全 26 件のコメント

[–]6asdgfsafAS6SD 3ポイント4ポイント  (0子コメント)

I'll just leave this here.

[–]d4d5c4e5 4ポイント5ポイント  (1子コメント)

A much more likely explanation is that Wei Dai was by far the most significant influence on Bitcoin, and whoever Satoshi is very consciously adopted the same style.

[–]dhs6sd5dshskoo 1ポイント2ポイント  (0子コメント)

Look for C++ code from Hal, Szabo, Wright or Wei. You will find such code from only one of them. This code will be a general purpose cryptography library written in C++ primarily supporting the Windows platform. A small portion (single file) of this code is used in the original implementation of Bitcoin.

One of this same person's self-stated role models is a fictional character from Vernor Vinge's book, "A Fire Upon The Deep." This character is known for making the clearest and most insightful posts on the Internet.

The only other non-fictional person (aside from Tim May) this person references as a role model was Hal Finney -- the recipient of the first Bitcoin transaction. Hal Finney was obviously very deeply respected by SN. IMO, this person is more of a real-life Sandor than even Hal. This person's posts are some of the clearest, most insightful, and most thought-provoking I have ever encountered. I highly encourage reading them all. I have, at least what I could find. Displayed in them is both an immense depth of imagination and intelligence which so rarely go hand in hand. What's more remarkable is the almost complete lack of ego and attention to clarity. This person wants the reader to clearly understand their ideas -- not to prove their intelligence.

Try to find a photo of this person. I guess finding a photo of a person who deeply internalized a reading of True Names will be rather difficult.

I don't know if this person is SN or not. But they are the only person whom I believe (of the widely known candidates) possesses the imagination, intelligence, and abilities consistent with being SN.

[–]jstolfi 6ポイント7ポイント  (19子コメント)

Not convincing.

For one thing, the texts by Wei Dai and Satoshi seem to be the only two technical papers; all the others are either non-technical, or "popular science" type. In technical writing, but not in the other kinds of writing, great value is placed in succintness, avoidance or repetition and redundancy, and use of precise terminology. These qualities all imply that technical papers have higher entropy than other texts. So that seems to be all that that the study proved.

Moreover, comparing texts by a single statistical measure is like comparing animals by their weight. By that criterion, you may conclude that the animal that most resembles a dog is a sea turtle.

[–]zawy2[S] 4ポイント5ポイント  (18子コメント)

I took the data from Szabo's blog and Wei Dai's blog. Most of Satoshi's writing was from discussing bitcoin. I think the white paper was 18kb. The files we have from him are 250 kb.

Entropy is very different from an average. It's looking at the difference in weight between the eyes, the stomach, the skin, the brain, etc and assigning a score to each value, with over 2000 different "body parts".

That's a VERY interesting thought though. Each body part is just a word and instead of word frequency, I could assign a weight in place of the count. Subtracting the entropy between two animals like this should work. If they are the same animal they should have a very low amount of disorder between them. If their total weight came out the same, it could still have a very high disorder value indicating it's not the same animal. Since I selected all texts to have the same number of words, it is like they have the same total weight.

[–]lifeboatz 1ポイント2ポイント  (9子コメント)

Do you have a comparison from, say, Kim Kardashian? Because a lot of us think Satoshi is one of those two... either Wei or Kim.

[–]zawy2[S] 1ポイント2ポイント  (8子コメント)

Just send me any 250 kb of text from anyone and no one you will find will match as well to Satoshi as Wei Dai when using the equation or Perl script I provided. I can remove any economic or bitcoin-related words and it still comes out pretty much the same. Szabo actually ranks higher when cryptocurrency and smart contract words are removed from all sample texts.

[–]lifeboatz 1ポイント2ポイント  (4子コメント)

My point is that you need some "control" points. What does the average researched look like in comparison?

[–]zawy2[S] 2ポイント3ポイント  (3子コメント)

Yes, it would be good to have more controls, but what's better than Nick Szabo and 5 bitcoin papers from Craig Wright? They failed miserably in comparison.

I know, what about Hal Finney? Or bitcoin forums?

[–]lifeboatz 1ポイント2ポイント  (2子コメント)

If you are doing a word frequency analysis, you need to look for authors with similar words. Instead of trying to prove you're right, try to prove that you are wrong.... maybe scan archives for people who have a closer match?

[–]zawy2[S] 1ポイント2ポイント  (1子コメント)

Wei Dai's articles were philosophy. I do not need him to to be talking about the same subject. Bitcoin talkers are not going to match good with satoshi. I only need other data to show the skeptics here that it works without caring if the author is using the same nouns or not. Send me a list bitcoin like words to delete from every file, and wei dai will still be on top.

[–]d4d5c4e5 2ポイント3ポイント  (0子コメント)

It's a fairly fringe theory, but I think a number of people would be interested in a comparison to John Nash.

[–]SrPeixinho 0ポイント1ポイント  (2子コメント)

Man screw that, just send me code from all 4 of them and I'll tell you who is satoshi.

[–]zawy2[S] 0ポイント1ポイント  (0子コメント)

The program works even better on code, but it has to be a LOT. Professional versions can detect detect the right blogger out of 100,000 bloggers 20% of the time in the #1 spot.

[–]jstolfi 1ポイント2ポイント  (7子コメント)

Entropy is very different from an average. It's looking at the difference in weight between the eyes, the stomach, the skin, the brain, etc and assigning a score to each value, with over 2000 different "body parts".

The question is whether you are reducing each text to a single number BEFORE comparing all those features, or are comparing feature by feature and then combining the differences.

Anyway, the first point remains. Wei Dai and Satoshi wrote mostly about technical stuff. All the other texts, includinn Szabo's and Craig's, were of a very different character.

[–]zawy2[S] 2ポイント3ポイント  (4子コメント)

I'm calculated a Shannon entropy difference. It's K-L divergence without the leading p outside the log. You can't reduce the text before doing it. Some runs take 3 minutes since I'm using Perl. It counts each word to be stored in a hash, then does the equation on each word between two authors and sums up for that cmoparison, then goes to next author. When I was testing it I would repeat it for 10 test authors on 40k words over 80 files. Something like 200 million loops.

I can't see how you can call Wei Dai's blog more technical. Wei Dai's page seems half philosophy. I say it is "his" page, but it might just be a group he has been in. It's a philosphy page. That's really way different than Szabo's economics and Wrights cryptocurrency papers. I think you might be thinking about his old home page which I didn't even bother going to find because he had so much stuff on his popular philosphy articles. I already knew the type of content would not matter so wasn't worried, as long as it's not code. Really, I was not expecting any kind of match. Everyone else listed on wikipedia was failing my tests. 200k of Satoshi's 250k is talk in a forum that was kind of in general terms so it wasn't too much code. They're all apple as and oranges from what I can tell.

Dai's forum's mission statement is

Less Wrong is an online community for people who try to think rationally. To get a quick idea of why rationality is important and how to develop it, try reading Your Intuitions Are Not Magic, The Cognitive Science of Rationality, or What I've Learned From Less Wrong.

[–]jstolfi 0ポイント1ポイント  (2子コメント)

OK, now I understand better what you did. It is sound in principle. I am not sure yet about the details. You write that the score is

sum [ log(p/q) ] when p is greater than q

for all words counts p and q where each word Satoshi used is count p, a known author's count for that word is q.

This measure skips words that were used by Satoshi (S) less often than by the author being evaluated (X). Presumably you also skip words that were used by only one of them (p = 0 or q = 0).

This does not seem very robust, since the number of words N(S,X) that are used in the comparison varies, depending of the topic of the text X.

So perhaps your score is measuring mostly the number N(S,X) than the similarity of the texts.

Moreover:

  • If Satoshi uses "byte" only once and X uses "byte" 10 times, ignore the word "byte";

  • If Satoshi uses "byte" 9 times and X uses "byte" 10 times, ignore;

  • If Satoshi uses ˜byte" 11 times and X uses "byte" 10 times, add log(11/10) = ~0.04 to the score;

  • If Satoshi uses ˜byte" 10 times and X uses "byte" only once, add log(10/1) = 1.00 to the score;

So, even if two texts X and Y have the same numbers N(S,X) and N(S,Y) of qualifying words, it looks like the score above measures the difference of each text to Satoshi's, not the similarity. That is, the higher the score, the more different the text is from Satoshi's.

I had to do such text comparisons once. I will try to describe my solution tomorrow...

[–]zawy2[S] 1ポイント2ポイント  (1子コメント)

You're right, it is the difference. I mention that in the text and I write below the chart that the chart is misleading to someone like you that knows what it is, but I have it that way to be less confusing to everyone else who thinks a long line is what we're looking for instead of a short line. If look at my rankings you see best match is the smallest value.

I do not know why I had to cut it off, but it matched the test authors better that way. Some of the data is not being used. I tried many ways to incorporate it, but the results were never better and often questionable. I have vague intuitions as to why.

No, I found it important to NOT skip when q=0, and CRUCIAL to skip when p=0, but I am not doing p=0 (satoshi) with this one-sided method. I guessed 0.25 should be the value assigned when q=0 to make the claim that it was half way between a true zero and a 0.5. It turned out to work really good. 0.1 to 0.35 was the range where I could not see deterioration. There was some deterioration at 0.5. A divide by zero error in the log forced me to think about what to do with zero and then to test it.

Also, p>q did not need to be exact. p>0.8q and >1.2q was OK Similar ratios did not affect it much. Down where p or q were close to 0 was important.

I tried about 100 different variations of that equation with different conditionals.

[–]jstolfi 0ポイント1ポイント  (0子コメント)

I found it important to NOT skip when q=0

But then log(p/q) is infinity. How do you handle that?

[–]wycks 0ポイント1ポイント  (0子コメント)

If you watch some of these videos you can see CW's code (simple examples) and writing style in a word document, though you would have to re-write it since it's a video (obviously).

https://www.youtube.com/watch?v=qq_kVixpxrI

[–]zawy2[S] 0ポイント1ポイント  (1子コメント)

I'll get Adam Black on the list. Wei Dai helped him and they were looking into an attack Dai proposed and Back talked about Dais b-money. Back is not ranking as high, but not too far off. I need more files for him.

[–]zawy2[S] 0ポイント1ポイント  (0子コメント)

Adam Back results: Out of 8 tests, he barely won on 3. He lost by a good margin on 3. Other 3 he was only 2 or 3 spots behind. Two of the ones he lost by the widest margin on are the one I consider the most reliable.

[–]zawy2[S] 1ポイント2ポイント  (2子コメント)

I'm telling you, a lot of authors like Carl Sagan and Robert Heinlein did not match as well with themselves as Wei Dai is matching with Satoshi. And that's comparing Dai's philosophy articles to Satoshi's coding forum. He's beating every other potential bitcoin contender that is listed at Wikipedia for whom I can get the data.

[–]d4d5c4e5 0ポイント1ポイント  (1子コメント)

To play devil's advocate, you may have just merely discovered how extensively these authors were edited and/or ghost-written. I recall ghost writing definitely being a strong possibility with Sagan vis-a-vis Ann Druyan, and with Heinlein there is a strong possibility that he dealt with varying degrees of control-freak editors, who in fiction could essentially end up being hard to distinguish from ghost writers.

[–]zawy2[S] 1ポイント2ポイント  (0子コメント)

Yeah, I was having trouble deciding what to do about Sagan since I knew about Druyan. But the worst one is Ridley. I swear it has to be a different author in the second half of the red queen.

[–]zawy2[S] 0ポイント1ポイント  (0子コメント)

The windows executable that implements the included Perl code is now at the bottom of that page