(cache) Entropy difference between Wei Dai & Satoshi Nakamoto Texts : btc

Entropy difference between Wei Dai & Satoshi Nakamoto Texts (zawy1.blogspot.com)

zawy2 が 8時間前投稿

全 26 件のコメント

[–]6asdgfsafAS6SD 3ポイント4ポイント5ポイント 6時間前 (0子コメント)

[–]d4d5c4e5 4ポイント5ポイント6ポイント 5時間前 (1子コメント)

[–]dhs6sd5dshskoo 1ポイント2ポイント3ポイント 3時間前* (0子コメント)

Look for C++ code from Hal, Szabo, Wright or Wei. You will find such code from only one of them. This code will be a general purpose cryptography library written in C++ primarily supporting the Windows platform. A small portion (single file) of this code is used in the original implementation of Bitcoin.

One of this same person's self-stated role models is a fictional character from Vernor Vinge's book, "A Fire Upon The Deep." This character is known for making the clearest and most insightful posts on the Internet.

The only other non-fictional person (aside from Tim May) this person references as a role model was Hal Finney -- the recipient of the first Bitcoin transaction. Hal Finney was obviously very deeply respected by SN. IMO, this person is more of a real-life Sandor than even Hal. This person's posts are some of the clearest, most insightful, and most thought-provoking I have ever encountered. I highly encourage reading them all. I have, at least what I could find. Displayed in them is both an immense depth of imagination and intelligence which so rarely go hand in hand. What's more remarkable is the almost complete lack of ego and attention to clarity. This person wants the reader to clearly understand their ideas -- not to prove their intelligence.

Try to find a photo of this person. I guess finding a photo of a person who deeply internalized a reading of True Names will be rather difficult.

I don't know if this person is SN or not. But they are the only person whom I believe (of the widely known candidates) possesses the imagination, intelligence, and abilities consistent with being SN.

[–]jstolfi 6ポイント7ポイント8ポイント 7時間前 (19子コメント)

[–]zawy2[S] 4ポイント5ポイント6ポイント 7時間前* (18子コメント)

I took the data from Szabo's blog and Wei Dai's blog. Most of Satoshi's writing was from discussing bitcoin. I think the white paper was 18kb. The files we have from him are 250 kb.

Entropy is very different from an average. It's looking at the difference in weight between the eyes, the stomach, the skin, the brain, etc and assigning a score to each value, with over 2000 different "body parts".

That's a VERY interesting thought though. Each body part is just a word and instead of word frequency, I could assign a weight in place of the count. Subtracting the entropy between two animals like this should work. If they are the same animal they should have a very low amount of disorder between them. If their total weight came out the same, it could still have a very high disorder value indicating it's not the same animal. Since I selected all texts to have the same number of words, it is like they have the same total weight.

[–]lifeboatz 1ポイント2ポイント3ポイント 7時間前 (9子コメント)

[–]zawy2[S] 1ポイント2ポイント3ポイント 7時間前* (8子コメント)

[–]lifeboatz 1ポイント2ポイント3ポイント 7時間前 (4子コメント)

[–]zawy2[S] 2ポイント3ポイント4ポイント 7時間前 (3子コメント)

[–]lifeboatz 1ポイント2ポイント3ポイント 6時間前 (2子コメント)

[–]zawy2[S] 1ポイント2ポイント3ポイント 5時間前 (1子コメント)

[–]d4d5c4e5 2ポイント3ポイント4ポイント 5時間前 (0子コメント)

[–]SrPeixinho 0ポイント1ポイント2ポイント 6時間前 (2子コメント)

[–]zawy2[S] 0ポイント1ポイント2ポイント 5時間前 (0子コメント)

[–]jstolfi 1ポイント2ポイント3ポイント 6時間前 (7子コメント)

[–]zawy2[S] 2ポイント3ポイント4ポイント 5時間前* (4子コメント)

I'm calculated a Shannon entropy difference. It's K-L divergence without the leading p outside the log. You can't reduce the text before doing it. Some runs take 3 minutes since I'm using Perl. It counts each word to be stored in a hash, then does the equation on each word between two authors and sums up for that cmoparison, then goes to next author. When I was testing it I would repeat it for 10 test authors on 40k words over 80 files. Something like 200 million loops.

I can't see how you can call Wei Dai's blog more technical. Wei Dai's page seems half philosophy. I say it is "his" page, but it might just be a group he has been in. It's a philosphy page. That's really way different than Szabo's economics and Wrights cryptocurrency papers. I think you might be thinking about his old home page which I didn't even bother going to find because he had so much stuff on his popular philosphy articles. I already knew the type of content would not matter so wasn't worried, as long as it's not code. Really, I was not expecting any kind of match. Everyone else listed on wikipedia was failing my tests. 200k of Satoshi's 250k is talk in a forum that was kind of in general terms so it wasn't too much code. They're all apple as and oranges from what I can tell.

Dai's forum's mission statement is

Less Wrong is an online community for people who try to think rationally. To get a quick idea of why rationality is important and how to develop it, try reading Your Intuitions Are Not Magic, The Cognitive Science of Rationality, or What I've Learned From Less Wrong.

[–]jstolfi 0ポイント1ポイント2ポイント 4時間前 (2子コメント)

OK, now I understand better what you did. It is sound in principle. I am not sure yet about the details. You write that the score is

sum [ log(p/q) ] when p is greater than q

for all words counts p and q where each word Satoshi used is count p, a known author's count for that word is q.

This measure skips words that were used by Satoshi (S) less often than by the author being evaluated (X). Presumably you also skip words that were used by only one of them (p = 0 or q = 0).

This does not seem very robust, since the number of words N(S,X) that are used in the comparison varies, depending of the topic of the text X.

So perhaps your score is measuring mostly the number N(S,X) than the similarity of the texts.

Moreover:

If Satoshi uses "byte" only once and X uses "byte" 10 times, ignore the word "byte";
If Satoshi uses "byte" 9 times and X uses "byte" 10 times, ignore;
If Satoshi uses ˜byte" 11 times and X uses "byte" 10 times, add log(11/10) = ~0.04 to the score;
If Satoshi uses ˜byte" 10 times and X uses "byte" only once, add log(10/1) = 1.00 to the score;

So, even if two texts X and Y have the same numbers N(S,X) and N(S,Y) of qualifying words, it looks like the score above measures the difference of each text to Satoshi's, not the similarity. That is, the higher the score, the more different the text is from Satoshi's.

I had to do such text comparisons once. I will try to describe my solution tomorrow...

[–]zawy2[S] 1ポイント2ポイント3ポイント 3時間前 (1子コメント)

You're right, it is the difference. I mention that in the text and I write below the chart that the chart is misleading to someone like you that knows what it is, but I have it that way to be less confusing to everyone else who thinks a long line is what we're looking for instead of a short line. If look at my rankings you see best match is the smallest value.

I do not know why I had to cut it off, but it matched the test authors better that way. Some of the data is not being used. I tried many ways to incorporate it, but the results were never better and often questionable. I have vague intuitions as to why.

No, I found it important to NOT skip when q=0, and CRUCIAL to skip when p=0, but I am not doing p=0 (satoshi) with this one-sided method. I guessed 0.25 should be the value assigned when q=0 to make the claim that it was half way between a true zero and a 0.5. It turned out to work really good. 0.1 to 0.35 was the range where I could not see deterioration. There was some deterioration at 0.5. A divide by zero error in the log forced me to think about what to do with zero and then to test it.

Also, p>q did not need to be exact. p>0.8q and >1.2q was OK Similar ratios did not affect it much. Down where p or q were close to 0 was important.

I tried about 100 different variations of that equation with different conditionals.

[–]jstolfi 0ポイント1ポイント2ポイント 3時間前 (0子コメント)

[–]wycks 0ポイント1ポイント2ポイント 3時間前 (0子コメント)

[–]zawy2[S] 0ポイント1ポイント2ポイント 4時間前 (1子コメント)

[–]zawy2[S] 0ポイント1ポイント2ポイント 4時間前 (0子コメント)

[–]zawy2[S] 1ポイント2ポイント3ポイント 5時間前 (2子コメント)

[–]d4d5c4e5 0ポイント1ポイント2ポイント 4時間前 (1子コメント)

[–]zawy2[S] 1ポイント2ポイント3ポイント 4時間前 (0子コメント)

[–]zawy2[S] 0ポイント1ポイント2ポイント 3時間前 (0子コメント)

π Rendered by PID 4507 on app-393 at 2016-05-08 08:58:38.271269+00:00 running 5426158 country code: JP.

btc

調停者