(cache) [FAQ] AutoTLDR Bot : autotldr

This is an archived post. You won't be able to vote or comment.

autotldr が 2年前 * 投稿

What is autotldr?

autotldr is a bot that uses SMMRY to automatically summarize long reddit submissions. It will remove extra examples, transition phrases, and unimportant details.

How does it work?

Refer to here for a basic understanding.

Why is autotldr useful?

Read here for a detailed explanation.

tl;dr's are frequently asked for yet sparsely available for long articles on external submissions. To increase the attention that sophisticated and scientific posts get autotldr will give the gist of the reading to redditors who prefer using a summary and would have otherwise ignored the article. This way important yet long articles become more relevant and accessible to a larger portion of the reddit userbase. It also allows redditors who can't access the original submission to still understand the context (good for sites that go down after a submission or if the content is removed).

When will autotldr make a post?

autotldr will only post if the content can be reduced by atleast 70%. So if the summary is only 50% shorter than the original, autotldr will not post it. The tl;dr must also be between 450-700 characters. autotldr does not summarize self posts, as the responsibility of providing that tl;dr should be of the OP.

Who do I contact about autotldr?

Message the bot account.

I'm a mod and I don't want autotldr to post on my subreddit

Send a message from your mod account to blacklist your subreddit. If you have valid reasons for blacklisting/banning autotldr please contribute to the theory of autotldr discussion.

全 11 件のコメント

トップ新着論争中古い順 Q&A

[–]raldi 14 ポイント15 ポイント16 ポイント 2年前 (0子コメント)

[–]cruyff8 5 ポイント6 ポイント7 ポイント 2年前 (7子コメント)

[–][削除されました] 2年前 (6子コメント)

[deleted]

[–]cruyff8 6 ポイント7 ポイント8 ポイント 2年前 (5子コメント)

[–]iforgot120 2 ポイント3 ポイント4 ポイント 2年前 (3子コメント)

[–]cruyff8 0 ポイント1 ポイント2 ポイント 2年前 (2子コメント)

[–]iforgot120 3 ポイント4 ポイント5 ポイント 2年前 (1子コメント)

Specifics on TF-IDF? It's a very simple algorithm, so there really isn't all too much to it; you can try to improve accuracy by playing with the numbers, but the idea is the same.

The idea behind TF-IDF (which stands for "term frequency - individual document frequency") is that it analyzes a single document (e.g. a posted article) for individual word count (how often each word appears in the document). Words that appear more frequently are most likely important to that document, however that'll be skewed by words that are simply frequent throughout the English language (e.g. things like conjunctions [and, or, but, etc.], determiners [this, that, each, my, the, etc.], common verbs [is, are, was, etc.], etc.).

To offset that, you need to normalize the term frequency with the individual document frequency which looks at a body of different documents (called a "corpus" in NLP). Words that appear (however many times) in all or many of the documents are probably words that are just common in the English language, while words that are rare would be more specific to a single argument.

So if you have a word that appears often in a single document, but only in that single document and in no other documents, then that's probably a relevant word to said document, meaning sentences containing that word probably have higher importance.

[–]cruyff8 0 ポイント1 ポイント2 ポイント 2年前 (0子コメント)

[–]aristeiaa 1 ポイント2 ポイント3 ポイント 2年前 (1子コメント)

[–]polysemous_entelechy 0 ポイント1 ポイント2 ポイント 2年前 (0子コメント)

π Rendered by PID 128886 on app-221 at 2017-07-03 03:21:37.971710+00:00 running d7ea8c6 country code: JP.

autotldr

General

Links

調停者