Archive.is blog

Blog of http://archive.is/ project
  • ask me anything
  • rss
  • archive
  • Are there any limits on the number of URLs a user may submit each day? The FAQ is silent about this, if there are limits.
    Anonymous

    There is no explicit per-user limit.
    There are some performance limits derived from the limits of the hardware: for example, the number of running browsers is limited and the queue size is limited.
    They are very rarely hit, but the user willing to submit tons of urls has good chance to hit them.

    • 1 day ago
  • So the data of the snapshots archive is gdok1, archive is AlnJe and archive is hUfRf are all saved seperately, while their webpage capture and webpage screenshots are the exact same? Could you check if its the same before saving a double copy? Then you don't need the 'If this snapshot looks obsolete you can save the page again.' alert.
    Anonymous

    1. same pages like in your example are very rare. ads or block of recomedations or feed of tweets differ.
    2. it might take considerable time to make a hidden snapshot to compare the previous snapshot with; also this operation may fail due to network conditions or server error.
    3. for some usercases it has sense to same exact the same page twice, to proof that it has not changed.

    • 2 days ago
  • Is there deduplication of the data on the archive? What if two snapshots of an url are the exact same, do you save it twice on disk?
    Anonymous

    Images are deduplicated, htmls are not. There are too many images which are the same across thousands of snapshots; for example, the icons of the social networks

    • 2 days ago
  • Could you remove the German Google translate translation? I'd rather see the site in English than in bad German.
    Anonymous

    Could you help the free project and improve the translation?

    Alternatively, you can change the language: http://www.w3.org/International/questions/qa-lang-priorities.en.php#changing

    • 3 days ago
  • When will the index scheme be updated to accommodate more than 10000 URLs listed in each search?
    Anonymous

    I plan to index the whole archive using elasticsearch and use it also for url-search.
    No exact date yet.

    • 1 week ago
  • Answer this already with absolutely no bullshitting, lying or talking non-sense: what is the real reason for blocking Finland and when will you unblock it? If answer to previous question is "never", how can ANYONE trust this site anymore?
    Anonymous

    Presumably, never.
    My view on the trust issue is different: it is better to sacrifice the global availability a bit (it is far from being 100% anyway) than to censor the content.

    • 1 week ago
    • 1 notes
  • The site cant archive PDF links, is there a way to allow it?
    Anonymous

    Not yet.
    You can try archive.org, webcitation.org or scribd.com.

    • 1 week ago
  • Few hours ago, an user found XSS vulnerability on both archive.org and archive.is.

    Page https://archive.is/VSGzW saved from https://archive.org/search.php?query=1XSS&sort=-publicdate<svg%20onload=confirm(/XSSPOSED/)> contained executable javascript.
    The bug is fixed.

    • 1 week ago
  • You are blocking Finland based users. Not so smart to punish people for with you think is government based excuse of yours. And we do have F-Secure Freedom and lot's of proxy's on the web. You are not so smart with this move you made.
    Anonymous

    It seems I just created a good page to buy the F-Secure ad-space :)
    There is also Opera Turbo, for free.

    • 1 week ago
  • If you end up blocking tor, only block archiving not viewing archives.
    Anonymous

    In case if the number of abuses increases it may end up in pre-moderated submissions from Tor .
    So far they are post-moderated with special attention to ones made from Tor IPs.

    • 1 week ago
Next page
  • Page 1 / 21