Archive.is blog

Blog of http://archive.is/ project
  • ask me anything
  • rss
  • archive
  • Private policy.

    There were many questions about the private policy and whether archive.is tracks users.

    The website itself does not set any cookies and it does not track user identities (there are no user accounts and no RTB-ads), but:

    1. There is Google Analytics javascript code on the all archive.is’s pages for statistics and downtime alerts. It may track users. A private installation of the statistic/monitoring software is in the TODO-list although not high-priority.
    2. The access logs are kept for 7 days. Sometimes archive.is is used to capture ephemeral tweets or webpages with child-porn and we get requests about them from the police and the regional agencies dealing with child-porn abuse.
    • 13 hours ago
  • Hello, it looks like archive is is no longer able to expand Livejournal comments. Ex: archive is TQaZg We ran a sample page through Dreamwidth (an LJ fork) and it still works archive is OLlJq Also when we tried running just the comments we ended up with a blank page: archive is 6NjZz Thank you.
    meeedeee

    I will check.

    Sometimes it did not work due of livejournal slow responses.

    • 1 day ago
    • 1 notes
  • The bookmarklet doesn't work on Twitter. I'm using Firefox.
    Anonymous

    I am afraid the only ways are:

    1. use another browser
    2. try the Firefox extention instead of the bookmarklet (I did not tested it)

    Firefox has in its source code the four domains where bookmarklets should not work: twitter.com, github.com, mail.google.com, reddit.com (but they do work on www.reddit.com)

    • 2 weeks ago
  • Google Translate is no longer being archived. I used archive-is with news articles after being put through Google Translate. Now pages translated using that service no longer seem to be capable of being archived. Help!
    Anonymous

    Fixed!

    Thank you for reporting!

    • 2 weeks ago
    • 1 notes
  • You mentioned there's no hot backup as of yet. Just wondering - is it plausible there could be one? Or is the site to big at this point? (I'm guessing it's in the ~5TB range)
    Anonymous

    5Tb is weekly growth :)

    • 2 weeks ago
    • 1 notes
  • Images on medium/com are blurry. Because of medium/com dev team made images blurry until your browser focuses on them.
    Anonymous

    Fixed.

    Thank you for reporting the problem!

    • 2 weeks ago
    • 1 notes
  • I asked that question -where are you located- because you might be located in earth quake area like for example archive org (San Francisco) and then there is a risk that all archiving has no sense...
    Anonymous

    We have two copies in different locations (although there is no hot backup yet, and disaster at the location of the primary copy may cause few days of downtime).

    • 2 weeks ago
  • How do end users know that archived pages aren't doctored or altered?
    Anonymous

    Many pages are altered. Main reason is to remove a popup or login box covering the content.

    • 3 weeks ago
  • Do you ever plan to open-source the software so we can run our own instances?
    Anonymous

    No. You may be interested in similar Q&A: http://blog.archive.is/tagged/opensource

    • 3 weeks ago
  • Is there a new maximum number of archived pages per site? one major site which had several thousand archived pages suddenly has only 1000.
    Anonymous

    There is no limit, but when you search for a whole domain, it shows only 10000 latest snapshots from that domain even if there are much more. This limit has been 1000 for short time, I just reverted it to 10000.

    • 3 weeks ago
Next page
  • Page 1 / 23