Archive.is blog

Blog of http://archive.is/ project
  • ask me anything
  • rss
  • archive
  • Thanks for the outstanding service. I'm building a stand-alone archiving tool for "Web 2.0" pages which is heavily inspired by your approach. I know you mentioned repeatedly that the codebase is not useful out of context and I understand that, but I'd like to ask again just for the Javascript components, which are the ones that would be the most useful and portable, but also the trickiest to get right without iteration. Any chance I can reuse the website-specific plugins and/or the inliner?
    Anonymous

    Ironicaly, the JS part is the most useless one and not portable, it requires PhantomJS with many patches applied.

    • 19 hours ago
  • I almost completed chrome extension for you. I will release in a few days. Then I'll remind you.
    Anonymous

    Thank you!

    • 19 hours ago
  • Is there any way to donate directly to the site? I went to the donate page, and it took me to a charity, but I'd like to help out with costs incurred by what I imagine is a hefty increase in archiving over the past... oh, say 10 months
    Anonymous

    There is no way.
    I do not know how to arrange it.
    Accepting credit cards would require the incorporation and the accounting (this is costly, perhaps more costly than the expected amount of the donations).
    May be Bitcoin is a solution?
    Also you can support WebCite here: https://fundrazr.com/campaigns/aQMp7

    • 2 days ago
    • 1 notes
  • can you make the long, full memento URL for an archived page more prominent and copyable? the URL that goes /20150626003414/ whatever
    Anonymous

    ok, done.

    if you have an idea how to design it better (I do not like the present solution), please, let me know.

    • 5 days ago
  • Why the project isn't open source?
    Anonymous

    The codebase is far from the point where the open/close-source difference may even have sense; it is not in the form of a re-deployable product; besides the html standards (and big list of exceptions) the code reflects specific hardware and network, which pages are popular, the behaviour of the SEO-bots and the users (in order to tune the caching strategies), etc.
    It would be a big work to create an alienable archiver (whether open or closed source) which anyone could set up on their own premises.

    • 3 weeks ago
  • what the heck - this is awesome. Who is running this? Why? Who's paying? How much does it cost? Why isn't everyone using it? How do I know my stuff is not going to disappear one day? Can you make a chrome plugin so I can save archived pages -- just like you archive them, the best way -- into Google drive, ensuring that I have control of them?
    Anonymous

    If you have control on Google Drive, you must have control on Chrome Team as well; why do you ask me to make a Chrome Plugin?

    • 3 weeks ago
  • You wrote that a page can be rearchived "At least 2 hours after the last submussion". When I tried three and a half hours after, it failed i.e. it delivered the previous snapshot. Id would be great if you could fix this. Also it would be grate if you can fix the fact that an archive operation that returns a previous snapshot rather than make a new one as requested, there is NO ALERT to the user. If he does not spot the old time, he is left unaware that the output URL shows a different snapshot.
    Anonymous

    Can you provide more details about your URL (may be by email) ? Actually, there are some conditions with the delay much bigger than 2 hours; for example if the submitted URL ends with .png or .jpg. If it is not the case then it must be a bug. 

    About the alert it would be a good feature, thank you.

    • 3 weeks ago
  • I archived a page containing a video. I can't read de video. What did I do wrong? Many thanks. J*
    jtoile

    Videos do not work, sorry.

    • 4 weeks ago
  • Is there any way to force the site to save an updated version of a URL that has already been archived once?
    Anonymous

    Yes. At least 2 hours after the last submussion. There used to be 5 min threshold, I increased it because of many submissions performed by broken bots and some urls (with static content!) were submitted thousand of times every 5 minutes. I understand that it is not convenient for thouse whose who would like to archive dynamic content so I am looking for a better solution (perhaps the white list of the sites which are updated quickly, such as twitter)

    • 1 month ago
  • Some pages which load content only after the user scrolls the page do not get archived fully. For example, most of the articles on Salon com
    Anonymous

    Do you mean “loading comments“ below? Scrolling does not help here :-( I will investigate it.

    • 1 month ago
Next page
  • Page 1 / 19