This tool is an ongoing experiment in better HTML checking, and its behavior remains subject to change

(X)HTML5 validation results for https://archive.ph/faq

Validator Input
  1. Error: Start tag seen without seeing a doctype first. Expected <!DOCTYPE html>.

    From line 1, column 1; to line 1, column 39

    <html style="background-color:#FFFAE1"><head>

  2. Error: Element head is missing a required instance of child element title.

    From line 1, column 103; to line 1, column 109

    archive"/></head><body>

    Content model for element head:
    If the document is an iframe srcdoc document or if title information is available from a higher-level protocol: Zero or more elements of metadata content, of which no more than one is a title element and no more than one is a base element.
    Otherwise: One or more elements of metadata content, of which exactly one is a title element and no more than one is a base element.
  3. Error: Bad value http://archive.is/FWVL#40% for attribute href on element a: Percentage ("%") is not followed by two hexadecimal digits.

    From line 116, column 115; to line 116, column 151

    r example <a href="http://archive.is/FWVL#40%">http:/

  4. Error: Bad value http://archive.is/[2A00:1450:400C:C00::69] for attribute href on element a: Illegal character in path segment: [ is not allowed.

    From line 139, column 5; to line 139, column 57

    <ul><li><a href="http://archive.is/[2A00:1450:400C:C00::69]">http:/

  5. Warning: This document appears to be written in English. Consider adding lang="en" (or variant) to the html start tag.

    From line 1, column 1; to line 1, column 39

    <html style="background-color:#FFFAE1"><head>

There were errors. (Tried in the text/html mode.)

Image report

No images in the document.

Source

  1. <html style="background-color:#FFFAE1"><head><meta name="robots" content="noindex,nocache,noarchive"/></head><body><h1 id = "FAQ">FAQ</h1>
  2. <h2 id = "Which_parts_of_web_page_are_saved_">Which parts of web page are saved?</h2>
  3. <ol>
  4. <li>Textual content of the web page.</li>
  5. <li>Images.</li>
  6. <li>Content of the frames.</li>
  7. <li>Content and images loaded or generated by Javascript on Web 2.0 sites</li>
  8. <li>Screenshot of 1024x768 pixels.</li>
  9. </ol>
  10. <h2 id = "Which_parts_of_web_page_are_not_saved_">Which parts of web page are not saved?</h2>
  11. <ol>
  12. <li>Flash and content loaded by flash.</li>
  13. <li>Video and sounds. It has no sense to archive youtube.com unless you want to archive the title of the video and comments. The video itself will not be saved.</li>
  14. <li>PDF</li>
  15. <li>RSS and other XML-pages saved not reliable. Most of them are not saved or saved as blank page.</li>
  16. </ol>
  17. <h2 id = "How_long_does_it_take_to_make_a_snapshot__">How long does it take to make a snapshot ?</h2>
  18. <p>The same time as to load a page into your browser.
  19. Although, saving the pages with heavy scripts or the pages full of Ads may take up to few minutes.
  20. There is 5 minutes timeout, if page is not fully loaded in 5 minutes, the saving considered failed.
  21. It is not often, but it happens.</p>
  22. <h2 id = "It_there_limit_on_the_page_size__">It there limit on the page size ?</h2>
  23. <p>The stored page with all images must be smaller than 50Mb</p>
  24. <h2 id = "What_software_do_you_run_and_how_data_is_stored__">What software do you run and how data is stored ?</h2>
  25. <p>The archive runs Apache Hadoop and Apache Accumulo.
  26. All data is stored on HDFS, textual content is duplicated 3 times among servers in different datacenters and images are duplicated 2 times.
  27. All datacenters are in Europe.</p>
  28. <h2 id = "How_long_the_page_will_be_stored__">How long the page will be stored ?</h2>
  29. <p>Virtually forever.
  30. We have a lot of free space and although the archive grows with time, the storage and bandwidth get cheaper.</p>
  31. <h2 id = "Do_you_delete_my_stored_page_s___">Do you delete my stored page(s) ?</h2>
  32. <p>Pages which violate our hoster's rules (cracks, porn, etc) may be deleted.
  33. Also, completely empty pages (or pages which have nothing but text like &ldquo;502 Server Timeout&rdquo;) may be deleted.</p>
  34. <h2 id = "How_is_the_archive_funded_">How is the archive funded?</h2>
  35. <p>It is privately funded; there are no complex finances behind it.
  36. It may look more or less reliable compared to startup-style funding or a university project, depending on which risks are taken into account.</p>
  37. <h2 id = "Will_advertising_appear_on_the_archive_one_day__">Will advertising appear on the archive one day ?</h2>
  38. <p>I cannot make a promise that it will not.
  39. With the current growth rate I am able to keep the archive free of ads.
  40. Well, I can promise it will have no ads at least till the end of 2014.</p>
  41. <h2 id = "How_to_refer_to_the_saved_page__">How to refer to the saved page ?</h2>
  42. <p>Each page has short url http://archive.is/XXXXX, where XXXXX is the unique indentfier of a page.
  43. Also, the page can be refered with urls like</p>
  44. <ul>
  45. <li><a href="http://archive.is/2013/http://www.google.de/">http://archive.is/<strong>2013</strong>/http://www.google.de/</a> - the newest snapshot in year 2013.</li>
  46. <li><a href="http://archive.is/201301/http://www.google.de/">http://archive.is/<strong>201301</strong>/http://www.google.de/</a> - the newest snapshot in January 2013.</li>
  47. <li><a href="http://archive.is/20130101/http://www.google.de/">http://archive.is/<strong>20130101</strong>/http://www.google.de/</a> - the newest snapshot within the day of 1st January 2013.</li>
  48. </ul>
  49. <p>The date can be extended further with hours, minutes and seconds:</p>
  50. <ul>
  51. <li><a href="http://archive.is/2013010103/http://www.google.de/">http://archive.is/<strong>2013010103</strong>/http://www.google.de/</a></li>
  52. <li><a href="http://archive.is/201301010313/http://www.google.de/">http://archive.is/<strong>201301010313</strong>/http://www.google.de/</a></li>
  53. <li><a href="http://archive.is/20130101031355/http://www.google.de/">http://archive.is/<strong>20130101031355</strong>/http://www.google.de/</a></li>
  54. </ul>
  55. <p>Year, month, day, hours, minutes and seconds can be separated with dots, dash or colons to increase readability:</p>
  56. <ul>
  57. <li><a href="http://archive.is/2013-04-17/http://blog.bo.lt/">http://archive.is/<strong>2013-04-17</strong>/http://blog.bo.lt/</a></li>
  58. <li><a href="http://archive.is/2013.04.17-12:08:20/http://blog.bo.lt/">http://archive.is/<strong>2013.04.17-12:08:20</strong>/http://blog.bo.lt/</a></li>
  59. </ul>
  60. <p>It is also possible to refer all snapshots of the given url</p>
  61. <ul>
  62. <li><a href="http://archive.is/http://www.google.de/">http://archive.is/http://www.google.de/</a></li>
  63. </ul>
  64. <p>All saved pages from the domain</p>
  65. <ul>
  66. <li><a href="http://archive.is/www.google.de">http://archive.is/www.google.de</a></li>
  67. </ul>
  68. <p>All saved pages from all the subdomains</p>
  69. <ul>
  70. <li><a href="http://archive.is/&#42;.google.de">http://archive.is/*.google.de</a></li>
  71. </ul>
  72. <h2 id = "Is_there_a_way_to_link_to_the_most_recent_archive_of_an_article_by_including_the_URL_in_an_archive__is_link_">Is there a way to link to the most recent archive of an article by including the URL in an archive. is link?</h2>
  73. <p>Yes.</p>
  74. <p><a href="http://archive.is/newest/http://reddit.com/">http://archive.is/newest/http://reddit.com/</a>
  75. There is also <a href="http://archive.is/oldest/http://reddit.com/">http://archive.is/oldest/http://reddit.com/</a></p>
  76. <h2 id = "How_to_refer_to_exact_part_of_a_long_page__">How to refer to exact part of a long page ?</h2>
  77. <p>There are two options:</p>
  78. <ul>
  79. <li><p>add hashtag with the scroll position as a number between 0 (top of the page) and 100 (bottom). For example <a href="http://archive.is/FWVL#40%">http://archive.is/FWVL#40%</a></p></li>
  80. <li><p>select some text on the page and get URL with hashtag referring to the selection. For example <a href="http://archive.is/FWVL#selection-1493.0-1493.53">http://archive.is/FWVL#selection-1493.0-1493.53</a></p></li>
  81. </ul>
  82. <h2 id = "Does_it_support_any_API__">Does it support any API ?</h2>
  83. <p>archive.is supports MementoWeb API. More info can be found <a href="http://mementoweb.org/depot/native/archiveis/">here</a></p>
  84. <h2 id = "Can_I_have_an_account_to_manage_my_bookmarks__">Can I have an account to manage my bookmarks ?</h2>
  85. <p>No.
  86. But you can keep bookmarks to archived pages in one of the existing bookmark managers, like <a href="https://delicious.com/">Delicious</a>, <a href="http://www.google.com/bookmarks">Google Bookmarks</a>, …</p>
  87. <h2 id = "Why_does_archive_is_not_obey_robots_txt_">Why does archive.is not obey robots.txt?</h2>
  88. <p>Because it is not a free-walking crawler, it saves only one page acting as a direct agent of the human user.
  89. Such services don't obey robots.txt (e.g. <a href="https://support.google.com/webmasters/answer/178852#robots">Google Feedfetcher</a>, screenshot- or pdf-making services, isup.me, …)</p>
  90. <h2 id = "Is_IPv6_supported__">Is IPv6 supported ?</h2>
  91. <p>Yes.</p>
  92. <ul>
  93. <li><a href="http://archive.is/[2A00:1450:400C:C00::69]">http://archive.is/[2A00:1450:400C:C00::69]</a></li>
  94. <li><a href="http://archive.is/ipv6.google.com">http://archive.is/ipv6.google.com</a></li>
  95. </ul>
  96. <h2 id = "Are_domains_with_national_characters_supported__">Are domains with national characters supported ?</h2>
  97. <p>Yes.</p>
  98. <ul>
  99. <li><a href="http://archive.is/www.maroñas.com.uy">http://archive.is/www.maroñas.com.uy</a></li>
  100. <li><a href="http://archive.is/&#42;.测试">http://archive.is/*.测试</a></li>
  101. </ul>
  102. <h2 id = "Do_you_preserve_archivers__privacy__E_g__not_disclose_the_source_IP_address_">Do you preserve archivers' privacy? E.g. not disclose the source IP address?</h2>
  103. <p>Yes.</p>
  104. <p>But take in mind that when you archive a page, your IP is being sent to the the website you archive as though you are using a proxy (in X-Forwarded-For header). This feature allows websites (e.g shops or the sites with weather forecast) target your region, not mine.</p>
  105. <h2 id = "I_found_incorrect_inaccurate_obsolete_informartion__Can_I_request_it_to_be_altered_or_deleted_">I found incorrect/inaccurate/obsolete informartion. Can I request it to be altered or deleted?</h2>
  106. <p>The archive is not a news agency nor an authoritative source of reference information.
  107. It merely certifies that at the given point of time there was a page on the web.
  108. The page might well contain a fairy tale and despite &ldquo;One day Little Red Riding Hood goes to visit her granny&rdquo; being a false statement it is not the reason to burn the books.
  109. Note that weather forecasts on the archived pages are outdated as well.</p>
  110. <h2 id = "My_question_is_not_here_">My question is not here!</h2>
  111. <p>More questions and answers: <a href="http://blog.archive.is/archive">http://blog.archive.is/archive</a></p></body></html>

Used the HTML parser. Externally specified character encoding was utf-8.

Total execution time 268 milliseconds.


About this checkerReport an issueVersion: 22.3.8