Backup Tips

From Archiveteam
Jump to navigation Jump to search

Personal digital archiving is all the rage nowadays. This article will give you a basic overview of why you should do it and how.

How Is Data Lost?

Here's a short list of the ways to lose data:

  • Disk failure
  • Software failure
  • Malicious software
  • Natural disaster
  • Clumsy user
  • Accidental deletion
  • Accidental overwriting
  • Cat hair
  • Refrigerator magnets
  • Solar radiation
  • You forgot where you put it
  • Your parents/roommate/spouse moved it and didn't tell you
  • The feds paid a surprise visit to you/your storage provider
  • Your storage provider went under/got bought/got bored

Any one of these can erase decades of data in a second. The goal of good backups is to contain the damage any one of these can cause, ideally to nearly zero. The ways to lose data can be summarized into these categories:

  1. Operational - drives wear down, software writes garbage, user error
  2. Environmental - building catches fire, hurricane knocks your house over
  3. Access - you lose track of data, or lose ability to get to it

Thus, a good backup plan is resilient against each of these types of failure. We'll use Michael Ashenfelder's four-step process as a model.

Identify/Decide

Before starting any sort of backup plan, it helps to identify what you're saving. Terabytes save differently than megabytes; knowing which you plan on saving can save money and grief.

One way to get started is to envision the following scenarios (which also serve as excellent fire drills):

  • You get a call from your lawyer, telling you that someone opened a suit against you. Your lawyer says that it's easy to take care of, but it requires as much documentation as possible to build your case. This encompasses things like financial data (Quicken, Excel, etc.), legal documentation, and the like.
  • You get a call from a client/boss, saying that the Big Project needs some crucial information from some old work of yours to save it. They don't have a specific date, but they'll sift through everything more than one year old to find it. This encompasses anything you do for work, be it media projects, code, reports, etc. Essentially, stuff your livelihood is based on.
  • You learn the hard way that nobody keeps backups of people. Your next of kin go through your effects, and come across your digital data. Here, they'll find personal things - photographs/videos, special emails, etc. - essentially, stuff that's meaningful to you.
  • You have to move to a developing country for a short time - not long enough to think long-term, but long enough that you'll want some amenities. Due to arcane customs laws, you can only bring one small hard drive into the country. These are things that you really like and would rather not lose, but don't fall into the above. Things like contact information, game saves, hard-to-replace data, favorite porn, etc. Think of this as a catch-all.

Here are things you probably shouldn't save:

  • Program and system files. Unless you run a high-reliability business server, there's little need to have a ready copy of explorer.exe. If you have the install discs handy, then there's no real reason to back these up. Note that software published by very small houses (music software comes to mind) can be hard to track down later - it may be prudent to archive these applications.

That being said, remember that storage is cheap, but your data is priceless. When in doubt, save it - the cost of doing so is nearly zero, and the cost of losing it is not.

Organize

Make sure you assess all possible data sources when deciding what to back up. There's nothing more embarrassing than losing your vacation photos because you didn't copy them off your phone before pitching it. If you have anything stored remotely - on a webhost, in an email server, in the cloud - copy it locally! Odds are good that the service won't be there when you need it most.

When backing up, it helps to keep everything together in one large archive. This solves a number of problems:

  • You won't forget where you put that Really Important Data - don't be like Jordan Mechner! - because it's all in one place.
  • It's easier to keep one big archive reliable than many small ones (think economy of scale)
  • Buying a few big hard drives is cheaper than buying many small ones (and they tend to be more reliable)

Save Copies

The goal here is to mitigate the damage caused by sudden catastrophic data loss, so that your valuable data (from above) is kept safe.

Scheme

First, buy some hard drives. Mechanical (traditional) drives are the cheapest for the storage and their longevity/flaws are well-documented. For purposes of personal archiving, consumer drives are sufficient - so long as it's not a disastrously bad line, any drive is sufficient. 1-1.5TB drives are roomy and cheap, and are recommended.

A basic backup scheme may look like this:

  1. Primary storage (your PC/phone/tablet/etc) - changes constantly as you use it
  2. Secondary local storage (a hard drive in a closet) - changes once every 2-3 weeks
  3. Secondary offsite storage (a hard drive in a safety deposit box) - changes once or twice a year [optional but highly recommended]

This scheme provides resilence against most common failures: if one drive dies, there are two backups; if you delete something, you have two; if your house floods, you have one. So long as you are vigilant, the chance of total data loss is negligible, even in case of total disaster. You may wish to add more drives to each area in accordance with your paranoia.

Keeping backup cycles up is important for both the longevity of the hardware and security of your data. Not only does it allow you to keep your data current, but it can show early signs of hardware failure as you read/write to the disks.

Why Not the Cloud?

You may be thinking "why not use cloud backups as offsite storage?" The answer is: you can, but it's risky, and you should only use it to supplement an already solid scheme.

The cloud offers many seductive features, such as high disk reliability, easy access, and cheap storage. However, there's a hidden cost: by using cloud storage, you lose control of your data. By trusting a storage provider with your data, you trust them to be there tomorrow. This has shown to be a very risky agreement, as cloud storage providers tend not to be long-lived, and those that fall don't give a damn about your data loss. The provider may lose interest (MobileMe), close shop (Deathwatch#Dead as a doornail), or have a surprise party thrown by the Department of Justice (MegaUpload).

In short, cloud storage is a lot like real clouds - insubstantial, fleeting, and really bad to build on. This isn't to say it's useless - cloud storage can be a useful function in a storage scheme due to easy access - but it's not something you should trust your backups to. Put another way, AT wouldn't be working overtime to save dying clouds if they were reliable long-term.

Doing It

Here's a quick, dirty, and platform-agnostic backup method:

  1. Connect the external drive to your computer.
  2. In the root of the external hard drive, create a folder called 'backups'.
  3. In the 'backups' folder, create another folder named after your computer's name (e.g. "POSEIDON").
  4. In this new folder, create another folder of today's date in the format of YYYY-MM-DD (i.e. 2023-01-6). This format ensures that Windows will properly sort the folders in date order.
  5. Copy everything you selected (if you're extra-thorough, everything from your C: drive) into this date folder. A program like TeraCopy is exceedingly useful, as it supports copy verification, pause/resume and most importantly, won't randomly die if it runs into any problems.

You'll probably have space left over (unless you do a lot of media editing), so you can repeat this once per backup cycle on the same drive. This gives you some extra peace of mind in case one cycle's backup is corrupted, as you can use the next most recent one.

The 3-2-1 Backup Strategy

This rule of thumb can greatly reduce the likelihood of data loss:

  • Keep a total of at least 3 copies of your data,
  • 2 of which are on different media formats and at least
  • 1 off-site copy.

Keyword is "at least." The more copies of your data you have the lower the chance of data loss.

Conclusion

Congrats! You're now resistant against catastrophic data loss. This is only the beginning of good archiving - vigilance is the watchword of digital archiving. Keep an eye on disk health, run through fire drills (either literally or figuratively), and stay consistent with backups.

As the old joke goes, there are two kinds of people in the world: those that keep backups, and those that haven't lost data yet. Don't let it happen to you.

See also


v · t · e         Archive Team
Current events

Alive... OR ARE THEY · Deathwatch · Projects

Archiveteam.jpg
Archiving projects

APKMirror · Archive.is · BetaArchive · Government Backup (#datarefuge · ftp-gov· Gmane · Internet Archive · It Died · Megalodon.jp · OldApps.com · OldVersion.com · OSBetaArchive · TEXTFILES.COM · The Dead, the Dying & The Damned · The Mail Archive · UK Web Archive · WebCite · Vaporwave.me

Blogging

Blog.pl · Blogger · Blogster · Blogter.hu · Freeblog.hu · Fuelmyblog · Jux · LiveJournal · My Opera · Nolblog.hu · Open Diary · ownlog.com · Posterous · Powerblogs · Proust · Roon · Splinder · Tumblr · Vox · Weblog.nl · Windows Live Spaces · Wordpress.com · Xanga · Yahoo! Blog · Zapd

Cloud hosting/file sharing

aDrive · AnyHub · Box · Dropbox · Docstoc · Fast.io · Google Drive · Google Groups Files · iCloud · Fileplanet · LayerVault · MediaCrush · MediaFire · Mega · MegaUpload · MobileMe · OneDrive · Pomf.se · RapidShare · Ubuntu One · Yahoo! Briefcase

Corporations

Apple · IBM · Google · Loblaw · Lycos Europe · Microsoft · Yahoo!

Events

Arab Spring · Great Ape-Snake War · Spanish Revolution

Font Repos

DaFont · Google Web Fonts · GNU FreeFont · Fontspace

Forums/Message boards

4chan · Captain Luffy Forums · College Confidential · Discourse · DSLReports · ESPN Forums · Facepunch Forums · forums.starwars.com · HeavenGames · JamiiForums · Invisionfree · NeoGAF · Textream · The Classic Horror Film Board · Yahoo! Messages · Yahoo! Neighbors · Yuku.com · Zetaboards

Gaming

Atomicgamer · Bazaar.tf · City of Heroes · Club Nintendo · Clutch · Counter-Strike: Global Offensive · CS:GO Lounge · Desura · Dota 2 · Dota 2 Lounge · Emulation Zone · ESEA · GameBanana · GameMaker Sandbox · GameTrailers · Halo · Heroes of Newerth · HLTV.org · HQ Trivia · Infinite Crisis · joinDOTA · League of Legends · Liquipedia · Minecraft.net · Player.me · Playfire · Raptr · SingStar · Steam · SteamDB · SteamGridDB · Team Fortress 2 · TF2 Outpost · Warhammer · Xfire

Image hosting

500px · AOL Pictures · Blipfoto · Blingee · Canv.as · Camera+ · Cameroid · DailyBooth · Degree Confluence Project · DeviantART · Demotivalo.net · Flickr · Fotoalbum.hu · Fotolog.com · Fotopedia · Frontback · Geograph Britain and Ireland · Giphy · GTF Képhost · ImageShack · Imgh.us · Imgur · Inkblazers · Instagram · Kepfeltoltes.hu · Kephost.com · Kephost.hu · Kepkezelo.com · Keptarad.hu · Madden GIFERATOR · MLKSHK · Microsoft Clip Art · Microsoft Photosynth · Nokia Memories · noob.hu · Odysee · Panoramio · Photobucket · Picasa · Picplz · Pixiv · Portalgraphics.net · PSharing · Ptch · puu.sh · Rawporter · Relay.im · ScreenshotsDatabase.com · Sketch · Smack Jeeves · Snapjoy · Streetfiles · Tabblo · Tinypic · Trovebox · TwitPic · Wallbase · Wallhaven · Webshots · Wikimedia Commons

Knowledge/Wikis

arXiv · Citizendium · Clipboard.com · Deletionpedia · EditThis · Encyclopedia Dramatica · Etherpad · Everything2 · infoAnarchy · GeoNames · GNUPedia · Google Books (Google Books Ngram· Horror Movie Database · Insurgency Wiki · Knol · Lost Media Wiki · Neoseeker.com · Notepad.cc · Nupedia · OpenCourseWare · OpenStreetMap · Orain · Pastebin · Patch.com · Project Gutenberg · Puella Magi · Referata · Resedagboken · SongMeanings · ShoutWiki · The Internet Movie Database · TropicalWikis · Uncyclopedia · Urban Dictionary · Urban Exploration Resource · Webmonkey · Wikia · Wikidot · WikiHow · Wikkii · WikiLeaks · Wikipedia (Simple English Wikipedia· Wikispaces · Wikispot · Wik.is · Wiki-Site · WikiTravel · Word Count Journal

Magazines/Blogs/News

Cyberpunkreview.com · Game Developer Magazine · Gigaom · Hardware Canucks · Helium · JPG Magazine · Make Magazine · The Escapist · Polygamia.pl · San Fransisco Bay Guardian · Scoop · Regretsy · Yahoo! Voices

Microblogging

Heello · Identi.ca · Jaiku · Mommo.hu · Plurk · Sina Weibo · Tencent Weibo · Twitter · TwitLonger

Music/Audio

8tracks · AOL Music · Audimated.com · Cinch · digCCmixter · Dogmazic.net · Earbits · exfm · Free Music Archive · Gogoyoko · Indaba Music · Instacast · Instaudio · Jamendo · Last.fm · Music Unlimited · MOG · PureVolume · Reverbnation · ShareTheMusic · SoundCloud · Soundpedia · Spotify · This Is My Jam · TuneWiki · Twaud.io · WinAmp

People

Aaron Swartz · Michael S. Hart · Steve Jobs · Mark Pilgrim · Dennis Ritchie · Len Sassaman Project

Protocols/Infrastructure

FTP · Gopher · IRC · Usenet · World Wide Web
BitTorrent DHT

Q&A

Askville · Answerbag · Answers.com · Ask.com · Askalo · Baidu Knows · Blurtit · ChaCha · Experts Exchange · Formspring · GirlsAskGuys · Google Answers · Google Baraza · JustAnswer · MetaFilter · Quora · Retrospring · StackExchange · The AnswerBank · The Internet Oracle · Uclue · WikiAnswers · Yahoo! Answers

Recipes/Food

Allrecipes · Epicurious · Food.com · Foodily · Food Network · Punchfork · ZipList

Social bookmarking

Addinto · Backflip · Balatarin · BibSonomy · Bkmrx · Blinklist · BlogMarks · BookmarkSync · CiteULike · Connotea · Delicious · Designer News · Digg · Diigo · Dir.eccion.es · Evernote · Excite Bookmark · Faves · Favilous · folkd · Freelish · Getboo · GiveALink.org · Gnolia · Google Bookmarks · Hacker News · HeyStaks · IndianPad · Kippt · Knowledge Plaza · Licorize · Linkwad · Menéame · Microsoft Developer Network · myVIP · Mister Wong · My Web · Mylink Vault · Newsvine · Oneview · Pearltrees · Pinboard · Pocket · Propeller.com · Reddit · sabros.us · Scloog · Scuttle · Simpy · SiteBar · Slashdot · Squidoo · StumbleUpon · Twine · Voat · Vizited · Yummymarks · Xmarks · Yahoo! Buzz · Zootool · Zotero

Social networks

Bebo · BlackPlanet · Classmates.com · Cyworld · Dogster · Dopplr · douban · Ello · Facebook · Flixster · FriendFeed · Friendster · Friends Reunited · Gaia Online · Google+ · Habbo · hi5 · Hyves · iWiW · LinkedIn · Miiverse · mixi · MyHeritage · MyLife · Myspace · myVIP · Netlog · Odnoklassniki · Orkut · Plaxo · Qzone · Renren · Skyrock · Sonico.com · Storylane · Tagged · tvtag · Upcoming · Viadeo · Vine · VK · WeeWorld · Weibo · Wretch · Yahoo! Groups · Yahoo! Stars India · Yahoo! Upcoming · more sites...

Shopping/Retail

Alibaba · AliExpress · Amazon · Apple Store · Barnes & Noble · DirectCanada · eBay · Kmart · NCIX · Printfection · RadioShack · Sears · Sears Canada · Target · The Book Depository · ThinkGeek · Toys "R" Us · Walmart

Software/code hosting

Android Development · Alioth · Assembla · BerliOS · Betavine · Bitbucket · BountySource · Codecademy · CodePlex · Freepository · Free Software Foundation · GNU Savannah · GitHost  · GitHub · GitHub Downloads · Gitorious · Gna! · Google Code · ibiblio · java.net · JavaForge · KnowledgeForge · Launchpad · LuaForge · Maemo · mozdev · OSOR.eu · OW2 Consortium · Openmoko · OpenSolaris · Ourproject.org · Ovi Store · Project Kenai · RubyForge · SEUL.org · SourceForge · Stypi · TestFlight · tigris.org · Transifex · TuxFamily · Yahoo! Downloads

Television/Radio

ABC · Austin City Limits · BBC · CBC · CBS · Computer Chronicles · CTV · Fox · G4 · Global TV · Jeopardy! · NBC · NHK · PBS · Penn & Teller: Bullshit! · The Howard Stern Show · TV News Archive (Understanding 9/11)

Torrenting/Piracy

ExtraTorrent · EZTV · isoHunt · KickassTorrents · The Pirate Bay · Torrentz · Library Genesis

Video hosting

Academic Earth · Bambuser · Blip.tv · Epic · Freshlive · Google Video · Justin.tv · Mixer · Niconico · Nokia Trailers · Oddshot.tv · Periscope · Plays.tv · Qwiki · Skillfeed · Stickam · TED Talks · Ticker.tv · Twitch.tv · Ustream · Videoplayer.hu · Viddler · Viddy · Vidme · Vimeo · Vine · Vstreamers · Yahoo! Video · YouTube · Famous Internet videos (Me at the zoo)

Web hosting

Angelfire · Brace.io · BT Internet · CableAmerica Personal Web Space · Claranet Netherlands Personal Web Pages · Comcast Personal Web Pages · Extra.hu · FortuneCity · Free ProHosting · GeoCities (patch· Google Business Sitebuilder · Google Sites · Internet Centrum · MBinternet · MSN TV · Nifty · Nwnyet · Parodius Networking · Prodigy.net · Saunalahti Iso G · Swipnet · Telenor · Tripod · University of Michigan personal webpages · Verizon Mysite · Verizon Personal Web Space · Webs · Webzdarma · Virgin Media

Web applications

Mailman · MediaWiki · phpBB · Simple Machines Forum · vBulletin

Information

A Million Ways to Die on the Web · Backup Tips · Cheap storage · Collecting items randomly · Data compression algorithms and tools · Dev · Discovery Data · DOS Floppies · Fortress of Solitude · Keywords · Naughty List · Nightmare Projects · Rescuing floppy disks · Rescuing optical media · Site exploration · The WARC Ecosystem · Working with ARCHIVE.ORG

Projects

ArchiveCorps · Audit2014 · Emularity · Faceoff · FlickrFckr · Froogle · INTERNETARCHIVE.BAK (Internet Archive Census· IRC Quotes · JSMESS · JSVLC · Just Solve the Problem · NewsGrabber · Project Newsletter · Valhalla · Web Roasting (ISP Hosting · University Web Hosting· Woohoo

Tools

ArchiveBot · ArchiveTeam Warrior (Tracker· Google Takeout · HTTrack · Video downloaders · Wget (Lua · WARC)

Teams

Bibliotheca Anonoma · LibreTeam · URLTeam · Yahoo Video Warroom · WikiTeam

Other

800notes · AOL · Akoha · Ancestry.com · April Fools' Day · Amplicate · AutoAdmit · Bre.ad · Circavie · Cobook · Co.mments · Countdown · Discourse · Distill · Dmoz · Easel · Eircode · Electronic Frontier Foundation · FanFiction.Net · Feedly · Ficlets · Forrst · FunnyExam.com · FurAffinity · Google Helpouts · Google Moderator · Google Poly · Google Reader · ICQmail · IFTTT · Jajah · JuniorNet · Lulu Poetry · Mobile Phone Applications · Mochi Media · Mozilla Firefox · MyBlogLog · NBII · Newgrounds · Neopets · Quantcast · Quizilla · Salon Table Talk · Shutdownify · Slidecast · Stack Overflow · SOPA blackout pages · starwars.yahoo.com · TechNet · Toshiba Support · USA-Gov · Volán · Widgetbox · Windows Technical Preview · Wunderlist · YTMND · Zoocasa

About Archive Team

Introduction · Philosophy · Who We Are · Our stance on robots.txt · Why Back Up? · Software · Formats · Storage Media · Recommended Reading · Films and documentaries about archiving · Talks · In The Media · FAQ