I used to work at Tumblr, the entirety of their user content is stored in a single multi-petabyte AWS S3 bucket, in a single AWS account, no backup, no MFA delete, no object versioning. It is all one fat finger away from oblivion.
> I used to work at Tumblr, the entirety of their user content is stored in a single multi-petabyte AWS S3 bucket, in a single AWS account, no backup, no MFA delete, no object versioning. It is all one fat finger away from oblivion.
Remember when Microsoft lost all of the data for their Sidekick users? Basically they were upgrading their SAN and things went badly.
What the hell. It is so easy to configure multi-region glacier backups, mfa delete, etc. for a single S3 bucket. Took me like a couple hours to setup versioning and backups, and a few days to setup mfa for admin actions. Why would they not set this stuff up?
The key words you probably need to look at are "multi-petabyte". Not saying they shouldn't be doing something but it all costs - and at multi-petabytes, it cooooosts
1 Petabyte (and they have multiple) S3 - $30,000 a month, $360,000 a year
S3 - reduced redundancy - $24,000 a month, $288,000 a year
S3 - infrequent access - $13,100 a month, $157,000 a year
Add in transit and cdn and Tumblr’s AWS bill was seven figures a month. A bunch of us wanted to build something like Facebook’s haystack do away with S3 altogether, but the idea kept getting killed because of concerns over all the places the S3 URLs were hard coded and also breaking 3rd party links to content in the bucket (for years you could link to the bucket directly - still can for content more then a couple years old)
Well, the business was acquired for $500,000,000 and a single employee probably costs what backing up two petabytes of data for a year (on glacier) does.
They could also always use tapes, for something as critical as the data that is the blood of your business.
Imagine if facebook lost everyones' contact lists, how bad would that be for their business? Backups are cheap insurance.
Backups are still a hard sell for management, though. No matter how many companies die a quick and painful death when they lose too much business critical data, the bossmen just can't wrap their heads around spending $100k for what they perceive as no benefit.
Same problems with buying things like antivirus software or even IT management utilities; when they're doing their job, there's no perceivable difference. It's only when shit goes sideways that the value is demonstrated.
Hell you could take this a step further for IT as a whole; if IT is doing their job well, they're invisible. Then they can the entire department, outsource to offsite support, and the business starts hemorrhaging employees and revenue because nobody can get anything done.
>No matter how many companies die a quick and painful death when they lose too much business critical data, the bossmen just can't wrap their heads around spending $100k for what they perceive as no benefit.
Yeah, but what exactly IS the benefit? The business doesn't die if something really bad happens? Is that really important though?
Consider the two alternatives:
1) The business spends $x00k/year on backups. IF something happens, they're saved, and business continues as normal. However, this money comes out of their bottom line, making them less profitable.
2) The business doesn't bother with backups, and has more profit. The management can get bigger bonuses. But IF something bad happens, the company goes under, but then what happens to the managers who made these decisions? They just go on to another job at another company, right?
Devil's advocate: it depends on how many petabytes you have. This cloud of uncertainty over your uploads could be seen as the hidden cost of using a free platform.
It will probably cost more to connect all these drives to some sort of a server. Though 125 is within the realm of what a simple USB should be able to handle (127 devices per controller).
My experience with Tumblr was generally that a large part of the content, especially larger media content like videos, failed to load most of the time. Makes me wonder if that's related ...
Picasso (supposedly) drew on a napkin, and Banksy draws on derelict walls or sticks his work through a shredder. The medium doesn’t need to be lasting. Edit: The potentially short-lived medium was chosen by the above artists. Tumblr users many not be too happy if work is lost.
banksy's walls are sold though; and he is still kind of the exception because of his art format. Not everything needs to be lasting but 100% temporary art is not common.
How many do you think they would be willing to pay some small monthly fee? I'm guessing most of them think their work is worth at least $5/month, right? Maybe Tumblr should become a paid service and ditch the advertising model. That way they could be more relaxed about what types of content they are willing to host.
That's basically what happened with S3 a couple years back. Mistyped command caused an outage for large parts of the internet in the US. Now, I dunno if they could make a big enough mistake that would bring down the whole company, but certainly it's been proven that a single mistake can affect major portions of the internet.
> experienced code reviewers verifying change sets using sophisticated deployment infrastructure targeting physical hardware spread out across one or more data centers in each availability zone
but the availability numbers speak for themselves :/
there was a S3 sync client that some people used that did:
aws s3 sync --delete ./ s3://your-bucket/
The delete flag was added by just a very innocuous checkbox in the UI. The result is that it removes anything not in the source directory. Kaboom. Everything's gone. The point is you have no idea what stuff is going to do even if you think it's obvious.
Have you tried this? It takes forever to clean out a bucket. At the scale we're talking about, doing this on a single thread from the CLI tool means you could go home and come back the next day and cancel it then, and you still wouldn't have made a particularly big dent in the bucket. It's really a pain in the neck to delete a whole bucket full of data when you actually want to. It's "easy" to start off a recursive delete, sure, but I think you're overestimating the "kaboom" factor.
Tumblr rejected all things Yahoo, except the money, so the answer to just about anything Yahoo asked was either “no”, “get stuffed”, or silence and a note to David that he needed to escalate to Marissa.
On the other side the Yahoo services were so heavily integrated that it was hard to carve out any piece of them, and the few times we tried it was a slow and painful process because Yahoo’s piece was glitchey and unreliable outside of it’s home turf and the Tumblr engineers defensive and argumentative about everything and not willing to help.
That's exactly how I imagined Tumblr's design and development, based on my multiple unsuccessful attempts, over the years, to find any useful navigation between blogs, or the function of reading comments.
> When was this? Being owned by Yahoo, I am surprised they don't use NetApp.
Dell used to offer an online backup service. It wasn't even running on Dell equipment!
Basically they acquired a company that offered the service, and while it would be "nice" if a Dell company ran on Dell gear, a lot of the time it's simply impractical/expensive to overhaul things.
i do this too with my data on a smaller scale, but i'm suprised tumblr does this because even with only a few million files s3 buckets that big are awkward to work with