Hacker Newsnew | past | comments | ask | show | jobs | submit | bgentry's commentslogin

As somebody whose first day working at Heroku was the day this acquisition closed, I think it’s mostly a misconception to blame Salesforce for Heroku’s stagnation and eventual irrelevance. Salesforce gave Heroku a ton of funding to build out a vision that was way ahead of its time. Docker didn’t even come out until 2013, AWS didn’t even have multiple regions when it was built. They mostly served as an investor and left us alone to do our thing, or so it seemed those first couple years.

The launch of the multi language Cedar runtime in 2011 led to incredible growth and by 2012 we were drowning in tech debt and scaling challenges. Despite more than tripling our headcount in that first year (~20 to 74) we could not keep up.

Mid 2012 was especially bad as we were severely impacted by two us-east-1 outages just 2 weeks apart. To the extent it wasn’t already, reliability and paying down tech debt became the main focus and I think we went about 18 months between major user-facing platform launches (Europe region and eventually larger sized dynos being the biggest things we eventually shipped after that drought). The organization lost its ability to ship significant changes or maybe never really had that ability at scale.

That time coincided with the founders taking a step back, leaving a loss of leadership and vision that was filled by people more concerned with process than results. I left in 2014 and at that time it already seemed clear to me that the product was basically stalled.

I’m not sure how much of this could have been done better even in hindsight. In theory Salesforce could have taken a more hands on approach early on but I don’t think that could have ended better. They were so far from profitability in late 2010 that they could not stay independent without raising more funding. The venture market in ~2010 was much smaller than a few years later—tiny rounds and low valuations. Had the company spent its pre-acquisition engineering cycles building for scalability & reliability at the expense of product velocity they probably would have never gotten successful.

Even still, it was the most amazing professional experience of my career, full of brilliant and passionate people, and it’s sad to see it end this way.


It remains the greatest engineering team I've ever seen or had the pleasure to be a part of. I was only there from early 2011 to mid 2012 but what I took with me changed me as an engineer. The shear brilliance of the ideas and approaches...I was blessed to witness it. I don't think I can overstate it, though many will think this is all hyperbole. I didn't always agree with the decisions made and I was definitely there when the product stagnation started, but we worked hard to reduce tech debt, build better infrastructure, and improve... but man, the battles we fought. Many fond memories, including the single greatest engineering mistake I've ever seen made, one that I still think about until this day (but will never post in a public forum :)).

It was a pleasure working with you bgentry!


Thanks for sharing your story. Those early days of using Heroku were really enjoyable for me. It felt so promising and simple. I remember explaining the concept to a lot of different people who didn't believe that the deployment model could be that simple and accessible until I showed them.

Then life went on, I bounced around in my career, and forgot about Heroku. Years later I actually suggested it for someone to use for a simple project once and I could practically feel the other developers in the room losing respect for me for even mentioning it. I hadn't looked at it for so long that I didn't realize it had fallen out of favor.

> That time coincided with the founders taking a step back, leaving a loss of leadership and vision that was filled by people more concerned with process than results

This feels all too familiar. All of my enjoyable startup experiences have ended this way: The fast growing, successful startup starts attracting people who look like they should be valuable assets to the company, but they're more interested in things like policies, processes, meetings, and making the status reports look nice than actually shipping and building.


The cancer of corporations: bureaucracy.

As far as the Salesforce acquisition goes, I'd be curious to see who made the decision to put Heroku into maintenance only mode.

I worked for a different part of Salesforce. I don't really feel like Salesforce did a ton of meddling in any of its bigger acquisitions other than maybe Tableau. I think the biggest missed opportunity was potentially creating a more unified experience between all of its subsidiaries. Though, it's hard to call that a failure since they're making tons of money.

It could be a case of post-founder leadership seeing that there's not a lot of room for growth and giving up. That happens a lot in the tech industry.


Thanks for a capsule tour through Heroku!

FWIW, the team that eventually created "Docker" was working at the same time on dotCloud as a direct Heroku competitor. I remember meeting them at a meet-up in the old Twitter building but couldn't tell you exactly which year that was. Maybe 2010 or 2011?


Yep, that team did great work. I remember having lunch at the Heroku office with the dotCloud team in 2011 or 2012 and also Solomon Hykes demoing Docker to us in our office’s basement before it launched. So much cool stuff was happening back then!

I remember Solomon's demoing at All Hands... it was such a crazy time for tech and innovation.

Oh hey Curt!! Remember, you are not a monitoring system :)

You'll need to unlock your iPhone first. Even though you're staring at the screen and just asked me to do something, and you saw the unlocked icon at the top of your screen before/while triggering me, please continue staring at this message for at least 5 seconds before I actually attempt FaceID to unlock your phone to do what you asked.

This is largely because LISTEN/NOTIFY has an implementation which uses a global lock. At high volume this obviously breaks down: https://www.recall.ai/blog/postgres-listen-notify-does-not-s...

None of that means Oban or similar queues don't/can't scale—it just means a high volume of NOTIFY doesn't scale, hence the alternative notifiers and the fact that most of its job processing doesn't depend on notifications at all.

There are other reasons Oban recommends a different notifier per the doc link above:

> That keeps notifications out of the db, reduces total queries, and allows larger messages, with the tradeoff that notifications from within a database transaction may be sent even if the transaction is rolled back


> None of that means Oban or similar queues don't/can't scale—it just means a high volume of NOTIFY doesn't scale

Given the context of this post, it really does mean the same thing though?


No, I don't think so. Oban does not rely on a large volume of NOTIFY in order to process a large volume of jobs. The insert notifications are simply a latency optimization for lower volume environments, and for inserts can be fully disabled such that they're mainly used for control flow (canceling jobs, pausing queues, etc) and gossip among workers.

River for example also uses LISTEN/NOTIFY for some stuff, but we definitely do not emit a NOTIFY for every single job that's inserted; instead there's a debouncing setup where each client notifies at most once per fetch period, and you don't need notifications at all in order to process with extremely high throughput.

In short, the fact that high volume NOTIFY is a bottleneck does not mean these systems cannot scale, because they do not rely on a high volume of NOTIFY or even require it at all.


Does River without any extra configuration run into scaling issues at a certain point? If the answer is yes, then River doesn’t scale without optimization (Redis/Clustering in Oban’s case).

While the root cause might not be River/Oban, them not being scalable still holds true. It’s of extra importance given the context of this post is moving away from redis and to strictly a database for a queue system.


Yup. I wasn’t talking about notify in particular but about using Postgres in general.


Yeah, River generally recommends this pattern as well (River co-author here :)

To get the benefits of transactional enqueueing you generally need to commit the jobs transactionally with other database changes. https://riverqueue.com/docs/transactional-enqueueing

It does not scale forever, and as you grow in throughput and job table size you will probably need to do some tuning to keep things running smoothly. But after the amount of time I've spent in my career tracking down those numerous distributed systems issues arising from a non-transactional queue, I've come to believe this model is the right starting point for the vast majority of applications. That's especially true given how high the performance ceiling is on newer / more modern job queues and hardware relative to where things were 10+ years ago.

If you are lucky enough to grow into the range of many thousands of jobs per second then you can start thinking about putting in all that extra work to build a robust multi-datastore queueing system, or even just move specific high-volume jobs into a dedicated system. Most apps will never hit this point, but if you do you'll have deferred a ton of complexity and pain until it's truly justified.


state machines to the rescue, ie i think the nature of asynchronous processing requires that we design for good/safe intermediate states.


I get the temptation to attribute the popularity of these systems to lazy police with nothing better to do, but from personal experience there’s more to it.

I live in a medium sized residential development about 15 minutes outside Austin. A few years ago we started getting multiple incidents per month of attempted car theft where the thieves would go driveway to driveway checking for unlocked doors. Sometimes the resident footage revealed the thieves were armed while doing so. In a couple of cases they did actually steal a car.

The sheriffs couldn’t really do much about it because a) it was happening to most of the neighborhoods around us, b) the timing was unpredictable, and c) the manpower required to camp out to attempt to catch these in progress would be pretty high.

Our neighborhood installed Flock cameras at the sole entrance in response to growing resident concerns. We also put in a strict policy around access control by non law enforcement. In the ~two years since they were installed, we’ve had two or three incidents total whereas immediately prior it was at least as many each month. And in those cases the sheriffs could easily figure out which vehicles had entered or left during that time. I continue to see stories of attempted car thefts from adjacent neighborhoods several times per month.

I totally get the privacy concerns around this and am inherently suspicious of any new surveillance. I also get the reflexive dismissal of their value. In this case it has been a clear win for our community through the obvious deterrent factor and the much higher likelihood of having evidence if anything does happen.

Our Flock cameras do not show on the map here, btw.


Aaron did RT the post here which likely indicates some agreement with the sentiment in it: https://x.com/searls/status/1972293469193351558

Also he shared it directly while saying it was good here: https://x.com/tenderlove/status/1972370330892321197


Take my money! I have been looking for a good way to get Claude to stop telling me I'm right in every damn reply. There must be people who actually enjoy this "personality" but I'm sure not one of them.


As one of River's creators, I'm aware of many companies using it, including names you've heard of :) One cool open source project I've seen recently is this open source pager / on-call system from Target called GoAlert: https://github.com/target/goalert


Modern Postgres in particular can take you really far with this mindset. There are tons of use cases where you can use it for pretty much everything, including as a fairly high throughput transactional job queue, and may not outgrow that setup for years if ever. Meanwhile, features are easier to develop, ops are simpler, and you’re not going to risk wasting lots of time debugging and fixing common distributed systems edge cases from having multiple primary datastores.

If you really do outgrow it, only then do you have to pay the cost to move parts of your system to something more specialized. Hopefully by then you’ve achieved enough success & traction to justify doing so.

Should be the default mindset for any new project if you don’t have very demanding performance & throughput needs, IMO.


PostgreSQL is quite terrible at OLAP, though. We got a few orders of magnitude performance improvement in some aggregation queries by rewriting them with ClickHouse. It's incredible at it.

My rule of thumb is: PG for transactional data consistency, Clickhouse for OLAP. Maybe Elasticsearch if a full-text search is really needed.


This. Don't loose your time and sanity trying to optimize complex queries for pg's non deterministic query planner, you have no guarantee your indexes will be used (even running the same query again with different arguments). Push your data to clickhouse and enjoy good performance without even attempting to optimize. If even more performance is needed, denormalize here and there.

Keep postgres as the source of truth.


I find that postgres query planner is quite satisfactory for very difficult use cases. I was able to get 5 years into a startup that wasn't basically trying to be the next Twitter on a 300 dollar postgres tier with heroku. The reduced complexity was so huge we didn't need a team of 10. The cost savings were yuge and I got really good at debugging slow queries to a point of I could tell when postgres would cough at one.

My point isn't that this will scale. It's that you can get really really far without complexity and then tack on as needed. This is just another bit of complexity removal for early tech. I'd use this in a heart beat.


It’s simple high throughput queries that often bite you with the PG query planner.


Redis, PG, Clickhouse sounds like a good combo to scale workloads that require OLTP and OLAP workloads, at scale (?).


You can get pretty high job throughput while maintaining transactional integrity, but maybe not with Ruby and ActiveRecord :) https://riverqueue.com/docs/benchmarks

That River example has a MacBook Air doing about 2x the throughput as the Sidekiq benchmarks while still using Postgres via Go.

Can’t find any indication of whether those Sidekiq benchmarks used Postgres or MySQL/Maria, that may be a difference.


this is really cool and I'd love to use it, but it seems they only support workers written in Go if I'm not mistaken? My workers can be remote and not using Go, behind a NAT too. I want a them to periodically pull the queue, this way I do not need to worry about network topology. I guess I could simply interface with PG atomically via a simple API endpoint for the workers to connect, but I'd love to have the UI of riverqueue.


very interesting and TIL about the project, thanks for sharing. What DB is River Queue for 46k jobs/sec (https://riverqueue.com/docs/benchmarks)

UPDATE: I see its PG - https://riverqueue.com/docs/transactional-enqueueing


IIUC - Does this mean if I am making a network call, or performing some expensive task from within the job, the transaction against the database is held open for the duration of the job?


Definitely not! Jobs in River are enqueued and fetched by worker clients transactionally, but the jobs themselves execute outside a transaction. I’m guessing you’re aware of the risks of holding open long transactions in Postgres, and we definitely didn’t want to limit users to short-lived background jobs.

There is a super handy transactional completion API that lets you put some or all of a job in a transaction if you want to. Works great for making other database side effects atomic with the job’s completion. https://riverqueue.com/docs/transactional-job-completion


very nice! cool project


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: