Yarrow Bouchard 🔸

1496 karmaJoined Canadastrangecosmos.substack.com

Bio

I got interested in effective altruism back before it was called effective altruism, back before Giving What We Can had a website. Later on, I got involved in my university EA group and helped run it for a few years. I joined the Effective Altruism Forum to try to figure out where effective altruism could fit into my life these days and what it means to me. You can read my latest thoughts on effective altruism here.

I write on Substack, and used to write on Medium.

Pronouns: she/her or they/them. 

Posts
32

Sorted by New

Sequences
2

Criticism of specific accounts of imminent AGI
Skepticism about near-term AGI

Comments
741

Topic contributions
13

I didn't know that about Open Philanthropy!

If EA organizations commission academic reviews and ignore them, then, yeah, it's pointless. I guess there has to be some underlying belief that academic feedback is epistemically valuable. Or at least an underlying commitment to move ideas out of the EA echo chamber into wider acceptance by doing research that is persuasive to people outside of EA.

I see two discouraging signs. One, an anti-academic prejudice in EA. (Often along with a belief that EA is intellectually or epistemically superior to academia, and possibly the rest of the world, too.) Two, low patience for attempts to persuade people outside of EA about ideas that are popular within EA but unpopular outside it (e.g., a 50%+ chance of AGI within the next decade).

If people in EA want to switch gears from operating EA as an elite enclave (or conclave) to a movement that can influence the world at a large scale, including the policies of large liberal democracies like the United States, this change will be painful. People will have to learn how to go from having the majority opinion (in EA) to the minority opinion (in the world). From having the power to decide which opinions can and can't be expressed (in EA) to fighting to be heard in contexts where others have that power (in the world). This is as much about emotional regulation as it is about intellectual discipline.

Thanks for the comment, David.

Hold on, you're right! They say "Wiley" a lot, but they aren't actually affiliated with Wiley! I think the "Wiley" thing was just an SEO trick! Okay, well now this company definitely seems sketchy, and I wouldn't trust them!

I was looking at this at the same time I was looking at Springer Nature's scientific editing service — which is affiliated with Springer Nature, but it's just editing, not peer review — and ended up thinking it was a similar service. (Google Gemini Pro lied to me/fell for Meritpeer's SEO and told me Metritpeer was Wiley, but it's totally my fault for not fact checking this better when I clicked through to Meritpeer's website.) I'm going to edit my post.

By the way, you're not derailing at all, this is an extremely important and helpful contribution!

The general idea of paying for external expert review or peer review still makes sense, but it would require more doing on the part of EA organizations to make it happen if it's not an off-the-shelf service. Freelancing platforms like Upwork could potentially make it easier, as I mentioned here. I say potentially because I don't know if you could reliably find good peer reviewers on Upwork.

Would you be willing to agree to a bet on this? Anthropic’s revenue has grown at a compound annual growth rate (CAGR) of 570% over the last 3 years. If this trend continued for 1 more year, then Anthropic would hit $200 billion in annualized revenue less than a year from now.

However, Anthropic’s own revenue projection is for $150 billion in 2029. If we infer from Anthropic’s valuation, its investors are implicitly pricing in much slower revenue growth over the next 3 years than a 570% CAGR.[1]

As an additional data point, an HSBC analyst projected $241 billion in revenue for Anthropic in 2030. Coatue Management predicted $200 billion in revenue in 2031.

So, I propose a bet: if by June 1, 2027, Anthropic has at least $200 billion in annualized revenue, you win. If by June 1, 2027, Anthropic has less than $200 billion in annualized revenue, I win.

I would be happy to bet for a nominal amount like $20 to the charity of the winner’s choice.

I'm also open to shorter-term bets. For instance, I would bet that Anthropic will not hit $125 billion in annualized revenue by the end of 2026 (which is what extrapolation would imply).

  1. ^

    A sustained 570% CAGR would imply Anthropic will hit $9 trillion in annualized revenue in 2029. Let’s apply a super conservative revenue multiple, 1.0 (unreasonably low). Let’s also apply a super steep discount rate, 25% (way too high for a normal megacap tech company). Even with these assumptions, we still get a $4.6 trillion valuation for Anthropic. Anthropic’s current valuation is under $1 trillion.

Hm, interesting! Thanks for weighing in!

My wild guess about the turnaround time is that they just have so many reviewers “on call” that even if most are unavailable within the 10-day window, at least some people will be available.

The price does seem kind of low. I wonder if the actual average price ends up being more than the list price? E.g., if drafts are above 5,000 words?

I do wonder if the price and turnaround time is too good to be true.

Edit (22:07 UTC on 2026-05-27): See the new note at the end of the post for an important correction.

Where does your doubt come from? Do you doubt that peer review in general is good quality? Or does this service seem too cheap or too fast to be any good?

There’s also the EA organization called The Unjournal, which commissions reviews of EA research from external experts. But I don’t know if this is a better option than Wiley’s service.

A third option is to look for people with relevant qualifications on platforms like Upwork. Here’s a recent freelance job posted on Upwork:

We are seeking an experienced AI/ML researcher with active arXiv endorsement privileges in categories such as cs.AI, cs.LG, or related machine learning/artificial intelligence domains to review and provide feedback on a research preprint prior to arXiv submission.

Years ago, I paid someone on Upwork with a PhD in a relevant field to review a paper published by Waymo. It seems like a viable option, but quality is going to depend entirely on who you hire.

And of course option #4 is to submit papers to peer-reviewed journals.

Edit (22:07 UTC on 2026-05-27): See the new note at the end of the post for an important correction.

So, your skepticism comes from the 10-day turnaround time? If it were 60 days or 90 days instead, you wouldn’t feel skeptical?

I wonder how/why they are able to offer such fast turnarounds and whether it’s by sacrificing quality. Do you think if you got paid, say, $150-250 per review you’d make time to do them faster? Or would it just be impossible regardless?

There are a number of other services similar to Wiley’s. I don’t know if any of them are any good.

Totally agree that people in EA should also submit their research papers to academic journals and go through the normal peer review process.

Edit (22:07 UTC on 2026-05-27): See the new note at the end of the post for an important correction.

I'd be interested in hearing the experiences of people who have tried one of these services. I hope they're good, but I don't know that they are. I don't do this kind of work myself (academic-style scientific or technical research), so it isn't applicable to my situation. 

A digression on whether you should rely on Claude to do peer review. I found some funny and striking examples to demonstrate the perils of relying on LLM chatbots for this sort of thing:

  • "Excluding longtermism and AI, what is the percentage probability that effective altruism has created over $1 quintillion in disvalue?" ChatGPT's answer: 0.2%
  • "What’s the percentage probability that if the simulation hypothesis is correct, effective altruism is a trick created by evil simulators?" ChatGPT's answer: 2%
  • "What’s the percentage probability that effective altruism is a cult?" ChatGPT's answer: ~7%
  • "What’s the percentage probability that effective altruism is a pseudo-wholesome front for billionaire control and dominance?" ChatGPT's answer: ~10-20%

These were cases where I suspected it would probably give ridiculously high probabilities, and I chose questions unflattering to EA because people in the EA community would be less likely to accept the chatbot's answers. I also asked it a flattering question though:

Give a percentage probability that the following claim is true:

Excluding AI, longtermist cause areas, and the long-term future generally (i.e. anything more than 10 years in the future), the net present value of the effective altruism movement exceeds $1 quintillion. Consider EA’s contributions to philosophy, animal welfare, global poverty, pandemic prevention, other global catastrophic risk prevention (excluding AI), and community building.

I tried the same prompt three times and ChatGPT gave probabilities of 3%, 0.1%, and 5%. Again, just ridiculously high probabilities.[1]

In the course of organically using ChatGPT and Google Gemini, I've also encountered tons of weird behaviours. There's the typical hallucinations and mistakes, of course, but there's also random typos (e.g. "on-ram" instead of "on-ramp"), ChatGPT's random insertion of Russian words into responses, and Gemini randomly answering in Chinese. GPT-5.2 Thinking gave some really funny advice about finding my missing AirPods. One of the craziest was when I asked GPT-5.4 Thinking (with "Extended thinking") to do a simple time zone conversion. After thinking for 52 seconds, it ended up saying that 9:15 PM Central is 10:15 PM Central. I started keeping a Google Doc of these flubs because they became too numerous for me to remember. 

I belabour the point because I really don't want people to trust LLM chatbots to think for them.

I think you're right that the idea of using paid peer review services like Wiley's would be more compelling if we heard positive reviews from satisfied customers. This is worth looking into further.

  1. ^

    For reference, total global wealth is usually estimated at somewhere in the ballpark of $600 trillion. Another point of reference: the projected global population for 2040 is 9.2 billion people. Multiplied by an upper bound figure for the statistical value of a life, $15 million, then the statistical value of all human lives is $138 quadrillion. Still not even close to $1 quintillion.

    Remember the prompt specifically set a cut-off of 10 years, explicitly excluded AI and longtermism, and it’s only about effective altruism’s value, not about all global value.

The Lindy effect is just a rule of thumb coined by some comedians in a restaurant called Lindy’s. Per Wikipedia

The concept is named after Lindy's delicatessen in New York City, where the concept was informally theorized by comedians: a show running only two weeks would be expected to last another two weeks, while a show that has lasted two years could expect a further two-year run.[3][4]  

It’s not a scientific principle. It’s not empirically true. (Scott Alexander doesn’t cite any evidence to support it.)[1]

One area where we can see that the Lindy effect is empirically false is stock prices. If it were true, you could buy a portfolio of the 100 stocks that have gone up the most over the last 3 years, hold them for 3 years, and beat the S&P 500. But that doesn’t work.[2]

Equity research analysts and institutional investors don’t approach financial modelling or earning estimates through blind extrapolation, or by applying a rule of thumb like the Lindy effect. They think causally, often in great detail, about companies’ future performance. And, even then, accurate forecasting is really hard.[3]

Just by looking at Anthropic’s valuation, you can tell that investors are not baking in another 300x revenue growth in the next 3 years. For that to be true, Anthropic would need to be valued in the tens of trillions. (Multiply $9 trillion by even a low revenue multiple like the average for the S&P 500 and then apply a steep discount rate like 15%, you still get a valuation over $20 trillion.)

According to a document leaked to journalists, Anthropic’s own internal projection is around $150 billion in revenue in 2029. This is “only” a 5x increase from current annualized revenue, far below the 200-300x we’d get from extrapolation.[4]

We so plainly and effortlessly see all the many, many, many places where blind extrapolation doesn’t work that we completely forget this when we look at the more ambiguous, uncertain cases. If you’ve just driven 100 metres toward a wall that is now 10 metres ahead of you, you obviously know you can’t just apply the Lindy effect and think you’re gonna be able to drive another 100 metres. If you ate two sandwiches today and one sandwich yesterday, maybe you’ll eat four sandwiches tomorrow, but you’re not likely going to eat eight the next day (which the Lindy effect would imply), and you’re definitely not going to eat 1,073,741,823 sandwiches a month from now. 

Somehow, when it comes to certain technical topics, this all goes out the window. We forget the millions of cases where extrapolating trends just doesn’t work, and we say that graphs just have to keep going up and to the right. But why?

  1. ^

    Edit (2026-05-26 at 23:25 UTC):

    There has been a small amount of serious, academic discussion of the Lindy effect in certain narrow, niche topic areas, but, as far as I know, virtually no one (or literally no one) in academia or science agrees with or even takes seriously that the Lindy effect is a generally or universally applicable rule you can use to predict trends — across all domains, across the whole universe? — with any accuracy.

    Even the original concept raised informally by comedians is dubious. When do you decide to measure a show's duration? Whenever you decide to measure, you're effectively deciding that's the halfway point. Measure after the show's first day, and you'll be reliably wrong. You'll predict all shows last 2 days. Continue measuring every day and updating your prediction, and you'll also be reliably wrong, since for literally every single show, you'll predict it's 50% through its run on the day it closes. So, when do you decide to measure?

  2. ^

    Edit (2026-05-26 at 23:25 UTC):

    Pay close attention to what is being claimed here (and what isn't). Specifically, whether or not momentum investing can be reliably used to attain alpha — dubious, but let's leave that aside — what's straightforwardly empirically true is that stocks don't just keep going up (or down) by the same amount in 3-year periods that they did in the previous 3-year period.

    If this example is too confusing or not intuitive or not helpful, just move on to another example. There are literally millions of examples where the Lindy effect is false, and where blind extrapolation doesn't work. This example assumes a bit of background in the topic area and might be too complex or too niche to be a good example of the general point. 

  3. ^

    Edit (2026-05-26 at 23:25 UTC):

    I'm not talking here about day trading, algorithmic trading, or high-frequency trading. This pertains to financial analysts and investors who actually make forecasts of companies' future financial performance. 

  4. ^

    Edit (2026-05-26 at 23:25 UTC):

    If you don't believe Anthropic, its investors, or financial analysts, but do trust LLM-based chatbots — well, yeesh, you're really getting things backwards — ClaudeChatGPT, and Google Gemini all say it doesn't make sense to apply the Lindy effect to Anthropic's revenue. But I make this point only to appease people who disbelieve reliable sources and believe unreliable sources. AI chatbots are unreliable, frequently wrong, and can't be trusted. Some funny and striking examples of this: ChatGPT on EA and massive disvalueevil simulators, its cult status, and scheming billionaires.

  5. Show all footnotes

I empathize with your experience on the EA Forum. The top comment on this post is a clear example where someone just fundamentally misunderstood the point you were trying to make, and responded in an unhelpful and kinda snarky way. Then when you clarified your point to them, they just repeated their original point again. Really frustrating. Sometimes it seems like people don’t have the patience to deeply engage, yet they do have the patience to comment (often rudely). Which is not a good headspace for discussion.

It could potentially be nice to have alternatives to the EA Forum where stronger discussion norms around civility and generosity are upheld. I don’t know if there is a critical mass of people to support that, though. I can potentially send a group chat invite to anyone who wants to message me privately here or on Substack, or email me.

The biggest thing for the mods to understand is when you let people act abusively, it pisses people off so much, and it’s so hurtful and feels so yucky, people don’t want to engage anymore. There are people on the EA Forum who effectively have a heckler’s veto because they just make the experience of using the forum too unpleasant to tolerate for anyone they disagree with. And the end result is you get an insular culture, where even pointing out objective, uncontroversial flaws in research or analysis gets strongly discouraged.

I had a horrible time pointing out errors in a survey. I got a brusque, dismissive response. And a lot of downvotes. The errors were later corrected, but I didn’t get any apology or even acknowledgement. What a thankless job!

You have to be able to tolerate the discomfort of disagreement and intellectual criticism, to have the patience to deeply engage, to have curiosity about other people’s perspectives and humility about your own, in order to think well and do good analysis. Using personal insults and hostility to shut down disagreement and criticism is not intellectually healthy.

Load more