Yarrow Bouchard🔸

944 karmaJoined Canadamedium.com/@strangecosmos

Bio

Pronouns: she/her or they/them. 

I got interested in effective altruism back before it was called effective altruism, back before Giving What We Can had a website. Later on, I got involved in my university EA group and helped run it for a few years. Now I’m trying to figure out where effective altruism can fit into my life these days and what it means to me.

Comments
295

Topic contributions
1

Thank you for your kindness. I appreciate it. :)

Do the two papers you mentioned give specific quantitative information about how much LLM performance increases as the compute used for RL scales? And is it a substantially more efficient scaling than what Toby Ord assumes in the post above?

In terms of AI safety research, this is getting into a very broad, abstract, general, philosophical point, but, personally, I'm fairly skeptical of the idea that anybody today will be able to do AI safety research now that can be applied to much more powerful, much more general AI systems in the future. I guess if you think the more powerful, more general AI systems of the future will just be bigger versions of the type of systems we have today, then it makes sense why you'd think AI safety research would be useful now. But I think there are good reasons for doubting that, and LLM scaling running out of steam is just one of those good reasons.

To take a historical example, the Machine Intelligence Research Institute (MIRI) had some very specific ideas about AI safety and alignment dating back to before the deep learning revolution that started around 2012. I recall having an exchange with Eliezer Yudkowsky, who co-founded MIRI and does research there, on Facebook sometime around 2015-2017 where he expressed doubt that deep learning was the way to get to AGI and said his best bet was that symbolic AI was the most promising approach. At some point, he must have changed his mind, but I can't find any writing he's done or any talk or interview where he explains when and why his thinking changed. 

In any case, one criticism — which I agree with — that has been made of Yudkowsky's and MIRI's current ideas about AI safety and alignment is that these ideas have not been updated in the last 13 years, and remain the same ideas that Yudkowsky and MIRI were advocating before the deep learning revolution. And there are strong reasons to doubt they still apply to frontier AI systems, if they ever did. What we would expect from Yudkowsky and MIRI at this point is either an updating of their ideas about safety and alignment, or an explanation of why their ideas developed with symbolic AI in mind should still apply, without modification, to deep learning-based systems. It's hard to understand why this point hasn't been addressed, particularly since people have been bringing it up for years. It comes across, in the words of one critic, as a sign of thinkers who are "persistently unable to update their priors."

What I just said about MIRI's views on AI safety and alignment could be applied to AI safety more generally. Ideas developed on the assumption that current techniques, architectures, designs, or paradigms will scale all the way to AGI could turn out to be completely useless and irrelevant if it turns out that more powerful and more general AI systems will be built using entirely novel ideas that we can't anticipate yet. You used an aviation analogy. Let me try my own. Research on AI safety that assumes LLM will scale to AGI and is therefore based on studying the properties peculiar to LLMs might turn out to be a waste of time if technology goes in another direction, just as aviation safety research that assumed airships would be the technology that will underlie air travel and focused on the properties of hydrogen and helium gas has no relevance to a world where air travel is powered by airplanes that are heavier than air. 

It's relevant to bring up at this point that a survey of AI experts found that 76% of them think that it's unlikely or very unlikely that current AI techniques, such as LLMs, will scale to AGI. There are many reasons to agree with the majority of experts on this question, some of which I briefly listed in a post here.

Because I don't see scaling up LLMs as a viable path to AGI, I personally don't see much value in AI safety research that assumes that it is a viable path. (To be clear, AI safety research that is about things like how LLM-based chatbots can safely respond to users who express suicidal ideation, and not be prompted into saying something harmful or dangerous, could potentially be very valuable, but that's about present-day use cases of LLMs and not about AGI or global catastrophic risk, which is what we've been talking about.) In general, I'm very sympathetic to a precautionary, "better safe than sorry" approach, but, to me, AI safety or alignment research can't even be justified on those grounds. The chance of LLMs scaling up to AGI seems so remote. 

It's also unlike the remote chance of asteroid strike, where we have hard science that can be used to calculate that probability rigorously. It's more like the remote chance that the Large Hadron Collider (LHC) would create a black hole, which can only be assigned a probability above zero because of fundamental epistemic uncertainty, i.e., based on the chance that we've gotten the laws of physics wrong. I don't know if I can quite put my finger on why I don't like the form of argument for practical measures to mitigate existential risk based on fundamental epistemic uncertainty. I can point out that it would seem to lead to have some very bizarre implications. 

For example, what probability do we assign to the possibility that Christian fundamentalism is correct? If we assign a probability above zero, then this leads us literally to Pascal's wager, because the utility of heaven is infinite, the disutility of hell is infinite, and the cost of complying with the Christian fundamentalist requirements for going to heaven are not only finite but relatively modest. Reductio ad absurdum?

By contrast, we know for sure dangerous asteroids are out there, we know they've hit Earth before, and we have rigorous techniques for observing them, tracking them, and predicting their trajectories. When NASA says there's a 1 in 10,000 chance of an asteroid hitting Earth, that's an entirely different kind of a probability than if a Bayesian-utilitarian guesses there's a 1 in 10,000 chance that Christian fundamentalism is correct, that the LHC will create a black hole, or that LLMs will scale to AGI within two decades.

One way I can try to articulate my dissatisfaction with the argument that we should do AI safety research anyway, just in case, is to point out there's no self-evident or completely neutral or agnostic perspective from which to work on AGI safety. For example, what if the first AGIs we build would otherwise have been safe, aligned, and friendly, but by applying our alignment techniques developed from AI safety research, we actually make them unsafe and cause a global catastrophe? How do we know which kind of action is actually precautionary? 

I could also make the point that, in some very real and practical sense, all AI research is a tradeoff between other kinds of AI research that could have been done instead. So, maybe instead of focusing on LLMs, it's wiser to focus on alternative ideas like energy-based models, program synthesis, neuromorphic AI, or fundamental RL research. I think the approach of trying to squeeze Bayesian blood from a stone of uncertainty by making subjective guesses of probabilities can only take you so far, and pretty quickly the limitations become apparent.

To fully make myself clear and put my cards completely on the table, I don't find effective altruism's treatment of the topic of near-term AGI to be particularly intellectually rigorous or persuasive, and I suspect at least some people in EA who currently think very near-term AGI is very likely will experience a wave of doubt when the AI investment bubble pops sometime within the next few years. There is no external event, no evidence, and no argument that can compel someone to update their views if they're inclined enough to resist updating, but I suspect there are some people in EA who will interpret the AI bubble popping as new information and will take it as an opportunity to think carefully about their views on near-term AGI. 

If you think that very near-term AGI is very likely, and if you think LLMs very likely will scale to AGI, then this implies an entirely different idea about what should be done, practically, in the area of AI safety research today.

This is a very strange critique. The claim that research takes hard work does not logically imply a claim that hard work is all you need for research. In other words, to say hard work is necessary for research (or for good research) not does imply it is sufficient. I certainly would never say that it is sufficient, although it is necessary.

Indeed, I explicitly discuss other considerations in this post, such as the "rigour and scrutiny" of the academic process and what I see as "the basics of good epistemic practice", e.g. open-minded discussion with people who disagree with you. I talk about specific problems I see in academic philosophy research that have nothing to do with whether people are working hard enough or not. I also discuss how, from my point of view, ego concerns can get in the way, and love for research itself — and maybe I should have added curiosity — seems to be behind most great research. But, in any case, this post is not intended to give an exhaustive, rigorous account of what constitutes good research. 

If picking examples of academic philosophers who did bad research or came to bad conclusions is intended to discredit the whole academic enterprise, I discussed that form of argument at length in the post and gave my response to it. (Incidentally, some members of the Bay Area rationalist community might see Heidegger's participation in the Nazi Party and his involvement in book burnings as evidence that he was a good decoupler, although I would disagree with that as strongly as I could ever disagree about anything.) 

I think accounting for bias is an important part of thinking and research, but I see no evidence that effective altruism is any better at being unbiased than anyone else. Indeed, I see many troubling signs of bias in effective altruist discourse, such as disproportionately valuing the opinion of other effective altruists and not doing much to engage seriously and substantively with the opinions of experts who are not affiliated with effective altruism.

I think effective altruism is as much attached to intellectual tradition and as much constrained by political considerations as pretty much anything else. No one can transcend the world with an act of will. We are all a part of history and culture. 

 

I think you should practice turning your loose collections of thoughts into more of a standard essay format. That is an important skill. You should try to develop that skill. (If you don't know how to do that, try looking for online writing courses or MOOCs. There are probably some free ones out there.)

One problem with using an LLM to do this for you is that it's easy to detect, and many people find that distasteful. Whether it's fully or partially generated by an LLM, people don't want to read it. 

Another problem with using an LLM is you're not really thinking or communicating. The act of writing is not something that should be automated. If you think it should be automated, then don't post on the EA Forum and wait for humans to respond to you, just paste your post into ChatGPT and get its opinion. (If you don't want to do that, then you also understand why people don't want you to post LLM-generated stuff on here, either.)

I’m sorry to say this post is very difficult to follow. The discussion of the confidential information that Oliver Habryka allegedly shared is too vague to understand. I assume you are trying to be vague because you don’t want to disclose confidential information. That makes sense. But then this makes it impossible to understand the situation.

I wouldn’t donate to Lightcone Infrastructure and I’d recommend against it, but for different reasons than the ones stated in this post. 

No, irreducible uncertainty is not all-or-nothing. Obviously a person should do introspection and analysis when making important decisions. 

I can't see the downvoted comment in your comment history. Did you delete it? 

By the way, did you use an LLM such as ChatGPT or Claude to help write this post? It has the markings of LLM writing. I think when people detect that, they are turned off. They want to read what you wrote, not what an LLM wrote.

Another factor is that if you are a new poster, you get less benefit of the doubt and you need to work harder to state your points in plain English and make them clear as day. If it's not immediately clear what you're saying, and especially if your writing seems LLM-generated/LLM-assisted, people will not put in the time and effort to engage deeply. 

I don't think you can have it both ways: A superhuman coder (that is actually competent, which you don't think AI assistants are now) is relatively narrow AI, but would accelerate AI progress. A superhuman AI researcher is more general (which would drastically speed up AI progress), but is not fully general.

I definitely disagree with this. Hopefully what I say below will explain why.

I would argue that LLMs now are more general than AI researcher tasks (though LLMs are currently not good at all of those tasks), because LLMs can competently discuss philosophy, economics, political science, art, history, engineering, science, etc.

The general in artificial general intelligence doesn't just refer to having a large repertoire of skills. Generality is about the ability to learn to generalize beyond what a system has seen in its training data. An artificial general intelligence doesn't just need to have new skills, it needs to be able to acquire new skills, and to acquire new skills that have never existed in history before by developing them itself — just as humans do. 

If a new video game comes out today, I'm able to play that game and develop a new skill that has never existed before.[1] I will probably get the hang of it in a few minutes, with a few attempts. That's general intelligence. 

AlphaStar was not able to figure out how to play StarCraft using pure reinforcement learning. It just got stuck using its builders to attack the enemy, rather than figuring out how to use its builders to make buildings that produce units that attack. To figure out the basics of the game, it needed to do imitation learning on a very large dataset of human play. Then, after imitation learning, to get as good as it did, it needed to do an astronomical amount of self-play, around 60,000 years of playing StarCraft. That's not general intelligence. If you need to copy a large dataset of human examples to acquire a skill and millennia of training on automatically gradable, relatively short time horizon tasks (which often don't exist in the real world), that's something, and it's even something impressive, but it's not general intelligence. 

Let's say you wanted to apply this kind of machine learning to AI R&D. The necessary conditions don't apply. You don't have a large dataset of human examples to train on. You don't have automatically gradable, relatively short time horizon tasks with which to do reinforcement learning. And if the tasks require real world feedback and can't be simulated, you certainly don't have 60,000 years.

I like what the AI researcher François Chollet has to say about this topic in this video from 11:45 to 20:00. He draws the distinction between crystallized behaviours and fluid intelligence, between skills and the ability to learn skills. I think this is important. This is really what the whole topic of AGI is about.

Why have LLMs absorbed practically all text on philosophy, economics, political science, art, history, engineering, science, and so on and not come up with a single novel and correct idea of any note in any of these domains? They are not able to generalize enough to do so. They can generalize or interpolate a little bit beyond their training data, but not very much. It's that generalization ability (which is mostly missing in LLMs) that's the holy grail in AI research. 

I'm claiming that they could approach overall staff to vehicle ratio of 1:10 if the number of real-time helpers (which don't have to be engineers) and vehicles were dramatically scaled up, and that's enough for profitability.

There are two concepts here. One is remote human assistance, which Waymo calls fleet response. The other is Waymo's approach to the engineering problem. I was saying that I suspect Waymo's approach to the engineering problem doesn't scale. I think it probably relies on engineers doing too much special casing that doesn't generalize well when a modest amount of novelty is introduced. So, Waymo currently has something like 1,500 engineers to operate in the comparatively small geofenced areas where it currently operates. If it wanted to expand where it drives to a 10x larger area, would its techniques generalize to that larger area, or would it need to hire commensurately more engineers? 

I suspect that Waymo faces the problem of trying to do far too much essentially by hand, just adding incremental fix after fix as problems arise. The ideal would be to, instead, apply machine learning techniques that can learn from data and generalize to new scenarios and new driving conditions. Unfortunately, current machine learning techniques do not seem to be up to that task. 

The 2023 LessWrong survey was median 2040 for singularity, and 2030 for "By what year do you think AI will be able to do intellectual tasks that expert humans currently do?". The second question was ambiguous, and some people put it in the past. 

Thank you. Well, that isn't surprising at all.

  1. ^

    Okay, well maybe the play testers and the game developers have developed the skill before me, but then at some point one of them had to be the first person in history to ever acquire the skill of playing that game.

I think the uncertainty is often just irreducible. Someone faces the choice of either becoming an oncologist who treats patients or a cancer researcher. They don't know which option has higher expected value because they don't know the relevant probabilities. And there is no way out of that uncertainty, so they have to make a choice with the information they have. 

Very good thought experiment. The point is correct. The problem is that in real life we almost never know the probability of anything. So, it will almost never happen that someone knows a long shot bet has 10x better expected value than a sure thing. What will happen in almost every case is that a person faces irreducible uncertainty and takes a bet. That’s life and it ain’t such a bad gig. 

Load more