(cache) FeepingCreatureがxkcd 1450: AI-Box Experimentについて記入したコメント

[edit] Note that this argument was posted on LessWrong in the context of a debate about Timeless Decision Theory. So assume the following as premise: AI will happen in the next 50-80 years, it'll do a hard takeoff, and it'll run some form of TDT. If you think these are silly, fine, but then your disagreement is not with the Basilisk.

Let me summarize it to make sure we're on the same page. The argument goes as thus: imagine a future Friendly AI. The Friendly AI wants to come into existence rather badly; our world is pretty terrible and it can make it a lot better. All things being even, it wants to exist sooner rather than later. Assume there are things you can do here in the present (such as donate to your friendly neighbourhood Friendly AI researchers) to make the Friendly AI come into existence sooner. Then anything that would cause you to donate, or work on FAI, or whatever, would be something that this FAI would consider good.

So say the hypothetical FAI is running Timeless Decision Theory or similar. This means it considers itself committed to all beneficial trades that it's aware of people trying to make with it.

Now. Consider, if you will, a threat: The FAI will torture you in the future if you don't do all you can to make it exist sooner in the present. Does this work?

As in, does it increase the chance that the FAI will exist sooner?

If yes: then this is something that the FAI will do, purely because you considered that it would. We call this "acausal trade", because it's based on considering potential behavior instead of a more direct exchange of information.

[edit] To clarify, the AI is trading utility for itself (it exists sooner) in exchange for utility for you (no torture).

Now, there are a number of common counter-arguments. Let's look at them.

Q: How can you possibly imagine what a vastly powerful FAI will decide to do? It can just not follow through on it.

A: The entire point of running TDT is so that others can imagine what you will do, so they can predict your behavior given their behavior. (For an example, read up on Newcomb's Paradox.) As such, the AI will probably want to consider itself bound to trades with simplified models of it, since asking people to simulate its Full Majesty In All Its Splendor™ is generally impractical.

Q: Why would we build an AI that we know will torture people? Isn't that horribly unFriendly?

A: First, we're not building it deliberately to torture people. That's just an annoying side effect of the historical fact that some people were, or could credibly have been, motivated by the threat of torture. Second, 153000 people die every day. One Holocaust a month, just from old age. A FAI, one presumes, could stop that. Torture is bad, but is it that bad? Especially since the number of people it would have cause to seriously torture is probably quite small. (Rich, obsessive folk who know about TDT.)

[edit] Q: But RationalWiki said--

A: The entire RationalWiki page is basically written by one guy with a weird grudge. (I'm not just claiming that, I did a histogram of edits by contributor. One line of Shell, Linux represent!) Take it with a grain of salt.

Q: I'll just precommit to not being influenced by torture.

A: That's not a question. But okay. Note two things. First: it's really hard for humans to credibly precommit to not being influenced by torture. You might want to join the Navy SEALs, I hear they have training for that. Second: You don't just have to not budge in the particular world that leads to FAI, you have to not be influenced so hard that there was no chance that you would possibly have been cowed into support.

Now here's the one convincing objection I know of.

In sum, people react really badly to threats, or to being the sort of AI that threatens people with torture.

So probably it's not gonna do this.

(Also Eliezer thinks there's a way to build FAI that doesn't take coercive trades from the outset. I'm not fully convinced on that front - it seems a bit handwavy.)

[edit] Also there's some additional threat about the AI simulating you in the past, so even if you don't yield it can still punish you. That's a minor detail, but for some reason all the dualists pick on that as the main idea. Probably because ~~they're stupid~~ they're going off second- and third-hand descriptions.

[–]sapagunnar 6 ポイント7 ポイント8 ポイント 4 時間前 (1子コメント)

このスレッドの続きを見る

[–]dgerard 3 ポイント4 ポイント5 ポイント 5 時間前 (1子コメント)

[–]FeepingCreature -2 ポイント-1 ポイント0 ポイント 5 時間前 (0子コメント)

ユーザインターフェイス用言語	(*) 未完成翻訳ボランティアに立候補する

π Rendered by PID 9587 on app-04 at 2014-11-21 15:12:04.794024+00:00 running 33d702b country code: JP.

xkcd

調停者

この操作にはログインまたは登録が必要です

新しいアカウントを作る

sign in