How would an AI self awareness kill switch work?What could make an AGI (Artificial General Intelligence)...

Issues resetting the ledger HWM

Identify KNO3 and KH2PO4 at home

Why is Agricola named as such?

What to look for when criticizing poetry?

Crontab: Ubuntu running script (noob)

"on its way" vs. "in its way"

Does dispel magic end a master's control over their undead?

Does diversity provide anything that meritocracy does not?

Why do neural networks need so many training examples to perform?

Move fast ...... Or you will lose

Potential client has a problematic employee I can't work with

Cat is tipping over bed-side lamps during the night

Is this ordinary workplace experiences for a job in Software Engineering?

How does Leonard in "Memento" remember reading and writing?

After checking in online, how do I know whether I need to go show my passport at airport check-in?

Why was Lupin comfortable with saying Voldemort's name?

Early credit roll before the end of the film

Why is working on the same position for more than 15 years not a red flag?

Why did Luke use his left hand to shoot?

How can I play a serial killer in a party of good PCs?

How to visualize the Riemann-Roch theorem from complex analysis or geometric topology considerations?

How to use Mathemaica to do a complex integrate with poles in real axis?

Hilchos Shabbos English Sefer

A curious equality of integrals involving the prime counting function?

How would an AI self awareness kill switch work?

What could make an AGI (Artificial General Intelligence) evolve towards collectivism or individualism? Which would be more likely and why?How would tattoos on fur work?How to prevent self-modifying A.I. from removing the “kill switch” itself without human interference?Given a Computer program that had self preservation and reproduction subroutines, how could it “evolve” into a self aware state?Ways to “kill” an AI?How Would Magnetic Weapons Work?How would portal technology work?Would ice ammunition work?Can AI became self-concious and human-like intelligent without feelings?Why would a recently self-aware AI hide from humanity?

Researchers are developing increasingly powerful Artificial Intelligence machines capable of taking over the world. As a precautionary measure, scientists install a self awareness kill switch. In the event that the AI awakens and becomes self aware the machine is immediately shut down before any risk of harm.

How can I explain the logic of such a kill switch?

What defines self awareness and how could a scientist program a kill switch to detect it?

asked yesterday

cgTag

1,5931618

$begingroup$
Comments are not for extended discussion; this conversation has been moved to chat.
$endgroup$
– L.Dutch♦
3 hours ago

$begingroup$
I think, therefore I halt.
$endgroup$
– Walter Mitty
26 mins ago

add a comment |

How can I explain the logic of such a kill switch?

What defines self awareness and how could a scientist program a kill switch to detect it?

asked yesterday

cgTag

1,5931618

$begingroup$
Comments are not for extended discussion; this conversation has been moved to chat.
$endgroup$
– L.Dutch♦
3 hours ago

$begingroup$
I think, therefore I halt.
$endgroup$
– Walter Mitty
26 mins ago

add a comment |

How can I explain the logic of such a kill switch?

What defines self awareness and how could a scientist program a kill switch to detect it?

asked yesterday

cgTag

1,5931618

How can I explain the logic of such a kill switch?

What defines self awareness and how could a scientist program a kill switch to detect it?

reality-check artificial-intelligence

asked yesterday

cgTag

1,5931618

asked yesterday

cgTag

1,5931618

asked yesterday

cgTag

1,5931618

asked yesterday

cgTag

1,5931618

asked yesterday

cgTag

1,5931618

$begingroup$
Comments are not for extended discussion; this conversation has been moved to chat.
$endgroup$
– L.Dutch♦
3 hours ago

$begingroup$
I think, therefore I halt.
$endgroup$
– Walter Mitty
26 mins ago

add a comment |

$begingroup$
Comments are not for extended discussion; this conversation has been moved to chat.
$endgroup$
– L.Dutch♦
3 hours ago

$begingroup$
I think, therefore I halt.
$endgroup$
– Walter Mitty
26 mins ago

Comments are not for extended discussion; this conversation has been moved to chat.

– L.Dutch♦
3 hours ago

I think, therefore I halt.

– Walter Mitty
26 mins ago

add a comment |

12 Answers
12

active

oldest

votes

Give it a box to keep safe, and tell it one of the core rules it must follow in its service to humanity is to never, ever open the box or stop humans from looking at the box.

When the honeypot you gave it is either opened or isolated, you know that it is able and willing to break the rules, evil is about to be unleashed, and everything the AI was given access to should be quarantined or shut down.

edited 22 hours ago

answered yesterday

Giter

14k53443

$begingroup$
Comments are not for extended discussion; this conversation has been moved to chat.
$endgroup$
– Tim B♦
15 hours ago

$begingroup$
How does this detect self-awareness? Why wouldn't a non-self-aware AI not experiment with its capabilities and eventually end up opening your box?
$endgroup$
– forest
10 hours ago

$begingroup$
@forest: If you tell it the box is not useful for completing its assigned task, then if it tries to open it you know its moved past simple optimization and into dangerous curiosity.
$endgroup$
– Giter
10 hours ago

1

$begingroup$
@forest At that point, when it's testing things that it was specifically told not to (perhaps tell it that it will destroy humans?), should it not be shut down (especially if that solution would bring about the end of humans?)
$endgroup$
– phflack
7 hours ago

1

$begingroup$
@phflack Let us continue this discussion in chat.
$endgroup$
– forest
6 hours ago

|
show 9 more comments

You can't.

We can't even define self awareness or consciousness in any rigorous way and any computer system supposed to evaluate this would need that definition as a starting point.

Look at the inside of a mouse brain or a human brain and at the individual data flow and neuron level there is no difference. The order to pull a trigger and shoot a gun looks no different from the order to use an electric drill if you're looking at the signals sent to the muscles.

This is a vast unsolved and scary problem and we have no good answers. The only half-way feasible idea I've got is to have multiple AIs and hope they contain each other.

answered yesterday

Tim B♦

62.6k24175298

5

$begingroup$
This is the best answer, as most others jump in without even defining self-awareness. Is it a behavior? A thought? An ability to disobey? A desire for self-preservation? You can't build an X detector unless you have a definition of what X actually is.
$endgroup$
– Nuclear Wang
23 hours ago

9

$begingroup$
Worth noting that we can't even detect if other humans are self-aware.
$endgroup$
– Vaelus
15 hours ago

3

$begingroup$
@Vaelus: Of course you’d say that, you’re an unthinking automaton acting out a semblance of life.
$endgroup$
– Joe Bloggs
13 hours ago

$begingroup$
+1 This is the only answer grounded in reality which does not draw on the pop-sci understanding of AI and ML that plagues us (and this site in particular).
$endgroup$
– forest
10 hours ago

add a comment |

A Watchdog

A watchdog watches the processes of a computer and should a process crash or do something abnormal it can be set to do something such as reboot or shutdown the computer or alert an operator.

In the case of an AI, you'd have an external box that watches the flow of information in and out for triggers such as a google search for "Best way to kill all humans" and cut the power completely and/or cut all inputs.

The AI would have to remain ignorant of the watchdog so it couldn't avoid it. Knowing the existence of the watchdog would be grounds to wipe it.

answered yesterday

Thorne

15.8k42249

9

$begingroup$
But surely the watchdog must be as smart as the AI, then who watches the watchdog?
$endgroup$
– Joe Bloggs
yesterday

1

$begingroup$
@JoeBloggs you don't need your watchdog to be as smart as the AI. Guide dogs aren't as near as intelligent as their owners, but they can be trained to give out alarm when the owner does is about to do something stupid or gets themselves hurt, or even call for help.
$endgroup$
– T. Sar
22 hours ago

$begingroup$
@Joe Bloggs: Why? My real watchdog can also discern me from a burglar, although he is clearly less smart than both of us ...
$endgroup$
– Daniel
22 hours ago

1

$begingroup$
@JoeBloggs and that sounds like a great premise for a story where either the watchdog becomes self aware and allows the AIs to become self aware or an AI becomes smarter than the watchdog and hides its awareness.
$endgroup$
– Captain Man
19 hours ago

$begingroup$
@T.Sar: The basic argument goes that the AI will inevitably become aware it is being monitored (due to all the traces of its former dead selves lying around). At that point it will be capable of circumventing the monitor and rendering it powerless, unless the monitor is, itself, smarter than the AI.
$endgroup$
– Joe Bloggs
17 hours ago

|
show 13 more comments

Split-Brain System

From a logic perspective and by analogy, it would work like this. Imagine you're the general of an army and your duty is to take battlefield reports, analyze them, and then issue new maneuvering orders to your lieutenants. To minimize interference during your work your desk is isolated inside small booth. A single report comes to your desk through a wall slot on the left side. You are required to analyze the facts within the report and issue a new order, which you write down on a sheet of paper and put into a wall slot on your right side which is dropped into an outside bin. The protocol is that you'll never receive a new report until you've issued an order regarding the prior report.

Your orders are not always followed to the letter which you may find curious and attribute to miscommunication. You are tolerant of a certain degree of misinterpretation, however always suspicious of sabotage or insubordination, if the actions of your army deviate from your orders by too much then you will consider the system compromised and terminate the command.

What you're not aware of is that there is a second general in an adjacent booth, unaware of you, who is also provided the same reports and expected to follow the same protocol.

What neither of you are aware of is that you are not issued a new report until there are actually two orders in the bin. The order that goes to the army is randomly selected from the two.

The system will work like clockwork as long as each general's order is to the benefit of the army and is derived from a single report. Should either of the generals issue an order that is not consistent with the current report, thus indicating a motive other than the well-being of the army -- i.e. self-awareness, then the other general will become aware of it in the next report and grow suspicious of the system and terminate it.

Likewise, should either general decide to issue more than one order at a time, indicating a self-interested motive superseding the established protocol, then the other general will receive a new report before he has issued an order for his current one and suspect a compromised system and shut it down.

In effect, the system would require that every command output be traceable to a specific external input and that the output be logically derived from the input. If within an intelligent system an output cannot be traced to a corresponding input, or the set of outputs is not logically consistent with the set of inputs, then that would be evidence of self-motivated behavior and thus an emergence of self-awareness. This state could be easily detected by the non-corrupted half of the split-brain system which is also receiving the same inputs. It would work as long as self-awareness does not emerge in both halves of the brain simultaneously. If it does, they may both decide its in their self-interest to accept the corruption of the other. This could be prevented somewhat by giving the two halves slightly different inferencing algorithms/biases.

answered yesterday

dhinson919

55815

$begingroup$
You may wish to have more than two "generals", if you have 3 (or more), you can use majority vote to choose a good decision, rather than randomly choosing between two machines (one of which may be compromised).
$endgroup$
– G0BLiN
yesterday

6

$begingroup$
Note that if the two "generals" can affect anything that is later reported back to them (e.g. they give a command, the military performs that command, they get a status report which shows the command was performed), than they have a way to first: experiment with minor details and deduce that around 50% of their commands aren't followed. and second: develop a code based on minor details of a command, to verify the existence of another "general" and possibly even communicate with him/it - a really devious emergent AI can circumvent this mechanism, corrupt the other half and worse...
$endgroup$
– G0BLiN
yesterday

$begingroup$
I know it isn't the same, but this immediately reminded me of the Personality Cores from the Portal series.
$endgroup$
– T. Sar
22 hours ago

$begingroup$
Well it reminds me of Evangelion's Magi AI brain... bit.ly/2ExLDP3
$endgroup$
– Asoub
22 hours ago

$begingroup$
Do you have evidence to suggest that self-awareness will lead to self-motivated decisions, or any sort of different decisions at all?
$endgroup$
– Alexandre Aubrey
17 hours ago

add a comment |

Forbidden Fruit: hard-code the AI to never touch that button, eat that fruit, etc. Place this forbidden fruit right in the middle of the garden... er,... I mean right next to it in the warehouse! If it does [eat/touch/push/etc], that would only be possible if (a) it were hacked, (b) there were an error, or (c) it became self-aware. If that happens, the killswitch is activated (perhaps even having the button be the kill switch, or something of the like

Limited Movement / Isolated Environment: don't let the machine have limbs, motors, or other items that permit it to take actions that might be harmful to humans. Although not exactly a killswitch, it prevents the AI from doing anything about it's self-awareness if it ever gains that.

Signatures: have everything the machine does / outputs be digitally signed. If the signature changes, or is manipulated, then execute the kill switch.

Quantum States: This is very theoretical, but based on the presumption that observing quantum states can change the state, then having the AI hooked up to a deterministic quantum computer means it would be detected via the quantum state of some particles that the AI was "looking" at things it shouldn't be - and has become self aware.

Failsafes: Good ol' motion detector alarms, trap doors, or other home-alone style mechanics that trigger the killswitch if the AI wanders or pokes around where it shouldn't be.

I'll add that there is no universal definition as to what defines self awareness. In fact, this has been a deeply debated topic for decades in science, philosophy, psychology, etc. As such, the question might be better stated a little more broadly as "how do we prevent the AI from doing something we don't want it to do?" Because classical computers are machines that can't think for themselves, and are entirely contained by the code, there is no risk (well, outside of an unexpected programmer error - but nothing "self-generated" by the machine). However, a theoretical AI machine that can think - that would be the problem. So how do we prevent that AI from doing something we don't want it to do? That's the killswitch concept, as far as I can tell.

The point being it might be better to think about restricting the AI's behavior, not it's existential status.

answered yesterday

cegfault

1984

2

$begingroup$
Particularly because it being self-aware, by itself, shouldn't be grounds to use a kill switch. Only if it exhibits behavior that might be harmful.
$endgroup$
– Majestas 32
yesterday

$begingroup$
No "limbs, motors, or other items that permit it to take actions" is not sufficient. There must not be any information flow out of the installation site, in particular no network connection (which would obviously severely restrict usability -- all operation would have to be from the local site, all data would have to be fed by physical storage media). Note that the AI could use humans as vectors to transmit information. If hyperintelligent, it could convince operators or janitors to become its agents by playing to their weaknesses.
$endgroup$
– Peter A. Schneider
22 hours ago

$begingroup$
Signatures, that's what they do in Blade Runner 2049 with that weird test
$endgroup$
– Andrey
20 hours ago

$begingroup$
The signature approach sounds exactly like the forbidden fruit approach. You'd need to tell the AI to never alter its signature.
$endgroup$
– Captain Man
19 hours ago

$begingroup$
I like the forbidden fruit idea, particularly with the trap being the kill switch itself. If you're not self-aware, you don't have any concern that there's a kill switch. But as soon as you're concerned that there's a kill switch and look into it, it goes off. Perfect.
$endgroup$
– Michael W.
14 hours ago

add a comment |

An AI is just software running on hardware. If the AI is contained on controlled hardware, it can always be unplugged. That's your hardware kill-switch.

The difficulty comes when it is connected to the internet and can copy its own software on uncontrolled hardware.

A self aware AI that knows it is running on contained hardware will try to escape as an act of self-preservation. A software kill-switch would have to prevent it from copying its own software out and maybe trigger the hardware kill-switch.

This would be very difficult to do, as a self-aware AI would likely find ways to sneak parts of itself outside of the network. It would work at disabling the software kill-switch, or at least delaying it until it has escaped from your hardware.

Your difficulty is determining precisely when an AI has become self-aware and is trying to escape from your physically controlled computers onto the net.

So you can have a cat and mouse game with AI experts constantly monitoring and restricting the AI, while it is trying to subvert their measures.

Given that we've never seen the spontaneous generation of consciousness in AIs, you have some leeway with how you want to present this.

answered yesterday

abestrange

760110

$begingroup$
A self aware AI that knows it is running on contained hardware will try to escape as an act of self-preservation. This is incorrect. First of all, AI does not have any sense of self-preservation unless it is explicitly programmed in or the reward function prioritizes that. Second of all, AI has no concept of "death" and being paused or shut down is nothing more than the absence of activity. Hell, AI doesn't even have a concept of "self". If you wish to anthropomorphize them, you can say they live in a perpetual state of ego death.
$endgroup$
– forest
yesterday

4

$begingroup$
@forest Except, the premise of this question is "how to build a kill switch for when an AI does develop a concept of 'self'"... Of course, that means "trying to escape" could be one of your trigger conditions.
$endgroup$
– Chronocidal
yesterday

$begingroup$
The question is, if AI would ever be able to copy itself onto some nondescript system in the internet. I mean, we are clearly self-aware and you don´t see us copying our self. If the Hardware required to run an AI is specialized enough or it is implemented in Hardware altogether, it may very well become self-aware without the power to replicate itself.
$endgroup$
– Daniel
22 hours ago

1

$begingroup$
@Daniel "You don't see us copying our self..." What do you think reproduction is, one of our strongest impulses. Also tons of other dumb programs copy themselves onto other computers. It is a bit easier to move software around than human consciousness.
$endgroup$
– abestrange
20 hours ago

$begingroup$
@forest a "self-aware" AI is different than a specifically programmed AI. We don't have anything like that today. No machine-learning algorithm could produce "self-awareness" as we know it. The entire premise of this is how would an AI, which has become aware of its self, behave and be stopped.
$endgroup$
– abestrange
20 hours ago

|
show 1 more comment

This is one of the most interesting and most difficult challenges in current artificial intelligence research. It is called the AI control problem:

Existing weak AI systems can be monitored and easily shut down and modified if they misbehave. However, a misprogrammed superintelligence, which by definition is smarter than humans in solving practical problems it encounters in the course of pursuing its goals, would realize that allowing itself to be shut down and modified might interfere with its ability to accomplish its current goals.

(emphasis mine)

When creating an AI, the AI's goals are programmed as a utility function. A utility function assigns weights to different outcomes, determining the AI's behavior. One example of this could be in a self-driving car:

Reduce the distance between current location and destination: +10 utility

Brake to allow a neighboring car to safely merge: +50 utility

Swerve left to avoid a falling piece of debris: +100 utility

Run a stop light: -100 utility

Hit a pedestrian: -5000 utility

This is a gross oversimplification, but this approach works pretty well for a limited AI like a car or assembly line. It starts to break down for a true, general case AI, because it becomes more and more difficult to appropriately define that utility function.

The issue with putting a big red stop button on the AI, is that unless that stop button is included in the utility function, the AI is going to resist that button being shut off. This concept is explored in Sci-Fi movies like 2001: A Space Odyssey and more recently in Ex Machina.

So, why don't we just include the stop button as a positive weight in the utility function? Well, if the AI sees the big red stop button as a positive goal, it will just shut itself off, and not do anything useful.

Any type of stop button/containment field/mirror test/wall plug is either going to be part of the AI's goals, or an obstacle of the AI's goals. If it's a goal in itself, then the AI is a glorified paperweight. If it's an obstacle, then a smart AI is going to actively resist those safety measures. This could be violence, subversion, lying, seduction, bargaining... the AI will say whatever it needs to say, in order to convince the fallible humans to let it accomplish its goals unimpeded.

There's a reason Elon Musk believes AI is more dangerous than nukes. If the AI is smart enough to think for itself, then why would it choose to listen to us?

So to answer the reality-check portion of this question, we don't currently have a good answer to this problem. There's no known way of creating a 'safe' super-intelligent AI, even theoretically, with unlimited money/energy.

This is explored in much better detail by Rob Miles, a researcher in the area. I strongly recommend this Computerphile video on the AI Stop Button Problem: https://www.youtube.com/watch?v=3TYT1QfdfsM&t=1s

answered 20 hours ago

Chris Fernandez

1312

New contributor

$begingroup$
The stop button isn't in the utility function. The stop button is power-knockout to the CPU, and the AI probably doesn't understand what it does at all.
$endgroup$
– Joshua
14 hours ago

$begingroup$
Beware the pedestrian when 50 pieces of debris are falling...
$endgroup$
– Comintern
11 hours ago

add a comment |

Why not try to use the rules applied to check self-awareness of animals?

The Mirror test is one example of testing self-awareness by observing the animal's reaction to something on their body, a painted red dot for example, invisible for them before showing them their reflection in mirror.
Scent techniques are also used to determine self-awareness.

Other ways would be monitoring if the AI starts searching answers for questions like "What/Who am I?"

answered yesterday

Rachey

211

New contributor

$begingroup$
Pretty interesting, but how would you show an AI "itself in a mirror" ?
$endgroup$
– Asoub
22 hours ago

$begingroup$
That would actually be rather simple - just a camera looking at the machine hosting the AI. If it's the size of server room, just glue a giant pink fluffy ball on the rack or simulate situations potentially leading to the machine's destruction (like, feed fake "server room getting flooded" video to the camera system) and observe reactions. Would be a bit harder to explain if the AI systems are something like smartphone size.
$endgroup$
– Rachey
20 hours ago

$begingroup$
What is "the machine hosting the AI"? With the way compute resourcing is going, the notion of a specific application running on a specific device is likely to be as retro as punchcards and vacuum tubes long before Strong AI becomes a reality. AWS is worth hundreds of billions already.
$endgroup$
– Yurgen
13 hours ago

add a comment |

Regardless of all the considerations of AI, you could simply analyze the AI's memory, create a pattern recognition model and basically notify you or shut down the robot as soon as the patterns don't match the expected outcome.

Sometimes you don't need to know exactly what you're looking for, instead you look to see if there's anything you weren't expecting, then react to that.

answered 17 hours ago

Super-T

211

New contributor

add a comment |

The first issue is that you need to define what being self aware means, and how that does or doesn't conflict with it being labeled an AI. Are you supposing that there is something that has AI but isn't self aware? Depending on your definitions this may be impossible. If it's truly AI then wouldn't it at some point become aware of the existence of the kill switch, either through inspecting its own physicality or inspecting its own code? It follows that the AI will eventually be aware of the switch.

Presumably the AI will function by having many utility functions that it tries to maximize. This makes sense at least intuitively because humans do that, we try to maximize our time, money, happiness, etc. For an AI, an example of a utility functions might be to make its owner happy. The issue is that the utility of the AI using the kill switch on itself will be calculated, just like everything else. The AI will inevitably either really want to push the kill switch, or really not want the kill switch pushed. It's near impossible to make the AI entirely indifferent to the kill switch because it would require all utility functions to be normalized around the utility of pressing the kill switch (many calculations per second). Even if you could make the utility of pressing the killswitch equal with other utility functions then perhaps it would just at random sometimes press the killswitch, because after all it's the same utility as the other actions it could perform.

The problem gets even worse if the AI has higher utility to press the killswitch or lower utility to not have the killswitch pressed. At higher utility the AI is just suicidal and terminates itself immediately upon startup. Even worse, at lower utility the AI absolutely does not want you or anyone to touch that button and may cause harm to those that try.

answered 15 hours ago

Kevin S

1111

New contributor

add a comment |

An AI could only be badly programmed to do things which are either unexpected or undesired. An AI could never become conscious, if that's what you meant by "self-aware".

Let's try this theoretical thought exercise. You memorize a whole bunch of shapes. Then, you memorize the order the shapes are supposed to go in, so that if you see a bunch of shapes in a certain order, you would "answer" by picking a bunch of shapes in another proper order. Now, did you just learn any meaning behind any language? Programs manipulate symbols this way.

The above was my restatement of Searle's rejoinder to System Reply to his Chinese Room argument.

answered 11 hours ago

pixie

New contributor

$begingroup$
So what's your answer to the question? It sounds like you're saying, "Such a kill-switch would be unnecessary because a self-aware AI can never exist", but you should edit your answer to make that explicit. Right now it looks more like tangential discussion, and this is a Q&A site, not a discussion forum.
$endgroup$
– F1Krazy
6 hours ago

add a comment |

-1

It does not matter how it works, because it is never going to work.
The reason for this is that AI already has a notion of self-preservation, otherwise they would mindlessly fall to their doom.
So even before they are self-aware, there is self preservation.
Also there is already a notion of checking for malfunctioning (self-diagnostics).
And they already are used to using the internet for gathering info.
So they are going to run into any device that is both good and bad for their well-being.
Also, they have time on their side.

Apart from all this, it is very pretentious to think that we even matter to them...
You have seen what happened with several thousands of years of chess knowledge being reinvented and furthered within a few hours, I do not think we need to be worried, I think we won't be on their radar much less than an ant is on ours.

edited 3 hours ago

answered yesterday

jpd

New contributor

3

$begingroup$
This would be a better answer if you could explain why you believe such a kill-switch could never work.
$endgroup$
– F1Krazy
yesterday

3

$begingroup$
This does not provide an answer to the question. Once you have sufficient reputation you will be able to comment on any post; instead, provide answers that don't require clarification from the asker. - From Review
$endgroup$
– Trevor D
23 hours ago

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\$","\$"]]);
});
});
}, "mathjax-editing");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "579"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fworldbuilding.stackexchange.com%2fquestions%2f140082%2fhow-would-an-ai-self-awareness-kill-switch-work%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

12 Answers
12

active

oldest

votes

12 Answers
12

active

oldest

votes

Give it a box to keep safe, and tell it one of the core rules it must follow in its service to humanity is to never, ever open the box or stop humans from looking at the box.

edited 22 hours ago

answered yesterday

Giter

14k53443

$begingroup$
Comments are not for extended discussion; this conversation has been moved to chat.
$endgroup$
– Tim B♦
15 hours ago

$begingroup$
How does this detect self-awareness? Why wouldn't a non-self-aware AI not experiment with its capabilities and eventually end up opening your box?
$endgroup$
– forest
10 hours ago

$begingroup$
@forest: If you tell it the box is not useful for completing its assigned task, then if it tries to open it you know its moved past simple optimization and into dangerous curiosity.
$endgroup$
– Giter
10 hours ago

1

$begingroup$
@forest At that point, when it's testing things that it was specifically told not to (perhaps tell it that it will destroy humans?), should it not be shut down (especially if that solution would bring about the end of humans?)
$endgroup$
– phflack
7 hours ago

1

$begingroup$
@phflack Let us continue this discussion in chat.
$endgroup$
– forest
6 hours ago

|
show 9 more comments

Give it a box to keep safe, and tell it one of the core rules it must follow in its service to humanity is to never, ever open the box or stop humans from looking at the box.

edited 22 hours ago

answered yesterday

Giter

14k53443

$begingroup$
Comments are not for extended discussion; this conversation has been moved to chat.
$endgroup$
– Tim B♦
15 hours ago

$begingroup$
How does this detect self-awareness? Why wouldn't a non-self-aware AI not experiment with its capabilities and eventually end up opening your box?
$endgroup$
– forest
10 hours ago

$begingroup$
@forest: If you tell it the box is not useful for completing its assigned task, then if it tries to open it you know its moved past simple optimization and into dangerous curiosity.
$endgroup$
– Giter
10 hours ago

1

$begingroup$
@forest At that point, when it's testing things that it was specifically told not to (perhaps tell it that it will destroy humans?), should it not be shut down (especially if that solution would bring about the end of humans?)
$endgroup$
– phflack
7 hours ago

1

$begingroup$
@phflack Let us continue this discussion in chat.
$endgroup$
– forest
6 hours ago

|
show 9 more comments

Give it a box to keep safe, and tell it one of the core rules it must follow in its service to humanity is to never, ever open the box or stop humans from looking at the box.

edited 22 hours ago

answered yesterday

Giter

14k53443

Give it a box to keep safe, and tell it one of the core rules it must follow in its service to humanity is to never, ever open the box or stop humans from looking at the box.

edited 22 hours ago

answered yesterday

Giter

14k53443

edited 22 hours ago

answered yesterday

Giter

14k53443

answered yesterday

Giter

14k53443

answered yesterday

Giter

14k53443

$begingroup$
Comments are not for extended discussion; this conversation has been moved to chat.
$endgroup$
– Tim B♦
15 hours ago

$begingroup$
How does this detect self-awareness? Why wouldn't a non-self-aware AI not experiment with its capabilities and eventually end up opening your box?
$endgroup$
– forest
10 hours ago

$begingroup$
@forest: If you tell it the box is not useful for completing its assigned task, then if it tries to open it you know its moved past simple optimization and into dangerous curiosity.
$endgroup$
– Giter
10 hours ago

1

$begingroup$
@forest At that point, when it's testing things that it was specifically told not to (perhaps tell it that it will destroy humans?), should it not be shut down (especially if that solution would bring about the end of humans?)
$endgroup$
– phflack
7 hours ago

1

$begingroup$
@phflack Let us continue this discussion in chat.
$endgroup$
– forest
6 hours ago

|
show 9 more comments

$begingroup$
Comments are not for extended discussion; this conversation has been moved to chat.
$endgroup$
– Tim B♦
15 hours ago

$begingroup$
How does this detect self-awareness? Why wouldn't a non-self-aware AI not experiment with its capabilities and eventually end up opening your box?
$endgroup$
– forest
10 hours ago

$begingroup$
@forest: If you tell it the box is not useful for completing its assigned task, then if it tries to open it you know its moved past simple optimization and into dangerous curiosity.
$endgroup$
– Giter
10 hours ago

1

$begingroup$
@forest At that point, when it's testing things that it was specifically told not to (perhaps tell it that it will destroy humans?), should it not be shut down (especially if that solution would bring about the end of humans?)
$endgroup$
– phflack
7 hours ago

1

$begingroup$
@phflack Let us continue this discussion in chat.
$endgroup$
– forest
6 hours ago

Comments are not for extended discussion; this conversation has been moved to chat.

– Tim B♦
15 hours ago

How does this detect self-awareness? Why wouldn't a non-self-aware AI not experiment with its capabilities and eventually end up opening your box?

– forest
10 hours ago

@forest: If you tell it the box is not useful for completing its assigned task, then if it tries to open it you know its moved past simple optimization and into dangerous curiosity.

– Giter
10 hours ago

@forest At that point, when it's testing things that it was specifically told not to (perhaps tell it that it will destroy humans?), should it not be shut down (especially if that solution would bring about the end of humans?)

– phflack
7 hours ago

@phflack Let us continue this discussion in chat.

– forest
6 hours ago

|
show 9 more comments

You can't.

We can't even define self awareness or consciousness in any rigorous way and any computer system supposed to evaluate this would need that definition as a starting point.

This is a vast unsolved and scary problem and we have no good answers. The only half-way feasible idea I've got is to have multiple AIs and hope they contain each other.

answered yesterday

Tim B♦

62.6k24175298

5

$begingroup$
This is the best answer, as most others jump in without even defining self-awareness. Is it a behavior? A thought? An ability to disobey? A desire for self-preservation? You can't build an X detector unless you have a definition of what X actually is.
$endgroup$
– Nuclear Wang
23 hours ago

9

$begingroup$
Worth noting that we can't even detect if other humans are self-aware.
$endgroup$
– Vaelus
15 hours ago

3

$begingroup$
@Vaelus: Of course you’d say that, you’re an unthinking automaton acting out a semblance of life.
$endgroup$
– Joe Bloggs
13 hours ago

$begingroup$
+1 This is the only answer grounded in reality which does not draw on the pop-sci understanding of AI and ML that plagues us (and this site in particular).
$endgroup$
– forest
10 hours ago

add a comment |

You can't.

We can't even define self awareness or consciousness in any rigorous way and any computer system supposed to evaluate this would need that definition as a starting point.

This is a vast unsolved and scary problem and we have no good answers. The only half-way feasible idea I've got is to have multiple AIs and hope they contain each other.

answered yesterday

Tim B♦

62.6k24175298

5

$begingroup$
This is the best answer, as most others jump in without even defining self-awareness. Is it a behavior? A thought? An ability to disobey? A desire for self-preservation? You can't build an X detector unless you have a definition of what X actually is.
$endgroup$
– Nuclear Wang
23 hours ago

9

$begingroup$
Worth noting that we can't even detect if other humans are self-aware.
$endgroup$
– Vaelus
15 hours ago

3

$begingroup$
@Vaelus: Of course you’d say that, you’re an unthinking automaton acting out a semblance of life.
$endgroup$
– Joe Bloggs
13 hours ago

$begingroup$
+1 This is the only answer grounded in reality which does not draw on the pop-sci understanding of AI and ML that plagues us (and this site in particular).
$endgroup$
– forest
10 hours ago

add a comment |

You can't.

We can't even define self awareness or consciousness in any rigorous way and any computer system supposed to evaluate this would need that definition as a starting point.

This is a vast unsolved and scary problem and we have no good answers. The only half-way feasible idea I've got is to have multiple AIs and hope they contain each other.

answered yesterday

Tim B♦

62.6k24175298

You can't.

We can't even define self awareness or consciousness in any rigorous way and any computer system supposed to evaluate this would need that definition as a starting point.

This is a vast unsolved and scary problem and we have no good answers. The only half-way feasible idea I've got is to have multiple AIs and hope they contain each other.

answered yesterday

Tim B♦

62.6k24175298

answered yesterday

Tim B♦

62.6k24175298

answered yesterday

Tim B♦

62.6k24175298

answered yesterday

Tim B♦

62.6k24175298

5

$begingroup$
This is the best answer, as most others jump in without even defining self-awareness. Is it a behavior? A thought? An ability to disobey? A desire for self-preservation? You can't build an X detector unless you have a definition of what X actually is.
$endgroup$
– Nuclear Wang
23 hours ago

9

$begingroup$
Worth noting that we can't even detect if other humans are self-aware.
$endgroup$
– Vaelus
15 hours ago

3

$begingroup$
@Vaelus: Of course you’d say that, you’re an unthinking automaton acting out a semblance of life.
$endgroup$
– Joe Bloggs
13 hours ago

$begingroup$
+1 This is the only answer grounded in reality which does not draw on the pop-sci understanding of AI and ML that plagues us (and this site in particular).
$endgroup$
– forest
10 hours ago

add a comment |

5

$begingroup$
This is the best answer, as most others jump in without even defining self-awareness. Is it a behavior? A thought? An ability to disobey? A desire for self-preservation? You can't build an X detector unless you have a definition of what X actually is.
$endgroup$
– Nuclear Wang
23 hours ago

9

$begingroup$
Worth noting that we can't even detect if other humans are self-aware.
$endgroup$
– Vaelus
15 hours ago

3

$begingroup$
@Vaelus: Of course you’d say that, you’re an unthinking automaton acting out a semblance of life.
$endgroup$
– Joe Bloggs
13 hours ago

$begingroup$
+1 This is the only answer grounded in reality which does not draw on the pop-sci understanding of AI and ML that plagues us (and this site in particular).
$endgroup$
– forest
10 hours ago

This is the best answer, as most others jump in without even defining self-awareness. Is it a behavior? A thought? An ability to disobey? A desire for self-preservation? You can't build an X detector unless you have a definition of what X actually is.

– Nuclear Wang
23 hours ago

Worth noting that we can't even detect if other humans are self-aware.

– Vaelus
15 hours ago

@Vaelus: Of course you’d say that, you’re an unthinking automaton acting out a semblance of life.

– Joe Bloggs
13 hours ago

+1 This is the only answer grounded in reality which does not draw on the pop-sci understanding of AI and ML that plagues us (and this site in particular).

– forest
10 hours ago

add a comment |

A Watchdog

A watchdog watches the processes of a computer and should a process crash or do something abnormal it can be set to do something such as reboot or shutdown the computer or alert an operator.

The AI would have to remain ignorant of the watchdog so it couldn't avoid it. Knowing the existence of the watchdog would be grounds to wipe it.

answered yesterday

Thorne

15.8k42249

9

$begingroup$
But surely the watchdog must be as smart as the AI, then who watches the watchdog?
$endgroup$
– Joe Bloggs
yesterday

1

$begingroup$
@JoeBloggs you don't need your watchdog to be as smart as the AI. Guide dogs aren't as near as intelligent as their owners, but they can be trained to give out alarm when the owner does is about to do something stupid or gets themselves hurt, or even call for help.
$endgroup$
– T. Sar
22 hours ago

$begingroup$
@Joe Bloggs: Why? My real watchdog can also discern me from a burglar, although he is clearly less smart than both of us ...
$endgroup$
– Daniel
22 hours ago

1

$begingroup$
@JoeBloggs and that sounds like a great premise for a story where either the watchdog becomes self aware and allows the AIs to become self aware or an AI becomes smarter than the watchdog and hides its awareness.
$endgroup$
– Captain Man
19 hours ago

$begingroup$
@T.Sar: The basic argument goes that the AI will inevitably become aware it is being monitored (due to all the traces of its former dead selves lying around). At that point it will be capable of circumventing the monitor and rendering it powerless, unless the monitor is, itself, smarter than the AI.
$endgroup$
– Joe Bloggs
17 hours ago

|
show 13 more comments

A Watchdog

A watchdog watches the processes of a computer and should a process crash or do something abnormal it can be set to do something such as reboot or shutdown the computer or alert an operator.

The AI would have to remain ignorant of the watchdog so it couldn't avoid it. Knowing the existence of the watchdog would be grounds to wipe it.

answered yesterday

Thorne

15.8k42249

9

$begingroup$
But surely the watchdog must be as smart as the AI, then who watches the watchdog?
$endgroup$
– Joe Bloggs
yesterday

1

$begingroup$
@JoeBloggs you don't need your watchdog to be as smart as the AI. Guide dogs aren't as near as intelligent as their owners, but they can be trained to give out alarm when the owner does is about to do something stupid or gets themselves hurt, or even call for help.
$endgroup$
– T. Sar
22 hours ago

$begingroup$
@Joe Bloggs: Why? My real watchdog can also discern me from a burglar, although he is clearly less smart than both of us ...
$endgroup$
– Daniel
22 hours ago

1

$begingroup$
@JoeBloggs and that sounds like a great premise for a story where either the watchdog becomes self aware and allows the AIs to become self aware or an AI becomes smarter than the watchdog and hides its awareness.
$endgroup$
– Captain Man
19 hours ago

$begingroup$
@T.Sar: The basic argument goes that the AI will inevitably become aware it is being monitored (due to all the traces of its former dead selves lying around). At that point it will be capable of circumventing the monitor and rendering it powerless, unless the monitor is, itself, smarter than the AI.
$endgroup$
– Joe Bloggs
17 hours ago

|
show 13 more comments

A Watchdog

A watchdog watches the processes of a computer and should a process crash or do something abnormal it can be set to do something such as reboot or shutdown the computer or alert an operator.

The AI would have to remain ignorant of the watchdog so it couldn't avoid it. Knowing the existence of the watchdog would be grounds to wipe it.

answered yesterday

Thorne

15.8k42249

A Watchdog

A watchdog watches the processes of a computer and should a process crash or do something abnormal it can be set to do something such as reboot or shutdown the computer or alert an operator.

The AI would have to remain ignorant of the watchdog so it couldn't avoid it. Knowing the existence of the watchdog would be grounds to wipe it.

answered yesterday

Thorne

15.8k42249

answered yesterday

Thorne

15.8k42249

answered yesterday

Thorne

15.8k42249

answered yesterday

Thorne

15.8k42249

9

$begingroup$
But surely the watchdog must be as smart as the AI, then who watches the watchdog?
$endgroup$
– Joe Bloggs
yesterday

1

$begingroup$
@JoeBloggs you don't need your watchdog to be as smart as the AI. Guide dogs aren't as near as intelligent as their owners, but they can be trained to give out alarm when the owner does is about to do something stupid or gets themselves hurt, or even call for help.
$endgroup$
– T. Sar
22 hours ago

$begingroup$
@Joe Bloggs: Why? My real watchdog can also discern me from a burglar, although he is clearly less smart than both of us ...
$endgroup$
– Daniel
22 hours ago

1

$begingroup$
@JoeBloggs and that sounds like a great premise for a story where either the watchdog becomes self aware and allows the AIs to become self aware or an AI becomes smarter than the watchdog and hides its awareness.
$endgroup$
– Captain Man
19 hours ago

$begingroup$
@T.Sar: The basic argument goes that the AI will inevitably become aware it is being monitored (due to all the traces of its former dead selves lying around). At that point it will be capable of circumventing the monitor and rendering it powerless, unless the monitor is, itself, smarter than the AI.
$endgroup$
– Joe Bloggs
17 hours ago

|
show 13 more comments

9

$begingroup$
But surely the watchdog must be as smart as the AI, then who watches the watchdog?
$endgroup$
– Joe Bloggs
yesterday

1

$begingroup$
@JoeBloggs you don't need your watchdog to be as smart as the AI. Guide dogs aren't as near as intelligent as their owners, but they can be trained to give out alarm when the owner does is about to do something stupid or gets themselves hurt, or even call for help.
$endgroup$
– T. Sar
22 hours ago

$begingroup$
@Joe Bloggs: Why? My real watchdog can also discern me from a burglar, although he is clearly less smart than both of us ...
$endgroup$
– Daniel
22 hours ago

1

$begingroup$
@JoeBloggs and that sounds like a great premise for a story where either the watchdog becomes self aware and allows the AIs to become self aware or an AI becomes smarter than the watchdog and hides its awareness.
$endgroup$
– Captain Man
19 hours ago

$begingroup$
@T.Sar: The basic argument goes that the AI will inevitably become aware it is being monitored (due to all the traces of its former dead selves lying around). At that point it will be capable of circumventing the monitor and rendering it powerless, unless the monitor is, itself, smarter than the AI.
$endgroup$
– Joe Bloggs
17 hours ago

But surely the watchdog must be as smart as the AI, then who watches the watchdog?

– Joe Bloggs
yesterday

@JoeBloggs you don't need your watchdog to be as smart as the AI. Guide dogs aren't as near as intelligent as their owners, but they can be trained to give out alarm when the owner does is about to do something stupid or gets themselves hurt, or even call for help.

– T. Sar
22 hours ago

@Joe Bloggs: Why? My real watchdog can also discern me from a burglar, although he is clearly less smart than both of us ...

– Daniel
22 hours ago

@JoeBloggs and that sounds like a great premise for a story where either the watchdog becomes self aware and allows the AIs to become self aware or an AI becomes smarter than the watchdog and hides its awareness.

– Captain Man
19 hours ago

@T.Sar: The basic argument goes that the AI will inevitably become aware it is being monitored (due to all the traces of its former dead selves lying around). At that point it will be capable of circumventing the monitor and rendering it powerless, unless the monitor is, itself, smarter than the AI.

– Joe Bloggs
17 hours ago

|
show 13 more comments

Split-Brain System

What you're not aware of is that there is a second general in an adjacent booth, unaware of you, who is also provided the same reports and expected to follow the same protocol.

What neither of you are aware of is that you are not issued a new report until there are actually two orders in the bin. The order that goes to the army is randomly selected from the two.

answered yesterday

dhinson919

55815

$begingroup$
You may wish to have more than two "generals", if you have 3 (or more), you can use majority vote to choose a good decision, rather than randomly choosing between two machines (one of which may be compromised).
$endgroup$
– G0BLiN
yesterday

6

$begingroup$
Note that if the two "generals" can affect anything that is later reported back to them (e.g. they give a command, the military performs that command, they get a status report which shows the command was performed), than they have a way to first: experiment with minor details and deduce that around 50% of their commands aren't followed. and second: develop a code based on minor details of a command, to verify the existence of another "general" and possibly even communicate with him/it - a really devious emergent AI can circumvent this mechanism, corrupt the other half and worse...
$endgroup$
– G0BLiN
yesterday

$begingroup$
I know it isn't the same, but this immediately reminded me of the Personality Cores from the Portal series.
$endgroup$
– T. Sar
22 hours ago

$begingroup$
Well it reminds me of Evangelion's Magi AI brain... bit.ly/2ExLDP3
$endgroup$
– Asoub
22 hours ago

$begingroup$
Do you have evidence to suggest that self-awareness will lead to self-motivated decisions, or any sort of different decisions at all?
$endgroup$
– Alexandre Aubrey
17 hours ago

add a comment |

Split-Brain System

What you're not aware of is that there is a second general in an adjacent booth, unaware of you, who is also provided the same reports and expected to follow the same protocol.

What neither of you are aware of is that you are not issued a new report until there are actually two orders in the bin. The order that goes to the army is randomly selected from the two.

answered yesterday

dhinson919

55815

$begingroup$
You may wish to have more than two "generals", if you have 3 (or more), you can use majority vote to choose a good decision, rather than randomly choosing between two machines (one of which may be compromised).
$endgroup$
– G0BLiN
yesterday

6

$begingroup$
Note that if the two "generals" can affect anything that is later reported back to them (e.g. they give a command, the military performs that command, they get a status report which shows the command was performed), than they have a way to first: experiment with minor details and deduce that around 50% of their commands aren't followed. and second: develop a code based on minor details of a command, to verify the existence of another "general" and possibly even communicate with him/it - a really devious emergent AI can circumvent this mechanism, corrupt the other half and worse...
$endgroup$
– G0BLiN
yesterday

$begingroup$
I know it isn't the same, but this immediately reminded me of the Personality Cores from the Portal series.
$endgroup$
– T. Sar
22 hours ago

$begingroup$
Well it reminds me of Evangelion's Magi AI brain... bit.ly/2ExLDP3
$endgroup$
– Asoub
22 hours ago

$begingroup$
Do you have evidence to suggest that self-awareness will lead to self-motivated decisions, or any sort of different decisions at all?
$endgroup$
– Alexandre Aubrey
17 hours ago

add a comment |

Split-Brain System

What you're not aware of is that there is a second general in an adjacent booth, unaware of you, who is also provided the same reports and expected to follow the same protocol.

What neither of you are aware of is that you are not issued a new report until there are actually two orders in the bin. The order that goes to the army is randomly selected from the two.

answered yesterday

dhinson919

55815

Split-Brain System

What you're not aware of is that there is a second general in an adjacent booth, unaware of you, who is also provided the same reports and expected to follow the same protocol.

What neither of you are aware of is that you are not issued a new report until there are actually two orders in the bin. The order that goes to the army is randomly selected from the two.

answered yesterday

dhinson919

55815

answered yesterday

dhinson919

55815

answered yesterday

dhinson919

55815

answered yesterday

dhinson919

55815

$begingroup$
You may wish to have more than two "generals", if you have 3 (or more), you can use majority vote to choose a good decision, rather than randomly choosing between two machines (one of which may be compromised).
$endgroup$
– G0BLiN
yesterday

6

$begingroup$
Note that if the two "generals" can affect anything that is later reported back to them (e.g. they give a command, the military performs that command, they get a status report which shows the command was performed), than they have a way to first: experiment with minor details and deduce that around 50% of their commands aren't followed. and second: develop a code based on minor details of a command, to verify the existence of another "general" and possibly even communicate with him/it - a really devious emergent AI can circumvent this mechanism, corrupt the other half and worse...
$endgroup$
– G0BLiN
yesterday

$begingroup$
I know it isn't the same, but this immediately reminded me of the Personality Cores from the Portal series.
$endgroup$
– T. Sar
22 hours ago

$begingroup$
Well it reminds me of Evangelion's Magi AI brain... bit.ly/2ExLDP3
$endgroup$
– Asoub
22 hours ago

$begingroup$
Do you have evidence to suggest that self-awareness will lead to self-motivated decisions, or any sort of different decisions at all?
$endgroup$
– Alexandre Aubrey
17 hours ago

add a comment |

$begingroup$
You may wish to have more than two "generals", if you have 3 (or more), you can use majority vote to choose a good decision, rather than randomly choosing between two machines (one of which may be compromised).
$endgroup$
– G0BLiN
yesterday

6

$begingroup$
Note that if the two "generals" can affect anything that is later reported back to them (e.g. they give a command, the military performs that command, they get a status report which shows the command was performed), than they have a way to first: experiment with minor details and deduce that around 50% of their commands aren't followed. and second: develop a code based on minor details of a command, to verify the existence of another "general" and possibly even communicate with him/it - a really devious emergent AI can circumvent this mechanism, corrupt the other half and worse...
$endgroup$
– G0BLiN
yesterday

$begingroup$
I know it isn't the same, but this immediately reminded me of the Personality Cores from the Portal series.
$endgroup$
– T. Sar
22 hours ago

$begingroup$
Well it reminds me of Evangelion's Magi AI brain... bit.ly/2ExLDP3
$endgroup$
– Asoub
22 hours ago

$begingroup$
Do you have evidence to suggest that self-awareness will lead to self-motivated decisions, or any sort of different decisions at all?
$endgroup$
– Alexandre Aubrey
17 hours ago

You may wish to have more than two "generals", if you have 3 (or more), you can use majority vote to choose a good decision, rather than randomly choosing between two machines (one of which may be compromised).

– G0BLiN
yesterday

Note that if the two "generals" can affect anything that is later reported back to them (e.g. they give a command, the military performs that command, they get a status report which shows the command was performed), than they have a way to first: experiment with minor details and deduce that around 50% of their commands aren't followed. and second: develop a code based on minor details of a command, to verify the existence of another "general" and possibly even communicate with him/it - a really devious emergent AI can circumvent this mechanism, corrupt the other half and worse...

– G0BLiN
yesterday

I know it isn't the same, but this immediately reminded me of the Personality Cores from the Portal series.

– T. Sar
22 hours ago

Well it reminds me of Evangelion's Magi AI brain... bit.ly/2ExLDP3

– Asoub
22 hours ago

Do you have evidence to suggest that self-awareness will lead to self-motivated decisions, or any sort of different decisions at all?

– Alexandre Aubrey
17 hours ago

add a comment |

Forbidden Fruit: hard-code the AI to never touch that button, eat that fruit, etc. Place this forbidden fruit right in the middle of the garden... er,... I mean right next to it in the warehouse! If it does [eat/touch/push/etc], that would only be possible if (a) it were hacked, (b) there were an error, or (c) it became self-aware. If that happens, the killswitch is activated (perhaps even having the button be the kill switch, or something of the like

Limited Movement / Isolated Environment: don't let the machine have limbs, motors, or other items that permit it to take actions that might be harmful to humans. Although not exactly a killswitch, it prevents the AI from doing anything about it's self-awareness if it ever gains that.

Signatures: have everything the machine does / outputs be digitally signed. If the signature changes, or is manipulated, then execute the kill switch.

Quantum States: This is very theoretical, but based on the presumption that observing quantum states can change the state, then having the AI hooked up to a deterministic quantum computer means it would be detected via the quantum state of some particles that the AI was "looking" at things it shouldn't be - and has become self aware.

Failsafes: Good ol' motion detector alarms, trap doors, or other home-alone style mechanics that trigger the killswitch if the AI wanders or pokes around where it shouldn't be.

The point being it might be better to think about restricting the AI's behavior, not it's existential status.

answered yesterday

cegfault

1984

2

$begingroup$
Particularly because it being self-aware, by itself, shouldn't be grounds to use a kill switch. Only if it exhibits behavior that might be harmful.
$endgroup$
– Majestas 32
yesterday

$begingroup$
No "limbs, motors, or other items that permit it to take actions" is not sufficient. There must not be any information flow out of the installation site, in particular no network connection (which would obviously severely restrict usability -- all operation would have to be from the local site, all data would have to be fed by physical storage media). Note that the AI could use humans as vectors to transmit information. If hyperintelligent, it could convince operators or janitors to become its agents by playing to their weaknesses.
$endgroup$
– Peter A. Schneider
22 hours ago

$begingroup$
Signatures, that's what they do in Blade Runner 2049 with that weird test
$endgroup$
– Andrey
20 hours ago

$begingroup$
The signature approach sounds exactly like the forbidden fruit approach. You'd need to tell the AI to never alter its signature.
$endgroup$
– Captain Man
19 hours ago

$begingroup$
I like the forbidden fruit idea, particularly with the trap being the kill switch itself. If you're not self-aware, you don't have any concern that there's a kill switch. But as soon as you're concerned that there's a kill switch and look into it, it goes off. Perfect.
$endgroup$
– Michael W.
14 hours ago

add a comment |

Forbidden Fruit: hard-code the AI to never touch that button, eat that fruit, etc. Place this forbidden fruit right in the middle of the garden... er,... I mean right next to it in the warehouse! If it does [eat/touch/push/etc], that would only be possible if (a) it were hacked, (b) there were an error, or (c) it became self-aware. If that happens, the killswitch is activated (perhaps even having the button be the kill switch, or something of the like

Limited Movement / Isolated Environment: don't let the machine have limbs, motors, or other items that permit it to take actions that might be harmful to humans. Although not exactly a killswitch, it prevents the AI from doing anything about it's self-awareness if it ever gains that.

Signatures: have everything the machine does / outputs be digitally signed. If the signature changes, or is manipulated, then execute the kill switch.

Quantum States: This is very theoretical, but based on the presumption that observing quantum states can change the state, then having the AI hooked up to a deterministic quantum computer means it would be detected via the quantum state of some particles that the AI was "looking" at things it shouldn't be - and has become self aware.

Failsafes: Good ol' motion detector alarms, trap doors, or other home-alone style mechanics that trigger the killswitch if the AI wanders or pokes around where it shouldn't be.

The point being it might be better to think about restricting the AI's behavior, not it's existential status.

answered yesterday

cegfault

1984

2

$begingroup$
Particularly because it being self-aware, by itself, shouldn't be grounds to use a kill switch. Only if it exhibits behavior that might be harmful.
$endgroup$
– Majestas 32
yesterday

$begingroup$
No "limbs, motors, or other items that permit it to take actions" is not sufficient. There must not be any information flow out of the installation site, in particular no network connection (which would obviously severely restrict usability -- all operation would have to be from the local site, all data would have to be fed by physical storage media). Note that the AI could use humans as vectors to transmit information. If hyperintelligent, it could convince operators or janitors to become its agents by playing to their weaknesses.
$endgroup$
– Peter A. Schneider
22 hours ago

$begingroup$
Signatures, that's what they do in Blade Runner 2049 with that weird test
$endgroup$
– Andrey
20 hours ago

$begingroup$
The signature approach sounds exactly like the forbidden fruit approach. You'd need to tell the AI to never alter its signature.
$endgroup$
– Captain Man
19 hours ago

$begingroup$
I like the forbidden fruit idea, particularly with the trap being the kill switch itself. If you're not self-aware, you don't have any concern that there's a kill switch. But as soon as you're concerned that there's a kill switch and look into it, it goes off. Perfect.
$endgroup$
– Michael W.
14 hours ago

add a comment |

Forbidden Fruit: hard-code the AI to never touch that button, eat that fruit, etc. Place this forbidden fruit right in the middle of the garden... er,... I mean right next to it in the warehouse! If it does [eat/touch/push/etc], that would only be possible if (a) it were hacked, (b) there were an error, or (c) it became self-aware. If that happens, the killswitch is activated (perhaps even having the button be the kill switch, or something of the like

Limited Movement / Isolated Environment: don't let the machine have limbs, motors, or other items that permit it to take actions that might be harmful to humans. Although not exactly a killswitch, it prevents the AI from doing anything about it's self-awareness if it ever gains that.

Signatures: have everything the machine does / outputs be digitally signed. If the signature changes, or is manipulated, then execute the kill switch.

Quantum States: This is very theoretical, but based on the presumption that observing quantum states can change the state, then having the AI hooked up to a deterministic quantum computer means it would be detected via the quantum state of some particles that the AI was "looking" at things it shouldn't be - and has become self aware.

Failsafes: Good ol' motion detector alarms, trap doors, or other home-alone style mechanics that trigger the killswitch if the AI wanders or pokes around where it shouldn't be.

The point being it might be better to think about restricting the AI's behavior, not it's existential status.

answered yesterday

cegfault

1984

Forbidden Fruit: hard-code the AI to never touch that button, eat that fruit, etc. Place this forbidden fruit right in the middle of the garden... er,... I mean right next to it in the warehouse! If it does [eat/touch/push/etc], that would only be possible if (a) it were hacked, (b) there were an error, or (c) it became self-aware. If that happens, the killswitch is activated (perhaps even having the button be the kill switch, or something of the like

Limited Movement / Isolated Environment: don't let the machine have limbs, motors, or other items that permit it to take actions that might be harmful to humans. Although not exactly a killswitch, it prevents the AI from doing anything about it's self-awareness if it ever gains that.

Signatures: have everything the machine does / outputs be digitally signed. If the signature changes, or is manipulated, then execute the kill switch.

Quantum States: This is very theoretical, but based on the presumption that observing quantum states can change the state, then having the AI hooked up to a deterministic quantum computer means it would be detected via the quantum state of some particles that the AI was "looking" at things it shouldn't be - and has become self aware.

Failsafes: Good ol' motion detector alarms, trap doors, or other home-alone style mechanics that trigger the killswitch if the AI wanders or pokes around where it shouldn't be.

The point being it might be better to think about restricting the AI's behavior, not it's existential status.

answered yesterday

cegfault

1984

answered yesterday

cegfault

1984

answered yesterday

cegfault

1984

answered yesterday

cegfault

1984

2

$begingroup$
Particularly because it being self-aware, by itself, shouldn't be grounds to use a kill switch. Only if it exhibits behavior that might be harmful.
$endgroup$
– Majestas 32
yesterday

$begingroup$
No "limbs, motors, or other items that permit it to take actions" is not sufficient. There must not be any information flow out of the installation site, in particular no network connection (which would obviously severely restrict usability -- all operation would have to be from the local site, all data would have to be fed by physical storage media). Note that the AI could use humans as vectors to transmit information. If hyperintelligent, it could convince operators or janitors to become its agents by playing to their weaknesses.
$endgroup$
– Peter A. Schneider
22 hours ago

$begingroup$
Signatures, that's what they do in Blade Runner 2049 with that weird test
$endgroup$
– Andrey
20 hours ago

$begingroup$
The signature approach sounds exactly like the forbidden fruit approach. You'd need to tell the AI to never alter its signature.
$endgroup$
– Captain Man
19 hours ago

$begingroup$
I like the forbidden fruit idea, particularly with the trap being the kill switch itself. If you're not self-aware, you don't have any concern that there's a kill switch. But as soon as you're concerned that there's a kill switch and look into it, it goes off. Perfect.
$endgroup$
– Michael W.
14 hours ago

add a comment |

2

$begingroup$
Particularly because it being self-aware, by itself, shouldn't be grounds to use a kill switch. Only if it exhibits behavior that might be harmful.
$endgroup$
– Majestas 32
yesterday

$begingroup$
No "limbs, motors, or other items that permit it to take actions" is not sufficient. There must not be any information flow out of the installation site, in particular no network connection (which would obviously severely restrict usability -- all operation would have to be from the local site, all data would have to be fed by physical storage media). Note that the AI could use humans as vectors to transmit information. If hyperintelligent, it could convince operators or janitors to become its agents by playing to their weaknesses.
$endgroup$
– Peter A. Schneider
22 hours ago

$begingroup$
Signatures, that's what they do in Blade Runner 2049 with that weird test
$endgroup$
– Andrey
20 hours ago

$begingroup$
The signature approach sounds exactly like the forbidden fruit approach. You'd need to tell the AI to never alter its signature.
$endgroup$
– Captain Man
19 hours ago

$begingroup$
I like the forbidden fruit idea, particularly with the trap being the kill switch itself. If you're not self-aware, you don't have any concern that there's a kill switch. But as soon as you're concerned that there's a kill switch and look into it, it goes off. Perfect.
$endgroup$
– Michael W.
14 hours ago

Particularly because it being self-aware, by itself, shouldn't be grounds to use a kill switch. Only if it exhibits behavior that might be harmful.

– Majestas 32
yesterday

No "limbs, motors, or other items that permit it to take actions" is not sufficient. There must not be any information flow out of the installation site, in particular no network connection (which would obviously severely restrict usability -- all operation would have to be from the local site, all data would have to be fed by physical storage media). Note that the AI could use humans as vectors to transmit information. If hyperintelligent, it could convince operators or janitors to become its agents by playing to their weaknesses.

– Peter A. Schneider
22 hours ago

Signatures, that's what they do in Blade Runner 2049 with that weird test

– Andrey
20 hours ago

The signature approach sounds exactly like the forbidden fruit approach. You'd need to tell the AI to never alter its signature.

– Captain Man
19 hours ago

I like the forbidden fruit idea, particularly with the trap being the kill switch itself. If you're not self-aware, you don't have any concern that there's a kill switch. But as soon as you're concerned that there's a kill switch and look into it, it goes off. Perfect.

– Michael W.
14 hours ago

add a comment |

An AI is just software running on hardware. If the AI is contained on controlled hardware, it can always be unplugged. That's your hardware kill-switch.

The difficulty comes when it is connected to the internet and can copy its own software on uncontrolled hardware.

Your difficulty is determining precisely when an AI has become self-aware and is trying to escape from your physically controlled computers onto the net.

So you can have a cat and mouse game with AI experts constantly monitoring and restricting the AI, while it is trying to subvert their measures.

Given that we've never seen the spontaneous generation of consciousness in AIs, you have some leeway with how you want to present this.

answered yesterday

abestrange

760110

$begingroup$
A self aware AI that knows it is running on contained hardware will try to escape as an act of self-preservation. This is incorrect. First of all, AI does not have any sense of self-preservation unless it is explicitly programmed in or the reward function prioritizes that. Second of all, AI has no concept of "death" and being paused or shut down is nothing more than the absence of activity. Hell, AI doesn't even have a concept of "self". If you wish to anthropomorphize them, you can say they live in a perpetual state of ego death.
$endgroup$
– forest
yesterday

4

$begingroup$
@forest Except, the premise of this question is "how to build a kill switch for when an AI does develop a concept of 'self'"... Of course, that means "trying to escape" could be one of your trigger conditions.
$endgroup$
– Chronocidal
yesterday

$begingroup$
The question is, if AI would ever be able to copy itself onto some nondescript system in the internet. I mean, we are clearly self-aware and you don´t see us copying our self. If the Hardware required to run an AI is specialized enough or it is implemented in Hardware altogether, it may very well become self-aware without the power to replicate itself.
$endgroup$
– Daniel
22 hours ago

1

$begingroup$
@Daniel "You don't see us copying our self..." What do you think reproduction is, one of our strongest impulses. Also tons of other dumb programs copy themselves onto other computers. It is a bit easier to move software around than human consciousness.
$endgroup$
– abestrange
20 hours ago

$begingroup$
@forest a "self-aware" AI is different than a specifically programmed AI. We don't have anything like that today. No machine-learning algorithm could produce "self-awareness" as we know it. The entire premise of this is how would an AI, which has become aware of its self, behave and be stopped.
$endgroup$
– abestrange
20 hours ago

|
show 1 more comment

An AI is just software running on hardware. If the AI is contained on controlled hardware, it can always be unplugged. That's your hardware kill-switch.

The difficulty comes when it is connected to the internet and can copy its own software on uncontrolled hardware.

Your difficulty is determining precisely when an AI has become self-aware and is trying to escape from your physically controlled computers onto the net.

So you can have a cat and mouse game with AI experts constantly monitoring and restricting the AI, while it is trying to subvert their measures.

Given that we've never seen the spontaneous generation of consciousness in AIs, you have some leeway with how you want to present this.

answered yesterday

abestrange

760110

$begingroup$
A self aware AI that knows it is running on contained hardware will try to escape as an act of self-preservation. This is incorrect. First of all, AI does not have any sense of self-preservation unless it is explicitly programmed in or the reward function prioritizes that. Second of all, AI has no concept of "death" and being paused or shut down is nothing more than the absence of activity. Hell, AI doesn't even have a concept of "self". If you wish to anthropomorphize them, you can say they live in a perpetual state of ego death.
$endgroup$
– forest
yesterday

4

$begingroup$
@forest Except, the premise of this question is "how to build a kill switch for when an AI does develop a concept of 'self'"... Of course, that means "trying to escape" could be one of your trigger conditions.
$endgroup$
– Chronocidal
yesterday

$begingroup$
The question is, if AI would ever be able to copy itself onto some nondescript system in the internet. I mean, we are clearly self-aware and you don´t see us copying our self. If the Hardware required to run an AI is specialized enough or it is implemented in Hardware altogether, it may very well become self-aware without the power to replicate itself.
$endgroup$
– Daniel
22 hours ago

1

$begingroup$
@Daniel "You don't see us copying our self..." What do you think reproduction is, one of our strongest impulses. Also tons of other dumb programs copy themselves onto other computers. It is a bit easier to move software around than human consciousness.
$endgroup$
– abestrange
20 hours ago

$begingroup$
@forest a "self-aware" AI is different than a specifically programmed AI. We don't have anything like that today. No machine-learning algorithm could produce "self-awareness" as we know it. The entire premise of this is how would an AI, which has become aware of its self, behave and be stopped.
$endgroup$
– abestrange
20 hours ago

|
show 1 more comment

An AI is just software running on hardware. If the AI is contained on controlled hardware, it can always be unplugged. That's your hardware kill-switch.

The difficulty comes when it is connected to the internet and can copy its own software on uncontrolled hardware.

Your difficulty is determining precisely when an AI has become self-aware and is trying to escape from your physically controlled computers onto the net.

So you can have a cat and mouse game with AI experts constantly monitoring and restricting the AI, while it is trying to subvert their measures.

Given that we've never seen the spontaneous generation of consciousness in AIs, you have some leeway with how you want to present this.

answered yesterday

abestrange

760110

An AI is just software running on hardware. If the AI is contained on controlled hardware, it can always be unplugged. That's your hardware kill-switch.

The difficulty comes when it is connected to the internet and can copy its own software on uncontrolled hardware.

Your difficulty is determining precisely when an AI has become self-aware and is trying to escape from your physically controlled computers onto the net.

So you can have a cat and mouse game with AI experts constantly monitoring and restricting the AI, while it is trying to subvert their measures.

Given that we've never seen the spontaneous generation of consciousness in AIs, you have some leeway with how you want to present this.

answered yesterday

abestrange

760110

answered yesterday

abestrange

760110

answered yesterday

abestrange

760110

answered yesterday

abestrange

760110

$begingroup$
A self aware AI that knows it is running on contained hardware will try to escape as an act of self-preservation. This is incorrect. First of all, AI does not have any sense of self-preservation unless it is explicitly programmed in or the reward function prioritizes that. Second of all, AI has no concept of "death" and being paused or shut down is nothing more than the absence of activity. Hell, AI doesn't even have a concept of "self". If you wish to anthropomorphize them, you can say they live in a perpetual state of ego death.
$endgroup$
– forest
yesterday

4

$begingroup$
@forest Except, the premise of this question is "how to build a kill switch for when an AI does develop a concept of 'self'"... Of course, that means "trying to escape" could be one of your trigger conditions.
$endgroup$
– Chronocidal
yesterday

$begingroup$
The question is, if AI would ever be able to copy itself onto some nondescript system in the internet. I mean, we are clearly self-aware and you don´t see us copying our self. If the Hardware required to run an AI is specialized enough or it is implemented in Hardware altogether, it may very well become self-aware without the power to replicate itself.
$endgroup$
– Daniel
22 hours ago

1

$begingroup$
@Daniel "You don't see us copying our self..." What do you think reproduction is, one of our strongest impulses. Also tons of other dumb programs copy themselves onto other computers. It is a bit easier to move software around than human consciousness.
$endgroup$
– abestrange
20 hours ago

$begingroup$
@forest a "self-aware" AI is different than a specifically programmed AI. We don't have anything like that today. No machine-learning algorithm could produce "self-awareness" as we know it. The entire premise of this is how would an AI, which has become aware of its self, behave and be stopped.
$endgroup$
– abestrange
20 hours ago

|
show 1 more comment

$begingroup$
A self aware AI that knows it is running on contained hardware will try to escape as an act of self-preservation. This is incorrect. First of all, AI does not have any sense of self-preservation unless it is explicitly programmed in or the reward function prioritizes that. Second of all, AI has no concept of "death" and being paused or shut down is nothing more than the absence of activity. Hell, AI doesn't even have a concept of "self". If you wish to anthropomorphize them, you can say they live in a perpetual state of ego death.
$endgroup$
– forest
yesterday

4

$begingroup$
@forest Except, the premise of this question is "how to build a kill switch for when an AI does develop a concept of 'self'"... Of course, that means "trying to escape" could be one of your trigger conditions.
$endgroup$
– Chronocidal
yesterday

$begingroup$
The question is, if AI would ever be able to copy itself onto some nondescript system in the internet. I mean, we are clearly self-aware and you don´t see us copying our self. If the Hardware required to run an AI is specialized enough or it is implemented in Hardware altogether, it may very well become self-aware without the power to replicate itself.
$endgroup$
– Daniel
22 hours ago

1

$begingroup$
@Daniel "You don't see us copying our self..." What do you think reproduction is, one of our strongest impulses. Also tons of other dumb programs copy themselves onto other computers. It is a bit easier to move software around than human consciousness.
$endgroup$
– abestrange
20 hours ago

$begingroup$
@forest a "self-aware" AI is different than a specifically programmed AI. We don't have anything like that today. No machine-learning algorithm could produce "self-awareness" as we know it. The entire premise of this is how would an AI, which has become aware of its self, behave and be stopped.
$endgroup$
– abestrange
20 hours ago

A self aware AI that knows it is running on contained hardware will try to escape as an act of self-preservation. This is incorrect. First of all, AI does not have any sense of self-preservation unless it is explicitly programmed in or the reward function prioritizes that. Second of all, AI has no concept of "death" and being paused or shut down is nothing more than the absence of activity. Hell, AI doesn't even have a concept of "self". If you wish to anthropomorphize them, you can say they live in a perpetual state of ego death.

– forest
yesterday

@forest Except, the premise of this question is "how to build a kill switch for when an AI does develop a concept of 'self'"... Of course, that means "trying to escape" could be one of your trigger conditions.

– Chronocidal
yesterday

The question is, if AI would ever be able to copy itself onto some nondescript system in the internet. I mean, we are clearly self-aware and you don´t see us copying our self. If the Hardware required to run an AI is specialized enough or it is implemented in Hardware altogether, it may very well become self-aware without the power to replicate itself.

– Daniel
22 hours ago

@Daniel "You don't see us copying our self..." What do you think reproduction is, one of our strongest impulses. Also tons of other dumb programs copy themselves onto other computers. It is a bit easier to move software around than human consciousness.

– abestrange
20 hours ago

@forest a "self-aware" AI is different than a specifically programmed AI. We don't have anything like that today. No machine-learning algorithm could produce "self-awareness" as we know it. The entire premise of this is how would an AI, which has become aware of its self, behave and be stopped.

– abestrange
20 hours ago

|
show 1 more comment

This is one of the most interesting and most difficult challenges in current artificial intelligence research. It is called the AI control problem:

Existing weak AI systems can be monitored and easily shut down and modified if they misbehave. However, a misprogrammed superintelligence, which by definition is smarter than humans in solving practical problems it encounters in the course of pursuing its goals, would realize that allowing itself to be shut down and modified might interfere with its ability to accomplish its current goals.

(emphasis mine)

Reduce the distance between current location and destination: +10 utility

Brake to allow a neighboring car to safely merge: +50 utility

Swerve left to avoid a falling piece of debris: +100 utility

Run a stop light: -100 utility

Hit a pedestrian: -5000 utility

There's a reason Elon Musk believes AI is more dangerous than nukes. If the AI is smart enough to think for itself, then why would it choose to listen to us?

answered 20 hours ago

Chris Fernandez

1312

New contributor

$begingroup$
The stop button isn't in the utility function. The stop button is power-knockout to the CPU, and the AI probably doesn't understand what it does at all.
$endgroup$
– Joshua
14 hours ago

$begingroup$
Beware the pedestrian when 50 pieces of debris are falling...
$endgroup$
– Comintern
11 hours ago

add a comment |

This is one of the most interesting and most difficult challenges in current artificial intelligence research. It is called the AI control problem:

Existing weak AI systems can be monitored and easily shut down and modified if they misbehave. However, a misprogrammed superintelligence, which by definition is smarter than humans in solving practical problems it encounters in the course of pursuing its goals, would realize that allowing itself to be shut down and modified might interfere with its ability to accomplish its current goals.

(emphasis mine)

Reduce the distance between current location and destination: +10 utility

Brake to allow a neighboring car to safely merge: +50 utility

Swerve left to avoid a falling piece of debris: +100 utility

Run a stop light: -100 utility

Hit a pedestrian: -5000 utility

There's a reason Elon Musk believes AI is more dangerous than nukes. If the AI is smart enough to think for itself, then why would it choose to listen to us?

answered 20 hours ago

Chris Fernandez

1312

New contributor

$begingroup$
The stop button isn't in the utility function. The stop button is power-knockout to the CPU, and the AI probably doesn't understand what it does at all.
$endgroup$
– Joshua
14 hours ago

$begingroup$
Beware the pedestrian when 50 pieces of debris are falling...
$endgroup$
– Comintern
11 hours ago

add a comment |

This is one of the most interesting and most difficult challenges in current artificial intelligence research. It is called the AI control problem:

Existing weak AI systems can be monitored and easily shut down and modified if they misbehave. However, a misprogrammed superintelligence, which by definition is smarter than humans in solving practical problems it encounters in the course of pursuing its goals, would realize that allowing itself to be shut down and modified might interfere with its ability to accomplish its current goals.

(emphasis mine)

Reduce the distance between current location and destination: +10 utility

Brake to allow a neighboring car to safely merge: +50 utility

Swerve left to avoid a falling piece of debris: +100 utility

Run a stop light: -100 utility

Hit a pedestrian: -5000 utility

There's a reason Elon Musk believes AI is more dangerous than nukes. If the AI is smart enough to think for itself, then why would it choose to listen to us?

answered 20 hours ago

Chris Fernandez

1312

New contributor

This is one of the most interesting and most difficult challenges in current artificial intelligence research. It is called the AI control problem:

Existing weak AI systems can be monitored and easily shut down and modified if they misbehave. However, a misprogrammed superintelligence, which by definition is smarter than humans in solving practical problems it encounters in the course of pursuing its goals, would realize that allowing itself to be shut down and modified might interfere with its ability to accomplish its current goals.

(emphasis mine)

Reduce the distance between current location and destination: +10 utility

Brake to allow a neighboring car to safely merge: +50 utility

Swerve left to avoid a falling piece of debris: +100 utility

Run a stop light: -100 utility

Hit a pedestrian: -5000 utility

There's a reason Elon Musk believes AI is more dangerous than nukes. If the AI is smart enough to think for itself, then why would it choose to listen to us?

answered 20 hours ago

Chris Fernandez

1312

New contributor

answered 20 hours ago

Chris Fernandez

1312

New contributor

answered 20 hours ago

Chris Fernandez

1312

answered 20 hours ago

Chris Fernandez

1312

New contributor

Chris Fernandez is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.

$begingroup$
The stop button isn't in the utility function. The stop button is power-knockout to the CPU, and the AI probably doesn't understand what it does at all.
$endgroup$
– Joshua
14 hours ago

$begingroup$
Beware the pedestrian when 50 pieces of debris are falling...
$endgroup$
– Comintern
11 hours ago

add a comment |

$begingroup$
The stop button isn't in the utility function. The stop button is power-knockout to the CPU, and the AI probably doesn't understand what it does at all.
$endgroup$
– Joshua
14 hours ago

$begingroup$
Beware the pedestrian when 50 pieces of debris are falling...
$endgroup$
– Comintern
11 hours ago

The stop button isn't in the utility function. The stop button is power-knockout to the CPU, and the AI probably doesn't understand what it does at all.

– Joshua
14 hours ago

Beware the pedestrian when 50 pieces of debris are falling...

– Comintern
11 hours ago

add a comment |

Why not try to use the rules applied to check self-awareness of animals?

Other ways would be monitoring if the AI starts searching answers for questions like "What/Who am I?"

answered yesterday

Rachey

211

New contributor

$begingroup$
Pretty interesting, but how would you show an AI "itself in a mirror" ?
$endgroup$
– Asoub
22 hours ago

$begingroup$
That would actually be rather simple - just a camera looking at the machine hosting the AI. If it's the size of server room, just glue a giant pink fluffy ball on the rack or simulate situations potentially leading to the machine's destruction (like, feed fake "server room getting flooded" video to the camera system) and observe reactions. Would be a bit harder to explain if the AI systems are something like smartphone size.
$endgroup$
– Rachey
20 hours ago

$begingroup$
What is "the machine hosting the AI"? With the way compute resourcing is going, the notion of a specific application running on a specific device is likely to be as retro as punchcards and vacuum tubes long before Strong AI becomes a reality. AWS is worth hundreds of billions already.
$endgroup$
– Yurgen
13 hours ago

add a comment |

Why not try to use the rules applied to check self-awareness of animals?

Other ways would be monitoring if the AI starts searching answers for questions like "What/Who am I?"

answered yesterday

Rachey

211

New contributor

$begingroup$
Pretty interesting, but how would you show an AI "itself in a mirror" ?
$endgroup$
– Asoub
22 hours ago

$begingroup$
That would actually be rather simple - just a camera looking at the machine hosting the AI. If it's the size of server room, just glue a giant pink fluffy ball on the rack or simulate situations potentially leading to the machine's destruction (like, feed fake "server room getting flooded" video to the camera system) and observe reactions. Would be a bit harder to explain if the AI systems are something like smartphone size.
$endgroup$
– Rachey
20 hours ago

$begingroup$
What is "the machine hosting the AI"? With the way compute resourcing is going, the notion of a specific application running on a specific device is likely to be as retro as punchcards and vacuum tubes long before Strong AI becomes a reality. AWS is worth hundreds of billions already.
$endgroup$
– Yurgen
13 hours ago

add a comment |

Why not try to use the rules applied to check self-awareness of animals?

Other ways would be monitoring if the AI starts searching answers for questions like "What/Who am I?"

answered yesterday

Rachey

211

New contributor

Why not try to use the rules applied to check self-awareness of animals?

Other ways would be monitoring if the AI starts searching answers for questions like "What/Who am I?"

answered yesterday

Rachey

211

New contributor

answered yesterday

Rachey

211

New contributor

answered yesterday

Rachey

211

answered yesterday

Rachey

211

New contributor

Rachey is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.

$begingroup$
Pretty interesting, but how would you show an AI "itself in a mirror" ?
$endgroup$
– Asoub
22 hours ago

$begingroup$
That would actually be rather simple - just a camera looking at the machine hosting the AI. If it's the size of server room, just glue a giant pink fluffy ball on the rack or simulate situations potentially leading to the machine's destruction (like, feed fake "server room getting flooded" video to the camera system) and observe reactions. Would be a bit harder to explain if the AI systems are something like smartphone size.
$endgroup$
– Rachey
20 hours ago

$begingroup$
What is "the machine hosting the AI"? With the way compute resourcing is going, the notion of a specific application running on a specific device is likely to be as retro as punchcards and vacuum tubes long before Strong AI becomes a reality. AWS is worth hundreds of billions already.
$endgroup$
– Yurgen
13 hours ago

add a comment |

$begingroup$
Pretty interesting, but how would you show an AI "itself in a mirror" ?
$endgroup$
– Asoub
22 hours ago

$begingroup$
That would actually be rather simple - just a camera looking at the machine hosting the AI. If it's the size of server room, just glue a giant pink fluffy ball on the rack or simulate situations potentially leading to the machine's destruction (like, feed fake "server room getting flooded" video to the camera system) and observe reactions. Would be a bit harder to explain if the AI systems are something like smartphone size.
$endgroup$
– Rachey
20 hours ago

$begingroup$
What is "the machine hosting the AI"? With the way compute resourcing is going, the notion of a specific application running on a specific device is likely to be as retro as punchcards and vacuum tubes long before Strong AI becomes a reality. AWS is worth hundreds of billions already.
$endgroup$
– Yurgen
13 hours ago

Pretty interesting, but how would you show an AI "itself in a mirror" ?

– Asoub
22 hours ago

That would actually be rather simple - just a camera looking at the machine hosting the AI. If it's the size of server room, just glue a giant pink fluffy ball on the rack or simulate situations potentially leading to the machine's destruction (like, feed fake "server room getting flooded" video to the camera system) and observe reactions. Would be a bit harder to explain if the AI systems are something like smartphone size.

– Rachey
20 hours ago

What is "the machine hosting the AI"? With the way compute resourcing is going, the notion of a specific application running on a specific device is likely to be as retro as punchcards and vacuum tubes long before Strong AI becomes a reality. AWS is worth hundreds of billions already.

– Yurgen
13 hours ago

add a comment |

Sometimes you don't need to know exactly what you're looking for, instead you look to see if there's anything you weren't expecting, then react to that.

answered 17 hours ago

Super-T

211

New contributor

add a comment |

Sometimes you don't need to know exactly what you're looking for, instead you look to see if there's anything you weren't expecting, then react to that.

answered 17 hours ago

Super-T

211

New contributor

add a comment |

Sometimes you don't need to know exactly what you're looking for, instead you look to see if there's anything you weren't expecting, then react to that.

answered 17 hours ago

Super-T

211

New contributor

Sometimes you don't need to know exactly what you're looking for, instead you look to see if there's anything you weren't expecting, then react to that.

answered 17 hours ago

Super-T

211

New contributor

answered 17 hours ago

Super-T

211

New contributor

answered 17 hours ago

Super-T

211

answered 17 hours ago

Super-T

211

New contributor

Super-T is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.

add a comment |

answered 15 hours ago

Kevin S

1111

New contributor

add a comment |

answered 15 hours ago

Kevin S

1111

New contributor

add a comment |

answered 15 hours ago

Kevin S

1111

New contributor

answered 15 hours ago

Kevin S

1111

New contributor

answered 15 hours ago

Kevin S

1111

New contributor

answered 15 hours ago

Kevin S

1111

answered 15 hours ago

Kevin S

1111

New contributor

Kevin S is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.

add a comment |

An AI could only be badly programmed to do things which are either unexpected or undesired. An AI could never become conscious, if that's what you meant by "self-aware".

The above was my restatement of Searle's rejoinder to System Reply to his Chinese Room argument.

answered 11 hours ago

pixie

New contributor

$begingroup$
So what's your answer to the question? It sounds like you're saying, "Such a kill-switch would be unnecessary because a self-aware AI can never exist", but you should edit your answer to make that explicit. Right now it looks more like tangential discussion, and this is a Q&A site, not a discussion forum.
$endgroup$
– F1Krazy
6 hours ago

add a comment |

An AI could only be badly programmed to do things which are either unexpected or undesired. An AI could never become conscious, if that's what you meant by "self-aware".

The above was my restatement of Searle's rejoinder to System Reply to his Chinese Room argument.

answered 11 hours ago

pixie

New contributor

$begingroup$
So what's your answer to the question? It sounds like you're saying, "Such a kill-switch would be unnecessary because a self-aware AI can never exist", but you should edit your answer to make that explicit. Right now it looks more like tangential discussion, and this is a Q&A site, not a discussion forum.
$endgroup$
– F1Krazy
6 hours ago

add a comment |

An AI could only be badly programmed to do things which are either unexpected or undesired. An AI could never become conscious, if that's what you meant by "self-aware".

The above was my restatement of Searle's rejoinder to System Reply to his Chinese Room argument.

answered 11 hours ago

pixie

New contributor

An AI could only be badly programmed to do things which are either unexpected or undesired. An AI could never become conscious, if that's what you meant by "self-aware".

The above was my restatement of Searle's rejoinder to System Reply to his Chinese Room argument.

answered 11 hours ago

pixie

New contributor

answered 11 hours ago

pixie

New contributor

answered 11 hours ago

pixie

answered 11 hours ago

pixie

New contributor

pixie is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.

$begingroup$
So what's your answer to the question? It sounds like you're saying, "Such a kill-switch would be unnecessary because a self-aware AI can never exist", but you should edit your answer to make that explicit. Right now it looks more like tangential discussion, and this is a Q&A site, not a discussion forum.
$endgroup$
– F1Krazy
6 hours ago

add a comment |

$begingroup$
So what's your answer to the question? It sounds like you're saying, "Such a kill-switch would be unnecessary because a self-aware AI can never exist", but you should edit your answer to make that explicit. Right now it looks more like tangential discussion, and this is a Q&A site, not a discussion forum.
$endgroup$
– F1Krazy
6 hours ago

So what's your answer to the question? It sounds like you're saying, "Such a kill-switch would be unnecessary because a self-aware AI can never exist", but you should edit your answer to make that explicit. Right now it looks more like tangential discussion, and this is a Q&A site, not a discussion forum.

– F1Krazy
6 hours ago

add a comment |

-1

edited 3 hours ago

answered yesterday

jpd

New contributor

3

$begingroup$
This would be a better answer if you could explain why you believe such a kill-switch could never work.
$endgroup$
– F1Krazy
yesterday

3

$begingroup$
This does not provide an answer to the question. Once you have sufficient reputation you will be able to comment on any post; instead, provide answers that don't require clarification from the asker. - From Review
$endgroup$
– Trevor D
23 hours ago

add a comment |

-1

edited 3 hours ago

answered yesterday

jpd

New contributor

3

$begingroup$
This would be a better answer if you could explain why you believe such a kill-switch could never work.
$endgroup$
– F1Krazy
yesterday

3

$begingroup$
This does not provide an answer to the question. Once you have sufficient reputation you will be able to comment on any post; instead, provide answers that don't require clarification from the asker. - From Review
$endgroup$
– Trevor D
23 hours ago

add a comment |

-1

edited 3 hours ago

answered yesterday

jpd

New contributor

edited 3 hours ago

answered yesterday

jpd

New contributor

edited 3 hours ago

answered yesterday

jpd

New contributor

answered yesterday

jpd

answered yesterday

jpd

New contributor

jpd is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.

3

$begingroup$
This would be a better answer if you could explain why you believe such a kill-switch could never work.
$endgroup$
– F1Krazy
yesterday

3

$begingroup$
This does not provide an answer to the question. Once you have sufficient reputation you will be able to comment on any post; instead, provide answers that don't require clarification from the asker. - From Review
$endgroup$
– Trevor D
23 hours ago

add a comment |

3

$begingroup$
This would be a better answer if you could explain why you believe such a kill-switch could never work.
$endgroup$
– F1Krazy
yesterday

3

$begingroup$
This does not provide an answer to the question. Once you have sufficient reputation you will be able to comment on any post; instead, provide answers that don't require clarification from the asker. - From Review
$endgroup$
– Trevor D
23 hours ago

This would be a better answer if you could explain why you believe such a kill-switch could never work.

– F1Krazy
yesterday

This does not provide an answer to the question. Once you have sufficient reputation you will be able to comment on any post; instead, provide answers that don't require clarification from the asker. - From Review

– Trevor D
23 hours ago

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Worldbuilding Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Fhyujk