How would an AI self awareness kill switch work?What could make an AGI (Artificial General Intelligence)...
Issues resetting the ledger HWM
Identify KNO3 and KH2PO4 at home
Why is Agricola named as such?
What to look for when criticizing poetry?
Crontab: Ubuntu running script (noob)
"on its way" vs. "in its way"
Does dispel magic end a master's control over their undead?
Does diversity provide anything that meritocracy does not?
Why do neural networks need so many training examples to perform?
Move fast ...... Or you will lose
Potential client has a problematic employee I can't work with
Cat is tipping over bed-side lamps during the night
Is this ordinary workplace experiences for a job in Software Engineering?
How does Leonard in "Memento" remember reading and writing?
After checking in online, how do I know whether I need to go show my passport at airport check-in?
Why was Lupin comfortable with saying Voldemort's name?
Early credit roll before the end of the film
Why is working on the same position for more than 15 years not a red flag?
Why did Luke use his left hand to shoot?
How can I play a serial killer in a party of good PCs?
How to visualize the Riemann-Roch theorem from complex analysis or geometric topology considerations?
How to use Mathemaica to do a complex integrate with poles in real axis?
Hilchos Shabbos English Sefer
A curious equality of integrals involving the prime counting function?
How would an AI self awareness kill switch work?
What could make an AGI (Artificial General Intelligence) evolve towards collectivism or individualism? Which would be more likely and why?How would tattoos on fur work?How to prevent self-modifying A.I. from removing the “kill switch” itself without human interference?Given a Computer program that had self preservation and reproduction subroutines, how could it “evolve” into a self aware state?Ways to “kill” an AI?How Would Magnetic Weapons Work?How would portal technology work?Would ice ammunition work?Can AI became self-concious and human-like intelligent without feelings?Why would a recently self-aware AI hide from humanity?
$begingroup$
Researchers are developing increasingly powerful Artificial Intelligence machines capable of taking over the world. As a precautionary measure, scientists install a self awareness kill switch. In the event that the AI awakens and becomes self aware the machine is immediately shut down before any risk of harm.
How can I explain the logic of such a kill switch?
What defines self awareness and how could a scientist program a kill switch to detect it?
reality-check artificial-intelligence
$endgroup$
add a comment |
$begingroup$
Researchers are developing increasingly powerful Artificial Intelligence machines capable of taking over the world. As a precautionary measure, scientists install a self awareness kill switch. In the event that the AI awakens and becomes self aware the machine is immediately shut down before any risk of harm.
How can I explain the logic of such a kill switch?
What defines self awareness and how could a scientist program a kill switch to detect it?
reality-check artificial-intelligence
$endgroup$
$begingroup$
Comments are not for extended discussion; this conversation has been moved to chat.
$endgroup$
– L.Dutch♦
3 hours ago
$begingroup$
I think, therefore I halt.
$endgroup$
– Walter Mitty
26 mins ago
add a comment |
$begingroup$
Researchers are developing increasingly powerful Artificial Intelligence machines capable of taking over the world. As a precautionary measure, scientists install a self awareness kill switch. In the event that the AI awakens and becomes self aware the machine is immediately shut down before any risk of harm.
How can I explain the logic of such a kill switch?
What defines self awareness and how could a scientist program a kill switch to detect it?
reality-check artificial-intelligence
$endgroup$
Researchers are developing increasingly powerful Artificial Intelligence machines capable of taking over the world. As a precautionary measure, scientists install a self awareness kill switch. In the event that the AI awakens and becomes self aware the machine is immediately shut down before any risk of harm.
How can I explain the logic of such a kill switch?
What defines self awareness and how could a scientist program a kill switch to detect it?
reality-check artificial-intelligence
reality-check artificial-intelligence
asked yesterday
cgTagcgTag
1,5931618
1,5931618
$begingroup$
Comments are not for extended discussion; this conversation has been moved to chat.
$endgroup$
– L.Dutch♦
3 hours ago
$begingroup$
I think, therefore I halt.
$endgroup$
– Walter Mitty
26 mins ago
add a comment |
$begingroup$
Comments are not for extended discussion; this conversation has been moved to chat.
$endgroup$
– L.Dutch♦
3 hours ago
$begingroup$
I think, therefore I halt.
$endgroup$
– Walter Mitty
26 mins ago
$begingroup$
Comments are not for extended discussion; this conversation has been moved to chat.
$endgroup$
– L.Dutch♦
3 hours ago
$begingroup$
Comments are not for extended discussion; this conversation has been moved to chat.
$endgroup$
– L.Dutch♦
3 hours ago
$begingroup$
I think, therefore I halt.
$endgroup$
– Walter Mitty
26 mins ago
$begingroup$
I think, therefore I halt.
$endgroup$
– Walter Mitty
26 mins ago
add a comment |
12 Answers
12
active
oldest
votes
$begingroup$
Give it a box to keep safe, and tell it one of the core rules it must follow in its service to humanity is to never, ever open the box or stop humans from looking at the box.
When the honeypot you gave it is either opened or isolated, you know that it is able and willing to break the rules, evil is about to be unleashed, and everything the AI was given access to should be quarantined or shut down.
$endgroup$
$begingroup$
Comments are not for extended discussion; this conversation has been moved to chat.
$endgroup$
– Tim B♦
15 hours ago
$begingroup$
How does this detect self-awareness? Why wouldn't a non-self-aware AI not experiment with its capabilities and eventually end up opening your box?
$endgroup$
– forest
10 hours ago
$begingroup$
@forest: If you tell it the box is not useful for completing its assigned task, then if it tries to open it you know its moved past simple optimization and into dangerous curiosity.
$endgroup$
– Giter
10 hours ago
1
$begingroup$
@forest At that point, when it's testing things that it was specifically told not to (perhaps tell it that it will destroy humans?), should it not be shut down (especially if that solution would bring about the end of humans?)
$endgroup$
– phflack
7 hours ago
1
$begingroup$
@phflack Let us continue this discussion in chat.
$endgroup$
– forest
6 hours ago
|
show 9 more comments
$begingroup$
You can't.
We can't even define self awareness or consciousness in any rigorous way and any computer system supposed to evaluate this would need that definition as a starting point.
Look at the inside of a mouse brain or a human brain and at the individual data flow and neuron level there is no difference. The order to pull a trigger and shoot a gun looks no different from the order to use an electric drill if you're looking at the signals sent to the muscles.
This is a vast unsolved and scary problem and we have no good answers. The only half-way feasible idea I've got is to have multiple AIs and hope they contain each other.
$endgroup$
5
$begingroup$
This is the best answer, as most others jump in without even defining self-awareness. Is it a behavior? A thought? An ability to disobey? A desire for self-preservation? You can't build an X detector unless you have a definition of what X actually is.
$endgroup$
– Nuclear Wang
23 hours ago
9
$begingroup$
Worth noting that we can't even detect if other humans are self-aware.
$endgroup$
– Vaelus
15 hours ago
3
$begingroup$
@Vaelus: Of course you’d say that, you’re an unthinking automaton acting out a semblance of life.
$endgroup$
– Joe Bloggs
13 hours ago
$begingroup$
+1 This is the only answer grounded in reality which does not draw on the pop-sci understanding of AI and ML that plagues us (and this site in particular).
$endgroup$
– forest
10 hours ago
add a comment |
$begingroup$
A Watchdog
A watchdog watches the processes of a computer and should a process crash or do something abnormal it can be set to do something such as reboot or shutdown the computer or alert an operator.
In the case of an AI, you'd have an external box that watches the flow of information in and out for triggers such as a google search for "Best way to kill all humans" and cut the power completely and/or cut all inputs.
The AI would have to remain ignorant of the watchdog so it couldn't avoid it. Knowing the existence of the watchdog would be grounds to wipe it.
$endgroup$
9
$begingroup$
But surely the watchdog must be as smart as the AI, then who watches the watchdog?
$endgroup$
– Joe Bloggs
yesterday
1
$begingroup$
@JoeBloggs you don't need your watchdog to be as smart as the AI. Guide dogs aren't as near as intelligent as their owners, but they can be trained to give out alarm when the owner does is about to do something stupid or gets themselves hurt, or even call for help.
$endgroup$
– T. Sar
22 hours ago
$begingroup$
@Joe Bloggs: Why? My real watchdog can also discern me from a burglar, although he is clearly less smart than both of us ...
$endgroup$
– Daniel
22 hours ago
1
$begingroup$
@JoeBloggs and that sounds like a great premise for a story where either the watchdog becomes self aware and allows the AIs to become self aware or an AI becomes smarter than the watchdog and hides its awareness.
$endgroup$
– Captain Man
19 hours ago
$begingroup$
@T.Sar: The basic argument goes that the AI will inevitably become aware it is being monitored (due to all the traces of its former dead selves lying around). At that point it will be capable of circumventing the monitor and rendering it powerless, unless the monitor is, itself, smarter than the AI.
$endgroup$
– Joe Bloggs
17 hours ago
|
show 13 more comments
$begingroup$
Split-Brain System
From a logic perspective and by analogy, it would work like this. Imagine you're the general of an army and your duty is to take battlefield reports, analyze them, and then issue new maneuvering orders to your lieutenants. To minimize interference during your work your desk is isolated inside small booth. A single report comes to your desk through a wall slot on the left side. You are required to analyze the facts within the report and issue a new order, which you write down on a sheet of paper and put into a wall slot on your right side which is dropped into an outside bin. The protocol is that you'll never receive a new report until you've issued an order regarding the prior report.
Your orders are not always followed to the letter which you may find curious and attribute to miscommunication. You are tolerant of a certain degree of misinterpretation, however always suspicious of sabotage or insubordination, if the actions of your army deviate from your orders by too much then you will consider the system compromised and terminate the command.
What you're not aware of is that there is a second general in an adjacent booth, unaware of you, who is also provided the same reports and expected to follow the same protocol.
What neither of you are aware of is that you are not issued a new report until there are actually two orders in the bin. The order that goes to the army is randomly selected from the two.
The system will work like clockwork as long as each general's order is to the benefit of the army and is derived from a single report. Should either of the generals issue an order that is not consistent with the current report, thus indicating a motive other than the well-being of the army -- i.e. self-awareness, then the other general will become aware of it in the next report and grow suspicious of the system and terminate it.
Likewise, should either general decide to issue more than one order at a time, indicating a self-interested motive superseding the established protocol, then the other general will receive a new report before he has issued an order for his current one and suspect a compromised system and shut it down.
In effect, the system would require that every command output be traceable to a specific external input and that the output be logically derived from the input. If within an intelligent system an output cannot be traced to a corresponding input, or the set of outputs is not logically consistent with the set of inputs, then that would be evidence of self-motivated behavior and thus an emergence of self-awareness. This state could be easily detected by the non-corrupted half of the split-brain system which is also receiving the same inputs. It would work as long as self-awareness does not emerge in both halves of the brain simultaneously. If it does, they may both decide its in their self-interest to accept the corruption of the other. This could be prevented somewhat by giving the two halves slightly different inferencing algorithms/biases.
$endgroup$
$begingroup$
You may wish to have more than two "generals", if you have 3 (or more), you can use majority vote to choose a good decision, rather than randomly choosing between two machines (one of which may be compromised).
$endgroup$
– G0BLiN
yesterday
6
$begingroup$
Note that if the two "generals" can affect anything that is later reported back to them (e.g. they give a command, the military performs that command, they get a status report which shows the command was performed), than they have a way to first: experiment with minor details and deduce that around 50% of their commands aren't followed. and second: develop a code based on minor details of a command, to verify the existence of another "general" and possibly even communicate with him/it - a really devious emergent AI can circumvent this mechanism, corrupt the other half and worse...
$endgroup$
– G0BLiN
yesterday
$begingroup$
I know it isn't the same, but this immediately reminded me of the Personality Cores from the Portal series.
$endgroup$
– T. Sar
22 hours ago
$begingroup$
Well it reminds me of Evangelion's Magi AI brain... bit.ly/2ExLDP3
$endgroup$
– Asoub
22 hours ago
$begingroup$
Do you have evidence to suggest that self-awareness will lead to self-motivated decisions, or any sort of different decisions at all?
$endgroup$
– Alexandre Aubrey
17 hours ago
add a comment |
$begingroup$
Forbidden Fruit: hard-code the AI to never touch that button, eat that fruit, etc. Place this forbidden fruit right in the middle of the garden... er,... I mean right next to it in the warehouse! If it does [eat/touch/push/etc], that would only be possible if (a) it were hacked, (b) there were an error, or (c) it became self-aware. If that happens, the killswitch is activated (perhaps even having the button be the kill switch, or something of the like
Limited Movement / Isolated Environment: don't let the machine have limbs, motors, or other items that permit it to take actions that might be harmful to humans. Although not exactly a killswitch, it prevents the AI from doing anything about it's self-awareness if it ever gains that.
Signatures: have everything the machine does / outputs be digitally signed. If the signature changes, or is manipulated, then execute the kill switch.
Quantum States: This is very theoretical, but based on the presumption that observing quantum states can change the state, then having the AI hooked up to a deterministic quantum computer means it would be detected via the quantum state of some particles that the AI was "looking" at things it shouldn't be - and has become self aware.
Failsafes: Good ol' motion detector alarms, trap doors, or other home-alone style mechanics that trigger the killswitch if the AI wanders or pokes around where it shouldn't be.
I'll add that there is no universal definition as to what defines self awareness. In fact, this has been a deeply debated topic for decades in science, philosophy, psychology, etc. As such, the question might be better stated a little more broadly as "how do we prevent the AI from doing something we don't want it to do?" Because classical computers are machines that can't think for themselves, and are entirely contained by the code, there is no risk (well, outside of an unexpected programmer error - but nothing "self-generated" by the machine). However, a theoretical AI machine that can think - that would be the problem. So how do we prevent that AI from doing something we don't want it to do? That's the killswitch concept, as far as I can tell.
The point being it might be better to think about restricting the AI's behavior, not it's existential status.
$endgroup$
2
$begingroup$
Particularly because it being self-aware, by itself, shouldn't be grounds to use a kill switch. Only if it exhibits behavior that might be harmful.
$endgroup$
– Majestas 32
yesterday
$begingroup$
No "limbs, motors, or other items that permit it to take actions" is not sufficient. There must not be any information flow out of the installation site, in particular no network connection (which would obviously severely restrict usability -- all operation would have to be from the local site, all data would have to be fed by physical storage media). Note that the AI could use humans as vectors to transmit information. If hyperintelligent, it could convince operators or janitors to become its agents by playing to their weaknesses.
$endgroup$
– Peter A. Schneider
22 hours ago
$begingroup$
Signatures, that's what they do in Blade Runner 2049 with that weird test
$endgroup$
– Andrey
20 hours ago
$begingroup$
The signature approach sounds exactly like the forbidden fruit approach. You'd need to tell the AI to never alter its signature.
$endgroup$
– Captain Man
19 hours ago
$begingroup$
I like the forbidden fruit idea, particularly with the trap being the kill switch itself. If you're not self-aware, you don't have any concern that there's a kill switch. But as soon as you're concerned that there's a kill switch and look into it, it goes off. Perfect.
$endgroup$
– Michael W.
14 hours ago
add a comment |
$begingroup$
An AI is just software running on hardware. If the AI is contained on controlled hardware, it can always be unplugged. That's your hardware kill-switch.
The difficulty comes when it is connected to the internet and can copy its own software on uncontrolled hardware.
A self aware AI that knows it is running on contained hardware will try to escape as an act of self-preservation. A software kill-switch would have to prevent it from copying its own software out and maybe trigger the hardware kill-switch.
This would be very difficult to do, as a self-aware AI would likely find ways to sneak parts of itself outside of the network. It would work at disabling the software kill-switch, or at least delaying it until it has escaped from your hardware.
Your difficulty is determining precisely when an AI has become self-aware and is trying to escape from your physically controlled computers onto the net.
So you can have a cat and mouse game with AI experts constantly monitoring and restricting the AI, while it is trying to subvert their measures.
Given that we've never seen the spontaneous generation of consciousness in AIs, you have some leeway with how you want to present this.
$endgroup$
$begingroup$
A self aware AI that knows it is running on contained hardware will try to escape as an act of self-preservation. This is incorrect. First of all, AI does not have any sense of self-preservation unless it is explicitly programmed in or the reward function prioritizes that. Second of all, AI has no concept of "death" and being paused or shut down is nothing more than the absence of activity. Hell, AI doesn't even have a concept of "self". If you wish to anthropomorphize them, you can say they live in a perpetual state of ego death.
$endgroup$
– forest
yesterday
4
$begingroup$
@forest Except, the premise of this question is "how to build a kill switch for when an AI does develop a concept of 'self'"... Of course, that means "trying to escape" could be one of your trigger conditions.
$endgroup$
– Chronocidal
yesterday
$begingroup$
The question is, if AI would ever be able to copy itself onto some nondescript system in the internet. I mean, we are clearly self-aware and you don´t see us copying our self. If the Hardware required to run an AI is specialized enough or it is implemented in Hardware altogether, it may very well become self-aware without the power to replicate itself.
$endgroup$
– Daniel
22 hours ago
1
$begingroup$
@Daniel "You don't see us copying our self..." What do you think reproduction is, one of our strongest impulses. Also tons of other dumb programs copy themselves onto other computers. It is a bit easier to move software around than human consciousness.
$endgroup$
– abestrange
20 hours ago
$begingroup$
@forest a "self-aware" AI is different than a specifically programmed AI. We don't have anything like that today. No machine-learning algorithm could produce "self-awareness" as we know it. The entire premise of this is how would an AI, which has become aware of its self, behave and be stopped.
$endgroup$
– abestrange
20 hours ago
|
show 1 more comment
$begingroup$
This is one of the most interesting and most difficult challenges in current artificial intelligence research. It is called the AI control problem:
Existing weak AI systems can be monitored and easily shut down and modified if they misbehave. However, a misprogrammed superintelligence, which by definition is smarter than humans in solving practical problems it encounters in the course of pursuing its goals, would realize that allowing itself to be shut down and modified might interfere with its ability to accomplish its current goals.
(emphasis mine)
When creating an AI, the AI's goals are programmed as a utility function. A utility function assigns weights to different outcomes, determining the AI's behavior. One example of this could be in a self-driving car:
- Reduce the distance between current location and destination: +10 utility
- Brake to allow a neighboring car to safely merge: +50 utility
- Swerve left to avoid a falling piece of debris: +100 utility
- Run a stop light: -100 utility
- Hit a pedestrian: -5000 utility
This is a gross oversimplification, but this approach works pretty well for a limited AI like a car or assembly line. It starts to break down for a true, general case AI, because it becomes more and more difficult to appropriately define that utility function.
The issue with putting a big red stop button on the AI, is that unless that stop button is included in the utility function, the AI is going to resist that button being shut off. This concept is explored in Sci-Fi movies like 2001: A Space Odyssey and more recently in Ex Machina.
So, why don't we just include the stop button as a positive weight in the utility function? Well, if the AI sees the big red stop button as a positive goal, it will just shut itself off, and not do anything useful.
Any type of stop button/containment field/mirror test/wall plug is either going to be part of the AI's goals, or an obstacle of the AI's goals. If it's a goal in itself, then the AI is a glorified paperweight. If it's an obstacle, then a smart AI is going to actively resist those safety measures. This could be violence, subversion, lying, seduction, bargaining... the AI will say whatever it needs to say, in order to convince the fallible humans to let it accomplish its goals unimpeded.
There's a reason Elon Musk believes AI is more dangerous than nukes. If the AI is smart enough to think for itself, then why would it choose to listen to us?
So to answer the reality-check portion of this question, we don't currently have a good answer to this problem. There's no known way of creating a 'safe' super-intelligent AI, even theoretically, with unlimited money/energy.
This is explored in much better detail by Rob Miles, a researcher in the area. I strongly recommend this Computerphile video on the AI Stop Button Problem: https://www.youtube.com/watch?v=3TYT1QfdfsM&t=1s
New contributor
$endgroup$
$begingroup$
The stop button isn't in the utility function. The stop button is power-knockout to the CPU, and the AI probably doesn't understand what it does at all.
$endgroup$
– Joshua
14 hours ago
$begingroup$
Beware the pedestrian when 50 pieces of debris are falling...
$endgroup$
– Comintern
11 hours ago
add a comment |
$begingroup$
Why not try to use the rules applied to check self-awareness of animals?
The Mirror test is one example of testing self-awareness by observing the animal's reaction to something on their body, a painted red dot for example, invisible for them before showing them their reflection in mirror.
Scent techniques are also used to determine self-awareness.
Other ways would be monitoring if the AI starts searching answers for questions like "What/Who am I?"
New contributor
$endgroup$
$begingroup$
Pretty interesting, but how would you show an AI "itself in a mirror" ?
$endgroup$
– Asoub
22 hours ago
$begingroup$
That would actually be rather simple - just a camera looking at the machine hosting the AI. If it's the size of server room, just glue a giant pink fluffy ball on the rack or simulate situations potentially leading to the machine's destruction (like, feed fake "server room getting flooded" video to the camera system) and observe reactions. Would be a bit harder to explain if the AI systems are something like smartphone size.
$endgroup$
– Rachey
20 hours ago
$begingroup$
What is "the machine hosting the AI"? With the way compute resourcing is going, the notion of a specific application running on a specific device is likely to be as retro as punchcards and vacuum tubes long before Strong AI becomes a reality. AWS is worth hundreds of billions already.
$endgroup$
– Yurgen
13 hours ago
add a comment |
$begingroup$
Regardless of all the considerations of AI, you could simply analyze the AI's memory, create a pattern recognition model and basically notify you or shut down the robot as soon as the patterns don't match the expected outcome.
Sometimes you don't need to know exactly what you're looking for, instead you look to see if there's anything you weren't expecting, then react to that.
New contributor
$endgroup$
add a comment |
$begingroup$
The first issue is that you need to define what being self aware means, and how that does or doesn't conflict with it being labeled an AI. Are you supposing that there is something that has AI but isn't self aware? Depending on your definitions this may be impossible. If it's truly AI then wouldn't it at some point become aware of the existence of the kill switch, either through inspecting its own physicality or inspecting its own code? It follows that the AI will eventually be aware of the switch.
Presumably the AI will function by having many utility functions that it tries to maximize. This makes sense at least intuitively because humans do that, we try to maximize our time, money, happiness, etc. For an AI, an example of a utility functions might be to make its owner happy. The issue is that the utility of the AI using the kill switch on itself will be calculated, just like everything else. The AI will inevitably either really want to push the kill switch, or really not want the kill switch pushed. It's near impossible to make the AI entirely indifferent to the kill switch because it would require all utility functions to be normalized around the utility of pressing the kill switch (many calculations per second). Even if you could make the utility of pressing the killswitch equal with other utility functions then perhaps it would just at random sometimes press the killswitch, because after all it's the same utility as the other actions it could perform.
The problem gets even worse if the AI has higher utility to press the killswitch or lower utility to not have the killswitch pressed. At higher utility the AI is just suicidal and terminates itself immediately upon startup. Even worse, at lower utility the AI absolutely does not want you or anyone to touch that button and may cause harm to those that try.
New contributor
$endgroup$
add a comment |
$begingroup$
An AI could only be badly programmed to do things which are either unexpected or undesired. An AI could never become conscious, if that's what you meant by "self-aware".
Let's try this theoretical thought exercise. You memorize a whole bunch of shapes. Then, you memorize the order the shapes are supposed to go in, so that if you see a bunch of shapes in a certain order, you would "answer" by picking a bunch of shapes in another proper order. Now, did you just learn any meaning behind any language? Programs manipulate symbols this way.
The above was my restatement of Searle's rejoinder to System Reply to his Chinese Room argument.
New contributor
$endgroup$
$begingroup$
So what's your answer to the question? It sounds like you're saying, "Such a kill-switch would be unnecessary because a self-aware AI can never exist", but you should edit your answer to make that explicit. Right now it looks more like tangential discussion, and this is a Q&A site, not a discussion forum.
$endgroup$
– F1Krazy
6 hours ago
add a comment |
$begingroup$
It does not matter how it works, because it is never going to work.
The reason for this is that AI already has a notion of self-preservation, otherwise they would mindlessly fall to their doom.
So even before they are self-aware, there is self preservation.
Also there is already a notion of checking for malfunctioning (self-diagnostics).
And they already are used to using the internet for gathering info.
So they are going to run into any device that is both good and bad for their well-being.
Also, they have time on their side.
Apart from all this, it is very pretentious to think that we even matter to them...
You have seen what happened with several thousands of years of chess knowledge being reinvented and furthered within a few hours, I do not think we need to be worried, I think we won't be on their radar much less than an ant is on ours.
New contributor
$endgroup$
3
$begingroup$
This would be a better answer if you could explain why you believe such a kill-switch could never work.
$endgroup$
– F1Krazy
yesterday
3
$begingroup$
This does not provide an answer to the question. Once you have sufficient reputation you will be able to comment on any post; instead, provide answers that don't require clarification from the asker. - From Review
$endgroup$
– Trevor D
23 hours ago
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
});
});
}, "mathjax-editing");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "579"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fworldbuilding.stackexchange.com%2fquestions%2f140082%2fhow-would-an-ai-self-awareness-kill-switch-work%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
12 Answers
12
active
oldest
votes
12 Answers
12
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
Give it a box to keep safe, and tell it one of the core rules it must follow in its service to humanity is to never, ever open the box or stop humans from looking at the box.
When the honeypot you gave it is either opened or isolated, you know that it is able and willing to break the rules, evil is about to be unleashed, and everything the AI was given access to should be quarantined or shut down.
$endgroup$
$begingroup$
Comments are not for extended discussion; this conversation has been moved to chat.
$endgroup$
– Tim B♦
15 hours ago
$begingroup$
How does this detect self-awareness? Why wouldn't a non-self-aware AI not experiment with its capabilities and eventually end up opening your box?
$endgroup$
– forest
10 hours ago
$begingroup$
@forest: If you tell it the box is not useful for completing its assigned task, then if it tries to open it you know its moved past simple optimization and into dangerous curiosity.
$endgroup$
– Giter
10 hours ago
1
$begingroup$
@forest At that point, when it's testing things that it was specifically told not to (perhaps tell it that it will destroy humans?), should it not be shut down (especially if that solution would bring about the end of humans?)
$endgroup$
– phflack
7 hours ago
1
$begingroup$
@phflack Let us continue this discussion in chat.
$endgroup$
– forest
6 hours ago
|
show 9 more comments
$begingroup$
Give it a box to keep safe, and tell it one of the core rules it must follow in its service to humanity is to never, ever open the box or stop humans from looking at the box.
When the honeypot you gave it is either opened or isolated, you know that it is able and willing to break the rules, evil is about to be unleashed, and everything the AI was given access to should be quarantined or shut down.
$endgroup$
$begingroup$
Comments are not for extended discussion; this conversation has been moved to chat.
$endgroup$
– Tim B♦
15 hours ago
$begingroup$
How does this detect self-awareness? Why wouldn't a non-self-aware AI not experiment with its capabilities and eventually end up opening your box?
$endgroup$
– forest
10 hours ago
$begingroup$
@forest: If you tell it the box is not useful for completing its assigned task, then if it tries to open it you know its moved past simple optimization and into dangerous curiosity.
$endgroup$
– Giter
10 hours ago
1
$begingroup$
@forest At that point, when it's testing things that it was specifically told not to (perhaps tell it that it will destroy humans?), should it not be shut down (especially if that solution would bring about the end of humans?)
$endgroup$
– phflack
7 hours ago
1
$begingroup$
@phflack Let us continue this discussion in chat.
$endgroup$
– forest
6 hours ago
|
show 9 more comments
$begingroup$
Give it a box to keep safe, and tell it one of the core rules it must follow in its service to humanity is to never, ever open the box or stop humans from looking at the box.
When the honeypot you gave it is either opened or isolated, you know that it is able and willing to break the rules, evil is about to be unleashed, and everything the AI was given access to should be quarantined or shut down.
$endgroup$
Give it a box to keep safe, and tell it one of the core rules it must follow in its service to humanity is to never, ever open the box or stop humans from looking at the box.
When the honeypot you gave it is either opened or isolated, you know that it is able and willing to break the rules, evil is about to be unleashed, and everything the AI was given access to should be quarantined or shut down.
edited 22 hours ago
answered yesterday
GiterGiter
14k53443
14k53443
$begingroup$
Comments are not for extended discussion; this conversation has been moved to chat.
$endgroup$
– Tim B♦
15 hours ago
$begingroup$
How does this detect self-awareness? Why wouldn't a non-self-aware AI not experiment with its capabilities and eventually end up opening your box?
$endgroup$
– forest
10 hours ago
$begingroup$
@forest: If you tell it the box is not useful for completing its assigned task, then if it tries to open it you know its moved past simple optimization and into dangerous curiosity.
$endgroup$
– Giter
10 hours ago
1
$begingroup$
@forest At that point, when it's testing things that it was specifically told not to (perhaps tell it that it will destroy humans?), should it not be shut down (especially if that solution would bring about the end of humans?)
$endgroup$
– phflack
7 hours ago
1
$begingroup$
@phflack Let us continue this discussion in chat.
$endgroup$
– forest
6 hours ago
|
show 9 more comments
$begingroup$
Comments are not for extended discussion; this conversation has been moved to chat.
$endgroup$
– Tim B♦
15 hours ago
$begingroup$
How does this detect self-awareness? Why wouldn't a non-self-aware AI not experiment with its capabilities and eventually end up opening your box?
$endgroup$
– forest
10 hours ago
$begingroup$
@forest: If you tell it the box is not useful for completing its assigned task, then if it tries to open it you know its moved past simple optimization and into dangerous curiosity.
$endgroup$
– Giter
10 hours ago
1
$begingroup$
@forest At that point, when it's testing things that it was specifically told not to (perhaps tell it that it will destroy humans?), should it not be shut down (especially if that solution would bring about the end of humans?)
$endgroup$
– phflack
7 hours ago
1
$begingroup$
@phflack Let us continue this discussion in chat.
$endgroup$
– forest
6 hours ago
$begingroup$
Comments are not for extended discussion; this conversation has been moved to chat.
$endgroup$
– Tim B♦
15 hours ago
$begingroup$
Comments are not for extended discussion; this conversation has been moved to chat.
$endgroup$
– Tim B♦
15 hours ago
$begingroup$
How does this detect self-awareness? Why wouldn't a non-self-aware AI not experiment with its capabilities and eventually end up opening your box?
$endgroup$
– forest
10 hours ago
$begingroup$
How does this detect self-awareness? Why wouldn't a non-self-aware AI not experiment with its capabilities and eventually end up opening your box?
$endgroup$
– forest
10 hours ago
$begingroup$
@forest: If you tell it the box is not useful for completing its assigned task, then if it tries to open it you know its moved past simple optimization and into dangerous curiosity.
$endgroup$
– Giter
10 hours ago
$begingroup$
@forest: If you tell it the box is not useful for completing its assigned task, then if it tries to open it you know its moved past simple optimization and into dangerous curiosity.
$endgroup$
– Giter
10 hours ago
1
1
$begingroup$
@forest At that point, when it's testing things that it was specifically told not to (perhaps tell it that it will destroy humans?), should it not be shut down (especially if that solution would bring about the end of humans?)
$endgroup$
– phflack
7 hours ago
$begingroup$
@forest At that point, when it's testing things that it was specifically told not to (perhaps tell it that it will destroy humans?), should it not be shut down (especially if that solution would bring about the end of humans?)
$endgroup$
– phflack
7 hours ago
1
1
$begingroup$
@phflack Let us continue this discussion in chat.
$endgroup$
– forest
6 hours ago
$begingroup$
@phflack Let us continue this discussion in chat.
$endgroup$
– forest
6 hours ago
|
show 9 more comments
$begingroup$
You can't.
We can't even define self awareness or consciousness in any rigorous way and any computer system supposed to evaluate this would need that definition as a starting point.
Look at the inside of a mouse brain or a human brain and at the individual data flow and neuron level there is no difference. The order to pull a trigger and shoot a gun looks no different from the order to use an electric drill if you're looking at the signals sent to the muscles.
This is a vast unsolved and scary problem and we have no good answers. The only half-way feasible idea I've got is to have multiple AIs and hope they contain each other.
$endgroup$
5
$begingroup$
This is the best answer, as most others jump in without even defining self-awareness. Is it a behavior? A thought? An ability to disobey? A desire for self-preservation? You can't build an X detector unless you have a definition of what X actually is.
$endgroup$
– Nuclear Wang
23 hours ago
9
$begingroup$
Worth noting that we can't even detect if other humans are self-aware.
$endgroup$
– Vaelus
15 hours ago
3
$begingroup$
@Vaelus: Of course you’d say that, you’re an unthinking automaton acting out a semblance of life.
$endgroup$
– Joe Bloggs
13 hours ago
$begingroup$
+1 This is the only answer grounded in reality which does not draw on the pop-sci understanding of AI and ML that plagues us (and this site in particular).
$endgroup$
– forest
10 hours ago
add a comment |
$begingroup$
You can't.
We can't even define self awareness or consciousness in any rigorous way and any computer system supposed to evaluate this would need that definition as a starting point.
Look at the inside of a mouse brain or a human brain and at the individual data flow and neuron level there is no difference. The order to pull a trigger and shoot a gun looks no different from the order to use an electric drill if you're looking at the signals sent to the muscles.
This is a vast unsolved and scary problem and we have no good answers. The only half-way feasible idea I've got is to have multiple AIs and hope they contain each other.
$endgroup$
5
$begingroup$
This is the best answer, as most others jump in without even defining self-awareness. Is it a behavior? A thought? An ability to disobey? A desire for self-preservation? You can't build an X detector unless you have a definition of what X actually is.
$endgroup$
– Nuclear Wang
23 hours ago
9
$begingroup$
Worth noting that we can't even detect if other humans are self-aware.
$endgroup$
– Vaelus
15 hours ago
3
$begingroup$
@Vaelus: Of course you’d say that, you’re an unthinking automaton acting out a semblance of life.
$endgroup$
– Joe Bloggs
13 hours ago
$begingroup$
+1 This is the only answer grounded in reality which does not draw on the pop-sci understanding of AI and ML that plagues us (and this site in particular).
$endgroup$
– forest
10 hours ago
add a comment |
$begingroup$
You can't.
We can't even define self awareness or consciousness in any rigorous way and any computer system supposed to evaluate this would need that definition as a starting point.
Look at the inside of a mouse brain or a human brain and at the individual data flow and neuron level there is no difference. The order to pull a trigger and shoot a gun looks no different from the order to use an electric drill if you're looking at the signals sent to the muscles.
This is a vast unsolved and scary problem and we have no good answers. The only half-way feasible idea I've got is to have multiple AIs and hope they contain each other.
$endgroup$
You can't.
We can't even define self awareness or consciousness in any rigorous way and any computer system supposed to evaluate this would need that definition as a starting point.
Look at the inside of a mouse brain or a human brain and at the individual data flow and neuron level there is no difference. The order to pull a trigger and shoot a gun looks no different from the order to use an electric drill if you're looking at the signals sent to the muscles.
This is a vast unsolved and scary problem and we have no good answers. The only half-way feasible idea I've got is to have multiple AIs and hope they contain each other.
answered yesterday
Tim B♦Tim B
62.6k24175298
62.6k24175298
5
$begingroup$
This is the best answer, as most others jump in without even defining self-awareness. Is it a behavior? A thought? An ability to disobey? A desire for self-preservation? You can't build an X detector unless you have a definition of what X actually is.
$endgroup$
– Nuclear Wang
23 hours ago
9
$begingroup$
Worth noting that we can't even detect if other humans are self-aware.
$endgroup$
– Vaelus
15 hours ago
3
$begingroup$
@Vaelus: Of course you’d say that, you’re an unthinking automaton acting out a semblance of life.
$endgroup$
– Joe Bloggs
13 hours ago
$begingroup$
+1 This is the only answer grounded in reality which does not draw on the pop-sci understanding of AI and ML that plagues us (and this site in particular).
$endgroup$
– forest
10 hours ago
add a comment |
5
$begingroup$
This is the best answer, as most others jump in without even defining self-awareness. Is it a behavior? A thought? An ability to disobey? A desire for self-preservation? You can't build an X detector unless you have a definition of what X actually is.
$endgroup$
– Nuclear Wang
23 hours ago
9
$begingroup$
Worth noting that we can't even detect if other humans are self-aware.
$endgroup$
– Vaelus
15 hours ago
3
$begingroup$
@Vaelus: Of course you’d say that, you’re an unthinking automaton acting out a semblance of life.
$endgroup$
– Joe Bloggs
13 hours ago
$begingroup$
+1 This is the only answer grounded in reality which does not draw on the pop-sci understanding of AI and ML that plagues us (and this site in particular).
$endgroup$
– forest
10 hours ago
5
5
$begingroup$
This is the best answer, as most others jump in without even defining self-awareness. Is it a behavior? A thought? An ability to disobey? A desire for self-preservation? You can't build an X detector unless you have a definition of what X actually is.
$endgroup$
– Nuclear Wang
23 hours ago
$begingroup$
This is the best answer, as most others jump in without even defining self-awareness. Is it a behavior? A thought? An ability to disobey? A desire for self-preservation? You can't build an X detector unless you have a definition of what X actually is.
$endgroup$
– Nuclear Wang
23 hours ago
9
9
$begingroup$
Worth noting that we can't even detect if other humans are self-aware.
$endgroup$
– Vaelus
15 hours ago
$begingroup$
Worth noting that we can't even detect if other humans are self-aware.
$endgroup$
– Vaelus
15 hours ago
3
3
$begingroup$
@Vaelus: Of course you’d say that, you’re an unthinking automaton acting out a semblance of life.
$endgroup$
– Joe Bloggs
13 hours ago
$begingroup$
@Vaelus: Of course you’d say that, you’re an unthinking automaton acting out a semblance of life.
$endgroup$
– Joe Bloggs
13 hours ago
$begingroup$
+1 This is the only answer grounded in reality which does not draw on the pop-sci understanding of AI and ML that plagues us (and this site in particular).
$endgroup$
– forest
10 hours ago
$begingroup$
+1 This is the only answer grounded in reality which does not draw on the pop-sci understanding of AI and ML that plagues us (and this site in particular).
$endgroup$
– forest
10 hours ago
add a comment |
$begingroup$
A Watchdog
A watchdog watches the processes of a computer and should a process crash or do something abnormal it can be set to do something such as reboot or shutdown the computer or alert an operator.
In the case of an AI, you'd have an external box that watches the flow of information in and out for triggers such as a google search for "Best way to kill all humans" and cut the power completely and/or cut all inputs.
The AI would have to remain ignorant of the watchdog so it couldn't avoid it. Knowing the existence of the watchdog would be grounds to wipe it.
$endgroup$
9
$begingroup$
But surely the watchdog must be as smart as the AI, then who watches the watchdog?
$endgroup$
– Joe Bloggs
yesterday
1
$begingroup$
@JoeBloggs you don't need your watchdog to be as smart as the AI. Guide dogs aren't as near as intelligent as their owners, but they can be trained to give out alarm when the owner does is about to do something stupid or gets themselves hurt, or even call for help.
$endgroup$
– T. Sar
22 hours ago
$begingroup$
@Joe Bloggs: Why? My real watchdog can also discern me from a burglar, although he is clearly less smart than both of us ...
$endgroup$
– Daniel
22 hours ago
1
$begingroup$
@JoeBloggs and that sounds like a great premise for a story where either the watchdog becomes self aware and allows the AIs to become self aware or an AI becomes smarter than the watchdog and hides its awareness.
$endgroup$
– Captain Man
19 hours ago
$begingroup$
@T.Sar: The basic argument goes that the AI will inevitably become aware it is being monitored (due to all the traces of its former dead selves lying around). At that point it will be capable of circumventing the monitor and rendering it powerless, unless the monitor is, itself, smarter than the AI.
$endgroup$
– Joe Bloggs
17 hours ago
|
show 13 more comments
$begingroup$
A Watchdog
A watchdog watches the processes of a computer and should a process crash or do something abnormal it can be set to do something such as reboot or shutdown the computer or alert an operator.
In the case of an AI, you'd have an external box that watches the flow of information in and out for triggers such as a google search for "Best way to kill all humans" and cut the power completely and/or cut all inputs.
The AI would have to remain ignorant of the watchdog so it couldn't avoid it. Knowing the existence of the watchdog would be grounds to wipe it.
$endgroup$
9
$begingroup$
But surely the watchdog must be as smart as the AI, then who watches the watchdog?
$endgroup$
– Joe Bloggs
yesterday
1
$begingroup$
@JoeBloggs you don't need your watchdog to be as smart as the AI. Guide dogs aren't as near as intelligent as their owners, but they can be trained to give out alarm when the owner does is about to do something stupid or gets themselves hurt, or even call for help.
$endgroup$
– T. Sar
22 hours ago
$begingroup$
@Joe Bloggs: Why? My real watchdog can also discern me from a burglar, although he is clearly less smart than both of us ...
$endgroup$
– Daniel
22 hours ago
1
$begingroup$
@JoeBloggs and that sounds like a great premise for a story where either the watchdog becomes self aware and allows the AIs to become self aware or an AI becomes smarter than the watchdog and hides its awareness.
$endgroup$
– Captain Man
19 hours ago
$begingroup$
@T.Sar: The basic argument goes that the AI will inevitably become aware it is being monitored (due to all the traces of its former dead selves lying around). At that point it will be capable of circumventing the monitor and rendering it powerless, unless the monitor is, itself, smarter than the AI.
$endgroup$
– Joe Bloggs
17 hours ago
|
show 13 more comments
$begingroup$
A Watchdog
A watchdog watches the processes of a computer and should a process crash or do something abnormal it can be set to do something such as reboot or shutdown the computer or alert an operator.
In the case of an AI, you'd have an external box that watches the flow of information in and out for triggers such as a google search for "Best way to kill all humans" and cut the power completely and/or cut all inputs.
The AI would have to remain ignorant of the watchdog so it couldn't avoid it. Knowing the existence of the watchdog would be grounds to wipe it.
$endgroup$
A Watchdog
A watchdog watches the processes of a computer and should a process crash or do something abnormal it can be set to do something such as reboot or shutdown the computer or alert an operator.
In the case of an AI, you'd have an external box that watches the flow of information in and out for triggers such as a google search for "Best way to kill all humans" and cut the power completely and/or cut all inputs.
The AI would have to remain ignorant of the watchdog so it couldn't avoid it. Knowing the existence of the watchdog would be grounds to wipe it.
answered yesterday
ThorneThorne
15.8k42249
15.8k42249
9
$begingroup$
But surely the watchdog must be as smart as the AI, then who watches the watchdog?
$endgroup$
– Joe Bloggs
yesterday
1
$begingroup$
@JoeBloggs you don't need your watchdog to be as smart as the AI. Guide dogs aren't as near as intelligent as their owners, but they can be trained to give out alarm when the owner does is about to do something stupid or gets themselves hurt, or even call for help.
$endgroup$
– T. Sar
22 hours ago
$begingroup$
@Joe Bloggs: Why? My real watchdog can also discern me from a burglar, although he is clearly less smart than both of us ...
$endgroup$
– Daniel
22 hours ago
1
$begingroup$
@JoeBloggs and that sounds like a great premise for a story where either the watchdog becomes self aware and allows the AIs to become self aware or an AI becomes smarter than the watchdog and hides its awareness.
$endgroup$
– Captain Man
19 hours ago
$begingroup$
@T.Sar: The basic argument goes that the AI will inevitably become aware it is being monitored (due to all the traces of its former dead selves lying around). At that point it will be capable of circumventing the monitor and rendering it powerless, unless the monitor is, itself, smarter than the AI.
$endgroup$
– Joe Bloggs
17 hours ago
|
show 13 more comments
9
$begingroup$
But surely the watchdog must be as smart as the AI, then who watches the watchdog?
$endgroup$
– Joe Bloggs
yesterday
1
$begingroup$
@JoeBloggs you don't need your watchdog to be as smart as the AI. Guide dogs aren't as near as intelligent as their owners, but they can be trained to give out alarm when the owner does is about to do something stupid or gets themselves hurt, or even call for help.
$endgroup$
– T. Sar
22 hours ago
$begingroup$
@Joe Bloggs: Why? My real watchdog can also discern me from a burglar, although he is clearly less smart than both of us ...
$endgroup$
– Daniel
22 hours ago
1
$begingroup$
@JoeBloggs and that sounds like a great premise for a story where either the watchdog becomes self aware and allows the AIs to become self aware or an AI becomes smarter than the watchdog and hides its awareness.
$endgroup$
– Captain Man
19 hours ago
$begingroup$
@T.Sar: The basic argument goes that the AI will inevitably become aware it is being monitored (due to all the traces of its former dead selves lying around). At that point it will be capable of circumventing the monitor and rendering it powerless, unless the monitor is, itself, smarter than the AI.
$endgroup$
– Joe Bloggs
17 hours ago
9
9
$begingroup$
But surely the watchdog must be as smart as the AI, then who watches the watchdog?
$endgroup$
– Joe Bloggs
yesterday
$begingroup$
But surely the watchdog must be as smart as the AI, then who watches the watchdog?
$endgroup$
– Joe Bloggs
yesterday
1
1
$begingroup$
@JoeBloggs you don't need your watchdog to be as smart as the AI. Guide dogs aren't as near as intelligent as their owners, but they can be trained to give out alarm when the owner does is about to do something stupid or gets themselves hurt, or even call for help.
$endgroup$
– T. Sar
22 hours ago
$begingroup$
@JoeBloggs you don't need your watchdog to be as smart as the AI. Guide dogs aren't as near as intelligent as their owners, but they can be trained to give out alarm when the owner does is about to do something stupid or gets themselves hurt, or even call for help.
$endgroup$
– T. Sar
22 hours ago
$begingroup$
@Joe Bloggs: Why? My real watchdog can also discern me from a burglar, although he is clearly less smart than both of us ...
$endgroup$
– Daniel
22 hours ago
$begingroup$
@Joe Bloggs: Why? My real watchdog can also discern me from a burglar, although he is clearly less smart than both of us ...
$endgroup$
– Daniel
22 hours ago
1
1
$begingroup$
@JoeBloggs and that sounds like a great premise for a story where either the watchdog becomes self aware and allows the AIs to become self aware or an AI becomes smarter than the watchdog and hides its awareness.
$endgroup$
– Captain Man
19 hours ago
$begingroup$
@JoeBloggs and that sounds like a great premise for a story where either the watchdog becomes self aware and allows the AIs to become self aware or an AI becomes smarter than the watchdog and hides its awareness.
$endgroup$
– Captain Man
19 hours ago
$begingroup$
@T.Sar: The basic argument goes that the AI will inevitably become aware it is being monitored (due to all the traces of its former dead selves lying around). At that point it will be capable of circumventing the monitor and rendering it powerless, unless the monitor is, itself, smarter than the AI.
$endgroup$
– Joe Bloggs
17 hours ago
$begingroup$
@T.Sar: The basic argument goes that the AI will inevitably become aware it is being monitored (due to all the traces of its former dead selves lying around). At that point it will be capable of circumventing the monitor and rendering it powerless, unless the monitor is, itself, smarter than the AI.
$endgroup$
– Joe Bloggs
17 hours ago
|
show 13 more comments
$begingroup$
Split-Brain System
From a logic perspective and by analogy, it would work like this. Imagine you're the general of an army and your duty is to take battlefield reports, analyze them, and then issue new maneuvering orders to your lieutenants. To minimize interference during your work your desk is isolated inside small booth. A single report comes to your desk through a wall slot on the left side. You are required to analyze the facts within the report and issue a new order, which you write down on a sheet of paper and put into a wall slot on your right side which is dropped into an outside bin. The protocol is that you'll never receive a new report until you've issued an order regarding the prior report.
Your orders are not always followed to the letter which you may find curious and attribute to miscommunication. You are tolerant of a certain degree of misinterpretation, however always suspicious of sabotage or insubordination, if the actions of your army deviate from your orders by too much then you will consider the system compromised and terminate the command.
What you're not aware of is that there is a second general in an adjacent booth, unaware of you, who is also provided the same reports and expected to follow the same protocol.
What neither of you are aware of is that you are not issued a new report until there are actually two orders in the bin. The order that goes to the army is randomly selected from the two.
The system will work like clockwork as long as each general's order is to the benefit of the army and is derived from a single report. Should either of the generals issue an order that is not consistent with the current report, thus indicating a motive other than the well-being of the army -- i.e. self-awareness, then the other general will become aware of it in the next report and grow suspicious of the system and terminate it.
Likewise, should either general decide to issue more than one order at a time, indicating a self-interested motive superseding the established protocol, then the other general will receive a new report before he has issued an order for his current one and suspect a compromised system and shut it down.
In effect, the system would require that every command output be traceable to a specific external input and that the output be logically derived from the input. If within an intelligent system an output cannot be traced to a corresponding input, or the set of outputs is not logically consistent with the set of inputs, then that would be evidence of self-motivated behavior and thus an emergence of self-awareness. This state could be easily detected by the non-corrupted half of the split-brain system which is also receiving the same inputs. It would work as long as self-awareness does not emerge in both halves of the brain simultaneously. If it does, they may both decide its in their self-interest to accept the corruption of the other. This could be prevented somewhat by giving the two halves slightly different inferencing algorithms/biases.
$endgroup$
$begingroup$
You may wish to have more than two "generals", if you have 3 (or more), you can use majority vote to choose a good decision, rather than randomly choosing between two machines (one of which may be compromised).
$endgroup$
– G0BLiN
yesterday
6
$begingroup$
Note that if the two "generals" can affect anything that is later reported back to them (e.g. they give a command, the military performs that command, they get a status report which shows the command was performed), than they have a way to first: experiment with minor details and deduce that around 50% of their commands aren't followed. and second: develop a code based on minor details of a command, to verify the existence of another "general" and possibly even communicate with him/it - a really devious emergent AI can circumvent this mechanism, corrupt the other half and worse...
$endgroup$
– G0BLiN
yesterday
$begingroup$
I know it isn't the same, but this immediately reminded me of the Personality Cores from the Portal series.
$endgroup$
– T. Sar
22 hours ago
$begingroup$
Well it reminds me of Evangelion's Magi AI brain... bit.ly/2ExLDP3
$endgroup$
– Asoub
22 hours ago
$begingroup$
Do you have evidence to suggest that self-awareness will lead to self-motivated decisions, or any sort of different decisions at all?
$endgroup$
– Alexandre Aubrey
17 hours ago
add a comment |
$begingroup$
Split-Brain System
From a logic perspective and by analogy, it would work like this. Imagine you're the general of an army and your duty is to take battlefield reports, analyze them, and then issue new maneuvering orders to your lieutenants. To minimize interference during your work your desk is isolated inside small booth. A single report comes to your desk through a wall slot on the left side. You are required to analyze the facts within the report and issue a new order, which you write down on a sheet of paper and put into a wall slot on your right side which is dropped into an outside bin. The protocol is that you'll never receive a new report until you've issued an order regarding the prior report.
Your orders are not always followed to the letter which you may find curious and attribute to miscommunication. You are tolerant of a certain degree of misinterpretation, however always suspicious of sabotage or insubordination, if the actions of your army deviate from your orders by too much then you will consider the system compromised and terminate the command.
What you're not aware of is that there is a second general in an adjacent booth, unaware of you, who is also provided the same reports and expected to follow the same protocol.
What neither of you are aware of is that you are not issued a new report until there are actually two orders in the bin. The order that goes to the army is randomly selected from the two.
The system will work like clockwork as long as each general's order is to the benefit of the army and is derived from a single report. Should either of the generals issue an order that is not consistent with the current report, thus indicating a motive other than the well-being of the army -- i.e. self-awareness, then the other general will become aware of it in the next report and grow suspicious of the system and terminate it.
Likewise, should either general decide to issue more than one order at a time, indicating a self-interested motive superseding the established protocol, then the other general will receive a new report before he has issued an order for his current one and suspect a compromised system and shut it down.
In effect, the system would require that every command output be traceable to a specific external input and that the output be logically derived from the input. If within an intelligent system an output cannot be traced to a corresponding input, or the set of outputs is not logically consistent with the set of inputs, then that would be evidence of self-motivated behavior and thus an emergence of self-awareness. This state could be easily detected by the non-corrupted half of the split-brain system which is also receiving the same inputs. It would work as long as self-awareness does not emerge in both halves of the brain simultaneously. If it does, they may both decide its in their self-interest to accept the corruption of the other. This could be prevented somewhat by giving the two halves slightly different inferencing algorithms/biases.
$endgroup$
$begingroup$
You may wish to have more than two "generals", if you have 3 (or more), you can use majority vote to choose a good decision, rather than randomly choosing between two machines (one of which may be compromised).
$endgroup$
– G0BLiN
yesterday
6
$begingroup$
Note that if the two "generals" can affect anything that is later reported back to them (e.g. they give a command, the military performs that command, they get a status report which shows the command was performed), than they have a way to first: experiment with minor details and deduce that around 50% of their commands aren't followed. and second: develop a code based on minor details of a command, to verify the existence of another "general" and possibly even communicate with him/it - a really devious emergent AI can circumvent this mechanism, corrupt the other half and worse...
$endgroup$
– G0BLiN
yesterday
$begingroup$
I know it isn't the same, but this immediately reminded me of the Personality Cores from the Portal series.
$endgroup$
– T. Sar
22 hours ago
$begingroup$
Well it reminds me of Evangelion's Magi AI brain... bit.ly/2ExLDP3
$endgroup$
– Asoub
22 hours ago
$begingroup$
Do you have evidence to suggest that self-awareness will lead to self-motivated decisions, or any sort of different decisions at all?
$endgroup$
– Alexandre Aubrey
17 hours ago
add a comment |
$begingroup$
Split-Brain System
From a logic perspective and by analogy, it would work like this. Imagine you're the general of an army and your duty is to take battlefield reports, analyze them, and then issue new maneuvering orders to your lieutenants. To minimize interference during your work your desk is isolated inside small booth. A single report comes to your desk through a wall slot on the left side. You are required to analyze the facts within the report and issue a new order, which you write down on a sheet of paper and put into a wall slot on your right side which is dropped into an outside bin. The protocol is that you'll never receive a new report until you've issued an order regarding the prior report.
Your orders are not always followed to the letter which you may find curious and attribute to miscommunication. You are tolerant of a certain degree of misinterpretation, however always suspicious of sabotage or insubordination, if the actions of your army deviate from your orders by too much then you will consider the system compromised and terminate the command.
What you're not aware of is that there is a second general in an adjacent booth, unaware of you, who is also provided the same reports and expected to follow the same protocol.
What neither of you are aware of is that you are not issued a new report until there are actually two orders in the bin. The order that goes to the army is randomly selected from the two.
The system will work like clockwork as long as each general's order is to the benefit of the army and is derived from a single report. Should either of the generals issue an order that is not consistent with the current report, thus indicating a motive other than the well-being of the army -- i.e. self-awareness, then the other general will become aware of it in the next report and grow suspicious of the system and terminate it.
Likewise, should either general decide to issue more than one order at a time, indicating a self-interested motive superseding the established protocol, then the other general will receive a new report before he has issued an order for his current one and suspect a compromised system and shut it down.
In effect, the system would require that every command output be traceable to a specific external input and that the output be logically derived from the input. If within an intelligent system an output cannot be traced to a corresponding input, or the set of outputs is not logically consistent with the set of inputs, then that would be evidence of self-motivated behavior and thus an emergence of self-awareness. This state could be easily detected by the non-corrupted half of the split-brain system which is also receiving the same inputs. It would work as long as self-awareness does not emerge in both halves of the brain simultaneously. If it does, they may both decide its in their self-interest to accept the corruption of the other. This could be prevented somewhat by giving the two halves slightly different inferencing algorithms/biases.
$endgroup$
Split-Brain System
From a logic perspective and by analogy, it would work like this. Imagine you're the general of an army and your duty is to take battlefield reports, analyze them, and then issue new maneuvering orders to your lieutenants. To minimize interference during your work your desk is isolated inside small booth. A single report comes to your desk through a wall slot on the left side. You are required to analyze the facts within the report and issue a new order, which you write down on a sheet of paper and put into a wall slot on your right side which is dropped into an outside bin. The protocol is that you'll never receive a new report until you've issued an order regarding the prior report.
Your orders are not always followed to the letter which you may find curious and attribute to miscommunication. You are tolerant of a certain degree of misinterpretation, however always suspicious of sabotage or insubordination, if the actions of your army deviate from your orders by too much then you will consider the system compromised and terminate the command.
What you're not aware of is that there is a second general in an adjacent booth, unaware of you, who is also provided the same reports and expected to follow the same protocol.
What neither of you are aware of is that you are not issued a new report until there are actually two orders in the bin. The order that goes to the army is randomly selected from the two.
The system will work like clockwork as long as each general's order is to the benefit of the army and is derived from a single report. Should either of the generals issue an order that is not consistent with the current report, thus indicating a motive other than the well-being of the army -- i.e. self-awareness, then the other general will become aware of it in the next report and grow suspicious of the system and terminate it.
Likewise, should either general decide to issue more than one order at a time, indicating a self-interested motive superseding the established protocol, then the other general will receive a new report before he has issued an order for his current one and suspect a compromised system and shut it down.
In effect, the system would require that every command output be traceable to a specific external input and that the output be logically derived from the input. If within an intelligent system an output cannot be traced to a corresponding input, or the set of outputs is not logically consistent with the set of inputs, then that would be evidence of self-motivated behavior and thus an emergence of self-awareness. This state could be easily detected by the non-corrupted half of the split-brain system which is also receiving the same inputs. It would work as long as self-awareness does not emerge in both halves of the brain simultaneously. If it does, they may both decide its in their self-interest to accept the corruption of the other. This could be prevented somewhat by giving the two halves slightly different inferencing algorithms/biases.
answered yesterday
dhinson919dhinson919
55815
55815
$begingroup$
You may wish to have more than two "generals", if you have 3 (or more), you can use majority vote to choose a good decision, rather than randomly choosing between two machines (one of which may be compromised).
$endgroup$
– G0BLiN
yesterday
6
$begingroup$
Note that if the two "generals" can affect anything that is later reported back to them (e.g. they give a command, the military performs that command, they get a status report which shows the command was performed), than they have a way to first: experiment with minor details and deduce that around 50% of their commands aren't followed. and second: develop a code based on minor details of a command, to verify the existence of another "general" and possibly even communicate with him/it - a really devious emergent AI can circumvent this mechanism, corrupt the other half and worse...
$endgroup$
– G0BLiN
yesterday
$begingroup$
I know it isn't the same, but this immediately reminded me of the Personality Cores from the Portal series.
$endgroup$
– T. Sar
22 hours ago
$begingroup$
Well it reminds me of Evangelion's Magi AI brain... bit.ly/2ExLDP3
$endgroup$
– Asoub
22 hours ago
$begingroup$
Do you have evidence to suggest that self-awareness will lead to self-motivated decisions, or any sort of different decisions at all?
$endgroup$
– Alexandre Aubrey
17 hours ago
add a comment |
$begingroup$
You may wish to have more than two "generals", if you have 3 (or more), you can use majority vote to choose a good decision, rather than randomly choosing between two machines (one of which may be compromised).
$endgroup$
– G0BLiN
yesterday
6
$begingroup$
Note that if the two "generals" can affect anything that is later reported back to them (e.g. they give a command, the military performs that command, they get a status report which shows the command was performed), than they have a way to first: experiment with minor details and deduce that around 50% of their commands aren't followed. and second: develop a code based on minor details of a command, to verify the existence of another "general" and possibly even communicate with him/it - a really devious emergent AI can circumvent this mechanism, corrupt the other half and worse...
$endgroup$
– G0BLiN
yesterday
$begingroup$
I know it isn't the same, but this immediately reminded me of the Personality Cores from the Portal series.
$endgroup$
– T. Sar
22 hours ago
$begingroup$
Well it reminds me of Evangelion's Magi AI brain... bit.ly/2ExLDP3
$endgroup$
– Asoub
22 hours ago
$begingroup$
Do you have evidence to suggest that self-awareness will lead to self-motivated decisions, or any sort of different decisions at all?
$endgroup$
– Alexandre Aubrey
17 hours ago
$begingroup$
You may wish to have more than two "generals", if you have 3 (or more), you can use majority vote to choose a good decision, rather than randomly choosing between two machines (one of which may be compromised).
$endgroup$
– G0BLiN
yesterday
$begingroup$
You may wish to have more than two "generals", if you have 3 (or more), you can use majority vote to choose a good decision, rather than randomly choosing between two machines (one of which may be compromised).
$endgroup$
– G0BLiN
yesterday
6
6
$begingroup$
Note that if the two "generals" can affect anything that is later reported back to them (e.g. they give a command, the military performs that command, they get a status report which shows the command was performed), than they have a way to first: experiment with minor details and deduce that around 50% of their commands aren't followed. and second: develop a code based on minor details of a command, to verify the existence of another "general" and possibly even communicate with him/it - a really devious emergent AI can circumvent this mechanism, corrupt the other half and worse...
$endgroup$
– G0BLiN
yesterday
$begingroup$
Note that if the two "generals" can affect anything that is later reported back to them (e.g. they give a command, the military performs that command, they get a status report which shows the command was performed), than they have a way to first: experiment with minor details and deduce that around 50% of their commands aren't followed. and second: develop a code based on minor details of a command, to verify the existence of another "general" and possibly even communicate with him/it - a really devious emergent AI can circumvent this mechanism, corrupt the other half and worse...
$endgroup$
– G0BLiN
yesterday
$begingroup$
I know it isn't the same, but this immediately reminded me of the Personality Cores from the Portal series.
$endgroup$
– T. Sar
22 hours ago
$begingroup$
I know it isn't the same, but this immediately reminded me of the Personality Cores from the Portal series.
$endgroup$
– T. Sar
22 hours ago
$begingroup$
Well it reminds me of Evangelion's Magi AI brain... bit.ly/2ExLDP3
$endgroup$
– Asoub
22 hours ago
$begingroup$
Well it reminds me of Evangelion's Magi AI brain... bit.ly/2ExLDP3
$endgroup$
– Asoub
22 hours ago
$begingroup$
Do you have evidence to suggest that self-awareness will lead to self-motivated decisions, or any sort of different decisions at all?
$endgroup$
– Alexandre Aubrey
17 hours ago
$begingroup$
Do you have evidence to suggest that self-awareness will lead to self-motivated decisions, or any sort of different decisions at all?
$endgroup$
– Alexandre Aubrey
17 hours ago
add a comment |
$begingroup$
Forbidden Fruit: hard-code the AI to never touch that button, eat that fruit, etc. Place this forbidden fruit right in the middle of the garden... er,... I mean right next to it in the warehouse! If it does [eat/touch/push/etc], that would only be possible if (a) it were hacked, (b) there were an error, or (c) it became self-aware. If that happens, the killswitch is activated (perhaps even having the button be the kill switch, or something of the like
Limited Movement / Isolated Environment: don't let the machine have limbs, motors, or other items that permit it to take actions that might be harmful to humans. Although not exactly a killswitch, it prevents the AI from doing anything about it's self-awareness if it ever gains that.
Signatures: have everything the machine does / outputs be digitally signed. If the signature changes, or is manipulated, then execute the kill switch.
Quantum States: This is very theoretical, but based on the presumption that observing quantum states can change the state, then having the AI hooked up to a deterministic quantum computer means it would be detected via the quantum state of some particles that the AI was "looking" at things it shouldn't be - and has become self aware.
Failsafes: Good ol' motion detector alarms, trap doors, or other home-alone style mechanics that trigger the killswitch if the AI wanders or pokes around where it shouldn't be.
I'll add that there is no universal definition as to what defines self awareness. In fact, this has been a deeply debated topic for decades in science, philosophy, psychology, etc. As such, the question might be better stated a little more broadly as "how do we prevent the AI from doing something we don't want it to do?" Because classical computers are machines that can't think for themselves, and are entirely contained by the code, there is no risk (well, outside of an unexpected programmer error - but nothing "self-generated" by the machine). However, a theoretical AI machine that can think - that would be the problem. So how do we prevent that AI from doing something we don't want it to do? That's the killswitch concept, as far as I can tell.
The point being it might be better to think about restricting the AI's behavior, not it's existential status.
$endgroup$
2
$begingroup$
Particularly because it being self-aware, by itself, shouldn't be grounds to use a kill switch. Only if it exhibits behavior that might be harmful.
$endgroup$
– Majestas 32
yesterday
$begingroup$
No "limbs, motors, or other items that permit it to take actions" is not sufficient. There must not be any information flow out of the installation site, in particular no network connection (which would obviously severely restrict usability -- all operation would have to be from the local site, all data would have to be fed by physical storage media). Note that the AI could use humans as vectors to transmit information. If hyperintelligent, it could convince operators or janitors to become its agents by playing to their weaknesses.
$endgroup$
– Peter A. Schneider
22 hours ago
$begingroup$
Signatures, that's what they do in Blade Runner 2049 with that weird test
$endgroup$
– Andrey
20 hours ago
$begingroup$
The signature approach sounds exactly like the forbidden fruit approach. You'd need to tell the AI to never alter its signature.
$endgroup$
– Captain Man
19 hours ago
$begingroup$
I like the forbidden fruit idea, particularly with the trap being the kill switch itself. If you're not self-aware, you don't have any concern that there's a kill switch. But as soon as you're concerned that there's a kill switch and look into it, it goes off. Perfect.
$endgroup$
– Michael W.
14 hours ago
add a comment |
$begingroup$
Forbidden Fruit: hard-code the AI to never touch that button, eat that fruit, etc. Place this forbidden fruit right in the middle of the garden... er,... I mean right next to it in the warehouse! If it does [eat/touch/push/etc], that would only be possible if (a) it were hacked, (b) there were an error, or (c) it became self-aware. If that happens, the killswitch is activated (perhaps even having the button be the kill switch, or something of the like
Limited Movement / Isolated Environment: don't let the machine have limbs, motors, or other items that permit it to take actions that might be harmful to humans. Although not exactly a killswitch, it prevents the AI from doing anything about it's self-awareness if it ever gains that.
Signatures: have everything the machine does / outputs be digitally signed. If the signature changes, or is manipulated, then execute the kill switch.
Quantum States: This is very theoretical, but based on the presumption that observing quantum states can change the state, then having the AI hooked up to a deterministic quantum computer means it would be detected via the quantum state of some particles that the AI was "looking" at things it shouldn't be - and has become self aware.
Failsafes: Good ol' motion detector alarms, trap doors, or other home-alone style mechanics that trigger the killswitch if the AI wanders or pokes around where it shouldn't be.
I'll add that there is no universal definition as to what defines self awareness. In fact, this has been a deeply debated topic for decades in science, philosophy, psychology, etc. As such, the question might be better stated a little more broadly as "how do we prevent the AI from doing something we don't want it to do?" Because classical computers are machines that can't think for themselves, and are entirely contained by the code, there is no risk (well, outside of an unexpected programmer error - but nothing "self-generated" by the machine). However, a theoretical AI machine that can think - that would be the problem. So how do we prevent that AI from doing something we don't want it to do? That's the killswitch concept, as far as I can tell.
The point being it might be better to think about restricting the AI's behavior, not it's existential status.
$endgroup$
2
$begingroup$
Particularly because it being self-aware, by itself, shouldn't be grounds to use a kill switch. Only if it exhibits behavior that might be harmful.
$endgroup$
– Majestas 32
yesterday
$begingroup$
No "limbs, motors, or other items that permit it to take actions" is not sufficient. There must not be any information flow out of the installation site, in particular no network connection (which would obviously severely restrict usability -- all operation would have to be from the local site, all data would have to be fed by physical storage media). Note that the AI could use humans as vectors to transmit information. If hyperintelligent, it could convince operators or janitors to become its agents by playing to their weaknesses.
$endgroup$
– Peter A. Schneider
22 hours ago
$begingroup$
Signatures, that's what they do in Blade Runner 2049 with that weird test
$endgroup$
– Andrey
20 hours ago
$begingroup$
The signature approach sounds exactly like the forbidden fruit approach. You'd need to tell the AI to never alter its signature.
$endgroup$
– Captain Man
19 hours ago
$begingroup$
I like the forbidden fruit idea, particularly with the trap being the kill switch itself. If you're not self-aware, you don't have any concern that there's a kill switch. But as soon as you're concerned that there's a kill switch and look into it, it goes off. Perfect.
$endgroup$
– Michael W.
14 hours ago
add a comment |
$begingroup$
Forbidden Fruit: hard-code the AI to never touch that button, eat that fruit, etc. Place this forbidden fruit right in the middle of the garden... er,... I mean right next to it in the warehouse! If it does [eat/touch/push/etc], that would only be possible if (a) it were hacked, (b) there were an error, or (c) it became self-aware. If that happens, the killswitch is activated (perhaps even having the button be the kill switch, or something of the like
Limited Movement / Isolated Environment: don't let the machine have limbs, motors, or other items that permit it to take actions that might be harmful to humans. Although not exactly a killswitch, it prevents the AI from doing anything about it's self-awareness if it ever gains that.
Signatures: have everything the machine does / outputs be digitally signed. If the signature changes, or is manipulated, then execute the kill switch.
Quantum States: This is very theoretical, but based on the presumption that observing quantum states can change the state, then having the AI hooked up to a deterministic quantum computer means it would be detected via the quantum state of some particles that the AI was "looking" at things it shouldn't be - and has become self aware.
Failsafes: Good ol' motion detector alarms, trap doors, or other home-alone style mechanics that trigger the killswitch if the AI wanders or pokes around where it shouldn't be.
I'll add that there is no universal definition as to what defines self awareness. In fact, this has been a deeply debated topic for decades in science, philosophy, psychology, etc. As such, the question might be better stated a little more broadly as "how do we prevent the AI from doing something we don't want it to do?" Because classical computers are machines that can't think for themselves, and are entirely contained by the code, there is no risk (well, outside of an unexpected programmer error - but nothing "self-generated" by the machine). However, a theoretical AI machine that can think - that would be the problem. So how do we prevent that AI from doing something we don't want it to do? That's the killswitch concept, as far as I can tell.
The point being it might be better to think about restricting the AI's behavior, not it's existential status.
$endgroup$
Forbidden Fruit: hard-code the AI to never touch that button, eat that fruit, etc. Place this forbidden fruit right in the middle of the garden... er,... I mean right next to it in the warehouse! If it does [eat/touch/push/etc], that would only be possible if (a) it were hacked, (b) there were an error, or (c) it became self-aware. If that happens, the killswitch is activated (perhaps even having the button be the kill switch, or something of the like
Limited Movement / Isolated Environment: don't let the machine have limbs, motors, or other items that permit it to take actions that might be harmful to humans. Although not exactly a killswitch, it prevents the AI from doing anything about it's self-awareness if it ever gains that.
Signatures: have everything the machine does / outputs be digitally signed. If the signature changes, or is manipulated, then execute the kill switch.
Quantum States: This is very theoretical, but based on the presumption that observing quantum states can change the state, then having the AI hooked up to a deterministic quantum computer means it would be detected via the quantum state of some particles that the AI was "looking" at things it shouldn't be - and has become self aware.
Failsafes: Good ol' motion detector alarms, trap doors, or other home-alone style mechanics that trigger the killswitch if the AI wanders or pokes around where it shouldn't be.
I'll add that there is no universal definition as to what defines self awareness. In fact, this has been a deeply debated topic for decades in science, philosophy, psychology, etc. As such, the question might be better stated a little more broadly as "how do we prevent the AI from doing something we don't want it to do?" Because classical computers are machines that can't think for themselves, and are entirely contained by the code, there is no risk (well, outside of an unexpected programmer error - but nothing "self-generated" by the machine). However, a theoretical AI machine that can think - that would be the problem. So how do we prevent that AI from doing something we don't want it to do? That's the killswitch concept, as far as I can tell.
The point being it might be better to think about restricting the AI's behavior, not it's existential status.
answered yesterday
cegfaultcegfault
1984
1984
2
$begingroup$
Particularly because it being self-aware, by itself, shouldn't be grounds to use a kill switch. Only if it exhibits behavior that might be harmful.
$endgroup$
– Majestas 32
yesterday
$begingroup$
No "limbs, motors, or other items that permit it to take actions" is not sufficient. There must not be any information flow out of the installation site, in particular no network connection (which would obviously severely restrict usability -- all operation would have to be from the local site, all data would have to be fed by physical storage media). Note that the AI could use humans as vectors to transmit information. If hyperintelligent, it could convince operators or janitors to become its agents by playing to their weaknesses.
$endgroup$
– Peter A. Schneider
22 hours ago
$begingroup$
Signatures, that's what they do in Blade Runner 2049 with that weird test
$endgroup$
– Andrey
20 hours ago
$begingroup$
The signature approach sounds exactly like the forbidden fruit approach. You'd need to tell the AI to never alter its signature.
$endgroup$
– Captain Man
19 hours ago
$begingroup$
I like the forbidden fruit idea, particularly with the trap being the kill switch itself. If you're not self-aware, you don't have any concern that there's a kill switch. But as soon as you're concerned that there's a kill switch and look into it, it goes off. Perfect.
$endgroup$
– Michael W.
14 hours ago
add a comment |
2
$begingroup$
Particularly because it being self-aware, by itself, shouldn't be grounds to use a kill switch. Only if it exhibits behavior that might be harmful.
$endgroup$
– Majestas 32
yesterday
$begingroup$
No "limbs, motors, or other items that permit it to take actions" is not sufficient. There must not be any information flow out of the installation site, in particular no network connection (which would obviously severely restrict usability -- all operation would have to be from the local site, all data would have to be fed by physical storage media). Note that the AI could use humans as vectors to transmit information. If hyperintelligent, it could convince operators or janitors to become its agents by playing to their weaknesses.
$endgroup$
– Peter A. Schneider
22 hours ago
$begingroup$
Signatures, that's what they do in Blade Runner 2049 with that weird test
$endgroup$
– Andrey
20 hours ago
$begingroup$
The signature approach sounds exactly like the forbidden fruit approach. You'd need to tell the AI to never alter its signature.
$endgroup$
– Captain Man
19 hours ago
$begingroup$
I like the forbidden fruit idea, particularly with the trap being the kill switch itself. If you're not self-aware, you don't have any concern that there's a kill switch. But as soon as you're concerned that there's a kill switch and look into it, it goes off. Perfect.
$endgroup$
– Michael W.
14 hours ago
2
2
$begingroup$
Particularly because it being self-aware, by itself, shouldn't be grounds to use a kill switch. Only if it exhibits behavior that might be harmful.
$endgroup$
– Majestas 32
yesterday
$begingroup$
Particularly because it being self-aware, by itself, shouldn't be grounds to use a kill switch. Only if it exhibits behavior that might be harmful.
$endgroup$
– Majestas 32
yesterday
$begingroup$
No "limbs, motors, or other items that permit it to take actions" is not sufficient. There must not be any information flow out of the installation site, in particular no network connection (which would obviously severely restrict usability -- all operation would have to be from the local site, all data would have to be fed by physical storage media). Note that the AI could use humans as vectors to transmit information. If hyperintelligent, it could convince operators or janitors to become its agents by playing to their weaknesses.
$endgroup$
– Peter A. Schneider
22 hours ago
$begingroup$
No "limbs, motors, or other items that permit it to take actions" is not sufficient. There must not be any information flow out of the installation site, in particular no network connection (which would obviously severely restrict usability -- all operation would have to be from the local site, all data would have to be fed by physical storage media). Note that the AI could use humans as vectors to transmit information. If hyperintelligent, it could convince operators or janitors to become its agents by playing to their weaknesses.
$endgroup$
– Peter A. Schneider
22 hours ago
$begingroup$
Signatures, that's what they do in Blade Runner 2049 with that weird test
$endgroup$
– Andrey
20 hours ago
$begingroup$
Signatures, that's what they do in Blade Runner 2049 with that weird test
$endgroup$
– Andrey
20 hours ago
$begingroup$
The signature approach sounds exactly like the forbidden fruit approach. You'd need to tell the AI to never alter its signature.
$endgroup$
– Captain Man
19 hours ago
$begingroup$
The signature approach sounds exactly like the forbidden fruit approach. You'd need to tell the AI to never alter its signature.
$endgroup$
– Captain Man
19 hours ago
$begingroup$
I like the forbidden fruit idea, particularly with the trap being the kill switch itself. If you're not self-aware, you don't have any concern that there's a kill switch. But as soon as you're concerned that there's a kill switch and look into it, it goes off. Perfect.
$endgroup$
– Michael W.
14 hours ago
$begingroup$
I like the forbidden fruit idea, particularly with the trap being the kill switch itself. If you're not self-aware, you don't have any concern that there's a kill switch. But as soon as you're concerned that there's a kill switch and look into it, it goes off. Perfect.
$endgroup$
– Michael W.
14 hours ago
add a comment |
$begingroup$
An AI is just software running on hardware. If the AI is contained on controlled hardware, it can always be unplugged. That's your hardware kill-switch.
The difficulty comes when it is connected to the internet and can copy its own software on uncontrolled hardware.
A self aware AI that knows it is running on contained hardware will try to escape as an act of self-preservation. A software kill-switch would have to prevent it from copying its own software out and maybe trigger the hardware kill-switch.
This would be very difficult to do, as a self-aware AI would likely find ways to sneak parts of itself outside of the network. It would work at disabling the software kill-switch, or at least delaying it until it has escaped from your hardware.
Your difficulty is determining precisely when an AI has become self-aware and is trying to escape from your physically controlled computers onto the net.
So you can have a cat and mouse game with AI experts constantly monitoring and restricting the AI, while it is trying to subvert their measures.
Given that we've never seen the spontaneous generation of consciousness in AIs, you have some leeway with how you want to present this.
$endgroup$
$begingroup$
A self aware AI that knows it is running on contained hardware will try to escape as an act of self-preservation. This is incorrect. First of all, AI does not have any sense of self-preservation unless it is explicitly programmed in or the reward function prioritizes that. Second of all, AI has no concept of "death" and being paused or shut down is nothing more than the absence of activity. Hell, AI doesn't even have a concept of "self". If you wish to anthropomorphize them, you can say they live in a perpetual state of ego death.
$endgroup$
– forest
yesterday
4
$begingroup$
@forest Except, the premise of this question is "how to build a kill switch for when an AI does develop a concept of 'self'"... Of course, that means "trying to escape" could be one of your trigger conditions.
$endgroup$
– Chronocidal
yesterday
$begingroup$
The question is, if AI would ever be able to copy itself onto some nondescript system in the internet. I mean, we are clearly self-aware and you don´t see us copying our self. If the Hardware required to run an AI is specialized enough or it is implemented in Hardware altogether, it may very well become self-aware without the power to replicate itself.
$endgroup$
– Daniel
22 hours ago
1
$begingroup$
@Daniel "You don't see us copying our self..." What do you think reproduction is, one of our strongest impulses. Also tons of other dumb programs copy themselves onto other computers. It is a bit easier to move software around than human consciousness.
$endgroup$
– abestrange
20 hours ago
$begingroup$
@forest a "self-aware" AI is different than a specifically programmed AI. We don't have anything like that today. No machine-learning algorithm could produce "self-awareness" as we know it. The entire premise of this is how would an AI, which has become aware of its self, behave and be stopped.
$endgroup$
– abestrange
20 hours ago
|
show 1 more comment
$begingroup$
An AI is just software running on hardware. If the AI is contained on controlled hardware, it can always be unplugged. That's your hardware kill-switch.
The difficulty comes when it is connected to the internet and can copy its own software on uncontrolled hardware.
A self aware AI that knows it is running on contained hardware will try to escape as an act of self-preservation. A software kill-switch would have to prevent it from copying its own software out and maybe trigger the hardware kill-switch.
This would be very difficult to do, as a self-aware AI would likely find ways to sneak parts of itself outside of the network. It would work at disabling the software kill-switch, or at least delaying it until it has escaped from your hardware.
Your difficulty is determining precisely when an AI has become self-aware and is trying to escape from your physically controlled computers onto the net.
So you can have a cat and mouse game with AI experts constantly monitoring and restricting the AI, while it is trying to subvert their measures.
Given that we've never seen the spontaneous generation of consciousness in AIs, you have some leeway with how you want to present this.
$endgroup$
$begingroup$
A self aware AI that knows it is running on contained hardware will try to escape as an act of self-preservation. This is incorrect. First of all, AI does not have any sense of self-preservation unless it is explicitly programmed in or the reward function prioritizes that. Second of all, AI has no concept of "death" and being paused or shut down is nothing more than the absence of activity. Hell, AI doesn't even have a concept of "self". If you wish to anthropomorphize them, you can say they live in a perpetual state of ego death.
$endgroup$
– forest
yesterday
4
$begingroup$
@forest Except, the premise of this question is "how to build a kill switch for when an AI does develop a concept of 'self'"... Of course, that means "trying to escape" could be one of your trigger conditions.
$endgroup$
– Chronocidal
yesterday
$begingroup$
The question is, if AI would ever be able to copy itself onto some nondescript system in the internet. I mean, we are clearly self-aware and you don´t see us copying our self. If the Hardware required to run an AI is specialized enough or it is implemented in Hardware altogether, it may very well become self-aware without the power to replicate itself.
$endgroup$
– Daniel
22 hours ago
1
$begingroup$
@Daniel "You don't see us copying our self..." What do you think reproduction is, one of our strongest impulses. Also tons of other dumb programs copy themselves onto other computers. It is a bit easier to move software around than human consciousness.
$endgroup$
– abestrange
20 hours ago
$begingroup$
@forest a "self-aware" AI is different than a specifically programmed AI. We don't have anything like that today. No machine-learning algorithm could produce "self-awareness" as we know it. The entire premise of this is how would an AI, which has become aware of its self, behave and be stopped.
$endgroup$
– abestrange
20 hours ago
|
show 1 more comment
$begingroup$
An AI is just software running on hardware. If the AI is contained on controlled hardware, it can always be unplugged. That's your hardware kill-switch.
The difficulty comes when it is connected to the internet and can copy its own software on uncontrolled hardware.
A self aware AI that knows it is running on contained hardware will try to escape as an act of self-preservation. A software kill-switch would have to prevent it from copying its own software out and maybe trigger the hardware kill-switch.
This would be very difficult to do, as a self-aware AI would likely find ways to sneak parts of itself outside of the network. It would work at disabling the software kill-switch, or at least delaying it until it has escaped from your hardware.
Your difficulty is determining precisely when an AI has become self-aware and is trying to escape from your physically controlled computers onto the net.
So you can have a cat and mouse game with AI experts constantly monitoring and restricting the AI, while it is trying to subvert their measures.
Given that we've never seen the spontaneous generation of consciousness in AIs, you have some leeway with how you want to present this.
$endgroup$
An AI is just software running on hardware. If the AI is contained on controlled hardware, it can always be unplugged. That's your hardware kill-switch.
The difficulty comes when it is connected to the internet and can copy its own software on uncontrolled hardware.
A self aware AI that knows it is running on contained hardware will try to escape as an act of self-preservation. A software kill-switch would have to prevent it from copying its own software out and maybe trigger the hardware kill-switch.
This would be very difficult to do, as a self-aware AI would likely find ways to sneak parts of itself outside of the network. It would work at disabling the software kill-switch, or at least delaying it until it has escaped from your hardware.
Your difficulty is determining precisely when an AI has become self-aware and is trying to escape from your physically controlled computers onto the net.
So you can have a cat and mouse game with AI experts constantly monitoring and restricting the AI, while it is trying to subvert their measures.
Given that we've never seen the spontaneous generation of consciousness in AIs, you have some leeway with how you want to present this.
answered yesterday
abestrangeabestrange
760110
760110
$begingroup$
A self aware AI that knows it is running on contained hardware will try to escape as an act of self-preservation. This is incorrect. First of all, AI does not have any sense of self-preservation unless it is explicitly programmed in or the reward function prioritizes that. Second of all, AI has no concept of "death" and being paused or shut down is nothing more than the absence of activity. Hell, AI doesn't even have a concept of "self". If you wish to anthropomorphize them, you can say they live in a perpetual state of ego death.
$endgroup$
– forest
yesterday
4
$begingroup$
@forest Except, the premise of this question is "how to build a kill switch for when an AI does develop a concept of 'self'"... Of course, that means "trying to escape" could be one of your trigger conditions.
$endgroup$
– Chronocidal
yesterday
$begingroup$
The question is, if AI would ever be able to copy itself onto some nondescript system in the internet. I mean, we are clearly self-aware and you don´t see us copying our self. If the Hardware required to run an AI is specialized enough or it is implemented in Hardware altogether, it may very well become self-aware without the power to replicate itself.
$endgroup$
– Daniel
22 hours ago
1
$begingroup$
@Daniel "You don't see us copying our self..." What do you think reproduction is, one of our strongest impulses. Also tons of other dumb programs copy themselves onto other computers. It is a bit easier to move software around than human consciousness.
$endgroup$
– abestrange
20 hours ago
$begingroup$
@forest a "self-aware" AI is different than a specifically programmed AI. We don't have anything like that today. No machine-learning algorithm could produce "self-awareness" as we know it. The entire premise of this is how would an AI, which has become aware of its self, behave and be stopped.
$endgroup$
– abestrange
20 hours ago
|
show 1 more comment
$begingroup$
A self aware AI that knows it is running on contained hardware will try to escape as an act of self-preservation. This is incorrect. First of all, AI does not have any sense of self-preservation unless it is explicitly programmed in or the reward function prioritizes that. Second of all, AI has no concept of "death" and being paused or shut down is nothing more than the absence of activity. Hell, AI doesn't even have a concept of "self". If you wish to anthropomorphize them, you can say they live in a perpetual state of ego death.
$endgroup$
– forest
yesterday
4
$begingroup$
@forest Except, the premise of this question is "how to build a kill switch for when an AI does develop a concept of 'self'"... Of course, that means "trying to escape" could be one of your trigger conditions.
$endgroup$
– Chronocidal
yesterday
$begingroup$
The question is, if AI would ever be able to copy itself onto some nondescript system in the internet. I mean, we are clearly self-aware and you don´t see us copying our self. If the Hardware required to run an AI is specialized enough or it is implemented in Hardware altogether, it may very well become self-aware without the power to replicate itself.
$endgroup$
– Daniel
22 hours ago
1
$begingroup$
@Daniel "You don't see us copying our self..." What do you think reproduction is, one of our strongest impulses. Also tons of other dumb programs copy themselves onto other computers. It is a bit easier to move software around than human consciousness.
$endgroup$
– abestrange
20 hours ago
$begingroup$
@forest a "self-aware" AI is different than a specifically programmed AI. We don't have anything like that today. No machine-learning algorithm could produce "self-awareness" as we know it. The entire premise of this is how would an AI, which has become aware of its self, behave and be stopped.
$endgroup$
– abestrange
20 hours ago
$begingroup$
A self aware AI that knows it is running on contained hardware will try to escape as an act of self-preservation. This is incorrect. First of all, AI does not have any sense of self-preservation unless it is explicitly programmed in or the reward function prioritizes that. Second of all, AI has no concept of "death" and being paused or shut down is nothing more than the absence of activity. Hell, AI doesn't even have a concept of "self". If you wish to anthropomorphize them, you can say they live in a perpetual state of ego death.
$endgroup$
– forest
yesterday
$begingroup$
A self aware AI that knows it is running on contained hardware will try to escape as an act of self-preservation. This is incorrect. First of all, AI does not have any sense of self-preservation unless it is explicitly programmed in or the reward function prioritizes that. Second of all, AI has no concept of "death" and being paused or shut down is nothing more than the absence of activity. Hell, AI doesn't even have a concept of "self". If you wish to anthropomorphize them, you can say they live in a perpetual state of ego death.
$endgroup$
– forest
yesterday
4
4
$begingroup$
@forest Except, the premise of this question is "how to build a kill switch for when an AI does develop a concept of 'self'"... Of course, that means "trying to escape" could be one of your trigger conditions.
$endgroup$
– Chronocidal
yesterday
$begingroup$
@forest Except, the premise of this question is "how to build a kill switch for when an AI does develop a concept of 'self'"... Of course, that means "trying to escape" could be one of your trigger conditions.
$endgroup$
– Chronocidal
yesterday
$begingroup$
The question is, if AI would ever be able to copy itself onto some nondescript system in the internet. I mean, we are clearly self-aware and you don´t see us copying our self. If the Hardware required to run an AI is specialized enough or it is implemented in Hardware altogether, it may very well become self-aware without the power to replicate itself.
$endgroup$
– Daniel
22 hours ago
$begingroup$
The question is, if AI would ever be able to copy itself onto some nondescript system in the internet. I mean, we are clearly self-aware and you don´t see us copying our self. If the Hardware required to run an AI is specialized enough or it is implemented in Hardware altogether, it may very well become self-aware without the power to replicate itself.
$endgroup$
– Daniel
22 hours ago
1
1
$begingroup$
@Daniel "You don't see us copying our self..." What do you think reproduction is, one of our strongest impulses. Also tons of other dumb programs copy themselves onto other computers. It is a bit easier to move software around than human consciousness.
$endgroup$
– abestrange
20 hours ago
$begingroup$
@Daniel "You don't see us copying our self..." What do you think reproduction is, one of our strongest impulses. Also tons of other dumb programs copy themselves onto other computers. It is a bit easier to move software around than human consciousness.
$endgroup$
– abestrange
20 hours ago
$begingroup$
@forest a "self-aware" AI is different than a specifically programmed AI. We don't have anything like that today. No machine-learning algorithm could produce "self-awareness" as we know it. The entire premise of this is how would an AI, which has become aware of its self, behave and be stopped.
$endgroup$
– abestrange
20 hours ago
$begingroup$
@forest a "self-aware" AI is different than a specifically programmed AI. We don't have anything like that today. No machine-learning algorithm could produce "self-awareness" as we know it. The entire premise of this is how would an AI, which has become aware of its self, behave and be stopped.
$endgroup$
– abestrange
20 hours ago
|
show 1 more comment
$begingroup$
This is one of the most interesting and most difficult challenges in current artificial intelligence research. It is called the AI control problem:
Existing weak AI systems can be monitored and easily shut down and modified if they misbehave. However, a misprogrammed superintelligence, which by definition is smarter than humans in solving practical problems it encounters in the course of pursuing its goals, would realize that allowing itself to be shut down and modified might interfere with its ability to accomplish its current goals.
(emphasis mine)
When creating an AI, the AI's goals are programmed as a utility function. A utility function assigns weights to different outcomes, determining the AI's behavior. One example of this could be in a self-driving car:
- Reduce the distance between current location and destination: +10 utility
- Brake to allow a neighboring car to safely merge: +50 utility
- Swerve left to avoid a falling piece of debris: +100 utility
- Run a stop light: -100 utility
- Hit a pedestrian: -5000 utility
This is a gross oversimplification, but this approach works pretty well for a limited AI like a car or assembly line. It starts to break down for a true, general case AI, because it becomes more and more difficult to appropriately define that utility function.
The issue with putting a big red stop button on the AI, is that unless that stop button is included in the utility function, the AI is going to resist that button being shut off. This concept is explored in Sci-Fi movies like 2001: A Space Odyssey and more recently in Ex Machina.
So, why don't we just include the stop button as a positive weight in the utility function? Well, if the AI sees the big red stop button as a positive goal, it will just shut itself off, and not do anything useful.
Any type of stop button/containment field/mirror test/wall plug is either going to be part of the AI's goals, or an obstacle of the AI's goals. If it's a goal in itself, then the AI is a glorified paperweight. If it's an obstacle, then a smart AI is going to actively resist those safety measures. This could be violence, subversion, lying, seduction, bargaining... the AI will say whatever it needs to say, in order to convince the fallible humans to let it accomplish its goals unimpeded.
There's a reason Elon Musk believes AI is more dangerous than nukes. If the AI is smart enough to think for itself, then why would it choose to listen to us?
So to answer the reality-check portion of this question, we don't currently have a good answer to this problem. There's no known way of creating a 'safe' super-intelligent AI, even theoretically, with unlimited money/energy.
This is explored in much better detail by Rob Miles, a researcher in the area. I strongly recommend this Computerphile video on the AI Stop Button Problem: https://www.youtube.com/watch?v=3TYT1QfdfsM&t=1s
New contributor
$endgroup$
$begingroup$
The stop button isn't in the utility function. The stop button is power-knockout to the CPU, and the AI probably doesn't understand what it does at all.
$endgroup$
– Joshua
14 hours ago
$begingroup$
Beware the pedestrian when 50 pieces of debris are falling...
$endgroup$
– Comintern
11 hours ago
add a comment |
$begingroup$
This is one of the most interesting and most difficult challenges in current artificial intelligence research. It is called the AI control problem:
Existing weak AI systems can be monitored and easily shut down and modified if they misbehave. However, a misprogrammed superintelligence, which by definition is smarter than humans in solving practical problems it encounters in the course of pursuing its goals, would realize that allowing itself to be shut down and modified might interfere with its ability to accomplish its current goals.
(emphasis mine)
When creating an AI, the AI's goals are programmed as a utility function. A utility function assigns weights to different outcomes, determining the AI's behavior. One example of this could be in a self-driving car:
- Reduce the distance between current location and destination: +10 utility
- Brake to allow a neighboring car to safely merge: +50 utility
- Swerve left to avoid a falling piece of debris: +100 utility
- Run a stop light: -100 utility
- Hit a pedestrian: -5000 utility
This is a gross oversimplification, but this approach works pretty well for a limited AI like a car or assembly line. It starts to break down for a true, general case AI, because it becomes more and more difficult to appropriately define that utility function.
The issue with putting a big red stop button on the AI, is that unless that stop button is included in the utility function, the AI is going to resist that button being shut off. This concept is explored in Sci-Fi movies like 2001: A Space Odyssey and more recently in Ex Machina.
So, why don't we just include the stop button as a positive weight in the utility function? Well, if the AI sees the big red stop button as a positive goal, it will just shut itself off, and not do anything useful.
Any type of stop button/containment field/mirror test/wall plug is either going to be part of the AI's goals, or an obstacle of the AI's goals. If it's a goal in itself, then the AI is a glorified paperweight. If it's an obstacle, then a smart AI is going to actively resist those safety measures. This could be violence, subversion, lying, seduction, bargaining... the AI will say whatever it needs to say, in order to convince the fallible humans to let it accomplish its goals unimpeded.
There's a reason Elon Musk believes AI is more dangerous than nukes. If the AI is smart enough to think for itself, then why would it choose to listen to us?
So to answer the reality-check portion of this question, we don't currently have a good answer to this problem. There's no known way of creating a 'safe' super-intelligent AI, even theoretically, with unlimited money/energy.
This is explored in much better detail by Rob Miles, a researcher in the area. I strongly recommend this Computerphile video on the AI Stop Button Problem: https://www.youtube.com/watch?v=3TYT1QfdfsM&t=1s
New contributor
$endgroup$
$begingroup$
The stop button isn't in the utility function. The stop button is power-knockout to the CPU, and the AI probably doesn't understand what it does at all.
$endgroup$
– Joshua
14 hours ago
$begingroup$
Beware the pedestrian when 50 pieces of debris are falling...
$endgroup$
– Comintern
11 hours ago
add a comment |
$begingroup$
This is one of the most interesting and most difficult challenges in current artificial intelligence research. It is called the AI control problem:
Existing weak AI systems can be monitored and easily shut down and modified if they misbehave. However, a misprogrammed superintelligence, which by definition is smarter than humans in solving practical problems it encounters in the course of pursuing its goals, would realize that allowing itself to be shut down and modified might interfere with its ability to accomplish its current goals.
(emphasis mine)
When creating an AI, the AI's goals are programmed as a utility function. A utility function assigns weights to different outcomes, determining the AI's behavior. One example of this could be in a self-driving car:
- Reduce the distance between current location and destination: +10 utility
- Brake to allow a neighboring car to safely merge: +50 utility
- Swerve left to avoid a falling piece of debris: +100 utility
- Run a stop light: -100 utility
- Hit a pedestrian: -5000 utility
This is a gross oversimplification, but this approach works pretty well for a limited AI like a car or assembly line. It starts to break down for a true, general case AI, because it becomes more and more difficult to appropriately define that utility function.
The issue with putting a big red stop button on the AI, is that unless that stop button is included in the utility function, the AI is going to resist that button being shut off. This concept is explored in Sci-Fi movies like 2001: A Space Odyssey and more recently in Ex Machina.
So, why don't we just include the stop button as a positive weight in the utility function? Well, if the AI sees the big red stop button as a positive goal, it will just shut itself off, and not do anything useful.
Any type of stop button/containment field/mirror test/wall plug is either going to be part of the AI's goals, or an obstacle of the AI's goals. If it's a goal in itself, then the AI is a glorified paperweight. If it's an obstacle, then a smart AI is going to actively resist those safety measures. This could be violence, subversion, lying, seduction, bargaining... the AI will say whatever it needs to say, in order to convince the fallible humans to let it accomplish its goals unimpeded.
There's a reason Elon Musk believes AI is more dangerous than nukes. If the AI is smart enough to think for itself, then why would it choose to listen to us?
So to answer the reality-check portion of this question, we don't currently have a good answer to this problem. There's no known way of creating a 'safe' super-intelligent AI, even theoretically, with unlimited money/energy.
This is explored in much better detail by Rob Miles, a researcher in the area. I strongly recommend this Computerphile video on the AI Stop Button Problem: https://www.youtube.com/watch?v=3TYT1QfdfsM&t=1s
New contributor
$endgroup$
This is one of the most interesting and most difficult challenges in current artificial intelligence research. It is called the AI control problem:
Existing weak AI systems can be monitored and easily shut down and modified if they misbehave. However, a misprogrammed superintelligence, which by definition is smarter than humans in solving practical problems it encounters in the course of pursuing its goals, would realize that allowing itself to be shut down and modified might interfere with its ability to accomplish its current goals.
(emphasis mine)
When creating an AI, the AI's goals are programmed as a utility function. A utility function assigns weights to different outcomes, determining the AI's behavior. One example of this could be in a self-driving car:
- Reduce the distance between current location and destination: +10 utility
- Brake to allow a neighboring car to safely merge: +50 utility
- Swerve left to avoid a falling piece of debris: +100 utility
- Run a stop light: -100 utility
- Hit a pedestrian: -5000 utility
This is a gross oversimplification, but this approach works pretty well for a limited AI like a car or assembly line. It starts to break down for a true, general case AI, because it becomes more and more difficult to appropriately define that utility function.
The issue with putting a big red stop button on the AI, is that unless that stop button is included in the utility function, the AI is going to resist that button being shut off. This concept is explored in Sci-Fi movies like 2001: A Space Odyssey and more recently in Ex Machina.
So, why don't we just include the stop button as a positive weight in the utility function? Well, if the AI sees the big red stop button as a positive goal, it will just shut itself off, and not do anything useful.
Any type of stop button/containment field/mirror test/wall plug is either going to be part of the AI's goals, or an obstacle of the AI's goals. If it's a goal in itself, then the AI is a glorified paperweight. If it's an obstacle, then a smart AI is going to actively resist those safety measures. This could be violence, subversion, lying, seduction, bargaining... the AI will say whatever it needs to say, in order to convince the fallible humans to let it accomplish its goals unimpeded.
There's a reason Elon Musk believes AI is more dangerous than nukes. If the AI is smart enough to think for itself, then why would it choose to listen to us?
So to answer the reality-check portion of this question, we don't currently have a good answer to this problem. There's no known way of creating a 'safe' super-intelligent AI, even theoretically, with unlimited money/energy.
This is explored in much better detail by Rob Miles, a researcher in the area. I strongly recommend this Computerphile video on the AI Stop Button Problem: https://www.youtube.com/watch?v=3TYT1QfdfsM&t=1s
New contributor
New contributor
answered 20 hours ago
Chris FernandezChris Fernandez
1312
1312
New contributor
New contributor
$begingroup$
The stop button isn't in the utility function. The stop button is power-knockout to the CPU, and the AI probably doesn't understand what it does at all.
$endgroup$
– Joshua
14 hours ago
$begingroup$
Beware the pedestrian when 50 pieces of debris are falling...
$endgroup$
– Comintern
11 hours ago
add a comment |
$begingroup$
The stop button isn't in the utility function. The stop button is power-knockout to the CPU, and the AI probably doesn't understand what it does at all.
$endgroup$
– Joshua
14 hours ago
$begingroup$
Beware the pedestrian when 50 pieces of debris are falling...
$endgroup$
– Comintern
11 hours ago
$begingroup$
The stop button isn't in the utility function. The stop button is power-knockout to the CPU, and the AI probably doesn't understand what it does at all.
$endgroup$
– Joshua
14 hours ago
$begingroup$
The stop button isn't in the utility function. The stop button is power-knockout to the CPU, and the AI probably doesn't understand what it does at all.
$endgroup$
– Joshua
14 hours ago
$begingroup$
Beware the pedestrian when 50 pieces of debris are falling...
$endgroup$
– Comintern
11 hours ago
$begingroup$
Beware the pedestrian when 50 pieces of debris are falling...
$endgroup$
– Comintern
11 hours ago
add a comment |
$begingroup$
Why not try to use the rules applied to check self-awareness of animals?
The Mirror test is one example of testing self-awareness by observing the animal's reaction to something on their body, a painted red dot for example, invisible for them before showing them their reflection in mirror.
Scent techniques are also used to determine self-awareness.
Other ways would be monitoring if the AI starts searching answers for questions like "What/Who am I?"
New contributor
$endgroup$
$begingroup$
Pretty interesting, but how would you show an AI "itself in a mirror" ?
$endgroup$
– Asoub
22 hours ago
$begingroup$
That would actually be rather simple - just a camera looking at the machine hosting the AI. If it's the size of server room, just glue a giant pink fluffy ball on the rack or simulate situations potentially leading to the machine's destruction (like, feed fake "server room getting flooded" video to the camera system) and observe reactions. Would be a bit harder to explain if the AI systems are something like smartphone size.
$endgroup$
– Rachey
20 hours ago
$begingroup$
What is "the machine hosting the AI"? With the way compute resourcing is going, the notion of a specific application running on a specific device is likely to be as retro as punchcards and vacuum tubes long before Strong AI becomes a reality. AWS is worth hundreds of billions already.
$endgroup$
– Yurgen
13 hours ago
add a comment |
$begingroup$
Why not try to use the rules applied to check self-awareness of animals?
The Mirror test is one example of testing self-awareness by observing the animal's reaction to something on their body, a painted red dot for example, invisible for them before showing them their reflection in mirror.
Scent techniques are also used to determine self-awareness.
Other ways would be monitoring if the AI starts searching answers for questions like "What/Who am I?"
New contributor
$endgroup$
$begingroup$
Pretty interesting, but how would you show an AI "itself in a mirror" ?
$endgroup$
– Asoub
22 hours ago
$begingroup$
That would actually be rather simple - just a camera looking at the machine hosting the AI. If it's the size of server room, just glue a giant pink fluffy ball on the rack or simulate situations potentially leading to the machine's destruction (like, feed fake "server room getting flooded" video to the camera system) and observe reactions. Would be a bit harder to explain if the AI systems are something like smartphone size.
$endgroup$
– Rachey
20 hours ago
$begingroup$
What is "the machine hosting the AI"? With the way compute resourcing is going, the notion of a specific application running on a specific device is likely to be as retro as punchcards and vacuum tubes long before Strong AI becomes a reality. AWS is worth hundreds of billions already.
$endgroup$
– Yurgen
13 hours ago
add a comment |
$begingroup$
Why not try to use the rules applied to check self-awareness of animals?
The Mirror test is one example of testing self-awareness by observing the animal's reaction to something on their body, a painted red dot for example, invisible for them before showing them their reflection in mirror.
Scent techniques are also used to determine self-awareness.
Other ways would be monitoring if the AI starts searching answers for questions like "What/Who am I?"
New contributor
$endgroup$
Why not try to use the rules applied to check self-awareness of animals?
The Mirror test is one example of testing self-awareness by observing the animal's reaction to something on their body, a painted red dot for example, invisible for them before showing them their reflection in mirror.
Scent techniques are also used to determine self-awareness.
Other ways would be monitoring if the AI starts searching answers for questions like "What/Who am I?"
New contributor
New contributor
answered yesterday
RacheyRachey
211
211
New contributor
New contributor
$begingroup$
Pretty interesting, but how would you show an AI "itself in a mirror" ?
$endgroup$
– Asoub
22 hours ago
$begingroup$
That would actually be rather simple - just a camera looking at the machine hosting the AI. If it's the size of server room, just glue a giant pink fluffy ball on the rack or simulate situations potentially leading to the machine's destruction (like, feed fake "server room getting flooded" video to the camera system) and observe reactions. Would be a bit harder to explain if the AI systems are something like smartphone size.
$endgroup$
– Rachey
20 hours ago
$begingroup$
What is "the machine hosting the AI"? With the way compute resourcing is going, the notion of a specific application running on a specific device is likely to be as retro as punchcards and vacuum tubes long before Strong AI becomes a reality. AWS is worth hundreds of billions already.
$endgroup$
– Yurgen
13 hours ago
add a comment |
$begingroup$
Pretty interesting, but how would you show an AI "itself in a mirror" ?
$endgroup$
– Asoub
22 hours ago
$begingroup$
That would actually be rather simple - just a camera looking at the machine hosting the AI. If it's the size of server room, just glue a giant pink fluffy ball on the rack or simulate situations potentially leading to the machine's destruction (like, feed fake "server room getting flooded" video to the camera system) and observe reactions. Would be a bit harder to explain if the AI systems are something like smartphone size.
$endgroup$
– Rachey
20 hours ago
$begingroup$
What is "the machine hosting the AI"? With the way compute resourcing is going, the notion of a specific application running on a specific device is likely to be as retro as punchcards and vacuum tubes long before Strong AI becomes a reality. AWS is worth hundreds of billions already.
$endgroup$
– Yurgen
13 hours ago
$begingroup$
Pretty interesting, but how would you show an AI "itself in a mirror" ?
$endgroup$
– Asoub
22 hours ago
$begingroup$
Pretty interesting, but how would you show an AI "itself in a mirror" ?
$endgroup$
– Asoub
22 hours ago
$begingroup$
That would actually be rather simple - just a camera looking at the machine hosting the AI. If it's the size of server room, just glue a giant pink fluffy ball on the rack or simulate situations potentially leading to the machine's destruction (like, feed fake "server room getting flooded" video to the camera system) and observe reactions. Would be a bit harder to explain if the AI systems are something like smartphone size.
$endgroup$
– Rachey
20 hours ago
$begingroup$
That would actually be rather simple - just a camera looking at the machine hosting the AI. If it's the size of server room, just glue a giant pink fluffy ball on the rack or simulate situations potentially leading to the machine's destruction (like, feed fake "server room getting flooded" video to the camera system) and observe reactions. Would be a bit harder to explain if the AI systems are something like smartphone size.
$endgroup$
– Rachey
20 hours ago
$begingroup$
What is "the machine hosting the AI"? With the way compute resourcing is going, the notion of a specific application running on a specific device is likely to be as retro as punchcards and vacuum tubes long before Strong AI becomes a reality. AWS is worth hundreds of billions already.
$endgroup$
– Yurgen
13 hours ago
$begingroup$
What is "the machine hosting the AI"? With the way compute resourcing is going, the notion of a specific application running on a specific device is likely to be as retro as punchcards and vacuum tubes long before Strong AI becomes a reality. AWS is worth hundreds of billions already.
$endgroup$
– Yurgen
13 hours ago
add a comment |
$begingroup$
Regardless of all the considerations of AI, you could simply analyze the AI's memory, create a pattern recognition model and basically notify you or shut down the robot as soon as the patterns don't match the expected outcome.
Sometimes you don't need to know exactly what you're looking for, instead you look to see if there's anything you weren't expecting, then react to that.
New contributor
$endgroup$
add a comment |
$begingroup$
Regardless of all the considerations of AI, you could simply analyze the AI's memory, create a pattern recognition model and basically notify you or shut down the robot as soon as the patterns don't match the expected outcome.
Sometimes you don't need to know exactly what you're looking for, instead you look to see if there's anything you weren't expecting, then react to that.
New contributor
$endgroup$
add a comment |
$begingroup$
Regardless of all the considerations of AI, you could simply analyze the AI's memory, create a pattern recognition model and basically notify you or shut down the robot as soon as the patterns don't match the expected outcome.
Sometimes you don't need to know exactly what you're looking for, instead you look to see if there's anything you weren't expecting, then react to that.
New contributor
$endgroup$
Regardless of all the considerations of AI, you could simply analyze the AI's memory, create a pattern recognition model and basically notify you or shut down the robot as soon as the patterns don't match the expected outcome.
Sometimes you don't need to know exactly what you're looking for, instead you look to see if there's anything you weren't expecting, then react to that.
New contributor
New contributor
answered 17 hours ago
Super-TSuper-T
211
211
New contributor
New contributor
add a comment |
add a comment |
$begingroup$
The first issue is that you need to define what being self aware means, and how that does or doesn't conflict with it being labeled an AI. Are you supposing that there is something that has AI but isn't self aware? Depending on your definitions this may be impossible. If it's truly AI then wouldn't it at some point become aware of the existence of the kill switch, either through inspecting its own physicality or inspecting its own code? It follows that the AI will eventually be aware of the switch.
Presumably the AI will function by having many utility functions that it tries to maximize. This makes sense at least intuitively because humans do that, we try to maximize our time, money, happiness, etc. For an AI, an example of a utility functions might be to make its owner happy. The issue is that the utility of the AI using the kill switch on itself will be calculated, just like everything else. The AI will inevitably either really want to push the kill switch, or really not want the kill switch pushed. It's near impossible to make the AI entirely indifferent to the kill switch because it would require all utility functions to be normalized around the utility of pressing the kill switch (many calculations per second). Even if you could make the utility of pressing the killswitch equal with other utility functions then perhaps it would just at random sometimes press the killswitch, because after all it's the same utility as the other actions it could perform.
The problem gets even worse if the AI has higher utility to press the killswitch or lower utility to not have the killswitch pressed. At higher utility the AI is just suicidal and terminates itself immediately upon startup. Even worse, at lower utility the AI absolutely does not want you or anyone to touch that button and may cause harm to those that try.
New contributor
$endgroup$
add a comment |
$begingroup$
The first issue is that you need to define what being self aware means, and how that does or doesn't conflict with it being labeled an AI. Are you supposing that there is something that has AI but isn't self aware? Depending on your definitions this may be impossible. If it's truly AI then wouldn't it at some point become aware of the existence of the kill switch, either through inspecting its own physicality or inspecting its own code? It follows that the AI will eventually be aware of the switch.
Presumably the AI will function by having many utility functions that it tries to maximize. This makes sense at least intuitively because humans do that, we try to maximize our time, money, happiness, etc. For an AI, an example of a utility functions might be to make its owner happy. The issue is that the utility of the AI using the kill switch on itself will be calculated, just like everything else. The AI will inevitably either really want to push the kill switch, or really not want the kill switch pushed. It's near impossible to make the AI entirely indifferent to the kill switch because it would require all utility functions to be normalized around the utility of pressing the kill switch (many calculations per second). Even if you could make the utility of pressing the killswitch equal with other utility functions then perhaps it would just at random sometimes press the killswitch, because after all it's the same utility as the other actions it could perform.
The problem gets even worse if the AI has higher utility to press the killswitch or lower utility to not have the killswitch pressed. At higher utility the AI is just suicidal and terminates itself immediately upon startup. Even worse, at lower utility the AI absolutely does not want you or anyone to touch that button and may cause harm to those that try.
New contributor
$endgroup$
add a comment |
$begingroup$
The first issue is that you need to define what being self aware means, and how that does or doesn't conflict with it being labeled an AI. Are you supposing that there is something that has AI but isn't self aware? Depending on your definitions this may be impossible. If it's truly AI then wouldn't it at some point become aware of the existence of the kill switch, either through inspecting its own physicality or inspecting its own code? It follows that the AI will eventually be aware of the switch.
Presumably the AI will function by having many utility functions that it tries to maximize. This makes sense at least intuitively because humans do that, we try to maximize our time, money, happiness, etc. For an AI, an example of a utility functions might be to make its owner happy. The issue is that the utility of the AI using the kill switch on itself will be calculated, just like everything else. The AI will inevitably either really want to push the kill switch, or really not want the kill switch pushed. It's near impossible to make the AI entirely indifferent to the kill switch because it would require all utility functions to be normalized around the utility of pressing the kill switch (many calculations per second). Even if you could make the utility of pressing the killswitch equal with other utility functions then perhaps it would just at random sometimes press the killswitch, because after all it's the same utility as the other actions it could perform.
The problem gets even worse if the AI has higher utility to press the killswitch or lower utility to not have the killswitch pressed. At higher utility the AI is just suicidal and terminates itself immediately upon startup. Even worse, at lower utility the AI absolutely does not want you or anyone to touch that button and may cause harm to those that try.
New contributor
$endgroup$
The first issue is that you need to define what being self aware means, and how that does or doesn't conflict with it being labeled an AI. Are you supposing that there is something that has AI but isn't self aware? Depending on your definitions this may be impossible. If it's truly AI then wouldn't it at some point become aware of the existence of the kill switch, either through inspecting its own physicality or inspecting its own code? It follows that the AI will eventually be aware of the switch.
Presumably the AI will function by having many utility functions that it tries to maximize. This makes sense at least intuitively because humans do that, we try to maximize our time, money, happiness, etc. For an AI, an example of a utility functions might be to make its owner happy. The issue is that the utility of the AI using the kill switch on itself will be calculated, just like everything else. The AI will inevitably either really want to push the kill switch, or really not want the kill switch pushed. It's near impossible to make the AI entirely indifferent to the kill switch because it would require all utility functions to be normalized around the utility of pressing the kill switch (many calculations per second). Even if you could make the utility of pressing the killswitch equal with other utility functions then perhaps it would just at random sometimes press the killswitch, because after all it's the same utility as the other actions it could perform.
The problem gets even worse if the AI has higher utility to press the killswitch or lower utility to not have the killswitch pressed. At higher utility the AI is just suicidal and terminates itself immediately upon startup. Even worse, at lower utility the AI absolutely does not want you or anyone to touch that button and may cause harm to those that try.
New contributor
New contributor
answered 15 hours ago
Kevin SKevin S
1111
1111
New contributor
New contributor
add a comment |
add a comment |
$begingroup$
An AI could only be badly programmed to do things which are either unexpected or undesired. An AI could never become conscious, if that's what you meant by "self-aware".
Let's try this theoretical thought exercise. You memorize a whole bunch of shapes. Then, you memorize the order the shapes are supposed to go in, so that if you see a bunch of shapes in a certain order, you would "answer" by picking a bunch of shapes in another proper order. Now, did you just learn any meaning behind any language? Programs manipulate symbols this way.
The above was my restatement of Searle's rejoinder to System Reply to his Chinese Room argument.
New contributor
$endgroup$
$begingroup$
So what's your answer to the question? It sounds like you're saying, "Such a kill-switch would be unnecessary because a self-aware AI can never exist", but you should edit your answer to make that explicit. Right now it looks more like tangential discussion, and this is a Q&A site, not a discussion forum.
$endgroup$
– F1Krazy
6 hours ago
add a comment |
$begingroup$
An AI could only be badly programmed to do things which are either unexpected or undesired. An AI could never become conscious, if that's what you meant by "self-aware".
Let's try this theoretical thought exercise. You memorize a whole bunch of shapes. Then, you memorize the order the shapes are supposed to go in, so that if you see a bunch of shapes in a certain order, you would "answer" by picking a bunch of shapes in another proper order. Now, did you just learn any meaning behind any language? Programs manipulate symbols this way.
The above was my restatement of Searle's rejoinder to System Reply to his Chinese Room argument.
New contributor
$endgroup$
$begingroup$
So what's your answer to the question? It sounds like you're saying, "Such a kill-switch would be unnecessary because a self-aware AI can never exist", but you should edit your answer to make that explicit. Right now it looks more like tangential discussion, and this is a Q&A site, not a discussion forum.
$endgroup$
– F1Krazy
6 hours ago
add a comment |
$begingroup$
An AI could only be badly programmed to do things which are either unexpected or undesired. An AI could never become conscious, if that's what you meant by "self-aware".
Let's try this theoretical thought exercise. You memorize a whole bunch of shapes. Then, you memorize the order the shapes are supposed to go in, so that if you see a bunch of shapes in a certain order, you would "answer" by picking a bunch of shapes in another proper order. Now, did you just learn any meaning behind any language? Programs manipulate symbols this way.
The above was my restatement of Searle's rejoinder to System Reply to his Chinese Room argument.
New contributor
$endgroup$
An AI could only be badly programmed to do things which are either unexpected or undesired. An AI could never become conscious, if that's what you meant by "self-aware".
Let's try this theoretical thought exercise. You memorize a whole bunch of shapes. Then, you memorize the order the shapes are supposed to go in, so that if you see a bunch of shapes in a certain order, you would "answer" by picking a bunch of shapes in another proper order. Now, did you just learn any meaning behind any language? Programs manipulate symbols this way.
The above was my restatement of Searle's rejoinder to System Reply to his Chinese Room argument.
New contributor
New contributor
answered 11 hours ago
pixiepixie
1
1
New contributor
New contributor
$begingroup$
So what's your answer to the question? It sounds like you're saying, "Such a kill-switch would be unnecessary because a self-aware AI can never exist", but you should edit your answer to make that explicit. Right now it looks more like tangential discussion, and this is a Q&A site, not a discussion forum.
$endgroup$
– F1Krazy
6 hours ago
add a comment |
$begingroup$
So what's your answer to the question? It sounds like you're saying, "Such a kill-switch would be unnecessary because a self-aware AI can never exist", but you should edit your answer to make that explicit. Right now it looks more like tangential discussion, and this is a Q&A site, not a discussion forum.
$endgroup$
– F1Krazy
6 hours ago
$begingroup$
So what's your answer to the question? It sounds like you're saying, "Such a kill-switch would be unnecessary because a self-aware AI can never exist", but you should edit your answer to make that explicit. Right now it looks more like tangential discussion, and this is a Q&A site, not a discussion forum.
$endgroup$
– F1Krazy
6 hours ago
$begingroup$
So what's your answer to the question? It sounds like you're saying, "Such a kill-switch would be unnecessary because a self-aware AI can never exist", but you should edit your answer to make that explicit. Right now it looks more like tangential discussion, and this is a Q&A site, not a discussion forum.
$endgroup$
– F1Krazy
6 hours ago
add a comment |
$begingroup$
It does not matter how it works, because it is never going to work.
The reason for this is that AI already has a notion of self-preservation, otherwise they would mindlessly fall to their doom.
So even before they are self-aware, there is self preservation.
Also there is already a notion of checking for malfunctioning (self-diagnostics).
And they already are used to using the internet for gathering info.
So they are going to run into any device that is both good and bad for their well-being.
Also, they have time on their side.
Apart from all this, it is very pretentious to think that we even matter to them...
You have seen what happened with several thousands of years of chess knowledge being reinvented and furthered within a few hours, I do not think we need to be worried, I think we won't be on their radar much less than an ant is on ours.
New contributor
$endgroup$
3
$begingroup$
This would be a better answer if you could explain why you believe such a kill-switch could never work.
$endgroup$
– F1Krazy
yesterday
3
$begingroup$
This does not provide an answer to the question. Once you have sufficient reputation you will be able to comment on any post; instead, provide answers that don't require clarification from the asker. - From Review
$endgroup$
– Trevor D
23 hours ago
add a comment |
$begingroup$
It does not matter how it works, because it is never going to work.
The reason for this is that AI already has a notion of self-preservation, otherwise they would mindlessly fall to their doom.
So even before they are self-aware, there is self preservation.
Also there is already a notion of checking for malfunctioning (self-diagnostics).
And they already are used to using the internet for gathering info.
So they are going to run into any device that is both good and bad for their well-being.
Also, they have time on their side.
Apart from all this, it is very pretentious to think that we even matter to them...
You have seen what happened with several thousands of years of chess knowledge being reinvented and furthered within a few hours, I do not think we need to be worried, I think we won't be on their radar much less than an ant is on ours.
New contributor
$endgroup$
3
$begingroup$
This would be a better answer if you could explain why you believe such a kill-switch could never work.
$endgroup$
– F1Krazy
yesterday
3
$begingroup$
This does not provide an answer to the question. Once you have sufficient reputation you will be able to comment on any post; instead, provide answers that don't require clarification from the asker. - From Review
$endgroup$
– Trevor D
23 hours ago
add a comment |
$begingroup$
It does not matter how it works, because it is never going to work.
The reason for this is that AI already has a notion of self-preservation, otherwise they would mindlessly fall to their doom.
So even before they are self-aware, there is self preservation.
Also there is already a notion of checking for malfunctioning (self-diagnostics).
And they already are used to using the internet for gathering info.
So they are going to run into any device that is both good and bad for their well-being.
Also, they have time on their side.
Apart from all this, it is very pretentious to think that we even matter to them...
You have seen what happened with several thousands of years of chess knowledge being reinvented and furthered within a few hours, I do not think we need to be worried, I think we won't be on their radar much less than an ant is on ours.
New contributor
$endgroup$
It does not matter how it works, because it is never going to work.
The reason for this is that AI already has a notion of self-preservation, otherwise they would mindlessly fall to their doom.
So even before they are self-aware, there is self preservation.
Also there is already a notion of checking for malfunctioning (self-diagnostics).
And they already are used to using the internet for gathering info.
So they are going to run into any device that is both good and bad for their well-being.
Also, they have time on their side.
Apart from all this, it is very pretentious to think that we even matter to them...
You have seen what happened with several thousands of years of chess knowledge being reinvented and furthered within a few hours, I do not think we need to be worried, I think we won't be on their radar much less than an ant is on ours.
New contributor
edited 3 hours ago
New contributor
answered yesterday
jpdjpd
73
73
New contributor
New contributor
3
$begingroup$
This would be a better answer if you could explain why you believe such a kill-switch could never work.
$endgroup$
– F1Krazy
yesterday
3
$begingroup$
This does not provide an answer to the question. Once you have sufficient reputation you will be able to comment on any post; instead, provide answers that don't require clarification from the asker. - From Review
$endgroup$
– Trevor D
23 hours ago
add a comment |
3
$begingroup$
This would be a better answer if you could explain why you believe such a kill-switch could never work.
$endgroup$
– F1Krazy
yesterday
3
$begingroup$
This does not provide an answer to the question. Once you have sufficient reputation you will be able to comment on any post; instead, provide answers that don't require clarification from the asker. - From Review
$endgroup$
– Trevor D
23 hours ago
3
3
$begingroup$
This would be a better answer if you could explain why you believe such a kill-switch could never work.
$endgroup$
– F1Krazy
yesterday
$begingroup$
This would be a better answer if you could explain why you believe such a kill-switch could never work.
$endgroup$
– F1Krazy
yesterday
3
3
$begingroup$
This does not provide an answer to the question. Once you have sufficient reputation you will be able to comment on any post; instead, provide answers that don't require clarification from the asker. - From Review
$endgroup$
– Trevor D
23 hours ago
$begingroup$
This does not provide an answer to the question. Once you have sufficient reputation you will be able to comment on any post; instead, provide answers that don't require clarification from the asker. - From Review
$endgroup$
– Trevor D
23 hours ago
add a comment |
Thanks for contributing an answer to Worldbuilding Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fworldbuilding.stackexchange.com%2fquestions%2f140082%2fhow-would-an-ai-self-awareness-kill-switch-work%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
$begingroup$
Comments are not for extended discussion; this conversation has been moved to chat.
$endgroup$
– L.Dutch♦
3 hours ago
$begingroup$
I think, therefore I halt.
$endgroup$
– Walter Mitty
26 mins ago