General impossibility theorems
Impossibility theorem demonstrates that a particular problem cannot be solved as described in the claim, or that a particular set of problems cannot be solved in general. The most well-known general examples are Gödel’s Incompleteness theorems and Turing’s undecidability results.
In the case of AI, impossibility theorems put limits on what is possible to do concerning artificial intelligence, especially the superintelligent one.
Below, I list the impossibility results from the literature. The list is not exhaustive and I will expand it in future.
- Unverifiability [Yampolskiy2017] states fundamental limitation (or inability) on verification of mathematical proofs, of computer software, of behavior of intelligent agents, and of all formal systems.
- Unpredictability [Vinge1993, Arbital2019, Yampolskiy2019] states our inability to precisely and consistently predict what specific actions an intelligent system will take to achieve its objectives, even if we know terminal goals of the system.
- Unexplainability [Yampolskiy2019b] states the impossibility of providing an explanation for certain decisions made by an intelligent system which is both 100% accurate and comprehensible.
- Incomprehensibility [Yampolskiy2019b] states the impossibility of complete understanding of any 100% -accurate explanation for certain decisions of an intelligent system by any human.
- Uncontrollability [Yampolskiy2020] states that humanity cannot remain safely in control while benefiting from a superior form of intelligence.
- Limits on utility-based value alignment [Eckersley2019] state a number of impossibility theorems on multi-agent alignment due to competing utilitarian objectives. This is not just AI-related topic. The most famous example is Arrow’s Impossibility Theorem from social choice theory, which shows there is no satisfactory way to compute society’s preference ordering via an election in which members of society vote with their individual preference orderings.
- Limits on preference deduction [Armstrong2019] states that even Occam’s razor is insufficient to decompose observations of behavior into the preferences and planning algorithm. Assumptions above the data are necessary for disambiguation between the preferences and planning algorithm.
Impossibility results in AI are proven by contradiction. Some proofs use the suboptimality of humans and the definition of super-intelligent as something strictly more intelligent than humans and then they put the limit on attainable relation concerning some aspect between the humans and superintelligent AI. The other general option is to use Liar Paradox (Gödel-like self-referentiality). Sometimes, AI does not even need to be strictly superintelligent across the whole domain, so some impossibility results hold even in relaxed conditions.
These impossibility results serve as guidelines, reminders, and warnings to AI Safety and Security researchers, and wider.
Lipton argues for the usefulness of impossibility results, but also adds warnings:
“I would say that they are useful, and that they can add to our understanding of a problem. At a minimum they show us where to attack the problem in question.
If you prove that no X can solve some problem Y, then the proper view is that I should look carefully at methods that lie outside X. I should not give up. I would look carefully—perhaps more carefully than is usually done—to see if X really captures all the possible attacks.
What troubles me about impossibility proofs is that they often are not very careful about X. They often rely on testimonial, anecdotal evidence, or personal experience to convince one that X is complete.”
(Reproduced from the original post on LinkedIn)