This paper discusses the criticisms that were presented to Alan Turing Test and how these criticisms were lacking practical approach.
In 1950, Alan Turing published his paper on “Imitation Game” by precisely defining artificial intelligence for machines. He put forth a test famously known as the “Turing Test” (from here on referred as TT), to determine whether a machine is intelligent or not - in which a computer will be considered intelligent, if its conversation cannot be distinguished from humans. Many critics for decades have debated the validity of TT and these criticisms make profound and interesting points which forces the reader to think that - is the Turing’s hypothesis flawed? This essay will discuss three major assessments that challenge Turing’s statement and my opinion as to why most of these are impractical.
Is Alan Turing’s statement correct that machines need to imitate humans in order to be called intelligent
In 1950, Alan Turing published his paper on “Imitation Game” by precisely defining artificial intelligence for machines. He put forth a test famously known as the “Turing Test” (from here on referred as TT), to determine whether a machine is intelligent or not— in which a computer will be considered intelligent, if its conversation cannot be distinguished from humans. Many critics for decades have debated the validity of TT and these criticisms make profound and interesting points which forces the reader to think that—is the Turing’s hypothesis flawed? This essay will discuss three major assessments that challenge Turing’s statement and my opinion as to why most of these are impractical.
Firstly, one of the well known critiques of the TT is the “Chinese Room” thought experiment by John R. Searle [1]. Searle explains that the TT could be easily passed by the brute force machine in the Chinese room which is “not” intelligent. This machines have a large look up dictionary data, which for a certain input of Chinese characters, has set of Chinese characters output. The output makes sense to a native Chinese speaker, but the machine has no understanding of either the input or the output. This machine fools the native Chinese speaker that he is in-fact talking to a machine which has an understanding of Chinese and therefore this machine passes the TT. Searle criticises that these machines could easily fool judges and that the TT is flawed because it only examines a machine output, and not the understanding of the machine.
Nonetheless, the Chinese Room experiment or in general any other brute-force machine is impractical to make. This has been proved most comprehensively by Levesque [2], who showed that the look up dictionary/table in Searle’s Chinese room could not exist in the universe. He then points out that there is a book that could potentially pass the TT, which is the Chinese-English learning manual—a book which teaches you Chinese. Therefore, in order to pass the TT, machines will have to somehow learn the language, instead of using brute force.
Secondly, in a TT, the human judge only interacts through text messages, and have no visual or physical contact with the subjects. While on one hand, this method may seem fair because we should not judge one’s intelligence based on how it looks but instead on their conversation skills, on the other hand, few philosophers have the opinion that language may not capture all types of intelligence that humans have and hence TT is not comprehensive . Gunderson[3] gives a good comparison between TT and “the Toe-Stepping Game” (referred as TSG)— in which the judge’s toe’s will be stepped on either by a human or by a rock and lever mechanism (which drops the rock on the judge and then removes the rock quickly, just like human removes its toe). Finally, the judge has to tell whether he was hit by a human or the rock. Gunderson claims that it is very easy for the lever mechanism to pass the test— one might think that a rock is actually a human. Gunderson later concludes that the TSG is actually flawed because it only reflects one facet of the rock’s abilities. He then draws an analogy between TSG and TT—stating that just like TSG, TT is also flawed since TT takes into consideration only the linguistic capabilities of the machine.
However, Gunderson argument is mistaken in one major assumption that the act of toestepping in TSG is analogous to language imitation in TT. The TSG does not entail any other capability of a machine other than to step on a toe. However, linguistic capabilities include a broad range of abilities. As a matter of fact, humans use language to learn almost everything in their lives and language is a general tool for learning [4]. In other words, how do we know whether a human is intelligent or not ? Through language. So in essence, TSG and TT are not comparable.
Finally, many philosophers claim that TT cannot assess the intelligence unknown to humans— since the judge in TT is a human as well as the computer is competing against another human’s messages, so TT could just test for “human intelligence” and not for general intelligence. To explain this, French [5] gives an example of the “Seagull test” in which the isolated native inhabitants of a Nordic Island want to capture the notion of flight. But the only flying object they have seen is the Nordic seagull, which is native to their island. So, the inhabitants develop the test in which the inhabitants are shown two 3D radar screens, one containing the figure of a Seagull and the other having a potentially flying object. If the inhabitants cannot tell which one of the two is the seagull, only then the potentially flying object can actually fly. So, even if the inhabitants were shown the Seagull and an airplane on the 2 radars, they will deem that the airplane could not fly, which is wrong. French uses Seagull test to refute TT, stating that just like seagull test is not testing general purpose flight but it is testing for flying nordic seagulls, same is the case with TT which is not testing general intelligence but only human intelligence.
However, French has one unjustified assumption that if the Nordic inhabitants saw an airplane on the radar, they would directly classify it as a non flying object. But this is not likely,. Inhabitants may also classify the plane as an unknown object— which is doing something, they are not aware of and might go out and research the plane and then refine their test accordingly. The same thing could happen to TT when it is presented with totally new intelligence unaware to humans, after experiencing the new way of conversation with this new intelligence, the judge instead of outrightly disregarding the conversation as unintelligent, can also recognise a totally new intelligence [4].
To conclude, my main focus of this essay has been that even though there are many arguments against the TT but they do not make solid practical examples to refute TT. One can argue that TT can only test human intelligence, but we currently do not know the existence of other types of intelligence. And we have no way of knowing if these criticisms are correct until we have access to more intelligent machines than we have today or if we research about intelligence in more detail. The TT may not always remain an ideal test in the future but it is by far one of the best tests to measure the intelligence of a machine.
References:
[1] J. R. Searle. Minds, Brains, and Programs. In Behavioural and Brain Sciences, 1980
[2] H. J. Levesque. Is it Enough to Get the Behaviour Right? 1989
[3] K. Gunderson. The Imitation Game. In Mind, 1964
[4] Katrina LaCurts. Criticisms of the Turing Test and Why You Should Ignore (Most of) Them, pages 4-8, 2011
Frequently Asked Questions
What is the main topic of the text "Is Alan Turing’s statement correct that machines need to imitate humans in order to be called intelligent"?
The text explores the validity of Alan Turing's "Turing Test" (TT) as a measure of artificial intelligence. It discusses criticisms of the TT, particularly whether machines need to imitate humans perfectly to be considered intelligent, and argues why many of these criticisms are impractical.
What is the Turing Test (TT)?
The Turing Test is a test proposed by Alan Turing to determine if a machine is intelligent. In this test, a computer is considered intelligent if its conversation cannot be distinguished from that of a human.
What is the "Chinese Room" argument, and how does it relate to the Turing Test?
The "Chinese Room" argument, proposed by John R. Searle, is a critique of the Turing Test. It suggests that a machine could pass the TT by brute force (e.g., using a large lookup dictionary) without actually understanding the language it's processing. The argument claims such a machine is not truly intelligent, even if it can fool a human judge.
What is Levesque's counter-argument to the Chinese Room critique?
Levesque argues that the brute-force approach described in the Chinese Room experiment is impractical. He suggests that creating a lookup dictionary large enough to successfully fool a human in the Turing Test is not feasible, effectively implying that machines would need to learn language rather than simply mimic it.
What is the "Toe-Stepping Game" analogy, and how does it critique the Turing Test?
Gunderson proposes the "Toe-Stepping Game" (TSG) as an analogy to the Turing Test. In TSG, a judge tries to determine if their toe was stepped on by a human or a rock and lever mechanism. Gunderson argues that just as TSG only reflects one facet of an object's abilities (the ability to step on a toe), the Turing Test only considers linguistic capabilities and may not capture other forms of intelligence.
What is the author's counter-argument to the "Toe-Stepping Game" analogy?
The author argues that the act of toe-stepping in TSG is not analogous to language imitation in TT. Linguistic capabilities, unlike toe-stepping, encompass a broad range of abilities crucial for learning and understanding. The author asserts that judging intelligence through language is a valid approach, making the comparison between TSG and TT flawed.
What is the "Seagull Test" analogy, and how does it criticize the Turing Test?
French proposes the "Seagull Test" as an analogy to the Turing Test. In this scenario, inhabitants of a Nordic island try to define flight based only on their experience with seagulls. If the inhabitants' test only recognizes seagulls as flying objects, it would incorrectly reject other flying objects like airplanes. French argues that the Turing Test might only be testing for "human intelligence" and not general intelligence, similar to how the Seagull Test is limited to identifying Nordic seagulls.
What is the author's counter-argument to the "Seagull Test" analogy?
The author argues that the "Seagull Test" analogy assumes the Nordic inhabitants would immediately dismiss an airplane as a non-flying object. Instead, they might classify it as an unknown object and investigate it further. Similarly, in the Turing Test, a human judge might recognize a new form of intelligence rather than immediately dismissing it as unintelligent.
What is the author's overall conclusion regarding the Turing Test?
The author concludes that while the Turing Test has been criticized, the criticisms lack solid practical examples. The author acknowledges that TT may only test for human intelligence and does not have the capabilities to measure other forms of intelligence unknown to humans. However, it remains one of the best current methods for assessing machine intelligence. The author suggests that future advancements in AI and our understanding of intelligence might necessitate a re-evaluation of the Turing Test.
What are the key references cited in the text?
The references cited include:
- J. R. Searle. Minds, Brains, and Programs. In Behavioural and Brain Sciences, 1980
- H. J. Levesque. Is it Enough to Get the Behaviour Right? 1989
- K. Gunderson. The Imitation Game. In Mind, 1964
- Katrina LaCurts. Criticisms of the Turing Test and Why You Should Ignore (Most of) Them, pages 4-8, 2011
- R. M. French. Subcognition and the Limits of the Turing Test. In Mind, 1990
- Quote paper
- Karan Singh (Author), 2018, Is Alan Turing’s statement correct that machines need to imitate humans in order to be called intelligent, Munich, GRIN Verlag, https://www.hausarbeiten.de/document/452210