The results are in. An artificial intelligence has gone to the top of its class after passing an English exam. Though it can’t beat more able human students, it achieved the best mark yet for a machine.
Hai Zhao at Shanghai Jiao Tong University in China and his colleagues trained their AI on more than 25,000 English reading comprehension tests.
Each contained a 200 to 300-word story followed by a series of related multiple-choice questions. The tests were sourced from English proficiency exams aimed at Chinese students aged from 12 to 18 years.
While some answers could be directly found in the text, over half of them required a degree of reasoning. For example, one of the questions asked you to choose the best headline for a story from four options.
After the training, the AI sat a final exam consisting of 1400 tests it hadn’t seen before. It achieved an overall score of 74 per cent, better than all previous machine attempts.
Zhao’s AI uses a system that can identify parts of the story that are relevant to the question, then selects the answer that is most similar in meaning and logic.
The next best was a system made by Tencent, a leading Chinese technology firm, which scored 72 per cent on the same exam. Tencent’s AI learned to compare the information carried by each option and use their differences as cues to look for evidence in the text.
Despite topping the leader board, Zhao is determined to improve his system’s abilities. “What our AI got is very average, a C+ at most,” he says. “For students who want to get into good universities in China, they will aim for 90 per cent.”
To increase its score, the team will try to modify the AI so that it can understand information embedded in sentence structure and feed it with more data to expand its vocabulary.
Understanding human language is a major headache for AI, as it is often imprecise and involves hidden contextual and societal clues that machines struggle to pick up on.
It is unclear what rules AIs follow when they learn our languages, says Guokun Lai at Carnegie Mellon University in Pennsylvania, who originally collated the tests in 2017 for AI research. “They seem to be able to [understand our logic] after reading tonnes of sentences and stories.”