Download PDFOpen PDF in browserA Data-Driven Metric of Hardness for WSC Sentences14 pages•Published: September 17, 2018AbstractThe Winograd Schema Challenge (WSC) — the task of resolving pronouns in certain sentences where shallow parsing techniques seem not to be directly applicable — has been proposed as an alternative to the Turing Test. According to Levesque, having access to a large corpus of text would likely not help much in the WSC. Among a number of attempts to tackle this challenge, one particular approach has demonstrated the plausibility of using commonsense knowledge automatically acquired from raw text in English Wikipedia.Here, we present the results of a large-scale experiment that shows how the performance of that particular automated approach varies with the availability of training material. We compare the results of this experiment with two studies: one from the literature that investigates how adult native speakers tackle the WSC, and one that we design and undertake to investigate how teenager non-native speakers tackle the WSC. We find that the performance of the automated approach correlates positively with the performance of humans, suggesting that the performance of the particular automated approach could be used as a metric of hardness for WSC instances. Keyphrases: common sense reasoning, coreference resolution, knowledge based learning, winograd schema challenge, wsc sentence hardness In: Daniel Lee, Alexander Steen and Toby Walsh (editors). GCAI-2018. 4th Global Conference on Artificial Intelligence, vol 55, pages 107-120.
|