Research Institute in Information and Language Processing
Browse by
Collections in this community
Recent Submissions
-
Web impact factors and search engine coverageSearch engines index only a proportion of the web and this proportion is not determined randomly but by following algorithms that take into account the properties that impact factors measure. A survey was conducted in order to test the coverage of search engines and to decide whether their partial coverage is indeed an obstacle to using them to calculate web impact factors. The results indicate that search engine coverage, even of large national domains is extremely uneven and would be likely to lead to misleading calculations.
-
A High Precision Information Retrieval Method for WiQAThis paper presents Wolverhampton University’s participation in the WiQA competition. The method chosen for this task combines a high precision, but low recall information retrieval approach with a greedy sentence ranking algorithm. The high precision retrieval is ensured by querying the search engine with the exact topic, in this way obtaining only sentences which contain the topic. In one of the runs, the set of retrieved sentences is expanded using coreferential relations between sentences. The greedy algorithm used for ranking selects one sentence at a time, always the one which adds most information to the set of sentences without repeating the existing information too much. The evaluation revealed that it achieves a performance similar to other systems participating in the competition and that the run which uses coreference obtains the highest MRR score among all the participants.
-
NP animacy identification for anaphora resolutionIn anaphora resolution for English, animacy identification can play an integral role in the application of agreement restrictions between pronouns and candidates, and as a result, can improve the accuracy of anaphora resolution systems. In this paper, two methods for animacy identification are proposed and evaluated using intrinsic and extrinsic measures. The first method is a rule-based one which uses information about the unique beginners in WordNet to classify NPs on the basis of their animacy. The second method relies on a machine learning algorithm which exploits a WordNet enriched with animacy information for each sense. The effect of word sense disambiguation on the two methods is also assessed. The intrinsic evaluation reveals that the machine learning method reaches human levels of performance. The extrinsic evaluation demonstrates that animacy identification can be beneficial in anaphora resolution, especially in the cases where animate entities are identified with high precision.
-
Refined Salience Weighting and Error Analysis in Anaphora Resolution.In this paper, the behaviour of an existing pronominal anaphora resolution system is modified so that different types of pronoun are treated in different ways. Weights are derived using a In genetic algorithm for the outcomes of tests applied by this branching algorithm. Detailed evaluation and error analysis is undertaken. Proposals for future research are put forward.