A Novel Entity Type Filtering Model for Related Entity Finding

Authors

  • Junsan Zhang Beijing Jiaotong University
  • Youli Qu
  • Shengfeng Tian

Keywords:

related entity finding, entity, entity ranking, type filtering

Abstract

Entity is an important information carrier in Web pages. Searchers often want a ranked list of relevant entities directly rather a list of documents. So the research of related entity finding (REF) is a meaningful work. In this paper we investigate the most important task of REF: Entity Ranking. To address the issue of wrong entity type in entity ranking: some retrieved entities don't belong to the target entity type. We propose a novel entity type filtering model in which the target types are composed of the originally assigned type and the new type which is automatically acquired from the topic's narrative to filter wrong-type entities. For the query, we propose a method to process the original narrative to acquire a new query which is composed of noun and verb phrases. The results of experiments show our novel type filtering model gets a better result than the traditional filtering model at whatever precision and recall. Also the experiment shows the method that we acquire a new query is feasible.

References

K. Balog, A. P. de Vries, P. Serdyukov, P. Thomas, and T. Westerveld, Overview of the TREC 2009 entity track, In TREC '09, 2009.

E. Riloff, Automatically generating extraction patterns from untagged text, In AAAI, Vol.2, 1044-1049, 1996.

E. M. Voorhees, Overview of the TREC 2002 Question Answering Track, In TREC '02, 115-123, 2009.

K. Balog, People Search in the Enterprise. PhD thesis, University of Amsterdam, 2008.

K. Balog, L. Azzopardi, and M. de Rijke, A language modeling framework for expert finding, Inf. Proc. and Man., 45(1), 1-19, 2009. http://dx.doi.org/10.1016/j.ipm.2008.06.003

Jovan Pehcevski, James A. Thom, Anne-Marie Vercoustre, Vladimir Naumovski, Entity ranking in Wikipedia: utilising categories, links and topic difficulty prediction, Information Retrieval, 13(5), 568-600, 2010. http://dx.doi.org/10.1007/s10791-009-9125-9

W. Zheng, S. Gottipati,J. Jiang, and H. Fang, UDEL/SMU at TREC 2009 Entity Track, In TREC '09, 2009.

Q. Yang, P. Jiang, C. Zhang, and Z. Niu, Experiments on related entity finding track at TREC 2009, In TREC '09, 2009.

Y.Wu and H. Kashioka, NiCT at TREC 2009: Employing three models for Entity Ranking Track, In TREC '09, 2009.

Y. Fang et al, Entity retrieval with hierarchical relevance model, In TREC '09, 2009.

R. Kaptein, P. Serdyukov, A. de Vries, and J. Kamps, Entity ranking using Wikipedia as a pivot, In CIKM, 2010.

http://lemurproject.org.

M. Bron and K. Balog and M. de Rijke, Ranking related entities: components and analyses,In CIKM, 2010.

P. Serdyukov and A. de Vries, Delft university at the TREC 2009 Entity Track: Ranking wikipedia entities, In TREC '09, 2009.

F. Song and W. B. Croft, A general language model for information retrieval, In CIKM 99, pages 77-82, 1999.

C. D. Manning and H. Schuetze, Foundations of Statistical Natural Language Processing, The MIT Press, 1999.

A. de Vries et al, Overview of the INEX 2007 Entity Ranking Track, pages 245-251, 2007. Int. J. of Computers, Communication and Control (Date of submission: January 15, 2012)

E. Brill, Transformation-based error-driven learning and natural language processing: a case study in part of speech tagging, Computational Linguistics, 21(4), 543-565, 1995.

L. Ramshaw, and M. Marcus, Text Chunking Using Transformation-Based Learning, In Proc. of the Third ACL Workshop on Very Large Corpora, MIT, 1995

Published

2014-01-03

Most read articles by the same author(s)

Obs.: This plugin requires at least one statistics/report plugin to be enabled. If your statistics plugins provide more than one metric then please also select a main metric on the admin's site settings page and/or on the journal manager's settings pages.