A Novel Entity Type Filtering Model for Related Entity Finding
Keywords:
related entity finding, entity, entity ranking, type filteringAbstract
Entity is an important information carrier in Web pages. Searchers often want a ranked list of relevant entities directly rather a list of documents. So the research of related entity finding (REF) is a meaningful work. In this paper we investigate the most important task of REF: Entity Ranking. To address the issue of wrong entity type in entity ranking: some retrieved entities don't belong to the target entity type. We propose a novel entity type filtering model in which the target types are composed of the originally assigned type and the new type which is automatically acquired from the topic's narrative to filter wrong-type entities. For the query, we propose a method to process the original narrative to acquire a new query which is composed of noun and verb phrases. The results of experiments show our novel type filtering model gets a better result than the traditional filtering model at whatever precision and recall. Also the experiment shows the method that we acquire a new query is feasible.References
K. Balog, A. P. de Vries, P. Serdyukov, P. Thomas, and T. Westerveld, Overview of the TREC 2009 entity track, In TREC '09, 2009.
E. Riloff, Automatically generating extraction patterns from untagged text, In AAAI, Vol.2, 1044-1049, 1996.
E. M. Voorhees, Overview of the TREC 2002 Question Answering Track, In TREC '02, 115-123, 2009.
K. Balog, People Search in the Enterprise. PhD thesis, University of Amsterdam, 2008.
K. Balog, L. Azzopardi, and M. de Rijke, A language modeling framework for expert finding, Inf. Proc. and Man., 45(1), 1-19, 2009. http://dx.doi.org/10.1016/j.ipm.2008.06.003
Jovan Pehcevski, James A. Thom, Anne-Marie Vercoustre, Vladimir Naumovski, Entity ranking in Wikipedia: utilising categories, links and topic difficulty prediction, Information Retrieval, 13(5), 568-600, 2010. http://dx.doi.org/10.1007/s10791-009-9125-9
W. Zheng, S. Gottipati,J. Jiang, and H. Fang, UDEL/SMU at TREC 2009 Entity Track, In TREC '09, 2009.
Q. Yang, P. Jiang, C. Zhang, and Z. Niu, Experiments on related entity finding track at TREC 2009, In TREC '09, 2009.
Y.Wu and H. Kashioka, NiCT at TREC 2009: Employing three models for Entity Ranking Track, In TREC '09, 2009.
Y. Fang et al, Entity retrieval with hierarchical relevance model, In TREC '09, 2009.
R. Kaptein, P. Serdyukov, A. de Vries, and J. Kamps, Entity ranking using Wikipedia as a pivot, In CIKM, 2010.
M. Bron and K. Balog and M. de Rijke, Ranking related entities: components and analyses,In CIKM, 2010.
P. Serdyukov and A. de Vries, Delft university at the TREC 2009 Entity Track: Ranking wikipedia entities, In TREC '09, 2009.
F. Song and W. B. Croft, A general language model for information retrieval, In CIKM 99, pages 77-82, 1999.
C. D. Manning and H. Schuetze, Foundations of Statistical Natural Language Processing, The MIT Press, 1999.
A. de Vries et al, Overview of the INEX 2007 Entity Ranking Track, pages 245-251, 2007. Int. J. of Computers, Communication and Control (Date of submission: January 15, 2012)
E. Brill, Transformation-based error-driven learning and natural language processing: a case study in part of speech tagging, Computational Linguistics, 21(4), 543-565, 1995.
L. Ramshaw, and M. Marcus, Text Chunking Using Transformation-Based Learning, In Proc. of the Third ACL Workshop on Very Large Corpora, MIT, 1995
Published
Issue
Section
License
ONLINE OPEN ACCES: Acces to full text of each article and each issue are allowed for free in respect of Attribution-NonCommercial 4.0 International (CC BY-NC 4.0.
You are free to:
-Share: copy and redistribute the material in any medium or format;
-Adapt: remix, transform, and build upon the material.
The licensor cannot revoke these freedoms as long as you follow the license terms.
DISCLAIMER: The author(s) of each article appearing in International Journal of Computers Communications & Control is/are solely responsible for the content thereof; the publication of an article shall not constitute or be deemed to constitute any representation by the Editors or Agora University Press that the data presented therein are original, correct or sufficient to support the conclusions reached or that the experiment design or methodology is adequate.