Skip to main navigation Skip to search Skip to main content

Automatic named-entity set expansion from the Web using a mutual importance measure

  • Dong-A University

Research output: Contribution to journalArticlepeer-review

Abstract

This paper proposes an effective set expansion system that can automatically extract named entities (NEs) from the Web to construct NE domain dictionaries. The purpose of this set expansion system is to expand a given partial set of objects into a more complete set. Google Sets is a representative set expansion system mat uses the Web. The proposed system uses several seed words as initial information to collect Web pages that probably contain many NEs and to extract NE candidates from the collected Web pages. A mutual-importance measurement technique is developed to estimate the importance scores of the NE candidates, and men, these importance scores are used for ranking these candidates. We can easily extract real NEs from an ordered list of NE candidates. As a result, the proposed method showed 95.60% mean average precision (MAP) in 7 Korean NE domains and 99.98% MAP in 8 English NE domains. In particular, the accuracy of the proposed system in the case of English domains is higher than that of Google Sets.

Original languageEnglish
Pages (from-to)5029-5040
Number of pages12
JournalInformation
Volume15
Issue number11 B
StatePublished - Nov 2012
Externally publishedYes

Keywords

  • Mutual importance measurement (MIM)
  • Named entity
  • Named entity recognition
  • Set expansion

Fingerprint

Dive into the research topics of 'Automatic named-entity set expansion from the Web using a mutual importance measure'. Together they form a unique fingerprint.

Cite this