Multi-document Summarization by Creating Synthetic Document Vector Based on Language Model

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Multi-document summarization is to create summaries covering the major information that multiple documents tell in common. For this point, the existing methods are based on hand-crafted features for word and sentence. However, it is difficult to figure out the core contents of each document with the hand-crafted features because they have the limited information presented the given documents. Moreover, there exists a limit to figure out the major information because documents with the same meaning used to be paraphrased depending on their writers. Therefore, it is necessary to represent the semantic meanings of documents as well as sentences through understanding natural language. In this paper, we propose a new multi-document summarization system by creating a synthetic document vector covering the whole documents based on Language Model, whose is well-known for learning the semantic features in text. We experimented with DUC 2004 dataset provided by Document Understanding Conference (DUC) and the results show that our method summarizes multiple documents effectively based on their core contents.

Original languageEnglish
Title of host publicationProceedings - 2016 Joint 8th International Conference on Soft Computing and Intelligent Systems and 2016 17th International Symposium on Advanced Intelligent Systems, SCIS-ISIS 2016
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages605-609
Number of pages5
ISBN (Electronic)9781467390415
DOIs
StatePublished - 28 Dec 2016
Externally publishedYes
Event8th Joint International Conference on Soft Computing and Intelligent Systems and 17th International Symposium on Advanced Intelligent Systems, SCIS-ISIS 2016 - Sapporo, Hokkaido, Japan
Duration: 25 Aug 201628 Aug 2016

Publication series

NameProceedings - 2016 Joint 8th International Conference on Soft Computing and Intelligent Systems and 2016 17th International Symposium on Advanced Intelligent Systems, SCIS-ISIS 2016

Conference

Conference8th Joint International Conference on Soft Computing and Intelligent Systems and 17th International Symposium on Advanced Intelligent Systems, SCIS-ISIS 2016
Country/TerritoryJapan
CitySapporo, Hokkaido
Period25/08/1628/08/16

Keywords

  • Core content
  • Language model
  • Major Information
  • Multi-document summarization
  • Synthetic document vector

Fingerprint

Dive into the research topics of 'Multi-document Summarization by Creating Synthetic Document Vector Based on Language Model'. Together they form a unique fingerprint.

Cite this