Efficient question classification and retrieval using category information and word embedding on cQA services

Kyoungman Bae, Youngjoong Ko

Research output: Contribution to journalArticlepeer-review

8 Scopus citations

Abstract

Classifying the task of automatically assigning unlabeled questions into predefined categories (or topics) and effectively retrieving a similar question are crucial aspects of an effective cQA service. We first address the problems associated with estimating and utilizing the distribution of words in each category of word weights. We then apply an automatic expansion word generation technique that is based on our proposed weighting method and the pseudo relevance feedback to question classification. Secondly to address the lexical gap problem in question retrieval, the case frame of the sentence is first defined using the extracted components of a sentence, and a similarity measure based on the case frame and the word embedding is then derived to determine the similarities between two sentences. These similarities are then used to reorder the results of the first retrieval model. Consequently, the proposed methods significantly improve the performance of question classification and retrieval.

Original languageEnglish
Pages (from-to)27-49
Number of pages23
JournalJournal of Intelligent Information Systems
Volume53
Issue number1
DOIs
StatePublished - 15 Aug 2019
Externally publishedYes

Keywords

  • Category information
  • Pseudo-relevance feedback
  • Question classification
  • Question expansion
  • Word weighting method

Fingerprint

Dive into the research topics of 'Efficient question classification and retrieval using category information and word embedding on cQA services'. Together they form a unique fingerprint.

Cite this