Abstract
Classifying the task of automatically assigning unlabeled questions into predefined categories (or topics) and effectively retrieving a similar question are crucial aspects of an effective cQA service. We first address the problems associated with estimating and utilizing the distribution of words in each category of word weights. We then apply an automatic expansion word generation technique that is based on our proposed weighting method and the pseudo relevance feedback to question classification. Secondly to address the lexical gap problem in question retrieval, the case frame of the sentence is first defined using the extracted components of a sentence, and a similarity measure based on the case frame and the word embedding is then derived to determine the similarities between two sentences. These similarities are then used to reorder the results of the first retrieval model. Consequently, the proposed methods significantly improve the performance of question classification and retrieval.
| Original language | English |
|---|---|
| Pages (from-to) | 27-49 |
| Number of pages | 23 |
| Journal | Journal of Intelligent Information Systems |
| Volume | 53 |
| Issue number | 1 |
| DOIs | |
| State | Published - 15 Aug 2019 |
| Externally published | Yes |
Keywords
- Category information
- Pseudo-relevance feedback
- Question classification
- Question expansion
- Word weighting method
Fingerprint
Dive into the research topics of 'Efficient question classification and retrieval using category information and word embedding on cQA services'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver