How to utilize syllable distribution patterns as the input of LSTM for Korean morphological analysis

Hyemin Kim, Seon Yang, Youngjoong Ko

Research output: Contribution to journalArticlepeer-review

4 Scopus citations

Abstract

This paper proposes the use of syllable distribution patterns as deep learning inputs for morphological analysis. The proposed syllable distribution pattern comprises two parts: a distributed syllable embedding vector and a morpheme syllable-level distribution pattern. As a learning method, we utilize bidirectional long short-term memory with a conditional random field layer (Bi-LSTM-CRF) for Korean part-of-speech tagging tasks. After syllable-level outputs are generated by Bi-LSTM-CRF, a morpheme restoration process is performed utilizing pre-analyzed dictionaries that were automatically created from a training corpus. Experimental results reveal outstanding performance for the proposed method with an F1-score of 98.65%.

Original languageEnglish
Pages (from-to)39-45
Number of pages7
JournalPattern Recognition Letters
Volume120
DOIs
StatePublished - 1 Apr 2019
Externally publishedYes

Keywords

  • Bi-LSTM-CRF
  • Morpheme distribution
  • Morphological analysis
  • POS tagging
  • Syllable distribution pattern
  • Syllable embedding

Fingerprint

Dive into the research topics of 'How to utilize syllable distribution patterns as the input of LSTM for Korean morphological analysis'. Together they form a unique fingerprint.

Cite this