Abstract
This paper presents a new method for automatic word spacing in Korean. Our major challenge is to reduce the memory usage and processing time to be suitable for mobile devices while ensuring a high performance. Since mobile devices are more difficult to type letters into than general PCs with current technology, spacing errors such as writing with no spaces can occur relatively frequently in mobile devices. As text typing to connect to Social Network Service via mobile devices continues to increase, more and more people can be confused in communication due to spacing errors. In this paper, we propose a novel "discriminative probabilistic model" in which syllables are classified into eight classes of "syllable-based space patterns". As baseline systems, we employ two different types of Conditional Random Field models which are regarded as one of the highest-performance systems in sequence labeling like word spacing. Our experimental results show that the proposed system performs better than the baseline while using significantly less memory and Spacing time.
| Original language | English |
|---|---|
| Pages (from-to) | 5055-5065 |
| Number of pages | 11 |
| Journal | International Journal of Innovative Computing, Information and Control |
| Volume | 8 |
| Issue number | 7 B |
| State | Published - Jul 2012 |
| Externally published | Yes |
Keywords
- Discriminative probabilistic model
- Mobile device
- Syllable- based space pattern
- Word spacing