Abstract
Image emotion analysis has gained notable attention owing to the growing importance of computationally modeling human emotions. Most previous studies have focused on classifying the feelings evoked by an image into predefined emotion categories. Compared with these categorical approaches which cannot address the ambiguity and complexity of human emotions, recent studies have taken dimensional approaches to address these problems. However, there is still a limitation in that the number of dimensional datasets is significantly smaller for model training, compared with many available categorical datasets. We propose four types of frameworks that use categorical datasets to predict emotion values for a given image in the valence–arousal (VA) space. Specifically, our proposed framework is trained to predict continuous emotion values under the supervision of categorical labels. Extensive experiments demonstrate that our approach showed a positive correlation with the actual VA values of the dimensional dataset. In addition, our framework improves further when a small number of dimensional datasets are available for the fine-tuning process.
| Original language | English |
|---|---|
| Pages (from-to) | 455-464 |
| Number of pages | 10 |
| Journal | IEEE Transactions on Cognitive and Developmental Systems |
| Volume | 17 |
| Issue number | 3 |
| DOIs | |
| State | Published - 2025 |
Keywords
- Dimensional emotion model
- image emotion detection
- neural network
- valence-arousal