TY - GEN
T1 - Richness evaluation of blogs on its topics using a generative model and probabilistic analysis
AU - Park, Jinhee
AU - Lee, Jaedong
AU - Jung, Hye Wuk
AU - Lee, Jee Hyong
PY - 2012
Y1 - 2012
N2 - Nowadays, blogs are one of important web services to publish and share various information. Accordingly, evaluation of various keywords in blogs is one of the important research topics for effective and efficient classification and retrieval of blogs in the blogosphere. In this paper, we propose a method to identify important keywords in a blog. In order to identify such keywords, we consider web context, assuming that the blogs documents are generated from web contexts by proposed generative model. Therefore, if the contexts of keyword on the web are reflected well in the blog, then we may regard the keyword is essential because the blog is rich on the keyword. We clustered the blog articles on the given keyword by several subtopics using LDA (Latent Dirichlet Analysis), and compared the clusters with the web context documents obtained by web search. Finally, we evaluated the richness of blog on each keyword.
AB - Nowadays, blogs are one of important web services to publish and share various information. Accordingly, evaluation of various keywords in blogs is one of the important research topics for effective and efficient classification and retrieval of blogs in the blogosphere. In this paper, we propose a method to identify important keywords in a blog. In order to identify such keywords, we consider web context, assuming that the blogs documents are generated from web contexts by proposed generative model. Therefore, if the contexts of keyword on the web are reflected well in the blog, then we may regard the keyword is essential because the blog is rich on the keyword. We clustered the blog articles on the given keyword by several subtopics using LDA (Latent Dirichlet Analysis), and compared the clusters with the web context documents obtained by web search. Finally, we evaluated the richness of blog on each keyword.
KW - Data Mining
KW - Information Retrieval
KW - Semantic Web
KW - Text Mining
UR - https://www.scopus.com/pages/publications/84877804280
U2 - 10.1109/SCIS-ISIS.2012.6505393
DO - 10.1109/SCIS-ISIS.2012.6505393
M3 - Conference contribution
AN - SCOPUS:84877804280
SN - 9781467327428
T3 - 6th International Conference on Soft Computing and Intelligent Systems, and 13th International Symposium on Advanced Intelligence Systems, SCIS/ISIS 2012
SP - 381
EP - 385
BT - 6th International Conference on Soft Computing and Intelligent Systems, and 13th International Symposium on Advanced Intelligence Systems, SCIS/ISIS 2012
T2 - 2012 Joint 6th International Conference on Soft Computing and Intelligent Systems, SCIS 2012 and 13th International Symposium on Advanced Intelligence Systems, ISIS 2012
Y2 - 20 November 2012 through 24 November 2012
ER -