TY - GEN
T1 - Blog topic analysis using TF smoothing and LDA
AU - Lee, Sungwoo
AU - Lee, Jaedong
AU - Park, Chang Yong
AU - Lee, Jee Hyong
PY - 2013
Y1 - 2013
N2 - In the era of Web 2.0, the number of blogs has explosively increased. With the appearance of social network services, blogs has become the places for sharing professional knowledge and personal branding. So, in order to understand the trends of topics or to analyze the content of blogs, the time sensitive topic extraction and topic change analysis is important and necessary. In the previous studies, most of topic extraction models extracted topic words independently from each time slice and tried to combine those. However, these methods did not show a good performance in analyzing topic trends because the topics extracted from time slices are independent. To cope with this problem, we propose a term frequency smoothing method which weaves time slices so that the more related topics are extracted from each time slice and a better topic trend analysis is generated. In order to extract topics from smoothed term frequencies, LDA, a generative topic model, is adopted. The evaluation of the proposed method on IT blogs shows that it can effectively discover quite meaningful topic patterns and topic words.
AB - In the era of Web 2.0, the number of blogs has explosively increased. With the appearance of social network services, blogs has become the places for sharing professional knowledge and personal branding. So, in order to understand the trends of topics or to analyze the content of blogs, the time sensitive topic extraction and topic change analysis is important and necessary. In the previous studies, most of topic extraction models extracted topic words independently from each time slice and tried to combine those. However, these methods did not show a good performance in analyzing topic trends because the topics extracted from time slices are independent. To cope with this problem, we propose a term frequency smoothing method which weaves time slices so that the more related topics are extracted from each time slice and a better topic trend analysis is generated. In order to extract topics from smoothed term frequencies, LDA, a generative topic model, is adopted. The evaluation of the proposed method on IT blogs shows that it can effectively discover quite meaningful topic patterns and topic words.
KW - Blog text mining
KW - LDA
KW - Term frequency smoothing
KW - Topic trend change
UR - https://www.scopus.com/pages/publications/84875842517
U2 - 10.1145/2448556.2448631
DO - 10.1145/2448556.2448631
M3 - Conference contribution
AN - SCOPUS:84875842517
SN - 9781450319584
T3 - Proceedings of the 7th International Conference on Ubiquitous Information Management and Communication, ICUIMC 2013
BT - Proceedings of the 7th International Conference on Ubiquitous Information Management and Communication, ICUIMC 2013
T2 - 7th International Conference on Ubiquitous Information Management and Communication, ICUIMC 2013
Y2 - 17 January 2013 through 19 January 2013
ER -