• 中国期刊全文数据库
  • 中国学术期刊综合评价数据库
  • 中国科技论文与引文数据库
  • 中华核心期刊(遴选)数据库
ZHANG Yinming, HUANG Tinglei, LIN Ke, ZHANG Qiangqiang. An improved k-means algorithm for text clustering[J]. Journal of Guilin University of Electronic Technology, 2016, 36(4): 311-314.
Citation: ZHANG Yinming, HUANG Tinglei, LIN Ke, ZHANG Qiangqiang. An improved k-means algorithm for text clustering[J]. Journal of Guilin University of Electronic Technology, 2016, 36(4): 311-314.

An improved k-means algorithm for text clustering

  • Random selection of initial cluster centroid in k-means algorithm for text clustering resulted in local optimization of clustering results, and isolated points and indeterminate cluster number led to low accuracy and slow convergence speed of k-means algorithm. So an improved k-means algorithm for text clustering was proposed. In the proposed algorithm, fp-growth algorithm was used for mining frequent item sets of text, and frequent item sets of text were filtered to obtain the core frequent item sets, and then the core frequent item sets were adopted to generate initial cluster centroid and the number of clustering. Finally k-means algorithm was applied for text clustering with the generated initial cluster centroid and the number of clustering. The results of text clustering experiment on Sina microblog dataset show that the improved k-means algorithm can effectively improve the accuracy of text clustering and accelerate the convergence speed, and has strong robustness.
  • loading

Catalog

    Turn off MathJax
    Article Contents

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return