An improved <i>k</i>-means algorithm for text clustering

ZHANG Yinming; HUANG Tinglei; LIN Ke; ZHANG Qiangqiang

ZHANG Yinming, HUANG Tinglei, LIN Ke, ZHANG Qiangqiang. An improved k-means algorithm for text clustering[J]. Journal of Guilin University of Electronic Technology, 2016, 36(4): 311-314.

Citation:

ZHANG Yinming, HUANG Tinglei, LIN Ke, ZHANG Qiangqiang. An improved k-means algorithm for text clustering[J]. Journal of Guilin University of Electronic Technology, 2016, 36(4): 311-314.

Citation:

ZHANG Yinming, HUANG Tinglei, LIN Ke, ZHANG Qiangqiang. An improved k-means algorithm for text clustering[J]. Journal of Guilin University of Electronic Technology, 2016, 36(4): 311-314.

An improved k-means algorithm for text clustering

Graphical Abstract

Graphical Abstract

Abstract

Abstract

Random selection of initial cluster centroid in k-means algorithm for text clustering resulted in local optimization of clustering results, and isolated points and indeterminate cluster number led to low accuracy and slow convergence speed of k-means algorithm. So an improved k-means algorithm for text clustering was proposed. In the proposed algorithm, fp-growth algorithm was used for mining frequent item sets of text, and frequent item sets of text were filtered to obtain the core frequent item sets, and then the core frequent item sets were adopted to generate initial cluster centroid and the number of clustering. Finally k-means algorithm was applied for text clustering with the generated initial cluster centroid and the number of clustering. The results of text clustering experiment on Sina microblog dataset show that the improved k-means algorithm can effectively improve the accuracy of text clustering and accelerate the convergence speed, and has strong robustness.

FullText(HTML)

References (0)

Supplements (0)

Cited By

Turn off MathJax

Article Contents

An improved k-means algorithm for text clustering

Graphical Abstract

Abstract

Catalog

Export File

Citation

Format

Content