Affiliation of Author(s):
信息与控制工程学院
Journal:
The 2nd International Conference on Business Intelligence and Financial Engineering
Key Words:
中文关键字:中文文本分类;attribute bagging;向量空间模型;信息增益;,英文关键字:Chinese text categorization;attribute bagging; vec
Abstract:
In order to improve precise rate and recall rate of Chinese text classifier, a improve bagging algorithm – attribute bagging is used in this paper. Document is represented by vector space model and Information Gain is used to select term features. Re-sampling attributes are used to get multiple training sets and the kNN is selected as individual classifier. The classifier result is attained by voting. Experiments show that the attribute bagging gets lower errors and better performance than bagging and kNN in Chinese text categorization.