Classification for Unbalanced Dataset by an Improved KNN Algorithm Based on Weight
发布时间:2024-08-09
点击次数:
- 所属单位:
- 信息与控制工程学院
- 发表刊物:
- INFORMATION(SCI)
- 关键字:
- 中文关键字:不平衡数据集;分类;K 最邻近算法;权重分配模型;遗传算法;K-means 算法,英文关键字:imbalanced dataset;classification;KNN;weight assig
- 摘要:
- Based on analyzing the shortages of KNN(K-Nearest Neighbor) algorithm in solving classification problems on imbalanced dataset, a novel KNN approach based on weight strategy(short as GAK-KNN) is presented. The key of GAK-KNN lies on defining a new weight assignment model, which can fully take into account the adverse effects caused by the uneven distribution of training sample between classes and within classes. The specific steps are as follows: first uses K-means algorithm based on genetic algorithm to cluster the training sample set, then computes the weight for each training sample in accordance to the clustering results and weight assignment model, at last uses the improved KNN algorithm to classify the test samples. GAK-KNN can significantly improve the identification rate of the minority samples and overall classification performance.
- 备注:
- 王超学
- 合写作者:
- 潘正茂
- 第一作者:
- 张涛[重名-待确认],董丽丽,王超学
- 论文类型:
- 期刊论文
- 卷号:
- 卷:15
- 期号:
- 期:11
- 页面范围:
- 页:4983-4988
- 是否译文:
- 否
- 发表时间:
- 2012-11-01
- 上一条:双系统协同进化的基因表达式编程及其在
- 下一条:采用混合策略的改进基因表达式编程



