Multimodal prototype fusion network for paper-cut image classification
发布时间:2025-09-20
点击次数:
- 影响因子:
- 4.9
- DOI码:
- 10.1038/s40494-025-02036-8
- 发表刊物:
- npj Heritage Science
- 摘要:
- This paper proposes a Multimodal Prototype Fusion Network (MPFN) to address challenges in paper-cut image classification, including artistic abstraction, imbalanced data, and unseen category adaptation. The framework introduces two variants: AMPFN, which dynamically fuses multimodal prototypes via cross-modal attention and residual learning, and IMPFN, a training-free model for rapid deployment. Leveraging CLIP for feature extraction, AMPFN achieves 90.71% accuracy (16-shot) on seen classes, while IMPFN attains 84.98% accuracy (16-shot) on unseen classes without training. Evaluations on paper-cut datasets and public benchmarks (PACS, ArtDL, CUB-200-2011) demonstrate superiority over existing methods. The approach mitigates data imbalance through n-shot prototypes and reduces computational costs via pre-trained features, proving robust in fine-grained and abstract art classification. This work offers a scalable solution for cultural heritage digitization and multimodal art analysis.
- 备注:
- Zhang, X., Chen, D. & Qin, Y. Multimodal prototype fusion network for paper-cut image classification. npj Herit. Sci. 13, 462 (2025). https://doi.org/10.1038/s40494-025-02036-8
- 论文类型:
- 期刊论文
- 论文编号:
- 462
- 学科门类:
- 工学
- 一级学科:
- 计算机科学与技术
- 文献类型:
- J
- 卷号:
- 13
- 期号:
- 462
- 页面范围:
- 1-14
- ISSN号:
- 3059-3220
- 是否译文:
- 否
- 发表时间:
- 2025-01-01
- 收录刊物:
- SCI