DOI number:
10.1038/s41467-025-67145-1
Key Words:
large language models; mis/disinformation; computational linguistic; information governance
Abstract:
The persuasive capability of large language models (LLMs) in generating mis/disinformation is widely recognized, but the linguistic ambiguity of such content and inconsistent findings on LLM-based detection reveal unresolved risks in information governance. To address the lack of Chinese datasets, this study compiles two datasets of Chinese AI mis/disinformation generated by multi-lingual models involving deepfakes and cheapfakes. Through psycholinguistic and computational linguistic analyses, the quality modulation effects of eight language features (including sentiment, cognition, and personal concerns), along with toxicity scores and syntactic dependency distance differences, were discovered. Furthermore, key factors influencing zero-shot LLMs in comprehending and detecting AI mis/disinformation are examined. The results show that although implicit linguistic distinctions exist, the intrinsic detection capability of LLMs remains limited. Meanwhile, the quality modulation effects of AI mis/disinformation linguistic features may lead to the failure of AI mis/disinformation detectors. These findings highlight the major challenges of applying LLMs in information governance.
Links to published journals: