Large Language Models (LLMs) are increasingly deployed in sensitive contexts where fairness and inclusivity are critical. Pronoun usage, especially concerning gender-neutral and neopronouns, remains a key challenge for responsible AI. Prior work, such as the MISGENDERED benchmark, revealed significant limitations in earlier LLMs' handling of inclusive pronouns, but was constrained to outdated models and limited evaluations. In this study, we introduce MISGENDERED+, an extended and updated benchmark for evaluating LLMs' pronoun fidelity. We benchmark five representative LLMs, GPT-4o, Claude 4, DeepSeek-V3, Qwen Turbo and Qwen2.5, across zero-shot, few-shot, and gender identity inference. Our results show notable improvements compared with the previous studies, especially in binary and gender-neutral pronoun accuracy. However, accuracy on neopronouns and reverse inference tasks remains inconsistent, underscoring persistent gaps in identity-sensitive reasoning. We discussed implications, model-specific observations, and avenues for future inclusive AI research.

Author Affiliations

Xushuo Tang

University of New South Wales

Yi Ding

University of New South Wales

Zhengyi Yang

University of New South Wales

Yin Chen

University of Technology Sydney

Yongrui Gu

Euler AI

Wenke Yang

University of New South Wales

Mingchen Ju

University of New South Wales

Xin Cao

University of New South Wales

Yongfei Liu

Euler AI

Wenjie Zhang

University of New South Wales

BibTeX

@inproceedings{tang2025understand,
  title = {Do They Understand Them? An Updated Evaluation on Nonbinary Pronoun Handling in Large Language Models},
  author = {Tang, Xushuo and Ding, Yi and Yang, Zhengyi and Chen, Yin and Gu, Yongrui and Yang, Wenke and Ju, Mingchen and Cao, Xin and Liu, Yongfei and Zhang, Wenjie},
  editor = {Liu, Miaomiao and Yu, Xin and Xu, Chang and Song, Yiliao},
  booktitle = {AI 2025: Advances in Artificial Intelligence},
  year = {2026},
  publisher = {Springer Nature Singapore},
  address = {Singapore},
  pages = {204--219},
  isbn = {978-981-95-4969-6}
}

Do They Understand Them? An Updated Evaluation on Nonbinary Pronoun Handling in Large Language Models

RAIDS Lab Authors

Details

Research Area

Tags

Resources

Abstract

Author Affiliations

BibTeX