📚 Publication
# Equal Contribution
Under Review

Trustworthy and Fair SkinGPT-R1 for Democratizing Dermatological Reasoning across Diverse Ethnicities
Yuhao Shen, Zhangtianyi Chen, Yuanhao He, Yan Xu, Shuping Zhang, Liyuan Sun, Zijian Wang, Yinghao Zhu, Yuyuan Yang, Jiahe Qian, Ziwen Wang, Xinyuan Zhang, Wenbin Liu, Zongyuan Ge, Tao Lu, Siyuan Yan, Juexiao Zhou
- Introduce SkinGPT-R1, a dermatology VLM that achieves interpretable and equitable diagnosis across diverse ethnicities by performing explicit, step-by-step, and verifiable diagnostic chain-of-thought reasoning.
Under Review

SkinCaRe: A Multimodal Dermatology Dataset Annotated with Medical Caption and Chain-of-Thought Reasoning
Yuhao Shen, Liyuan Sun, Yan Xu, Wenbin Liu, Shuping Zhang, Shawn Afvari, Zhongyi Han, Jiaoyan Song, Yongzhi Ji, Tao Lu, Xiaonan He, Xin Gao, Juexiao Zhou
- Release SkinCaRe, unifying SkinCAP (medical captions) and SkinCoT (clinician-verified chains-of-thought) for transparent dermatologic reasoning.
Under Review

Towards Trustworthy Dermatology MLLMs: A Benchmark and Multimodal Evaluator for Diagnostic Narratives
Yuhao Shen, Jiahe Qian, Shuping Zhang, Zhangtianyi Chen, Tao Lu, Juexiao Zhou
- Propose DermBench (six clinical dimensions) and DermEval (reference-free evaluator) for image–text dermatology reasoning aligned with physician scoring.
Under Review

SkinGPT-X: A Multimodal Collaborative Multi-agent System with Self-evolving Dermatological Memory for Transparent and Trustworthy Diagnosis
Zhangtianyi Chen#, Yuhao Shen#, Florensia Widjaja#, Yan Xu, Liyuan Sun, Zijian Wang, Hongyi Chen, Wufei Dai, Juexiao Zhou
- Introduce SkinGPT-X, a multimodal collaborative multi-agent dermatology diagnosis system with self-evolving memory, achieving strong performance on public, large-scale multi-class, and rare-disease benchmarks.
Under Review

CoTBox-TTT: Grounding Medical VQA with Visual Chain-of-Thought Boxes During Test-time Training
Jiahe Qian#, Yuhao Shen#, Zhangtianyi Chen, Juexiao Zhou, Peisong Wang
- Evidence-first test-time training with all backbones frozen; update a small set of continuous soft prompts guided by visual chain-of-thought boxes.