📚 Publication

# Equal Contribution

Under Review
SkinGPT-R1
Trustworthy and Fair SkinGPT-R1 for Democratizing Dermatological Reasoning across Diverse Ethnicities

Yuhao Shen, Zhangtianyi Chen, Yuanhao He, Yan Xu, Shuping Zhang, Liyuan Sun, Zijian Wang, Yinghao Zhu, Yuyuan Yang, Jiahe Qian, Ziwen Wang, Xinyuan Zhang, Wenbin Liu, Zongyuan Ge, Tao Lu, Siyuan Yan, Juexiao Zhou

  • Introduce SkinGPT-R1, a dermatology VLM that achieves interpretable and equitable diagnosis across diverse ethnicities by performing explicit, step-by-step, and verifiable diagnostic chain-of-thought reasoning.
Under Review
SkinCaRe Dataset
SkinCaRe: A Multimodal Dermatology Dataset Annotated with Medical Caption and Chain-of-Thought Reasoning

Yuhao Shen, Liyuan Sun, Yan Xu, Wenbin Liu, Shuping Zhang, Shawn Afvari, Zhongyi Han, Jiaoyan Song, Yongzhi Ji, Tao Lu, Xiaonan He, Xin Gao, Juexiao Zhou

  • Release SkinCaRe, unifying SkinCAP (medical captions) and SkinCoT (clinician-verified chains-of-thought) for transparent dermatologic reasoning.
Under Review
DermBench & DermEval
Towards Trustworthy Dermatology MLLMs: A Benchmark and Multimodal Evaluator for Diagnostic Narratives

Yuhao Shen, Jiahe Qian, Shuping Zhang, Zhangtianyi Chen, Tao Lu, Juexiao Zhou

  • Propose DermBench (six clinical dimensions) and DermEval (reference-free evaluator) for image–text dermatology reasoning aligned with physician scoring.
Under Review
SkinGPT-X
SkinGPT-X: A Multimodal Collaborative Multi-agent System with Self-evolving Dermatological Memory for Transparent and Trustworthy Diagnosis

Zhangtianyi Chen#, Yuhao Shen#, Florensia Widjaja#, Yan Xu, Liyuan Sun, Zijian Wang, Hongyi Chen, Wufei Dai, Juexiao Zhou

  • Introduce SkinGPT-X, a multimodal collaborative multi-agent dermatology diagnosis system with self-evolving memory, achieving strong performance on public, large-scale multi-class, and rare-disease benchmarks.
Under Review
CoTBox-TTT
CoTBox-TTT: Grounding Medical VQA with Visual Chain-of-Thought Boxes During Test-time Training

Jiahe Qian#, Yuhao Shen#, Zhangtianyi Chen, Juexiao Zhou, Peisong Wang

  • Evidence-first test-time training with all backbones frozen; update a small set of continuous soft prompts guided by visual chain-of-thought boxes.