2605.09272
2026-05-12
cs.AI
cs.CL
cs.CV
Towards Conversational Medical AI with Eyes, Ears and a Voice
Meet Shah, Jason Gusdorf, Anil Palepu, Chunjong Park, Jack W. O'Sullivan, Vishnu Ravi, Tim Strother, Pavel Dubov, Aliya Rysbek, Toshiyuki Fukuzawa, Yana Lunts, Jan Freyberg, Michael B. Chang, Aniruddh Raghu, David Stutz, Devora Berlowitz, Eliseo Papa, Taylan Cemgil, JD Velasquez, Jack Chen, Arthur Chen, Doug Fritz, Charlie Taylor, Katya Tregubova, Jing Rong Lim, Richard Green, Sara Mahdavi, Mahvish Nagda, Jihyeon Lee, Craig Schiff, Liviu Panait, Sukhdeep Singh, Valentin Liévin, David G. T. Barrett, Hannah Gladman, Anna Cupani, Francesca Pietra, Uchechi Okereke, Katherine Tong, Clemens Meyer, Erwan Rolland, Mili Sanwalka, Michael D. Howell, Shixiang Shane Gu, Bibo Xu, Euan A. Ashley, S. M. Ali Eslami, Gregory Wayne, Pushmeet Kohli, Vivek Natarajan, Adam Rodman, Alan Karthikesalingam, Ryutaro Tanno
AI总结
该研究提出了一种名为AI co-clinician的新型会话式医疗AI系统,能够实时处理来自医患对话的视听数据,辅助临床决策。该系统基于Gemini的低延迟音视频处理能力,采用双代理架构,兼顾深度临床推理与自然对话所需的低延迟响应。实验表明,AI co-clinician在多个关键评估维度上接近初级保健医生,且在通用评估标准上显著优于GPT-Realtime,但仍在体格检查和疾病特异性推理方面存在不足,突显了视听信息在医疗咨询中的重要性。