Article
Published: 12 June 2024

A Multimodal Generative AI Copilot for Human Pathology

Ming Y. Lu^1,2,3,4^na1,
Bowen Chen^1,2^na1,
Drew F. K. Williamson ORCID: orcid.org/0000-0003-1745-8846^1,2,3^na1,
Richard J. Chen ORCID: orcid.org/0000-0003-0389-1331^1,2,3,
Melissa Zhao^1,2,
Aaron K. Chow⁵,
Kenji Ikemura^1,2,
Ahrong Kim ORCID: orcid.org/0000-0003-2317-8920^1,10,
Dimitra Pouli ORCID: orcid.org/0000-0003-0890-9326^1,2,
Ankush Patel⁶,
Amr Soliman⁵,
Chengkuan Chen¹,
Tong Ding^1,7,
Judy J. Wang¹,
Georg Gerber ORCID: orcid.org/0000-0002-9149-5509¹,
Ivy Liang^1,7,
Long Phi Le²,
Anil V. Parwani⁵,
Luca L. Weishaupt^1,8 &
…
Faisal Mahmood ORCID: orcid.org/0000-0001-7587-1562^1,2,3,9

Nature (2024)Cite this article

38k Accesses
463 Altmetric
Metrics details

We are providing an unedited version of this manuscript to give early access to its findings. Before final publication, the manuscript will undergo further editing. Please note there may be errors present which affect the content, and all legal disclaimers apply.

Subjects

Abstract

The field of computational pathology[1,2] has witnessed remarkable progress in the development of both task-specific predictive models and task-agnostic self-supervised vision encoders[3,4]. However, despite the explosive growth of generative artificial intelligence (AI), there has been limited study on building general purpose, multimodal AI assistants and copilots[5] tailored to pathology. Here we present PathChat, a vision-language generalist AI assistant for human pathology. We build PathChat by adapting a foundational vision encoder for pathology, combining it with a pretrained large language model and finetuning the whole system on over 456,000 diverse visual language instructions consisting of 999,202 question-answer turns. We compare PathChat against several multimodal vision language AI assistants and GPT4V, which powers the commercially available multimodal general purpose AI assistant ChatGPT-4[7]. PathChat achieved state-of-the-art performance on multiple-choice diagnostic questions from cases of diverse tissue origins and disease models. Furthermore, using open-ended questions and human expert evaluation, we found that overall PathChat produced more accurate and pathologist-preferable responses to diverse queries related to pathology. As an interactive and general vision-language AI Copilot that can flexibly handle both visual and natural language inputs, PathChat can potentially find impactful applications in pathology education, research, and human-in-the-loop clinical decision making.

A visual-language foundation model for computational pathology

Article 19 March 2024

A visual–language foundation model for pathology image analysis using medical Twitter

Article 17 August 2023

MedFuseNet: An attention-based multimodal deep learning model for visual question answering in the medical domain

Article Open access 06 October 2021

Author information

These authors contributed equally: Ming Y. Lu, Bowen Chen, Drew F. K. Williamson

Authors and Affiliations

Department of Pathology, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, USA
Ming Y. Lu, Bowen Chen, Drew F. K. Williamson, Richard J. Chen, Melissa Zhao, Kenji Ikemura, Ahrong Kim, Dimitra Pouli, Chengkuan Chen, Tong Ding, Judy J. Wang, Georg Gerber, Ivy Liang, Luca L. Weishaupt & Faisal Mahmood
Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
Ming Y. Lu, Bowen Chen, Drew F. K. Williamson, Richard J. Chen, Melissa Zhao, Kenji Ikemura, Dimitra Pouli, Long Phi Le & Faisal Mahmood
Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA
Ming Y. Lu, Drew F. K. Williamson, Richard J. Chen & Faisal Mahmood
Electrical Engineering and Computer Science, Massachusetts Institute of Technology (MIT), Cambridge, MA, USA
Ming Y. Lu
Department of Pathology, Wexner Medical Center, Ohio State University, Columbus, OH, USA
Aaron K. Chow, Amr Soliman & Anil V. Parwani
Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester, MN, USA
Ankush Patel
Harvard John A. Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge, MA, USA
Tong Ding & Ivy Liang
Health Sciences and Technology, Harvard-MIT, Cambridge, MA, USA
Luca L. Weishaupt
Harvard Data Science Initiative, Harvard University, Cambridge, MA, USA
Faisal Mahmood
Department of Pathology, Pusan National University, Busan, South Korea
Ahrong Kim

Authors

Ming Y. Lu
View author publications
You can also search for this author in PubMed Google Scholar
Bowen Chen
View author publications
You can also search for this author in PubMed Google Scholar
Drew F. K. Williamson
View author publications
You can also search for this author in PubMed Google Scholar
Richard J. Chen
View author publications
You can also search for this author in PubMed Google Scholar
Melissa Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Aaron K. Chow
View author publications
You can also search for this author in PubMed Google Scholar
Kenji Ikemura
View author publications
You can also search for this author in PubMed Google Scholar
Ahrong Kim
View author publications
You can also search for this author in PubMed Google Scholar
Dimitra Pouli
View author publications
You can also search for this author in PubMed Google Scholar
Ankush Patel
View author publications
You can also search for this author in PubMed Google Scholar
Amr Soliman
View author publications
You can also search for this author in PubMed Google Scholar
Chengkuan Chen
View author publications
You can also search for this author in PubMed Google Scholar
Tong Ding
View author publications
You can also search for this author in PubMed Google Scholar
Judy J. Wang
View author publications
You can also search for this author in PubMed Google Scholar
Georg Gerber
View author publications
You can also search for this author in PubMed Google Scholar
Ivy Liang
View author publications
You can also search for this author in PubMed Google Scholar
Long Phi Le
View author publications
You can also search for this author in PubMed Google Scholar
Anil V. Parwani
View author publications
You can also search for this author in PubMed Google Scholar
Luca L. Weishaupt
View author publications
You can also search for this author in PubMed Google Scholar
Faisal Mahmood
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Faisal Mahmood.

Supplementary information

Supplementary Information

Supplementary Tables 1–64.

Reporting Summary

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lu, M.Y., Chen, B., Williamson, D.F.K. et al. A Multimodal Generative AI Copilot for Human Pathology. Nature (2024). https://doi.org/10.1038/s41586-024-07618-3

Download citation

Received: 11 December 2023
Accepted: 28 May 2024
Published: 12 June 2024
DOI: https://doi.org/10.1038/s41586-024-07618-3

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

A Multimodal Generative AI Copilot for Human Pathology

Subjects

Abstract

Similar content being viewed by others

A visual-language foundation model for computational pathology

A visual–language foundation model for pathology image analysis using medical Twitter

MedFuseNet: An attention-based multimodal deep learning model for visual question answering in the medical domain

Author information

Authors and Affiliations

Corresponding author

Supplementary information

Supplementary Information

Reporting Summary

Rights and permissions

About this article

Cite this article

Comments

Search

Quick links

Subjects

Abstract

Similar content being viewed by others

A visual-language foundation model for computational pathology

A visual–language foundation model for pathology image analysis using medical Twitter

MedFuseNet: An attention-based multimodal deep learning model for visual question answering in the medical domain

Author information

Authors and Affiliations

Corresponding author

Supplementary information

Supplementary Information

Reporting Summary

Rights and permissions

About this article

Cite this article

Share this article

Comments

Search

Quick links