OpenAI’s GPT-3 Reported as Unviable in Medical Tasks by Healthcare Firm

October 27, 2020

French digital health firm Nabla evaluated GPT-3's performance in medical documentation, diagnosis support, and treatment recommendations. The findings highlighted inconsistency and insufficient scientific and medical expertise, deeming it unsuitable for healthcare applications. This serves as a reminder of the importance of trustworthy AI governance and safe and secure AI practices. For those interested in shaping the future of responsible AI and contributing to Project Cerebellum's efforts in incident mapping, we invite you to JOIN US. This evaluation incident aligns with the HISPI Project Cerebellum TAIM function of 'Govern', shedding light on the need for careful consideration when employing AI in sensitive areas such as healthcare.

Alleged deployer

none

Alleged developer

openai, nabla

Alleged harmed parties

nabla-customers

AI governance case studies

For forensic AI governance failure analysis (TAIMScore™ case studies), browse Human Signal’s Failure Files™.

Data source

Incident data is from the AI Incident Database (AIID).

When citing the database as a whole, please use:

McGregor, S. (2021) Preventing Repeated Real World AI Failures by Cataloging Incidents: The AI Incident Database. In Proceedings of the Thirty-Third Annual Conference on Innovative Applications of Artificial Intelligence (IAAI-21). Virtual Conference.

Pre-print on arXiv · Database snapshots & citation guide

We use weekly snapshots of the AIID for stable reference. For the official suggested citation of a specific incident, use the “Cite this incident” link on each incident page.

OpenAI’s GPT-3 Reported as Unviable in Medical Tasks by Healthcare Firm

Matched TAIM controls

AI governance case studies