3 min read

Safeguarding PHI in ChatGPT

Picture of Dean Levitt Dean Levitt March 27, 2023

A Paubox customer asked about using PHI in ChatGPT. Healthcare professionals and even patients have already been using OpenAI’s ChatGPT to research and supplement their knowledge. To answer questions about PHI in ChatGPT, we’ll need to look into how ChatGPT uses data to generate responses. We’ll also examine how ChatGPT manages inputted protected health information (PHI).

Training ChatGPT: Conversations & Usage Information

ChatGPT is an AI language model developed by OpenAI, trained on vast amounts of text data from various sources, including books, websites, and articles. In addition to this diverse dataset, ChatGPT utilizes inputted conversations and other usage information to improve its ability to generate context-relevant and accurate responses, particularly in specialized fields like healthcare.

The summary:

ChatGPT learns from a diverse dataset
Conversations & other usage data enhance training
Ensures context-relevant & accurate responses

Does ChatGPT use user conversations to train the language model?

According to ChatGPT, Yes. “Your conversations may be reviewed by our AI trainers to improve our systems.” According to OpenAI’s privacy policy, they could use personal information to research and develop new programs and services.

They also state, “we will maintain and use de-identified information in anonymous or de-identified form and we will not attempt to reidentify the information.”

However, it’s best to avoid inputting PHI, which may violate HIPAA regulations.

Protecting PHI: Privacy Concerns & Risks

If you mistakenly input private PHI into ChatGPT, it’s unlikely that it could show up in answer to another user, but it is very slightly possible due to a mistake.

As a precaution, healthcare professionals and users interacting with ChatGPT in a healthcare context are encouraged to avoid sharing sensitive information to minimize any potential privacy risks.

It actually depends on the account

Yaniv Markovski, Head of AI Specialist at OpenAI, said, “OpenAI does not use data submitted by customers via our API to train OpenAI models or improve OpenAI’s service offering… When you use our non-API consumer services ChatGPT or DALL-E, we may use the data you provide us to improve our models.”

The summary:

Low risk of PHI appearing elsewhere
Users are encouraged to avoid sharing sensitive data
Different accounts have different policies, so always check with OpenAI first

How your inputted data is used to train ChatGPT

User data, including conversations and usage information, is essential in refining ChatGPT’s performance. The AI model can better understand context and user needs by learning from interactions, generating more accurate and relevant responses. To protect user privacy, this data is anonymized before being used in the model’s training process.

The summary:

Data anonymization for privacy protection
Enhances understanding of context & user needs

Opting Out: How to exclude your data from model training

Those interacting with ChatGPT through the API can opt out of having their data used in the model training process. To do so, contact the support team for assistance. It’s essential to note that certain restrictions may apply depending on the user type and access level.

The summary:

Opt-out available for API users
Contact support for opting out
Restrictions may apply to some user types

According to OpenAI's data policies as of March 1, 2023:

OpenAI will not use data submitted by customers via API to train or improve our models, unless explicitly opted in to share data..
Any data sent through the API will be retained for abuse and misuse monitoring purposes for a maximum of 30 days, after which it will be deleted (unless otherwise required by law).
Non-API data is used in AI training by default, but users can opt out of having their data used in training by submitting this form.

Data Deletion: Erasing Your Information

To have your data deleted, you can submit a request to the support team. This process typically applies to API users. Non-API users may have different options or requirements. Always check with OpenAI support before you input patient information and other PHI or personally identifiable information (PII).

The summary:

Request data deletion by contacting support
Applies to API users
Non-API users may have different options

Is it possible to delete specific prompts?

Whether or not it is possible to delete specific prompts depends on the user type and access level. API users can request the deletion of particular prompts by contacting the support team. Non-API users may have different options or requirements and should contact OpenAI support for further information.

The summary:

Feasibility depends on user type and access
API users can request prompt deletion
Non-API users should consult the documentation

Better safe than sorry: Don’t enter PHI in ChatGPT without a BAA

While all data inputted into ChatGPT is encrypted in transit, and at rest, a search of their documentation and the web does not indicate that OpenAI will sign a business associate agreement (BAA). While they state they remove personal data, entering PHI into ChatGPT may still be a HIPAA violation.