New research uses AI to extract HIPAA lessons from breach reports

Written by Caitlin Anthoney | November 7, 2025

Ember: Learning Lessons from Breach Reports, describes “an automated approach of extracting information from breach reports and suggesting actions based on the extracted knowledge.”

Healthcare organizations can use the information from these reports to prevent future incidents and maintain HIPAA compliance.

The value of breach reports

Healthcare organizations must protect sensitive patient information. Breach reports, while often seen as bureaucratic requirements, contain a wealth of knowledge that can guide prevention and remediation.

Under HIPAA, breaches affecting 500 or more individuals must be publicly reported. The authors of the EMBER study explain that breach reports “describe cases where deployed systems fail, or are maliciously or accidentally misused,” and often include “corrective steps that suggest actions to prevent, mitigate, and recover from future breaches.”

These reports can help organizations understand vulnerabilities, recurring threats, and the most effective responses.

Automated insights for HIPAA compliance

EMBER (Extracting Meaningful Breach Reports) is a tool that automates breach analysis and operates through three stages:

Identifying informative sentences: EMBER first “identifies informative sentences with a classifier” from a large pool of breach reports. This step filters out irrelevant information, focusing only on sentences that provide actionable knowledge.
Extracting descriptive phrases and useful actions: Using advanced natural language processing (NLP) techniques, EMBER “extracts descriptive phrases and useful actions from the informative sentences.” It allows healthcare entities to pinpoint specific vulnerabilities and recommended corrections without manually reading thousands of reports.
Suggesting actionable steps: Finally, EMBER “suggests actions based on descriptions of the breach that the responsible party wishes to prevent or remedy.” These suggestions can include staff training, technical safeguards, or administrative interventions based on the type of breach documented.

Predictive compliance

The success of EMBER shows that healthcare organizations can use artificial intelligence to analyze what went wrong and predict what could go wrong next. EMBER uses historical patterns to create something healthcare organizations can use to anticipate risks before they escalate into violations.

This predictive potential is particularly relevant to the Office for Civil Rights (OCR), the federal agency responsible for enforcing HIPAA. The OCR routinely investigates breaches to determine whether healthcare entities failed to meet their obligations under the Privacy, Security, or Breach Notification Rules.

Each OCR settlement provides detailed information on what went wrong, for example, improper access controls, delayed notifications, or insufficient employee training. Tools like EMBER automate this same process of pattern recognition at scale.

EMBER learns from thousands of previous breaches, helping organizations to map vulnerabilities and see where they align with known compliance failures. The researchers explain that this process can help entities “implement additional administrative and technical safeguards” and “retrain the staff” before the OCR needs to intervene.

How effective is automated analysis

The EMBER system was tested on 3,144 breach reports and demonstrated impressive accuracy. It “achieves 78% recall in information extraction, outperforming average humans,” showing its efficiency and reliability.

The study also found that commonly recommended actions included:

Retraining staff: Employees who work with protected health information (PHI) must understand HIPAA rules and organizational policies.
Implementing administrative and technical safeguards: Adding security controls, access restrictions, and monitoring systems.
Sanctioning responsible employees: Holding individuals accountable for negligence or misuse.

EMBER creates a resource that can guide healthcare organizations toward proactive compliance measures by systematically extracting and categorizing these actions.

Why automation matters in healthcare

Manually analyzing breach reports can be time-consuming and prone to human error. Large healthcare systems often generate hundreds of breach reports annually, making identifying trends or recurring issues difficult. EMBER addresses this challenge through automating information extraction, so healthcare organizations can:

Quickly identify the root causes of breaches.
Prioritize corrective actions based on severity and frequency.
Reduce human error in interpreting complex or lengthy reports.

The authors note that EMBER’s output “presents the extracted information in an easy-to-use action suggestion tool, which helps HIPAA-covered entities comply with regulations and protect health information.”

Moreover, automated data can help organizations that handle thousands of patients’ records with efficiency, so they can be compliant and avoid regulatory penalties.

Go deeper: The complete guide to HIPAA violations

Applications for Paubox users

HIPAA compliant email platforms, like Paubox, secure electronic communications and protect sensitive patient information, including PHI. Healthcare organizations can use these platforms and combine them with tools like EMBER to strengthen their compliance strategy in the following ways:

Preventative training: Organizations can analyze breach reports and suggested actions, designing staff training programs focused on the most common sources of breaches, like phishing attacks, misrouted emails, or improper handling of electronic health records (EHRs).
Improved technical safeguards: EMBER identifies actions like implementing access controls or encryption protocols. Using Paubox, healthcare teams can automatically encrypt emails containing PHI, so the information is safeguarded during transmission and at rest.
Policy enforcement: The tool’s recommendations, such as sanctioning responsible employees or revising internal policies, can help organizations enforce accountability, improving their HIPAA compliance.

Implementing breach reports in daily practice

One of the challenges in healthcare is translating lessons from breach reports into daily operational improvements. EMBER addresses this through actionable guidance derived directly from incidents.

EMBER’s automated approach allows organizations to “suggest actions based on descriptions of the breach” and strengthen HIPAA compliance. As the study notes, breach reports often contain “useful actions…helpful toward HIPAA compliance of the covered entity (CE).”

For example, a breach involving unauthorized access to patient records might suggest retraining staff on role-based permissions, implementing stricter login controls, and auditing access logs regularly.

Paubox users can integrate these lessons, reviewing access permissions for sensitive email communications, setting multi-factor authentication (MFA), and monitoring for unusual email activity.

The impact on healthcare security

Automation tools like EMBER are catalysts for cultural change in healthcare security. It systematically analyzes breaches and suggests preventive actions, encouraging organizations to move from reactive to proactive security measures.

The study states that automated systems “outperform average humans” in recall, allowing organizations to detect patterns and vulnerabilities that might otherwise be overlooked. It also has direct implications for patient safety, data integrity, and legal compliance.

Ethical AI in HIPAA compliance

Even though EMBER operates on publicly available breach reports, it also brings forth ethical lessons that extend to all AI applications in healthcare, including who controls the data, how it is used, and how its outputs influence human decision-making.

The authors of the EMBER study note that their system is meant to “assist, not replace, human judgment in interpreting regulatory data.” In practice, compliance officers, risk managers, and data privacy specialists must remain in the loop, reviewing automated recommendations, validating findings, and using AI-driven insights that uphold HIPAA requirements and institutional policies.

Ethical AI also demands transparency. Healthcare entities should understand how an algorithm arrives at its conclusions, what data it was trained on, and what limitations it might have. Black-box AI systems, i.e., those whose decision-making processes are opaque, can introduce new risks if they make compliance recommendations that can’t be easily explained or justified.

Additionally, the principle of “minimum necessary” should apply to human access and AI access. Automated systems should process only the data required for their intended purpose and should log every instance of data handling for auditability.

Learn more: Factors driving AI adoption in healthcare

Human oversight still matters

Even though EMBER “outperforms average humans” in extracting information from text, the authors acknowledge that AI is not a replacement for human judgment. Automated systems must still be monitored to check that they do not misinterpret the context of a breach report or overlook unique circumstances. Human compliance officers bring ethical reasoning and regulatory understanding that AI cannot replicate.

More specifically, the study notes that while the tool can detect common phrases like “sanction the responsible employee,” it is up to compliance teams to determine when disciplinary action is appropriate or when a systemic change is more effective.

Ultimately, the combination of machine precision and human discernment creates a balanced model of governance. AI identifies patterns and recommends actions, while human experts make the final decisions based on organizational culture and patient care priorities.

FAQs

How can AI improve HIPAA compliance?

AI analyzes vast amounts of breach data to uncover patterns that humans might miss. It can automatically detect risky behaviors, identify policy gaps, and flag potential vulnerabilities before they turn into reportable incidents. It continuously monitors breach trends, helping organizations take proactive steps, like tightening access controls or improving staff training, to prevent future violations.

What is the purpose of using AI on breach reports?

Analyzing breach reports with AI makes it possible to detect recurring causes of data loss, like phishing, misdirected messages, or unencrypted devices, at a scale no human team could match. The goal is to extract lessons from past incidents and apply them to future prevention efforts.

As the EMBER researchers note, automation allows compliance teams to “identify systemic weaknesses and improvement opportunities faster and more consistently than manual review alone.”

Can private companies use EMBER’s findings?

Yes. EMBER’s findings can guide private-sector efforts to improve data protection and compliance strategies. Health IT vendors, compliance officers, and risk managers can use its findings to assess their own breach prevention protocols, benchmark against national trends, and improve their HIPAA compliant communication systems. It translates public breach data so regulators and private organizations can improve their overall healthcare cybersecurity.

View full post