4 min read

Why human digital twins bring a new realm to data privacy

Why human digital twins bring a new realm to data privacy

An editorial published in Frontiers in Bioengineering and Biotechnology and titled ‘Human digital twins for medical and product engineering’ states, “We refer to a human digital twin as a virtual representation or digital replica of an individual, created using data from various sources, including sensors, medical records, and other digital inputs. This digital twin mirrors certain physical and behavioural characteristics of the person, enabling simulations, predictions, and analyses.”

Human digital twins (HDTs) introduce unprecedented data privacy challenges due to their reliance on dynamic, multi-source health data integration. These virtual replicas aggregate sensitive information, including genomic data, real-time biometrics from wearables, medical imaging, and electronic health records (EHRs). 

The bidirectional data flow between physical and virtual twins necessitates continuous updates, amplifying exposure points for breaches. Ethical concerns arise around informed consent, as traditional static consent models fail to address the evolving nature of HDTs, which may autonomously update using new data streams without explicit patient approval. 

Legal frameworks struggle to classify HDTs, so they exist in a regulatory grey area, often only recognized as medical device software when applied to individuals, leaving broader public health applications underregulated.

 

What are human digital twins?

Human digital twins are virtual replicas of individuals that simulate biological processes and health trajectories using integrated multi-modal data. Defined by the 5Is framework– individualized, interconnected, interactive, informative, and impactful–they combine clinical records, genetic profiles, environmental factors, and real-time sensor data from wearable devices. 

The Journal of Personalized Medicine study ‘Digital Twins’ Advancements and Applications in Healthcare, Towards Precision Medicine’ aptly notes on the topic of digital twins, “These virtual models facilitate a deeper understanding of disease progression and enhance the customization and optimization of treatment plans by modeling complex interactions between genetic factors and environmental influences. By establishing dynamic, bidirectional connections between the DTs of physical objects and their digital counterparts, these technologies enable real-time data exchange, thereby transforming electronic health records.”

Unlike conventional medical models, HDTs maintain bidirectional connections with their physical counterparts, enabling real-time adjustments through machine learning (ML) and artificial intelligence (AI). For example, a cardiac digital twin might integrate MRI scans, electrophysiology data, and blood biomarkers to predict arrhythmia risks. 

Their architecture comprises three components: 

  • The physical entity (patient)
  • The virtual model
  • Data pipelines enabling continuous feedback loops

 

Current and emerging uses

Current applications include personalized oncology, where twins model tumor progression and simulate chemotherapy responses, reducing trial-and-error prescribing. The Mayo Clinic employs knee health digital twins to predict osteoarthritis progression using biomechanical data. 

Emerging uses focus on public health: during the COVID-19 pandemic, researchers proposed epidemic twins to simulate mask mandate impacts. Siemens Healthineers is developing perinatal twins that combine fetal MRI and maternal biomarkers to predict birth complications. 

A 2024 trial at Johns Hopkins used liver twins to optimize transplant eligibility assessments, reducing waitlist mortality by 18%. Future applications aim to integrate data for preventive care; for instance, diabetes twins could predict hypoglycemic events by analyzing continuous glucose monitoring and gut microbiome data.

 

The data involved 

  1. Clinical: EHRs, imaging (CT/MRI), and pathology reports
  2. Biometric: Real-time vitals from wearables (e.g., glucose levels, ECG)
  3. Genomic: Whole genome sequencing, proteomic, and metabolomic profiles 
  4. Environmental: Geolocation, air quality, and lifestyle data from smartphones 
  5. Social determinants: Income, education, and community health indices

 

Data privacy challenges unique to human digital twins

According to a letter to the editor from the Chinese Medical Journal on the topic of the challenges presented by HDT, “because the digital twin is constantly adjusted on the basis of data in time and space; a person's consent should consider its specific form and time point, so that a person can accurately control the digital twin representing them. Digital twin technology requires continuous improvement to meet the challenges it faces.” HDTs introduce privacy risks rooted in their data-intensive and dynamic nature. Dynamic consent challenges traditional frameworks, as HDTs continuously assimilate new data streams (e.g., wearable biometrics, genomic updates) without explicit patient reauthorization. Static consent forms fail to address scenarios where twins autonomously integrate third-party data, such as environmental sensors, raising ethical concerns about ongoing patient agency. 

There are also re-identification risks that persist even with anonymization, as synthetic datasets generated by HDTs often retain unique biometric patterns (e.g., cardiac rhythms, gait metrics) that could link to individuals through cross-referencing public databases. This is coupled with algorithmic bias in training data, such as underrepresentation of minority populations in genomic repositories, which skews predictive outputs, potentially leading to discriminatory treatment recommendations.

 

Do human digital twins qualify as PHI?

HDT data unequivocally qualifies as protected health information (PHI) under HIPAA. HDTs integrate identifiers like biometric signatures (e.g., retinal scans, ECG patterns) and geolocation data, which can directly link to individuals even without traditional demographics. As covered entities (e.g., hospitals) create and manage HDTs for treatment purposes, the data falls under HIPAA’s “designated record set” criteria, requiring safeguards against unauthorized access.

Notably, synthetic health data generated by twins retains PHI status if reverse-engineering could reveal individual identities. This is demonstrated in research from the Delft University of Technology, which notes, “Publicly funded initiatives like Genomics England (The 100.000 Genomes Project, 2017) or the US precision medicine... gather genomic information on large numbers of individuals. These initiatives ultimately aim at the development of digital models of certain aspects of patients, allowing for more targeted health care interventions.”

 

How human digital twins create gaps with other laws

U.S. regulations struggle to address HDTs’ adaptive architectures. According to a journal article from the Applied Clinical Trials titled ‘A New Regulatory Road in Clinical Trials: Digital Twins,’ “The FDA has indicated willingness to investigate the use of digital twins…A recent collaboration between the US National Science Foundation, the NIH, and the FDA has sought to explore how digital twins can serve as a ‘catalyzer’ of biomedical innovation…The EMA… published a five-year ‘AI Action Plan,’ which includes a commitment to conduct several technical deep dives into specific tools and techniques, including digital twin technology.”

The FDA’s 510(k) clearance process, designed for static medical devices, fails to evaluate machine learning models that evolve post-deployment, leaving continuously learning twins in regulatory limbo. HIPAA’s minimum necessary standard conflicts with HDTs’ need for exhaustive data ingestion, creating compliance dilemmas when twins require non-essential data (e.g., social media activity) to optimize predictions. 

While the 21st Century Cures Act promotes health data interoperability, it excludes AI-driven tools like HDTs from its mandates, hindering standardized data sharing across platforms. State laws like the California Consumer Privacy Act (CCPA) lack provisions for synthetic health data, complicating compliance when twins generate predictive biomarkers that could be classified as personal information.

Related: HIPAA Compliant Email: The Definitive Guide (2025 Update)

 

FAQs

What types of information are considered PHI?

PHI encompasses a broad range of information, such as medical records, test results, diagnoses, treatment details, billing information, and any identifiers linked to these data.

 

How do HDTs affect health insurance practices?

Insurers like UnitedHealthcare are piloting twins to simulate chronic disease trajectories, raising concerns about premium adjustments based on predictive risk scores.

 

How do HDTs impact medical liability?

Malpractice risks emerge if twin-based predictions lead to adverse outcomes. For example, a 2024 lawsuit alleged that a liver transplant twin’s flawed eligibility assessment caused patient harm. US courts are yet to establish clear liability frameworks for AI-driven clinical decisions.

Subscribe to Paubox Weekly

Every Friday we'll bring you the most important news from Paubox. Our aim is to make you smarter, faster.