6 min read

What is Open-Source Intelligence (OSINT) in healthcare?

What is Open-Source Intelligence (OSINT) in healthcare?

Open-Source Intelligence refers to the process of collecting and analyzing information from publicly available sources to produce intelligence. As defined in research on intelligence and global health, OSINT consists of information "produced from publicly available information" and these tools can "automatically collect and collate data, thereby referencing much larger quantities of information."

According to research by Ghioni, Taddeo, and Floridi in Open source intelligence and AI: a systematic review of the GELSI literature OSINT now comprises between seventy and ninety percent of all intelligence material used by law enforcement agencies and intelligence services in Western countries. 

The U.S. Department of Health and Human Services Health Sector Cybersecurity Coordination Center (HC3) further clarifies that, "Open-source intelligence is the data and information that is available to the public. It's not just the information that we can access online from a search engine." The HC3 notes that information "can come from many sources," including "newspapers, court filings, and of course the internet." Importantly, "Information available to the public isn't necessarily just free information either it can consist of proprietary or subscription-based information."

OSINT isn’t new; it "has been around for a long time and was even used as a collection method in World War II," as documented by HC3. 

Ghioni, Taddeo, and Floridi identify three distinct generations of OSINT evolution. The first generation focused on physical document retrieval and translation, exemplified by agencies monitoring and translating foreign broadcasts during wartime. The second generation, emerging around 2005 with the creation of the Open Source Center, introduced digital OSINT capabilities including linguistic and text-based tools, geospatial analysis, network mapping, and visual forensics. Now, a third generation is emerging, characterized by AI-automated collection and analysis that requires minimal human supervision.

Unlike hacking or unauthorized access, OSINT relies on legitimate, openly accessible data. This includes information from websites, social media platforms, public records, job postings, professional networking sites, news articles, academic publications, conference presentations, and even metadata from digital files.

 

OSINT collection methods

The HC3 document outlines three approaches to OSINT collection, each with different levels of visibility and risk:

  • Passive collection: This is "the most common type of OSINT and the intent behind it is to only target publicly available information." Passive collection involves gathering data without directly interacting with target systems, making it the least detectable method.
  • Semi-passive collection: This approach is "more technical in nature. Through this form of collection an investigator will be sending traffic to a server to gather information." This method involves some interaction with target systems but stops short of direct engagement.
  • Active collection: In this method, investigators are "directly engaging with a system or even a person." This can include scanning for vulnerabilities or looking for open ports, activities that are "normally considered malicious" and increase the likelihood of detection.

The distinction of OSINT is its legality and accessibility. Anyone with internet access and the right skills can gather this intelligence without breaking into systems or bypassing security measures. This democratization of intelligence gathering means that threat actors, from cybercriminals to nation-state operatives, can build profiles of healthcare organizations without triggering a security alert. However, as HC3 warns, "it's also important to be aware that threat actors leverage OSINT for their own benefit as well."

 

OSINT in the healthcare context

Healthcare organizations are vulnerable to OSINT reconnaissance for several reasons. First, the industry's regulatory requirements often mandate public transparency about certain operations, licenses, and affiliations. Second, healthcare professionals frequently share information about their work, research, and achievements on professional platforms. Third, the system of vendors, partners, and contractors creates an expanded attack surface with numerous publicly visible connections.

For example, job postings reveal the specific technologies, software platforms, and security tools an organization uses. LinkedIn profiles of IT staff expose network architecture details, recent migrations, and technology implementations. Social media posts from employees at conferences can reveal unpatched systems or upcoming changes. Public financial records show budget constraints that might indicate deferred security investments. Healthcare provider directories and facility information provide physical security details.

 

Common OSINT sources targeting healthcare

Social media intelligence (SOCMINT)

The HC3 identifies Social Media Intelligence (SOCMINT) as "a branch of open-source intelligence, but the information being collected is obtained from social media websites." According to HC3, data obtained from SOCMINT is "usually broken down into two separate categories: Content posted by the person: profile pictures, images, and other multimedia files [and] Secondary metadata such as birthdates, geolocation, friends, workplace, and political views."

Employees sharing workplace photos may inadvertently reveal security badge designs, access control systems, or visitor protocols. Posts about work accomplishments can disclose system vulnerabilities, implementation timelines, or security gaps during transitions. 

 

Professional networking sites

Professional networking sites like LinkedIn provide adversaries with organizational charts, technology stacks, vendor relationships, and even insights into employee dissatisfaction that could make someone vulnerable to social engineering.

 

Job postings

Job postings often detail specific software versions, security tools, network configurations, and even information about legacy systems that may contain known vulnerabilities.

 

Public records and regulatory filings

Public records and regulatory filings, including:

  • HIPAA breach reports detailing security incidents and weaknesses
  • State licensing databases reveal facility locations and operational details
  • Medicare provider data exposing financial information and operational capacity
  • Court records and legal filings disclosing security failures, vendor disputes, and internal challenges

 

Digital presence and technical infrastructure

The organization's own digital presence also contributes to OSINT exposure:

  • Website source code may contain comments revealing internal systems or developer notes
  • DNS records expose network infrastructure
  • SSL certificates show subdomains and internal naming conventions
  • Metadata in published documents can reveal user accounts, software versions, and internal file structures

 

Additional OSINT sources

According to HC3, other valuable OSINT sources include:

  • Newspapers and magazine articles
  • Academic papers and other research
  • Books
  • Public trading data
  • Public surveys
  • Location data
  • Breached data information
  • Public indicators (IPs, domains, or hashes)
  • Certificate/Domain registration data
  • Application/system vulnerability data
  • Arrest records

 

How threat actors use healthcare OSINT

Cybercriminals use OSINT as the reconnaissance phase of targeted attacks. By mapping an organization's digital footprint, they identify the most vulnerable entry points for ransomware attacks. They create phishing campaigns using information about organizational structure, vendor relationships, and current projects. They discover unpatched systems and vulnerable applications through version information inadvertently disclosed online.

Nation-state actors employ OSINT to understand healthcare supply chains, research capabilities, and intellectual property assets. During the COVID-19 pandemic, reports emerged of foreign intelligence services targeting vaccine research through campaigns that began with OSINT reconnaissance.

 

The limitations of OSINT

According to research published in "Intelligence and global health: assessing the role of open source and social media intelligence analysis in infectious disease outbreaks," OSINT faces several challenges that both defenders and attackers must contend with.

 

The verification problem

Researchers note, "OSINT used on its own is therefore not sufficient; it must be corroborated from other sources." Ghioni, Taddeo, and Floridi reinforce this point, emphasizing that the verification problem remains one of the most significant challenges in OSINT operations. This means that while threat actors can gather information about healthcare organizations, they still need additional verification methods to confirm the accuracy and current validity of what they've discovered. 

 

Information overload

Research indicates that "one of the greatest issues with OSINT is that there can be so much data that deriving analytics becomes difficult." For healthcare organizations, this presents both a vulnerability and a potential defense. While attackers can access vast amounts of information, extracting meaningful, actionable intelligence from them requires skill and resources. A major health surveillance system "was unable to verify over 30% of its total reports in 2002," demonstrating that even sophisticated OSINT operations struggle with data quality and verification.

 

Privacy expectations and ethical concerns

Although information shared on platforms like Facebook may technically be public, research notes that "we still expect a contextual degree of privacy." This creates a gray area where information is legally accessible but ethically questionable to exploit.

More concerning is the potential for misidentification and harm. The research references the Boston Marathon bombing incident, where "internet users identified the wrong suspects...which led to an innocent student being identified and victimised." In healthcare contexts, similar mistakes could lead to wrongful accusations, privacy violations, or discrimination against patients or staff.

 

Real-world impact

The 2020 ransomware attack on Universal Health Services showed how attackers used publicly available information to maximize impact. The extended network downtime affected hundreds of facilities and forced a return to paper records, endangering patient care.

More recently, sophisticated social engineering campaigns in late 2023 showcased how OSINT reconnaissance enables targeted attacks against healthcare organizations. The attackers claimed their phones were broken and couldn't receive MFA tokens, successfully convincing help desk staff to enroll new devices. Once inside corporate systems, they targeted payer website credentials and manipulated ACH payment information to divert legitimate payments to attacker-controlled accounts.

The threat actors conducted reconnaissance on platforms like LinkedIn to identify employees, gathered organizational information from public records, and even registered typosquatting domains to support their phishing campaigns. The local area code spoofing and detailed employee information made the social engineering attempts particularly convincing to help desk personnel.

 

The OSINT paradox in healthcare

As noted in intelligence and global health research, healthcare organizations need transparency but face "severe breaches of privacy through their communications being intercepted." Sharing research advances medical knowledge but may expose valuable intellectual property. Recruiting top talent requires describing exciting technologies but may reveal security details. Building partnerships demands information sharing but expands the attack surface.

Ghioni, Taddeo, and Floridi describe this as the "privacy paradox" of OSINT, information that is publicly available and technically not private can still be sensitive and personal in nature.

The researchers identify several additional challenges that make up this paradox. The asymmetric technological advantage created by AI-powered OSINT tools means that actors with access to greater computing power and algorithms can extract more intelligence from the same publicly available data than those without such resources. This creates an uneven playing field where well-resourced threat actors, whether cybercriminals or nation-states, can conduct far more effective reconnaissance than healthcare organizations can defend against.

This paradox requires healthcare organizations to balance transparency with security, sharing information purposefully while maintaining awareness of how that information could be weaponized by adversaries. 

 

FAQs

How does OSINT differ from traditional intelligence gathering?

OSINT relies solely on publicly accessible information, whereas traditional intelligence gathering often involves classified or covert sources.

 

Why is healthcare an attractive target for OSINT-based attacks?

Healthcare organizations hold high-value personal, financial, and research data that are often partially exposed through required transparency and public reporting.

 

Can OSINT be used defensively in healthcare?

Yes, security teams can use OSINT to monitor their organization’s public exposure and detect potential data leaks or impersonation attempts.

 

What role does artificial intelligence play in modern OSINT?

AI automates data collection, pattern recognition, and analysis, enabling faster and more intelligent extraction from public sources.

 

How can healthcare organizations limit OSINT-related risks?

They can implement strict social media policies, conduct regular OSINT audits, and train staff on what information should remain confidential.

Subscribe to Paubox Weekly

Every Friday we'll bring you the most important news from Paubox. Our aim is to make you smarter, faster.