Skip to content
NHNeeam Hayder — home
← All posts

FHIR, Ontologies, and Why Healthcare Data Is Hard

Neeam Hayder

· 2 min read

When I tell people I do AI/ML research in healthcare informatics, they picture models diagnosing diseases. The day-to-day reality is further upstream and, honestly, more interesting: most of the field's effort goes into making health data usable at all.

The same fact, five ways

A single clinical fact — say, a patient's blood pressure reading — can be recorded in wildly different ways across systems: different units, different codes, different field names, buried in free text, or attached to the wrong encounter. Multiply that by every lab, clinic, and legacy EHR, and you get the core problem: data that exists but can't be combined.

That's what standards like FHIR (Fast Healthcare Interoperability Resources) are for — a common shape for health data so systems can exchange it without a bespoke translator for every pair.

Ontologies are the unsung heroes

Standards give data a shape; ontologies and terminologies give it meaning. Controlled vocabularies let "myocardial infarction," "MI," and "heart attack" resolve to the same concept. Without that layer, your dataset silently fragments — and any model you train on it learns the fragmentation instead of the medicine.

This is where the FAIR principles (Findable, Accessible, Interoperable, Reusable) stop being an acronym on a slide and start being an engineering checklist.

Where the ML actually fits

Once the plumbing exists, the interesting questions open up:

  • Can models help map messy real-world records onto standard terminologies, instead of humans hand-coding crosswalks?
  • Can LLMs give clinicians and researchers a natural-language interface to structured health data?
  • How do you evaluate any of this when errors have clinical consequences?

That last one is the question I keep coming back to. In most ML applications, a wrong answer costs you a click. Here, evaluation is the research.

Reading list

If this space interests you, start with the FHIR overview and the FAIR principles — then look at how far real hospital data is from either. That gap is the job.

One email per post, with an AI-written summary. Double opt-in, unsubscribe anytime.