Genetic & Genomic Medicine

A Hospital Biobank Shows Why Genetic Medicine Keeps Failing Non-Europeans

Researchers linked genomes to health records for nearly 94,000 patients across 36 ancestry groups at UCLA. The tools meant to predict disease worked far worse outside European ancestry, and even a popular weight-loss drug showed different effects by ancestry.

Abel Chen
·
April 3, 2026
·
4 min
Article hero

Most of what we know about the genetics of common disease comes from people of European descent. That has been true for two decades. The consequence is quieter than it sounds. When a clinic calculates your genetic risk of heart disease or diabetes from a DNA sample, the math behind that number was mostly trained on one slice of humanity. If you fall outside it, the estimate can be wrong in ways nobody flags.

A team centered at UCLA set out to measure how bad this gap actually is, using their own hospital. Writing in Cell, Roni Haas and colleagues analyzed 93,936 participants from the UCLA ATLAS Community Health Initiative, a biobank that ties each person's genome to their electronic health record. They sorted patients into five continental ancestry groups and 36 finer ones, then asked a blunt question: do our genetic prediction tools hold up across all of them?

The prediction gap, in numbers

Polygenic scores are the workhorse here. They add up thousands of small genetic effects into a single risk figure for a common disease. In the ATLAS data these scores predicted disease well overall. But the authors report that their accuracy dropped sharply in people who were not of European ancestry. A score tuned on European genomes simply carries less information when applied to someone whose genetic background differs, because the statistical landmarks it relies on shift.

The clinical-variant databases had the same problem from a different angle. Curated lists of disease-causing mutations lean heavily European, so a variant that matters in an underrepresented group may never have been catalogued. To push against that bias, the team used computational predictors to reassess variants across ancestries. Doing so surfaced associations that the standard databases had missed, including a link between the gene ANKZF1 and peripheral vascular disease in African American patients.

Diversity was not only a fairness concern. It was a discovery engine. Because the cohort spanned so many backgrounds, the researchers turned up gene-phenotype connections that a uniform cohort would have hidden. One example: they tied the gene FN3K to intestinal disaccharidase deficiency, a difficulty digesting certain sugars, in Europeans and in admixed American patients.

When a blockbuster drug reads differently by ancestry

The part likely to travel furthest involves semaglutide, the GLP-1 drug behind Ozempic and Wegovy. Because ATLAS includes longitudinal records, the team could watch how patients responded over time rather than at a single snapshot. Semaglutide's efficacy varied across ancestry groups. It also tracked with a patient's polygenic score for type 2 diabetes, and it was modulated by genetic variation in a gene called PTPRU.

That is a concrete pharmacogenomic signal for one of the most prescribed drug classes in the world. It hints that who benefits most, and by how much, may partly be written in the genome, and that the answer is not identical across populations. The finding is an association drawn from health-system data, not a controlled trial, so it points toward hypotheses to test rather than a dosing rule to adopt tomorrow.

What this does and does not settle

A few limits are worth stating plainly. This is one health system in Los Angeles, and its patient mix reflects that region rather than the whole country or world. Electronic health records are messy: diagnoses get miscoded, follow-up is uneven, and a biobank captures people who happen to seek care, not a random sample. Ancestry itself is a statistical construct here, a way of grouping genetic similarity, not a biological category with hard edges. Associations like the semaglutide and PTPRU result need replication in independent cohorts before anyone changes practice.

Still, the direction is clear enough. The paper shows that a single institution, if its biobank is genuinely diverse, can generate robust disease associations and pharmacogenomic leads that European-heavy datasets keep missing. It also quantifies the cost of the status quo. Tools built on narrow data do not fail loudly. They just quietly serve some patients worse than others. Building biobanks that look like the people walking into the clinic is one way to stop pretending that gap does not exist.

Sources
Sources content
Comments

Comments

Stay current on biology.

Weekly research updates, breakthrough summaries, and new articles — straight to your inbox. Free, always.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.