Genetic Diversity Is Missing From Many Biobanks : Shots - Health News Scientists around the world are working to correct a problem with genetic health information — too much of it is currently based on samples of Europeans.
NPR logo

Lack Of Diversity In Genetic Databases Hampers Research

  • Download
  • <iframe src="" width="100%" height="290" frameborder="0" scrolling="no" title="NPR embedded audio player">
  • Transcript
Lack Of Diversity In Genetic Databases Hampers Research

Lack Of Diversity In Genetic Databases Hampers Research

  • Download
  • <iframe src="" width="100%" height="290" frameborder="0" scrolling="no" title="NPR embedded audio player">
  • Transcript


Millions of people around the world have donated DNA samples and medical records to researchers. They form the foundation of the science of so-called precision medicine. To date, most of the results have been based on samples from people of European ancestry. Scientists are now working to correct the skewed picture that results in. NPR's science correspondent Richard Harris explains.

RICHARD HARRIS, BYLINE: When you get lab results from your doctor, you may have noticed that some of them specify different normal ranges depending on your race. Arjun Manrai says for people with kidney disease, those details can really matter.

ARJUN MANRAI: You can change the stage of kidney disease that you have and when we start initiating things like dialysis.

HARRIS: Manrai studies health genetics at Harvard Medical School, but he also has a personal story about his mother, who was born in India, immigrated to the United States and, later in life, developed kidney disease. Her lab results showed normal ranges for blacks and everybody else. But her doctor explained that the everybody-else category was really based on results from Europeans.

MANRAI: So my mom responded to her nephrologist like, hey. Well, you know, you have this African American range. You have this white range. I'm an Indian American woman. Which of these ranges actually should I be using? He responded that, you know, you're not either of these two, so I averaged the two numbers for you. And it just...

HARRIS: That's ridiculous, isn't it?

MANRAI: It's ridiculous. But in the absence, I'd say in this kind of vacuum of information, this was what he was doing as his approach to staging her kidney disease.

HARRIS: Doctors don't have good information about people of different ancestries because so much of the basic science has been done on samples drawn from European populations. One of the most widely used repositories is called the U.K. Biobank, which contains samples from half a million middle-aged Brits. Chief scientist Cathie Sudlow says 95% came from white Europeans.

CATHIE SUDLOW: At the time that they were recruited and the age group that were recruited, that largely reflected the average across the UK. So because the study was in the U.K., that's what we got.

HARRIS: The Biobank has been a boon to scientists who want to identify the genes that are involved in disease. Genes are universal, but it doesn't work to identify all the genetic variants that differ based on ancestry.

SUDLOW: There is no one cohort anywhere in the world that can answer all questions for all people. And one of the really important things about some of the - what the U.K. Biobank's now engaged in with others in the U.S. and, indeed, throughout the world is to look at ways in which data from different ethnic groups can learn from each other.

HARRIS: The U.K. Biobank has helped establish large repositories in Mexico and in China. In the U.S., the National Institutes of Health is gradually putting together a biobank that aims to have a diverse population of a million volunteers. There are dozens and dozens of collections like this scattered around the world; some in private hands, others accessible to scientists.

EWAN BIRNEY: Broadly, we're talking about at least millions of people.

HARRIS: Ewan Birney, co-director of the European Bioinformatics Institute, is part of an effort to find ways to link some of these together so scientists can quickly see how a discovery in one group applies to people with different ancestries. Birney says even though most of the initial work has been in European populations, a lot of it is relevant to everyone.

BIRNEY: How genetics works in different countries - sort of a surprise is that very often, the genetics is pretty much the same as you move between different countries.

HARRIS: Where it breaks down is in the details. The same genes and proteins are involved in, say, diabetes, but the variants differ based on a person's genetic heritage. Birney expects that the new databases will not only help identify issues of concern to a particular ethnic group but can identify genes that are important for everybody's health. He's particularly eager to learn what comes out of a biobank project taking shape in sub-Saharan Africa.

BIRNEY: Because Africa is the birthplace of humans, there's just the highest amount of genetic diversity inside of sub-Saharan Africa. And it's really clear, if you're a geneticist, we should be spending an awful lot more time studying humans there. But then that also is really important that we do that in a way which is empowering and enabling of the scientists who come from these different countries.

HARRIS: The African scientists will get a chance to do their own research before opening this resource up to the rest of the world, Birney says. Arjun Manrai at Harvard is tapping into data that's already available, as he works to make sure that genetic tests and lab results aren't skewed by a person's ancestry.

MANRAI: I think understanding ancestry, race, ethnicity is an area that we're going to see a tremendous, tremendous amount of work in over the next 10 years.

HARRIS: And he says these new resources will help a lot.

Richard Harris, NPR News.

Copyright © 2019 NPR. All rights reserved. Visit our website terms of use and permissions pages at for further information.

NPR transcripts are created on a rush deadline by Verb8tm, Inc., an NPR contractor, and produced using a proprietary transcription process developed with NPR. This text may not be in its final form and may be updated or revised in the future. Accuracy and availability may vary. The authoritative record of NPR’s programming is the audio record.