Genetic studies have revealed thousands of genetic variants underlying hundreds of human diseases and traits. However, the variants discovered to date explain less than half of the apparent heritability in most diseases and traits. Since these efforts have largely focused on common genetic variants, researchers suspect much of this missing heritability is due to rare genetic variants. Now scientists describe a conceptual framework for the design of studies of rare variants. The findings are detailed in the Proceedings of the National Academy of Sciences.
Studies of common genetic variants are typically known as genome-wide association studies (GWAS), whereas studies of rare variants are often simply called sequencing studies. However, Eric Lander, founding director of the Broad Institute of Harvard and MIT, and his colleagues now suggest the terms common variant association study (CVAS) and rare variant association study (RVAS).
The researchers defined common variants as those with at least one carrier per 100 people. There have been enormous studies to find common genetic variants “that have taught us a tremendous amount, but attempts to find rare genetic variants so far have been done with small samples without any idea of how large a sample was needed,” Lander says. “We wanted to see what the principles were for carrying out those studies — we want to be able to find the complete genetic basis of disease.”
The researchers mathematically analyzed what factors are most critical to detect associations between rare variants and diseases or traits. They found the design for an RVAS depended crucially on how strongly the genetic variants it analyzed were selected for or against — that is, how greatly the process of natural selection made sure that variant spread or was quashed.
Altogether, these findings suggest that studies of both common and rare variants “actually need to be a similar size,” Lander says. “Many rare variant association studies have tried so far to look at about 1,000 cases, and you’d probably need to at least look at 25,000 cases to reach a pretty good level of power for a study.” RVAS can begin by analyzing existing sample collections that have been used for CVAS, many of which contain many tens of thousands of cases, the researchers note.
The researchers surprisingly also found that in RVAS, looking in regions that encoded for genes was far more powerful in terms of discovering associations than looking outside coding regions, Lander says. This contrasts with CVAS, which are equally powerful on coding and noncoding regions and have identified many common variants affecting common traits in noncoding regions.
Designing a study properly by thinking about its power “is much better to do before you do the study than after you do the study,” Lander notes wryly. “Now we’ve got to execute these studies in diseases such as early heart attack and schizophrenia.”