On this page

Study populations

Asthma Translational Genomics Collaborative (ATGC)

The ATGC has whole genome sequences and corresponding measurements of asthma and lung function on 15,580 minority children, constituting the largest genomic study of asthma in minority children in the U.S. The ATGC is composed of several study cohorts including populations from three UC San Francisco studies:

  1. GALA I: Genetics of Asthma in Latino Americans
  2. GALA II: Genes-environments & Admixture in Latino Americans
  3. SAGE: Study of African Americans, Asthma, Genes, & Environments.

GALA I is a family-based study that includes genetic and environmental data of Latino American asthma cases and their parents and controls. GALA II and SAGE are parallel case-control studies of asthma in Latinos (GALA II) and African Americans (SAGE) recruited from the continental U.S., Mexico, and Puerto Rico, with substantial genetic, demographic, social, environmental, and clinical data. Details: Study Populations & Recruitment Staff.

Puerto Rican Infant Metagenomic and Epidemiologic study of Respiratory Outcomes (PRIMERO)

PRIMERO is a birth cohort study in Puerto Rico designed to probe the relationship between early-life respiratory illnesses and childhood respiratory diseases. Recruitment of 4,000 mothers and their newborns for this study will begin in October 2019. It will be the largest birth cohort of minority children in the U.S.

Latina Breast Health

The Latina Breast Health study contains biological samples and genome-wide array data on 3,457 women, of which about half are cancer cases. As of June 2019, Latina Breath Health is funded to grow by an additional 2,000 germline samples and 350 tumor/normal samples.

California Pacific Medical Center Research Institute (CPMCRI)

The CPMCRI cohort contains samples from 19,094 women (including 5,108 Asian, Latina, and African American women) prospectively collected during mammography, of which 1,062 samples are breast cancer cases.

Ongoing research projects

Early-life respiratory illnesses and later development of asthma

It is unknown if early-life respiratory illnesses are causal for later development of asthma or if a genetic predisposition to asthma initially manifests as an early-life respiratory illness. Asthma morbidity and mortality are 2.4- and 4-fold higher in Puerto Ricans compared to Caucasians, respectively.1 To explain this, we investigate whether early-life respiratory illnesses, such as respiratory syncytial virus (RSV), which has a year-round viral season in Puerto Rico2, can explain such disparities. This research laid the foundation for our Puerto Rican Infant Metagenomic and Epidemiologic study of Respiratory Outcomes (PRIMERO) study.

Cystic fibrosis in Caribbean populations

Cystic fibrosis (CF) is a lung disease caused by mutations in the CFTR gene. As of June 2019, the reference panel of CFTR mutations was developed using a mostly European population. Our collaborators in Puerto Rico have observed patients clinically diagnosed with CF who did not carry any known CFTR mutations. We aim to expand the CFTR mutation reference panel to include CF causing mutations from a diverse set of individuals, starting with those from Puerto Rico and the Dominican Republic, and consequently rectify the disparity in genetic diagnosis of CF in Caribbean individuals.

Inhaled corticosteroid response differs across populations

For treatment of persistent asthma, the clinical gold standard prescription is inhaled corticosteroid (ICS) therapy combined with a short-acting bronchodilator. However, few studies have examined the efficacy of this combination treatment across diverse populations. Even fewer studies have focused on children. We leverage our ethnically diverse dataset to study how ICS differs between populations and highlight the importance of investigating the influence of race/ethnicity on pharmacological response.

Population-specific bronchodilator response cutoffs

As of June 2019, the cutoff for determining variable expiratory flow limitation in clinical practice is a 12% increase in FEV1 after administration of a bronchodilator. However, some evidence suggests that cutoffs of 8% may be better determinants of impaired lung function in children. Additionally, no thresholds exist to differentiate between intermittent and persistent asthma. As such, we aim to answer if population-specific bronchodilator response thresholds have more clinical value than a single cutoff point and to provide population-specific threshold values that can tease apart the different severities of asthma.

Genetic ancestry in lung-function predictions

There has been great debate over the use of racial classification in medicine and biomedical research. Race and ethnicity are complex constructs which incorporate social, cultural, and genetic factors. Presently, pulmonary function testing is one of the few clinical applications where self-reported race/ethnicity is used in interpreting a “normal” range. Normative equations of lung function have been developed by testing large populations categorized by self-reported race/ethnicity. However, many populations are racially mixed (i.e., admixed) and self-identified racial and ethnic categories are crude measures of individual genetic ancestry. Using self-reported race/ethnicity may result in misclassifying individuals with respect to the normal range for physiologic measures, if the latter are dependent on ancestry. This error could lead to inaccuracies in evaluating individual pulmonary function and in determining population-specific disease prevalence and severity.

Recent advances in genetics allow genetic ancestry to be easily and inexpensively estimated in admixed populations such as African Americans and Latinos. Ancestry may serve as a proxy for differentially distributed genetic factors which vary according to historic geographic separations. Therefore, even within racial/ethnic groups, quantitative traits may vary with ancestry. Thus, ancestry may have relevance to lung function which also has a genetic component. We demonstrated that incorporating individual genetic ancestry into estimates of pulmonary function could improve predictions among self-identified African Americans and Mexicans.

Air pollution and telomere length in African Americans

Telomere length provides a potential biomarker for conditions like asthma that are associated with chronic oxidative stress and inflammation. Air pollution is one potential source of oxidative stress in the lungs. Understanding the relationship between telomere length, asthma, and air pollution is important for identifying risk factors contributing to unhealthy aging in children. Therefore, we aim to investigate the associations between exposures to ambient air pollutants and telomere length in African American children and adolescents, and to examine whether asthma status alters the association.

Portability of public gene expression databases

Transcriptome-wide association studies (TWAS) analyze the contribution of genetic variants to complex traits mediated through gene expression. Public genotype-expression repositories such as the Genotype-Tissue Expression (GTEx) project allow researchers to perform TWAS by predicting gene expression levels from reference genotypes using linear models from PrediXcan. However, predictive models with GTEx expression data are built almost exclusively with reference data from individuals of European descent, which decreases their portability to other non-European populations.3 We study techniques for improving the transethnic portability of these models and provide guidance to users who wish to use gene expression prediction models for their own TWAS.

Latina breast cancer whole exome sequencing: germline

Rare variants associated with breast cancer may be identified by sequencing. While these variants may be relatively rare, affecting a small group of women, the women who have them are at much higher risk of breast cancer and may benefit from increased screening and/or preventive therapies. We are using a combination of whole exome and targeted sequencing to understand the contribution of high and intermediate penetrance breast cancer susceptibility genes in Latinas.

Latina breast cancer whole exome tumor/normal analysis

Mutations that occur in the cancer (somatic mutations) are an important aspect of understanding the biology and in directing targeted therapy to some women with breast cancer. We are performing a whole exome sequencing project of breast tumors in Latina women to understand the patterns of somatic mutations in breast tumors from women in this population.

Our public resources

Data sets


github logo


  1. Akinbami LJ, Moorman JE, Bailey C, Zahran HS, King M, Johnson CA, Liu X. Trends in asthma prevalence, health care use, and mortality in the United States, 2001-2010. NCHS Data Brief. 2012 May;(94):1-8. PubMed PMID: 22617340.
  2. McGuiness CB, Boron ML, Saunders B, Edelman L, Kumar VR, Rabon-Stith KM. Respiratory syncytial virus surveillance in the United States, 2007-2012: results from a national surveillance system. Pediatr Infect Dis J. 2014 Jun;33(6):589-94. doi: 10.1097/INF.0000000000000257. PubMed PMID: 24445835; PubMed Central PMCID: PMC4025589.
  3. Keys KL, Mak ACY, White MJ, Eckalbar WL, Dahl AW, Mefford J, Mikhaylova AV, Contreras MG, Elhawary JR, Eng C, Hu D, Huntsman S, Oh SS, Salazar S, Lenoir MA, Ye JC, Thornton TA, Zaitlen N, Burchard EG, Gignoux CR. On the cross-population portability of gene expression prediction models. bioRxiv. March 19, 2019.