May 31, 2019


Using bioinformatics and ontologies to bridge between healthcare, human genetics and model organism research


Project Description

This project will build on a major public database of disease genetics (GWAS Central, https://www.gwascentral.org) to provide new methods and interfaces to: (a) visually and programmatically interrogate GWAS Central data using a range of research and clinical ontology terms; and (b) integrate and compare GWAS Central data with similar data from other public sources including those of other species. Possible extensions include: (i) GWAS data publishing and linking via the semantic web of ‘Linked Data’; (ii) natural language processing of GWAS publications to extract genotype and phenotype information; and (iii) developing and implementing an application ontology for GWAS summary-level metadata.

More specifically, genome-wide association studies (GWAS) identify genetic variants associated with phenotypes. The comprehensive and widely used GWAS Central database (Beck et al, 2014) collates published summary-level GWAS findings from thousands of studies. The phenotype descriptions in GWAS Central are standardised with the use of publicly available ontologies. Ontologies are controlled vocabularies where the terms are precisely defined and related to each other in meaningful ways, and they are widely used by bioinformaticians to integrate and compare heterogeneous datasets. GWAS Central currently uses ontologies that have been developed with a research-focus. However, greater integration with clinical datasets will be achieved by additionally using ontologies that have a clinical-focus, such as those used by the NHS. Furthermore, GWAS data will be integrated with genetic data from other mammalian species, such as mouse, by mapping the human phenotype to the mouse equivalent. International efforts to map between ontologies provide a foundation for enabling this.


Beck T, Hastings RK, Gollapudi S, Free RC, Brookes AJ. GWAS Central: a comprehensive resource for the comparison and interrogation of genome-wide association studies. Eur J Hum Genet. 2014 Jul;22(7):949-52


This fully-funded studentship is available to Home/EU students and covers UK/EU tuition fees plus an annual tax-free stipend for 3 years (for 2018/19 the stipend rate is £14,777). 

The studentship would be held in the Department of Genetics and Genome Biology in the College of Life Sciences at the University of Leicester and commence 23 September 2019.

