Projects

Bioinformatics workflows and software

I like to make useful, reliable, and correct scientific tools.

Most of the things I develop are available to view, download, or modify from open source repositories.

Below is a summary of some things I've helped to make with various excellent people.

Polygenic Score (PGS) Catalog Calculator

pgsc_calc is a nextflow workflow for Polygenic Score (PGS) calculation. The basic idea is:

The aim of pgsc_calc is to make it easy to re-use polygenic scores. I helped to take research code from Cambridge University and transform it into a portable, reliable, and scalable workflow to calculate PGS. We worked hard to make sure scores were calculated correctly across different genetic ancestry groups.

snpQT

snpQT is a nextflow workflow to simplify working with human genetics data and doing:

Sequencing human genomes is quite cheap, so a lot more scientists have access to human genetic data but don't really know how to work with it (myself included at the time). The aim of the workflow was to formalise a set of best practices at our research centre. We published the workflow hoping others might find it helpful.


Software development

I'm currently working on integrating pgsc_calc with a genetic data analysis platform for the INTERVENE project.

This involves building tools to help deploy and monitor the workflow across a diverse set of systems, including:


Data science

COVID-19

At the start of the pandemic I was part of a small team that helped the Public Health Agency set up data science workflows and applications for contact tracing. Public health experts at the PHA guided the development of interactive data visualisations, and they were happy to upgrade from dodgy spreadsheets 👀

Unfortunately the developed applications aren't publicly available.