I like to make useful, reliable, and correct scientific tools.
Most of the things I develop are available to view, download, or modify from open source repositories.
Below is a summary of some things I've helped to make with various excellent people.
pgsc_calc
is a nextflow
workflow for Polygenic Score (PGS) calculation. The basic idea is:
The aim of pgsc_calc
is to make it easy to re-use polygenic scores. I helped to take research code
from
Cambridge University and transform it into a portable, reliable, and scalable workflow to calculate PGS. We
worked
hard to make sure scores were calculated correctly across different genetic ancestry groups.
snpQT
is a nextflow workflow to simplify working
with human
genetics data and doing:
Sequencing human genomes is quite cheap, so a lot more scientists have access to human genetic data but don't really know how to work with it (myself included at the time). The aim of the workflow was to formalise a set of best practices at our research centre. We published the workflow hoping others might find it helpful.
I'm currently working on integrating pgsc_calc
with a genetic data analysis platform for the INTERVENE project.
This involves building tools to help deploy and monitor the workflow across a diverse set of systems, including:
At the start of the pandemic I was part of a small team that helped the Public Health Agency set up data science workflows and applications for contact tracing. Public health experts at the PHA guided the development of interactive data visualisations, and they were happy to upgrade from dodgy spreadsheets 👀
Unfortunately the developed applications aren't publicly available.