We live in an age of incredible innovation. It is hard to say which of the many mindblowing advances made in our lifetimes will have the biggest impact on us, but in the medical field I would hazard that the most significant developments for health, humanity and the world have been advances in imaging technology (namely MRI and PET) and in DNA sequencing.
If you have been in a pub with me at any point in the last three years I have probably tried to explain my PhD project to you. The result is usually:
Five minutes of me trying say ‘haemoglobinopathy’ correctly
Knocking over someone’s drink while trying to explain Next Generation Sequencing (NGS) through the medium of wild gesticulation
- A lot of this:
Here, in silico, in the hallowed land of Ctrl + Z, I am going to explain coherently what NGS is and why it’s special to me, you and the world.
Rolls up sleeves, pushes mug of wine safe distance from keyboard
What is Next Generation Sequencing?
Next Generation Sequencing is the latest technique for sequencing DNA. It allows you to take a ton of DNA from anything or anyone and work out the sequence of nucleotide bases in that DNA. NGS has revolutionized genetics because it allows rapid sequencing of very large amounts of DNA, compared to older techniques.
This is how the most commonly used NGS platforms at the momenbt – the HiSeq and MiSeq (made by Illumina) work:
>Pick a DNA sample (from hair, blood, whatever)
>Chop all the DNA (which will be thousands of copies of the genome) into tiny pieces
>’Unzip’ the pieces so they are in single strands, rather than double helices
>[Do some chemistry]
>Stick all the fragments to a chip
>Make them clone themselves so they make little tight little colonies of identical fragments
>Wash fluorescently labelled DNA bases (A,G,C,T) over them
>The fragments bind to these bases, releasing their fluorescent label
>Use a camera to record the coloured flashes from all the fragments (the clones you made earlier make these signals stronger and easier to detect)
>>> Translate this as DNA sequence of each original fragment
The sheer amount of sequencing these machines can do is enormous. A DNA sample from a millilitre of blood contains billions of copies of a person’s genome. One of these machines has the capacity to sequence enough DNA fragments to cover every base of your genome 10,000 times in less than a week (or 500x in a day if you wish). By contrast, the Human Genome Project, which was started in 1990, took 13 years and $3bn to produce a complete draft of the human genome (sequencing 4 individuals), using the previous generation of sequencing techniques. Multiple samples can also be sequenced at once, by adding different nucleotide ID tags to all the DNA from each sample and then mixing them all together and sequencing the whole lot. Some platforms can sequence up to 96 people simutaneously and then work out from the ID tags which fragments came from where.
Why Should I Care?
The volume of data produced means this data can be manipulated in a ton of ways to answer practically every biological question you can dream of. Some major uses of NGS data are:
Whole Genome Sequencing
You can now have your entire genome sequenced in a day. From this, we can tell if you’re predisposed to heart disease, cancer, Alzheimer’s and many other things. This means you can take preventative measures against these conditions, or that your doctors can keep a lookout for early signs of them. Conversely, if you have a disease we don’t completely understand we can sequence you along with some other sufferers and see if you share any mutations, and what genes they affect. Bingo – new drug target, new means of diagnosing cases, better understanding of the disease.
Minor Allele Sequencing
Sometimes the genome you were born with is not the only one that shows up when we sequence a sample from, say, your blood. DNA from viruses, from tumours* and, if you’re pregnant, from your unborn baby. NGS can pick up and sequence this DNA along with your own. Therefore, just by taking a sample of your blood, we can determine the genotype of a cancer you might be suffering from and treat it better; we can identify the genotype of your baby – particularly useful if there’s a risk of it inheriting a genetic disease from you; on top of that, we can detect viral and bacterial infection. Previously these things were only possible by potentially risky biopsy, or in the case of non-solid tumours, not possible at all.
*which are caused by mutations in a small group or single cell, which then divide and mutate rapidly
Knowing what genes you have is one thing – knowing where and how they’re being used is quite another. While all your different cells start with the same pairs of chromosomes, they take on different shapes and perform different functions depending on which genes are actually active. Studying this can help us work out what is happening in the body at a given time and identify disease conditions. A common method your body uses to ‘switch off’ genes is sticking a methyl molecule to them, so that nothing else can interact with them. Methylation-specific sequencing, which only sequences unmethylated DNA, can therefore be used to work out which genes are on and which genes are off in a particular population of cells.
In the olden days of a few decades ago assembling a new genome would have been an incredible feat: years of work of sequencing small pieces of DNA and slotting them all together. On paper, NGS does exactly the same thing, except that it does it all at once, at the speed of 2,000 badly dressed 70’s scientists. We can now decode the genome for an entirely new organism in a matter of days. With this we can find out more about a species, and see how they compare to us, even identifying our last common ancestors.
On top of this, the technique can be used to amplify the severely degraded, previously unusable data we have from extinct species. The result of this so far has been the sequencing of the Neanderthal and Denosovan genomes, both of which can now be accessed online, just like the human genome. The whole Neanderthal genome was elucidated from three pieces of bone…found in caves. The Denosovan genome as it currently stands originates from 50mg of a 40,000 year old bone. If that isn’t the coolest thing you have ever heard then I don’t know what is.
If the blood from your crime scene seems to come from about 6 different people, do not fear! Because NGS sequences everything you can seperate all the DNA out into different genomes based, for example, on the proportions at which you find all the different sequences in your sample!
We can make some pretty nice SD cards these days, but we still haven’t got data storage down to the same level as mother nature: The nucleus of a cell is 6 micrometers across yet contains gigabytes of data**. If you filled cells with an artificial genome had written yourself, you could totally, like, send a ton on information to someone in code and they could then decode it with NGS.
**how much, exactly? We don’t know – estimates range from 20-100Gb right now: we keep finding new ways that DNA writes different information into the same DNA sequences!
What’s Next for NGS?
NGS technology is constantly moving towards cheaper, faster sequencing. We completed the human genome project in 2003 and both the 1,000 genomes project and the $1,000 genome project in 2012. The new goals are the $100 genome project, the 10,000 genomes project and the 1000 plant genomes project. Single Molecule Sequencing is also imminent, with the development of of nanopore platforms by Oxford Nanopore Technology: These platforms, including this one that fits on a freaking flash drive skip some of the chemistry steps required by other machines which are known to occasionally introduce sequencing errors.
Next Generation Sequencing has driven us into an age of exponential increases in our sequencing capabilities, opening the doors to new opportunities in every aspect of the study of genetics. The ‘Cheaper’ and ‘Faster’ aspects of NGS developments are crucial to its application to medicine, where we will see its biggest impacts. The challenge now is to develop equally good tools for actually handling the enormous volume of data it creates, and knowing how to correctly interpret what we find…