What’s hot in DNA sequencing–and what’s next?
DNA sequencing has come a long way
Got $1,000 burning a hole in your pocket? Can I interest you in a miniature DNA sequencer that will plug into your laptop? Just four decades after Fredrick Sanger developed the Sanger method, a DNA sequencing technique that won him a Nobel prize, DNA sequencing is a reliable and essential tool not only for scientists, but also clinicians, law enforcement and even educators.
Our appetite for faster, better sequencing is insatiable, and has spurred development of new sequencing technologies from large banks of high-throughput sequencers to portable units that fit in a drawer. While it took 12 years to sequence the human genome, researchers using current sequencing technology, supported and enabled by advances in computing and data analysis, contemplate sequencing the genomes of populations of 100,000.
Next-generation sequencing: where are we now?
After the completion of the Human Genome Project, a new wave of technologies, called next-generation sequencing (NGS), came onto the market. These sequencers use massively parallel sequencing of short reads for high throughput.
- Illumina platforms became market leaders, developing a sequencing-by-synthesis approach on a clonally amplified template.
- BGI used the Illumina platform to become the largest NGS data producer in the world, while developing systems of its own based on a sequencing-by-ligation approach.
- Both technologies now produce gigabases of data per run.
NGS has fundamentally changed genomics research. It enables scientists to sequence an entire genome, transcriptome, or exome, conducting experiments that were previously impossible. To put that into perspective, generating the raw data for the Human Genome Project now would take less than one day with these systems.
But there are some limitations to these platforms. NGS relies on short reads (less than 1 kb), requiring scaling up to millions of fragments sequenced in parallel. The short-read length is not well-suited for de novo assembly, as the data isn’t easily reconstructed without a reference genome. Deep sequencing partially compensates for this limitation by minimizing errors, but repetitive regions are still a challenge.
Sample preparation for NGS can be labor-intensive. This is being addressed with commercial kits that simplify the amplification and purification of starting DNA.
Read more about how these systems work in Nature Reviews.
Third-generation sequencing: single-molecule technologies
Enter third-generation sequencers, which provide long-read sequencing (tens of kb) without clonal amplification or the various preparative steps needed for NGS. Long-read sequencing can simplify de novo assembly, span repetitive regions and structural variations in one read, and provide lower bias to G-rich/poor regions.
Single-molecule sequencing struggles with accuracy, but error rates are improving as the technology matures. It’s already possible to reconstruct a genome with long-read sequencing as easily as by NGS.
Key players in the commercialization of single-molecule sequencing technologies are Pacific Biosciences (PacBio) and Oxford Nanopore Technologies (ONT). To compete with these technologies, NGS platform suppliers, like Illumina, have developed synthetic long-read technology. Here, clever barcoding provides a way to rebuild the original sequence without investing in a new piece of hardware.
The future of next-generation sequencing
As next-generation sequencing technologies continue to develop, expect a step change in speed and convenience, doing away with the need for technical preparation steps, and simplifying workflows. These new technologies will inevitably become routine in research, in clinical settings, and far beyond. These are interesting times for genome sequencing technology, indeed.