Hybrid Sequencing: The Future of Accurate and Complete Genome Assembly

Hybrid Sequencing: The Future of Accurate and Complete Genome Assembly
Genome reconstruction to decode an organism’s full genetic blueprint is essential to modern biology. But it’s a really tough job, especially when we’re looking at organisms with complex genomes.

Plants, for example, can have very large genomes that are often packed with long, repetitive DNA sequences, which make accurate assembly of the complete genome, a technical challenge. This challenge is even greater for non-model organisms or species with high levels of heterozygosity (natural differences between the two copies of chromosomes inherited from each parent). This heterozygosity contributes to “genetic noise”, making the task of assembling the genome from scratch a much harder puzzle to solve.

Short-Read Sequencing (SRS)

Short-read sequencing or SRS has long been the cornerstone of genomic research, valued for its exceptional accuracy and high throughput. This technology, exemplified by Illumina platforms, typically generates read lengths ranging from 50 to 300 base pairs (bp).

Advantages of SRS
  • High Accuracy: Extremely low error rates make SRS ideal for detecting single nucleotide polymorphisms (SNPs) and small insertions/deletions (indels).
  • High Throughput: Can generate large volumes of data quickly, supporting large-scale genomic studies.
  • Cost-Effective: Low cost per base, making it affordable for population-scale or deep sequencing projects.
  • Mature Ecosystem: The widespread adoption of SRS has led to:
    • Established workflows
    • Validated protocols
    • Rich bioinformatics support

Consequently, SRS remains the preferred method for applications such as whole-genome resequencing, whole-exome sequencing, and gene panel sequencing, especially when a high-quality reference genome is available.

Disadvantages of SRS
  • Short read lengths (50–300 bp): Cannot span repetitive or structurally complex regions, limiting resolution.
  • Fragmented assemblies:  Produces highly fragmented genomes, which is especially problematic for genomes with high heterozygosity and those rich in repeat elements.
  • Misannotation issues: Exons from the same gene can be mistakenly split into separate scaffolds, causing annotation errors.
  • Poor detection of structural variants: Often misses or mischaracterizes large structural variants like inversions, translocations, and large indels due to inability to span breakpoints.
  • Not ideal for de novo assembly of genomes: In organisms without a reference genome, SRS results in fragmented, hard-to-interpret assemblies.
Long-Read Sequencing (LRS)

The arrival of long-read sequencing (LRS) technologies, notably from Oxford Nanopore Technologies and PacBio, marked a pivotal advancement in overcoming the limitations of short reads.

Advantages of LRS
  • Long read lengths: Generates reads from kilobases to megabases, enabling comprehensive coverage of complex regions.
  • Better at assembling complex regions: Effectively resolves repetitive and difficult genomic regions such as centromeres, telomeres, and segmental duplications
  • Accurate detection of structural variants: Identifies large inversions, translocations, and indels often missed by SRS.
  • Improved de novo assembly: Enables highly contiguous and accurate genome assemblies, crucial for organisms without a reference genome.
  • Improved mapping across areas with high numbers of repeats: Long reads can span the entire length of repetitive regions, which can then be aligned using unique flanking anchors, enhancing alignment accuracy.
Disadvantages of LRS
  • Higher cost per base: Significantly more expensive than SRS, especially for large-scale projects.
  • Higher raw read error rates: Especially prone to small indel errors, requiring error correction.
  • Requires high coverage: Robust assemblies typically need ≥50X genome coverage, increasing cost and data volume.
  • High computational demands: Requires more complex algorithms and greater processing power for assembly and analysis—especially in large, repetitive eukaryotic genomes.
  • Higher input (DNA sample) requirements: Generally requires more starting material than SRS, which can be a limitation for low-yield samples.
The Hybrid Approach: A Synergistic Solution

Hybrid sequencing refers to using different sequencing technologies or combinations of short read approaches and LRS to generate data.

This approach has emerged as an elegant and powerful solution to the challenges posed by complex genomes, significantly reducing the financial and computational burden by reducing the stringent long-read coverage requirements. This strategy involves leveraging high-throughput, high-accuracy SRS to first correct sequencing errors inherent in LRS data. The subsequent de novo assembly is then performed using these error-corrected, highly contiguous long reads. This synergy facilitates more complete and accurate assemblies, particularly in repeat-rich regions, while optimizing resource utilization.

The utility of hybrid sequencing extends well beyond de novo genome assembly, addressing critical analytical needs across various domains in genomics.

  • Improved assembly of eukaryotic genes: Dramatically improved de novo eukaryotic genome assemblies, as demonstrated on the Saccharomyces cerevisiae genome.
  • Bacterial genomics: Revolutionised the field of bacterial genome analysis by allowing for the complete genomic assembly of up to 17 complete and 3 partial bacterial genomes from a mixed microbial community. In addition, this approach has been used to generate 557 metagenome-assembled genomes to explore the highly complex microbiome of a sample of activated sludge.
  • Viral community analysis: For viral genome assembly, hybrid approaches outperform single-technology methods when studying a community of viruses. In a study of bacteriophage communities, using a hybrid approach reduced error rates to levels comparable with SRS and improved genome recovery.
  • Personalised medicine: Hybrid sequencing approaches have reshaped the use of personalised medicine in key clinical areas:
    • Structural variant and haplotype detection: Hybrid approaches allow for complete phasing and detection of structural variants that may be missed bySRS alone. For example, PacBio HiFi long reads can phase complex pharmacogene haploblocks, resolving medically relevant variants in CYP2D6 and other drug-response genes.
    • Personalised cancer genomics: Hybrid approaches enable high-quality tumor–normal genome assemblies, uncovering complex somatic variants and novel gene fusions in cancers more effectively than reference-based short-read pipelines. Hybrid WGS has also been used to discover structural variants in exome-negative patients, transforming clinical decision-making like preimplantation genetic diagnosis.
    • Personalised genomics and and diagnosis of rare diseases: In a patient with a rare glycogen storage disease, hybrid whole genome sequencing has been used to discover structural variants in exome-negative patients, transforming clinical diagnoses and decision-making like preimplantation genetic diagnosis.
Comparison of Sequencing Approaches
Feature Short-Read Sequencing Long-Read Sequencing Hybrid Sequencing
Read Length 50–300 bp 5,000–100,000+ bp Combines both
Accuracy (per read) High (≥99.9%) Moderate (85–98% raw) High (≥99.9%; after correction with SRS)
Platforms Illumina, BGI Oxford Nanopore, PacBio Illumina + ONT/PacBio
Cost per Base Low Higher Moderate
Throughput Very high Moderate to high Depends on balance of platforms
Turnaround Time Fast Moderate Slower due to dual workflows
Best For Variant calling

 

RNA-seq

Population studies

Structural variation, isoform detection, de novo assembly Comprehensive genome analysis, complex regions
Limitations Limited context for repeats or SVs Higher error rates

 

More complex prep

More complex analysis

 

Higher cost/logistics

Data Output Short, high-quality reads Long, potentially noisy reads Complete, polished assemblies
Genome Assembly Fragmented assemblies, gaps likely Near-complete, fewer gaps Highly contiguous and accurate assemblies
The Future of Genomics is Integrated

In an era demanding ever-greater precision and comprehensiveness in genomic analyses, hybrid sequencing stands as a testament to the power of technological integration. By resolving the inherent challenges of short read sequencing (fragmented data and inability to map highly repetitive regions, and LRS (sequencing errors), it facilitates the generation of complete, high-quality genomic insights at optimized costs. Hybrid sequencing uses the strengths of each sequencing technique to shore up the weaknesses of the other. This approach is a clever optimization in the usage of existing technologies to tackle complex biological systems across diverse fields; from agricultural genomics and environmental microbiology to precision medicine and infectious disease epidemiology. The hybrid paradigm is set to continue redefining the landscape of de novo genome assembly, enabling a more holistic and accurate exploration of the tree of life.

Strand’s Offerings for Hybrid Sequencing

We offer robust hybrid sequencing capabilities powered by state-of-the-art platforms. For short-read sequencing, the Illumina NovaSeq X Plus enables ultra-high throughput and precision, ideal for SNP detection and deep coverage applications. For long-read sequencing, the ONT PromethION P2 Solo delivers comprehensive structural insights, full-length transcripts, and superior repeat resolution. What sets us apart is our end-to-end, integrated bioinformatics infrastructure, which seamlessly combines short- and long-read data for high-quality genome assembly, variant calling, and transcriptomics. This unified approach ensures accurate, scalable, and clinically meaningful insights—essential for complex research and precision medicine.

References:
  1. Hu T, Chitnis N, Monos D, Dinh A. Next-generation sequencing technologies: An overview. Human immunology. 2021 Nov 1;82(11):801-11.
  2. Logsdon GA, Vollger MR, Eichler EE. Long-read human genome sequencing and its applications. Nature Reviews Genetics. 2020 Oct;21(10):597-614.
  3. Gehrig JL, Portik DM, Driscoll MD, Jackson E, Chakraborty S, Gratalo D, Ashby M, Valladares R. Finding the right fit: evaluation of short-read and long-read sequencing approaches to maximize the utility of clinical microbiome data. Microbial Genomics. 2022 Mar 18;8(3):000794.
  4. Eisenhofer R, Nesme J, Santos-Bay L, Koziol A, Sørensen SJ, Alberdi A, Aizpurua O. A comparison of short-read, HiFi long-read, and hybrid strategies for genome-resolved metagenomics. Microbiology Spectrum. 2024 Apr 2;12(4):e03590-23.
  5. Goodwin S, Gurtowski J, Ethe-Sayers S, Deshpande P, Schatz MC, McCombie WR. Oxford Nanopore sequencing, hybrid error correction, and de novo assembly of a eukaryotic genome. Genome research. 2015 Nov 1;25(11):1750-6.
  6. Derakhshani H, Bernier SP, Marko VA, Surette MG. Completion of draft bacterial genomes by long-read sequencing of synthetic genomic pools. BMC genomics. 2020 Dec;21:1-1.
  7. Liu L, Wang Y, Yang Y, Wang D, Cheng SH, Zheng C, Zhang T. Charting the complexity of the activated sludge microbiome through a hybrid sequencing strategy. Microbiome. 2021 Dec;9:1-5.
  8. Cook R, Brown N, Rihtman B, Michniewski S, Redgwell T, Clokie M, Stekel DJ, Chen Y, Scanlan DJ, Hobman JL, Nelson A. The long and short of it: Benchmarking viromics using Illumina, Nanopore and PacBio sequencing technologies. Microbial genomics. 2024 Feb 29;10(2):001198.
  9. van der Lee M, Rowell WJ, Menafra R, Guchelaar HJ, Swen JJ, Anvar SY. Application of long-read sequencing to elucidate complex pharmacogenomic regions: a proof of principle. The pharmacogenomics journal. 2022 Feb;22(1):75-81.
  10. Ermini L, Driguez P. The application of long-read sequencing to cancer. Cancers. 2024 Mar 25;16(7):1275.
  11. Miao H, Zhou J, Yang Q, Liang F, Wang D, Ma N, Gao B, Du J, Lin G, Wang K, Zhang Q. Long-read sequencing identified a causal structural variant in an exome-negative case and enabled preimplantation genetic diagnosis. Hereditas. 2018 Dec;155:1-9.
Share this article

Contact Us