Sequencing the gigabase plant genome of the wild tomato species Solanum pennellii using Oxford Nanopore single molecule sequencing

 

Maximilian Schmidt1, Alexander Vogel1, Alisandra Denton1, Benjamin Istace6, Alexandra Wormit1, Henri van de Geest2, Marie E. Bolger3, Saleh Alseekh4, Janina Maß3, Christian Pfaff3, Ulrich Schurr3, Roger Chetelat, Florian Maumus, Jean-Marc Aury6, Alisdair R. Fernie4, Dani Zamir5, Anthony Bolger1, Björn Usadel1,3.

1 Institute for Botany and Molecular Genetics, BioEconomy Science Center, RWTH Aachen University, Aachen, Germany.
2Wageningen Plant Research, Droevendaalsesteeg 1, 6708 PB, Wageningen, The Netherlands.
3 Institute for Bio- and Geosciences (IBG-2: Plant Sciences), Forschungszentrum Jülich, Jülich, Germany.
4 Department of Molecular Physiology, Max Planck Institute of Molecular Plant Physiology, Potsdam-Golm, Germany.
5 Faculty of Agriculture, Hebrew University of Jerusalem, Rehovot, Israel.
6 Genoscope (CEA) and UMR 8030 CNRS-Genoscope-Université d'Evry, 2 rue Gaston Crémieux, BP5706, 91057 Evry, France.

 

Background

Recent updates in Oxford Nanopore technology (R9.4) have made it possible to obtain GBases of sequence data from a single flowcell. However, unlike other next generation sequencing technology, Oxford nanopore based sequencing doesn’t require any a priori capital investments. We therefore evaluated whether Oxford nanopore can be used to analyze plant genomes. To this aim, we sequenced and are assembling an accession of the wild tomato species Solanum pennellii. This accession was identified spuriously as an tomato accessions. Unlike the frequently used Solanum pennelii LA716 accession, for which we have previously generated a high quality draft genome, this new accession does not appear to exhibit any dwarfed, necrotic leaf phenotype when introgressed into modern tomato cultivars.

Here we present approximately 134 Gbases of third generation sequencing data representing a raw coverage of ca 110x. This corresponds to 110GBases of data passing the Oxford nanopores quality filter representing about 90x coverage. In addition we provide approximately 20-30x coverage of Illumina data. Average Q value represents a normal average of all Q  (as delivered in e.g. FastQC) values in a read and is thus higher than the one reported by Oxford nanopores.

Please cite our manuscript:-

De novo Assembly of a New Solanum pennellii Accession Using Nanopore Sequencing

The Plant Cell, Oct 2017. 

Other supplementary data:-

Data is open for use (please remember to cite the paper).

MinION data

 

FAST5 files

Fastq files (original Albacore run) Fastq files (Albacore 2.1.1) Fastq (original Albacore run) Fastq (Albacore 2.1.1)
unfiltered filtered unfiltered filtered
# reads Yield RL average longest Read avg. Q # reads Yield RL average longest Read avg. Q # reads Yield RL average longest Read avg. Q # reads Yield %PF RL average longest Read avg. Q
20161027_Spenn_001_001 (512GB) 20161027_Spenn_001_001 20161027_Spenn_001_001 252,424.00 3,339,142,817.00 13,228.00 258,042.00 8.96 205,010.00 2,818,546,677.00 13,748.00 71,457.00 9.55 252,980.00 3,502,241,408.00 13,843.95 390,285.00 8.95 222,992.00 3,210,123,159.00 0.92 14,395.69 76,034.00 9.46
20161101_Spenn_002_002 (826GB) 20161101_Spenn_002_002 20161101_Spenn_002_002 439,240.00 5,183,715,529.00 11,802.00 171,384.00 8.79 340,880.00 4,202,561,007.00 12,329.00 81,554.00 9.49 439,296.00 5,461,248,451.00 12,431.82 170,901.00 8.68 365,518.00 4,755,062,637.00 0.87 13,009.11 170,901.00 9.34
20161103_Spenn_003_003 (1.1TB) 20161103_Spenn_003_003 20161103_Spenn_003_003 520,761.00 6,650,042,702.00 12,770.00 160,531.00 8.87 412,111.00 5,466,343,110.00 13,264.00 95,933.00 9.43 521,334.00 6,962,701,269.00 13,355.55 166,463.00 8.72 443,147.00 6,193,529,874.00 0.89 13,976.24 166,463.00 9.32
20161108_Spenn_004_004 (753GB) 20161108_Spenn_004_004 20161108_Spenn_004_004 431,400.00 5,252,529,782.00 12,176.00 147,621.00 8.86 343,999.00 4,360,721,187.00 12,677.00 105,470.00 9.45 431,474.00 5,540,797,850.00 12,841.56 140,235.00 8.82 371,673.00 4,973,407,918.00 0.90 13,381.14 140,235.00 9.37
20161108_Spenn_004_005 (1.1TB) 20161108_Spenn_004_005 20161108_Spenn_004_005 561,300.00 7,058,376,081.00 12,575.00 206,494.00 8.95 458,908.00 6,009,783,868.00 13,096.00 92,999.00 9.46 561,362.00 7,428,206,310.00 13,232.47 233,377.00 8.87 491,757.00 6,795,726,493.00 0.91 13,819.28 233,377.00 9.38
20161110_Spenn_005_006 (813GB) 20161110_Spenn_005_006 20161110_Spenn_005_006 380,518.00 5,364,783,960.00 14,099.00 190,799.00 8.83 298,794.00 4,460,813,385.00 14,929.00 131,801.00 9.44 380,705.00 5,653,369,695.00 14,849.74 138,150.00 8.71 322,980.00 5,104,296,304.00 0.90 15,803.75 138,150.00 9.38
20161110_Spenn_005_007 (732GB) 20161110_Spenn_005_007 20161110_Spenn_005_007 346,686.00 4,956,793,607.00 14,297.00 164,918.00 9.07 285,832.00 4,294,311,690.00 15,023.00 109,564.00 9.55 347,079.00 5,218,433,504.00 15,035.29 143,906.00 8.92 304,640.00 4,824,584,280.00 0.92 15,837.00 143,906.00 9.47
20161112_Spenn_006_008 (431GB) 20161112_Spenn_006_008 20161112_Spenn_006_008 219,392.00 2,942,797,165.00 13,413.00 131,281.00 9.10 176,739.00 2,535,756,222.00 14,347.00 89,450.00 9.63 219,571.00 3,077,195,783.00 14,014.58 123,787.00 8.86 185,767.00 2,800,253,293.00 0.91 15,074.01 89,970.00 9.59
20161112_Spenn_006_009 (732GB) 20161112_Spenn_006_009 20161112_Spenn_006_009 379,071.00 5,171,422,068.00 13,642.00 149,605.00 9.07 313,732.00 4,464,764,479.00 14,231.00 100,221.00 9.55 379,237.00 5,448,867,688.00 14,367.97 132,020.00 9.00 334,238.00 5,009,007,328.00 0.92 14,986.35 105,329.00 9.49
20161114_Spenn_007_010 (1.1TB) 20161114_Spenn_007_010 20161114_Spenn_007_010 451,070.00 6,721,699,609.00 14,901.00 198,894.00 8.74 344,128.00 5,460,999,489.00 15,869.00 100,546.00 9.40 451,270.00 7,070,634,612.00 15,668.30 148,822.00 8.53 375,419.00 6,277,227,410.00 0.89 16,720.59 148,822.00 9.28
20161114_Spenn_007_011 (211GB) 20161114_Spenn_007_011 20161114_Spenn_007_011 92,994.00 1,334,258,832.00 14,348.00 107,121.00 8.59 66,439.00 1,035,193,238.00 15,581.00 90,489.00 9.39 93,032.00 1,399,668,682.00 15,045.02 114,980.00 8.34 72,447.00 1,187,443,861.00 0.85 16,390.52 88,575.00 9.28
20161116_Spenn_009_012 (799GB) 20161116_Spenn_009_012 20161116_Spenn_009_012 523,760.00 5,360,546,441.00 10,238.00 162,037.00 9.00 430,481.00 4,558,323,604.00 10,589.00 67,428.00 9.54 523,757.00 5,650,882,208.00 10,789.13 141,897.00 8.98 459,425.00 5,129,078,435.00 0.91 11,164.13 89,621.00 9.47
20161116_Spenn_009_013 (250GB) 20161116_Spenn_009_013 20161116_Spenn_009_013 166,466.00 1,717,528,003.00 10,317.00 114,435.00 8.98 134,116.00 1,441,345,290.00 10,747.00 56,948.00 9.57 166,552.00 1,803,799,080.00 10,830.25 65,727.00 8.92 141,882.00 1,599,743,855.00 0.89 11,275.17 59,889.00 9.50
20161118_Spenn_008_014 (174GB) 20161118_Spenn_008_014 20161118_Spenn_008_014 122,505.00 1,146,492,883.00 9,358.00 146,319.00 8.96 99,017.00 961,490,600.00 9,710.00 58,788.00 9.59 123,625.00 1,200,962,979.00 9,714.56 175,642.00 8.81 105,539.00 1,080,066,667.00 0.90 10,233.82 61,659.00 9.48
20161118_Spenn_008_015 (296GB) 20161118_Spenn_008_015 20161118_Spenn_008_015 186,513.00 1,815,953,333.00 9,736.00 181,283.00 8.84 148,452.00 1,495,777,948.00 10,075.00 62,177.00 9.50 186,910.00 1,902,441,366.00 10,178.38 131,748.00 8.80 159,309.00 1,696,083,933.00 0.89 10,646.50 66,511.00 9.41
20161121_Spenn_011_016 (727GB) 20161121_Spenn_011_016 20161121_Spenn_011_016 328,053.00 4,215,219,836.00 12,849.00 161,459.00 8.84 260,702.00 3,485,323,522.00 13,368.00 81,602.00 9.50 328,267.00 4,414,265,051.00 13,447.18 137,344.00 8.79 280,747.00 3,952,418,629.00 0.90 14,078.22 120,499.00 9.41
20161121_Spenn_011_017 (423GB) 20161121_Spenn_011_017 20161121_Spenn_011_017 203,146.00 2,594,558,262.00 12,771.00 153,016.00 8.79 155,876.00 2,105,331,803.00 13,506.00 88,254.00 9.50 203,279.00 2,720,576,015.00 13,383.46 238,936.00 8.66 167,707.00 2,386,049,312.00 0.88 14,227.49 105,080.00 9.40
20161123_Spenn_012_018 (650GB) 20161123_Spenn_012_018 20161123_Spenn_012_018 327,709.00 3,870,110,250.00 11,809.00 180,596.00 8.76 254,308.00 3,133,636,771.00 12,322.00 85,535.00 9.49 328,350.00 4,036,113,614.00 12,292.11 129,774.00 8.68 273,662.00 3,544,343,780.00 0.88 12,951.54 90,413.00 9.39
20161123_Spenn_012_019 (896GB) 20161123_Spenn_012_019 20161123_Spenn_012_019 499,116.00 5,886,334,951.00 11,793.00 167,972.00 8.87 392,448.00 4,878,338,636.00 12,430.00 118,685.00 9.51 499,824.00 6,171,272,387.00 12,346.89 145,910.00 8.73 419,542.00 5,488,400,148.00 0.89 13,081.88 88,850.00 9.41
20161123_Spenn_012_020 (646GB) 20161123_Spenn_012_020 20161123_Spenn_012_020 370,723.00 4,223,196,504.00 11,391.00 181,030.00 8.92 290,213.00 3,500,334,970.00 12,061.00 90,173.00 9.54 371,609.00 4,443,108,401.00 11,956.41 108,644.00 8.79 316,526.00 4,017,146,779.00 0.90 12,691.36 96,971.00 9.44
20161125_Spenn_010_021 (564GB) 20161125_Spenn_010_021 20161125_Spenn_010_021 416,857.00 3,459,880,516.00 8,299.00 150,499.00 8.88 335,613.00 2,913,698,492.00 8,681.00 56,044.00 9.46 417,487.00 3,627,019,345.00 8,687.74 169,514.00 8.72 356,707.00 3,266,344,223.00 0.90 9,156.94 169,514.00 9.34
20161130_Spenn_014_022 (277GB) 20161130_Spenn_014_022 20161130_Spenn_014_022 144,680.00 1,763,487,955.00 12,188.00 188,526.00 8.95 115,211.00 1,490,693,758.00 12,938.00 77,113.00 9.52 144,841.00 1,840,563,228.00 12,707.47 108,006.00 8.75 122,432.00 1,661,746,134.00 0.90 13,572.81 80,884.00 9.44
20161130_Spenn_014_023 (559GB) 20161130_Spenn_014_023 20161130_Spenn_014_023 249,427.00 3,061,683,900.00 12,274.00 169,609.00 8.64 186,877.00 2,447,344,412.00 13,096.00 93,030.00 9.39 249,648.00 3,180,153,904.00 12,738.55 107,542.00 8.43 203,528.00 2,811,421,534.00 0.88 13,813.44 98,174.00 9.27
20161130_Spenn_014_024 (851GB) 20161130_Spenn_014_024 20161130_Spenn_014_024 395,151.00 5,042,395,024.00 12,760.00 230,409.00 8.61 294,534.00 3,946,433,116.00 13,398.00 78,966.00 9.41 396,516.00 5,297,048,958.00 13,358.98 358,713.00 8.54 326,870.00 4,604,556,443.00 0.87 14,086.81 124,229.00 9.26
20161202_Spenn_016_025 (784GB) 20161202_Spenn_016_025 20161202_Spenn_016_025 371,137.00 4,578,018,827.00 12,335.00 192,749.00 8.63 280,153.00 3,604,968,054.00 12,867.00 109,370.00 9.41 372,149.00 4,791,498,491.00 12,875.22 149,406.00 8.50 303,673.00 4,150,146,602.00 0.87 13,666.50 149,406.00 9.26
20161202_Spenn_016_026 (827GB) 20161202_Spenn_016_026 20161202_Spenn_016_026 431,745.00 5,333,852,075.00 12,354.00 202,965.00 8.68 326,658.00 4,214,255,431.00 12,901.00 153,099.00 9.43 432,622.00 5,622,615,652.00 12,996.60 672,474.00 8.74 365,112.00 4,939,809,559.00 0.88 13,529.57 99,452.00 9.35
20161202_Spenn_016_027 (949GB) 20161202_Spenn_016_027 20161202_Spenn_016_027 484,175.00 6,103,006,087.00 12,604.00 305,986.00 8.68 371,827.00 4,859,413,464.00 13,069.00 79,985.00 9.42 485,163.00 6,430,996,802.00 13,255.33 374,023.00 8.72 410,406.00 5,645,415,708.00 0.88 13,755.69 111,014.00 9.31
20161204_Spenn_015_028 (512GB) 20161204_Spenn_015_028 20161204_Spenn_015_028 251,396.00 2,705,390,485.00 10,761.00 140,518.00 8.58 182,745.00 2,111,380,840.00 11,553.00 83,721.00 9.42 253,480.00 2,798,547,416.00 11,040.51 117,952.00 8.24 197,682.00 2,416,371,861.00 0.86 12,223.53 83,321.00 9.29
20161204_Spenn_015_029 (600GB) 20161204_Spenn_015_029 20161204_Spenn_015_029 317,257.00 3,566,412,110.00 11,241.00 253,497.00 8.76 244,735.00 2,884,284,143.00 11,785.00 97,690.00 9.51 320,491.00 3,742,222,498.00 11,676.53 766,390.00 8.58 264,535.00 3,283,863,360.00 0.88 12,413.72 127,189.00 9.38
20161206_Spenn_017_030 (1.2TB) 20161206_Spenn_017_030 20161206_Spenn_017_030 1,133,214.00 7,324,445,582.00 6,463.00 163,606.00 8.89 908,610.00 6,019,467,101.00 6,625.00 94,944.00 9.55 1,133,449.00 7,694,792,575.00 6,788.83 144,140.00 8.89 970,352.00 6,779,404,141.00 0.88 6,986.54 118,872.00 9.46
20161206_Spenn_017_031 (1.1TB) 20161206_Spenn_017_031 20161206_Spenn_017_031 1,009,641.00 7,161,552,200.00 7,093.00 228,704.00 8.87 798,551.00 5,802,073,420.00 7,266.00 108,965.00 9.59 1,010,016.00 7,572,683,619.00 7,497.59 573,773.00 9.00 869,908.00 6,682,822,091.00 0.88 7,682.22 185,132.00 9.53

 

MiSeq data

SpnLY-PF55-MS01-01-1_S1_L001_R1_001
SpnLY-PF55-MS01-01-1_S1_L001_R2_001
SpnLY-PF55-MS01-01-2_S1_L001_R1_001
SpnLY-PF55-MS01-01-2_S1_L001_R2_001
SpnLY-PF55-MS01-01-3_S1_L001_R1_001
SpnLY-PF55-MS01-01-3_S1_L001_R2_001

 

 

NextSeq data used for independent error correction

SpnLY-PF50-MS02-01-1_S1_L001_R1_001
SpnLY-PF50-MS02-01-1_S1_L001_R2_001
SpnLY-PF50-MS02-01-1_S1_L002_R1_001
SpnLY-PF50-MS02-01-1_S1_L002_R2_001
SpnLY-PF50-MS02-01-1_S1_L003_R1_001
SpnLY-PF50-MS02-01-1_S1_L003_R2_001
SpnLY-PF50-MS02-01-1_S1_L004_R1_001
SpnLY-PF50-MS02-01-1_S1_L004_R2_001

NextSeq data for accession LA2963

SpnRC-PF50-MS04-01-01_S1_L001_R1_001
SpnRC-PF50-MS04-01-01_S1_L002_R1_001
SpnRC-PF50-MS04-01-01_S1_L003_R1_001
SpnRC-PF50-MS04-01-01_S1_L004_R1_001
SpnRC-PF50-MS05-10-01_S2_L001_R1_001
SpnRC-PF50-MS05-10-01_S2_L002_R1_001
SpnRC-PF50-MS05-10-01_S2_L003_R1_001
SpnRC-PF50-MS05-10-01_S2_L004_R1_001
SpnRC-PF50-MS06-20-01_S3_L001_R1_001
SpnRC-PF50-MS06-20-01_S3_L002_R1_001
SpnRC-PF50-MS06-20-01_S3_L003_R1_001
SpnRC-PF50-MS06-20-01_S3_L004_R1_001

Assemblies

Assembly N50 L50 Total size Largest contig Total contigs Illumina mapping rate % Qualimap Discrepancy rate % complete BUSCO
Canu (929MB) 1.55 169 961.83 10.01 2010 98.95 0.82 96.46
SMARTdenovo (923MB) 1.06 270 955.31 5.84 1901 98.99 0.91 96.11
Miniasm (945MB) 1.75 156 977.78 9.49 2704 98.24 2.48 85.69
CanuSMARTdenovo (885MB) 2.52 106 915.60 12.72 899 98.98 0.85 96.46

 

 

 

 

 

 

 

All sequence length in Mbp

All assemblies are 5x pilon polished

 

Acknowledgement

We want to acknowledge partial funding through the Federal Ministry of Education and Research (0315961, 031A053, and 031A536C), the Ministry of Innovation, Science, and Research within the framework of the NRW Strategieprojekt BioSC (313/323-400-002 13), the Deutsche Forschungsgemeinschaft (Grants US98/7-1 and FE552/29-1) within ERACAPS Regulatome, and support for large equipment from Deutsche Forschungsgemeinschaft (Grossgeräte NextSeq LC-MS) and France Génomique (ANR-10-INBS-09). D.Z. was supported by the Horizon-2020 Grant G2P-SOL (677379). S.K. was supported in part by the Intramural Research Program of the National Human Genome Research Institute, National Institutes of Health. This work utilized the computational resources of the NIH HPC Biowulf cluster (https://hpc.nih.gov).