Register      Login
Functional Plant Biology Functional Plant Biology Society
Plant function and evolutionary biology
RESEARCH ARTICLE

Survey sequencing of soybean elucidates the genome structure, composition and identifies novel repeats

Andrew Nunberg A , Joseph A. Bedell A , Mohammad A. Budiman A , Robert W. Citek A , Sandra W. Clifton B , Lucinda Fulton B , Deana Pape B , Zheng Cai C , Trupti Joshi C D , Henry Nguyen E F , Dong Xu C D E and Gary Stacey D E F G H
+ Author Affiliations
- Author Affiliations

A Orion Genomics, LLC, 4041 Forest Park Ave, St Louis, MO 63108, USA.

B Genome Sequencing Center, School of Medicine, Washington University, St Louis, MO 63130, USA.

C Computer Science Department, University of Missouri, Columbia, MO 65211, USA.

D Christopher S. Bond Life Sciences Center, University of Missouri, Columbia, MO 65211, USA.

E National Center for Soybean Biotechnology, University of Missouri, Columbia, MO 65211, USA.

F Division of Plant Science, University of Missouri, Columbia, MO 65211, USA.

G Division of Biochemistry, Department of Molecular Microbiology and Immunology, University of Missouri, Columbia, MO 65211, USA.

H Corresponding author. Email: staceyg@missouri.edu

I This paper originates from a presentation at the Third International Conference on Legume Genomics and Genetics, Brisbane, Queensland, Australia, April 2006.

Functional Plant Biology 33(8) 765-773 https://doi.org/10.1071/FP06106
Submitted: 27 April 2006  Accepted: 24 May 2006   Published: 2 August 2006

Abstract

In order to expand our knowledge of the soybean genome and to create a useful DNA repeat sequence database, over 24 000 DNA fragments from a soybean [Glycine max (L.) Merr.] cv. Williams 82 genomic shotgun library were sequenced. Additional sequences came from over 29 000 bacterial artificial chromosome (BAC) end sequences derived from a BstI library of the cv. Williams 82 genome. Analysis of these sequences identified 348 different DNA repeats, many of which appear to be novel. To extend the utility of the work, a pilot study was also conducted using methylation filtration to estimate the hypomethylated, soybean gene space. A comparison between 8366 sequences obtained from a filtered library and 23 788 from an unfiltered library indicate a gene-enrichment of ~3.2-fold in the hypomethylated sequences. Given the 1.1-Gb soybean genome, our analysis predicts a ~343-Mb hypomethylated, gene-rich space.


Acknowledgments

Research was supported by a grant (to GS and SWC) from the National Science Foundation (grant DBI-0417357) and by the Missouri Soybean Merchandising Council.


References


Arabidopsis Genome Initiative (2000) Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature 408, 796–815.
Crossref | GoogleScholarGoogle Scholar | PubMed | open url image1

Arumuganathan K, Earle ED (1991) Estimation of nuclear DNA content of plants by flow cytometry. Plant Molecular Biology Reporter 9, 229–241. open url image1

Bao Z, Eddy SR (2002a) Automated de novo identification of repeat sequence families in sequenced genomes. Genome Research 12, 1269–1276.
Crossref | GoogleScholarGoogle Scholar | PubMed | open url image1

Bao Z , Eddy SR (2002 b) RECON 00README — finding repeat families from biological sequences Version 1.03.

Barakat A, Matassi G, Bernardi G (1998) Distribution of genes in the genome of Arabidopsis thaliana and its implications for the genome organization of plants. Proceedings of the National Academy of Sciences USA 95, 10 044–10 049.
Crossref | GoogleScholarGoogle Scholar | open url image1

Bedell JA, Korf I, Gish W (2000) MaskerAid: a performance enhancement to RepeatMasker. Bioinformatics 16, 1040–1041.
Crossref | GoogleScholarGoogle Scholar | PubMed | open url image1

Bedell JA, Budiman MA, Nunberg A, Citek RW, Robbins D , et al. (2005) Sorghum genome sequencing by methylation filtration. PLoS Biology 3, e13.
Crossref | GoogleScholarGoogle Scholar | PubMed | open url image1

Bennetzen JL (1996) The contributions of retroelements to plant genome organization, function and evolution. Trends in Microbiology 4, 347–353.
Crossref | GoogleScholarGoogle Scholar | PubMed | open url image1

Clifton SW, Minx P, Fauron CMR, Gibson M, Allen JO , et al. (2004) Sequence and comparative analysis of the maize NB mitochondrial genome. Plant Physiology 136, 3486–3503.
Crossref | GoogleScholarGoogle Scholar | PubMed | open url image1

Cordeiro GM, Casu R, McIntyre CL, Manners JM, Henry RJ (2001) Microsatellite markers from sugarcane (Saccharum spp.) ESTs cross transferable to erianthus and sorghum. Plant Science 160, 1115–1123.
Crossref | GoogleScholarGoogle Scholar | PubMed | open url image1

Foster-Hartnett D, Mudge J, Larsen D, Danesh D, Yan H, Denny R, Penuela S, Young ND (2002) Comparative genomic analysis of sequences sampled from a small region on soybean (Glycine max) molecular linkage group G. Genome 45, 634–645.
Crossref | GoogleScholarGoogle Scholar | PubMed | open url image1

Gepts P, Beavis WD, Brummer EC, Shoemaker RC, Stalker HT, Weeden NF, Young ND (2005) Legumes as a model plant family. Genomics for food and feed report of the cross-legume advances through genomics conference. Plant Physiology 137, 1228–1235.
Crossref | GoogleScholarGoogle Scholar | PubMed | open url image1

Gibbs RA, Weinstock GM, Metzker ML, Muzny DM, Sodergren EJ , et al. (2004) Genome sequence of the Brown Norway rat yields insights into mammalian evolution. Nature 428, 493–521.
Crossref | GoogleScholarGoogle Scholar | PubMed | open url image1

Goldberg RB (1978) DNA sequence organization in the soybean plant. Biochemical Genetics 16, 45–51.
Crossref | GoogleScholarGoogle Scholar | PubMed | open url image1

Goldblatt P (1981) Cytology and phylogeny of Leguminosae. In ‘Advances in legume systematics. Part 2’. (Eds RM Polhill, PH Raven) pp. 427–463. (Royal Botanic Gardens: Kew)

Gurley WB, Hepburn AG, Key JL (1979) Sequence organization of the soybean genome. Biochimica et Biophysica Acta 561, 167–183.
PubMed |
open url image1

Hawkins JW, Van Keuren ML, Piatigorsky J, Law ML, Patterson D, Kao FT (1987) Confirmation of assignment of the human α 1-crystallin gene (CRYA1) to chromosome 21 with regional localization to q22.3. Human Genetics 76, 375–380.
Crossref | GoogleScholarGoogle Scholar | PubMed | open url image1

Holmes I (2002) Transcendent elements: whole-genome transposon screens and open evolutionary questions. Genome Research 12, 1152–1155.
Crossref | GoogleScholarGoogle Scholar | PubMed | open url image1

Jarvik T, Lark KG (1998) Characterization of Soymar1, a Mariner element in soybean. Genetics 149, 1569–1574.
PubMed |
open url image1

Klein PE, Klein RR, Vrebalov J, Mullet JE (2003) Sequence-based alignment of sorghum chromosome 3 and rice chromosome 1 reveals extensive conservation of gene order and one major chromosomal rearrangement. The Plant Journal 34, 605–621.
Crossref | GoogleScholarGoogle Scholar | PubMed | open url image1

Lander ES, Linton LM, Birren B, Nusbaum C, Zody MC , et al. (2001) Initial sequencing and analysis of the human genome. Nature 409, 860–921.
Crossref | GoogleScholarGoogle Scholar | PubMed | open url image1

Laten HM, Morris RO (1993) SIRE-1, a long interspersed repetitive DNA element from soybean with weak sequence similarity to retrotransposons: initial characterization and partial sequence. Gene 134, 153–159.
Crossref | GoogleScholarGoogle Scholar | PubMed | open url image1

Lin J-Y, Jacobus BH, SanMiguel P, Walling JG, Yuan Y, Shoemaker RC, Young ND, Jackson SA (2005) Pericentromeric regions of soybean (Glycine max L. Merr.) chromosomes consist of retroelements and tandemly repeated DNA and are structurally related and evolutionarily labile. Genetics 170, 1221–1230.
Crossref | GoogleScholarGoogle Scholar | PubMed | open url image1

Marek LF, Mudge J, Darnielle L, Grant D, Hanson N , et al. (2001) Soybean genomic survey: BAC-end sequences near RFLP and SSR markers. Genome 44, 572–581.
Crossref | GoogleScholarGoogle Scholar | PubMed | open url image1

Martienssen RA, Rabinowicz PD, O’Shaughnessy A, McCombie WR (2004) Sequencing the maize genome. Current Opinion in Plant Biology 7, 102–107.
Crossref | GoogleScholarGoogle Scholar | PubMed | open url image1

McCouch SR, Teytelman L, Xu YB, Lobos KB, Clare K , et al. (2002) Development and mapping of 2240 new SSR markers for rice (Oryza sativa L.). DNA Research 9, 199–207.
Crossref | GoogleScholarGoogle Scholar | PubMed | open url image1

Morgante M, Jurman I, Shi L, Zhu T, Keim P, Rafalski JA (1997) The STR120 satellite DNA of soybean: organization, evolution and chromosomal specificity. Chromosome Research 5, 363–373.
Crossref | GoogleScholarGoogle Scholar | PubMed | open url image1

Morgante M, Hanafey M, Powell W (2002) Microsatellites are preferentially associated with nonrepetitive DNA in plant genomes. Nature Genetics 30, 194–200.
Crossref | GoogleScholarGoogle Scholar | PubMed | open url image1

Mudge J, Yan HH, Denny RL, Howe DK, Danesh D, Marek LF, Retzel E, Shoemaker RC, Young ND (2004) Soybean bacterial artificial chromosome contigs anchored with RFLPs: insights into genome duplication and gene clustering. Genome 47, 361–372.
PubMed |
open url image1

Rabinowicz PD, Schutz K, Dedhia N, Yordan C, Parnell LD, Stein L, McCombie WR, Martienssen RA (1999) Differential methylation of genes and retrotransposons facilitates shotgun sequencing of the maize genome. Nature Genetics 23, 305–308.
Crossref | GoogleScholarGoogle Scholar | PubMed | open url image1

Rabinowicz PD, Citek R, Budiman MA, Nunberg A, Bedell JA, Lakey N, O’Shaughnessy AL, Nascimento LU, McCombie WR, Martienssen RA (2005) Differential methylation of genes and repeats in land plants. Genome Research 15, 1431–1440.
Crossref | GoogleScholarGoogle Scholar | PubMed | open url image1

Rhodes PR, Vodkin LO (1988) Organization of the Tgm family of transposable elements in soybean. Genetics 120, 597–604.
PubMed |
open url image1

Schlueter J, Dixon P, Granger C, Grant D, Clark L, Doyle JJ, Shoemaker RC (2004) Mining EST databases to resolve evolutionary events in major plant species. Genome 47, 868–876.
Crossref | GoogleScholarGoogle Scholar | PubMed | open url image1

Shoemaker RC, Polzin K, Labate J, Specht J, Brummer EC , et al. (1996) Genome duplication in soybean (Glycine subgenus soja). Genetics 144, 329–338.
PubMed |
open url image1

Shoemaker RC, Schluetter J, Doyle JJ (2006) Paleopolyploidy and gene duplication in soybean and other legumes. Current Opinion in Plant Biology 9, 104–109.
Crossref | GoogleScholarGoogle Scholar | PubMed | open url image1

Singh RJ, Hymowitz T (1988) The genomic relationship between Glycine max (L.) Merr. and G. soja Sieb. and Zucc. as revealed by pachytene chromosome analysis. Theoretical and Applied Genetics 76, 705–711.
Crossref | GoogleScholarGoogle Scholar | open url image1

Stacey G, Vodkin L, Parrott WA, Shoemaker RC (2004) National science foundation-sponsored workshop report: draft plan for soybean genomics. Plant Physiology 135, 59–70.
Crossref | GoogleScholarGoogle Scholar | PubMed | open url image1

Thompson JD, Higgins DG, Gibson TJ (1994) CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Research 22, 4673–4680.
PubMed |
open url image1

Vahedian M, Shi L, Zhu T, Okimoto R, Danna K, Keim P (1995) Genomic organization and evolution of the soybean SB92 satellite sequence. Plant Molecular Biology 29, 857–862.
Crossref | GoogleScholarGoogle Scholar | PubMed | open url image1

Vodkin LO, Rhodes PR, Goldberg RB (1983) A lectin gene insertion has the structural features of a transposable element. Cell 34, 1023–1031.
Crossref | GoogleScholarGoogle Scholar | PubMed | open url image1

Whitelaw CA, Barbazuk WB, Pertea G, Chan AP, Cheung F , et al. (2003) Enrichment of gene-coding sequences in maize by genome filtration. Science 302, 2118–2120.
Crossref | GoogleScholarGoogle Scholar | PubMed | open url image1

Wright DA, Voytas DF (2002) Athila4 of Arabidopsis and Calypso of soybean define a lineage of endogenous plant retroviruses. Genome Research 12, 122–131.
Crossref | GoogleScholarGoogle Scholar | PubMed | open url image1