Mercator is a tool to batch classify protein or gene sequences into MapMan functional plant categories. Many MapMan categories deal with metabolic pathways and enzyme functions, therefore using this pipeline a draft metabolic network can be established, especially after manual corretion of the automatically derived classification see e.g. May et al. 2008.
For a sequence clasification Mercator performs:

The results of the individual searches are then weighted by reliability. E.g. Uniref gets a low reliability since all proteins from Uniref90 are only classified based on keywords. The classifications with the highest reliabilty are retained. Current statistics (December 08):

  • Accuracy ca. 90%
  • Domains and families > 1000
  • Proteins and genes > 30.000

TAIR:   TAIR Release 10
PPAP:   SwissProt/UniProt Plant Proteins
CHLAMY:   JGI Chlamy release 4 Augustus models
ORYZA:   TIGR5 rice proteins
KOG:   Clusters of orthologous eucaryotic genes database (KOG)
CDD:   Use conserved domain database
IPR:   Include interpro scan (long runtime)
MULTIPLE:   Allow multiple bin assignments
CONSERVATIVE:   Consider the "unassigned" bin with equal weight when assigning bincodes.
ANNOTATE:   Append database annotation to mapping
IS_DNA:   Sequence file contains DNA sequence
Maximally 30 * 106 symbols (nucleotides or aminoacids) may be uploaded in FASTA format.
If you would like to submit a larger data set, please contact us.