Job submission for protein annotation
Include Swissprot annotations | additional protein description by Blast sequence comparison to Swissprot database |
Include Prot-scriber annotations | additional protein description by Prot-scriber tool |
list with a simple statistics on how many of the protein sequences were successfully categorized
Submitted seqs (S) | total number of user-submitted sequences |
Classified seqs (C) | protein sequences put into Mercator4 protein categories |
Annotated seqs (A) | sum of classified seqs (C) and sequences assigned to the pseudo-category BIN-35.1 (see ) |
Occupied BINs (O) | occupied Mercator4 protein categories for which the user-submitted set contains matching sequences |
BINs available (B) | total number of all true Mercator4 protein categories (no pseudo-categories) |
bar chart summarising the protein assignments across the top-level context descriptions
Each bar represents a top-level context description and the percentage of its protein categories occupied by at least one protein from the submitted protein sequences.
bar chart displaying the distribution of protein lengths based on the differences to category-specific reference lengths
Each bar represents the number of proteins having a certain length difference to the reference length of the corresponding Mercator4 category.
download of protein annotation files for further processing on your local computer
mercator4_result_data_fasta.zip | FASTA-format that contains the protein annotations |
mercator4_result_data.zip | specific tabular format that is required for the and the MapMan desktop application |
Visualize result in tree viewer
The TreeViewer shows the protein categorization visualized as hierarchical tree with annotation context descriptions as branch nodes and protein categories as leaf nodes.
Visualize result in heatmap viewer
The HeatmapViewer displays the comparison of two protein sets with protein categories as spots colored according to the comparison outcome.
Online legacy protein annotation tool
Mercator version 3.6 is an older release of the context-based annotation approach based on a different annotation framework (see publication). The Mercator version 3.6 online tool is still available but any active maintenance has ended.
Publication
Lohse et al. (2014)
Mercator: a fast and simple web server for genome scale functional annotation of plant sequence data
Plant Cell Environ. 2014 May, 37(5): 1250-1258.
Online legacy protein annotation tool
Although it is recommended to use the latest version of Mercator4, it is possible to submit sequences to legacy versions. Please notice that the available older Mercator4 versions do no longer support the online tools and .
Online enrichment analysis of protein categories
The online Mercator4 enrichment analysis identifies protein classes that are over- or under-represented within the full set of Mercator4 protein categories (BINs). The method uses statistical approaches to identify significantly enriched or depleted groups of protein categories.
When an enrichment analysis finishes, a tabular output is generated and displayed. It shows the Mercator4 protein categories found to be enriched or depleted along with a description. Click on the Download CSV button to download the table.
Below the table, the same information is visualized as interactive BIN tree diagram with nodes colored according to an enrichment
Online validation of FASTA-formatted sequences
The FASTA-format is a text-based format for representing protein or nucleotide sequences. The FASTA validator allows users to test the FASTA-format of a sequence file before submitting it to Mercator4. Each record in the FASTA-formatted file will be validated and all records not supported by Mercator4 will be listed. Optionally, the user can check the Create Mercator4-valid FASTA file checkbox and download a Mercator4-valid version of the file with all records containing errors removed.
Requirements for a FASTA-formatted file
General requirements for a FASTA-formatted file
Single-letter codes for protein sequences supported in Mercator4
Mercator4 - an online protein annotation tool
Mercator4 is an online tool to assign functional annotations to protein sequences of land plants (including flowering plants, ferns, horsetails, mosses, liverworts, and hornworts). Mercator4 can also annotate highly conserved proteins among the green algae groups of Archaeplastida. The results from user-submitted protein sequences can be visualized online and/or downloaded for further analysis.
The Mercator4 functional annotations are designed as a hierarchical framework ("Mapman4 framework") with each child node term being more specialised than its parent node term. The framework has 31 top-level categories (see figure above) which end with the protein categories at the leaf-level. Protein sequences are only assigned to leaf-level categories but the annotation is based on the full hierarchical path including all levels.
Protein pseudo-categories
A protein's context and category is depicted as a hierarchical number. The first number of the hierarchy refers to one of the 31 Mercator4 top-level categories (see list in top figure). Protein sequences which cannot be categorized by Mercator4, are assigned by default to the top-level protein pseudo-category BIN-35 "no Mercator4 annotation" (the pseudo-category BIN-35 was introduced in an old framework developed for the MapMan desktop application, Thimm et al. 2004).
For a standard plant proteome, approximately 55% to 60% of the predicted protein sequences can be categorized by Mercator4. An option to increase the protein annotation rate is the annotation tool ProtScriber v.0.1.3 (Eiteneuer and Hallab, unpublished, available on GitHub). Another option is the alignment tool Blast by which the Swiss-Prot protein annotation of a similar protein is selected (Swiss-Prot dataset of Viridiplantae proteins). For an average plant proteome, ProtScriber and Swiss-Prot annotations are available for more than 60% of all the plant protein sequences, but most protein descriptions are less specific than Mercator4 protein categories.
Mercator4 protein (pseudo-)category |
Mercator4 protein category |
other protein description |
---|---|---|
BIN-1 .. BIN-30 or BIN-50 | yes | yes or no |
BIN-35.1 | no Mercator4 annotation.other annotation available | no | yes |
BIN-35.2 | no Mercator4 annotation.no other annotation | no | no |
Protein function annotation results
When a Mercator4 job finishes, an overview of the results is displayed as
The Mercator4 protein annotation results can be downloaded for
The protein annotations can also be visualized by two interactive online tools
Mercator4 updates
The hierarchical framework for Mercator4 is regularly updated and extended. For details about the version history see the . Although it is strongly recommended to use the latest version of Mercator4, it is also possible to submit sequences to legacy versions .
Contact & publications
For any questions and suggestions, please feel free to contact us (plabipd@fz-juelich.de).
Mercator4 v.7 (October 2024)
Can I submit a FASTA file containing both DNA and protein sequences?
No, this will result in an error. FASTA files submitted to Mercator4 must be exclusively DNA or Protein sequences.
Very few of my sequences are assigned to functional BINS. Why?
You should verify that that you have selected the correct sequence type (DNA or Protein) before submitting the mercator job. If you submit DNA sequences, but specify the type of sequence as Protein, an error will be not be generated, but very few sequences are likely to be assigned to functional BINs. If you are sure that you have selected the correct sequence type, verify that you have submitted gene sequences (introns must be removed). Mercator4 is designed for land plant protein annotations: if you submit sequences from non-plant organisms, the classification and annotation rate will likely be low.
I get an error that my sequences are incompatible with Mercator4 - what can I do?
Mercator4 has been upgraded to accept a number of ambiguous protein sequences. However, there are still certain criteria which a sequence has to meet. To validate your sequence for Mercator4, you can run your FASTA file on the button which will give a detailed report including the possibility to generate a Mercator4-valid FASTA file with the offending records removed.
I get an Unknown Error (or Internal Error or Server Error) when running my job. What does this mean?
We try to handle every error scenario and provide a detailed description why the job failed. If you experience such an error, please send an email to plabipd@gmail.com with the 'JOB ID' (Starts with GFA-XXXXXXXX).
I ran my sequences on Mercator4 six months ago, but now the version has changed. Can I run my sequences against an older version?
Yes. We provide which will allow users to run against older versions of Mercator4.
My job has been queued for hours. Is it really running?
This cluster is capable of running many jobs in parallel, but can still be overpowered if many users submit jobs simultaneously. If your job has been queued for hours, submitting the same jobs again will not speed up the process. If your job has not completed after 4 hours, then you should contact us at plabipd@gmail.com providing us with the 'JOB ID' .My browser crashed while running a job, and now I cannot access my job any more. What can I do?
As we do not require users to login to submit a job, the only way we have to track your job is using a 'browser session'. If your browser has crashed, then a new session is created and the link to your jobs is lost. However, if you entered a email address when you submitted the job, you will still be notified (along with a link to the results) when the job has finished. If you did not enter an email address, but have taken a note of the JOB ID, then you can email us at plabipd@gmail.com to get the results.