Overview
PepSweetener is a software tool that facilitates the manual annotation of intact glycopeptide spectra by providing a detailed visualization of all theoretical glycopeptides that match the molecular mass of the queried precursor ion. Ion matching can be performed in two modes: simple and advanced search.
Simple Search
In the simple search mode, precursor ions are matched to the combination of all tryptic peptides including N-glycosites in the human proteome (1 missed cleavage allowed) and all human N-glycans of reported in the October 2017 release of GlyConnect extended in-house by manually mining and curating published articles. This represents ~150 million combinations. Several filters are defined for limiting the search space. Please note that due to its size the dataset is stored at one of the ExPASy servers.
Search Parameters
The input is a list of hypothetical intact N-glycopeptide masses / precursor ions – The format of this query list is comma separated experimental masses or ions. For ions, the charge state is defined by placing each charge value in brackets e.g. 999.99(+3).
The tolerance ± - the error window on experimental glycopeptide mass values, is specified in parts per million (ppm) or dalton (Da) unitsHere is a sample input list:
910.386(+3), 1393.6226(+2), 2887.2452
Visualization of matched glycopeptides
All the theoretical glycopeptides are visualized on a heat-map chart with tiles colored accordingly to the ppm distance from the query mass. The chart is sortable by peptide and glycan mass or the ppm distance to the queried mass. Mousing over the row labels pops up the peptide mass and the UniProt identifier of the source protein.
Glycan Composition Filter
In order to filter results by glycan composition click on the button "Glycan filter". The filter excludes glycans containing particular monomers or defines a minimum amount of particular monomers in searched glycans.
Theoretical Peptide Ions
To assist peptide assignment, theoretical b and y peptide ions are displayed after mouse click on the glycopeptide tile.
Advanced Search
In the advanced mode, the search space is custom-built by specifying peptides and glycans. Peptides can be automatically generated for a chosen protein by providing UniProt identifier and cleavage enzyme or uploaded in a text file (file format description). The input glycan repertoire can be chosen as the whole glycan composition dataset, or its restriction to N glycans, or it can be uploaded in a text file with a comma-separated list of compositions (file format description). Peptides may carry modifications. In this case, corresponding mass shifts must be placed within square brackets (e.g. [+80] for phosphorylation). Raw masses can be used instead of peptide sequences or glycan compositions.
Peptide Input
Supported cleavage enzymes are: asp-n, bnps-skatole, caspase 1, caspase 2, caspase 3, caspase 4, caspase 5, caspase 6, caspase 7, caspase 8, caspase 9, caspase 10, chymotrypsin high specificity, chymotrypsin low specificity, clostripain, cnbr, enterokinase, factor Xa, formic acid, glutamyl endopeptidase, granzyme B, hydroxylamine, iodosobenzoic acid, lysc, ntcb, pepsin ph2.0, pepsin ph1.3, proline endopeptidase, pronase, proteinase K, staphylococcal peptidase i, thermolysin, thrombin, trypsinGlycosylation sites Filtering
Two checkboxes located at the bottom of the peptide input section allow for peptide filtering due to presence of the N glycosylation site motif or S/T residue.
Note: The N-glycosylation site filter works differently in protein digestion and peptide file upload modes. In the protein digestion mode, N-glycosylation sites are mapped on the protein and the filter removes only the peptides that do not contain asparagine located within N-glycosylation motif. In the peptide upload mode, only peptides with a complete N-glycosylation motif are retained.
Glycan Input
Supported file format for glycan input - Only plain text files are supported - Glycan compositions must be separated by a comma
Supported glycan composition format - Only the following glycan monomers are currently permitted:
Hex, dHex, HexA, HexNAc, Kdn, NeuAc, NeuGc, Neu, Fuc, S, P
- Count for each monomer should be placed after a colon
- Each glycan monomer is separated by a vertical line character (|)
- Glycan composition is case sensitive
Here are some valid glycan composition examples:
Hex:4|S:1|NeuAc:1|HexNAc:5, Hex:5|dHex|HexNAc:5
Click here to download sample data file with glycan compositions
Note: a missing count value after a monomer is interpreted as a single occurrence of monomer.
Acknowledgements
PepSweetener was developed in collaboration with members of Dr. Daniel Kolarich group at Max Planck Institute of Colloids and Interfaces in Berlin, Germany.
This work was supported by European Union FP7 GastricGlycoExplorer Innovative Training Network under Grant Agreement No. 316929.