Identifies peptides in MS/MS spectra via OMSSA (Open Mass Spectrometry Search Algorithm).
pot. predecessor tools | ![]() ![]() | pot. successor tools |
any signal-/preprocessing tool (in mzML format) | IDFilter or any protein/peptide processing tool |
OMSSA must be installed on the system to be able to use the OMSSAAdapter. See pubchem.ncbi.nlm.nih.gov/omssa/ for further information on how to download and install OMSSA on your system. You might find that the latest OMSSA version does not run on your system (to test this, run omssacl in your OMMSA/bin/ directory and see if it crashes). If you encounter an error message, try another OMSSA version.
Sequence databases in FASTA format must be converted into the NCBI format before OMSSA can read them. For this, you can use the program 'formatdb' (old releases) or 'makeblastdb' (recent release) of the NCBI-tools suite, which is freely available for download. Use formatdb -i SwissProt_TargetAndDecoy.fasta -o to create a BLAST database, which actually consists of multiple files. The more recent 'makeblastdb' has a similar syntax, e.g., makeblastdb -dbtype prot -in SwissProt_TargetAndDecoy.fasta .
Make sure that your FASTA file (which you convert into BLAST database files) is properly formatted, especially that it conforms to the FASTA Defline format and that there is a description(!), i.e. '>ID DESCRIPTION'. Otherwise you might get a
As database parameter for the OMSSAAdapter you can either specify the .psq file as generated by formatdb/makeblastdb (e.g., 'SwissProt_TargetAndDecoy.fasta.psq') or the original FASTA file (e.g., 'SwissProt_TargetAndDecoy.fasta'). Allowing fasta format makes it easy to specify a common TOPPAS input node (using only the FASTA suffix) for multiple downstream adapters. Just make sure that the .psq and .fasta file reside in the same directory!
This adapter supports relative database filenames, which (when not found in the current working directory) is looked up in the directories specified by 'OpenMS.ini:id_db_dir' (see TOPP for Advanced Users).
The options that specify the protease specificity (e) are directly taken from OMSSA. A complete list of available proteases can be found by executing omssacl -el.
Pre-build versions of OMSSA are 32bit for Windows and 64bit for Linux & MacOSX. If the input dataset contains many spectra (>30k), then a 32bit version of OMSSA will likely crash due to memory allocation issues. To prevent this, the adapter will automatically split the data into chunks of appropriate size (10k spectra by default) and call OMSSA for each chunk. Running time is about the same (slightly faster even) for 10k chunks, but deteriorates slightly (15%) if chunk size is too small (1k spectra). The disadvantage of chunking is that no protein hits (nor their scores) will be stored in the output, since peptide evidence is split between chunks. If you want to disable chunking at the risk of provoking a memory allocation error in OMSSA, set chunk size to '0'.
This wrapper has been tested successfully with OMSSA, version 2.x.
The command line parameters of this tool are:
INI file documentation of this tool:
OpenMS / TOPP release 2.1.0 | Documentation generated on Sat Apr 8 2017 21:05:40 using doxygen 1.8.13 |