Help

Pasting sequences:

Ensure that the pasted sequence only contains a string of single letter amino acid codes.
Pasted sequences can only be uploaded one at a time.
Between successive uploads, the 'upload' button must be clicked to clear the field.
A successful upload of a pasted sequence is reflected in the counter below the field.


Upload button for pasted sequence:

Multiple sequences can be pasted by iteratively clicking 'upload' after every pasted sequence.

Upload button for file:

Multiple sequences/PDBs can be uploaded by iteratively clicking 'upload' after every uploaded file.

File upload:

The server can recognize the contents of the following file formats:
Alignment files: MODELLER PIR or FASTA
Structure files: PDB or mmCIF format (atomic coordinate files can have '.ent', '.pdb', '.cif' or no extension at all)
Compressed files (.zip or .tar.gzip) consisting of files of one or more of the above formats are also allowed.

Multiple files can be uploaded by iteratively clicking 'upload' after every upload.

MODELLER PIR structure entries are considered structures if the corresponding PDB file can be found among the uploaded files or in the Sali Lab PDB library.
If not found it is simply considered a sequence.

The web page displays all files that have been uploaded, their sizes and when they were uploaded (when they were compressed for compressed files).


Accepted file formats:

MODELLER PIR, FASTA, PDB, .zip or .tar.gz

Importantly, archives (.zip and .tar.gz) should contain all files in the top level, and not within a folder. Thus, when preparing these, the user should archive all desired files directly, not a folder containing the files.


Choosing PDB structures:

The user can choose structures to align from the PDB without having to upload his/her own structure files.

Structures are chosen by entering their 4 letter PDB codes.
Multiple structures are entered on separate lines.
Each PDB should be entered only once, even if several segments of the same PDB are to be aligned.
Residue ranges are chosen in the next step.


E-mail address:

If an email address is provided, an email will be sent out when the job is completed.


Job name:

The job can be given a name to more easily identify it.


Choice of alignment category:

Upon submission of the start page inputs the SALIGN server determines the alignment category which is likely to produce the most accurate alignment.
Click here for more information about the decision process
The proposed alignment category can be changed by pressing 'Advanced' if other options exist.
For a flowchart of default and optional alignment categories, click here


PDB segments:

Segments of the structure files can be chosen by specifying the capping residues and chain identifiers
eg. 1a0l 20:A:40:A will choose residues 20 though 40 from chain A of the PDB file identified as 1a0l

Default values of the structure segment always span all residues of the first chain of the given structure file.
If residue ranges are specified in the input MODELLER PIR files, these are taken instead of the default.
Multiple segments from the same PDB should be entered on separate lines.
If the user chooses to only align the sequences of structures (advanced), the segments to align are still to be set here.


Alignment category:

The user can override the default alignment category choice.

Examples:

If only structures have been uploaded, SALIGN will by default do a multiple structure alignment. This can be overridden to perform a multiple sequence alignment instead. Structure-sequence alignment is not an option when only structures have been uploaded. See third example below for structure-sequence.

If two sets of sequences have been uploaded, SALIGN will by default align each set separately, then align the resulting profiles to each other. This can be overriden to perform a multiple sequence alignment of all uploaded sequences instead.

If structures and sequences have been uploaded, SALIGN will by default do a structure-sequence alignment. This can be overriden to perform a multiple sequence alignment instead.


Only viable alternatives to the alignment category are displayed.
For a flowchart of default and optional alignment categories, click here


Optimal alignment type:

To optimize alignment time and quality the following scheme has been chosen as default:
30 structures/sequences or less are aligned using the tree algorithm, while progressive alignments are performed for larger numbers of entries
For information about exceptions and intricate scenarios, click here


1D gap penalties:

In most cases there is only one set of 1D gap penalties - in these situations the defaults (-450.00 and -50.00) are simply the values for these.
In some situations more than one set of 1D gap penalties exist - in these cases the default setting may have different values for different sets of 1D gap penalties. In such cases the displayed values reflect the main set of 1D gap penalties.
If the user changes the 1D gap penalties all sets of 1D gap penalties will acquire this value.
Scenarios with multiple sets of 1D gap penalties only arise in alignments that contain multiple alignment steps.

Example: A structure-sequence alignment of multiple structures and sequences includes one multiple structure alignment, one multiple sequence alignment and one structure-sequence alignment.
Each of these alignments have a distinct set of 1D gap penalties.


For more information about 1D gap penalties, click here


Launch Chimera

If structures have been aligned, the results page provides a link that retrieves the aligned structure files and opens them locally in the molecular graphics viewer Chimera. For this function, the user must have Chimera installed on their computer. On Mac, the launcher downloads a file that needs to be clicked in order to start Chimera.


Examples

1. Multiple structure alignment - asp.ali

This alignment file in MODELLER PIR format contains several structure entries and no sequence entries. The comment section at the top of the file describes how the server recognizes these as structures rather than sequences. Since the input consists of only structures, the server suggests a multiple structure alignment by default. Start and end residues/chains are extracted from the headers in the MODELLER PIR file.

Structures to align can also be specified by entering their PDB codes in the main interface, or by uploading cutom PDB files. If no structure alignment file is uploaded, the server suggests aligning the entire first chain of each PDB. The user will thus often need to modify these alignment segments. Click here for more information on specifying PDB segments.


2. Multiple sequence alignment - asp_seqs.ali

This alignment file in MODELLER PIR format contains several sequence entries. The comment section at the top of the file explains why the server recognizes these as sequences rather than structures. Since the input consists of one group of sequences, the server carries out a multiple sequence alignment.

As an alternative to uploading alignment files, sequences without headers can be pasted into the appropriate field of the interface. Click here for more information on pasting sequences.


3. Profile-profile sequence alignment - asp_seqs_half1.ali + asp_seqs_half2.ali

These alignment files in MODELLER PIR format contain several sequences each. The comment sections at the top of the files explain why the server recognizes these as sequences rather than structures. Since the input consists of two groups of sequences, the server first generates multiple sequence alignments of each file separately, and then aligns these to each other using the profile-profile alignment algorithm.


4. Structure-sequence alignment - asp_strs_seqs.ali

This alignment file in MODELLER PIR format contains both structure entries and sequence entries. The comment sections in the file explain why the server recognizes some entries as sequences and some as structures. Since the input consists of both sequences and structures, the server first generates multiple sequence and structure alignments separately, and then aligns the resulting alignments to each other using the structure-sequence alignment algorithm.

As an alternative to uploading sequence alignment files, sequences without headers can be pasted into the appropriate field of the interface. Click here for more information on pasting sequences. Structures to align can also be specified by entering their PDB codes in the main interface, or by uploading cutom PDB files. If no structure alignment file is uploaded, the server suggests aligning the entire first chain of each PDB. The user will thus often need to modify these alignment segments. Click here for more information on specifying PDB segments.


Citations

In addition to the SALIGN web server, please cite the following publications:

  • Profile-profile sequence aligments

    M.A. Marti-Renom, M.S. Madhusudhan, A. Sali. Alignment of protein sequences by their profiles. Protein Sci 13, 1071-1087, 2004.

  • Structure-sequence aligments

    M.S. Madhusudhan, M.A. Marti-Renom, R. Sanchez, A. Sali. Variable gap penalty for protein sequence-structure alignment. Protein Eng Des Sel 19, 129-133, 2006.

  • Structure-structure alignments

    M.S. Madhusudhan, B.M. Webb, M.A. Marti-Renom, N. Eswar, A. Sali. Alignment of multiple protein structures based on sequence and structure features. Protein Eng Des Sel 22, 569-574, 2009.