![]() |
![]() |
Home | User Login |
|
|
||||||||||||
| ModBase Documentation | ||||||||||||
|
ModBase is a comprehensive Database of Comparative Protein Structure Models. |
||||||||||||
| Help Topics | ||||||||||||
| ||||||||||||
|
MODBASE Search | ||||||||||||
|
User Login
To access the User Login page, please
go to
the login page
For a ModWeb dataset, the user receives username/password by email after
the modweb calculation has been finished and the data have been stored in ModBase. | ||||||||||||
|
Datasets
ModBase is organized into Datasets. The comprehensive Dataset (combination of all SP/TR datasets) comprises comparative models of
all sequences in the SwissProt and TrEMBL databases that have detectable similarity to an experimental protein
structure. | ||||||||||||
|
Dataset Selection To select a specific subset of the datasets available to you, please go to the search page, and click on "Select specific dataset(s)". This will open two boxes. The left box contains all datasets available to you. The right box contains a subset of those. You can either click on a dataset on the left, and then click the arrow, or double click a dataset on the left, to include it into the search. You can also double click a dataset on the right to exclude it from the search. If you don't use this menu (Collapse dataset selection), all available datasets are selected by default. | ||||||||||||
|
Search Types
Different search options are available in ModBase.
| ||||||||||||
|
Display Types
Depending on the chosen search mode, different display options are available in ModBase.
| ||||||||||||
|
Search Properties
Search properties are dependent on the chosen search mode.
| ||||||||||||
|
Advanced Search Type
Often, Modbase returns several models for one sequence. If you check: | ||||||||||||
|
Advanced Search Properties
Search options for model properties are available: | ||||||||||||
|
Original Sequence ModPipe, the software that calculates the ModBase models, modifies the original protein sequences to replace non-standard amino acid residues. The Original Sequence represents the un-modified version, gets only displayed if a modification has taken place. | ||||||||||||
|
Input Sequence Enter a sequences in either FAST format or just the amino acid residues. Non-standard residues are being ignored. | ||||||||||||
|
FASTA Format
| ||||||||||||
|
Sequence Similarity Search Search by sequence similarity using BLAST. | ||||||||||||
|
Action Pulldown Menues The Action Pulldown Menues give the users the option to:
| ||||||||||||
|
Linking to Modbase Models from external Databases
To link from outside pages to specific Modbase Sequences/Models, please use the following link construction:
| ||||||||||||
|
Model Details page This is the default MODBASE page for models for one sequence. Sequence information, model/sequence coverage and model information are displayed. Two version of this page are available: Graphical and schematic | ||||||||||||
|
Model Details (Graphical) The graphical Model Details page gives access to all available information for the models of one sequence: Sequence information, model information, database crosslinks. If there are several models for the current sequence, the model with the highest sequence identity to its template is displayed, and thumbprints of the other models are show as well. Mouseover the thumbprint to get information about that model. | ||||||||||||
|
Model Coverage Sketch
| ||||||||||||
|
Model Image The model images are created on the fly using MolScript and raster 3d. | ||||||||||||
|
Model Details (Schema) The schematic model details page shows a thumbprint of each model and a sketch of the model/sequence coverage. | ||||||||||||
|
Filtered Models Modbase contains many models for some sequences. This might be due to a very long sequence, or because this sequence has been processed in several datasets. To avoid confusion, a "filtered" subset of the models are displayed on the model details pages, if there are more than 5 models for the current sequence. The filtering is done very crude, to span the whole length of the sequence. Please click on "all models" if you are not satisfied with the displayed models, a better one might be hiding. On the schematic page, a mouseover the thumbprint gives more information about that model. | ||||||||||||
|
Sequence/Model Overview This is the default page when the search results in more than one sequences. There are two models: Sequence Overview and Model Overview. | ||||||||||||
|
Sequence Overview The Sequence Overview page summarizes the search results for many sequences. The sequence coverage Sketch indicates the modeled area(s) for the given sequence. Click on the sketch to go to the "Model Details" page. | ||||||||||||
|
Model Coverage Sketch Similar as the Model Details Sketch, but smaller and with less complexity. | ||||||||||||
|
Model Overview The Model Overview page displays the search results as one line for each model. Details about modeling quality and templates are being displayed. Click on the thumbprint on the left to get to the "Model Details" page for that model. | ||||||||||||
|
Model/Fold Reliability
Please click on the Ball to go to the Model Details (schema) page for this model/sequence. | ||||||||||||
|
Model Thumbnail Please click on the Thumbnail to go the the Model Details (graphical) page for this model. | ||||||||||||
|
Sequence Information | ||||||||||||
|
Primary Sequence Database ID
The Sequence Database ID displayed on the "Sequence Information" section is chosen according to the following
order of availability: | ||||||||||||
|
Original Sequence Database ID The original Sequence Database ID from the fasta file that was used for the modeling calculation. The prefix "CU" indicates a custom database ID. This ID can be useful to identity sequences from modweb calculations. | ||||||||||||
|
Organism Information (Taxonomy)
MODBASE currently contains 16 datasets of complete genomes. These are included in the pull-down menu. | ||||||||||||
|
Model Information | ||||||||||||
|
Alignment Significance Significance of the alignment between the target and the template as reported by NCBI's PSI-BLAST program (Nucl. Acids Res. 25, 3389-3402, 1997). This is the significance reported during the template (PDB) database search. It is not the significance of the modeling alignment produced by MODELLER. | ||||||||||||
|
E-Value ModPipe1.0: Significance of the alignment between the target and the template as reported by NCBI's PSI-BLAST program (Nucl. Acids Res. 25, 3389-3402, 1997). This is the significance reported during the template (PDB) database search. It is not the significance of the modeling alignment produced by MODELLER. ModPipe2 and later:Similar significance value, but calculated by MODELLER using the Built-Profile routine. | ||||||||||||
|
Model Score Score for the reliability of a Model, derived from statistical potentials (F. Melo, R. Sanchez, A. Sali,2001 PDF). A model is predicted to be good when the model score is higher than a pre-specified cutoff (0.7). A reliable model has a probability of the correct fold that is larger than 95%. A fold is correct when at least 30% of its Calpha atoms superpose within 3.5A of their correct positions. | ||||||||||||
|
Protein Size Length of the modeled sequences (original sequence, not the modeled part). | ||||||||||||
|
Model Size Length of the model; | ||||||||||||
|
Reliable Model A reliable model is a model that is evaluated as good by a new model evaluation procedure (F. Melo, R. Sanchez, A. Sali,2001 PDF). A model is predicted to be good when the model score is higher than a pre-specified cutoff (0.7). A reliable model has a probability of the correct fold that is larger than 95%. A fold is correct when at least 30% of its Calpha atoms superpose within 3.5A of their correct positions. | ||||||||||||
|
Reliable Fold Assignment A reliable fold assignment is a fold assignment that corresponds to a significant PSI-BLAST hit or to a reliable model. A PSI-BLAST hit is significant when it is obtained in a filtered search and its E-value is smaller than 0.0001. Thus, a reliable fold assignment can correspond to an unreliable model if the PSI-BLAST score is significant. | ||||||||||||
|
PSI-Blast Fold Assignment A PSI-BLAST fold assignment is a fold assignment that corresponds to a significant PSI-BLAST hit. A PSI-BLAST hit is significant when it is obtained in a filtered search and its E-value is smaller than 0.0001. Thus, a PSI-BLAST fold assignment can correspond to an unreliable model. | ||||||||||||
|
Sequence Identity Percentage of identical residues in the alignment between the target and the template as reported during the template search. | ||||||||||||
|
ModPipe Protein Quality Score The ModPipe Protein Quality Score is a composite score comprising sequence identity to the template, coverage, and the three individual scores evalue, z-Dope and GA341. | ||||||||||||
|
z-Dope
| ||||||||||||
|
Target Region The region of the protein sequence that is modeled. | ||||||||||||
|
Protein Length The length of the original protein sequence. | ||||||||||||
|
Template PDB Code The PDB code of the template the model is based on. | ||||||||||||
|
Template Region The region of the pdb structure that was used as a template. | ||||||||||||
|
ModPipe Version ModPipe is the underlying software pipeline that is used to build all ModBase models. ModPipe1.0 relies on PSI-Blast and Impala for template selection and fold assignment. ModPipe2 is additionally using the Built-Profile method in Modeller. ModPipe2 models are also scored with the MPQS and z-Dope. | ||||||||||||
|
Coordinate (3D) File Coordinate file for the model in the PDB format. The "fifth column" (which normally contains B-factors or order parameters) contains the MODELLER error profile. | ||||||||||||
|
Modeller Error Profile
| ||||||||||||
|
PAP Alignment Format The 'PAP' format is nicer to look at than the 'PIR' format, but not as computer friendly. The WRITE_ALIGNMENT command description in the MODELLER manual contains more detailed information about this format. | ||||||||||||
|
PIR Alignment Format The 'PIR' format resembles that of the PIR sequence database. It is described in the MODELLER manual and is used for comparative modeling with MODELLER because it can contain all the information useful for modeling. | ||||||||||||
|
LIGBASE (as integrated in MODBASE)
Ligbase is a structural database of ligand binding sites. The ligbase database tables
contain all amino acid residues that are within 5 Angstroems from a small molecul ligand in a given pdb file.
The current version of ligbase contains ligand binding information from 16629 pdb files. 1. Putative ligand binding sites derived from the template
Putative ligand binding sites of MODBASE models are derived from the template on the fly by
parsing the MODBASE alignment file. The native ligand binding residues of the template (TEMPL) and the
derived ligand binding sites of the model (MODEL) are shown. Additionally, the putative binding residues of
the model are colored in its image. If a gap is found in the alignment at a ligand binding residue, a "-"
is displayed instead of the model residue.
Many pdb files used as templates don't include a ligand, but closely related pdb files might have one or more ligands bound. Using DBALI , a database of structural alignments, and using the information in the PDB-Ligbase tables, the INHERITOR table of Ligbase contains the amino acid residues of pdb files with ligands and the equivalent residues of related pdb files. Additionally, the sequence identity and coverage between those pdb files and between the respective binding sites are stored. Once the INHERITOR information is retrieved, putative ligand binding sites from related pdb files are determined similar to the binding sites derived from the templates. The inherited residues are shown (INHER) together with the equivalent model residues (MODEL). Ligbase model coverage sketchThe ligbase coverage sketch displays the same information as the general model coverage sketch. Additionally, it displays the position of the amino acid residues which are putative ligand binding residues. | ||||||||||||
|
ModWeb ModWeb is a comparative modeling server. | ||||||||||||
|
ModWeb
When a MODWEB job is finished (with the option of depositing the models into MODBASE), the user gets an email
including the dataset name, and the username/password for MODBASE to access that particular dataset.
| ||||||||||||
|
ModWeb Input Mode There are three ModWeb input options:
| ||||||||||||
|
ModWeb Personal Information Personal information for modweb calculations are solely collected for dataset identification, when the models are displayed in ModBase. | ||||||||||||
|
ModWeb Input Sequence Enter a sequences in either FAST format or just the amino acid residues. Non-standard residues are being ignored. | ||||||||||||
|
ModWeb Fasta Format
If non-standard sequences (i.e. a sequence that is not 100% identical to a sequence in SwissProt, TrEMBL or Genpept) gets submitted in
ModWeb, it is advisable to create the following header line for each sequence
| ||||||||||||
|
ModWeb Output Options When only one sequences is submitted to ModWeb, the user has the option of:
For all other requests, the results will be deposited in ModBase. The user will receive an email about the access information (dataset name, username, password) and a short summary of the results. | ||||||||||||
|
ModWeb Advanced Properties
| ||||||||||||
|
ModBase Command line Retrieval
For all other requests, the results will be deposited in ModBase. The user will receive an email about the access information (dataset name, username, password) and a short summary of the results. | ||||||||||||
|
Predicted Protein Complexes MODBASE contains structure-based predictions of 3,213 binary and 1,234 higher order protein complexes in Saccharomyces cerevisiae involving 750 and 195 proteins, respectively. To generte candidate complexes, comparative models of individual proteins were built and combined together using complexes of known structure as templates. These candidate complexes were then assessed using a statistical potential, derived from binary domain interfaces in PIBASE (http://salilab.org/pibase). A benchmark indicates a false positive rate of 3% and a true positive rate of 97%. Moreover, the predicted complexes are also filtered using functional annotation (http://yeastgenome.org) and sub-cellular localization (http://yeastgfp.ucsf.edu) data. | ||||||||||||
|
SNP Stability
| ||||||||||||