UCSF  |  Karchin Lab  |  Sali Lab  |  UCSC Genome Browser  |  dbSNP  |  Help

Last Update: 6/21/2006
  Home
  About
  Background
  Queries
  Downloads
  Help
  Citation
Help
ID_type
LS-SNP currently supports queries using the following ID types:
SwissProt  ID
     Examples: O94759,P78312
dbSNP  rsID
     Examples: rs273258,rs987495
Kegg  pathway ID
     Examples: hsa00600,hsa04210
HUGO  gene ID
     Examples: ABCB1,BRCA2
Comparative structure model viewing
Rasmol  scripts are provided for viewing the location of SNPs on comparative structure models. In UNIX, this script  can be used as a rasmol viewer.
The models are displayed in ribbon (cartoon) format with the SNP position in spacefill.
SVM disease predictions
Predicted disease-associated with high confidence
Predicted disease-associated with low confidence
Insufficient confidence to predict
Predicted neutral with low confidence
Predicted neutral with high confidence
Disease-associated nsSNPs are predicted by a support vector machine (SVM) trained on OMIM amino-acid variants and putatively neutral nsSNPs from dbSNP.
What do the confidence ratings mean?
Excluding the "low confidence" predictions slightly improves the overall accuracy, false positive fraction and false negative fraction of our method.
The method was evaluated on a benchmark set of 1457 disease-associated missense mutations and 2504 putatively neutral missense polymorphisms. Using a three-fold cross-valdiation protocol, we measured the overall accuracy, false positive fraction (fraction of disease-associated missense mutations that are incorrectly classified), and false negative fraction (fraction of neutral missense polymorphisms that are incorrectly classified). Error bars were obtained by repeating the protocol ten times.
Confidence is measured by difference between the SVM's discriminant score from 0. The scores range from -1.77 (for the most confidently predicted disease-associated nsSNPs) to 1.71 (for the most confidently predicted neutral nsSNPs).
We have found empirically that if only predictions with scores greater than abs(0.2) are accepted, the SVM is slightly more accurate than if all predictions are accepted. In our benchmark tests, this threshold results in accuracy of 0.805 (error 0.003), false positive fraction of 0.197 (error 0.002), and false negative fraction of 0.187 (error 0.008).
If we accept all SVM predictions, regardless of confidence, accuracy is slightly degraded to 0.782 (error 0.003), with false positive fraction 0.214 (error 0.003), and false negative fraction 0.235 (error 0.007).
For details of our feature selection and prediction method see:
R. Karchin, M. Diekhans, L. Kelly, D. Thomas, U. Pieper, N. Eswar, D. Haussler, A. Sali "LS-SNP: Large-scale annotation of coding non-synonomous SNPs based on multiple information sources" Bioinformatics. 2005 Jun 15;21(12):2814-20.[Epub 2005 Apr 12]
R. Karchin, L. Kelly, A. Sali "Improving functional annotation of non-synonomous SNPs with information theory" Pacific Symposium on Biocomputing 2005 World Scientific
Linking to LS-SNP
You can link directly to LS-SNP from a web page or external program by constructing url query strings.
The query strings may contain either a genomic range or an arbitrary number of genes, SwissProt/TrEMBL IDs, dbSNP rsIDs, or KEGG pathway ids. Mixed id type queries (such as querying for several HUGO gene ids and several dbSNP rsIDs) are not supported.
Query url strings should have one of the following formats:
Query by genomic range:
https://salilab.org/LS-SNP-cgi/LS_SNP_query.pl?RequestType=QueryByRange&idtype=IDTYPE&Range=RANGE&PropertySelect=PROPERTY
Query by id list:
https://salilab.org/LS-SNP-cgi/LS_SNP_query.pl?idvalue=IDVALUE&RequestType=QueryById&idtype=IDTYPE&PropertySelect=PROPERTY
Examples of valid ID types
Example url query strings
Supported values of IDTYPE and PROPERTY
IDTYPE rsID geneID sprotID keggID
PROPERTY Protein_structure Protein_sequence Functional Genomic_sequence

  Disclaimer   |   References