dalla sequenza alla struttura
DESCRIPTION
Dalla sequenza alla struttura. Mauro Fasano Dipartimento di Biologia Strutturale e Funzionale Centro di Neuroscienze Università dell’Insubria – Busto Arsizio [email protected] http://fisio.dipbsf.uninsubria.it/cns/fasano. Dalla sequenza alla struttura. VLSEGEWQLVLV. O 2. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Dalla sequenza alla struttura](https://reader035.vdocumenti.com/reader035/viewer/2022081502/56815b7b550346895dc977b4/html5/thumbnails/1.jpg)
Dalla sequenza Dalla sequenza alla strutturaalla struttura
Mauro Fasano
Dipartimento di Biologia Strutturale e FunzionaleCentro di Neuroscienze
Università dell’Insubria – Busto [email protected]
http://fisio.dipbsf.uninsubria.it/cns/fasano
![Page 2: Dalla sequenza alla struttura](https://reader035.vdocumenti.com/reader035/viewer/2022081502/56815b7b550346895dc977b4/html5/thumbnails/2.jpg)
Dalla sequenza alla struttura
O2
VLSEGEWQLVLV . . .Sequenza Struttura Funzione
![Page 3: Dalla sequenza alla struttura](https://reader035.vdocumenti.com/reader035/viewer/2022081502/56815b7b550346895dc977b4/html5/thumbnails/3.jpg)
Che informazioni offre la struttura?
• Conformazione dei siti attivi e di legame• Orientazione dei residui conservati• Interpretazione di meccanismi• Visualizzazione di cavità• Calcolo di potenziale elettrostatico• …
![Page 4: Dalla sequenza alla struttura](https://reader035.vdocumenti.com/reader035/viewer/2022081502/56815b7b550346895dc977b4/html5/thumbnails/4.jpg)
Esempio
• FtsZ – divisione cellulare in procarioti, mitocondri e cloroplasti.
• Tubulina – componente strutturale dei microtubuli – comunicazione intracellulare e divisione cellulare.
• FtsZ e Tubulina hanno bassa similarità di sequenza e non sembrerebbero omologhe.
![Page 5: Dalla sequenza alla struttura](https://reader035.vdocumenti.com/reader035/viewer/2022081502/56815b7b550346895dc977b4/html5/thumbnails/5.jpg)
Burns, R., Nature 391:121-123Picture from E. Nogales
![Page 6: Dalla sequenza alla struttura](https://reader035.vdocumenti.com/reader035/viewer/2022081502/56815b7b550346895dc977b4/html5/thumbnails/6.jpg)
FtsZ e tubulina sono omologhe?
• Proteine che hanno conservato la struttura tridimensionale possono derivare da un progenitore comune anche se la divergenza della sequenza non permette più di riconoscere l’omologia.
![Page 7: Dalla sequenza alla struttura](https://reader035.vdocumenti.com/reader035/viewer/2022081502/56815b7b550346895dc977b4/html5/thumbnails/7.jpg)
Un altro esempio
• α-lattalbumina e lisozima possiedono:– Stesso fold– Moderata similarità– Diversa funzione
![Page 8: Dalla sequenza alla struttura](https://reader035.vdocumenti.com/reader035/viewer/2022081502/56815b7b550346895dc977b4/html5/thumbnails/8.jpg)
Metodi sperimentali:
• Diffrazione dei raggi x
• Risonanza magnetica nucleare
![Page 9: Dalla sequenza alla struttura](https://reader035.vdocumenti.com/reader035/viewer/2022081502/56815b7b550346895dc977b4/html5/thumbnails/9.jpg)
Cristallografia a raggi X
• Ottenere cristalli della proteina– 0.3-1.0 mm– Le singole molecole sono ordinate in modo
periodico, ripetitivo. • La struttura è determinata dai dati di
diffrazione.
![Page 10: Dalla sequenza alla struttura](https://reader035.vdocumenti.com/reader035/viewer/2022081502/56815b7b550346895dc977b4/html5/thumbnails/10.jpg)
Image from http://www-structure.llnl.gov/Xray/101index.html
![Page 11: Dalla sequenza alla struttura](https://reader035.vdocumenti.com/reader035/viewer/2022081502/56815b7b550346895dc977b4/html5/thumbnails/11.jpg)
Schmid, M. Trends in Microbiology, 10:s27-s31.
![Page 12: Dalla sequenza alla struttura](https://reader035.vdocumenti.com/reader035/viewer/2022081502/56815b7b550346895dc977b4/html5/thumbnails/12.jpg)
• Le proteine devono cristallizzare– Grande quantità– Solubili
• Accesso a radiazione adatta• Tempo di calcolo per risolvere la struttura
Cristallografia a raggi X
![Page 13: Dalla sequenza alla struttura](https://reader035.vdocumenti.com/reader035/viewer/2022081502/56815b7b550346895dc977b4/html5/thumbnails/13.jpg)
Risonanza Magnetica Nucleare (NMR)
• Proteine in soluzione• Limite di dimensione ~ 40 kDa• Proteine stabili a lungo• Marcatura con 15N, 13C, 2H.• Strumentazione molto costosa• Tempo per assegnare le risonanze
![Page 14: Dalla sequenza alla struttura](https://reader035.vdocumenti.com/reader035/viewer/2022081502/56815b7b550346895dc977b4/html5/thumbnails/14.jpg)
![Page 15: Dalla sequenza alla struttura](https://reader035.vdocumenti.com/reader035/viewer/2022081502/56815b7b550346895dc977b4/html5/thumbnails/15.jpg)
Il Protein Data Bank
![Page 16: Dalla sequenza alla struttura](https://reader035.vdocumenti.com/reader035/viewer/2022081502/56815b7b550346895dc977b4/html5/thumbnails/16.jpg)
![Page 17: Dalla sequenza alla struttura](https://reader035.vdocumenti.com/reader035/viewer/2022081502/56815b7b550346895dc977b4/html5/thumbnails/17.jpg)
![Page 18: Dalla sequenza alla struttura](https://reader035.vdocumenti.com/reader035/viewer/2022081502/56815b7b550346895dc977b4/html5/thumbnails/18.jpg)
Crescita del PDB
![Page 19: Dalla sequenza alla struttura](https://reader035.vdocumenti.com/reader035/viewer/2022081502/56815b7b550346895dc977b4/html5/thumbnails/19.jpg)
Motivi strutturali depositati ogni anno
![Page 20: Dalla sequenza alla struttura](https://reader035.vdocumenti.com/reader035/viewer/2022081502/56815b7b550346895dc977b4/html5/thumbnails/20.jpg)
Percentuale di nuovi motivi strutturali
![Page 21: Dalla sequenza alla struttura](https://reader035.vdocumenti.com/reader035/viewer/2022081502/56815b7b550346895dc977b4/html5/thumbnails/21.jpg)
HEADER BINDING PROTEIN 01-JUN-95 1HXN 1HXN 2COMPND MOL_ID: 1; 1HXN 3COMPND 2 MOLECULE: HEMOPEXIN; 1HXN 4COMPND 3 CHAIN: NULL; 1HXN 5COMPND 4 DOMAIN: C-TERMINAL DOMAIN; 1HXN 6COMPND 5 SYNONYM: HPX; 1HXN 7COMPND 6 HETEROGEN: PO4 1HXN 8SOURCE MOL_ID: 1; 1HXN 9SOURCE 2 ORGANISM_SCIENTIFIC: ORYCTOLAGUS CUNICULUS; 1HXN 10SOURCE 3 ORGANISM_COMMON: RABBIT; 1HXN 11SOURCE 4 TISSUE: SERUM 1HXN 12KEYWDS HEME 1HXN 13EXPDTA X-RAY DIFFRACTION 1HXN 14AUTHOR H.R.FABER,E.N.BAKER 1HXN 15REVDAT 1 15-OCT-95 1HXN 0 1HXN 16JRNL AUTH H.R.FABER,C.R.GROOM,H.BAKER,W.MORGAN,A.SMITH, 1HXN 17JRNL AUTH 2 E.N.BAKER 1HXN 18JRNL TITL 1.8 ANGSTROMS CRYSTAL STRUCTURE OF THE C-TERMINAL 1HXN 19JRNL TITL 2 DOMAIN OF RABBIT SERUM HEMOPEXIN 1HXN 20JRNL REF TO BE PUBLISHED 1HXN 21JRNL REFN 0353 1HXN 22REMARK 1 1HXN 23
![Page 22: Dalla sequenza alla struttura](https://reader035.vdocumenti.com/reader035/viewer/2022081502/56815b7b550346895dc977b4/html5/thumbnails/22.jpg)
ATOM 1 CA GLU 225 -0.900 -1.002 39.233 1.00 70.00 1HXN 170ATOM 2 C GLU 225 -0.185 0.146 39.970 1.00 70.00 1HXN 171ATOM 3 O GLU 225 -0.514 1.329 39.758 1.00 70.00 1HXN 172ATOM 4 N SER 226 0.788 -0.203 40.823 1.00 70.00 1HXN 173ATOM 5 CA SER 226 1.534 0.805 41.594 1.00 70.00 1HXN 174ATOM 6 C SER 226 2.231 1.806 40.681 1.00 68.89 1HXN 175ATOM 7 O SER 226 1.883 1.952 39.514 1.00 70.00 1HXN 176ATOM 8 CB SER 226 2.572 0.130 42.515 1.00 70.00 1HXN 177ATOM 9 OG SER 226 3.237 -0.941 41.848 1.00 70.00 1HXN 178ATOM 10 N THR 227 3.242 2.478 41.223 1.00 65.51 1HXN 179ATOM 11 CA THR 227 3.989 3.417 40.410 1.00 70.00 1HXN 180ATOM 12 C THR 227 4.274 2.705 39.080 1.00 56.25 1HXN 181ATOM 13 O THR 227 4.179 3.296 38.022 1.00 44.63 1HXN 182ATOM 14 CB THR 227 5.354 3.797 41.074 1.00 70.00 1HXN 183ATOM 15 OG1 THR 227 5.114 4.682 42.172 1.00 70.00 1HXN 184ATOM 16 CG2 THR 227 6.256 4.492 40.065 1.00 70.00 1HXN 185
![Page 23: Dalla sequenza alla struttura](https://reader035.vdocumenti.com/reader035/viewer/2022081502/56815b7b550346895dc977b4/html5/thumbnails/23.jpg)
http://www.expasy.ch/spdbv
Esegui
![Page 24: Dalla sequenza alla struttura](https://reader035.vdocumenti.com/reader035/viewer/2022081502/56815b7b550346895dc977b4/html5/thumbnails/24.jpg)
Classificazione delle proteine:
• SCOP (Structural Classification of Proteins, scop.mrc-lmb.cam.ac.uk/scop/, Murzin et. al.):
548 folds (major structural similarity in terms of secondary structures e.g. globin-like, Rossman fold); 1296 families (clear evolutionary relationship or homology e.g. globins, Ras)
• CATH (Class, Architecture, Topology, Homologous Superfamily, www.biochem.ucl.ac.uk/bsm/cath/, Orengo et. al):
35 architectures (gross arrangment of secondary structures e.g. non-bundle, sandwich); 580 topologies (connectivity of secondary structures e.g. globin-like, Rossman fold); 1846 families (clear homology, same function)
![Page 25: Dalla sequenza alla struttura](https://reader035.vdocumenti.com/reader035/viewer/2022081502/56815b7b550346895dc977b4/html5/thumbnails/25.jpg)
Structural Classification Of Proteins
![Page 26: Dalla sequenza alla struttura](https://reader035.vdocumenti.com/reader035/viewer/2022081502/56815b7b550346895dc977b4/html5/thumbnails/26.jpg)
Predizione della struttura secondaria e terziaria
![Page 27: Dalla sequenza alla struttura](https://reader035.vdocumenti.com/reader035/viewer/2022081502/56815b7b550346895dc977b4/html5/thumbnails/27.jpg)
Metodi predittivi
»Comparative modeling> 30% similitudine
»Threading/Fold recognition0 – 30% similitudine
»Ab initionessun omologo
![Page 28: Dalla sequenza alla struttura](https://reader035.vdocumenti.com/reader035/viewer/2022081502/56815b7b550346895dc977b4/html5/thumbnails/28.jpg)
![Page 29: Dalla sequenza alla struttura](https://reader035.vdocumenti.com/reader035/viewer/2022081502/56815b7b550346895dc977b4/html5/thumbnails/29.jpg)
Qualità del modello comparativo
Identità di sequenza:
60-100% Confrontabile con NMR media risoluzioneSpecificità di substrato
30-60% Molecular replacement in cristallografiaPartenza per site-directed mutagenesis
<30% Gravi errori
![Page 30: Dalla sequenza alla struttura](https://reader035.vdocumenti.com/reader035/viewer/2022081502/56815b7b550346895dc977b4/html5/thumbnails/30.jpg)
Building by homology (Homology modelling)
---
G
---
Y
---
M
AAAA
KSTA
AGGG
YFFY
LEDA
VVVV
LVI
L
SEDS
Allineamento con proteine a struttura nota
Modello strutturale
![Page 31: Dalla sequenza alla struttura](https://reader035.vdocumenti.com/reader035/viewer/2022081502/56815b7b550346895dc977b4/html5/thumbnails/31.jpg)
Fold recognition (Threading)
Sequenza:
+Motivi strutturali noti
SLVAYGAAM
Modello strutturale
![Page 32: Dalla sequenza alla struttura](https://reader035.vdocumenti.com/reader035/viewer/2022081502/56815b7b550346895dc977b4/html5/thumbnails/32.jpg)
Ab initio
Sequenza
SLVAYGAAM
Modello strutturale
![Page 33: Dalla sequenza alla struttura](https://reader035.vdocumenti.com/reader035/viewer/2022081502/56815b7b550346895dc977b4/html5/thumbnails/33.jpg)
General Flowchart
![Page 34: Dalla sequenza alla struttura](https://reader035.vdocumenti.com/reader035/viewer/2022081502/56815b7b550346895dc977b4/html5/thumbnails/34.jpg)
Un numero grandissimo di polipeptidi si struttura in un numero finito (e relativamente piccolo) di folds
Almeno una proteina su due di quelle presenti nel database ha un omologo (identità > 30%) che quasi sempre ha lo stesso fold.
Building by homology
![Page 35: Dalla sequenza alla struttura](https://reader035.vdocumenti.com/reader035/viewer/2022081502/56815b7b550346895dc977b4/html5/thumbnails/35.jpg)
Costruire il modello comparativo
1) Cercare il massimo numero di omologhi che possiedano una entry nel PDB. Strumenti che utilizzano PSSM sono più sensibili. In questo caso vengono utilizzate sequenze senza struttura per costruire la PSSM.
2) Costruire un accurato allineamento multiplo tra la sequenza da modellare e tutte le entries che verranno utilizzate come templato.
![Page 36: Dalla sequenza alla struttura](https://reader035.vdocumenti.com/reader035/viewer/2022081502/56815b7b550346895dc977b4/html5/thumbnails/36.jpg)
Trovare strutture di proteine la cui sequenza è simile
allineamento
Modello strutturale
Verifica
OK!
![Page 37: Dalla sequenza alla struttura](https://reader035.vdocumenti.com/reader035/viewer/2022081502/56815b7b550346895dc977b4/html5/thumbnails/37.jpg)
Costruire il modello stesso
Determinare la struttura secondaria in base all’allineamento
Costruire le regioni conservate. Per ciascuna regione possiamo prendere le coordinate del frammento con la maggior similarità di sequenza.
Costruire le regioni variabili, solitamente loops.
![Page 38: Dalla sequenza alla struttura](https://reader035.vdocumenti.com/reader035/viewer/2022081502/56815b7b550346895dc977b4/html5/thumbnails/38.jpg)
Usando raccolte di loops osservati in strutture note, in base alla loro lunghezza ed alla loro sequenza
Costruendo la conformazione del loop ab initio. Vengono generate numerose conformazioni casuali e si calcola l’energia in un opportuno campo di forze.
Costruzione dei loops:
![Page 39: Dalla sequenza alla struttura](https://reader035.vdocumenti.com/reader035/viewer/2022081502/56815b7b550346895dc977b4/html5/thumbnails/39.jpg)
![Page 40: Dalla sequenza alla struttura](https://reader035.vdocumenti.com/reader035/viewer/2022081502/56815b7b550346895dc977b4/html5/thumbnails/40.jpg)
![Page 41: Dalla sequenza alla struttura](https://reader035.vdocumenti.com/reader035/viewer/2022081502/56815b7b550346895dc977b4/html5/thumbnails/41.jpg)
Alcuni siti web di homology modeling
COMPOSER – felix.bioccam.ac.uk/soft-base.html
MODELLER – guitar.rockefeller.edu/modeller/modeller.html
WHAT IF – www.sander.embl-heidelberg.de/whatif/
SWISS-MODEL – www.expasy.ch/SWISS-MODEL.html
![Page 42: Dalla sequenza alla struttura](https://reader035.vdocumenti.com/reader035/viewer/2022081502/56815b7b550346895dc977b4/html5/thumbnails/42.jpg)
![Page 43: Dalla sequenza alla struttura](https://reader035.vdocumenti.com/reader035/viewer/2022081502/56815b7b550346895dc977b4/html5/thumbnails/43.jpg)
Swiss-Modelhttp://www.expasy.ch/swissmod/SWISS-MODEL.html
![Page 44: Dalla sequenza alla struttura](https://reader035.vdocumenti.com/reader035/viewer/2022081502/56815b7b550346895dc977b4/html5/thumbnails/44.jpg)
![Page 45: Dalla sequenza alla struttura](https://reader035.vdocumenti.com/reader035/viewer/2022081502/56815b7b550346895dc977b4/html5/thumbnails/45.jpg)
http://guitar.rockefeller.edu/modeller/about_modeller.shtml
Modeller
Advanced program for homology modeling
Based on distance constraints
Implemented in several popular modelling packages such as InsightIIThe source is available for unix platforms at the above URL
![Page 46: Dalla sequenza alla struttura](https://reader035.vdocumenti.com/reader035/viewer/2022081502/56815b7b550346895dc977b4/html5/thumbnails/46.jpg)
Threading (fold recognition)
La sequenza di input viene confrontata con una libreria di folds noti
Si calcola un punteggio che esprima la compatibilità tra la sequenza e ciascun fold considerato
Punteggi statisticamente significativi indicano che la sequenza ha una certa probabilità di assumere la stessa struttura 3D del fold considerato
![Page 47: Dalla sequenza alla struttura](https://reader035.vdocumenti.com/reader035/viewer/2022081502/56815b7b550346895dc977b4/html5/thumbnails/47.jpg)
Input:
Sequenza
Donatore HAccettore HGlyIdrofobico
Collezione di folds di proteine note
![Page 48: Dalla sequenza alla struttura](https://reader035.vdocumenti.com/reader035/viewer/2022081502/56815b7b550346895dc977b4/html5/thumbnails/48.jpg)
Input:
Sequenza
Donatore HAccettore HGlyIdrofobico
Collezione di folds di proteine note
![Page 49: Dalla sequenza alla struttura](https://reader035.vdocumenti.com/reader035/viewer/2022081502/56815b7b550346895dc977b4/html5/thumbnails/49.jpg)
S=20S=5S=-2Z=5Z=1.5Z= -1
Donatore HAccettore HGlyIdrofobico
![Page 50: Dalla sequenza alla struttura](https://reader035.vdocumenti.com/reader035/viewer/2022081502/56815b7b550346895dc977b4/html5/thumbnails/50.jpg)
Chain/Domain Library
![Page 51: Dalla sequenza alla struttura](https://reader035.vdocumenti.com/reader035/viewer/2022081502/56815b7b550346895dc977b4/html5/thumbnails/51.jpg)
Scoring functions for fold recognition
Ci sono due metodi per valutare la compatibilità sequenza-struttura (1D-3D)
In methods based on structural profile, for every fold a profile is built based on structural features of the fold and compatability of every amino acid to the features.
The structural features of each position are determined based on the combination of secondary structure, solvent accessibility and the property of the local environment (hydrophobic/hydrophilic)
The profile is a defined mathematical structure, adjusted for pair-wise comparisons and dynamic programming
![Page 52: Dalla sequenza alla struttura](https://reader035.vdocumenti.com/reader035/viewer/2022081502/56815b7b550346895dc977b4/html5/thumbnails/52.jpg)
10100N
::::::::
10100167-9987-242
10100-80101-50101
GextGopY…DCA
Amino acid typePo
sitio
n on
sequ
ence
![Page 53: Dalla sequenza alla struttura](https://reader035.vdocumenti.com/reader035/viewer/2022081502/56815b7b550346895dc977b4/html5/thumbnails/53.jpg)
Contact potentials
This method is based on predefined tables which include pseudo-energetic scores to each pairwise interaction of two amino acids.
This method makes use of distance matrix for representation of different folds
For each pair of amino acids which are close in space the interaction energy is summed. The total sum is the indication for the fitness of the sequence into that structure
![Page 54: Dalla sequenza alla struttura](https://reader035.vdocumenti.com/reader035/viewer/2022081502/56815b7b550346895dc977b4/html5/thumbnails/54.jpg)
Scoring Function
…YKLILNGKTKGETTTEAVDAATAEKVFQYANDNGVDGEW…
Tendenza a stare in un certo ambiente: E_s
(singleton term)
Tendenza a stare vicini: E_p
(pairwise term)
Alignment gap penalty: E_g
Energia totale: E_m + E_p + E_s + E_g
Descrive quanto la sequenza assomiglia al templato
Qualità dell’allineamento in una certa posizione: E_m
(mutation term)
![Page 55: Dalla sequenza alla struttura](https://reader035.vdocumenti.com/reader035/viewer/2022081502/56815b7b550346895dc977b4/html5/thumbnails/55.jpg)
• • • • • • • • • • • • • •Amino acid index
Am
ino
acid
inde
x
••••
1
N1 N
•••• • •
••
![Page 56: Dalla sequenza alla struttura](https://reader035.vdocumenti.com/reader035/viewer/2022081502/56815b7b550346895dc977b4/html5/thumbnails/56.jpg)
Expected Performance
Predicted model
X-raystructure
PROSPECT prediction in CASP4:12 out 19 folds (no homology) recognized
![Page 57: Dalla sequenza alla struttura](https://reader035.vdocumenti.com/reader035/viewer/2022081502/56815b7b550346895dc977b4/html5/thumbnails/57.jpg)
Web sites for fold recognition
3D-PSSM - http://www.bmm.icnet.uk/~3dpssm
Profiles:
Libra I - http://www.ddbj.nig.ac.jp/htmls/E-mail/libra/LIBRA_I.html
UCLA DOE - http://www.doe-mbi.ucla.edu/people/frsvr/frsvr.html
Contact potentials
123D - http://www-Immb.ncifcrf.gov/~nicka/123D.html
Profit - http://lore.came.sbg.ac.at/home.html
![Page 58: Dalla sequenza alla struttura](https://reader035.vdocumenti.com/reader035/viewer/2022081502/56815b7b550346895dc977b4/html5/thumbnails/58.jpg)
Risultati
![Page 59: Dalla sequenza alla struttura](https://reader035.vdocumenti.com/reader035/viewer/2022081502/56815b7b550346895dc977b4/html5/thumbnails/59.jpg)
Ab initio methods for modelling
This field is of great theoretical interest but, so far, of very little practical applications. Here there is no use of sequence alignments and no direct use of known structures
The basic idea is to build empirical function that simulates real physical forces and potentials of chemical contacts
If we will have perfect function and we will be able to scan all the possible conformations, then we will be able to detect the correct fold
![Page 60: Dalla sequenza alla struttura](https://reader035.vdocumenti.com/reader035/viewer/2022081502/56815b7b550346895dc977b4/html5/thumbnails/60.jpg)
Algorithms for Ab initio prediction include:A. Searching procedure that scans many possible structures (conformations)B. Scoring function to evaluate and rank the structures
Due to the large search space, heuristic methods are usually applied
The parameters in the searching procedure are the dihedral angles which specify the exact fold of the polypeptide chain
![Page 61: Dalla sequenza alla struttura](https://reader035.vdocumenti.com/reader035/viewer/2022081502/56815b7b550346895dc977b4/html5/thumbnails/61.jpg)
A
C
B
E
D
![Page 62: Dalla sequenza alla struttura](https://reader035.vdocumenti.com/reader035/viewer/2022081502/56815b7b550346895dc977b4/html5/thumbnails/62.jpg)
A
C
B
ED
![Page 63: Dalla sequenza alla struttura](https://reader035.vdocumenti.com/reader035/viewer/2022081502/56815b7b550346895dc977b4/html5/thumbnails/63.jpg)
New Fold Methods
• Since almost all predictors use sequence and structural databases in some form, there is no longer an “ab initio” category
• Assessment is sometimes difficult to communicate due to the complexity of the protein structure and completeness of prediction
• Methods are still somewhat limited to smaller proteins
![Page 64: Dalla sequenza alla struttura](https://reader035.vdocumenti.com/reader035/viewer/2022081502/56815b7b550346895dc977b4/html5/thumbnails/64.jpg)
Rosetta-David Baker
• Based on the assumption that the distribution of conformations sampled by a local segment of the polypeptide chain is reasonably approximated by the distribution of structures adopted by that sequence and closely related sequences in known protein structures.
• Fragment libraries for all possible three and nine residue segments of the chain are extracted from PDB by profile methods
![Page 65: Dalla sequenza alla struttura](https://reader035.vdocumenti.com/reader035/viewer/2022081502/56815b7b550346895dc977b4/html5/thumbnails/65.jpg)
Rosetta-Simulation Procedure
• Information on fragments from secondary structure prediction methods compiled and scored based on equation for local secondary structure propensity
• Conformational space defined by these fragments is then searched by a Monte Carlo procedure with an energy function that favors compact structures with paired beta strands and buried hydrophobic residues, refinement of procedure late in simulation
• Thousands of structures generated• Filters to remove bad structures• Remaning structures clustered and cluster center taken
as the prediction.
![Page 66: Dalla sequenza alla struttura](https://reader035.vdocumenti.com/reader035/viewer/2022081502/56815b7b550346895dc977b4/html5/thumbnails/66.jpg)
Force fields- collection of terms that simulate the forces act between atoms
Terms based on probabilities to find pairs of amino acids or atoms within specific distances
Terms based on surface area and overlapping volume of spheres representing atoms
Methods to evaluate structures are based on
![Page 67: Dalla sequenza alla struttura](https://reader035.vdocumenti.com/reader035/viewer/2022081502/56815b7b550346895dc977b4/html5/thumbnails/67.jpg)
In homology modelling, construction of the side chains is done using the template structures when there is high similarity between the built protein and the templates
In spite of the huge size of the problem (because each side chain influences its neighbours) there are quite succesful algorithms to this problem.
Side chain construction
Without such similarity the construction can be done using rotamer libraries
A compromise between the probability of the rotamer and its fitness in specific position determines the score. Comparing the scores of all the rotamer for a given amino acid determines the preferred rotamer.
![Page 68: Dalla sequenza alla struttura](https://reader035.vdocumenti.com/reader035/viewer/2022081502/56815b7b550346895dc977b4/html5/thumbnails/68.jpg)
In this work we examined differences in structures of amino- acid side chains around point mutations.
Phe
AsnConformation - a given setof dihedral angle which defines a structure.
Rotamer - energetically favourable conformation.
![Page 69: Dalla sequenza alla struttura](https://reader035.vdocumenti.com/reader035/viewer/2022081502/56815b7b550346895dc977b4/html5/thumbnails/69.jpg)
SER 59.6 41.0 SER -62.5 26.4SER 179.6 32.6
Example to library of rotamers
TYR 63.6 90.5 21.0TYR 68.5 -89.6 16.4TYR 170.7 97.8 13.3TYR -175.0 -100.7 20.0TYR -60.1 96.6 10.0TYR -63.0 -101.6 19.3
![Page 70: Dalla sequenza alla struttura](https://reader035.vdocumenti.com/reader035/viewer/2022081502/56815b7b550346895dc977b4/html5/thumbnails/70.jpg)
![Page 71: Dalla sequenza alla struttura](https://reader035.vdocumenti.com/reader035/viewer/2022081502/56815b7b550346895dc977b4/html5/thumbnails/71.jpg)
Model evaluation
The main approaches for model evaluation are:A. Use of internal information (such as the one that used for the model construction)B. Use of external information derived from the databases
After the model is built we can check it by various methods.
If the model turns out to be bad, it is necessary to repeat several stages of the model building
![Page 72: Dalla sequenza alla struttura](https://reader035.vdocumenti.com/reader035/viewer/2022081502/56815b7b550346895dc977b4/html5/thumbnails/72.jpg)
![Page 73: Dalla sequenza alla struttura](https://reader035.vdocumenti.com/reader035/viewer/2022081502/56815b7b550346895dc977b4/html5/thumbnails/73.jpg)
Usually algorithms are checked by building models for proteins which have already solved structure and comparison between the model and the native structure
It is always possible that information from the native structure will be used in direct or indirect ways for model building
A more objective test is prediction of structures before they are publicly distributed (this is the idea of the CASP competitions)