The objective of the TP is to study by molecular modeling the specific recognition between the aspartyl-tRNA synthetase and its substrate Asp. We will try to evaluate the specificity by comparing the binding of the ligands Asp and Asn. We will then seek to identify and model mutations in the active site that could favor the binding of Asn instead of Asp. This is a first step towards an engineering of the genetic code.
Aminoacyl-tRNA synthetases (aaRS) are an enzyme family implicated in protein synthesis. They are involved in translation by allowing the binding of an amino acid to its transfer RNA. They are very specific of the amino acid concerned and corresponding transfer RNA. Thus there exists one for each amino acid.
We will be interested particularly in the aspartyl-tRNA synthetase (AspRS), the goal being to perform point mutations of this enzyme in order to reduce its affinity for its natural ligand aspartate and favor its binding to asparagine.
For that, we will consider the problem in terms of protein sequences and structures. The study includes three steps:
This analysis will lead to propose judicious mutations of the active site, allowing to modify the AspRS specificity by favoring Asn binding over Asp.
The E. coli AspRS has three domains: the tRNA anticodon binding domain, the catalytic site domain, and a third one inserted into the catalytic site domain.
Launch a BLAST search. What types of proteins are found?
Identify strongly conserved regions that can correspond to the active site. Choose some positions that seem distinctive of the AspRS and of Asp binding.
Which strategy have you employed? Which mutations do you propose to modify the AspRS affinity for aspartate and asparagine?
With the informations previously obtained, propose judicious mutations to modify the AspRS affinity for aspartate and asparagine. We will try to test several of them in the next step of modeling.
Does the active site examination lead you to modify your mutation propositions made from the sequences?
Can we use the structure to verify the sequence alignment?
This is the most ambitious and complex part of the TP. There are two steps:
We will follow this protocol with the XPLOR program:
The PDB files must comply with a particular format to be readable by XPLOR. The segment name has to be written on 4 characters in columns 73-76. We also note that the 3-letter code of histidines has been changed from HIS to HIE. There indeed exists 3 possible protonation states for histidines and it is necessary to tell XPLOR which state is chosen among HID, HIE, or HIP (see the topology file amber.rtf for the definition of these states).
Does the HIE protonation state chosen for all histidines seem reasonable to you? If in doubt, try other protonation states and evaluate the impact on the results.
xplor < build.inp > build.out
xplor < minimize.inp > minimize.out
xplor < energy.inp > energy.out
What is the affinity of the AspRS protein for the Asp ligand?
What is the affinity of the AspRS for Asn?
Experimentally, the wild-type enzyme binds Asp considerably stronger than Asn, with an association free energy difference of more than 7 kcal/mol. Do you find the same tendency?
A simple mutation (for example Asp→Asn or Gln→Glu) can be realized by editing the PDB file, as explained for the ligand.
A more complex mutation can be performed with the SCWRL program. The mutation choice (for example R10K) is done by replacing in the asprs.seq file the one-letter code of the native amino acid in lowercase by the code of the amino acid chosen for the mutation in uppercase (for example replacing the lowercase ``r'' in position 10 by an uppercase ``K'').
We then launch the SCWRL program as follows:
scwrl -s asprs.seq -i asprs.wt.pdb -o asprs.pdb > scwrl.out
Compare the mutated structure obtained asprs.pdb with the native structure asprs.wt.pdb.
It is recommended to work in a separate folder for each mutant.
In order to use the structure mutated by SCWRL with XPLOR, it must be ensured that it is correctly formatted. For that, we will use the pdb2xplor program as follows:
pdb2xplor asprs.pdb A PROT > asprs.xplor.pdb
What are the affinities for Asp and Asn obtained with the mutated protein?
Have you succeeded in inverting the specificity?
Interpret structurally the effect of the mutations.
The objective of the TP is to study the structure and stability of a small protein, the Trp-cage.
↔
Trp-cage is a small artificial protein of 20 amino acids, which has been designed to fold easily. Its amino acid sequence is NLYIQWLKDGGPSSGRPPPS. The protein folding problem is one of the most important challenges of structural bioinformatics. It consists in predicting the three-dimensional structure of a protein from the sequence information alone.
We will employ the methods of molecular mechanics to model the Trp-cage.
xplor < build.inp > build.out
This script builds a model of the Trp-cage with XPLOR and performs an energy minimization to improve the geometry.
Inspect the output file and visualize the structures produced.
xplor < md.inp > md.out
This script performs a molecular dynamics of the Trp-cage during 20ps, assigning random initial velocities and then maintaining the temperature at 300K.
Inspect the output file and track the energy and temperature as a function of time.
xplor < traj2mpdb.inp > traj2mpdb.out
This script converts the format of the trajectory produced from DCD to multiple PDB.
We can then visualize the trajectory with PyMOL by loading it as follows:
load md.multi.pdb, multiplex=0
xplor < analyze.inp > analyze.out
This script reads the produced trajectory (md.dcd) and performs structural or energetic calculations at each step. The results are written in a text file (md.dat). Represent them graphically.
The analyses included in the script are only examples, it is your task to add more relevant ones with the help of the XPLOR documentation.
Mimivirus is a giant DNA virus. It is larger than many bacteria, and can itself be infected by other viruses. It was discovered that mimivirus possesses certain genes for proteins involved in translation, absent in other viruses that use the host cell's machinery to multiply. These discoveries have fuelled debate about the boundary between living and inert matter.
Homology modeling aims at building a model of the unknown structure of a target protein, knowing its sequence and the structure of another template protein of homologous sequence. The method can be decomposed into four steps:
The aim of this work is to propose the best possible structural model (criteria to be defined) of mimivirus tyrosyl-tRNA synthetase (its structure is assumed to be unknown) by homology modelling with the Modeller program.
Retrieve the sequence of mimivirus tyrosyl-tRNA synthetase in FASTA format in UniProt (http://www.uniprot.org) database.
Carefully select a structure to be used as a guide for homology modeling (of course, we won't use the structure of mimivirus tyrosyl-tRNA synthetase, which we assume to be unknown). Retrieve this structure in PDB format.
Convert query sequence from FASTA to PIR format (http://salilab.org/modeller/manual, File formats, Alignment file (PIR)) with which Modeller works. Example of a sequence in PIR format:
>P1;TvLDH sequence:TvLDH:::::::: MSEAAHVLITGAAGQIGYILSHWIASGELYGDRQVYLHLLDIPPAMNRLTALTMELEDCAFPHLAGFVATTDPKA AFKDIDCAFLVASMPLKPGQVRADLISSNSVIFKNTGEYLSKWAKPSVKVLVIGNPDNTNCEIAMLHAKNLKPEN FSSLSMLDQNRAYYEVASKLGVDVKDVHDIIVWGNHGESMVADLTQATFTKEGKTQKVVDVLDHDYVFDTFFKKI GHRAWDILEHRGFTSAASPTKAAIQHMKAWLFGTAPGEVLSMGIPVPEGNPYGIKPGVVFSFPCNVDKEGKIHVV EGFKVNDWLREKLDFTEKDLFHEKEIALNHLAQGG*
The Modeller program is launched as follows:
modeller file.py
Align the query sequence with the sequence of the selected guide structure by adapting the align2d.py script.
modeller align2d.py
The alignment produced is written in PIR, PAP, and FASTA formats. Examine these files.
Model by homology the target structure by adapting the model-single.py script.
modeller model-single.py
A summary including the PDB file names of the models produced as well as the value of the Modeller energy function and the DOPE score for each model can be found at the end of the output file (model-single.log). Examine the models produced with PyMOL.
Evaluate the models generated in the previous step by adapting the evaluate_model.py script.
modeller evaluate_model.py
This script allows a more detailed evaluation of the models produced by calculating the DOPE score for each position of the alignment. Plot the DOPE score as a function of position (columns 1 and 42) for the different models.
For further information:
Modeller tutorial: http://salilab.org/modeller/tutorial/basic.html
Modeller manual: http://salilab.org/modeller/manual
AutoDock est une suite d'outils destinés à l'amarrage moléculaire (« molecular docking »). Le docking consiste à prédire comment des ligands, comme des substrats ou des médicaments potentiels, se fixent sur un récepteur de structure tridimensionnelle connue.
La procédure de docking avec AutoDock se décompose en deux étapes principales :
Nous utiliserons également une interface graphique appelée AutoDockTools (ADT) pour faciliter la préparation du docking et la visualisation des résultats.
La méthode sera illustrée avec le docking de l'Indinavir, un inhibiteur de la protéase du VIH-1, utilisé comme antirétroviral dans le traitement du SIDA.
AutoDock a besoin de connaître les charges et types atomiques de chaque atome, ainsi qu'une liste des liaisons avec libre rotation présentes dans le ligand.
adt
→ → → → "ind.pdb"
AutoDockTools lit le ligand et effectue les étapes suivantes : calcul des charges atomiques de type Gasteiger, fusion des hydrogènes non-polaires, attribution des types atomiques, détection du nombre de degrés de liberté en torsion.
Les types atomiques et les charges sont utilisés dans les termes de mécanique moléculaire de la fonction de score d'AutoDock. Le nombre de degrés de liberté en torsion du ligand détermine sa flexibilité et intervient également dans le calcul de la pénalité entropique d'association.
→ →
Le plus petit groupe rigide de la molécule inclut cet atome et tous les atomes connectés à lui par des liaisons sans libre rotation.
→ →
Les liaisons sans libre rotation apparaissent en rouge, celles qui pourraient subir une rotation mais qui sont marquées comme inactives apparaissent en violet, enfin les liaisons marquées comme actives apparaissent en vert. Laisser la définition par défaut qui correspond à 14 degrés de liberté.
→ →
Réduire le nombre de degrés de liberté à 6 ("fewest atoms") pour accélérer le calcul.
→ → → "ind.pdbqt"
→ → → → "hsg1.pdb"
AutoDockTools lit le récepteur et comme pour le ligand effectue les étapes de calcul des charges atomiques de type Gasteiger, fusion des hydrogènes non-polaires et attribution des types atomiques.
Sauvegarder le récepteur sous le nom "hsg1.pdbqt".
Il est nécessaire de générer une carte d'interaction pour chaque type atomique du ligand plus une carte pour l'électrostatique et une carte pour la désolvatation.
→ → → "ind" →
→
Choisir 60, 60 et 66 pour le nombre de points de la grille dans les directions x, y et z ; et 2.5, 6.5 et -7.5 pour les coordonnées x, y et z de la position du centre de la grille.
→
→ → → "hsg1.gpf"
Le fichier .gpf contient les paramètres pour le programme AutoGrid.
→ →
Le programme AutoGrid est exécuté avec le fichier de paramètres
généré précédemment (la ligne de commande correspondante est
autogrid4 -p hsg1.gpf -l hsg1.glg
).
Les cartes produites sont écrites dans le répertoire courant et ont l'extension .map.
→ → → "hsg1.pdbqt"
→ → → "ind" → →
→ →
Réduire le nombre d'évaluations de l'algorithme génétique à 250000 ("short") puis cliquer sur
.→ → → "ind.dpf"
Le fichier .dpf contient les paramètres pour le programme AutoDock. On a choisi l'algorithme génétique Lamarckien comme méthode d'échantillonnage conformationnel.
→ →
Le programme AutoDock est exécuté avec le fichier de paramètres
généré précédemment (la ligne de commande correspondante est
autodock4 -p ind.dpf -l ind.dlg
).
→ →
→ → → "ind.dlg"
→ →
→ →
Une barre de contrôle apparaît et permet de parcourir les conformations trouvées par le docking. La conformation 0 est celle du départ.
Changer les options en cliquant sur le bouton "&" de la barre.
Sélectionner "Show Info" et examiner les termes énergétiques affichés.
Parcourir les conformations.
Changer le mode de représentation et de coloration du ligand et du récepteur. On pourra par exemple afficher la surface moléculaire du récepteur pour visualiser la complémentarité de forme avec le ligand.
Se positionner sur la meilleure conformation avec la barre de contrôle.
Un nouvel objet "ind_conf_1" est créé.
Sélectionner cet objet.
Sauvegarder la structure dans un fichier PDB
→ → → "ind_conf_1.pdb" →
La structure expérimentale du complexe a pour code PDB 1HSG.
Documentation en ligne de XPLOR 3.1
Linux installation in a virtual machine
Linux Command Line Cheat Sheet