1
Biochemistry
Biostatistics and Bioinformatics
Protein-Ligand Docking
2
Biochemistry
Biostatistics and Bioinformatics Protein-Ligand Docking
Description of Module Subject Name Biochemistry
Paper Name 13 Biostatistics and Bioinformatics Module Name/Title 18 Protein-Ligand Docking
Dr. Vijaya Khader Dr. MC Varadaraj
3
Biochemistry
Biostatistics and Bioinformatics Protein-Ligand Docking
1. Objectives: in the present module on protein ligand docking are
1.1. To obtain 3D structures for a protein, such as an enzyme and its ligand, such as a substrate 1.2. To reformat the 3D structures so as to be used for docking
1.3. To produce three dimensional dockings of the protein and its ligand
1.4. To analyse dockings manually to select correct docking with biochemically relevant interactions between the protein and its ligand.
2. Concept Map
Brief Description
Protein Structure
Ligand Obtaining 3D structures
Summary Protein-Ligand Docking
Analyzing Dockings Interconverting Formats
4
Biochemistry
Biostatistics and Bioinformatics Protein-Ligand Docking
3. Description
The proteins are involved in various biochemical functions including Enzyme catalysis, Mechanical support, coordinated motion, Storage, Transport, Immune protection, and Controlling growth as well as differentiation. A protein, if involved in a diseased condition, then this protein is a potential target for developing a drug molecule to alleviate the diseased condition. A target protein in a diseased condition may be identified using high-throughput technologies of transcriptomics and proteomics. The function of all protein targets is determined by their three dimensional (3-D) structure. Therefore, first step is to obtain an experimental three dimensional structure of the target protein. If experimental three dimensional structure is not available, then one can opt homology modeling for obtaining an adequate model of the protein structure. The drug development begins with identification of a binding molecule to regulate the biological activity of the target protein in a desired way. This involves high-throughput virtual screening of 3D structures of chemical molecules or drugs library to identify a lead compound. The 3D structure of target protein and drugs library are the used for producing complex structures between the two. The high-throughput virtual screening is carried out using in silico docking of each drug molecule to target protein and to rank the drug molecules as per their binding affinity. Alternatively, a substrate of an enzyme may be the starting ligand structure for producing enzyme -substrate docking complex structures.
The resulting dockings are then analysed by using binding pocket information to identify correct binding of ligand to elucidate chemical interactions for its functioning. This follows in silico lead optimization through introduction of changes in the structure of ligand to improve its binding efficiency to target protein, to achieve desired action, such as inhibition or activation of the biochemical function of the target protein.
Alternatively, quantitative structure activity relationship (QSAR) experiments may be conducted to improve the lead compound or the ligand.
Therefore, docking of a known ligand, such as a substrate with its binding protein, such as an enzyme, begins with obtaining 3D structures of the target enzyme and its substrate. This is followed by reformatting 3D structures coordinates for submission to docking software for producing enzyme -substrate dockings.
Finally, we need to analyse resulting dockings to select a biochemically relevant enzyme-substrate to understand the interactions of amino acid residues in the enzyme and the chemical groups on the substrate, so as to design alternative molecules having desired interactions for controlling the biochemical activity of the target enzyme.
Go to Concept Map
3.1. Obtaining 3D Coordinates
In the last module, we have seen “Hpr Kinase/Phosphorylase” (Hpr K/P) enzyme of Gram-positive bacteria, which is a bifunctional enzyme to add phosphate to Ser46 of Hpr (histidine-containing protein) as kinase and to remove phosphate from Ser46 of phosphorylated Hpr as phosphorylase. As kinase, Hpr K/P accepts Hpr and ATP as substrates to produce serine phosphorylated Hpr i.e. Ser-(P)-HPr. After the reaction, the
5
Biochemistry
Biostatistics and Bioinformatics Protein-Ligand Docking
products Ser-(P)-HPr and ADP leave the bindng pocket of the enzyme. It follows bi-bi sequential kinetic mechanism Therefore, Hpr and ATP binds to enzyme before the reaction and both Ser-(P)-HPr and ADP leave the binding pocket after the reaction.
In this module we are interested in visualizing binding of Hpr and ATP to the binding pocket of Hpr K/P enzyme in the three dimensions. Therefore, we need 3D coordinates of enzyme and Hpr protein as well as 3D coordinates of ATP.
3.1.1. Obtaining 3D Coordinates of proteins
Visit PDB at www.rcsb.org/pdb/ and start entering “Hpr Kinase” in the search text box. The Text Box will suggest some recommended names such as UniProt Molecule Name or Ontology Terms to guide you selecting from the suggested names. The first suggestion for UniProt Molecule Name i.e. “Hpr Kinase/Phosphorylase” matches our requirement. Therefore, simply click on the first suggestion or enter complete name “Hpr Kinase/Phosphorylase” and click “Go” button.
This will display the search results as shown next:
6
Biochemistry
Biostatistics and Bioinformatics Protein-Ligand Docking
The ORGANISM column shows available structures. Scroll down the list to inspect each entry. These are extracted next for ready reference.
1. 2QMH - Structure of L. casei V267F mutant HprK/P
2. 1KNX - HPr kinase/phosphatase from Mycoplasma pneumoniae 3. 1KKL - L. casei HprK/P in complex with Hpr from B. subtilis 4. 1KKM - L. casei HprK/P in complex with Ser-(P)-Hpr from B. subtilis
5. 1KO7 - X-ray structure of full length HPr kinase/phosphatase from Staphylococcus xylosus 6. 1JB1 - Lactobacillus casei HprK/P Bound to Phosphate
The third structure 1KKL in the list reveals the “X-ray structure of a bifunctional protein kinase in complex with its protein substrate HPr”. Therefore, this will allow visualization of substrate Hpr in the binding pocket of the enzyme. However, this structure has only C-terminal residues from 135 to 310 for Hpr K/P.
This list does not include the structure of bound ATP to Hpr K/P. The fifth structure 1KO7 i.e. “Structure of the full-length HPr kinase/phosphatase from Staphylococcus xylosus at 1.95 Å resolution: Mimicking the product/substrate of the phospho transfer reactions” is available. Therefore, full length 1KO7 structure of Hpr K/P can be used to dock ATP to Hpr K/P. This will allow visualization of substrate ATP in the binding pocket of the enzyme. 1KO7 has also two phosphates bound in binding pocket which will help in selecting the correctly oriented ATP dockings in the P-loop. At this stage, we will have two structures. One structure with substrate ATP docked in the binding pocket of the full length enzyme and the other structure with substrate Hpr bound in the binding pocket of the part length enzyme. At this stage, both the structures may be superimposed. This superposition of Hpr bound part length Hpr K/P with full ATP docked full length Hpr K/P will allow visualization of both the substrates in the active site of the enzyme. Therefore, Download 1KKL and 1KO7.
Open 1KO7 structure using SwissPDBViewer and scroll down the control panel. It shows two chains, named chain A and B. Therefore, we need to extract the coordinates of monomer Hpr K/P to be used in docking.
To extract the coordinates of monomer Hpr K/P, select first residue in chain ‘A’ by clicking on “MET1”. Now scroll down to last residue of the chain ‘A’ i.e. ASN298. Press “shift” key on the keyboard and click ASN298.
This will select residues one to 298 i.e. from MET1 to ASN298 in chain ‘A’. The selection is successful as the color the residues labels in the first column changes to red. Now “Save” the selected residues using command “Selected Residues Of Current Layer…” of “Save” option in “File” menu. Save the file with file name “1KO7momomer.pdb”. This structure will be used to produce docking of ATP to Hpr K/P and then to screen docking results for selecting correctly docked ATP in P-loop.
7
Biochemistry
Biostatistics and Bioinformatics Protein-Ligand Docking
Now, scroll down to the end of control panel. With “Ctrl” key pressed and clicked successively on PO4316 and PO4317 of ‘A’ chain, select these two residues,
and “Save” the selected residues using command “Selected Residues Of Current Layer…” of “Save” option in “File” menu. Save the file with file name 1KO7monomerWithPhosphates. file name using command
8
Biochemistry
Biostatistics and Bioinformatics Protein-Ligand Docking
“Selected Residues Of Current Layer…” of “Save” option in “File” menu. This structure will be used to screen docking results for selecting only those dockings which have phosphates overlapping with phosphates of ATP in the P-loop.
In the control panel, make visible two phosphates PO4316 and PO4317, as well as residues GLY151 to SER158 i.e. residues of P-loop. Also make visible the labels of residue 151 and 158 with the complete chain displayed as ribbons from the fifth column.
Now zoom in to reveal biding of two phosphates in the P-loop of Hpr K/P.
Now start SwissPDBViewer afresh and open 1KKL structure. Scroll down the control panel. It reveals six chains, labeled A, B and C as well as H, I and J. Press shift and click in first column of the control panel to make main chains invisible. Similarly, Press shift and click in second column of the control panel to make
9
Biochemistry
Biostatistics and Bioinformatics Protein-Ligand Docking
side chains invisible. Scroll down to last residue in chain C, i.e. GLU310. Left click in the fifth column of control panel corresponding to GLU310 residue of chain ‘C’ and while mouse button kept pressed, move the mouse upward to display chains A, B and C as ribbons. Now scroll to the first residue GLN3 in the Chain H. Left click in the first column corresponding to residue GLN3 of chain H and while the mouse button kept pressed, move the mouse downwards to display main chain atoms in chains H, I and J. Similarly, click in the second column corresponding to residue GLN3 of chain H and while the mouse button kept pressed, move the mouse downwards to display side chain atoms in chains H, I and J. Now, color chains H, I and J as Cyan from the sixth column of the control panel. Simply click the box corresponding to the first residue in chain H i.e. GLN3 and the mouse button kept pressed, move the mouse downwards to select till the last residue in chain J and release mouse button. Immediately, color dialog box will appear. Select the cyan color and click “OK” button. The color of the boxes in the sixth column of the control panel changes to cyan for these chains and this will change the color of three chains H, I and J to cyan. These three chains are Hpr monomers made up of 88 amino acids each. With these actions, three chains ‘A’, ‘B and ‘C’ of Hpr K/P will be colored as per the preset ribbon colors and main chain and side chains residues of three Hpr chains ‘H’,
‘I’ and ‘J’ will be colored as cyan color, shown next.
However for superimposing ATP docked Hpr K/P with Hpr bound Hpr K/P, we need only one chain of Hpr K/P with corresponding bound chain of Hpr. Therefore, extract Hpr K/P chain ‘A’ and corresponding bound Hpr chain ‘H’. To extract chains ‘A’ and ‘H’, move to last residue GLU310 of first chain ‘A’ in the control
10
Biochemistry
Biostatistics and Bioinformatics Protein-Ligand Docking
panel. Click GLU310 and while mouse button kept pressed, move the mouse upward till first residue. This will select all residues in chain ‘A’. Now scroll down to first residue in chain ‘H’, i.e.GLN3. Press Control key on key board, click GLN3 and drag till the last residue of chain ‘H’. This will select all the residues in chain H also. Chain A was selected initially. The selection is successful as the color the residues labels in the first column changes to red. Now “Save” the selected residues using command “Selected Residues Of Current Layer…” of “Save” option in “File” menu. Save the file with file name “1KKLMonomers.pdb”. This will save Hpr K/P chain ‘A’ with bound Hpr chain ‘H’. This structure will be used for superimposing ATP docked Hpr K/P with Hpr bound Hpr K/P. Close all layers.
Now open the saved file “1KKLMonomers.pdb”containing Hpr K/P chain ‘A’ and Hpr chain ‘H’ monomers.
From the Select menu choose None command. Scroll down to SER46 of Hpr and display its main chain and side chain. Display it as sphere by marking in fourth column. Color the residues as ‘type’ from ‘Color’ menu.
Make visible the main chain and side chains of residues 155-162 of P-loop. Make visible the labels of residues 155 and 162. Display P-loop residues of Hpr K/P as ribbons. Display Stereovision and center molecule. This reveals that SER46 of Hpr is placed very close to the ATP binding Walker A motif i.e. P-loop of Hpr K/P.
11
Biochemistry
Biostatistics and Bioinformatics Protein-Ligand Docking
Go to Concept Map
3.1.2. Obtaining 3D coordinates of Ligands
Visit PubChem at https://pubchem.ncbi.nlm.nih.gov/ and search for ATP.
This will present a list of PubChem compounds containing keyword ATP.
12
Biochemistry
Biostatistics and Bioinformatics Protein-Ligand Docking
Click the first record for Adenosine triphosphates to reach ATP record. Scroll down to 3D conformer and download PubChem compound in SDF format and save with file named as “ATP.sdf”.
Go to Concept Map
3.2. Reformatting Ligand Coordinates
The SwissDock at Swiss Institute of BioInformatics requires the ligand coordinates in MOL format.
Therefore to convert ATP.sdf to mol2 format, visit WebQC molecular formats converter page at http://www.webqc.org/molecularformatsconverter.php Open the ATP.sdf file with ‘NotePad’, an accessory with windows operating system and copy all the contents onto clipboard and paste in the text box at WebQC molecular formats converter page. Select Input file type format as sdf – MDL MOL format and output file type as mol2 – Sybyl Mol2 format.
13
Biochemistry
Biostatistics and Bioinformatics Protein-Ligand Docking
Check Add Hydrogens check box and Click convert button. This will present the coordinates in mol2 format in check box name “MOLECULE IN OUTPUT FORMAT”. Click anywhere in output format box and copy contents on clipboard.
Paste the contents in NotePad and Save with file name “ATP.mol2” and save as type “All Files”.
Alternatively, for Interconverting formats, BABEL is another option available at http://openbabel.org/wiki/Main_Page.
Go to Concept Map
14
Biochemistry
Biostatistics and Bioinformatics Protein-Ligand Docking
3.3. Protein-Ligand Docking
Computational Protein-ligand docking i.e. in silico Protein-ligand docking is undertaken to visualize the binding of small molecule to active site of a protein with a known 3D structure. The protein taken may be an enzyme to understand the binding of a substrate, a product or an inhibitor. It may be a protein receptor for understating the binding of a small molecule such as hormone, for designing a lead compound to be used in drug development. We will use SwissDcok for Protein-ligand docking in this module. Visit http://www.swissdock.ch/ and open Tab “Submit Docking”. Upload saved file “1KO7momomer.pdb” as target and “ATP.mol2” as ligand.
15
Biochemistry
Biostatistics and Bioinformatics Protein-Ligand Docking
After Successful Setup, enter description details.
User can click on “Show extra parameters” to set user defined docking parameters. But for now, click
“Start Docking” button.
Click on “here” hyperlink to reach the results page.
User can bookmark this results page, for visiting later, because the result will appear within some time of few hours. SwissDock will also send a results page link via e-mail. In case, after clicking link, sent via e-mail shows, ‘Unknown Job’,
16
Biochemistry
Biostatistics and Bioinformatics Protein-Ligand Docking
Then remove an extra apostrophe at the end of the link address, in the address bar http://www.swissdock.ch/docking/view/swissdockd_WW53in_GNSNNLKRBAXRIIWUQKJ9' which was appended inadvertently by SwissDock, and visit the link to view the results output.
Download the predictions file and extract zipped files which contains clusters.dock4.PDB file.
In addition to SwissDock, kindly visit comprehensive list of available docking portals, which is kept up to date, on the click2drug web portal, at http://www.click2drug.org.
Go to Concept Map
3.4. Analysing Dockings
17
Biochemistry
Biostatistics and Bioinformatics Protein-Ligand Docking
Open clusters.dock4.PDB file with SwissPDBViewer. Also open the “1KO7_Monomer.pdb” file. Display 1KO7_Monomer as ribbons with P-loop residues as spheres colored cyan. This display shows that ATP (shown in grey color) has several docked positions to Hpr K/P.
We know that ATP binds within P-loop. Therefore, dockings outside P-loop are predicted wrongly. To screen correct dockings from wrong ones, first of all, we need to select only those dockings, which are in contact with P-loop of the Hpr K/P Monomer. Then we need to select only those dockings, in which ATP is overlapping with experimentally bound phosphates in P-loop of “1KO7monomerWithPhosphates.pdb” file.
Finally, we need to Select ATP dockings in P-loop with gamma phosphate of ATP in close contact with serine 46 in target Hpr, i.e. second substrate.
3.4.1. Select Dockings with ATP in the P-loop
Make all dockings invisible by pressing shift and clicking in the first column and second column. Now, in stereo View, display first docking i.e. ‘LIG1” by marking ‘v’ in the first column. This appears within P-loop, therefore select it by clicking its name ‘LIG1’.
18
Biochemistry
Biostatistics and Bioinformatics Protein-Ligand Docking
Now, make the next docking visible by marking ‘v’ in the first column. If this next docking is not visible outside P-loop, then make previous docking invisible. If the currently displayed docking is within P-loop then select it by pressing “Control” key and by clicking its name ‘LIG1’. Now make next docking visible by marking ‘v’ in the first column, third row. If this next docking is not visible outside P-loop, then follow the same procedure as for previous docking. Continue with next dockings till a docking appears outside P-loop.
When the next docking is visible outside P-loop, as shown next.
Then donot select it and make it invisible. This leaves us with first eight dockings selected. Follow this to select docked ATP views. At the end, you will have only those dockings selected, which are having ATP
19
Biochemistry
Biostatistics and Bioinformatics Protein-Ligand Docking
docked within the P-loop. Now make each of the selected docking visible by marking ‘v’ mark in the first column for each selected docking. This will display only those dockings which fall within the P-loop and all these selected dockings in the control panel are shown next.
3.4.2. Select ATP dockings in P-loop with ATP phosphates overlapping with experimentally docked phosphates
The ATP dockings in P-loop may have phosphates docked in correct orientation. But some dockings may be wrong with P-loop containing sugar or base component of ATP. Screening of correctly docked ATP in P-loop is to be taken visually. But if an experimental structure for the placement of phosphates in P -loop is available, then the same shall be used to select those dockings in which phosphates in P-loop in predictions overlap with phosphates in experimental structures obtained after co-crystallization with phosphates. In the present case, we have already saved “1KO7monomerWithPhosphates.pdb” with phosphates docked in P-loop. Therefore, this file shall be used. Consequently, hide the structure “1KO7monomer” in control panel and open the file “1KO7monomerWithPhosphates” as ribbons. Display last two residues i.e.
phosphates (d316 and PO4317) in sphere form with CPK color and P-loop residues (151-158) as spheres in cyan color.
20
Biochemistry
Biostatistics and Bioinformatics Protein-Ligand Docking
P-loop residues in cyan color can be distinguished from phosphates in CPK color. In the control panel, switch to clusters.dock4 layer.
Now hide all ligands by while pressing shift key, clicking any where in the first column of control panel.
Display first ligand by marking ‘v’ in the first column. Now rotate, move and zoom molecule to display ATP docked in P-loop with phosphates visible clearly in front side, as shown next.
21
Biochemistry
Biostatistics and Bioinformatics Protein-Ligand Docking
In this view, it is clear that alpha and beta phosphates of ATP are overlapping with two docked phosphates.
Therefore leave ligand selected in the control panel and hide it. Now display next ligand. If the phosphates in ATP overlap with two experimentally docked phosphates, then leave the ligand selected. On the other hand if not, then deselect it by clicking on its name while control key pressed. Hide this residue and display next residue and follow the same procedure to keep selected only those dockings which have overlapping phosphates in P-loop. Following this procedure upto the last selected residue, to keep selected only those dockings where phosphates in ATP are overlapping with experimental phosphates in P -loop. However, when we find that ATP is not docked correctly as the phosphates are not overlapping, then deselect it.
This, leave us with eight ligands selected. Now display all the eight ligands by marking ‘v’ in the first column of control panel. This will display as shown next.
22
Biochemistry
Biostatistics and Bioinformatics Protein-Ligand Docking
Continue with next selected dockings to keep selected only those dockings with overlapping phosphates. In the next group, we find 13 correctly overlapping dockings, as shown next.
Third group with eight dockings is show next
In the next group, all dockings are having collision of Adenine ring with P-loop, therefore unselected.
23
Biochemistry
Biostatistics and Bioinformatics Protein-Ligand Docking
In the next group, Adenine ring of ATP is sitting in P-loop. Therefore, unselected.
In the last group, phosphates are not overlapping. Therefore unselected.
24
Biochemistry
Biostatistics and Bioinformatics Protein-Ligand Docking
3.4.3. Select ATP dockings in P-loop with gamma phosphate of ATP in close contact to serine 46 in target Hpr, i.e. second substrate
Open 1KKLmonomers and superimpose on to 1KO7Monomer. Hide 1KO7monomerWithPhosphates.
Center View. Display P-loop (155-162) “GDSGVGKS” in cyan as spheres. Hide backbone of chain H, i.e. Hpr substrate and display as ribbons. Display Ser46 as sphere and select color from “Color” menu as “Type”.
Rotate, move and zoom to display two substrates, i.e. Ser46 in Hpr and ATP in P-loop.
Only third group with eight dockings is fitting gamma phosphate of ATP next to Ser-46 of Hpr and without making any clashes with any other residue in Hpr K/P or Hpr. Therefore, this docking is correctly positioned to transfer gamma phosphate of ATP to Hpr in the Hpr kinase reaction.
Go to Concept Map
4. Summary
Dear Students, we know that proteins are involved various biochemical processes. In these biochemical processes proteins interact with small molecules known as ligands. Therefore, it is very important to understand biochemical interactions between two, such as an active site of an enzyme and its substrate.
However, sometimes the 3D structures of the complex is not available. On the other hand, if 3D structures of individual enzyme and its substrate are available, then these structures can be used to produce the complex structure using in silico docking. In this module, we have learned to obtain individual 3D structures for a protein, such as an enzyme and its ligand, such as a substrate and to reformat these 3D structures so as to be submitted for docking. The 3D structures of the resulting dockings or complexes
25
Biochemistry
Biostatistics and Bioinformatics Protein-Ligand Docking
were analyzed manually to select a correct docking with biochemically relevant interactions between the protein and its ligand, so as to understand the mechanism of biochemical processes. We have used bifunctional enzyme Hpr K/P which follows bi-bi sequential kinetic mechanism with Hpr and ATP binding to the enzyme. We learned to obtain individual 3D structure of Hpr K/P, its substrate ATP and used these for docking. Then we analysed resulting dockings using binding pocket information to select the correct docking. The correctness of the selected docking was confirmed with superposition of the bound second substrate i.e. Hpr. In this way, protein ligand docking may be used to produce biochemically relevant complexes to understand mechanism of action of the proteins.