Homology modeling, also known as comparative modeling is a protein structure prediction method that is based on structural conservation of framework regions between proteins (Xiang, 2006). It involves a construction of an atomic-resolution model of the target protein, which is the unknown structure, and an experimental 3D structure of a related homologous protein referred to as the template. The process involves predicting the three-dimensional structure of an unknown protein based on the known structure of a similar protein. Homology modeling is widely used in drug design and structure-based drug discovery. Homology provides the best result as it does not cause delays in obtaining sufficient amount of material as well as crystallization difficulties during elucidation of protein structures (Xiang, 2006).
Protein Structure Prediction
Protein structure prediction is a method of translating protein sequences in a three-dimensional structure using computational algorithms (Xiang, 2006). The computational approaches involve threading, comparative, and ab initio. Protein structure prediction determines the native, in vivo structure of an amino acid sequence. The process applies the knowledge of the determinants of protein structure, for example, hydrogen and covalent bonds, electrostatic interactions, the hydrophobicity and hydrophilicity of residues, enthalpy and entropy, bond angle stresses, and van der Waals interactions. Protein structure prediction uses protein fold recognition and homology modeling methods.
Comparisons between Comparative/Homology Modeling, Threading/Fold Recognition, and Ab Initio/De Novo Structure Prediction
Comparative or homology modeling of protein is the construction of an atomic resolution model of a target protein from its amino acid sequence and a 3D structure of a template, which is its related homologous protein. The process seeks to identify one or more known protein structures resembling the structure of the sequence in question, and the alignment that maps the target’s residues to the template sequence’s residues.
Threading, on the other hand, detects the protein templates in protein data bank, PDB bank for the similar structural motif to the target protein. Threading differs from homology modeling in that homology modeling only considers the similarity of sequences between the target and the template proteins, while threading considers the template’s structural information. Threading detects the correct template proteins with similar folds to the target protein for correct alignment.
The Ab initio method generates protein structures from scratch when the relationship is so distant to allow for threading or there is no homologous structure in PDB. Ab initio modeling differs from homology modeling in that it builds the query from scratch when no structurally related proteins are found in the template database (Xiang, 2006).
The Steps in Comparative/Homology Modeling
Homology modeling of the target structure involves seven steps which involve template recognition and initial alignment, where sequence comparison of the target is done against all known structures in the PDB. The second step involves alignment correction to look at the residues that are less likely to be changed. Backbone generation of the target is then created before loop modeling is done. The fifth step entails side-chain modeling where side-chains are added to the model’s backbone. The next step involves the optimization of the model whereby refinements are done using Molecular Dynamics simulations. In the last step, the model is validated for bumps to ensure that all the bond lengths, angles are within normal ranges.
The software needed for each step include the SWISS-MODEL, MODELLER, and Geno3D. The SWISS-MODEL server is used to generate alignments and homology models through an automated web server for the protein structure. MODELLER produces homology models by the satisfaction of spatial restraints using NMR spectroscopy data processing methodology. Geno3D, on the other hand, produces homology models by spatial restraints satisfaction using NMR data processing methodology.
Fig. 1. A flow chart of the steps involved in comparative protein structure modeling (Madhusudhan, Marti-Renom & Eswar, 2007).
Most Commonly Used Homology-Modeling Tools/Software/Websites
The commonly used homology modeling software includes SWISS-MODEL and Modeller. SWIS-MODEL quickens the submission of the target sequence for automatically generated comparative models. The template, in this case, is identified and aligned automatically. However, the automated models may sometimes contain errors. The Modeller software is the most used homology modeling program since it is relatively fast and thus appropriate for whole-genome modeling. In the software, models are obtained by spatial restraint satisfaction from the alignment and expressed as probability density functions for the varying restraints (Xiang, 2006).
The template structure returns a full 3-dimension description for the target and comparative modeling. The structure relies on the detectable similarity in the modeled sequence as well as one known structure. The template structure’s fold is determined from a possible set of templates as well as a full-atom model. As such, the template determines the protein in the family since the other members can be modeled depending on their alignment to the template (Fiser, 2010).
The Accuracy of the Method
The accuracy of the template structure ascertains the information that can be extracted from it. Therefore, the accuracy estimation of the protein model is vital for their interpretation. The evaluation of the model can be done as a whole or in the individual regions. The accuracy depends on the sequence identity percentage on which the model is based in relation to the relationship of the sequence and structure of the proteins (Xiang, 2006). More than 50 percent accuracy determines the sequence identity to the templates, which means they are trustworthy. Low accuracy is based on less than 30 percent sequence identity.
Applications of Comparative/Homology Modeling
Homology modeling is applied in structure-based drug design process. Other common applications include in ligand design, which involves identifying of active and binding sites on protein and designing novel ligands of a particular binding site (Jacobson & Sali, 2004). Homology modeling is also applied in data mining for the search for the ligand of a given binding site. The procedure is useful in refining functional predictions using high and medium accuracy comparative models. Homology also correctly predicts the features of the target protein that do not occur in template structure. Molecular replacement in X-ray structure refinement and protein-protein docking simulations also use homology modeling.
Examples of a Homology Modeling Study
An example of homology modeling involves the amino acid sequencing of ZIKV strain, which is essential for understanding exposed epitopes which differentiate ZIV from flaviviruses and dengue (Ekins et al., 2016). Such modeling is done using the SWISS-MODEL server or target-template sequence alignment after a search for the putative X-ray template proteins in PDB.
Homology modeling predicts the structure of a protein from its sequence with a high level of accuracy similar to the experimentally obtained results. As such, it provides a feasible, cost-effective alternate method for model generation. As the number of experimentally determined structures increase, so will the reliability and role of homology modeling. The modeling technique serves to suggest modeling of mutagenesis experiments, enzyme-substrate interactions, of ligand-receptor interactions, loop structure prediction as well as identifying hits.
Homology modeling uses virtual screening through the seven steps. The process is applicable to drug discovery and can be safely used when the target and template share at least a 30 percent sequence identity. However, models must be evaluated to ensure their correctness. More research should focus on the role of homology modeling in the drug discovery process. Therefore, future research should seek to improve the accuracy of models produced by homology modeling.