In silico identification of the potential natural inhibitors of SARS-CoV-2 Guanine-N7 methyltransferase
Abstract
Background: The outbreak of the COVID-19 pandemic caused by the SARS-CoV-2 has triggered intense scientific research into the possible therapeutic strategies that can combat the ravaging disease. One of such strategies is the inhibition of an important enzyme that affects an important physiological process of the virus. The enzyme, Guanine-N7 Methyltransferase is responsible for the capping of the SARS-CoV-2 mRNA to conceal it from the host’s cellular defense. The aim of the study: This study aims at computationally identifying the potential natural inhibitors of the SARS-CoV-2 Guanine-N7 methyltransferase binding at the active site (Pocket 41). Materials and methods: A library of small molecules was obtained from edible African plants and was molecularly docked against the SARS-CoV-2 Guanine-N7 methyltransferase (QHD43415_13. pdb) using the Pyrx software. Sinefungin, an approved antiviral drug had a binding score of -7.6 kcal/ mol with the target was chosen as a standard. Using the molecular descriptors of the compounds, virtual screening for oral availability was performed using the Pubchem and SWISSADME web tools. The online servers pkCSM and Molinspiration were used for further screening for the pharmacokinetic properties and bioactivity respectively. The molecular dynamic simulation and analyses of the Apo and Holo proteins were performed using the GROMACS software on the Galaxy webserver. Results: With a total RMSD of 77.78, average RMSD of 3.704, total regional (active site) RMSF of 30.61, average regional RMSF of 1.91, gyration of 6.9986, and B factor of 696.14, Crinamidine showed the greatest distortion of the target. Conclusion: All the lead compounds performed better than the standard while Crinamidine is predicted to show the greatest inhibitory activity. Further tests are required to further investigate the inhibitory activities of the lead compounds.
Keywords: COVID-19, SARS-CoV-2, guanine-n7 methyltransferase, inhibition, molecular docking, molecular dynamic simulation
Introduction. Coronavirus disease 2019 (COVID-19) is a novel infection that began in China resulting in a worldwide outbreak. The disease was declared a global health emergency and later recognized as a pandemic by the World Health Organization in March 2020 [1]. As of the 25th of July, 2020, the global number of reported cases of the disease stood at 15,975,268 with 643,476 deaths and 9,766,873 recoveries [2]. COVID-19 is caused by Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) which causes mild to severe respiratory illness with symptoms such as fever, cough, and shortness of breath. The ailment becomes life-threatening in the presence of co-morbidities such as diabetes, hypertension, and cardiovascular diseases [3,4]. There is currently no WHO-approved drug or vaccine for the cure or prevention of COVID-19. SARS-CoV-2 belongs to a large family of viruses consisting of multiple strains that are known to cause illnesses ranging from the common cold to more severe diseases such as the Middle East Respiratory Syndrome (MERS) and Severe Acute Respiratory Syndrome (SARS) [4,5]. SARS-CoV-2 is a positive-sense, single-stranded RNA virus possessing the largest and most complex genome (about 30 Kb), packed inside a nucleocapsid protein and enveloped with several structural proteins [6]. The size of the viral particle is in the range of 8 0-90nm and there are bulbous surface projections that form crown-like patterns (corona) on the surface of the particles [7]. The potential therapeutic strategies for the treatment of COVID-19 include immunomodulation and viral inhibition. Several enzymes or structural proteins of SARS-CoV-2 are potential drug targets as they directly affect physiological processes such as RNA synthesis, replication, assembly, and human cell receptor binding [8,9]. Guanine N-7-MethylTransferase (GNMT) is one of such targets and it is the enzyme responsible for the capping of SARS-CoV-2 mRNA. For many life-sustaining processes such as replication, protein translation, and metabolism, viruses require a host cell as they lack the proper cellular machinery. Viral propagation within the host cell requires the transcription of viral mRNA. To do this, the viral mRNA assumes molecular anonymity to evade detection in the host cell cytoplasm. The viral mRNA undergoes structural modification by a 5’ cap structure. By evading the host cell defense system, viral mRNA can be effectively translated into proteins. The addition of the guanine N-7-methylguanosine cap is necessary for the maturation, stability, nuclear export, and efficient translation of viral mRNA. Eukaryotic mRNA is modified by the addition of the 5′ cap structure which is a 7-methylguanosine linked to the first transcribed nucleotide by a 5′-5′ triphosphate bridge [10]. The mRNA cap is formed on the first transcribed nucleotide of transcripts by three sequential enzymatic activities; triphosphatase, guanylyltransferase, and methyltransferase [11,12]. The 5′ triphosphate of pre-mRNA is hydrolyzed to diphosphate by a 5′-triphosphatase, to which Guanosine monophosphate (GMP) is added by the RNA guanylyltransferase to create the cap intermediate, GpppN. Guanine-N-7-methyl transferase (GNMT) also known as mRNA cap guanine-N7 methyltransferase is the enzyme that catalyzes the chemical reaction and most importantly plays a necessary part in the RNA capping reaction. RNA guanine N7 methyltransferase creates the mature cap, m7GpppN, and a byproduct, AdoHcy (S-adenosyl homocysteine) through the methylation of the cap intermediate utilizing the methyl donor, AdoMet [13]. The GNMT in coronaviruses belongs to a large class of SAM (S-Adenosyl methionine)-dependent methyltransferases and is an exoribonuclease [14]. Additionally, they are linked with a unique 3’ to 5’ exoribonuclease (ExoN) domain in non-structural protein 14 (nsp14). The diversity of the capping apparatus makes viral RNA capping an attractive target for drug design and development [14,15]. Accordingly, the inhibition of GNMT which may induce potent antiviral activity makes it an important drug target [16]. This implies that incompletely-capped mRNAs can be recognized by immune sensors which trigger innate immunity pathways that culminate in the expression of type I interferon and other cytokines that have antiviral activity in neighboring cells [17,18]. The active site of GNMT is found in Pocket 41 and it includes residues ARG 289, VAL 290, TRP 292, GLY 333, PRO 335, ASP 352, ALA 353, GLN 354, PRO 355, CYS 356, SER 357, TRP 385, ASN 386, CYS 387, ASN 388 and PHE 426 [19].
In the light of the absence of a universally accepted drug for the treatment of COVID-19 and the severity of the pandemic, the aim of this study is to therefore identify potential natural inhibitors of SARS CoV-2 GNMT.
Materials and Methods
Preparation, analysis, and validation of target protein structure:The 3D structure of SARS-CoV-2 GNMT in the Protein Data Bank (pdb) format (ID: QHD43415_13. pdb) was obtained from the I-TASSER online server with an estimated Template Modelling (TM) score of 0.99 [20]. The web server, Volume, Area, Dihedral Angle Reporter (VADAR 1.8) was used to reveal the architecture of GNMT. The structure of the target was further analysed using the Ramanchandran plot obtained from the MolProbity web server [21].
Ligand preparation: A library of 1,048 compounds obtained from edible African plants such as fruits, spices, and vegetables were downloaded from PubChem database [22]. All the compounds had been pre-screened for Lipinski (hydrogen bond donor (HBD) ≤ 5, hydrogen bond acceptor (HBA) ≤ 10, molecular weight ≤ 500, and logP ≤ 5) and Veber (polar surface area (PSA) ≤ 140, and rotatable bonds ≤ 10) rules [23]. The 3D structures of all the compounds and that of the standard, Sinefungin (PubChem CID 65482) were downloaded from PubChem in the structure-data file (sdf) format [22].
Molecular docking and virtual screening: In preparation for molecular docking, all the ligands were uploaded on the virtual screening software, PyRx (Python prescription) 0.8 version using the Open Babel plug-in tool [24] and converted from sdf to Protein Data Bank, Partial Charge, & Atom Type (pdbqt) format [25]. For stable conformation, the Universal Force Field (UFF) was used as the energy minimization parameter and conjugate gradient descent as the optimization algorithm. Using the AutoDock Vina plug-in tool in Pyrx, all ligands and the standard were docked against the target protein, SARS-CoV-2 GNMT using the following grid parameters [26]. Centre X = 92.432, Y = 92.529, Z = 92.555 and Dimensions (Angstrom): X = 87.658, Y = 97.427, Z = 64.081 [24]. Using the Microsoft Excel software, the docked results were exported in comma-separated values (.csv) format and screened using the docking score of the standard, Sinefungin (-7.6 kcal/ mol) as the cut-off. The SWISSADME, pkCSM, and Molinspiration web-servers were used to predict the molar refractivity, pharmacokinetic properties, and bioactivity of all the ligands respectively [27-30]. The SMILES for Sinefungin and the ligands were downloaded from PubChem. Binding site analyses: Using the Pymol software, the target protein was superimposed with the docked poses of all the front-runner compounds [31]. The Protein-Ligand Interaction Profiler (PLIP) webserver was used to evaluate the resultant protein-ligand complexes for hydrogen bonds, salt bridges, and other protein-ligand interactions. The analyses carried out include the name and number of residues, exhaustiveness, bond distance, and bond angle [32]. The binding pockets of the target protein were analysed with the Fpocket web server [19].
Molecular Dynamic Simulations (MDS) and Analyses: A 2-nanoseconds MDS of the Apo and Holo structures of SARS-CoV-2 GNMT was performed using the GROMACS software of the Galaxy (versions 2019.1 and 2019.1.4) supercomputing server [33]. For ligand parameterization, LigParGen server was used to generate GROMACS-compatible topology files for the small molecules. OPLS-AA/ 1.14*CM1A was the force field parameter used [34, 35]. After initial conversion to topology files, solvation, energy minimization, and equilibration (NVT and NPT), a 1,000,000-step MDS was performed. The analyses of trajectories were done using the BIO 3D tool on the Galaxy super-computing platform [36]. These include the Principal Component Analysis (PCA), per residue Root Mean Square Fluctuation (RMSF) of the protein backbone, and Root Mean Square Deviation of atomic positions (RMSD) and Dynamical Cross-Correlation Matrix (DCCM). [37]. The radius of gyration and the B factor was also analysed using the MDWeb web server [38].
Results and Discussion
Structural analysis, validation, and preparation of SARS-CoV-2 GNMT (QHD43415_13. pdb): The Apo structure of SARS-CoV-2 GNMT (QHD43415_13. pdb) has 527 amino acids with the following constituent secondary structures: α helix 21%; beta-sheets 30%; Coil 48%; and Turns 16% (Fig. 1). The Total Accessible Solvent Area (ASA) is 260780 (Å) ². The geometry of SARS-CoV-2 GNMT (QHD43415_13. pdb) reveals 8.01% poor rotamers, 83.98% favored rotamers, 4.00% Ramachandran outliers, 82.29% Ramachandran favored, 3.22% Carbon Beta deviations >0.25Å, 0.00% bad bonds and 1.04% bad angles (Fig. 2). The Peptide omegas of SARS-CoV-2 GNMT (QHD43415_13. pdb) include 0.00% Cis Prolines and 3.04% Twisted Peptides. The low-resolution criteria include 8.2% CaBLAM outliers and 0.96% CA Geometry outliers.
Chemoinformatic profile of ligands(Fig. 3, Table 1):A combination of Ghose, Lipinski, and Veber rules define the molecular descriptors necessary for good oral bioavailability of drugs and their penetration through biological membranes. The molecular descriptors include a molecular weight ≤ 500 g/mol, log P ≤ 5, hydrogen bond donors ≤ 5, hydrogen bond acceptors ≤ 10, molar refractivity between 40 to 130, the number of rotatable bonds ≤ 10 and polar surface area (PSA) ≤ 140 [39-42].
Results from Table 1, reveal that none of the lead compounds violated the Ghose, Lipinski, and Veber rules. This suggests that they have good oral bioavailability and permeability. Therefore, we predict that these compounds are good drug candidates having met the criteria for drug-likeness assessment [43]. However, the Standard (Sinefungin) violates the Veber rule with a high TPSA value (208.65Aa). This suggests that it would have a considerably lower intestinal absorption, blood-brain barrier permeation, and cellular potency than the lead compounds [44].
The molecular complexity of a compound is measured by the ratio of sp3 hybridized carbons over the total carbon count of the molecule (Fraction Csp3). It is an important property in determining the success of drug development. A value of at least 0.25 indicates saturation [45]. From (Table 1), all lead compounds and the standard are saturated suggesting molecular stability. Crinamidine has a higher saturation than the standard while Sinensetin has the lowest.
Due to problematic structural moieties, promiscuous bioactive compounds interact with multiple biological targets and aggregate under assay conditions giving false-positive results. While this might be good for polypharmacology, unintended interactions might likely lead to many undesired side effects [46]. From (Table 1)all lead compounds and the standard are predicted to be non-promiscuous.
Beyond ligand binding to the appropriate target, it should elicit a pharmacological effect. Drug candidates are classified based on their bioactivity which includes GPCR ligands, ion channel modulators, kinase inhibitors, nuclear receptor ligands, protease inhibitors, and other enzyme inhibitors [47]. In this study, the results showed that only the standard and Crinamidine had poor bioactivity scores as Nuclear Receptor Ligand and Kinase inhibitors respectively. All other scores for standard and lead compounds revealed moderate to good bioactivity against the targets. Furthermore, all lead compounds showed good activity as enzyme inhibitors. While the standard showed the highest enzyme inhibition, Marmesin showed the least activity (Table 1.) [29, 48].
Pharmacokinetic properties of ligands:Pharmacokinetic properties play an important role in drug discovery and development. The primary goal of drug discovery or design projects is to identify potential drug candidates that have the greatest efficacy and least toxicity. To avoid failures in the drug development process, it is proper to identify good Absorption, Distribution, Metabolism, Excretion, and Toxicity (ADMET) properties of the front-runner compounds through in silicomethods [28]. An excellent drug candidate should have good ADMET properties at therapeutic doses [28, 49].
The penetration of a target molecule by a drug candidate is a good marker of its therapeutic potential and is influenced by absorption parameters such as human intestinal absorption (poor: <30%), caco2 permeability (high:> 0.9), water solubility (insoluble: less than -4.0 Log mol/L), and skin permeability (low: LogKp> −2.5). From Table 2, data suggests that the standard and all lead compounds have good human intestinal absorption property, and skin permeability. The ability to penetrate human epithelial colorectal adenocarcinoma cells is lowest in the standard, and highest in Crinamidine.
The pharmacological markers for distribution include CNS permeability (permeable Log PS > -2; poor Log PS < -3), BBB permeability (permeable: Log BBB > 0.3; poor <: Log BBB <-1), Volume of distribution steady state (Low: Log VDss<- 0.15; High: Log VDss> 0.45), and Fraction unbound. From Table 2, Sinensetin has a high VDSS, while the values for Marmesin and Crinamidine are below the pharmacological range. This can be corrected by dosage.
The standard and Marmesin have a poor ability to permeate into the brain tissue, while other lead compounds can permeate. The standard and all the lead compounds have poor CNS permeability. The fraction unbound values for standard and all lead compounds are within an acceptable range.
P-glycoprotein is a transmembrane efflux pump that pumps its substrates from inside to outside the cell [50]. All the lead compounds except Marmesin were shown to be P-glycoprotein substrates which imply that they should be co-administered with a P-glycoprotein inhibitor to prevent a potential reduction in absorption and oral bioavailability resulting in decreased retention time of the drug [51]. However, all lead compounds, and the standard showed no inhibition to P-glycoprotein I and II indicating less likelihood of its substrates inducing cellular toxicity, and drug interactions [52,53].
The predicted metabolic behavior of bioactive compounds is a determinant of their inclusion or elimination in the drug discovery process. The inhibition or non-inhibition of the isomers of the Cytochrome P450 enzyme determines whether the drug candidates would undergo biotransformation or accumulate in the cellular spaces with toxic tendencies. If drug candidates are Cytochrome P450 enzyme substrates they would be administered with inhibitors to facilitate their metabolism [54]. From Table 2, all lead compounds are neither inhibitors nor substrates of CYP1A2, CYP2C19, CYP2C9, CYP2D6, and CYP3A4 enzymes.
The predicted excretion values for Total Clearance for the standard, and the lead compounds are within the pharmacological range [23]. Similarly, they all are predicted to be non-substrates of Renal Organic Cation Transporter 2 (OCT2). This implies that they will all be eliminated from the blood into the proximal tubular cell by the Renal OCT2 [24].
The toxicity profile for the standard and all lead compounds suggests that are non-mutagenic, non- cardiotoxic, non-hepatotoxic, and non-dermatotoxic as revealed in their AMES toxicity, hERG I & II toxicity, hepatotoxicity, and skin Sensitization predictions respectively [28].
The dose administered at clinical trials is determined by the maximum recommended tolerated dose. Values less than 0.477 log mg/kg/day are considered low while values higher than 0.477 log mg/kg/day are considered as high. From Table 2 the predicted values suggest that Marmesinand Crinamidine are the most and least potent compounds respectively [55]. The predicted values for Oral Rat Acute Toxicity and Oral Rat Chronic Toxicity should be considered alongside factors such as concentration of drug, dose, and the length of time it is administered [55].In this study, the data on theoral rat acute and chronic toxicity were obtained from the pkCSMonline server.
Inhibition of 50% of the growth of T.pyriformis, a protozoan bacterium (IGC50) is a toxicity marker in drug discovery. When the pIGC50 value is greater than -0.5 log Ug/L, the drug candidate is considered toxic. Results from Table 2, all lead compounds, and the standard are predicted to be toxic against T.pyriformissuggestingantibacterial effect properties (that might be unharmful to human cells) [55]. Similarly, in flathead Minnows, the log LC50 is the log of a compound that causes the death of 50% of the population. High acute toxicity is indicated by values less than 0.3 log mM.The results from Table 2 shows that all lead compounds and the standard are not toxic to Minnows [55].
Molecular docking analyses of ligands against SARS-CoV-2 GNMT:In molecular docking, the binding affinity score is a measure of the ability of the small molecule to find the optimal conformation in the protein binding pocket. Hence, the ligand with the lower binding energy suggests the greatest binding affinity making it a possible drug candidate [56].
All lead compounds have shown greater potency as drug candidates because they all have a stronger binding affinity than the standard. Crinamidine has the strongest binding affinity of -8.5 Kcal/mol (Table 3).
Binding Site analyses:Hydrogen bonding plays an important role in many biochemical processes such as protein-ligand interactions. By displacing water molecules, it enhances ligand binding [57]. Also, the orientation and length of an intermolecular hydrogen bond determine the direction and specificity of ligand binding [58].
Hydrogen bonds (H-bonds) are abundant in nature and are vital in protein folding, protein-ligand interactions as well as catalytic reactions. In biological systems, they are generally considered facilitators of protein-ligand binding [59,60]. An increasing number of H-bonds between protein and drug molecule in molecular simulations is indicative of a stronger binding affinity [61].
Figures 4 & 5 and Table 4 reveal that while the standard has the highest number of intermolecular hydrogen bonds (eight) while Marmesin forms the least (one). Of all the lead compounds, Crinamidine has the highest number of hydrogen bonds (four).
All hydrogen bonds of the lead compounds and standard fall within Pocket 41. Regarding the angles formed by hydrogen bonds, the standard forms four strong (greater than 130°) and four (less than 130°) hydrogen bonds with the target protein. Crinamidine forms two weak and two strong hydrogen bonds. Other lead compounds form only weak hydrogen bonds[62].
Regarding the donor to acceptor distance,the standard formssix moderate (2.5-3.2 Å) and two weak (3.2-4.0 Å) hydrogen bonds with the target protein. Crinamidine forms two moderate and two weak hydrogen bonds. Marmesin and Sinensetin form only weak bonds [62].
The identification of potential protein-ligand interactions is an integral aspect of drug discovery as it aids the discovery of possible new drug leads, thus contributing to the advancement from hits to leads and prediction of likely explanations for side effects of approved drug candidates [63]. The most frequently observed interactions in ligand design are hydrophobic bonds, hydrogen bonds, and π-stacking, followed by weak hydrogen bonds, salt bridges, amide stacking, and cation–π interactions [64]. The presence of hydrophobic interactions and salt bridges further strengthens and stabilizes the protein-ligand complexes[65].
The salt bridge is the strongest non-covalent bond, and it gives greater stability to the Protein-ligand complex [66]. From Table 5, GNMT-Crinamidine and GNMT-Marmesin complexes form salt bridges at residues ASP352 and HIS424 respectively. GNMT-Marmesin also has the highest number of hydrophobic interactions. This suggests a slightly more atom-efficient binding than other complexes. GNMT-Crinamidine has also p-stacking contributing to the small molecule interaction.
Analysis of MDS
Root Mean Square Deviation of Atomic Positions (RMSD): Through a computational approach, the RMSD is used to assess the quality of a reproduced binding pose. The new structures induced by simulation and/or ligand binding are compared to a reference structure where the RMSD is at zero. The structural distance between the Cα atoms of the protein backbone is used as a means of evaluation. Lower RMSD values show greater stability of the biological configuration. Higher values suggest greater structural instability [56, 67,68]. Fig. 6 is s screenshot showing the conformational changes the Apo and Holo structures underwent after the MDS.
In a 2-nanosecond trajectory, the RMSD of Apo and Holo-structures were measured over consistent time frames (Fig. 7 and Table 6). Of all the Holo-structures, the GNMT-Sinefungin complex has the least total and average RMSD values. The other lead compounds produced greater total and average RMSD values than the standard (Sinefungin). Crinamidine followed closely by Sinensetin induced the greatest total and average RMSD values.
There is a steep increase in RMSD of the simulated Apo protein relative to the crystal structure as the production time increased. The slope suggests that the RMSD values would increase with more simulation time. Similarly, the Holo-structures formed by the Crinamidine and Sinensetin also showed a steep increase of RMSD values all through the trajectory showing instability. This is also shown in the time frame in which their respective highest RMSD values were attained (20 and 19 respectively). The GNMT-Marmesin complex shows a gentle slope that flattens towards the end of the trajectory. The GNMT-Sinefungin complex shows the greatest stability with the least gradient of the slope.
The distribution of RMSD values of the Apo and Holo-structures (Fig. 8 and Table 6)suggests that the greatest deviation to the right from the respective reference structures comes from the GNMT-Crinimadine complex. A total of 17 peaks were found between RMSD values 3.0 to 5.0 Å for the GNMT-Crinimadine complex while 17, 17, and 16 peaks were found in the same positions for the GNMT-Sinefungin, GNMT-Sinensetin, and GNMT-Marmesin complexes respectively. The GNMT-Crinimadine complex shows a wider RMSD range than the GNMT-Sinefungin and GNMT-Sinensetin complexes. This is because the GNMT-Crinimadine complex has a peak between the 5.00 -5.49 Å range while the GNMT-Sinefungin complex has no peak beyond 4.0 Å and GNMT-Sinensetin complex has no peak beyond 5.0 Å.
Put together, during the course of the simulation, the ligand-induced protein conformations have changed between different time points in the trajectory. The RMSD data suggests that Crinamdine, Marmesin, and Sinensetin in this order induced more structural distortion to GNMT than the standard. Crinamdine followed closely by Sinensetin showed the greatest ligand-induced instability of the viral protein.
RMSF:The function of a protein is largely dependent on its function and dynamics. Protein motions are global, regional (domain or active site), and local (residue). Protein dynamics can be evaluated through the measure of the root mean square fluctuations (RMSF) of aligned residues. [69].
From Figure 9 and Table 6, the total and average global RMSF is greater in the GNMT-Crinamidine complex than all the other Holo-structures and least in the GNMT-Sinefungin complex (Standard). In this regard, the GNMT-Crinamidine complex is followed by the GNMT-Sinestein complex. The total and average regional (pocket 41) RMSF remained highest in the GNMT-Crinamidine complex followed by the GNMT-Sinestein complex. The lowest values are seen in the GNMT-Sinefungin complex for Pocket 41. In a similar vein, the highest fluctuation and highest range of RMSF were found in the GNMT-Crinamidinecomplex is followed by the GNMT-Sinestein complex. GNMT-Sinefungin complex (Standard) had the lowest values.
Put together, Crinamidine showed the most instability with the greatest fluctuations at both global and regional sites followed by Sinensetin. Globally, Sinefungin showed the least fluctuation at the regional (Pocket 41) site.
Radius of Gyration:TheRoG analysis is run to indicate or ascertain the compactness of the secondary structures within the 3D structure of the protein. It is measured from the center of mass of the molecule with a high RoG suggesting loose packing while a low RoG suggests a tight packing of the protein [70].
Graphical representation of the RoG reveals that the GNMT-Crinamindine complex has a steep slope in the upward direction showing the least compactness. The GNMT-Sinefungin complex also progressed upwardly howbeit with a gentle slope (Figure 10). The GNMT-Marmesin complex shows a gentle slope with a downward trend as the trajectory progressed. The GNMT-Sinensetin appears to be flat with a slight downward trend.
Trajectory data for RoG reveals that the GNMT-Crinamindine complex had the highest values of average gyration, range of gyration, and percentage gyration over the trajectory. This made it the least compact of all the Holo-structures. The GNMT-Sinefungin complex is the most compact and only marginally different from the GNMT Marmesin complex. (Table 6). Put together, the Crinamdine followed closely by Sinensetin induced the greatest conformational changes on the target protein as shown by the least compactness. This suggests that they are better GNMT inhibitors than the standard.
B-Factor:The B-Factor or Temperature factor is an evaluation of the thermostability of the protein molecule as it measures the internal atomic motions as reflected in their flexibility or rigidity [71]. The B-factor also directly impacts the residual factor (R factor) which is a determinant of the stereochemical quality of protein structure coordinates [72].
From Figure 11 and Table 6, the graphical plots of the B factor values show high values at the termini of the protein molecules suggesting molecular flexibility at these ends, and that the GNMT-Sinefungin complex is the most thermally stable of all the Holo-structures. The global average B-Factor value of the GNMT-Crinamidine complex is the highest of all the Holo-structures while the GNMT-Sinefungin has the lowest value. This suggests that at the global level the lead compounds-induced conformations are more thermally unstable than the conformation induced by the standard. In a similar vein, data of the regional average B factor obtained from Pocket 41 suggests that the GNMT-Crinamidinecomplex has the highest values of all the Holo-structures and this was followed by the GNMT-Sinensetin complex. In only the GNMT-Crinamideine complex, the average B-factor value for the Pocket 41 is higher than that of the global average. The GNMT-Marmesin complex has the least B factor value at the regional level.Put together,the greatest temperature-dependent atomic vibrations were induced by Crinamidine binding causing the greatest dynamic disorder of the GNMT stereochemistry.
Principal components Analysis (PCA): New conformations are generated during the molecular dynamic simulation of a protein. The statistical significance of these conformations is determined by the use of principal component analysis (PCA) [73]. Of all the Holo-structures, the total global motions (mean of PC1, PC2, and PC3) were highest in the GNMT-Sinensetincomplexand least in the GNMT-Crinamidine complex. However, the total regional motions (mean of PC1, PC2 & PC3) were highest in the GNMT-Sinefungin complex followed closely by the GNMT-Sinensetin and GNMT-Crinamidine complexes (Figure 12 and Table 6).
Specifically, based on the greatest motions, the best global conformations are PC2 of the Apo protein, PC1 of the GNMT-Sinefungin complex, PC1 of the GNMT-Crinamidine complex, PC1 of the GNMT-Marmesin complex, and PC3 of the GNMT-Sinensetin complex. Of all these Holo structures, the GNMT-Sinensetin complex has the greatest motion. Similarly, the best conformations that produced the greatest motions at Pocket 41 are PC3, PC3, PC3, PC1, and PC2 of the Apo protein, GNMT-Sinefungin complex, GNMT-Crinamidine complex, GNMT-Marmesin complex, and the GNMT-Sinensetin complexes respectively. Of all these Holo-structures, the GNMT-Crinamidine complex has the greatest motion at the Pocket 41.
The convergence of the MD simulation is revealed by the cosine contents of the principal components. Convergence shows sampling quality, accuracy, and reproducibility. Table 6 shows the results of the cosine content. They show good quality except for a slight non-convergence at the PC3 of the GNMT-Sinefungin complex [74].
The dynamic cross-correlation (DCC) analysis: This is a standard method for analyzing significant intermolecular contacts that are rapidly substituted by side-chain flipping in molecular dynamic simulations [75]. The dynamic cross-correlation map captures the multimodal characteristics of atoms, especially at the interface of macromolecules by quantifying the correlation coefficients of motions between atoms depicting data as positive and negative correlation effect of amino acids [75,76].
From Figure 13, the strongest overall anti-correlated motion of residues occurred in the GNMT-Crinamidine complex. The active site of GNMT fall within the range of residues 289-426. The GNMT-Sinefungin complex showed non-correlated between residues 300-400 while the other residues in the active site showed moderate anti-correlation motions. The GNMT-Crinamidine complex showed strong anticorrelation motions between residues 250-450 which covers the whole area of the active site. The GNMT-Marmesin complex shows moderate anticorrelated motions at approximate residues 280-300, predominantly non-correlated motions between residues 300-350, and predominantly moderate anticorrelated motions from residues 350-400. The GNMT-Sinensetin complex showed non-correlation, moderate correlation, and moderate anticorrelation motions between residues 250-300. However, the greater portion consisting of residues 300-450 show moderate anticorrelation motions.
Put together, the greatest anticorrelation motions both globally and regionally (at the active site) were found in the GNMT-Crinamidine complex suggesting the greatest inhibitory activity. The heat map of the GNMT-Sinensetin complex also suggests a greater inhibitory activity than the standard at the active site.
The compounds all showed good oral bioavailability properties except for the Standard which has a high TPSA value. The standard, and the lead compounds all showed favorable absorption, metabolism, excretion, and toxicity properties. The distribution pharmacokinetics are generally favorable except that all the compounds have a poor CNS permeability, poor BBB (except standard and Marmesin) and they are P-glycoprotein substrates (except Marmesin). The standard has the highest number of hydrogen bonds formed within the active site followed by Crinamidine. The trajectory data such as RMSD, RMSF, B-Factor, DCCM, and RoG, suggests that Crinamidine proved to cause the greatest distortion to the target protein while the standard caused the least at the global and regional levels (Pocket 41). Specifically, of all the compounds, the PC3 of Crinamidine is the conformation caused the greatest distortion at the active site.
Isolated for the Streptomyces species, Sinefungin is a natural nucleoside that is a derivative of S-adenosylmethionine (SAM) [77]. It has shown a wide range of biological effects which include amoebicidal, antifungal, antibacterial (Streptococcus pneumoniae) and antiparasitic (Plasmodium, malarial, trypanosomal, and leishmanial species) activities [77, 78,79,80].The antiviral activity of Sinefungin has also been established as it has been shown to be an inhibitor of mRNA(guanine-7-)-methyltransferase, mRNA(nucleoside-2'-)-methyltransferase, and DNA methyltransferases [81, 82]. Sinefungin is has been shown to inhibit the multiplication of feline herpesvirus type I, Newcastle disease and vaccinia virus [82, 83].
Crinamidine is an alkaloid obtained fromCrinumlatifolium and Talinum triangulare. In Chinese ethnomedicine, the antiviral and antitumor properties of the extract of Crinum latifolium have been reported [84, 85]. Sinensetincan be found in orange (Citrus sinensis)peel, and it has a wide range of biological activity such as antiviral, anticancer, antitumor, anti-inflammatory.Sinensetin is an important ingredient of the aqueous extract of Orthosiphon stamineus extract which has shown inhibitory properties against Herpes Simplex Virus type 1 [86, 87]. Marmesincan be found in mango and wheat [88]. Its inhibitory activity against the Epstein-Barr virus (EBV) has been reported [89].
Conclusion. After the virtual screening of a library of 1,048 natural compounds against the SARS-CoV-2 GNMT, three lead compounds namely Crinamidine, Sinensetin and Marmesin were identified. Overall, the lead compounds proved to be better drug candidates than the standard in the following order: Crinamidine, Sinensetin and Marmesin.
It is recommended that the inhibitory effect of Crinamidine, Sinensetin and Marmesin on the active site of SARS-CoV-2 GNMT should be further investigated.
Reference lists