Introduction to Molecular Modeling
Molecular modeling is a general term which covers a wide range of molecular graphics and computational chemistry techniques used to build, display, manipulate, simulate, and analyze molecular structures, and to calculate properties of these structures.
Molecular modeling is used in a number of different research areas, and therefore the term does not have a rigid definition. To a chemical physicist, molecular modeling might imply performing a high quality quantum mechanical calculation using a supercomputer on a structure with 4 or 5 atoms; to an organic chemist, molecular modeling might mean displaying and modifying a candidate drug molecule on a desktop computer. The criterion for a successful modeling experiment should not be how accurately the calculations are performed, but whether they are useful in rationalizing the behavior of the molecule, or in enhancing the creativity of the chemist in the design of novel compounds.
This section gives a brief introduction to techniques that can be used to model molecules of interest to the bio-pharmaceutical industry. The references included in appendix A contain more in-depth information. Chapter 3, Basic Applications, lists a series of tutorials which are provided, designed to help you get started with molecular modeling with Insight II.
A somewhat arbitrary, but frequently used, distinction is to divide molecular modeling techniques into molecular graphics, and computational chemistry. Molecular graphics is the core of a modeling system, providing for the visualization of molecular structures and their properties. Molecular graphics provides the ability to display structures in a variety of styles (from simple line displays to solid renderings known as CPKs) and color schemes, with visual aids such as depth-cueing, and the ability to move the structures interactively in three dimensions. Simple tools for manipulating structures, such as modifying torsion angles and calculating geometry, are frequently included under the molecular graphics banner.
The visualization of molecular properties is an extremely important aspect of molecular modeling. The properties might be calculated using a computational chemistry program and visualized as three-dimensional contours, along with the associated structures. While manipulation of structures is usually interactive, the calculation of properties may require significant computer time. Calculations are usually run in the background rather than interactively, leaving the modeling system free for interactive work. The graphics part of the modeling system also provides the interface to the computational chemistry tools, allowing calculations to be defined and run, and then analyzed when complete. In Insight II the Viewer module provides the tools for displaying and manipulating structures and representations of their properties.
There are a number of sources of experimentally derived molecular structural data. These include the Brookhaven Protein Data Bank which maintains a library of protein and nucleic acid structures, and the Cambridge Structural Database with a database of small molecule structures. (Basics Application 6 shows how to use structural data from these sources.) In the course of a modeling experiment it is frequently necessary to modify these structures, and of course to build completely new structures. A number of techniques exist for building and modifying structures.
Building and Modifying Structures
Small molecule structures can be built in three dimensions by joining together basic building blocks from a fragment library, and then modifying atom and bond types (this functionality is provided by the Builder module; see Basics Applications 1 and 2). Macromolecule structures such as proteins and nucleic acids, consisting of large numbers of specific units, can be built by specifying the sequence of the units and the conformation in which the units should be joined. When necessary, individual units (amino acids or nucleotides) may be modified or replaced in the same manner that small molecule structures are modified (see the Biopolymer module documentation).
An alternative to building in three dimensions (3D) is to sketch a structure freehand in two dimensions, and then convert the sketch into three dimensions. This approach has the advantage of speed and the ease with which complex structures can be built; only element types, connectivity, and (where necessary) stereochemistry need to be defined. This capability is provided by the MolBuilder toolbox of the Builder module, which uses distance geometry techniques to convert a two-dimensional sketch to 3D. From the connectivity of the sketch, distance geometry derives a series of distance bounds for all atoms pairs, and a 3D structure is generated in which the interatomic distances satisfy these bounds. The strength of the 3D conversion algorithm is that it is virtually insensitive to the quality of the original 2D sketch. (See Basics Applications 1 and 2 for examples of sketching.)
When a structure is built it usually needs to be refined to bring it to a stable, sterically acceptable, conformation. This is especially true after building certain structures in three dimensions, when the process of adding fragments can generate serious atom clashes. The refinement process is known as minimization (or optimization), and is an iterative procedure in which the coordinates of the atoms are adjusted so that the energy of the structure is brought to a minimum. The structure with the lowest energy is considered to have the most stable arrangement, and by definition the optimum geometry. Minimization generally results in a modeled structure with a close resemblance to a real physical structure. The ability to compute the energy of a structure is a necessary part of the minimization process, and is an extremely important aspect of a modeling system.
Molecular mechanics techniques take a classical approach to calculating the energy of a structure. The molecule is treated essentially as a set of charged point masses (the atoms) which are coupled together with springs. The total energy of a structure is calculated using an analytical function which sums a number of individual energy terms. At its simplest level the function includes bond stretching, valence angle bending, torsion, and nonbond interaction terms, which associate an energetic penalty to the structure based upon deviations from an idealized equilibrium geometry. For instance, the bond term is a summation over all bonds in the structure, in which the energy of each bond is evaluated based on how far it is deformed from its equilibrium value. The amount by which the energy increases for a given deviation from the ideal value is determined by a parameter called a force constant.
The nonbond interaction energy actually includes three terms:
1. van der Waals attraction
2. van der Waals repulsion, and
These three terms are summed for all atom pairs in the structure that are 1-4 nonbonded and above. Any atom pairs in which one atom is bonded to, or involved in an angle interaction with, the other member of the pair is excluded from the nonbond energy calculation. The electrostatic term requires that each atom be assigned a partial charge, which for any atom can be either attractive or repulsive depending on the sign of the charges involved. The number of nonbonded atom pairs rises with the square of the number of atoms, and therefore the nonbond interaction term dominates the computer time required for energy evaluation in calculations on larger structures. For this reason, it is normal to limit the nonbond energy evaluation to atom pairs within a certain cutoff distance, based upon the assumption that atom pairs separated by a larger distance make a negligible contribution to the total energy.
The idealized equilibrium values of bond lengths and angles, plus the force constants, and the van der Waals radii and associated constants required to calculate the nonbond interactions, are stored in a file which is referred to when the energy calculation is run. The combination of these parameters with the functional forms of the individual energy terms is known as a forcefield. It is important to appreciate that the ideal values of bonds and angles cannot be based upon element type alone. For example, a carbon-carbon single bond is longer than a carbon-carbon double bond. For this reason, each atom in a structure must have a potential type assigned before the energy calculation is run. The potential type assigned depends on the hybridization state and environment of each atom. The forcefield parameters themselves are stored in the parameter file according to potential type.
Molecular mechanics enables the energy of a structure to be evaluated quickly, and may be applied to structures of a size up to and including large proteins. Energy calculations have a range of applications in molecular modeling. They can be used in conformational analysis to evaluate the relative stability of different conformers (see below), and to predict the equilibrium geometry of a structure. They can also be used to evaluate the energy of two or more interacting molecules, as when docking a substrate in to an enzyme active site (Basics Application 3 illustrates docking in the formic acid dimer using the Docking module.)
As described above, molecular mechanics energy calculations are an integral part of energy minimizations. From the energy functions it is possible to evaluate the forces acting on the atoms. Minimization actually uses information on the atomic forces to adjust atomic coordinates in an iterative manner to bring the structure to a minimum energy conformation. (Basics Application 1 and 2 illustrate use of the Optimize command in the Builder module.) There are several different minimization algorithms which can be used, depending on the nature of the problem (see Basics Applications 8 and 10).
A great deal of useful information can be gained from the study of minimum energy structures. However, these structures are static models, whereas in reality, molecules are flexible structures subject to thermal motion. The technique of molecular dynamics can be used to simulate the thermal motion of a structure as a function of time, using the forces acting on the atoms to drive the motion.
Starting with the molecular mechanics energy description of the structure as described above, the forces acting on the atoms can be evaluated. As the masses of the atoms are known, Newton's second law of motion (force = mass * acceleration) may be used to compute the accelerations, and thus the velocities, of the atoms. The accelerations and velocities may then be used to calculate new positions for the atoms over a short time step (around 1 femtosecond, where a femtosecond is equal to 10-15 seconds), thus moving each atom to a new position in space. This process iterates many thousands of times, generating a series of conformations of the structure known as a trajectory. A simulation is frequently run for many tens of picoseconds (1 picosecond is equal to
The velocities of the atoms are related directly to the temperature at which the simulation is run. A simulation run at 300K provides information on structural fluctuations that occur around the starting conformation, perhaps to illustrate which parts of a molecule are most flexible, and also can provide information on the pathways of conformational transitions. If the temperature of the simulation is increased, more energy is available to climb and cross energetic barriers. Thus high-temperature (e.g., 1000K) simulations are often used to search conformational space. (Basics Applications 10, 11, 12, and 13 illustrate some applications of molecular dynamics.)
Molecular modeling techniques can become very powerful when combined with experimental information. For instance, NMR experiments may indicate that specific atoms are separated by a certain distance. This information can be added to a simulation in the form of restraints. During a simulation a force can be applied to an atom pair to restrain it to the specified separation. Restraints are important because they provide a certain amount of control over a simulation. (Basics Applications 10and 11illustrate the use of restraints in a dynamics simulation.)
Minimizations and molecular dynamics simulations may be performed in Insight II using the Discover module. (For more information, refer to the Theory and Methodology sections of the Discover manual.)
Using molecular dynamics simulations it is easy to generate a vast amount of data. The problem is in analyzing the data. The simulation is usually analyzed first in a qualitative way by replaying the simulation as a movie, a process known as animation. A simulation can be analyzed quantitatively by defining properties of interest, and then graphing those properties against each other. For example, graphs that illustrate how the geometry or energy of the structure varies during the simulation can be created and compared. The Analysis or, alternatively, the Decipher module provides a number of tools for analyzing a dynamics simulation. (Refer to Basic Applications 9, 10, and 11.)
The electronic structure of a molecule is of prime importance in determining its properties. When a drug molecule and receptor interact, each "sees" the other as a blob of electron density which is held together by the positive charges of the atomic nuclei. Although molecular mechanics calculations are extremely useful they consider essentially only the position of the nuclei and therefore cannot fully represent chemical reality. Molecular mechanics provides no information on electronic structure, and furthermore cannot be used when the molecule is not in its ground state, or when covalent bonds are being broken or formed. There are a number of properties that can be derived only using quantum mechanics calculations.
Starting with a specified nuclear geometry, quantum mechanics calculations solve the Schrodinger equation for this arrangement of electrons and nuclei. This yields both the energy of the molecule, and the associated wave function from which electronic properties, such as electron density, can be calculated.
The energy of a structure calculated quantum mechanically can be used in conformational searches, in the same way that the molecular mechanics energy is used. Quantum mechanics calculations can also be used for energy minimization. However, quantum mechanics calculations typically consume a far greater amount of computer resource than molecular mechanics calculations, and are also generally limited to small molecules, whereas molecular mechanics can be applied to structures up to the size of large proteins. Molecular mechanics and quantum mechanics should thus be viewed as complementary techniques. For instance, conformational energy calculations for a peptide are best carried out using molecular mechanics. However, molecular mechanics is generally ineffective for handling conjugated systems, while quantum mechanics, in calculating electronic structure, takes account of conjugation automatically and is therefore recommended for optimizing the structure of a small molecule containing conjugated systems.
The wave function can be used to calculate a range of chemical properties, which can be used in structure activity studies. These include electrostatic potential and electron density, dipole moment, and the energies and positions of frontier orbitals. As with the analysis of a molecular dynamics calculation, molecular graphics is essential for visualizing these properties. Quantum mechanics calculations are also used frequently to derive atom-centered partial charges (although note that the term charge itself does not have a strict quantum mechanical definition). Charges have a wide range of applications in modeling, and are used in the calculation of electrostatic energies in molecular mechanics calculations and in computing electrostatic potentials.
The most widely used quantum mechanics packages are the public domain programs AMPAC and MOPAC, to which an interface is provided in the Ampac/Mopac module. These programs utilize the molecular orbital formalism in solving the Schrodinger equation, and are therefore known as molecular orbital programs. (The application of molecular orbital calculations to computing partial charges and then electrostatic potentials, and to calculating frontier orbitals, is illustrated in Basics Application 5.)
Once a modeling experiment is complete it is essential to be able to convey the results. Hardcopy plots of molecular structures with suitable annotations convey a large amount of information, and the ability to incorporate structural displays into reports produced using word processing packages is extremely useful. (Basics Application 7 illustrates annotation and the production of hardcopy.)
Finally. . .
Last updated December 17, 1998 at 04:25PM PST.
Copyright © 1998, Molecular Simulations Inc. All rights