Insight II |

Molecular modeling is a general term which covers a wide range of molecular graphics and computational chemistry techniques used to build, display, manipulate, simulate, and analyze molecular structures, and to calculate properties of these structures.

Molecular modeling is used in a number of different research areas, and therefore the term does not have a rigid definition. To a chemical physicist, molecular modeling might imply performing a high quality quantum mechanical calculation using a supercomputer on a structure with 4 or 5 atoms; to an organic chemist, molecular modeling might mean displaying and modifying a candidate drug molecule on a desktop computer. The criterion for a successful modeling experiment should not be how accurately the calculations are performed, but whether they are useful in rationalizing the behavior of the molecule, or in enhancing the creativity of the chemist in the design of novel compounds.

This section gives a brief introduction to techniques that can be used to model molecules of interest to the bio-pharmaceutical industry. The references included in appendix A contain more in-depth information. Chapter 3, *Basic Applications*, lists a series of tutorials which are provided, designed to help you get started with molecular modeling with Insight II.

Molecular Graphics

The visualization of molecular properties is an extremely important aspect of molecular modeling. The properties might be calculated using a computational chemistry program and visualized as three-dimensional contours, along with the associated structures. While manipulation of structures is usually interactive, the calculation of properties may require significant computer time. Calculations are usually run in the background rather than interactively, leaving the modeling system free for interactive work. The graphics part of the modeling system also provides the interface to the computational chemistry tools, allowing calculations to be defined and run, and then analyzed when complete. In Insight II the **Viewer** module provides the tools for displaying and manipulating structures and representations of their properties.

Building and Modifying Structures

Small molecule structures can be built in three dimensions by joining together basic building blocks from a fragment library, and then modifying atom and bond types (this functionality is provided by the **Builder** module; see Basics Applications 1 and 2). Macromolecule structures such as proteins and nucleic acids, consisting of large numbers of specific units, can be built by specifying the sequence of the units and the conformation in which the units should be joined. When necessary, individual units (amino acids or nucleotides) may be modified or replaced in the same manner that small molecule structures are modified (see the Biopolymer module documentation).

An alternative to building in three dimensions (3D) is to sketch a structure freehand in two dimensions, and then convert the sketch into three dimensions. This approach has the advantage of speed and the ease with which complex structures can be built; only element types, connectivity, and (where necessary) stereochemistry need to be defined. This capability is provided by the **MolBuilder** toolbox of the **Builder **module, which uses distance geometry techniques to convert a two-dimensional sketch to 3D. From the connectivity of the sketch, distance geometry derives a series of distance bounds for all atoms pairs, and a 3D structure is generated in which the interatomic distances satisfy these bounds. The strength of the 3D conversion algorithm is that it is virtually insensitive to the quality of the original 2D sketch. (See Basics Applications 1 and 2 for examples of sketching.)

Molecular Mechanics

Molecular mechanics techniques take a classical approach to calculating the energy of a structure. The molecule is treated essentially as a set of charged point masses (the atoms) which are coupled together with springs. The total energy of a structure is calculated using an analytical function which sums a number of individual energy terms. At its simplest level the function includes bond stretching, valence angle bending, torsion, and nonbond interaction terms, which associate an energetic penalty to the structure based upon deviations from an idealized equilibrium geometry. For instance, the bond term is a summation over all bonds in the structure, in which the energy of each bond is evaluated based on how far it is deformed from its equilibrium value. The amount by which the energy increases for a given deviation from the ideal value is determined by a parameter called a force constant.

The nonbond interaction energy actually includes three terms:

1. van der Waals attraction

2. van der Waals repulsion, and

3. electrostaticThese three terms are summed for all atom pairs in the structure that are 1-4 nonbonded and above. Any atom pairs in which one atom is bonded to, or involved in an angle interaction with, the other member of the pair is excluded from the nonbond energy calculation. The electrostatic term requires that each atom be assigned a partial charge, which for any atom can be either attractive or repulsive depending on the sign of the charges involved. The number of nonbonded atom pairs rises with the square of the number of atoms, and therefore the nonbond interaction term dominates the computer time required for energy evaluation in calculations on larger structures. For this reason, it is normal to limit the nonbond energy evaluation to atom pairs within a certain cutoff distance, based upon the assumption that atom pairs separated by a larger distance make a negligible contribution to the total energy.

The idealized equilibrium values of bond lengths and angles, plus the force constants, and the van der Waals radii and associated constants required to calculate the nonbond interactions, are stored in a file which is referred to when the energy calculation is run. The combination of these parameters with the functional forms of the individual energy terms is known as a *forcefield*. It is important to appreciate that the ideal values of bonds and angles cannot be based upon element type alone. For example, a carbon-carbon single bond is longer than a carbon-carbon double bond. For this reason, each atom in a structure must have a *potential type* assigned before the energy calculation is run. The potential type assigned depends on the hybridization state and environment of each atom. The forcefield parameters themselves are stored in the parameter file according to potential type.

Molecular mechanics enables the energy of a structure to be evaluated quickly, and may be applied to structures of a size up to and including large proteins. Energy calculations have a range of applications in molecular modeling. They can be used in conformational analysis to evaluate the relative stability of different conformers (see below), and to predict the equilibrium geometry of a structure. They can also be used to evaluate the energy of two or more interacting molecules, as when docking a substrate in to an enzyme active site (Basics Application 3 illustrates docking in the formic acid dimer using the **Docking** module.)

As described above, molecular mechanics energy calculations are an integral part of energy minimizations. From the energy functions it is possible to evaluate the forces acting on the atoms. Minimization actually uses information on the atomic forces to adjust atomic coordinates in an iterative manner to bring the structure to a minimum energy conformation. (Basics Application 1 and 2 illustrate use of the **Optimize** command in the **Builder** module.) There are several different minimization algorithms which can be used, depending on the nature of the problem (see Basics Applications 8 and 10).

Molecular Dynamics

Starting with the molecular mechanics energy description of the structure as described above, the forces acting on the atoms can be evaluated. As the masses of the atoms are known, Newton's second law of motion (force = mass * acceleration) may be used to compute the accelerations, and thus the velocities, of the atoms. The accelerations and velocities may then be used to calculate new positions for the atoms over a short time step (around 1 femtosecond, where a femtosecond is equal to 10^{-15} seconds), thus moving each atom to a new position in space. This process iterates many thousands of times, generating a series of conformations of the structure known as a trajectory. A simulation is frequently run for many tens of picoseconds (1 picosecond is equal to

10^{-12} seconds).

The velocities of the atoms are related directly to the temperature at which the simulation is run. A simulation run at 300K provides information on structural fluctuations that occur around the starting conformation, perhaps to illustrate which parts of a molecule are most flexible, and also can provide information on the pathways of conformational transitions. If the temperature of the simulation is increased, more energy is available to climb and cross energetic barriers. Thus high-temperature (e.g., 1000K) simulations are often used to search conformational space. (Basics Applications 10, 11, 12, and 13 illustrate some applications of molecular dynamics.)

Molecular modeling techniques can become very powerful when combined with experimental information. For instance, NMR experiments may indicate that specific atoms are separated by a certain distance. This information can be added to a simulation in the form of *restraints*. During a simulation a force can be applied to an atom pair to restrain it to the specified separation. Restraints are important because they provide a certain amount of control over a simulation. (Basics Applications 10and 11illustrate the use of restraints in a dynamics simulation.)

Minimizations and molecular dynamics simulations may be performed in Insight II using the **Discover** module. (For more information, refer to the Theory and Methodology sections of the **Discover **manual.)

Using molecular dynamics simulations it is easy to generate a vast amount of data. The problem is in analyzing the data. The simulation is usually analyzed first in a *qualitative* way by replaying the simulation as a movie, a process known as animation. A simulation can be analyzed *quantitatively* by defining properties of interest, and then graphing those properties against each other. For example, graphs that illustrate how the geometry or energy of the structure varies during the simulation can be created and compared. The **Analysis** or, alternatively, the **Decipher **module provides a number of tools for analyzing a dynamics simulation. (Refer to Basic Applications 9, 10, and 11.)

Quantum Mechanics

Starting with a specified nuclear geometry, quantum mechanics calculations solve the Schrodinger equation for this arrangement of electrons and nuclei. This yields both the energy of the molecule, and the associated wave function from which electronic properties, such as electron density, can be calculated.

The energy of a structure calculated quantum mechanically can be used in conformational searches, in the same way that the molecular mechanics energy is used. Quantum mechanics calculations can also be used for energy minimization. However, quantum mechanics calculations typically consume a far greater amount of computer resource than molecular mechanics calculations, and are also generally limited to small molecules, whereas molecular mechanics can be applied to structures up to the size of large proteins. Molecular mechanics and quantum mechanics should thus be viewed as complementary techniques. For instance, conformational energy calculations for a peptide are best carried out using molecular mechanics. However, molecular mechanics is generally ineffective for handling conjugated systems, while quantum mechanics, in calculating electronic structure, takes account of conjugation automatically and is therefore recommended for optimizing the structure of a small molecule containing conjugated systems.

The wave function can be used to calculate a range of chemical properties, which can be used in structure activity studies. These include electrostatic potential and electron density, dipole moment, and the energies and positions of frontier orbitals. As with the analysis of a molecular dynamics calculation, molecular graphics is essential for visualizing these properties. Quantum mechanics calculations are also used frequently to derive atom-centered partial charges (although note that the term *charge* itself does not have a strict quantum mechanical definition). Charges have a wide range of applications in modeling, and are used in the calculation of electrostatic energies in molecular mechanics calculations and in computing electrostatic potentials.

The most widely used quantum mechanics packages are the public domain programs AMPAC and MOPAC, to which an interface is provided in the **Ampac/Mopac** module. These programs utilize the molecular orbital formalism in solving the Schrodinger equation, and are therefore known as molecular orbital programs. (The application of molecular orbital calculations to computing partial charges and then electrostatic potentials, and to calculating frontier orbitals, is illustrated in Basics Application 5.)

Finally. . .