Insight II



G       Potential Template Rules

The potential type template file is used for the assignment of potential atom types to the atoms of a molecule. It consists of one or more potential type templates, followed by a precedence tree. The potential type templates describe the connectivity and other properties that an atom must have in order to be assigned a particular potential type. The precedence tree describes the relative priorities of the potential types, and is used for deciding which type to assign in the case of multiple template matches. Any line whose first character is an exclamation point (!) is a comment. Comments may appear anywhere in the file. The environment variable INSIGHT_POTENTIAL_TEMPLATES indicates which potential file in the BIOSYM_LIBRARY directory is to be used. If INSIGHT_POTENTIAL_TEMPLATES is undefined, the old hard-coded assignment rules are used.

Note that in the definitions throughout this chapter, parentheses are used to distinguish separate levels of precedence. Each left parenthesis indicates the next lower level of precedence. Each right parenthesis ends the level begun by the preceding left parenthesis.


Potential Type Template

Each potential type template is of the following form:

type:ccc
template:(>....)
[optional atom tests]
end_type

where ccc is the 1-to-3 letter potential type name, the template is the description of the surrounding chemical environment of atoms receiving this potential type, and the tests are a way to refine the description of this environment.

The template is a SMILES-like string describing the bonded atoms and bond orders that must be encountered in the search outward from the atom in question. The first atom is always the one being assigned a potential, and is preceded by the special character >. All bonded substructures are enclosed in parentheses or brackets, and begin with a bond order character from the following set:

- = single bond

: = partial double bond

= = double bond

# = triple bond

~ = wildcard bond (matches any bond order)

Atoms are indicated by their chemical symbol, and the special symbol * is used to indicate a match with any element. Any substructure can in turn have nested substructures. This can continue as deep as necessary to uniquely identify the atom's environment. Spaces can be used in the template to increase readability.

Example:

The template (>C (=O) (-O(-H)) (-*)) describes the carbon in the following group:

anything
/
O=C
\
O-H

The closing parenthesis for an atom does not indicate that there may not be any more bonds out of the atom, only that no further specific bonds are required for a match. Thus the template (>N) matches any nitrogen, no matter what its bonded neighbors are.

Brackets around an atom are used when you wish to indicate that the connectivity must match the template exactly. Thus the template [>Ca] can be used for ionic calcium, and will not match a calcium with atoms bonded to it. When an atom with substructures is bracketed, it is only that atom, not all the substructures, which is limited in its connectivity.

Example:

The template (>C[:N(-*)]) matches a carbon in the following groups:

but not:

Note that for the matching to work correctly the subtrees at any level must be ordered from most specific to most general. This prevents an atom that is necessary for a specific subtree match from being incorrectly paired with an earlier wildcard.

Example:

(-N(-H)(-C)) before (-N(-H)(-*)) before (-N) before (-*)

Atom Tests

Atom tests are used to refine the matching criteria by placing further restrictions on the atoms being matched. Each test refers to an atom in the template, using the order in which it appears in the template. The atom tests are of the following form:

atom_test:x
.
.
tests
.
.
end_test

where x is the atom number in the template, for example 3 for the second oxygen in the template (>C(=O)(-O(-H))). The atom_test section for a particular atom can contain one or more tests from the set described below. The only test type that is allowed to occur multiple times is the ring test.

Element Test

The element test allows you to specify which elements are permitted or not permitted at this location in the template. Using element tests only makes sense for template atoms of the wildcard (*) type. It has two forms:

allowed_atoms: ..list..

and

disallowed_atoms: ..list..

In both cases, the list is a list of atomic symbols (i.e., C, Si, Br, H) separated by commas indicating which elements are allowed or disallowed.

Hybridization Test

The hybridization test allows you to specify what hybridization state(s) an atom may have and still match the atom in the template. The form is:

hybridization: ..list..

where the list is a comma-separated list of hybridization types from among sp, sp2, sp3.

Aromaticity Test

This test allows specification of whether the atom is to be aromatic or not. Its form is:

aromaticity: aromatic

or:

aromaticity: non_aromatic

Ring Test

The ring test allows you specify ring size and ring geometry requirements for the atom. Either the size or geometry requirement can be wildcarded to indicate that that property of the ring is not constrained. The form of the ring test is:

ring: geometry(size)

where geometry is planar, non_planar, or *. The size is an integer or the wildcard *.


Precedence Tree

The precedence tree is used to indicate which potential types have priority over others in the case of multiple template matches for a given atom. It is specified using the same nested parenthetical style as the templates. To properly resolve the precedence between a number of matches there must be a path down from the root of the tree (? in the example below) which contains all the matches. The final potential type in this path will have the highest precedence and be assigned. For example, in the tree below there is a path that contains the following set of matches: '', c', c=, and so an atom matching these three types would be assigned c=. However, if c1 were also matched for that atom, no single path could contain the whole set of matches and Insight II would report an error about being unable to resolve the precedence. Notice that the path containing the matches does not have to go all the way to the lowest level of the tree, it must simply be a continuous path starting at the root and containing all the matches.

The simple precedence tree (?(h?(hc))(c?(c1(c2))(c=))) represents the following precedence relationships:

? |
/ \ |
h? c? Increasing Priority
/ / \ |
hc c= c1 |
\ \ /
c2 .

Note that the ordering of the subtrees is unimportant It is the order as you descend down a path in the tree that defines the precedence. Thus the simple trees (?(Br)(br)(Si)) and (?(Si)(br)(Br)) are equivalent, and indicate the same precedence for Br, br, and Si. However, the tree (?(Br(br))(Si)) indicates that the br potential takes precedence over Br if both are matched.

The format of the precedence tree section is:

precedence:
.
.
tree
.
.
end_precedence

The actual tree can be broken into multiple lines, and punctuated with spaces and comments for readability.

Examples of Potential Assignment for a Small Set of Potential Types

Suppose we have the following simple set of parameters to assign:

h generic hydrogen parameter

hc hydrogen bonded to carbon

h* hydrogen in water

c sp3 carbon

c2 sp3 carbon with 2 hydrogens, 2 heavy atoms

c= non-aromatic double bond sp2 carbon

o generic oxygen parameter

? parameter for unknown atoms

The rules for these parameters would be as follows:

type: h

template: (>H)

! Matches any hydrogen

end_type

type: hc

template: (>H(-C))

! Matches a hydrogen single bonded to a carbon. No

! restriction on what may be bonded to the carbon.

end_type

type: h*

template: (>H[-O(-H)])

! Matches a hydrogen single bonded to an oxygen which

! in turn has a hydrogen bonded to it. The brackets

! around the OH prevent matches in the unlikely

! situation of a charged oxygen which happens to have

! two hydrogens. With the brackets present the oxygen

! must have only the bonds specified.

end_type

type: c

template: (>C)

atom_test: 1

hybridization: sp3

end_test

! The template matches any carbon. The atom test is

! applied to atom 1 (in this case the only atom) of

! the template string and restricts the match to only

! sp3 carbons.

end_type

type: c2

template: (>C(-H)(-H)(-*)(-*))

! Matches a carbon with 4 single bonds, two of which

! are to hydrogens. No hybridization test was felt

! necessary since a carbon with 4 single bonds is always

! sp3 to insightII. Note that the hydrogen substructures

! appear before the wildcard substructures. This

! prevents a hydrogen from being used for the wildcard

! match and hence not being available for the more specific

! substructure match.

end_type

type: c=

template: (>C(=*))

atom_test: 1

aromaticity: NON_AROMATIC

hybridization: sp2

end_test

! The template will match any carbon with a double bond.

! To prevent matches to carbons in aromatic rings there

! is an aromaticity test on the first atom of the template (the

! carbon). The hybridization test somewhat redundantly specifies

! that the carbon must be SP2.

end_type

type: o

template: (>O)

! Matches any oxygen

end_type

type: ?

template: (>*)

! Matches anything

end_type

The precedence tree for this small set of parameters would be:

precedence:

(? (h(hc)(h*)) (c(c2)) (c=) (o) )

end_precedence

Graphically this looks like:


	    ____?____


	   /   / \   \


	  h   c   c=  o


	 / \  |  


      hc  h* c2


 Where the lower you go in the tree, the greater the precedence. Thus if an atom matches the ?, h, and h* types, then it is assigned the h* potential. In this manner you can insure that the most appropriate potential type out of all that were matched is assigned.

Examples of Potential Assignments Using These Types


    H


     \


      C=O		aldehyde


     /


    H


 The hydrogens in this molecule match the ?, h, and hc types. Since hc has the highest precedence of this set of matches, it is assigned to the hydrogens. The carbon matches only the c= and ? types; the hybridization test prevents a match with the c template. Since c= has a higher precedence than ?, the carbon is assigned c=. Similarly, the oxygen matches o and ?, and is assigned o.

Example of unmatched atom:

C---C // \\ C C benzene \ / C===C The simple set of parameters described so far does not do a good job on this aromatic ring. The c parameter is not matched because the carbons are not SP3, the c= rule fails because the carbons are aromatic, and we do not even get a match with the template in the c2 rule. The only match is ?, and this is what is assigned to the carbons of the aromatic ring.

Adding A New Parameter Type

One way to handle the case above is to add a new parameter for SP2 carbons. For example, call such a rule cp. Once the new parameter is added to the forcefield we want add to the potential assignment rules so that the new parameter are assigned to the appropriate atoms. The new rule will be:

type: cp

template: (>C)

atom_test: 1

hybridization: sp2

end_test

end_type

We also need to add this type into the precedence tree. At first glance it might seem it should just go below ?:

precedence:

(? (h(hc)(h*)) (c(c2)) (c=) (cp) (o) )

end_precedence

The graphical representation of this is:


            ____?________


           /   / \   \   \


          h   c   c=  o  cp


         / \  |


        hc h* c2


 However if we go back and look at the aldehyde case we see that this presents a problem. The carbon will still match ? and c=, but now it will also match cp and there is no single path in the precedence tree that contains all three matches, and hence no way for Insight II to figure out which of the potentials has highest precedence. The cp type will be matched any time the c= type is matched because it is just a more general (no aromaticity test) version of the c= rule. This means that we can resolve the precedence conflict by putting c= under cp:

precedence:

(? (h(hc)(h*)) (c(c2)) (cp(c=)) (o) )

end_precedence

The graphical representation of this is:

____?____ / / \ \ h c cp o / \ | | hc h* c2 c=


cvff_templates.dat

The following is a sample potential type template file that does potential type assignment for the cvff.frc library.

! cvff_templates.dat

! Template file of potential type assignment templates for the cvff

! forcefield

!

! Revision History:

! KWC 6/5/90 Initial version without ci and ni types

! DWS 8/13/90 Fixes for beta bugs, addition of ci,ni

! KWC 8/16/90 added o- templates for carboxylate oxygens,

! changed precedence for Cl/cl since brackets

! now work correctly for first atom in template

! KWC 8/16/90 Fixed the error introduced into the o-

! template covering O:*:O

! DWS 11/17/90 Removed SP3 test for o potential, changed

! precedence for some of the c' and o' stuff

! JDC 4/8/92 Added alternative bonding for nitro group

!

! SHS 1/9/92 Changed the atom type for Calcium ion - Ca++

! from c+ to ca+

! JDC 9/10/92 Update template for ca from cff91.

!

!

type: ?

! anything

template: (>*)

end_type

type:h

! hydrogen bonded to carbon

template: (>H (-*) )

atom_test:2

allowed_elements:C,Si,H

end_test

end_type

type:hn

! hydrogen bonded to nitrogen

template: (>H(-N))

end_type

type:ho

! hydrogen bonded to oxygen

template: (>H(-O))

end_type

type:hp

! hydrogen bonded to phosphorous

template: (>H(-P))

end_type

type:hs

! hydrogen bonded to sulfur

template: (>H(-S))

end_type

type:h*

! hydrogen in water

template: (>H(-O(-H)))

end_type

type: lp

!lone pair

template: (>L (-*))

end_type

type:c

! generic SP3 carbon

template: (>C)

atom_test:1

hybridization:sp3

end_test

end_type

type: c=

! non aromatic double bonded carbon

template: (>C(=*))

atom_test:1

aromaticity:NON_AROMATIC

end_test

end_type

type:c'

! carbonyl (c=O) group

template (>C(=*))

atom_test: 2

allowed_elements: O,S

end_test

end_type

type:c'

! carbonyl (c:O) group

template (>C(:*)(~*)(~*))

atom_test: 2

allowed_elements: O,S

end_test

end_type

type:cp

! SP2 aromatic carbon

template:(>C)

atom_test:1

hybridization: SP2

aromaticity:AROMATIC

end_test

end_type

type:cp

! This is used for aromatic carbons that fail the aromaticity test because

! the current ring checker is to lame to figure on a ring with more than

! seven or eight sides. The NON_AROMATIC test is to eliminate the conflict

! with the above 'cp' definition. This can be removed when the ring checker

! is made more robust.

template: [>C(-*)(:*)(:*)]

atom_test:1

hybridization: SP2

aromaticity:NON_AROMATIC

end_test

end_type

type: c5

! SP2 aromatic in 5 membered ring

template:(>C)

atom_test: 1

hybridization: SP2

aromaticity:AROMATIC

ring:PLANAR(5)

end_test

end_type

type: cs

! SP2 aromatic in 5 membered ring next to S

template:(>C(-S))

atom_test: 1

hybridization: SP2

aromaticity:AROMATIC

ring:PLANAR(5)

end_test

atom_test: 2

hybridization: SP2

aromaticity:AROMATIC

ring:PLANAR(5)

end_test

end_type

! The cn type is not assigned currently

!type: cn

! template (>C(-N)(-*)(-*)(-*))

!end_type

type: cr

! c in guanidinium group

template: (>C (=N(-*)) (-N(-H)(-H)) (-N(-H)(-H)) )

end_type

type: c-

! c in charged carboxylate

template: (>C(:O)(:O))

end_type

type: c-

! c in charged carboxylate

! How do we indicate that the second O has nothing bonded to it ?

! what makes it not match COOH ?

template: (>C(=O)[-O])

end_type

type: c1

! sp3 carbon with 1 h 3 heavies

template: (>C(-H)(-*)(-*)(-*))

atom_test:3

disallowed_elements:H

end_test

atom_test:4

disallowed_elements:H

end_test

atom_test:5

disallowed_elements:H

end_test

end_type

type: ca

! template: (>C(-H)(-C[~O])(-C)(-N(-H)))

template: (>C(-N(-*))(-C[~O])(~*)(~*))

end_type

type:c2

! sp3 carbon with 2 H's, 2 Heavy's

template:(>C(-H)(-H)(-*)(-*))

atom_test:4

disallowed_elements:H

end_test

atom_test:5

disallowed_elements:H

end_test

end_type

type: cg

template: (>C(-H)(-H)(-C[~O])(-N(-H)))

end_type

type: c3

! sp3 carbon with 3 h's 1 heavy

template: (>C(-H)(-H)(-H)(-*))

atom_test:5

disallowed_elements:H

end_test

end_type

type: ct

! sp nitrogen involved in a triple bond

template: (>C(#*))

end_type

type: n

! Billed as sp2 with 1 H but seems to be used for sp2 with no H's as

! well in proline

template: (>N(-*))

atom_test:1

aromaticity: non_aromatic

hybridization:SP2

end_test

end_type

type: n1

!sp2 nitrogen in charged arginine

template: (>N(-C)(=C)(-H))

atom_test:1

hybridization:SP2

end_test

end_type

type: n2

!sp2 nittrogen with two hydrogens

template: (>N(-H)(-H))

atom_test:1

aromaticity: non_aromatic

hybridization:SP2

end_test

end_type

type: np

!sp2 sp2 aromatic

template: (>N)

atom_test:1

aromaticity:aromatic

hybridization:SP2

end_test

end_type

type: np

!sp2 sp2 aromatic

template: (>N(:*)(:*))

atom_test:1

aromaticity:non_aromatic

hybridization:SP2

end_test

end_type

type: np

!

template: (>N(=O)(-O))

atom_test:1

aromaticity:non_aromatic

hybridization:SP2

end_test

end_type

type: n3

! sp3 nitrogen with 3 substituants

template: (>N(-*)(-*)(-*))

atom_test:1

hybridization:SP3

end_test

end_type

type: n4

! sp3 nitrogen with 4 substituants

template: (>N(-*)(-*)(-*)(-*))

atom_test:1

hybridization:SP3

end_test

end_type

type:n=

! sp2 nitrogen involved ina double bond (non-aromatic)

template:(>N(=*))

atom_test:1

aromaticity:NON_AROMATIC

hybridization:SP2

end_test

end_type

type:nt

! sp nitrogen involved in a triple bond

template:(>N(#*))

end_type

type:nz

! sp nitrogen in N2

template:[>N[#N]]

end_type

type: ni

! Nitrogen in charged imidazole ring

template:[>N(=C(-N(-C(=C))))(-H)(-C)]

atom_test:1

ring:PLANAR(5)

end_test

atom_test:2

ring:PLANAR(5)

end_test

atom_test:3

ring:PLANAR(5)

end_test

atom_test:4

ring:PLANAR(5)

end_test

atom_test:5

ring:PLANAR(5)

end_test

atom_test:7

ring:PLANAR(5)

end_test

end_type

type: ni

! Nitrogen in charged imidazole ring

template:(>N(-C(=C(-N(=C)(-H))))(-C)(-*))

atom_test:1

ring:PLANAR(5)

end_test

atom_test:2

ring:PLANAR(5)

end_test

atom_test:3

ring:PLANAR(5)

end_test

atom_test:4

ring:PLANAR(5)

end_test

atom_test:5

ring:PLANAR(5)

end_test

atom_test:7

ring:PLANAR(5)

end_test

end_type

type: ci

! Carbon in charged imidazole ring

template:(>C(=N(-C(=C(-N)))(-H)))

atom_test:1

ring:PLANAR(5)

end_test

atom_test:2

ring:PLANAR(5)

end_test

atom_test:3

ring:PLANAR(5)

end_test

atom_test:4

ring:PLANAR(5)

end_test

atom_test:5

ring:PLANAR(5)

end_test

end_type

type: ci

! Carbon in charged imidazole ring

template:(>C(-N(-C(=N(-H)(-C)))))

atom_test:1

ring:PLANAR(5)

end_test

atom_test:2

ring:PLANAR(5)

end_test

atom_test:3

ring:PLANAR(5)

end_test

atom_test:4

ring:PLANAR(5)

end_test

atom_test:6

ring:PLANAR(5)

end_test

end_type

type: ci

! Carbon in charged imidazole ring

template:(>C(=C(-N(-C(=N(-C)(-H))))))

atom_test:1

ring:PLANAR(5)

end_test

atom_test:2

ring:PLANAR(5)

end_test

atom_test:3

ring:PLANAR(5)

end_test

atom_test:4

ring:PLANAR(5)

end_test

atom_test:5

ring:PLANAR(5)

end_test

atom_test:6

ring:PLANAR(5)

end_test

end_type

type:o

! generic oxygen

template (>O)

! SP3 test removed by DWS 11/27/90

! atom_test:1

! hybridization:SP3

! end_test

end_type

type: o'

! oxygen in carbonyl group

template: (>O(=*))

atom_test:2

allowed_elements: N,C,S

end_test

end_type

type: o'

! oxygen in carbonyl group

template: [>O(:*(:*))]

atom_test:3

disallowed_elements: O

end_test

end_type

type: o-

! partial double oxygen bonded to something then bonded to

! another partial double oxygen

template: [>O(:*[:O](-*))]

end_type

type: o-

! double bonded oxygen in charged carboxylate COO-

template: [>O(=C[-O](-*))]

end_type

type: o-

! single bonded oxygen in charged carboxylate COO-

template: [>O[-C[=O](-*)]]

end_type

type: o-

! single bonded oxygen in incorrecly bond ordered charged carboxylate COO-

! NOTE: the carbon will be flagged as having unfilled valences

template: [>O[-C[-O](-*)]]

end_type

type: o-

! single bonded oxygen in incorrect nitro group

template: (>O(-N(=O)))

end_type

type: o-

! double bonded oxygen in incorrect nitro group

template: (>O(=N(-O)))

atom_test:1

hybridization:SP2

end_test

end_type

type oh

! oxygen bonded to hydrogen

template: (>O(-H)(-*))

end_type

type: op

! SP2 aromatic in 5 membered ring

template:(>O)

atom_test: 1

hybridization: SP2

aromaticity:AROMATIC

ring:PLANAR(5)

end_test

end_type

type: o*

!oxygen in water

template (>O(-H)(-H))

end_type

type: sh

template: (>S(-H)(-*))

atom_test:3

disallowed_elements:S

end_test

end_type

type: s1

! disufide

template: (>S(-S))

end_type

type: s

! methionine sulfur

template: (>S)

end_type

type: sp

! SP2 aromatic in 5 membered ring

template:(>S)

atom_test: 1

hybridization: SP2

aromaticity:AROMATIC

ring:PLANAR(5)

end_test

end_type

type: s'

! S in thioketone group

template: (>S(=*))

atom_test:2

allowed_elements: C

end_test

end_type

type: p

! General phosphorous atom

template: (>P)

end_type

type: ca+

! calcium ion

template: [>Ca]

end_type

type: f

!fluorine bonded to carbon

template: (>F (-*))

atom_test: 2

allowed_elements: C,F

end_test

end_type

type: cl

!chlorine bonded to carbon

template: (>Cl (-*))

atom_test: 2

allowed_elements: C,Cl

end_test

end_type

type: Cl

!chlorine ion

template: [>Cl]

end_type

type: Na

!sodium ion

template: [>Na]

end_type

type: br

!bromine bonded to carbon

template: (>Br (-*))

atom_test: 2

allowed_elements: C,Br

end_test

end_type

type: Br

!bromine ion

template: [>Br]

end_type

type: i

!iodine

template: (>I (-*))

atom_test: 2

allowed_elements: C,I

end_test

end_type

type: si

!silicon

template: (>Si)

end_type

type: ar

! Argon

template: (>Ar)

end_type

precedence:

(?

(ca+)

(Cl) (cl)

(Br) (br)

(Na)

(p)

(si)

(f)

(i)

(h) (ho(h*)) (hn) (hp) (hs)

(lp)

(ct) (c=(ci)(c'(c-)) (cr)) (cp(c'(c-))) (c'(c-))

(cp(c5(ci)(c')(cs(c')))) (c(c1(ca)) (c2(ca(cg))) (c3) (ca))

(o (o-) (oh(o*)) (op)) (o(o'(o-))) (o-)

(nt (nz)) (n(ni)(np(ni))(n2(n=))(n=(ni)(np(ni)(n1(ni))))) (np(ni)) (n3(n4))

(s (sh) (s1) (sp) (s')) (n(n=(n1(np(ni))(ni)))(n1(np(ni))(ni)))

(ar)

)

end_precedence


Out-of-Plane Assignment Rules

While assigning out-of-plane value to an atom, Insight II first looks for matches in the currently assigned residue library. If a match is found, the residue library entry is used to assign the out-of-plane value to the atom. If a match is not found, the following rules are used to assign the out-of-plane value to the atom. Note, however, that the rules are not designed to work for the AMBER forcefield.

Table 7¯1. Out-of-Plane Assignment Rules

ELEMENT ENVIRONMENT OOP FLAG
------- ----------- --------

N -NH3 0

=X-NH2 1
where X=SP2 and N=SP2

-NH2 0

\
N-H 1
//

All other environments 0

------------------------------------------------------

\
C C-- 1
//



where C=SP2

All other environments 0

------------------------------------------------------

All other elements
All other environments 0




Last updated December 17, 1998 at 04:29PM PST.
Copyright © 1998, Molecular Simulations Inc. All rights reserved.