The ens Command

ens add

ens add ehandle ?ehandle_list?...

e.add(?eref/erefsequence?,...)

e += eref

This command performs the same operation as the ens merge command, but preserves the ensembles in the merge lists (argument four and onwards in the Tcl command variant). The base ensemble (third argument) is modified.

Please refer to the ens merge command for a more detailed documentation.

The Python arithmetic command returns a reference of the original ensemble, not the new first atom label or reference of the merged ensemble (see again ens merge ).

ens align3d

ens align3d ehandle box/center/masscenter/pmi ?usehydrogens? ?property?

e.align3d(?mode=?,?usehydrogens=?,?coordinateproperty=?)

Perform a 3D alignment by modifying standard atom coordinates property A_XYZ , or an alternative explicitly specified atomic coordinate property.

The possible alignment modes are

box
move center of enclosing 3D coordinate box to origin
center
move average atom coordinates to origin
masscenter
move mass-weighted atom coordinates to origin
pmi
align ensemble to principle moments of inertia (largest on x axis), and move the mass-weighted center to the origin.

By default all atoms are used to compute the alignment rotation and movement vectors, including hydrogens. If these should be omitted from computing the movement vectors (but not the subsequent atom movement), the optional usehydrogens parameter can be set to false .

The command returns the handle or reference of the ensemble.

ens append

ens append ehandle ?property value?...

e.append({?property:value,?...})

e.append(?property,value,?...)

Standard data manipulation command for appending property data. It is explained in more detail in the section about setting property data.

The command returns the first data value.

Example:

ens append $ehandle E_NAME “_linker”

ens assign

ens assign ehandle srcproperty dstproperty

e.assign(srcproperty=,dstproperty=)

Assign property data to another property on the same ensemble. Both properties must be associated with the ensemble object class. This process is more efficient than going through a pair of ens get/ens set commands, because in most cases no string or Tcl/Python script object representations of the property data need to be created.

Both source and destination properties may be addressed with field specifications. A data conversion path must exist between the data types of the involved properties. If any data conversion fails, the command fails. For example, it is possible to assign a string property to a numeric property - but only if all property values can be successfully converted to that numeric type. The reverse example case always succeeds, out-of-memory errors and similar global events excluded.

The original property data remains valid. The command variant ens rename directly exchanges the property name without any data duplication or conversion, if that is possible. In any case, the original property data is no longer present after the execution of this command variant.

The command returns the original object handle for Tcl , or object reference for Python .

Examples:

ens assign $ehandle A_XY A_XY%

ens assign $ehandle E_NMRSPECTRUM(spectrometer) E_METHOD

ens rename $ehandle E_IDENT E_NAME

ens atoms

ens atoms ehandle ?filterset? ?filtermode?

e.atoms(?filters=?,?mode=?)

Standard cross-referencing command to obtain the labels or references of the atoms the ensemble contains as minor objects. This is explained in more detail in the section about object cross-references.

Examples:

ens atoms $ehandle

ens atoms $ehandle hydrogen

ens atoms $ehandle !hydrogen count

The first example simply returns a list of the labels of the atoms the ensemble contains as minor objects. The second example returns the atom label(s) of all hydrogen atoms in the ensemble. If there are no such atoms, an empty list is returned. The final example counts the number of non-hydrogen atoms in the ensemble.

ens bondangles

ens bondangles ehandle ?filterset? ?filtermode?

e.bondangles(?filters=?,?mode=?)

Standard cross-referencing command to obtain the labels or references of the bond angle objects the ensemble contains as minor objects. This is explained in more detail in the section about object cross-references.

ens bonds

ens bonds ehandle ?filterset? ?filtermode?

e.bonds(?filters=?,?mode=?)

Standard cross-referencing command to obtain the labels or references of the bonds the ensemble contains as minor objects. This is explained in more detail in the section about object cross-references.

Examples:

ens bonds $ehandle

ens bonds $ehandle doublebond

ens bonds $ehandle carbon count

The first example simply returns a list of the labels of the bonds the ensemble contains as minor objects. The second example returns the bonds label(s) of all double bonds in the ensemble. If there are no such bonds, an empty list is returned. The final example counts the number of bonds which involve one or more carbon atoms in the ensemble.

ens cast

ens cast ehandle dataset/ens/reaction/table ?propertylist?

e.cast(objectclass=,?properties=?)

Transform the ensemble into a different object. Depending on the target object class, the result is as follows:

dataset
A new dataset which contains which contains the ensemble as first and only object.
ens
Only supplied for the sake of completeness. This mode does nothing.
reaction
A new reaction, which contains the original and a duplicate of the ensemble as reagent and product components, and an auto-generated 1:1 A_MAPPING property.
table
A new table with one row and automatically generated columns for all properties of the input ensemble of the ens (E_*) object class. The row is filled with the input ensemble data, and the ensemble is moved to the internal dataset of the table.

If the optional property list is specified, an attempt is made to compute the listed properties before the cast operation, so that they may become a part of the new object. No error is raised if a computation fails.

The command returns the handle (reference for Python ) of the new object, or the input object in case of mode ens .

ens clear

ens clear ehandle ?keepensprops?

e.clear(?keepensproperties=?)

This command resets an ensemble to a virgin state. All minor objects and all property data of the ensemble are deleted. However, the ensemble handle or reference remains valid, representing an ensemble without any atoms, bonds, rings or other minor objects. If the optional argument is set to a true value, ensemble-class properties ( E_* ) are not deleted, but everything else still is.

Ensemble membership in datasets, reactions, etc. is not changed by this command.

The command returns the original handle or reference.

ens compare

ens compare ehandle ehandle2

e.compare(eref/ehandle)

Compare two ensembles, yielding a stable sort order. The compared attributes are, in this order, the number of atoms, the number of bonds, the ensemble molecular weight, the number of ESSSR rings and finally the stereo- and isotope aware 64-bit hashcode ( E_ISOTOPE_STEREO_HASHY ). The command returns 1 if the first ensemble is larger, -1 if the second is larger, and 0 if they are identical according to the comparison scheme.

The compared property values, with the exception of the final hashcode tiebreaker, are compatible with the RDKit model.

ens copy

ens copy src_ehandle dst_ehandle

e.copy(eref_dst)

Create a copy of the input ensemble into the framework of an existing ensemble. The old data of the destination ensemble is destroyed, but its handle or reference is reused for the copy. The destination handle can be an empty string, #new, #auto or None for Python . In that case, the ensemble is duplicated and a new handle assigned.

This command is useful when an ensemble handle or reference is potentially stored in unknown locations and the ensemble data needs to be updated.

The return value of the command is the handle or reference of the destination ensemble. It is allowed to copy an ensemble onto itself.

Example:

set eh1 [ens create CC]

set eh2 [ens create CCC]

ens copy $eh1 $eh2

After the example code sequence, both ensembles represent ethane, the first compound. However, these are independent ensembles. Any further modifications of the ensemble data on any of the ensembles will not be seen by the other.

The command returns the handle or reference of the target ensemble.

ens create

ens create ?codestring? ?mode? ?datasethandle? ?macroset?

Ens(?data=?,?mode=?,?dataset=?,?macroset=?)

Ens.Create(?data=?,?mode=?,?dataset=?,?macroset=?)

This command creates a new molecular ensemble and returns its handle or reference. If none of the optional arguments are specified, or the argument string is an empty string (or None for Python ), an empty ensemble without any atoms or bonds is created. These may later be populated with commands like atom create.

If data string may either begin with an automatically recognized prefix, or an automatic format detection process is initiated. Recognized prefixes are:

aa:
Decode a 1-letter or 3-letter case-sensitive amino acid sequence. Stereochemistry is assumed to be natural (i.e. L amino acids). The first amino acid has the free amino group, the last the free carboxyl group.
aldrich:
Decode a Sigma-Aldrich catalog number via the sigmaaldrich.com Website. There are a couple of alias prefixes: sigma:, sial: milliporesigma:, sigmaaldrich: and ms:
cas:
Decode CAS number via the NCI resolver, PubChem or commonchemistry.org. Since properly formed CAS numbers are distinctive, they are also recognized without a prefix.
cdx:
Decode base64-encoded ChemDraw CDX data. This is the format sent to Web servers by the ChemDraw browser plug-in.
chebi:
Decode ChEBI ID
chembl:
Decode ChEMBL ID
chemspider:
Decode Chemspider ID
cid:
Decode PubChem CID
drugbank:
Decode DrugBank ID
emol:
Decode EMolecules ID. emolecules: is an equivalent prefix.
formula:
Decode as formula, i.e. create elemental atoms, but no bonds
inchi:
Decode an InChI string. Usually this is not a needed prefix since the standard beginning of an InChI string ( InChI=) is sufficiently unique to prevent misinterpretation. The prefix can be useful in case it is not known whether the InChI string has a proper lead-in. If the InChI= part has been stripped, the decoder does not automatically recognize the encoding. With the explicit prefix, InChI strings with and without the lead-in are decodable.
jme:
Decode as data string of JME Java structure editor
kegg:
Decode KEGG ID
lincs:
Decode a LINCS ID.
mcule:
Decode MCULE ID
mesh:
Decode NCBI Mesh ID
mfcd:
Decode MDL structure ID. The value following the colon can be either a simple number, or start with the MFCD prefix in upper case.
name:
Perform name resolution using the NCI resolver, OPSIN, KEGG or ChemSpider, depending on the system configuration. By default, first the NCI resolver and, if that fails, OPSIN are contacted.
patran:
Decode a Lhasa 1D Patran query pattern. 2D Patran patterns can be decoded with reaction create .
pdb:
Decode a PDB ID (4 characters, initial number plus 3 alpha characters)
querysln:
Decode as Query SLN string.
sid:
Decode PubChem SID
sln:
Decode as SLN string
smarts:
Decode as SMARTS (explicitly not as SMILES )
smiles:
Decode as SMILES (explicitly not as SMARTS )
strictsmiles:
Decode as SMILES (explicitly not as SMARTS ), and also use hypervalent hydrogen addition as per the original Daylight definition (see also the description of the ::cactvs(smiles_hypervalent_hydrogen_addition) control variable).
unii:
Decode as FDA UNII code. Properly formed UNII s are also automatically recognized without a prefix.
zinc:
Decode ZINC ID
quoted with ’ or “
Handled the same way as the name: prefix. These must be explicit quotes that are part of the string, not string syntax elements of the script. Example: ens create “aspirin” vs. ens create \”aspirin\” or ens create ’aspirin’ - the latter two commands work as expected, the first does not, because the quotes are not an actual part of the string, and aspirin can be decoded (in a very lenient fashion) as SMILES , which has precedence.

The colon in the prefix may be omitted (except for the name: item), but this is not recommended, since it may lead to misinterpretation of the data if the prefix is also part of a valid structure encoding.

In addition, URL s as structure data argument are automatically detected and handled specially. If the URL is a data URI , it is unpacked and its payload processed in a second cycle. If it is an HTTP or FTP URL , the file is downloaded and its contents read a a structure file with automatic format detection. This is not identical to data URI processing: Data URI s are again interpreted as command arguments with all prefix and line notation interpretation, while file contents are only interpreted as a record in a structure data file.

If none of the above special cases are recognized, automatic interpretation is performed next. Currently, the encoding then may either be

a SMILES/SMARTS string (see below on how to distinguish these)
a hex-encoded SMILES/SMARTS string, as used by some Daylight tools
an InChI string, with a proper lead-in ( InChI= )
a Cactvs packed serialized object string, as it is generated by the ens pack command
a Cactvs Minimol object in binary or base64 -encoded form
a plain text or base64 -encoded blob of the contents of a structure file record, such as an MDL SDfile. The format must be identifiable by the currently loaded set of structure file I/O modules. Since the data has no file name, automatic loading of modules is not possible.
a PubChem CID - any simple integer argument is interpreted as CID
a CAS number which is looked up on the Internet provided general Internet access is enabled in the toolkit
an MDL structure ID, starting with a proper lead-in (MFCD), followed by eight digits, which is also resolved by Internet access to the chemsynthesis.com site if possible.
an InChI key, with or without lead-in ( InChIKey =). This only works for keys which can be looked up via the NCI resolver over the Internet.
a properly formed FDA UNII code.
a structure file record image as produced by the Mysql databasecompress() function (i.e. 4 byte binary uncompressed size prefix plus zlib-compressed content). This is primarily useful when the command is used in the context of the Mysql database cartridge.
a compound name as last resort, which is by default looked up via the NCI resolver and the OPSIN service

In the absence of a prefix, the encoding is automatically detected. With the exception of PubChem CIDs, the long form of a database ID must be used, not its simple integer value (i.e. a simple 70 is interpreted as PubChem CID, while CHEMBL70 or chembl:70 are decoded as ChEMBL database IDs).

For the base64 -encoded compressed records, the compression algorithm may be raw zlib , gzip or zip and its type is automatically detected.

In case one of the SMILES -class encoding schemes is used, the mode argument of the ens create command provides finer control of the decoding. By default, or when this argument is an empty string, the string is interpreted as standard SMILES , except when there are elements in the string which cannot occur in SMILES but in SMARTS . In SMILES mode, query expressions are only recognized to a very limited degree, and implicit hydrogens are automatically added. This decoding scheme may also be explicitly selected by specifying hadd as mode.

In order to force a full hydrogen addition to the raw decoded structure even if it would not be done otherwise, use the mode forcehadd .

Mode strictsmiles decodes SMILES with hydrogen addition but as if the strictsmiles: prefix was set. This is described above.

Mode nohadd is essentially the same as basic SMILES decoding, but implicit hydrogen addition does not happen. In any case, explicitly encoded hydrogen is decoded and preserved.

Mode smarts (or query ) also skips hydrogen addition, but in addition the decoder now fully parses SMARTS , including Recursive SMARTS, but it also becomes less lenient in the area of superatom encodings and similar gray areas, in order to avoid ambiguity. The recognized SMILES dialect may be switched via the control variable ::cactvs(smiles_version). The default is Daylight release 4.9 with Cactvs and EliLilly extensions.

Mode sln forces the interpretation of the input string as Sybyl Line Notation . If the SLN I/O module has already been loaded, interpretation as SLN is automatically attempted in any case, but only after SMILES decoding has failed. Since there are strings which are both valid SMILES and SLN , but mean something different, this automatism can lead to misinterpretation, so if you know you are dealing with SLN , it is a good idea to specify it. The sln mode attempts to auto-load the SLN I/O module if it is not yet loaded. In case it cannot be loaded, this mode raises an error. Mode querysln is similar, but assumes the input is query SLN , not plain SLN .

The 3D decoder mode prefers resolution of identifiers as 3D model instead of 2D connectivity. This has an effect only with a few select combination of identifiers and resolvers and should be considered experimental.

Instead of using an explicit decoder mode or a data prefix, it is also possible to supply the name of a property the structure data is an instance of. Examples are E_SDF_STRING or E_SMILES . Such properties are expected to provide suitable default decoder configuration data in their fileformat and fileflags attributes, and these are then used to decode the structure.

In nohadd decoder mode, the structure code is finally, if everything else fails, interpreted as a plain molecular formula. If the string is parsed successfully as a formula, a collection of atoms of the specified elements is created, without any bonds.

By default, or if the optional target dataset parameter is an empty string, the new ensemble is not a member of any dataset. It may be directly made a dataset member if a dataset handle is specified.

If a macro set name is specified, SMILES and SMARTS with macro definitions can be processed. Any patterns names which belong to the specified set are expanded. Set names, pattern names and expansion fragments are specified in the system macro table. Macro expansion is not available if the toolkit was compiled without table support.

Examples:

set eh [ens create]

set eh [ens create CCC]

set sshandle [ens create {[CH3][Cl,Br,I]} smarts]

set eh [ens create [decode -url C%23C] nohadd]

In case a structure is encoded as a string in a format which cannot be directly decoded by the ens create command (such as a plain string representation of an MDL molfile), the standard method is to load the appropriate file format decoder (if not built in, this is needed so that automatic format detection of the memory image record works), open the structure string as a memory-based structure file, and read from this file. This technique allows the input of multiple records from the in-memory file and thus is also useful in cases like a multi-record SMILES file encoded as a string.

Example:

filex load cdx

set fh [molfile open [decode -base 64 $cdxstring] s]

set eh [molfile read $fh]

molfile close $fh

ens dataset

ens dataset ehandle ?filterlist?

e.dataset(?filters=?)

Return the dataset handle or reference of the dataset the ensemble is part of. It the ensemble is not member of a dataset, or does not pass all of the optional filters, an empty string or None for Python is returned.

Example:

ens dataset $ehandle

ens defined

ens defined ehandle property

e.defined(property)

This command checks whether a property is defined for the ensemble. This is explained in more detail in the section about property validity checking. Note that this is not a check for the presence of property data! The ens valid command is used for this purpose.

The command returns a boolean result.

ens delete

ens delete all

ens delete ?ehandlelist?...

e.delete()

Ens.Delete(“all”)

Ens.Delete(?erefsequence/eref/ehandle?,...)

Delete ensembles and the minor objects which are part of the deleted ensembles. The special parameter all may be used to delete all ensembles currently registered in the application, including those which are part of reactions or other major objects. Alternatively, any number of lists of ensemble handles may be specified for specific deletions.

The command returns the number of deleted ensembles.

For historic reasons, the same command may also be invoked as ens destroy .

Example:

ens delete $ehandle

ens delete $ehandlelist1 $ehandlelist2

ens dget

ens dget ehandle propertylist ?filterset? ?parameterdict?

e.dget(property=,?filters=?,?parameters=?)

Ens.Dget(data,property=,?filters=?,?parameters=?)

Standard data manipulation command for reading object data. It is explained in more detail in the section about retrieving property data.

For examples, see the ens get command. The difference between ens get andens dget is that the latter does not attempt computation of property data, but rather initializes the property values to the default and return that default if the data is not yet available. For data already present, ens get andens dget are equivalent.

The Python class method is a one-shot command. The transient ensemble created from the initialization items is automatically deleted when the command finishes. The data for the creation of the temporary ensemble is equivalent to the first argument of the standard constructor. Additional constructor parameters cannot be used.

ens dup

ens dup ehandle ?datasethandle? ?position? ?filterset? ?ctonlyflag?

e.dup(?dataset=?,?position=?,?filters=?,?ctonly=?)

Duplicate an ensemble. The return value is the handle or reference of the new ensemble.

The duplicate ensemble is placed into the same dataset as the source, if it is a member of a dataset. Specifying an explicitly empty dataset argument (including None for Python ) places the duplicate outside any dataset, regardless of the dataset membership of the source ensemble.

If the duplicate is moved to a dataset, it is appended to the dataset end by default. This happens also if the position parameter is explicitly specified as end or an empty string. Otherwise, the ensemble is inserted at the given position, starting with 0. If the requested position is larger than the current size of the dataset, the ensemble is appended.

The filter parameter allows the selection of only a subset of atoms to be copied. All atoms which do not pass the filters are discarded, as are all bonds which connect to discarded atoms. If no atoms pass the filters, the result is an empty ensemble. By default, no atom filtering takes place, and all atoms and bonds of the original ensemble are part of the duplicate.

The final optional parameter can be used to make the duplicate lightweight. If this boolean parameter is set, the duplicate is limited to the basic connectivity information with all atom and bond properties, but it has no copies of properties of other object classes, and no copies of rings, molecules, groups or other minor object classes.

The ens hdup command is a variant of this command. It automatically adds a hydrogen set to the duplicate.

Examples:

ens dup $ehandle

ens dup $ehandle [dataset create] end ringatom

The first sample line is a standard use. The second example moves the duplicate into a newly created dataset, and isolates the ring systems. All other atoms are stripped.

ens exists

ens exists ehandle ?filterset?

e.exists(?filters=?)

Ens.Exists(eref=,?filters=?)

Check whether an ensemble handle or reference is valid. The command returns boolean 0 or 1. Optionally, the ensemble may be filtered by a standard filter list and it is reported as not valid if it does not pass the filters. If filters in the filter list operate on atom, bonds, or other minor objects, it is sufficient if a single minor object of the ensemble passes the filter.

Example:

ens exists $ehandle chlorine

Check whether the ensemble with the handle in variable $ehandle exists and, if it exists, whether it contains one or more chlorine atoms.

ens expand

ens expand ehandle ?allowambiguous? ?noimplicith?

e.expand(?allowambiguous=?,?noimplicith=?)

This command expands all superatoms in the ensemble. The mechanisms for the expansion of superatoms are described in detail for the atom expand command. This command is functionally equivalent, working on all atoms in the ensemble instead a single atom.

Example:

ens expand $ehandle

The command returns the total number of successfully expanded atoms.

ens expr

ens expr ehandle expression

e.expr(expression)

Compute a standard SQL -style property expression for the ensemble. This is explained in detail in the chapter on property expressions.

ens fill

ens fill ehandle ?property value?...

e.fill({?property:value,...})

e.fill(?property,value?,...)

Standard data manipulation command for setting data, ignoring possible mismatches between the lengths of the lists of objects associated with the property and the value list. It is explained in more detail in the section about setting property data.

Example:

ens fill $ehandle B_COLOR red

sets the color of the first bond in the ensemble to red.

ens filter

ens filter ehandle filterlist

e.filter(filters)

Check whether the ensemble passes a filter list. The return value is boolean 1 for success and 0 for failure.

Example:

ens filter [ens create CCCl] chlorine

checks whether the ensemble contains one or more chlorine atoms. If the filter operates on minor objects of the ensemble, it is sufficient to have a single ensemble minor object pass the filter condition.

ens forget

ens forget ehandle ?objclass?

e.forget(?objectclass=?)

Delete specific classes of minor objects and their data from the ensemble data structure. If no object class is specified, all minor object classes except atoms and bonds and the ensemble data are purged.

If the object class ens is specified, all property data attached to the ensemble object class (usually those properties starting with E_* ) are deleted, but not the ensemble itself.

The command returns the original ensemble handle or reference.

ens formulamatch

ens formulamatch ehandle formula_expression ?other_elements?

e.formulamatch(query=,?other_elements=?)

Match the ensemble against a formula expression. Its syntax is the same as in formula queries in molfile scan and other scan commands.

There are several methods to specify whether any elements not mentioned in the formula expression may or must be present. If the other_elements flag is used, it has the highest priority. If may be set to 0 (no other elements allowed), 1 (allowed) or 2 (required), and if it is set, any prefix in the formula expression is ignored. If it is not used, a prefix in the formula expression may be used to control the matching. Supported prefixes are = (no other elements), >= (other elements allowed) and > (required). If no prefix is used, the default mode is an exact match without other elements.

The return value is the boolean match result.

Example:

ens formulamatch $eh >=C6

Matches any ensemble with has six carbon atoms.

ens formulamatch $eh C5-6(Cl+Br+I)2- 1

Matches an ensemble with five or six carbon atoms, two ore more heavy halogens, and potentially any other elements.

ens fragment

ens fragment ehandle atomlist ?datasethandle? ?position?

e.fragment(atomsequence=,?dataset=?,?position=?)

Create a new ensemble from a set of atoms in another ensemble. All bonds existing between those atoms are also preserved. The atoms can be selected with any standard atom selection syntax, with one selector per list element. Duplicate atom specifications are ignored. Atom specifications which cannot be resolved generate an error.

By default, the new ensemble becomes a member of the same dataset (if any) as the source ensemble, but this can be changed with the optional fifth argument. If no explicit position is given, the ensemble is appended to rear of the target dataset. The new ensemble only inherits the selected atoms and bonds plus stable atom and bond properties, but not other minor objects or ensemble data.

The command returns the handle or reference of the new ensemble object.

Example:

match ss $substructure $eh amap

set ehfrag [ens fragment $ehandle [unzip $amap 1]]

Above code sequence matches a substructure, and then extracts the matched structure part as a new ensemble.

ens get

ens get ehandle propertylist ?filterset? ?parameterdict?

ens get ehandle attribute

e.get(property=,?filters=?,?parameters=?)

e.get(attribute)

e[property/attribute]

e.property/attribute

Ens.Get(data,property=,?filters=?,?parameters=?)

Ens.Get(data,attribute)

Standard data manipulation command for reading object data. It is explained in more detail in the section about retrieving property data.

Examples:

ens get $ehandle {M_WEIGHT A_ELEMENT}

yields a nested list with two elements. The first element is a list of the molecular weights of all molecules in the ensemble. The second element is a list of the element numbers of all atoms in the ensemble. If the information is not yet available, an attempt is made to compute it. If the computation fails, an error results.

ens get $ehandle B_ORDER ringbond

gives the bond orders of all bonds of the ensemble which are ring bonds.

The format of the optional parameter list argument is a series of keyword/value pairs, as produced by the Tcl command array get or the standard Tcl dictionary commands. If a this parameter list is present as argument, and the requested property data is already valid for the ensemble, a check if made if all the specified parameters are the same as the parameters the present property data was computed with. If this is the case, the values are directly returned as usual. Otherwise, the data is discarded and re-computed.

If computation of the property data is performed, either because the parameter set was not matched, or the requested data was not valid, the computation integrates the specified parameter set into the parameters of the computation function. Parameters from the list temporarily override the global settings of these parameters in the property definition. Parameters used by the property computation function but not listed in the local parameter list are neither used for data validity checking, nor their value changed during the computation request. After the computation finishes, the old global parameter settings of the property definition are restored.

The use of a parameter list argument is primarily useful only if a single property is requested with this command, but its use with a multiple-property request is not illegal - the parameter list is simply applied to all properties in sequence.

Example:

ens get $ehandle E_GIF {} [dict create width 200 height 200 bgcolor white]

Variants of theens get command are ens new, ens dget, ens jget, ens jnew, ens jshow, ens nget, ens show, ens sqldget, ens sqlget, ens sqlnew, andens sqlshow .

Further examples:

ens get $ehandle E_NAME

ens get $ehandle A_FLAGS(boxed)

In addition to property data, the ensemble object possesses a few attributes, which can be retrieved with the ens get command (but not by its related sister subcommands like ens dget, ens sqlget, etc.). Some of them are also modifiable via ens set. These attributes are:

coords
If the toolkit was compiled with factory support, these are the coordinates of the object icon on its workbench, encoded as integer pair. This attribute can be changed.
deletable
Flag indicating whether this object can be deleted with a standard ens delete command. This attribute is read-only. Objects which are, for example, property data values or a part of a molfile loop command cannot be deleted by standard means.
failures
If the property computation failure cache is active, return a list of all properties which have failed computation for this ensemble after the last structural change. This attribute is read-only.
footer
If the toolkit was compiled with factory support, this is the footer of the object icon on a workbench. This attribute can be changed.
gflags
If the toolkit was compiled with factory support, this is the currently set object icon rendering flag collection.
header
f the toolkit was compiled with factory support, this is the header of the object icon on a workbench. This attribute can be changed.
hidden
Flag indicating whether the object is hidden. This is not the same as the invisible state. This attribute is intended to be used for rendering selections. This attribute can be changed.
incomplete
Boolean status flag indicating an aborted input operation during the read of the object from file, which returned the structure intact but without the complete set of associated data. An aborted input may be either be the result of an explicitly set input control flag, or by encountering property data which could not be decoded. This attribute is read-only.
invisible
Flag indicating whether the object is invisible. This is not the same as the hidden state. An invisible object is no longer accessible via its handle. This is usually the case for objects which are scheduled for deletion, but still have lingering pointer references. This attribute is read-only.
javaobject
If the toolkit was compiled with JNI support, this attribute reports the memory address of the JNI wrapper class instance, if it exists.
modcount
Object modification count. This attribute is read-only.
mutexcount
The number of recursive mutex locks held for this object. Only supported on Linux.
pyobject
If the toolkit was compiled with Python support, this attribute reports the memory address of the Python wrapper class instance, if it exists. This attribute is read-only.
pyrefcount
If the toolkit was compiled with Python support, this attribute reports the reference count of the Python wrapper class instance, if it exists. This attribute is read-only.
record
The current iterator record (starting with 1) of the ensemble. It is possible to set the value and thus skip or revisit ensemble molecules in the iterator.
refcount
If the Tcl interpreter is using native Cactvs objects instead of string-based major object handles and integer-based minor object labels to identify toolkit objects, this returns the number of Tcl object references active for this ensemble. This attribute is read-only.
scoped
A boolean object visibility control flag. If set, and global control flag ::cactvs(object_scope) is also set, the object is visible only in the Tcl interpreter which set the scope flag and thus claimed it. Object list commands executed in other interpreters omit this object, and attempts to decode its handle in other interpreters will fail. The most common use of this feature is the hiding of persistent chemistry objects in scripted property computation functions.
selected
Flag indicating whether the object is selected. This attribute can be changed.
tooltip
If the toolkit was compiled with factory support, this is the tooltip of the object icon on a workbench. This attribute can be changed.
uuid
An automatically generated UUID globally identifying the object. This attribute is read-only, different for every object, and not dependent on its contents.
x
f the toolkit was compiled with factory support, this is the x coordinate of the object icon on its workbench. This attribute can be changed.
y
If the toolkit was compiled with factory support, this is the y coordinate of the object icon on its workbench.This attribute can be changed.

ens getparam

ens getparam ehandle property ?key? ?default?

e.getparam(property=,?key=?,?default=?)

Retrieve a named computation parameter from valid property data. If the key is not present in the parameter list, an empty string is returned (None for Python ). If the default argument is supplied, that value is returned in case the key is not found.

If the key parameter is omitted, a complete set of the parameters used for computation of the property value is returned in dictionary format.

This command does not attempt to compute property data. If the specified property is not present, an error results.

Example:

ens getparam $ehandle E_GIF format

returns the actual format of the image, which could be gif , png , or various bitmap formats.

ens groups

ens groups ehandle ?filterset? ?filtermode?

e.groups(?filters=?,?mode=?)

Standard cross-referencing command to obtain the labels or references of the groups the ensemble contains. This is explained in more detail in the section about object cross-references.

Example:

ens groups $ehandle

ens hadd

ens hadd ehandle ?filterset? ?flags? ?changeset?

e.hadd(?filters=?,?flags=?,?changeset=?)

Add a standard set of hydrogens to the ensemble. If the filterset parameter is specified, only those atoms which pass the filter set are processed.

Additional operation flags may be activated by setting the flags parameter to a list of flag names, or a numerical value representing the bit-ored values of the selected flags. By default, the flag set is empty, corresponding to the use of an empty string or none as parameter value. These flags are currently supported:

keepflags
For expert use only. Do not discard min/max values and property scope flags for atom properties when hydrogen is added.
no2dcoords
Do not assign 2D coordinates to the added hydrogens, even if the rest of the atoms in the ensemble have valid 2D coordinates. In any case, 2D coordinates are never added when the ensemble does no already possess valid 2D coordinates.
no3dcoords
Do not assign 3D coordinates to the added hydrogens, even if the rest of the atoms in the ensemble have valid 3D coordinates. In any case, 3D coordinates are never added when the ensemble does no already possess valid 3D coordinates.
noanions
Do not add hydrogen to atoms with a negative formal charge.
noatoms
Do not add hydrogen to atoms without any bonds.
nocations
Do not add hydrogen to atoms with a positive formal charge.
noelements
Do not add hydrogen if the ensemble consists purely of isolated metal atoms, which probably represent the material in elementary form, or as an alloy.
noexcessvalences
Similar to nohighvalences , but hydrogen is not added to any atom which is not in its lowest standard bonded valence state.
nofixatomtext
Do not adjust property A_TEXTLABEL (if present) by removing references to implicit H from it on atoms where hydrogen is added. For example, by default “NHCOOEt” becomes “NCOOEt” after adding an instantiated hydrogen to the nitrogen atom. This reduces confusion on the hydrogen status when rendering all atoms.
nohighvalences
Do not add hydrogen to atoms which already exceed their lowest standard valence minus any formal charge. This option only applies to elements which have a defined lowest standard valence (this is configurable via the element table).
nomemory
Do not remember the added hydrogen atoms as automatically added. Normally, a flag is retained as part of the atom information which distinguishes atoms which were added by automatic processing, such as hydrogen addition, from those which were originally input.
nometals
Do not attempt to add hydrogen to atoms which are metals (as defined in the system element table).
nospecial
Do not perform hydrogen addition to atoms which participate in non-standard bonds (all bonds with B_TYPE not normal ).
protonate
Add a single proton to the first suitable atom. The charge of the atom is increased, and only a single hydrogen is added regardless of the standard number of missing hydrogens,. This command does issue the standard property invalidation event for atom and bond changes. In the ensemble command variant, this option is rarely useful. It is supported for compatibility with theatom hadd command.
resetmemory
Reset the origin flag described above for all atoms in the ensemble. All current atoms act as if they were part of the original atom set.

Adding hydrogens with this command, except wit a set protonate flag, is less destructive to the property data set of the ensemble than adding them with individual atom create/bond create commands, because many properties are designed to be indifferent to explicit hydrogen status changes, but are invalidated if the structure is changed in other ways.

If the effects of the hydrogen addition step to the validity of the property data set should not be handled according to this standard procedure, it is possible to explicitly generate additional property invalidation events by specifying an event list as the optional last parameter, for example a list of atom and bond to trigger both the atom change and bond change events.

The command returns the number of hydrogens which were added.

Example:

set ehandle [ens create {[C].[C]}]

ens hadd $ehandle

adds a total of eight hydrogens to the two carbon atoms, transforming them into methane.

ens hdup

ens hdup ehandle ?datasethandle? ?position? ?filterset? ?ctonlyflag?

e.dup(?dataset=?,?position=?,?filters=?,?ctonly=?)

This command is a convenience variant of the ens dup command. It has the same parameters, but also adds a full standard hydrogen set (equivalent to executing an ens hadd $eh command) to the duplicate.

The command arguments are documented in the paragraph onens dup .

ens hfragment

ens hfragment ehandle atomlist ?datasethandle? ?position?

e.hfragment(atomsequence=,?dataset=?,?position=?)

This command has the same arguments asens fragment . The only difference is that after the duplication all open valences in the fragment are plugged with hydrogen, as if an ens hadd command had been executed immediately after the fragment creation command.

The command returns the handle or reference of the new ensemble object.

ens hierarchy

ens hierarchy ehandle ?filterlist? ?root?

e.hierarchy(?filters=?,?root=?)

Return the hierarchy handle or reference of the hierarchy the ensemble is part of. If the ensemble is not member of a hierarchy, or does not pass all of the optional filters, an empty string or None for Python is returned. By default, the hierarchy object which directly contains the ensemble is returned. If the root flag is set, the root hierarchy object is reported instead, which is the same only if the hierarchy has only a single level.

Example:

ens hierarchy $ehandle

ens hstrip

ens hstrip ehandle ?flags? ?changeset?

e.hstrip(?flags=?,?changeset=?)

This command removes hydrogens from the ensemble. By default, all hydrogen atoms in the ensemble are removed.

The flags parameter can be used to make the operation more selective. It may be a list of the following flags:

deprotonate
If this flag is set, a single proton is removed from the first suitable atom. This command variant does issue a standard atom and bond change property invalidation event, and it always ends processing after removing the first proton. Proton removal decreases the charge of the atom by one. In the ensemble command variant, this flag is rarely useful - it is supported for compatibility with the atom hstrip command
keepalphawedge
Keep hydrogen atoms which are bonded to an atom which is at the tip of a wedgebond. This flag excludes the case where the bond to the hydrogen atom is the wedge bond - use the keepwedge flag to cover this case.
keepisotopes
Keep hydrogen atoms which are isotope labels (including enriched/depleted 1H).
keeporiginal
Hydrogen atoms which were not automatically added via a hydrogen addition command are retained. Note that these commands can be run in a mode which does not leave information about automatic addition - hydrogens added this way are not retained.
keepprotons
Keep any molecules which consist only of hydrogen atoms (such as protons, hydride anions, and molecular hydrogen).
keepspecia l
If this flag is set, hydrogens which are usually displayed, such as on aldehydes, wedge bonds, carbon triple bonds or hetero atoms are retained.
keepwedge
Keep hydrogens which are at the end of a wedge bond, indicating stereochemistry.
normalize
Normalize the wedge pattern for standard cases, removing excess wedges from hydrogens if the result structure is still stereochemically defined. Hydrogens which lose their wedge in this process are no longer protected by the keepwedge flag.
wedgetransfer
If a hydrogen atom is removed which is at the end of a wedge, the wedge information is saved by transferring the wedge (changing its up/down status if necessary) to an adjacent, surviving bond. This flag has no effects if the keepspecial or keepwedge flags are set. This flag is set by default.

If the flags parameter is an empty string, or none , it is ignored. The default flag value is wedgetransfer - but this default value is overridden if any flags are set!

If the changeset parameter is specified, the property change events listed in the parameter are triggered after the command.

Hydrogen stripping is not as disruptive to the ensemble data content as normal atom deletion, except when the deprotonate flag is set. The system assumes that this operation is done as part of some file output or visualization preparation. However, if any new data is computed after stripping, the computation functions see the stripped structure, and proceed to work on that reduced structure without knowledge that the structure may contain implicit hydrogens.

The command returns the number of stripped hydrogens.

Example:

ens hstrip $ehandle [list keeporiginal wedgetransfer]

ens hydrogenate

ens hydrogenate ehandle ?filterset? ?changeset?

e.hydrogenate(?filters=?,?changeset=?)

Reduce all bonds in the ensemble to single bonds, except those excluded by the filter set.

If a change set is supplied, its interpretation is the same as in ens hadd.

The command returns the number of added hydrogens.

Example:

ens hydrogenate $eh {!arobond !ccbond}

This reduces all non-aromatic bonds involving hetero atoms to single bonds.

ens image

ens image ehandle ?width? ?height? ?options?

This command generates a Tk image object displaying the ensemble as an icon. The command is only available in toolkit variants which are linked with the portable Tk GUI toolkit library and which are either statically linked with the GD image drawing library, or can load it dynamically. It is currently not support in the Python interface.

The default image size is 64x64 pixels, but this may be overridden by the width and height parameters. If only width is set, it is also used for the height. The command returns a Tk image handle. These images may for example be placed on Tk canvases as canvas objects, or used on buttons and other GUI objects.

Because of the small size of the images, atoms are not displayed as symbols, but small color-coded squares. This is a command for the implementation of graphical structure-handling applications with icons. For serious structure visualization, use the E_GIF , E_EMF_IMAGE or E_EPS_IMAGE properties.

Additional options may be added by an arbitrary sequence of option/value pairs. Color names can be those registered in the X11 color database, or a numeric specification in the #rrggbb format. These options are currently supported:

-background color
Background color. The default is black.
-border npixels
Thickness of the image border. The default are 5 pixels.
-bordercolor color
Border color. The default is blue.
-cmode none/special/all
Display mode for carbon atoms. The default is special, meaning that only carbon atoms which usually are drawn with a C symbol are displayed as colored rectangle and not just a bond node. Highlighted atoms are always displayed.
-highlightatom label
Select an atom for highlighting. By default, no atom is highlighted.
-highlightcolor color
Set the highlighting color. The default is chartreuse .
-hmode none/special/all
Display mode for hydrogen atoms. The default is special , meaning that only hydrogen atoms which usually are drawn with an H symbol are displayed as colored rectangle. Other hydrogen atoms and the bonds leading to them are suppressed. Highlighted atoms are always displayed.
-imagename name
Explicitly set a name for the image. By default, a name of the form imagen is automatically generated. It is possible to specify the name of an existing image, which will then be overwritten.
-linecolor color
Color of bond lines and wedges. The default is white .

Images are cached. If an image for the selected ensemble with the same display attributes exists, it is reused.

Example:

set img [ens image $ehandle 80 80 -border yellow -linecolor blue]

canvas create .canvaswin image 50 50 -image $img

ens index

ens index ehandle

e.index()

Get the position of the ensemble in the object list of its dataset. If the ensemble is not member of a dataset, -1 is returned.

ens isotopecheck

ens isotopecheck ehandle ?failedatomvariable? ?extended?

e.isotopecheck(variable=,extended=)

Test whether the isotope labels on the atoms of the ensemble, if they exist, are physically reasonable. The command returns the number of failed atoms. If a capture variable is specified, the atom labels or references of these atoms are stored therein. If no isotope labels are set in A_ISOTOPE , the command always reports zero problems.

By default, a smaller isotope table is used which contains only isotopes which are sufficiently long-lived to perform chemistry on. These include naturally occurring isotopes as well as isotopes used for experimental labeling, such as 3H or 14C. If the extended boolean flag is set, a larger table containing all known isotopes of the elements is used.

The isocheck command is an alias.

ens jget

ens jget ehandle propertylist ?filterset? ?parameterdict?

e.jget(property=,?filters=?,?parameters=?)

Ens.Jget(data,property=,?filters=?,?parameters=?)

This is a variant of ens get which returns the result data as a JSON formatted string instead of Tcl interpreter objects. The command is usable only for property data, not attribute retrieval.

The Python class method is a one-shot command. The transient ensemble created from the initialization items is automatically deleted when the command finishes.

ens jnew

ens jnew ehandle propertylist ?filterset? ?parameterdict?

e.jnew(property=,?filters=?,?parameters=?)

Ens.Jnew(data,property=,?filters=?,?parameters=?)

This is a variant of ens new which returns the result data as a JSON formatted string instead of Tcl interpreter objects.

The Python class method is a one-shot command. The transient ensemble created from the initialization items is automatically deleted when the command finishes.

ens jshow

ens jshow ehandle propertylist ?filterset? ?parameterdict?

e.jshow(property=,?filters=?,?parameters=?)

Ens.Jshow(data,property=,?filters=?,?parameters=?)

This is a variant of ens show which returns the result data as a JSON formatted string instead of Tcl interpreter objects.

The Python class method is a one-shot command. The transient ensemble created from the initialization items is automatically deleted when the command finishes.

ens ldup

ens ldup ?ehandlelist?...

Ens.Ldup(?eref/erefsequence?,...)

Duplicate all ensembles in the argument list(s) in default mode.

The return value is a single list (even if multiple source lists are used) of the duplicated ensemble handles or references. If an argument list element is an empty string (or None for Python ), it indicates a missing object, and the output list also receives an empty string element (for Tcl ) or None (for Python ) at its position, without raising an error.

ens lhdup

ens lhdup ?ehandlelist?...

Ens.Lhdup(?eref/erefsequence?,...)

Duplicate all ensembles in the argument list(s) in default mode, and add hydrogens.

ens list

ens list ?filterlist?

Ens.List(?filters=?)

This command returns a list of the ensemble handles currently registered in the application. This list may optionally be filtered by a standard filter list. If the filter operates on ensemble minor objects such as atoms or bonds and not directly on the ensemble object, it is sufficient if a single minor object passes the filter.

Example:

ens list halogen

lists the handles of all ensembles in the application which contain one or more halogen atoms.

ens lock

ens lock ehandle propertylist/objclass/all ?compute?

e.lock(property=,?compute=?)

Lock property data of the ensemble, meaning that it is no longer managed by the standard data consistency manager. The data consistency manager deletes specific property data if anything is done to the ensemble which would invalidate the information. Blocking the consistency manager can be useful when building ensembles from components in a script. Property data remains locked until is it explicitly unlocked.

The property data to lock can be selected by providing a list of the following identifiers:

Property names
Valid property instances on the ensembles or ensemble minor objects are locked. If the boolean compute flag is set, an attempt is made to compute the property if it is not yet present. Otherwise, a request to lock non-existent data is silently ignored. It is not possible to lock individual property fields.
all
All valid ensemble and ensemble sub-object properties are locked. The compute flag is ignored.
ens,atom,bond,...
These is are object class identifiers. All property data which is controlled by the ensemble major object and attached to the specified object class is locked.

The lock can be released by an ens unlock command.

The return value is the original ensemble handle or reference.

Example:

set eh [ens create CCC]

ens lock $eh A_SYMBOL 1

ens purge $eh A_ELEMENT

atom set $eh 1 A_query(dsearch) 3

ens unlock $eh A_SYMBOL

In this example, an ensemble is created, and the atom symbol information is locked. Next, the element number property is deleted, and a query attribute is set. Finally, the lock is released. Had the element symbol information not been locked, the ensemble would have become unusable due to an overzealous data consistency manager. Setting query information in property A_query can have an influence on the atom symbol. So the default action of invalidating A_SYMBOL when manipulating A_query is correct. However, in case there is no element information A_ELEMENT , and no atom symbol information A_SYMBOL , the element information is completely lost, and the ensemble becomes unusable. So in this case, locking A_SYMBOL (or alternatively A_ELEMENT ) is required to avoid unexpected side effects of structure editing.

ens loop

ens look ehandle objvariable ?maxmol? ?offset? body

e.loop(function=,?maxloop=?,?offset=?,?variable=?)

for m in e:

Loop over all molecules in the ensemble, by providing a temporary ensemble duplicate of each found molecule. The handle of the duplication is stored in the object variable and visible to the loop code.

The loop code cannot delete the duplicate ensemble. It is automatically deleted at the end of each cycle. Changes made to the duplicate molecule are not seen in the base ensemble. It is however possible to explicitly assign data computed on the duplicate ensemble to the base ensemble.

The optional parameters allow more control over which molecules are processed. By default the maxmol parameter is -1, meaning an unlimited number of fragments are processed, and the offset is zero, meaning that processing begins with the first molecule in the molecule list of the base ensemble.

For Tcl scripts, within the loop code, the standard Tcl commands break and continue work as expected.

The Python version of the loop method does intentionally have a different argument sequence for convenience. The function argument may either be a multi-line string (similar to the Tcl construct), or a function reference. Functions are called with the reference of the current loop object as single argument, and have their own context frame, so that the specification of a reference variable is not generally useful in that call style, though is is allowed. For string function blocks the code is executed in the local call frame, and the variable with the current object reference is visible locally. Script code blocks must be written with an initial indentation level of zero. Within the Python functions, the normal break and continue commands cannot be used to to scope limitations. Instead, the custom exceptions BreakLoop and ContinueLoop can be raised. These are automatically caught and processed in the loop body handler code.

In Python , there is also an object iterator so that simple loops over ensemble molecules can be written with a for statement. The ensemble object iterator is of the self style (i.e. there is one per ensemble, these are not independent objects), so nesting them is not possible on the same ensemble.

Python object loop constructs and their peculiarities are discussed in more detail in the general chapter on Python scripting.

The command returns the number of molecule fragments processed.

Example:

set midx 0

ens loop $ehandle ehdup {

	mol set $ehandle [mol mol $ehandle #$midx] M_MYPROP [ens get $ehdup E_MYPROP]]

	incr midx

The example loop assigns a custom property where the compute function is only defined for a single-fragment ensemble to the equivalent molecule property in a multi-fragment base ensemble.

ens mask

ens mask ehandle labellist/all property onvalue ?offvalue?

e.mask(objects=,property=,onvalue=,?offvalue=?)

e.mask(“all”,property=,onvalue=,?offvalue=?)

This command sets property values of a subset of minor objects of one class in the ensemble to a specific value, and optionally resets the values of the same property for all other minor objects of the ensemble which are not selected.

The first argument after the ensemble handle is either a list of object identifiers, or the magic value all . Object identifiers are usually the standard numerical labels, but any construct which identifies an atom, a bond, etc. can be used. The next argument identifies the property. The object identifiers in the previous argument must correspond to the object class of the property, i.e. atom label pairs can only be used it the property is a bond property, but simple numerical labels work for all classes. If data for that property is not present on the ensemble, it is instantiated with the default value. The final one or two arguments must be decodable data values for that property.

If the all object subset identifier is used, all values of the property in the ensemble are set to the onvalue . Any offvalue specification is ignored.

Otherwise, the explicit label list is processed. If an off value is given, all values of the property in the ensemble are first reset to that value. If no off value was specified is, no reset is performed and the current values remain valid. Then, all minor objects in the list are looked up from their labels or other identifiers, and their property value set to the onvalue .

Example:

ens mask $eh [ens atoms $eh carbon] A_COLOR green black

This command sets the A_COLOR property value for all carbon atoms in the ensemble to green, and all other atoms to black. This is shorter and more efficient then explicitly coding a loop of atom set statements.

The command returns the original ensemble handle or reference.

ens match

ens match ehandle ss_ehandle ?matchflags? ?ignoreflags? ?atommatchvar? 	?bondmatchvar? ?molmatchvar?

e.match(substructure=,?matchflags=?,?ignoreflags=?,?atommatchvariable=?,	?bondmatchvariable=?,?molmatchvariable=?)

Check whether the ensemble matches a substructure. The substructure may be any structure ensemble, and even be in the same ensemble as the primary command ensemble.

The precise operation of the substructure match routine can be tuned by providing a standard set of match flags and feature ignore flags. The default match flag set has set bits for the bondorder , atomtree and bondtree comparison features, and an empty ignore set. If a flag set is specified as an empty string, the default set is used. In order to reset a flag set, an explicit none value must be used. The bit options of the match flag are explained in the documentation of thematch ss command.

The command returns boolean 1 for a successful match, 0 otherwise. If an optional atom, bond, or molecule match variable is specified, it is set to a nested list of matching substructure/structure atom, bond or molecule labels ( Tcl ) or references ( Python ). If no match can be found, the variable is set to an empty list. In case only a bond or molecule match variable is needed, an empty string can be used to skip the unused match variable argument positions.

This is a very simple variant of substructure matching. Thematch ss command provides many more advanced match determination and match processing options.

ens max

ens max ehandle propertylist ?filterset?

e.max(property=,?filters=?)

Get the maximum values of the properties named in the propertylist parameter. The return value of the command is a list of the maximum property values. The objects whose property values are used for the determination of the maximum values may optionally be filtered by a standard filter set. If no objects pass the filter, the result is an empty string.

Example:

ens max $ehandle A_ELEMENT

computes the maximum element number in the ensemble.

ens merge

ens merge ehandle ?ehandle_list?...

e.merge(?eref/erefsequence?,...)

Merge a set of ensembles into one ensemble. All structure information is accumulated in the first (base) ensemble. Its handle remains unchanged. All other ensembles are destroyed. It is not possible to name an ensemble more than once in the argument lists, and ensembles cannot be merged with themselves.

The merged ensemble has a consistent property set for all minor objects. If the information content of the input ensembles varies, an attempt is made to compute the missing information for ensembles which do not have valid data for each individual property. If the computation fails, the property data is discarded for all merged objects. In addition, a merge property invalidation event is issued, which may lead to additional loss of property data. For surviving properties which have defined a merge update function, this function is then called and may perform additional data adjustments. For example, the A_XY 2D plot coordinate property merge function transforms the structure plot coordinates in the new ensemble to a uniform scale and arrange the coordinates for the atoms from the merged ensembles as a sequence of plots from left to right.

The return value of this command is a list of the new first atom labels or references for every merged ensemble, excluding the base ensemble. All minor object labels in the merged ensembles are re-assigned to avoid collisions. The new labels begin with the highest respective minor object label in use in the base ensemble plus one, and are thereafter assigned in sequence. In case an empty ensemble was merged, the list contains an empty string ( Tcl ) or None ( Python ) at its merge position.

Theens add command performs the same operation as theens merge command, but merges duplicates of the input ensembles, thus preserving them.

Example:

ens merge [ens create CC] [list [ens create CCC.CCCC] [ens create C]]

Merge three ensembles into one. The new ensemble contains the molecules ethane, propane, butane and methane in that order.

ens metadata

ens metadata ehandle property ?field ?value??

e.metadata(property=,?field=?,?value=?)

Obtain property metadata information, or set it. The handling of property metadata is explained in more detail in its own introductory section. The related commands ens setparam and ens getparam can be used for convenient manipulation of specific keys in the computation parameter field. Metadata can only be read from or set on valid property data.

Valid field names are bounds , comment , info , flags , parameters and unit .

Examples:

array set gifparams [ens metadata $ehandle E_GIF parameters]

ens metadata $ehandle E_NAME comment “This is a CAS name in 1995 revision. The IUPAC name, or any previous or later CAS revision name, look completely different.”

The first line retrieves the computation parameters of the property E_GIF as keyword/value pairs. These are read into the array variable gifparams , and may subsequently be accessed as $gifparams(format) , $gifparams(height) , etc. The second example shows how to attach a comment to a property value.

ens min

ens min ehandle propertylist ?filterset?

e.min(property=,?filters=?)

Get the minimum values of the properties named in the propertylist parameter. The return value of the command is a list of the minimum property values. The objects whose property values are used for the determination of the minimum values may optionally be filtered by a standard filter set. If no objects pass the filter, the result is an empty string.

Example:

ens min $ehandle A_FORMAL_CHARGE xatom

gets the lowest value of the formal charge of a hetero atom in the ensemble.

ens mols

ens mols ehandle ?filterset? ?filtermode?

e.mols(?filters=?,?mode=?)

Standard cross-referencing command to obtain the label(s) of the molecule the ensemble contains as minor objects. This is explained in more detail in the section about object cross-references.

Examples:

ens mols $ehandle

ens mols $ehandle heterocycle

The first example simply returns a list of the labels of the molecules the ensemble contains as minor objects. Note that it is possible that there is more than one molecule in the ensemble - this is the reason why the command name is mols , not mol . The second example returns the molecule label(s) of all the molecules in the ensemble which contain one or more heterocycles. If there are no such molecules, an empty list is returned.

ens move

ens move ehandle ?datasethandle|remotehandle? ?position?

e.move(?target=?,?position=?)

Make the ensemble a member of a dataset, or remove it from a dataset. If the dataset handle or reference parameter is omitted, or is an empty string, or None for Python , the object is removed from its current dataset. The dataset handle or reference may be the name of a remote dataset for moving object over a network connection.

If a target dataset handle or reference is specified, the ensemble is added to the dataset, if allowed by the acceptance bits of the dataset, and removed from any dataset it was member of before the execution of the command. By default the ensemble is added to the end of the dataset object list, but the final optional parameter allows the specification of an object list index. The first position is index zero. If the parameter value end is used, or the index is bigger than the current number of dataset objects minus one, the ensemble is appended as per the default. It is legal to use this command for moving ensembles within the same dataset.

Another special position value is random or rnd . This value moves to the object to a random position in the dataset. Using this mode with remote datasets is currently not supported.

The dataset handle cannot be a transient dataset.

The return value of the command is the dataset of the object prior to the move operation. It is either a dataset handle/reference, or an empty string ( Tcl ) or None ( Python ) if it was not member of a dataset.

This command interacts with the insert control mechanism of size-constrained datasets. More information is provided in the description of the sizecontrol dataset parameter.

Examples:

ens move $ehandle $dhandle 0

ens move $ehandle

In the first example, the ensemble is inserted as the first element in a dataset. The second line reverts this operation and removes the ensemble from the dataset.

This command can be used with a remote dataset descriptor. In that case, the ensemble is packed into a serialized object representation, transmitted over the network and restored as member of the remote dataset at the specified position. The local ensemble is deleted if the transfer succeeds.

Example:

ens move $ehandle blockbuster@server2:9998 end

This command moves the ensemble to the dataset which was set up as listener on port 9998 and pass phrase blockbuster on host server2 . The local ensemble is deleted, and its copy is inserted at the end of the remote dataset.

ens mutex

ens mutex ehandle mode

e.mutex(mode)

Manipulate the object mutex.

During the execution of a script command, the mutex of the major object(s) associated with the command are automatically locked and unlocked, so that the operation of the command is thread-safe. This applies to toolkit builds that support multi-threading, either by allowing multiple parallel script interpreters in separate threads or by supporting helper threads for the acceleration of command execution or background information processing.

Going beyond this automatic per-statement protection, this command locks major objects for a period of time that exceeds a single command. A lock on the object can only be released from the same interpreter thread that set the lock. Any other threaded interpreters, or auxiliary threads, block until a mutex release command has been executed when accessing a locked command object. This command supports the following modes:

lock
Increase the recursive mutex lock count on the object. The command returns the current lock count after the command, excluding the transient single-command lock.
reset
Release all persistent locks on the object, if they exist.
test
Return the current persistent lock count on the object. This excludes the transient per-command lock.
unlock
Decrease the recursive lock count on the object. The command returns the current lock count after the command, excluding the transient single-command lock. Unlocking an object which has not been persistently locked results in an error.

There is no trylock command variant because the command already needs to be able to acquire a transient object mutex lock for its execution.

The command returns the current lock count.

ens need

ens need ehandle propertylist ?mode? ?parameterdict?

e.need(property=,?mode=?,?parameters=?)

Standard command for the computation of property data, without immediate retrieval of results. This command is explained in more detail in the section about retrieving property data.

The return value is the original ensemble handle or reference.

Examples:

ens need $ehandle A_XY recalc

ens need $ehandle E_EINECS_ID threaded

ens new

ens new ehandle propertylist ?filterset? ?parameterdict?

e.new(property=,?filters=?,?parameters=?)

Ens.New(data,property=,?filters=?,?parameters=?)

Standard data manipulation command for reading object data. It is explained in more detail in the section about retrieving property data.

For examples, see the ens get command. The difference between ens get andens new is that the latter forces the re-computation of the property data, regardless whether it is present and valid, or not.

The Python class method is a one-shot command. The transient ensemble created from the initialization items is automatically deleted when the command finishes.

ens nget

ens nget ehandle propertylist ?filterset? ?parameterdict?

e.nget(property=,?filters=?,?parameters=?)

Ens.Nget(data,property=,?filters=?,?parameters=?)

Standard data manipulation command for reading object data. It is explained in more detail in the section about retrieving property data.

For examples, see theens get command. The difference between ens get andens nget is that the latter returns numeric data, even if symbolic names for the values are available.

The Python class method is a one-shot command. The transient ensemble created from the initialization items is automatically deleted when the command finishes.

ens nnew

ens nnew ehandle propertylist ?filterset? ?parameterdict?

e.nnew(property=,?filters=?,?parameters=?)

Ens.Nnew(data,property=,?filters=?,?parameters=?)

Standard data manipulation command for reading object data and attributes. It is explained in more detail in the section about retrieving property data.

For examples, see the ens get command. The difference between ens get andens nnew is that the latter always returns numeric data, even if symbolic names for the values are available, and that property data re-computation is enforced.

The Python class method is a one-shot command. The transient ensemble created from the initialization items is automatically deleted when the command finishes.

ens nitrostyle

ens nitrostyle ehandle style

e.nitrostyle(style=)

Change the internal encoding of nitro groups and similar functional groups in the ensemble. Possible values for the style parameter are:

asis No change
ionic Change to encoding to a positive charge on the center atom, and a negative on one of the oxygens
xionic As above, but also change the encoding of azides, etc.
neutral Change the encoding to the neutral form with extended valence. pentavalent is an alias.
xneutral As above, but also change the encoding of azides, etc.

The command returns the original ensemble handle or reference.

ens op2d

ens op2d ehandle mode ?atomfilter_bit/degrees?

e.op2d(mode=,?atomfilter=?)

Perform various operations on the standard 2D layout coordinates of the structure (property A_XY ). Properties tightly connected to A_XY are also updated (most notably, B_FLAGS to keep wedges in sync with stereochemistry defined in other properties).

In mode rotate , the optional argument is the rotation angle in degrees. If it is not specified, the default are 30 degrees.

For alignment and flipping operations, the atoms which are used to determine the orientation can be filtered by specifying one or more value bits of property A_FLAGS . Only atoms where one or more of these bits are set in A_FLAGS are used for computing the alignment (in modes xalign , yalign , xyalign - all atoms are moved) or are flipped (modes hflip , vflip - unselected atoms are not moved). If no but filter values are specified, or none is used, all ensemble atoms and bonds are processed.

The following modes are supported:

rotate
Rotate the 2D structure coordinates counterclockwise.
hflip
Perform a horizontal flip around the X axis, while maintaining stereochemistry.
vflip
Perform a vertical flip around the Y axis, while maintaining stereochemistry.
xalign
The largest eigenvector of the unweighted XY coordinates of the selected atoms is aligned with the X axis.
xyalign
The largest eigenvector of the unweighted XY coordinates of the selected atoms is aligned with the XY diagonal.
yalign
The largest eigenvector of the unweighted XY coordinates of the selected atoms is aligned with the Y axis.

Additionally, the mode argument may an ensemble handle or reference. In that case, it is interpreted as a substructure, matched onto the ensemble, and if a match is found, the 2D coordinates of the ensemble atoms are adjusted by scaling and rotation for maximum overlap between the 2D coordinates of the substructure and the matched part of the ensemble. This mode retains the relative positions of the matched atoms - this is not a full redraw operation around a match template.

The command returns 0 (nothing done) or 1 (coordinates changed).

ens pack

ens pack ehandle ?maxsize? ?requestprops? ?suppressedprops? ?compressionlib?

e.pack(?maxsize=?,?requestprops=?,?suppressedprops=?,?compressionlib=?)

Pack the ensemble object into a base64-encoded compressed serialized object string. This string does not contain any non-printable characters and is a full dump of the internal state of the object, omitting only property data that was declared to be so easily re-computed that a dump is not worthwhile. Outside object relationship information, such as the dataset the ensemble might be a member of, or tables the ensemble is associated with, are not included.

The maximum size of the object string (default -1, meaning unlimited) can be configured by the optional maxsize parameter. The size is specified in bytes. If the pack string would be longer than the maximum size, an error results.

The two optional parameters lists allow to request a specific property set to be part of the package, even if it normally would not be included, and to explicitly omit properties from the dump. No property computation is performed, and suppressed properties are not purged from the source ensemble.

Ensembles can be restored from a packed object string by the ens unpack andens create commands.

The ensemble object and its minor objects are unchanged after using this command.

The default compression library is zlib . Other useful variants include lzo and gzip (and there are other internal types), but these may not be available on all builds due to license issues, and you need to specify the compression library when a dataset is unpacked. It is generally recommended to stay with zlib .

The return value of this command is the packed string.

In Python , ensembles support the standard pickle / unpickle protocol.

Example:

set dbstring [ens pack [ens create CC=O]]

ens pis

ens pis ehandle ?filterset? ?filtermode?

e.pis(?filters=?,?mode=?)

Standard cross-referencing command to obtain the labels or references of the π systems the ensemble contains. This is explained in more detail in the section about object cross-references.

Examples:

ens pis $ehandle

π systems are a rather exotic feature and not commonly used. These are essentially descriptions of bonding interactions which use p or d orbitals, such as in standard covalent multiple bonds. A simple double bond is described with one σ system and one π system in this representation.

ens prepare

ens prepare ehandle molfilehandle

e.prepare(molfileref)

Prepare the ensemble for output via the specified file handle, for example by pre-computing properties that are needed for output. This has only an effect if the I/O module for the format of the file handle provides an output object preparation function, which is currently only the case for the BDB database format. The output of prepared and unprepared ensembles sent to the same file handle is indistinguishable.

The purpose of this command is to allow the preparation of the ensembles for output in a separate thread. For unprepared ensembles, a significant part of the time to write the record may be spent in computing required data. During this time, the file handle is blocked. Prepared ensembles already contain all required data, and are thus faster to write to file. The total time required in single-thread scripts for a simple molfile write command vs. a ens prepare plus molfile write combo is not much different. However, these operations are largely independent, and on multi-threaded scripts the total time savings can be significant if the two commands are executed in different threads.

The command returns the molfile handle or reference.

ens properties

ens properties ehandle ?pattern? ?noempty?

e.properties(?pattern=?,?noempty=?)

Get a list of valid properties of the ensemble and its minor objects. Property subsets may be selected by a non-empty filter pattern, which the property names must match in order to be listed. If the ensemble is a member of a reaction, reaction properties are included in the list. The same mechanism is used for dataset properties.

If the noempty flag is set, only properties where at least one data element controlled by the ensemble (i.e. a value for an atom of the ensemble, etc.) is not the property default value are output. By default, the filter pattern is an empty string, and the noempty flag is not set.

This command may also be invoked as ens props or e.props() .

Example:

ens properties $ehandle X_*

ens props $ehandle

The first example returns a list of the currently valid reaction properties of the reaction the ensemble is a member of, or an empty list if it is not. The second example lists all properties, including those of the ensemble proper, its minor objects such as atoms and bonds, and possibly of the reaction the ensemble is a member of, if it is an reaction ensemble.

ens purge

ens purge ehandle propertylist/objectclass/specialname ?emptyonly?

e.purge(?properties=?,?emptyonly=?)

Delete property data from the ensemble. The properties may either be properties of a reaction the ensemble is a member of (prefix X_ ), properties of a dataset the ensemble is a member of (prefix D_ ), or properties of the ensemble proper and its minor objects, such as ensemble or atom properties. If a property marked for deletion is not present, it is silently ignored.

If an object class name, such as ens or atom , is used instead of a property name, all properties of that class set on the ensemble are deleted, if they are not locked, or filtered out by the optional empty-only flag.

Setting the optional boolean flag emptyonly allows restricts the deletion to those properties where all the values for a property associated with a major object (such as on all atoms in an ensemble for atom properties, or just the single ensemble property value for ensemble properties) are set to the default property value.

Besides normal property names, a few convenient special names for common property deletion tasks are defined and can be used as a replacement for the property list. These include:

atomquery
Delete atom query properties ( A_QUERY and any other atom query property).
atomstereochemistry
Delete all atomic atom stereo descriptors, but keep those for bonds.
bondquery
Delete bond query properties ( B_QUERY and any other bond query property).
bondstereochemistry
Delete all bond stereo descriptors, but keep those for atoms.
isotopes
Delete isotope information in A_ISOTOPE and other isotope properties which may be defined in future software versions.
query
Delete query information ( A_QUERY and B_QUERY , and any other query property).
radicals
Delete atomic radical information in A_RADICAL and other radical-related properties which may be defined in future software versions.
stereochemistry
Delete all stereochemistry descriptors, including 2D wedges, but not 3D coordinates. The implicit property list includes A_LABEL _STEREO, B_LABEL_STEREO, A_CIP_STEREO, B_CIP_STEREO, A_DL_STEREO, B_CISTRANS_STEREO, A_HASH_STEREO, B_HASH_STEREO , A_MAP_STEREO , B_MAP_STEREO , A_STEREOINFO , B_STEREOINFO , A_STEREO_GROUP , M_STEREO_COUNT , E_STEREO_COUNT and B_FLAGS (only selected bits, the property remains valid if present).
wedges
Delete wedge bond flags in property B_FLAGS . If B_FLAGS is not present, the command is ignored and no computation attempt is made.

Examples:

ens purge $ehandle X_IDENT

ens purge $ehandle E_IDENT 1

ens purge $ehandle stereochemistry

The first example deletes the property data X_IDENT from the reaction the ensemble is a member of - provided it actually is a reaction ensemble. The second example deletes property E_IDENT from the ensemble if the property value is equal to the default value for E_IDENT . The last example removes all stereochemistry information from the ensemble.

The command returns the original ensemble handle or reference.

ens reaction

ens reaction ehandle ?filterlist?

e.reaction(?filters=?)

Return the handle or reference of the reaction the ensemble is a member of. Optionally, the reaction may be filtered by a simple filter list. If the ensemble is not part of a reaction, or does not pass the filter, an empty string is returned for Tcl , and None for Python .

Because an ensemble can only participate in a single reaction, the command is spelled ens reaction in singular.

Example:

ens reaction $ehandle

ens rebuild

ens rebuild ehandle ?minor_objectclass?

e.rebuild(?objectclass=?)

This command discards all minor objects and attached property data of a specific class associated with the ensemble. Afterwards, the minor object set is re-populated by the standard set-up function of the object class, if such a set-up function is defined.

If no minor object class is specified, bonds are regenerated - for example from 3D atomic coordinates. Bonds , molecules ( mols ), sigma and pi systems ( sigmas , pis ), rings and ring systems ( rings , ringsystems ) can all be rebuilt. However, by default no reconstruction function is defined for groups and surface patches ( surfaces ), although it is possible to set one via the object class manipulation command.

Generally, object sets should only be regenerated under exceptional circumstances, for example in order to undo a manual manipulation. Object sets are automatically generated when they are required - for example, bonds are automatically derived from atomic 3D coordinates if any property data associated with bonds is used in any context, and the ensemble so far did not contain bond information. An explicit request to generate connectivity is rarely needed.

Under normal circumstances, the use of minor object information such as bonds encoded explicitly in an input file is preferable to indirectly derived sets, such as regenerated connectivity. The connectivity algorithm of the toolkit is rather capable, but has its limitations, especially when hydrogen-depleted charged structures are encountered.

Files encoded in a few notorious structure file formats, such as PDB , may contain an incomplete bond set - without any indication that the bond set is incomplete. The PDB input routine tries to detect this, and automatically augments the bond set if obvious deficiencies are found. However, in case of minor omissions in the input data, a PDB structure may be one of the rare cases when an explicit request for a rebuild of the bond set can be helpful.

Besides the set of ensemble minor objects, the pseudo object class aro is also recognized. This keyword triggers a re-evaluation of aromatic systems and re-assign Kekulé bond orders, but not completely redo the bond set.

Example:

ens rebuild $ehandle bonds

This command discards the old bond set, and generate a new one. This only works if there is information which can be used for regeneration, such as atomic 3D coordinates. If no such information is present, the loss of bonds is irreversible and the ensemble useless for almost all applications short of a simulated plasma torch atomization.

The command returns the original ensemble handle or reference.

ens ref

Ens.Ref(identifier)

Python only method to get an ensemble reference from a string handle or another identifier. For ensembles, other recognized identifiers are ensemble references, or integers encoding the numeric part of the handle string.

ens rename

ens rename ehandle srcproperty dstproperty

e.rename(srcproperty=,dstproperty=)

This is a variant of theens assign command. Please refer the command description in that paragraph.

ens replace

ens replace ehandle property/enshandle/emptystring ?preserved_propertylist/all?

e.replace(source=,?keep=?)

Substitute the ensemble with a structure decoded from data held in an ensemble property of that ensemble, or with the structure and associated data of another ensemble identified by its handle.

The original handle of the command ensemble is always preserved. The original structure data, with the exception of explicitly saved properties, is discarded. If the structure source argument is an ensemble handle, that ensemble is deleted.

For convenience, the replacement data argument may also be an empty string, which results in a no-op.

If the replacement argument is a property name, the exact type of operation depends on the data type of the property. The following data types are currently supported:

structure
Replace command ensemble directly with the property data ensemble.
string
Try to interpret the string as a structure line notation (as inens create ).
url
Try to download the file behind the Internet address and read it as a structure file.
blob
Try to read the contents as an in-memory structure file record.
diskfile , mapfile
Try to read it as a single-record structure file.

Any other property data type, NULL values of the property, non-ensemble properties, or malformed data result in an error and the original structure remains unchanged.

The structure source property data does not become not a property of the updated ensemble. In that ensemble, by default all other ensemble properties of the original are also purged, and all ensemble properties of the replacement structure are retained. However, by specifying a list of properties to be transferred, or using the special argument all , all or a subset of the ensemble property data of the original ensemble can be transferred to the replacement structure and thus saved. Under these circumstances, property data from the original ensemble has precedence and overwrites existing values of the same property on the replacement ensemble. However, all ensemble property data on the replacement ensemble which are not overwritten remain present in the updated ensemble. It is not possible to transfer atom, bond, or any other ensemble minor object property data to the replacement structure directly with this command.

The command returns the original, unchanged ensemble handle or reference.

Examples:

ens replace $eh E_CANONIC_TAUTOMER [list E_IDENT E_NAME]

This command replaces the current structure with its canonic tautomer. The values of properties E_IDENT and E_NAME from the original ensemble are kept in the updated form, all other ensemble property data of the original is discarded.

ens replace $eh $ehnew

Replace the structure with the one in $ehnew . The second ensemble is destroyed in the process.

ens replicate

ens replicate ehandle ?count?

e.replicate(?count=?)

This command duplicates all molecules in the ensemble and appends them to the atom, bond and other minor object lists of the ensemble.

The default replication count is one, but any other number of duplications may be chosen by an appropriate count parameter. If the count is less than one, the command is silently ignored.

The command returns the original ensemble handle or reference. As part of the integration step, merge property invalidation events are generated.

The ens dup command generates a new ensemble, while this command expands the current ensemble.

Example:

echo [ens get [ens replicate [ens create C.CC]] E_SMILES]

This prints C.CC.C.CC as result SMILES string, because both molecules in the original ensemble were duplicated and appended to the existing ensemble data.

ens rings

ens rings ehandle ?filterset? ?filtermode?

e.rings(?filters=?,?mode=?)

Standard cross-referencing command to obtain the labels or references of the rings the ensemble contains. This is explained in more detail in the section about object cross-references.

Examples:

ens rings $ehandle

ens rings $ehandle [list heterocycle aroring]

The first example returns the labels of all rings the ensemble contains. If the ensemble does not contain any rings, an empty list is returned. Only labels of rings in the SSSR or ESSSR set are returned, even if the currently configured ring set is larger. The second example filters the rings - only heteroaromatic rings are reported.

ens ringsystems

ens ringsystems ehandle ?filterset? ?filtermode?

e.ringsystems(?filters=?,?mode=?)

Standard cross-referencing command to obtain the labels or references of the ring systems the ensemble contains. This is explained in more detail in the section about object cross-references.

Examples:

ens ringsystems $ehandle

ens ringsystems $ehandle [list heterocycle aroring]

The first example returns the labels of all ring systems the ensemble contains. If the ensemble does not contain any ring systems, an empty list is returned. The second example filters the ring systems - a ring system label is included in the output list only if that ring system contains one or more hetero aromats.

ens rotate

ens rotate ehandle angle axis ?center? ?property?

e.rotate(angle=,axis=,?center=?,?coordinateproperty=?)

Rotate the ensemble in 3D space by manipulating property A_XYZ , or a custom atom float vector coordinate property.

The angle argument is a floating-point number in degrees. The axis argument is a 3D vector in standard notation, i.e. usually a list/tuple of three floating point numbers for the x, y and z components. If the last optional argument is omitted, the center of rotation is the 3D unweighted coordinate average of all ensemble atoms with valid 3D coordinates, which is computed as property E_CENTER . If the center argument is specified, it is expected to be a 3D point which is used as center of rotation instead.

This operation triggers a 3dglop property invalidation event.

The command returns the original ensemble handle or reference.

Example:

ens rotate $eh 60 {0 0 1}

Rotate the ensemble 60 degrees counterclockwise around the z axis.

ens scan

ens scan ehandle expression/queryhandle ?mode? ?parameterdict?

e.scan(query=,?resultmode=?,?parameters=?)

Perform a query on the ensemble object. The syntax of the query expression and the optional selection list is the same as that of the dataset scan command with a transient dataset consisting of the current ensemble only. For more details, please refer to the paragraphs on dataset scan and molfile scan .

The return value depends on the mode. The default query mode, this is different from the default in molfile scan , is exists .

ens set

ens set ehandle ?property value?...

e.set(property,value,...)

e.set({property:value,...})

e.property = value

e[property] = value

Standard data manipulation command for setting property data. It is explained in more detail in the section about setting property data.

Example:

ens set $ehandle E_NAME “Pharmacon X-25”

ens setparam

ens setparam ehandle property ?key value?...

ens setparam ehandle property dictionary

e.setparam(property,?key,value?...)

e.setparam(property,dict)

Set or update a property computation parameter in the metadata parameter list of a valid property. This command is described in the section about retrieving property data. The current settings of the computation parameters in the property definition are not changed.

The return value is the updated property computation parameter dictionary.

Example:

ens setparam $ehandle E_GIF comment “Top Secret Lead Structure”

ens setup

ens setup ehandle ?minorobjclass?

e.setup(?objectclass=?)

Query the status of the minor object lists in the ensemble, or initialize one of these to an empty list.

If no class is specified, a dictionary with all currently registered minor object classes of the ensemble is returned. The object class names are the key, the value is a boolean flag for the status.

If an object class argument is supplied, the object class is instantiated on the ensemble, if necessary by auto-loading an object class handler module. Unknown object class names result in an error. If the minor object class is already instantiated, it is not changed. Otherwise, an empty minor object set is added. This is even the case if the minor object class handler provides a default object setup function (seeens rebuild command). Instantiating an object class with this command always creates an empty collection of the minor objects associated with the ensemble.

Minor object lists are usually implicitly instantiated, as in

ens get $eh M_LABEL

which automatically sets up the molecule/fragment object set if it is not yet present, and populates it with objects identifying disconnected fragments in the ensemble, or

group create $eh [list $a1 $a2 $a3]

which adds a group to the ensemble, again automatically initializing the group object set if it was not initialized.

The ens setup command is intended for special circumstances and not commonly used.

ens show

ens show ehandle propertylist ?filterset? ?parameterdict?

e.show(property=,?filters=?,?parameters=?)

Ens.Show(data,property=,?filters=?,?parameters=?)

Standard data manipulation command for reading object data. It is explained in more detail in the section about retrieving property data.

For examples, see the ens get command. The difference between ens get and ens show is that the latter does not attempt computation of property data, but raises an error if the data is not present and valid. For data already present,ens get and ens show are equivalent.

The Python class method is a one-shot command. The transient dataset created from the initialization items is automatically deleted when the command finishes.

ens sigmas

ens sigmas ehandle ?filterset? ?filtermode?

e.sigmas(?filters=?,?mode=?)

Standard cross-referencing command to obtain the labels or references of the σ systems the ensemble contains. This is explained in more detail in the section about object cross-references.

Examples:

ens sigmas $ehandle

σ systems are a rather exotic feature and not commonly used. These are essentially descriptions of bonding interactions which use s orbitals, such as normal, covalent single bonds, or the central bond in multiple bonds. A simple double bond is described with one σ system and one π system in this representation.

ens sort

ens sort ehandle ?sort_property? ?relabel? ?duplicate? ?datasethandle? ?position?

e.sort(?property=?,?relabel=?,?duplicate=?,?target=?,?position=?)

Sort the atoms in an ensemble according to a property value. The default property is A_LABEL , the standard atom label. The first optional argument can be used to sort on a different property, or a property field. However, the property must be either an atom property, or a molecule property. If the relabel flag is set, the ensemble atoms and molecules are renumbered after the sort in ascending order, starting with one. By default, atoms and molecules retain their original labels even if they change positions. If the duplicate flag is set, the sort operation works on a duplicate of the original ensemble. If the flag is unset, or the argument omitted, the operation modifies the original ensemble object.

The final two optional arguments allow the direct transfer of the modified ensemble or duplicate into a dataset, similar to an ens move command. The ensemble may be inserted into a specific position of a target dataset. If the special value end is used, or the zero-based position index is beyond the current end of the target dataset, the ensemble is simply appended. By default the ensemble is not moved, and if it is moved without an explicit position, it is appended.

The sequence of the atoms in the ensemble is rearranged so that the atoms are in ascending order of the values of the sort property or property field. Indirectly, molecules are also rearranged to correspond to the sequence of the first atoms in every molecule. This operation triggers a shuffle property invalidation event. If the renumbering option is selected, the atom and molecule sets are re-labeled with their standard label properties (i.e. A_LABEL for atoms, M_LABEL for molecules) in ascending order, starting with one. Other minor object collections remain in their original sequence and retain their current labels. Certain important properties which, if present, are dependent on atom label values, notably A_LABEL_STEREO , B_LABEL_STEREO and B_FLAGS , are specifically adjusted to the new labeling scheme instead of being invalidated.

The command returns an ensemble handle or reference. If the operation was operating on a duplicate, it is the handle or reference of the new ensemble, otherwise that of the original ensemble.

ens split

ens split ehandle ?minsize? ?splitproperty?

e.split(?minsize=?,?splitproperty=?)

Split the molecules of the ensemble into individual ensembles. The return value is a list of the handles or references of the new ensembles. If the original structure contains only a single fragment, the result is the same as a simpleens dup command. The split structures do not become a member of a reaction or dataset, even if the original structure is.

The optional minsize parameter is a minimum value for the number of heavy atoms (property M_HEAVY_ATOM_COUNT ) in the molecules. If this is not an empty string, molecules which have less heavy atoms than the minimum are not duplicated. If all molecules in the input ensemble are smaller than the required size, an empty list is returned.

The optional splitproperty argument can be used to split the ensemble on values of a molecule property, which needs to be either already set or computable, instead of simply separating fragments on connectivity. All molecules in the input ensemble which have a common value of this property are put into a joint result ensemble, and each distinct split property value starts a new result ensemble. Molecules with a common property value do not need to be present in the input ensemble in a consecutive sequence, nor are there any special requirements for the data type or value range of the split property, as long as the data type has a comparison function. If the values of the split property are distinct over all molecules in the input ensemble, the outcome of command is indistinguishable from running it without any split property.

Example:

lassign [ens split [ens create “CC.CC”]] eh1 eh2

This example creates an ensemble with two ethane molecules, splits it, and assigns the two new ensemble handles to variables eh1 and eh2 .

set elist [ens split $eh {} M_REACTION_LABEL]

Split ensemble along the original reagent or product data blocks found in an RXN or RDF file.

ens sqldget

ens sqldget ehandle propertylist ?filterset? ?parameterdict?

e.sqldget(property=,?filters=?,?parameters=?)

Ens.Sqldget(data,property=,?filters=?,?parameters=?)

Standard data manipulation command for reading object data. It is explained in more detail in the section about retrieving property data.

For examples, see the ens get command. The differences between ens get and ens sqldget are that the latter does not attempt computation of property data, but initializes the property value to the default and returns that default, if the data is not present and valid; and that the SQL command variant formats the data as SQL values rather than for Tcl or Python script processing.

The Python class method is a one-shot command. The transient dataset created from the initialization items is automatically deleted when the command finishes.

ens sqlget

ens sqlget ehandle propertylist ?filterset? ?parameterdict?

e.sqlget(property=,?filters=?,?parameters=?)

Ens.Sqlget(data,property=,?filters=?,?parameters=?)

Standard data manipulation command for reading object data. It is explained in more detail in the section about retrieving property data.

For examples, see the ens get command. The difference between ens get and ens sqlget is that the SQL command variant formats the data as SQL values rather than for Tcl or Python script processing

The Python class method is a one-shot command. The transient dataset created from the initialization items is automatically deleted when the command finishes..

ens sqlnew

ens sqlnew ehandle propertylist ?filterset? ?parameterdict?

e.sqlnew(property=,?filters=?,?parameters=?)

Ens.Sqlnew(data,property=,?filters=?,?parameters=?)

Standard data manipulation command for reading object data. It is explained in more detail in the section about retrieving property data.

For examples, see theens get command. The differences betweenens get and ens sqlnew are that the latter forces re-computation of the property data, and that the SQL command variant formats the data as SQL values rather than for Tcl or Python script processing.

The Python class method is a one-shot command. The transient dataset created from the initialization items is automatically deleted when the command finishes.

ens sqlshow

ens sqlshow ehandle propertylist ?filterset? ?parameterdict?

e.sqlshow(property=,?filters=?,?parameters=?)

Ens.Sqlshow(data,property=,?filters=?,?parameters=?)

Standard data manipulation command for reading object data. It is explained in more detail in the section about retrieving property data.

For examples, see the ens get command. The differences between ens get and ens sqlshow are that the latter does not attempt computation of property data, but raises an error if the data is not present and valid, and that the SQL command variant formats the data as SQL values rather than for Tcl or Python script processing.

The Python class method is a one-shot command. The transient dataset created from the initialization items is automatically deleted when the command finishes.

ens subcommands

ens subcommands

dir(Ens)

Lists all subcommands of the ens command. Note that this command does not require an ensemble handle.

ens surfaces

ens surfaces ehandle ?filterset? ?filtermode?

e.surfaces(?filters=?,?mode=?)

Standard cross-referencing command to obtain the labels or references of surface patches the ensemble contains. This is explained in more detail in the section about object cross-references.

Example:

ens surfaces $ehandle carbon

This example lists all surface patches which are associated with carbon atoms. Surface patches associated with other atoms, or with no atoms, are not listed.

ens swapin

ens swapin ehandle

e.swapin()

Swap an ensemble from the disk store fully back into memory, and disable further automatic loading and shelving. If the ensemble was not swapped out, the command does nothing.

The command returns the original ensemble handle or reference.

ens swapout

ens swapout ehandle

e.swapout()

Remove most of the ensemble data from memory and store it in a temporary disk store. The ensemble handle remains valid. As soon as it is used in a command again after this command has been executed, the swapped ensemble data is automatically reloaded from file, and then stored again when the object lock is released. To disable the automatic swapping of an ensemble, use the ens swapin command.

This command is intended to be used in cases where a large number of ensembles must be kept in memory. Its routine use is not encouraged - it is only useful in case the programmer knows about access patterns. In other cases, the standard virtual memory mechanism of the operating system might yield better performance results.

The ensembles are stored as binary blobs in a key/value store in a process-specific swap directory cactvs%d, ( %d is replaced by the process ID) which is created automatically in the standard temporary directory. When an ensemble is deleted, its swap record is also removed, if one was created during the lifetime of the ensemble. When a Cactvs application program exits, the swap store as well as the swap directory are automatically deleted, even without explicit deletion of the last set of ensembles in memory. In case of program crashes, the swap directory and its contents may however survive. If ensemble swapping is used with unstable applications, the temporary directory should be checked from time to time.

The command returns the original ensemble handle or reference.

Example:

ens swapout $ehandle

ens tables

ens tables ehandle ?filterlist?

e.tables(?filters=?)

Return a list of the handles of all table objects the ensemble is associated with. Optionally, the table set may be filtered by a simple filter list. If the ensemble is not related to any table, or none of these tables passes the filter list, an empty string is returned.

This command is only available if the toolkit was compiled with table support.

Example:

ens tables $ehandle

ens taint

ens taint ehandle propertylist/changeset ?purge?

e.taint(property=,?purge=?)

Issue a property data tainting event which acts on the ensemble data.

If the ensemble is a member of a dataset, the dataset and its objects are not tainted.

The event list may contain any number of the following items:

A property name.
In that case, all properties which depend on the specified one are invalidated. If the optional purge parameter flag is also set, the specified property itself is also deleted. By default the self-deletion flag is not set.
An object class
All properties which a sensitive to changes in the object class collection associated with the target ensemble are deleted. Example:

ens taint $eh atom

This deletes all properties which are sensitive to changes in the atom make-up of the ensemble.

2dop
All properties which are dependent on 2D layout coordinates are invalidated.
3drelative
All properties which are dependent on relative inter-atomic 3D atomic coordinate changes are invalidated.
3dabsolute
All properties which are dependent on absolute 3D atomic coordinate changes are invalidated.
dup
All properties which do not survive duplication of the underlying object are invalidated.
hadd
All properties which are sensitive to hydrogen addition or deletion via dedicated hydrogen processing commands, which do not trigger the default atom and bond change events associated with atom addition or deletion and bond changes, are purged.
merge
All properties which are invalidated by merging ensembles are invalidated.
shuffle
All properties which are dependent on the order of minor objects in the ensemble are purged.
stereo
All properties which are invalidated by stereo changes are dropped.

The command returns the original ensemble handle or reference.

ens torsions

ens torsions ehandle ?filterset? ?filtermode?

e.torsions(?filters=?,?mode=?)

Standard cross-referencing command to obtain the labels or references of the torsion objects the ensemble contains as minor objects. This is explained in more detail in the section about object cross-references.

ens transfer

ens transfer ehandle propertylist ?targethandle? ?targetpropertylist?

e.transfer(properties=,?target=?,?targetproperties=?)

Copy property data from one ensemble to another ensemble or other major object, without going through an intermediate scripting language object representation, or dissociate property data from the ensemble. If a property in the argument property list is not already valid on the source ensemble, an attempt is made to compute it.

If a target object is specified, and a property is not an ensemble but an ensemble minor object property, the number of property-associated minor objects is usually expected to be the same in both ensembles, and expected to have the same label set, tough it is not required that they are in the same sequence. Property data is assigned to the target ensemble minor objects with the minor object label as reference key. In case of a label set or object count mismatch between the two ensembles, no error is raised. Excess source data items are discarded, and excess target minor objects, or those with unmatched labels, retain their original value if the property was present on the target, or are set to the default value if the property was freshly instantiated. In this command mode, the return value is the handle of the target ensemble. Source and target ensembles cannot be the same object.

If a target property list is given, the data from the source is stored as content of a different property on the target. For this, the data types of the properties must be compatible, and the object class of the target property that of the target object. No attempt is made to convert data of mismatched types. In case of multiple properties, the source property list and the target property list are stepped through in parallel. If there is no target property list, or it is shorter than the source list, unmatched entries are stored as original property values, and this implies that the object class of the source and target objects are the same.

If no target object is specified, or it is spelled as an empty string or Python None , the visible effect of the command is the same as a simple ens get , i.e. the result is the property data value or value list. The property data is then deleted from the source object. In case the data type of the deleted property was a major object (i.e. an ensemble, reaction, table, dataset or network), it is only unlinked from the source object, but not destroyed. This means that the object handles returned by the command can henceforth the used as independent objects. They can be deleted by a normal object deletion command, and are no longer managed by the source object..

Properties which are ensemble minor object properties can only be transferred to another ensemble. Ensemble properties can be moved to other major objects.

Example:

ens transfer $eh E_EMF_IMAGE $eh2

This copies property E_EMF_IMAGE from the first ensemble to the second. The property data remains valid on the source ensemble.

set ehc [ens transfer $eh E_CANONIC_TAUTOMER]

Get the handle of the canonic tautomer of the source ensemble, and dissociate it from the source ensemble.

ens transform

ens transform ehandle SMIRKSlist ?direction? ?reactionmode? ?selectionmode? 	?flags? ?overlapmode? ?{?exclusionmode? excludesslist}? ?maxstructures? 	?timeout? ?maxtransforms? ?niterations? ?statusvariable?

e.transform(transforms=,?direction=?,?reactionmode=?,?selectionmode=?,?flags=?,	?overlapmode=?,?excludess=?,?maxstructures=?,?timeout=?,?maxtransforms=?,	?iterations=?,?statusvariable=?)

This command applies one or more SMIRKS transforms to an ensemble and returns a list of ensemble handles or references of transformation products. The transformation products are filtered for duplicates. The original start structure is never returned - if a transform set does not match at all, an empty list is returned.

The required parameter after the ensemble handle is a list of SMIRKS lines, where each SMIRKS line is itself a list. A SMIRKS line is in the simplest case a simple SMIRKS transform without any extra data, but it may be padded by additional parameters which apply only to the application of that transform. If these optional parameters local to the current transform are not specified, their global counterpart on the command line is used instead. The syntax of an individual SMIRKS line is

SMIRKStransform ?step? ?direction? ?flags? ?overlapmode?

The SMIRKS transform part is the only required list element. It may be provided either as a string in standard Daylight notation, or as a handle of a reaction, which should have been decoded in SMIRKS mode (seereaction create command). Care should be taken to pass SMIRKS strings as proper elements of a list, even if only a single string is used, because they may contain whitespace and naming information after the actual transform code. Example:

ens transform $ehandle [list [list {[C:1][C:2]>>[C:1]=[C:2] Dehydrogenation} 1]]

The string Dehydrogenation is part of the transform specification string and not the transform step. The name string is attached to the (intermediate, in this case) transform reaction object as property X_NAME and can be used to track the reaction history of transform result structures.

The optional step element in a transform line (a positive integer or 0) identifies the reaction step of the transform. Transform sets of different step numbers are isolated from each other and do not interact. Transforms are executed in ascending step number. Transforms with different step numbers need not to be sorted, and the step numbers neither need to begin with one, nor form an uninterrupted sequence. A step number of 0 disables the transform. The default step number is one. All transforms of the same step number are essentially executed in parallel and may interact with each other.

The third and again optional element of transform lines is the direction identifier. It may be either forward, backward, or bidirectional. In forward mode, only the left part of a transform is used for matching, and the matched structure part is modified according to the description on the right side. backward works the other way around, and in bidirectional mode, both sides of the transform scheme are independently matched, and, if the match is successful, transformed to the other side. If this parameter is not specified, or specified as an empty string, the global direction parameter from the command line is substituted.

The fourth and once more optional element of a transform line is a list of flag words. Every word sets an additional flag. Currently, the following flag words are recognized:

absolutestereo
If this flag is set, the stereochemistry of the right side of a transform is transferred unchanged to the transform result ensemble, without attempting to interpret the operation as a reaction with stereochemistry inversion or retention by examination of the pattern on the left side. If the left side does not contain stereochemistry, the behavior induced by this flag is already the default and it has no effect. It also has no effect if the right side of the transform does not specify stereochemistry.
allrequired
If this flag is set, only result structures which were generated by the combined application of all transforms marked with this flag are accepted as final results. If any of the transforms marked with this flag did not contribute to a result structure, it is discarded. By default, the result set is not filtered by its origination from any specific transform.
anyrequired
If this flag is set, only result structures which were generated by the application of at least one of all transforms marked with this flag are accepted as final results. By default, the result set is not filtered by its origination from any specific transform.
appendpathname
The same as setpathname , except that the content of an existing property E_NAME on the input ensembles is not overwritten. Transform path information is always appended. If the input structure does not have initial name information, the operation of the two flags is indistinguishable.
changeelements
If set, the element number and atom type of matched atoms is changed to that of the matching right side template. By default, atom type and element number of atoms which are not newly added are preserved in the transformed ensembles. This is usually desirable for the use of element lists and other generic expressions as part of transform patterns. If this flag is set, the atom is changed to the exact template definition - including changes to any atoms, element lists, or complete atomic recursive SMARTS expressions.
chargeneutral
If set, the sum of all changes in the formal atom charges in the set modified by the application of the transform, excluding any atoms which are deleted or added, must be zero. This is helpful for example for charge redistribution transforms. For example, a transform like

[*;+1:1]=[*:2][N:3]>>[*:1][*:2]=[N;+1:3]

only works on structures where the nitrogen atom is neutral, because otherwise the total charge of the match three atom block would change. It would be possible to achieve the same effect with explicit indication of allowed charges on all involved atoms, but this flag can be convenient.

chargeradicals
If this flag is set, radicals which are generated as result of a transform are charged using chemistry common sense. A cleaner and preferable method is to explicitly encode charge in the transform.
checkaro
If set, aromaticity checking takes place. Atoms specified as aromatic in the transform pattern only match aromatic atoms in the target ensemble, and all other atoms only match non-aromatic atoms. By default, the aromaticity status of atoms is ignored in evaluating the pattern match.
checkcharges
If set, formal charges on the match side of the transform must exactly match the charges on the matched structure atoms. By default, charges are not used for determining a match. This flag should be set if the transform pattern should only match specific charges.
checkkekule
If this flag is set, bond orders of aromatic systems in the substrate molecules must be matched exactly as specified in the transform.
checkstereo
If set, the stereochemistry on the match side of the transform must match the stereochemistry on the matched structure atoms. By default, stereochemistry is not used for matching.
checkwedges
If this flag is set, bonds in the transform ensemble must match the wedge style specified in the left side of the transform template. This is useful only under very specific circumstances, since the style and placement of wedges does not uniquely identify a stereo isomer. Checking stereochemistry is therefore usually performed via the checkstereo flag, which relies on the comparison of stereo descriptors instead of wedges.
distinctpatternmatch
If this flag is set, the match mode of the substructure side of the transform is changed. The default match mode is all , meaning that all possible orientations of the substructure are generated, except in case of a transform application mode first , where the substructure match mode is also first . If this flag is set, the match mode is changed to distinct . In this mode, only pattern matches which differ in the set of structure atoms matched are generated, removing alternative mappings of the substructure on the same set of structure atoms. This mode is faster and can reduce the number of computation steps significantly, but the applicability of this match mode for the generation of the full set of desired transform results must be determined by the programmer with an eye for possible asymmetry of the matched structures outside the atom set of the transform substructure.
dropradicals
If this flag is set, transform result structures are discarded if they are radicals. In case the chargeradicals flag is also set, the radical check is performed after the attempt to charge standard radical centers and may thus be used as a second line of defense against unreasonable structures.
filtercharges
If set, use the localization of formal charges on atoms as a criterion to distinguish transformation results. By default, the standard hashcoding process is used which does not care about the placement of formal charges as long as these forms are interconvertible. For example, with the standard duplicate filtering process, pentavalent and ionic forms of nitro groups are considered equivalent. However, this will also prevent transforms which convert one form of a nitro group into another from working, since the transform result is discarded as being equivalent to the input structure. In order for this kind of transform to function as expected, the filtercharges flag must be set, which configures the duplicate filter to distinguish between the two forms. In that case, the preservecharges flag (see below) must not be set in order to allow the transformation to change the charge, but the checkcharges flag (see below) should be set in order to restrict the match of the transform to a specific ionic or pentavalent form.
filterisotopes
This flag instructs the duplicate detection mechanism to use hash codes which use isotope labeling information for duplicate removal. This flag is not exclusive to the filterstereo flag - both attributes can be combined to select a suitable hash code.
filterkekule
This flag instructs the duplicate detection mechanism to use compute hash codes which are dependent on the exact bond order - including that of Kekulé structures of aromatic systems - for duplicate removal.
filterradicals
If set, use the localization of free electrons on atoms as a criterion to distinguish transformation results. By default, the standard hashcoding process is used which does not care about the placement of electrons as long as these are interconvertible. However, this also prevents transforms which convert from one electron localization scheme to another without accompanying atom or bond changes from working, since the results are discarded as being equivalent to the input structure.
filterstereo
This flag instructs the duplicate detection mechanism to use stereo-specific hash codes for duplicate removal. This flag is not exclusive to the filterisotopes flag - both attributes are used to select a suitable hash code.
keepiterationintermediates
If this flag is set, and multiple iterations are run, the results from intermediate iteration steps are part of the returned set. By default, only the results of the last iteration are returned.
kekulize
If this flag is set, a new Kekulé form of aromatic systems in the transformed structure is constructed. This is useful when the matching pattern did not check for explicit single and double bonds, so that after applying the transformation the Kekule pattern may be wrong, for example after swapping an electron-pair donating N witch a normal pi-bonded C atom. Still, it is generally recommended to use explicit bond order manipulation since that method is more robust.
linkreaction
Ensembles which are created via a transform for which this flag is set are linked to an automatically created reaction object in which the transform result ensemble is the reaction product, and a duplicate of the input ensemble the reagent. In addition, the X_NAME and E_REACTION_ROLE properties are set. The return value of the ens transform command is still a list of the handles of the transform result ensembles. The additional reagent ensemble handles are not included, and neither are the handles of the reactions. In order to access the reaction information, a lookup command such as ens reaction with a result ensemble as argument can be used.
lockimplicitbonds
Bonds in the transform structure which are represented by bonds with an implicit bond order on the right side of the pattern (in forward direction, left side for reverse transforms) do not get their bond order adjusted, with the rationale that these pattern bond orders are not well defined anyway.
nitrostandardizer
If this flag is set, the input structure duplicate(s) entered as start compound in the transform processing queue is standardized to possess the neutral, pentavalent form of nitro groups and similar groups. This option does not change the input structure(s) of this command but only the first structure duplicate entered into the processing queue.
nochargepaircollapse
Disable the feature that bonds which connect atoms of opposite +1 and -1 formal charges are also matched by the equivalent bond with a bond order increased by one and neutral atoms.
nohadd
Part of the normal transformation procedure is a final hydrogen addition step before duplicate checks etc. are performed. This default behavior is designed to result in standard fully hydrogen-complete structures. If this flag is set, this step is omitted. This can for example be useful to avoid the addition of hydrogens to atoms with different default hydrogen addition characteristics if formal atomic charges have been moved. This option does apply to the input structure(s) of this command but only the first structure duplicate entered into the processing queue.
preservecharges
If set, charges are not modified after a transform is matched. By default, the charge of matched atoms is set to the charge of the matching atom in the transform template, as long as the atom has sufficient free electrons to allow the charge change. Atoms which are newly introduced by the transform always bear the charge specified in the transform description. This flag does not influence the match process - charges specified in the transform may still be used for selecting specific atoms via the checkcharges flag preservestereo
If set, atom and bond stereochemistry are not changed on matched atoms and bonds. By default, changes do occur - changed atoms or bonds have their stereochemistry reset if the transform pattern does not contain stereochemistry, or set to a specific stereochemistry if it does. If only the right side of a transform contains stereochemical descriptors, the stereochemistry of the transformed product is set to that of the template (for example, a cis double bond). If both the left and right side of a transform contain stereochemistry, the chemistry at the transform product is inverted or retained, depending on the stereochemistry change in the transform. Having stereochemistry only on the left side is possible, and potentially useful for selecting specific enantiomers or diastereomers via the checkstereo flag, but results in a reset (if this flag is not set) or retained (if this flag is set) stereochemistry in the transformed ensemble.
preservecoordinates
If this flag is set, 2D and 3D coordinates of the transform ensemble are retained. Newly added atoms are set to a magic coordinates value. By default, a successful transformation invalidates 2D and 3D coordinates, as well as all property data dependent on these.
preservestereo
If the flag is set, all stereo descriptors are retained. By default, stereo centers and bonds matched by the pattern are reset, or inherit their new value from the pattern.
preservewedges
If set, the wedge status of bonds matching the transform pattern is preserved. By default, wedges involving bonds which are changed, or which connect atoms which are changed (or deleted and then re-added in other form), are reset. Note that this flag operates independently of the set of stereo flags listed above. In most cases, the desired mode of stereochemistry processing should be selected by specifying these flags, and the wedges regenerated as needed. If combined with stereochemistry changes, the use of this flag may otherwise lead to conflicting stereochemical information on the result ensembles.
removeh
If set, an attempt is made to rescue bond changes which would fail because of insufficient electrons for bond manipulations by deleting a minimum number of hydrogen atoms on the bond atoms needed for the bond creation or bond order change. Without this option, a transform like [C:1][C:2]>>[C:1]=[C:2] usually does work, since Cactvs is designed to work on structures with a full hydrogen set. When this flag is set, the transform succeeds if C1 and C2 both have at least one hydrogen. Alternatively, the transform can be specified with explicit hydrogens as in [#1][C:1][C:2][#1]>>[C:1]=[C:2] . In that form, it always removes the hydrogens because they do not appear on the right side. This is form is slightly more complex and different from the Daylight mechanism.
requireheteromatch
If set, among the structure atoms matched by the pattern there must be at least one hetero atom in order to proceed with the transform modifications. If only carbon or hydrogen is matched, this match is ignored.
restricthydrogenmatch
If set, hydrogen matches are not permuted. This means that, for example, the first explicit hydrogen around a transform substructure atom can only match the first hydrogen around a structure atom, not all of them (for example, all 3 in a methyl structure group) in different matches. This is an optimization which is frequently useful, if the transform results are guaranteed to be identical regardless of which hydrogen atom was matched - for example, when generating tautomers. However, if extended attributes of the structure hydrogen atoms are significant, such as 3D position, charge or isotope labels, etc., setting this flag can lead to the non-generation of distinct result structures.
setatommatch
If this flag is set, the atom labels ( A_LABEL ) of the stored atoms on the left (substructure) side of the transform are stored on the transformation result ensembles as property A_SSMATCH . Atoms which are not matched are assigned a zero value.

In case a transform result structure is the product of more than one transform, each transformation step adds a new property instance A_SSMATCH , A_SSMATCH/2, and so on. Pre-existing A_SSMATCH properties on the transform input ensemble are not deleted. If these exist, the new data is stored in the next unused property instance after the current instance with the highest slot number.

setatomstatus
If this flag is set, the status of the atom during the last transform is marked in property A_TRANSFORM_STATUS . Possible values are none : atom did not participate, matched : it was matched by the transform substructure, but did not change, changed : one or more atom attributes, including possibly the element number, we edited, new : the atom was added by the transform.

In case a transform result structure is the product of more than one transform, each transformation step adds a new property instance A_transform_status , A_transform_status/2, and so on. Pre-existing A_TRANSFORM_STATUS properties on the transform input ensemble are not deleted. If these exist, the new data is stored in the next unused property instance after the current instance with the highest slot number.

setbondmatch
This option is very similar to setatommatch described above, except that matching left-side substructure bond labels B_LABEL are stored in property B_SSMATCH .
setbondstatus
This option is very similar to setatomstatus described above, except that bond history is stored in property B_TRANSFORM_STATUS .
setpathname
If this flag is set, the name (property E_NAME ) of the result ensembles is set to display the transformation sequence the structure underwent from the input structure. The name is formatted as a Tcl -conforming list with one element for each transform applied. The first character of each list element is either ’>’ or ’<’ to indicate application of the transform in forward or reverse direction. It is followed by either the transform name (property X_NAME ), if it is available, or the transform index number (starting with 0). Any initial name of the start structures of the transformation is cleared, so that the result name only contains transform path information.

The fifth, final, and again optional element of a SMIRKS line is the overlap mode. Again, if this parameter is omitted or supplied as an empty string, the global default from the command line is used. The overlap mode determines whether a transform substructure which consists of multiple disconnected fragments may match onto common target structure atoms or bonds. The following values are supported:

none
No overlap of the substructure fragments, neither on atoms nor on bonds. This is the default mode, and the most commonly used.
distinctmols
All disconnected fragments in the substructure must match different molecules in the target structure. This is a useful mode to prevent, for example, intra molecular reactions.
any
Any overlap of the substructure fragments is possible. This mode is rather useless for transforms.
nobonds
Atoms may overlap, but not bonds. This mode is actually highly useful in some contexts.
noembed
Atoms and bonds may overlap, but no substructure fragment may be completely embedded in the structure part matched by another fragment, meaning that at least one of any pair of matching substructure fragments must match an atom which is not matched by the other fragment.
distinctatoms
Between any pair of matched substructure fragments, both fragments must match at least one atom not matched by the other fragment.

Every SMIRKS line follows the outlined scheme, and all settings within that line are applicable only to the current transform scheme.

There is no general limit for the maximum number of transforms in this command. However, if transforms are combined with exclusion substructures, and these exclusion substructures are to be applied on a per-transform basis, (see below), the highest transform index for which an applicability flag can be set is 63. Every transform which is applied in bidirectional fashion, either by global configuration or transform-specific flags, is counted twice toward this limit.

All parameters after the SMIRKS lines list act globally. The third and optional direction parameter, command word number five, sets the default for the directionality of all transforms for which no local override was set in their respective SMIRKS lines. If this parameter is not specified, the default is forward .

The optional reaction mode, parameter four and command word six, does not have a counterpart in the SMIRKS lines. This parameter determines how the possibility of multiple matches of a transform substructure in the target molecules is handled. It can be one of these values:

first
Only the first match which is found is executed, all other possible matches are disregarded. The location of the first match should be considered random.
exhaustive
Only those transform products where all possible match sites have been processed are produced. For example, a structure with two reaction sites A and B, only the product where both A and B have been transformed is reported - provided that the initial transformation of A or B did not influence the possibility of matching the second part. So, in case of the hypothetical hydrolysis of a dihalogene compound with explicit water molecules, the fully reacted product will only be obtained if the input ensemble contained two water molecules. Otherwise, one (in case of symmetry) or both products of a single hydrolysis step are obtained. This mode operates by generating the intermediate products and re-submitting them. If these generate one or more new compounds, they are discarded from the result list. An older and still recognized name of this reaction mode is all .
singlestep
All matches are found and the transform executed, but the transform results are not re-submitted for matching as they are in the exhaustive mode. All different products which result from a single application of the transform are returned. For hypothetical example of the hydrolysis of an asymmetrical dihalogene compound, both partial hydrolysis products are generated, but not the fully hydrolyzed end product.
multistep
This mode generates all transform products by systematically applying the transforms to all structures and re-submitting the results again and again, until no new compounds are generated. In contrast to the exhaustive mode, intermediate products which further react are not discarded. The hypothetical example of the hydrolysis of an asymmetrical dihalogene compound yields three products - two partial hydrolysis products, and the fully hydrolyzed end product.

The default value for the reaction mode is first .

The next optional command parameter, the selection mode , (command argument five and command word seven) again has no counterpart in the SMIRKS line parameters. It determines the interaction of transforms of the same step number. All these transforms form a group. This parameter determines which of the transforms from the current group are executed, and in which order. The parameter can be set to one of the following values:

first
The first transform from the current group which matches is processed according to the reaction mode setting. All other transforms in the group are ignored, regardless whether they would match or not.
sequence
All transforms in the current group are applied once in the order they are specified, with the current reaction mode. Each transform is applied to the result ensembles of the previous transform, or the start ensemble for the first transform. All results, including those which did undergo further changes by later transforms, are returned.
seqendpoints
Similar to the sequence mode, but only those result ensembles which did not lead to further transformation results (either actually generated, or discarded as duplicates) are returned. Again, each transform in the sequence is only applied to the result structures of the previous transform.
endpoints
Similar to the all mode, but every transform is applied to all result ensembles which have accumulated before. Only those ensembles which did not yield additional, structurally distinguishable result ensembles are returned as final result.
all
All transformations are applied to all result ensembles. This process is repeated until no additional, structurally distinguishable result ensembles are generated1. The full set of result ensembles is returned.
newseqendpoints
This mode is similar to the seqendpoints mode. In seqendpoints mode, if a transform does not match any of the current input structures, an empty set is passed on to the next transform as input data. Thus, the transforms which follow a failing transform cannot produce any results themselves and are effectively ignored. In this mode, if a transform does not yield any results on the current input set, the current input set is re-used for the next transform, so that transforms which do not match cannot interrupt the chain. If the current transform yields results, that result set is used. The final result set is filtered, as in the endpoints and seqendpoints modes, to contain only structures which did not produce any transform results themselves.
parallel
All transforms of the current group are applied, but only to the start structure set, not to any results produced by the successful application of any previous transform.

The default selection mode is first .

The next and again optional flags parameter (command argument six, command word eight) defines the default for those transforms which do not possess an override flag set in their SMIRKS line. Note that if a flag set is specified on a SMIRKS line it completely replaces the default flag set. It does not simply add or bit-or more flags compared to the global setting. The default flag set is empty.

Similarly, the overlap mode parameter (command argument seven, command word nine) sets the default for handling potential overlap when matching disconnected transform fragments onto the structure to be transformed. The default setting is none , disallowing any fragment overlap. If the transforms only consists of a single fragment in the applicable direction(s), there is no effect of this parameter.

The excludesslist parameter (command parameter eight, command word ten) again has a potentially complex internal structure. It defines exclusion fragments. An exclusion fragment blocks all sections of the target structure from matching any transform substructure, either by preventing the match of transform atoms (the default) or transform bonds. This is a useful feature for example to easily prevent amide groups from matching amino group transforms. The default exclusion substructure list is empty. The parameter is a list. Every list element can be a simple structure identifier, or a list of a structure identifier and a transform index list.

Structure identifiers recognized by this command are:

ensemble handles
This selects the complete parameter ensemble as exclusion substructure.
lists of an ensemble handle and a molecule label
This selects a specific molecule from the ensemble as exclusion substructure,
SMARTS strings
The SMARTS string is temporarily decoded and used like an ensemble handle. The transient ensemble is automatically destroyed when the ens transform command has finished.

If the exclusion substructure identifier is not associated with a transform index list, the substructure applies to all transforms. The optional transform index list consists of an arbitrary number of transform indices in the range 0...63. If a transform index list is supplied, the exclusion substructure applies only to the listed transforms. Note that it is not possible to set individual exclusion indices for transforms beyond the 64th, even though it is allowable to use any number of transforms in the transform list. All ensembles, including intermediate result ensembles, are checked against all applicable exclusion structures immediately before the application of a transform is attempted.

The exclusion substructure specification list may be prepended by a magical list element with value ( marked ) atoms , ( marked ) bonds, unmarkedatoms or unmarkedbonds . These control the mechanism how matched substructures are marked in the transform source structure. The default mode is atoms , where excluded atoms are prevented from matching transform pattern atoms. The bonds mode switches this to preventing a bond match. The difference is that in bonds mode, transform pattern atoms can still overlap, by a single atom, excluded regions, but not change bonds therein, while in atoms mode absolutely no atom or bond overlap between excluded regions and transform patterns is allowed. The unmarked variants operated with a reversed exclusion set - i.e. atoms or bonds which are not matched are excluded from the structure region eligible for transform application.

In case the exclusion mode is ( marked ) atoms or unmarkedatoms , an atom identifier, i.e. any notation which is supported to identify an atom in the atom command, may also be used in addition to the three substructure specification styles listed above to directly exclude a single atom from matching by all transforms. In ( marked ) bonds or unmarkedbonds marker mode, bond identifications in the same style as supported by the bond command, such as bond labels or bond atom label pairs, are similarly allowed as additional direct bond exclusion specifications, and these again apply to all transforms.

Exclusion markings, once set for the input structure, are inherited by newly generated result structures, so that the protection remains active even for structures undergoing sequences of transformations.

The related dataset transform command does not support direct atom or bond exclusion marking, even if the dataset only contains a single structure.

An example for an exclusion list:

ens transform $eh $tlist ... [list „atoms“ {C(=O)[NH2]} {{C[NH]C} {0 1}} 1]

This exclusion set protects amide groups (the first substructure) from all transforms, secondary amines including their immediate carbon neighbor atoms from the first two transforms in the set (index 0 and 1, the transform set is specified in the tlist variable), and the single atom with label 1 in the input ensemble. The exclusion marker mode is explicitly spelled out as atoms in first exclusion list element, which however is already the default.

Another example:

ens transform $eh $tlist ... [list „unmarkedatoms“ {*}$statoms]

This transform only operates on the atoms of which the labels or other identifiers are included in the list in variable statoms . All other parts of the structure are excluded and cannot participate in the transform.

The next optional global command parameter (parameter nine, command word eleven) is the maximum number of result ensembles to generate. The input ensemble is not counted. As soon as the maximum is reached, the command finishes and returns the result ensembles which were generated so far. If the maximum number of results is set to a negative number (the default), no limit applies. If it is set to zero, the transform command is effectively disabled. The global control variable ::cactvs(setsize_exceeded) is set to 1 if the specified maximum number of result ensembles was going to be exceeded. At the beginning of the execution of the ens transform command, this control variable is reset to zero. The limit applies to the total of generated unique structures, which is not necessarily the same as the number of output structures in case the processing mode dictates that they are processed further and not included as intermediates in the result set. In the special case of exhaustive transform application, the parameter limits the size of the intermediate result set after each pass, not the overall total of unique structures.

The timeout parameter (command parameter ten, command word twelve) can be used to set a time limit in seconds for the command execution. If this parameter is set to 0 or a negative number, no timeout applies. This is the default. Otherwise, the generation of result ensembles is stopped after the specified time, and the command returns with the results generated so far. The global control variable ::cactvs(interrupted) is set to 1 if a timeout occurs. It is reset to 0 at the beginning of the execution of the command.

The next optional parameter (command parameter eleven, command word thirteen) can be used to limit the number of transforms applied to the starting structure and intermediate structures. If this parameter is not specified, or specified as an empty string or a negative value, no limit is imposed. If this parameter or the timeout option is used, the result set may become dependent on the atom and bond order of the input structure because the traversed part of the possible transform match space is different and might yield different and/or a different number of results when the timeout or application count restriction is triggered.

The second last optional parameter (command parameter twelve, command word fourteen) is an iteration count. Its default value is one, meaning that the whole transformation process is only executed once. If set to a larger value, the transformation routine calls itself recursively. This is equivalent to first runningens transform with a start structure, and then repeatedly execute dataset transform commands for the second and later iterations with the last result set. All limits and other control parameters are passed in the original configuration, and apply only to the next iteration, not globally over the sum of all transform cycles. By default, the result set of this mode is what the last iteration produced, but this can be changed to the union of all iteration results by the keepiterationintermediates flag. Uniqueness checking of result structures is applied to the full return set. If the parameter is set to zero or a negative value, no transformations are executed. If the setpathname flag is set, it is automatically switched to appendpathname for the second and later cycles, so that the name mirrors the full transformation history and is not reset in each cycle.

The final optional parameter is an array variable name. If it is specified, various statistics about the transform application are collected and stored in that array. Some important array elements are:

patternmatches The total number of transform pattern substructure matches, on both sides of the transform if the transform settings allow this
leftpatternmatches The total number of left-side transform pattern substructure matches
rightpatternmatches The total number of right-side transform pattern substructure matches
leftpatternmatches$n The number of left-side pattern matches for transform pattern n, beginning with index 0
rightpatternmatches$n The number of right-side pattern matches for transform pattern n, beginning with index 0
datafailures The number of aborted transform applications because property data could not be computed on the transform result structures
applicationfailures The total count of failures to apply the transform instructions of a matched pattern, for example because of bad electron counts
applicationsuccesses The total count of successful applications of transforms, before the duplicate check
duplicaterejections The number of successfully transformed structures which were not added to the result set because they were a duplicate of an already-registered structure
duplicateaccepts The number of transform result structures which passed the duplicate result structure check

Example:

set t1 {{[O,S;X1:1]=[C:2x1][C:3X4][#1:4]>>[#1:4][O,S;X2:1][C:2x1]=[C:3] enol/thioenol}}

set elist [ens transform $eh [list $t1] bidirectional multistep all preservecharges none]

This example is part of a tautomer generator. The full standard generator in the toolkit uses a lengthy list of transform schemes and not just the one sample keto/enol schema displayed here. Because the operation is bidirectional, the transform transforms ketones into enols, and vice versa. If more than one interchangeable group exists, all intermediate structures are generated ( multistep reaction mode). All results are retained ( all selection mode), and all intermediate structures are again subjected to all transforms (this does not have any effect with a single transform, but the real application uses a set of transforms). Finally, charges should not be changed ( preservecharges flags), and fragment overlap is not allowed ( none overlap mode) - this again is without effect in this sample transform, because it does not consist of disconnected fragments on either side.

Multiple structures may be jointly transformed in a single command by means of the very similar dataset transform command.

ens translate

ens translate ehandle pt1 ?pt2? ?property?

e.translate(point1=,?point2=?,?coordinateproperty=?)

Move the atoms of the ensemble by modifying their 3D coordinates in property A_XYZ , or a custom atomic float vector coordinate property. This command requires atomic 3D coordinates and will attempt to compute them if they are not yet present. If no 3D atomic coordinates can be generated, the command fails with an error.

The first argument is interpreted as a 3D vector if this is the only coordinate argument. All atoms with valid 3D coordinates are moved according to the vector coordinates. In case a second argument is supplied, both arguments are interpreted as points in 3D space. The ensemble atoms are moved according to the difference vector between the second and the first point.

This operation triggers a 3dglop property invalidation event.

The command returns the original ensemble handle or reference.

Examples:

ens translate $eh {0 0 1}

ens translate $eh [atom get $eh $a1 A_XYZ] [atom get $eh $a2 A_XYZ]

ens trim

ens trim ehandle ?propertylist?

e.trim(?properties=?)

Reduce the information content of a structure to a standard minimum set and discard any additional information. This process minimizes the storage requirements of the ensemble. The properties of the internally defined minimum set are computed if required. The retained property set is designed to support a faithful representation of connectivity including bond and atom labels and types as well as formal charges, stereochemistry, isotopes, 2D and 3D coordinates, but not of auxiliary additional attributes of atoms, bonds or other minor objects.

The optional fourth argument is a list of properties which should be retained in addition to the standard set. If any of these are not present on the ensemble to be trimmed, they are silently ignored and no attempt is made to compute them. Specifying properties of the standard retention set in this list is allowed but has no additional effect.

The return value of the command is a list of the remaining properties of the ensemble.

Example:

ens trim $ehandle {E_GIF E_SMILES}

ens uncharge

ens uncharge ehandle ?filterset? ?flags?

e.uncharge(?filters=?,?flags=?)

Attempt to remove charges on atoms in a chemically sensible way. Charge removal by default happens via addition or removal of protons. In cases where this does not make chemical sense, a direct charge manipulation may be performed instead. Charged metal ions and other charged species without an obvious method for neutralization remain unchanged.

By default all atoms are processed, but the set of processed atoms can be limited by specifying a filter collection. Additional conditions on processed atoms can be set via the flag argument, which accepts the same values as ens hadd . Please refer to that command for a list and explanation of these flags.

The command returns the number of atoms which were neutralized.

Example:

ens uncharge [ens create {[NH3+]CC(=O)[O-]}]

This sample line removes a proton from the charged amino group and add a proton to the charged carboxyl group of the initial glycine zwitterion. The returned result value is 2. In this example the total hydrogen count has not changed. In case of an unbalanced set of positive and negative, modified charged centers this is usually not the case.

ens unlock

ens unlock ehandle propertylist/objclass/all

e.unlock(property=)

Unlock property data for the ensemble, meaning that they are again under the control of the standard data consistency manager.

The property data to unlock can be selected by providing a list of the following identifiers:

Property names or references
Valid property instances on the ensemble, or ensemble minor objects are unlocked. Non-existent data is silently ignored. It is not possible to unlock individual property fields.
all
All valid ensemble or ensemble minor object properties are unlocked. Ensemble properties and ensemble minor object properties are not affected.
ens,atom,bond...
These are object class identifiers. All property data which is controlled by the ensemble major object and attached to the specified object class is unlocked.

Property data locks are obtained by the ens lock command.

The return value is the original ensemble handle or reference.

Example:

set eh [ens create CCC]

ens lock $eh A_SYMBOL 1

ens purge $eh A_ELEMENT

atom set $eh 1 A_query(dsearch) 3

ens unlock $eh A_SYMBOL

ens unpack

ens unpack packstring ?compressionlib?

Ens.Unpack(data=,?compressionlib=?)

Unpack a base64-encoded serialized object string which was created by an ens pack command. The return value of this function is the handle of the newly created ensemble object, which is an exact duplicate of the packed original ensemble.

Packed ensembles may also be unpacked by the ens create command.

The default compression library is zlib . For more options, see ens pack .

Example:

set packdata [ens pack [ens create CCCl]]

set ehandle [ens unpack $packdata]

ens valencecheck

ens valencecheck ehandle ?failedatomvariable? ?nitrogenmode?

e.valencecheck(?variable=?,?nitrogenmode=?)

Perform a valence check on the ensemble, comparing the current bonding situation at all atoms to the list of element-specific valence states in the system element table. This command is intentionally quite picky, discouraging for example the use of pentavalent nitrogen by default. For the calculation of valence, only bonds of type normal (valence bonds) are taken into account. Complex bonds and pseudo bond types thus do not interfere in the calculation. Some more exotic metal atoms with many different valence states, or few well-defined covalent compounds, such as vanadium or rhodium , always pass.

The handling of nitrogen in pentavalent or ionic form can be controlled by setting the optional nitrogenmode argument, or modifying the global ::cactvs(nitrogen_valence_check) variable.Possible values are xionic , ionic (the default), asis , pentavalent and xpentavalent . These are the same values as with the ens nitrostyle command - please refer to that command for more information. In asis mode, both ionic and pentavalent forms pass.

The return value of this command is the number of atoms which failed the valence check. If the optional failedatomvariable argument is specified as non-empty string, it is the name of a variable which receives a list of the atom labels which failed the check, or is set to an empty list in case no problems were found.

Note that this command assumes that all hydrogen atoms are in place. Processing of structures with implicit hydrogen atoms is not supported.

mol valcheck is a short command alias.

Example:

ens valencecheck [ens create {CN(=O)=O.C[N+](=O)[O-]}] badatoms

This sample command checks the valence situation of nitromethane in two encoding formats. The first molecule, using a pentavalent nitrogen encoding, is responsible for the result value 1, indicating one failed atom, and the variable badatoms is set to 2, the label of the pentavalent nitrogen atom. The second molecule passes the check and reports no additional problems.

ens valcheck is a short alias.

ens valid

ens valid ehandle propertylist

e.valid(property/propertysequence)

Returns a list of boolean values indicating whether values for the named properties are currently set for the ensemble. No attempt at computation is made. For Python , where single-item lists are syntactically not the same as a single value, the return value is a single boolean if the argument was a string or a property reference, and only a single property was decoded.

Example:

ens valid $xhandle X_IDENT

reports whether the ensemble has a standard ID (has a valid E_IDENT property) or not.

ens has is an alias to this command.

ens vector

ens vector ehandle property vectorname ?invert? ?integrate?

Map ensemble property data to a Blt library vector object. Please refer to the Blt manual pages for more information on these. Blt vector objects are very useful, for example, for the efficient set-up of GUI graphing widgets which are provided by the Blt Tk extension. This command automatically attempts to load the Blt Tcl module if necessary. If that fails, an error results.

The vectorized property data must be of a vector type, and the element type of the vector must either be a simple numeric type, or a bit for bitvectors, or a floating-point pair. It is possible to address a property field, for example the X/Y data points of a spectrum which are typically stored as a field in a complex compound property.

If the invert flag is set, the stored Blt vector object values are set to 1.0 minus the property data value. By default, this flag is not active. If the integrate flag is set, the Blt vector object element values are set to the sum of all preceding property data values. This flag is also disabled by default.

If the property data type is a float pair vector, two vector objects are created in the Blt namespace, with suffixes _X and _Y . For simple vector types, the vector name is used directly. It is possible to overwrite existing Blt vectors of the same name with this command.

The return value of the command is a list of the generated name of the vector, followed by the minimum and maximum data values in that vector object. These may the different from the ensemble property data values because of the application of the invert or integrate flags.For float pair vectors, the same information is repeated for the second vector object.

The command is not supported in the Python interface.

ens verify

ens verify ehandle property

e.verify(property)

Verify the values of the specified property on the ensemble. The property data must be valid, and of an ensemble or ensemble minor object property. If the data can be found, it is checked against all constraints defined for the property, and, if such a function has been defined, is tested with the value verification function of the property.

If all tests are passed, the boolean return value is boolean 1, 0 if the data could be found but fails the tests, and an error condition otherwise.

ens weed

ens weed ehandle keywords

e.weed(keywordsequence)

e.weed(?keyword?,...)

This command performs a number of common clean-up and standardization operations on the ensemble, which are especially useful in the context of processing PDB files. The ensemble is potentially modified, but keeps its handle or reference, which is returned as command result. In addition, properties A_XYZ and A_RESIDUE , which are normally susceptible to bond manipulations, are locked and retained.

The keywords argument selects the desired set of operations. Most of the keywords are single words, but the minsize and maxsize as well as the minaminoacids and maxaminoacids keywords take an additional integer number as argument. The following operations are currently supported:

carbonless
Remove all molecules/fragments which do not contain carbon.
disulphides
Split and hydrogenate all disulfide bridges. This operation can change the molecule and ring set.
duplicates
Remove all molecules/fragments which are duplicates (taking isotope labels and stereochemistry into account) of another molecule in the ensemble. Only a single instance of any duplicate molecule is retained. Internally, this is a check on property M_HASHISY .
hydrogenless
Remove all molecules which do not contain hydrogen.
inorganic
Remove all inorganic molecules.
ligands
Remove all molecules which do not consist exclusively of linked standard amino acids. This flag is complementary to proteins .
maxaminoacids n
Discard all molecules from ensemble which consist only of linked standard amino acids and contain more than the specified number of them. This operation requires an additional integer after the keyword.
maxsize n
Discard all molecules from ensemble which have more than the specified number of atoms. This operation requires an additional integer after the keyword.
metalatoms
Remove all metal atoms from the ensemble. This operation can change the molecule and ring set.
metalions
Remove all molecules which are unbonded metal atoms. Bonded metal atoms are not affected.
metaloxygenbonds
Remove all bonds between metal atoms and oxygen atoms. This operation can change the molecule and ring set.
minaminoacids n
Discard all molecules from ensemble which consist only of linked standard amino acids and contain less than the specified number of them. This operation requires an additional integer after the keyword
minsize n
Discard all molecules from ensemble which have less than the specified number of atoms. This operation requires an additional integer after the keyword.
proteins
Discard all molecules which only consist of linked standard amino acids. This is a shortcut for minaminoacids 0 .
proteinhetatmbonds
Discard all bonds between the protein core and heterogens, i.e. all bonds where the property field A_RESIDUE(hetatom) is different among the involved bond atoms. This operation can change the molecule and ring set.
proteinspecialbonds
Discard all special bonds (i.e. complex bonds, link bonds, etc.) where at least one atom is from the protein, i.e. was encoded with an ATOM line in a PDB file, not HETATM . This operation can change the molecule and ring set.
specialbonds
Delete all bonds which are not VB bonds. This operation can change the molecule and ring set.
water
Discard water molecules, i.e. all molecules which consist of one oxygen atom, any number of hydrogen atoms, and no other element.

The order of the keywords is not important. The sequence of operations is always

metalatoms > specialbonds > proteinspecialbonds,proteinhetatmbonds > metaloxygenbonds > disulphides > carbonless,hydrogenless,inorganic,maxsize,metalions,minsize,water > maxaminoacids,minaminoacids > duplicates

Applied operations which potentially change the set of molecules and rings trigger an automatic re-evaluation of this data after the operation block has been executed.

Example:

The code below is part of a reliable PDB ligand extractor.

ens weed $eh {metaloxygenbonds water proteinspecialbonds duplicates minsize 10 \ maxsize 300 maxaminoacids 6 disulfides}

if {[ens get $eh E_NATOMS]==0} {

# try again with additional bond cut step. Cannot do this by default, because# there are plenty of ligands with embedded amino acid parts# that are encoded as ATOM lines. PDB files suck.

	molfile backspace $fh

	set eh [molfile read $fh]

	ens weed $eh {metaloxygenbonds water proteinspecialbonds proteinhetatmbonds \													duplicates minsize 10 maxsize 300 maxaminoacids 6 disulfides}

ens xhandle

ens xhandle ehandle

Return the remote handle of the ensemble if it was exported and is currently under the control of a live-linked application. In case the ensemble is not exported, an error results.

This command is not supported in the Python interface.