|
General What is Cactvs?
Cactvs is a general-purpose toolkit for chemical information
processing. Its special strengths are a very powerful scripting
environment with special Web support features, very good 2D structure
layout and rendering functions, a rich set of high-quality I/O
modules, extreme extensibility by means of external modules and
data definitions, and a powerful lazy computation and data validity
maintenance mechanism. By virtue of its extensibility, it can
be configured to deal with almost any type of data, in any format.
Is this Cactvs toolkit the same as Cactus from CactusCode
?
No. Cactus (with an u) is an engineering/physics computational
toolkit without any relationship to our Cactvs (with a v) chemical
information processing toolkit. Unfortunately, both toolkits have
a long history reaching back more than a decade each, and a name
change is thus problematic for both sides. So please watch the
spelling...
How do I cite Cactvs?
Computation and Management of Chemical Properties in CACTVS:
An extensible Networked Approach toward Modularity and Flexibility
W. D. Ihlenfeldt, Y. Takahashi, H. Abe, S. Sasaki, J. Chem.
Inf. Comp. Sci. 34 (1994), 109-116
Installation
How do I install the standard academic or commercial scripting
toolkit versions on Unix/Linux?
First, download a suitable package from our server.
The normal distribution format is a gzip-compressed tarfile. Next,
make a temporary staging directory (this is not the final installation
directory), and unpack the package in this directory. Among the
unpacked contents of the package is an installation shell script
named installme. Run that script, and answer its questions,
for example concerning the installation directories. If you install
in /usr/local or similar standard locations, you will
probably need root permissions to succeed. Here is a representative
example command sequence:
su
mkdir stagedir
cd stagedir
gunzip < ../cactvstools-Linux2.6-SuSE11.0-64-3.353.tar.gz |
tar xf -
./installme
cd ..
rm -rf stagedir
Why can't I have an rpm/deb/whatever package? Tar files are
so nineties...
Well, if you are a commercial customer, you can request these,
built to order for your specific Linux distro and Linux distro
installation options. Non-paying customers, please accept that
supporting many different package formats is a major hassle. tar
files have the advantage that they are the most portable format,
and easy to generate.
How do I uninstall the software on Unix/Linux?
Just remove the files. There is no specific uninstaller. The
bulk of the Cactvs components are all concentrated in a single
subdirectory (usually a subdirectory of /usr/local/lib)
which you can remove as a whole. Additionally, delete the cs*
wrapper scripts from the executable directory (usually /usr/local/bin),
and you are done.
How do I install the standard academic or commercial scripting
toolkit versions on Windows?
The Windows package is a standard executable installer. Just
download the package and execute it.
How do I uninstall the software on Windows?
Use the normal Windows software manager tool. The package is
registered in the standard software list. Depending on your Windows
set-up, there may be a lot of questions asked when you uninstall,
specifically whether you really want to remove individual font
and dll files. This is due to Cactvs not installing these files
in the system directories, which seems to confuse Windows to a
certain extent. Just park the mouse over the OK button and put
something heavy on the Enter key if you see these questions popping
up.
Implementation
- I heard that Cactvs is implemented in Tcl. That's weird.
- And it is wrong. The core system is implemented in efficient
and very portable ANSI C. On top of the core, there is a scripting
language layer in some packages. Currently, we support Tcl as
scripting language, but in the future we may provide other interfaces
such as Python. Specialized libraries provided for linking with
other systems may or may not contain a script interpreter. A customized
package with selected, essential functionalities can be provided
as a single compiled C library, without any dependency on a script
interpreter.
-
- Features
Can Cactvs handle reactions?
- Yes, we have full reaction support, including reaction properties
and reaction queries. Reaction transformations are possible by
means of advanced SMIRKS transform capabilities. Reaction data
can be read and written in MDL RXN, RDF, Reaction SMILES, CDX/CDXML
and SKC/TGF native molecule editor formats and of course the native
Cactvs formats (CBIN,CBS,BDB). We also support reaction depictions
in pixel and vector formats.
Can Cactvs handle 3D molecules?
Yes. The only frequently requested component missing is 3D rendering,
though even that is not entirely correct, we support 3D output
as PovRay renderer data file and VRML. Cactvs can handle 3D atomic
coordinates including multiple conformers, and that even for non-element
atoms such as Gaussian Bqs. It also supports ISIS-compatible
3D queries, and some limited 3D structure manipulation such as
bond rotations. 3D atomic coordinates can be computed for example
by an integrated Corina 3D generator module (to be licensed separately
from Molecular Networks), or by automatic submission of requests
to the chembiogrid.org
Web service. On the input front, Cactvs has a rather nice PDB
I/O module with full connectivity reconstruction, which measurably
outperforms the supposedly hitherto best reader described in
J. Chem. Inf. Model. 2007, 47, 1379-1385
on the same test data set. Of course plenty of other 3D file formats
are also recognized, read and written.
The InChI or SMILES string computed by some other package does
not agree.
Well, in all likelihood the Cactvs version is the correct one,
especially if stereochemistry is involved. There were (and are)
implementations of these properties in open-source toolkits that
have vocal advocates which are severely broken. In case of SMILES,
you also need to be aware that there are several canonicalization
algorithms. Cactvs implements the algorithm as originally and
incompletely published by Daylight. Unfortunately, this is no
longer the method used by Daylight in current releases, or by
the OpenEye toolkit which strives to be compatible. The official
Daylight canonic SMILES algorithm is now proprietary and undocumented.
Some older ChemDraw releases have bugs in R/S stereo descriptor
generation which were detected by comparison with Cactvs results.
However, this does not mean that our software is flawless. Such
software does not exist, and if you find a reproducible and understandable
problem, we want to hear about it.
Licensing
I am in academia. Do I need a license for the software?
- No. The software is free for all academic and educational
use. However, this does not guarantee that we will support you
in your project. In the absence of a support agreement, support
is only given on a case-by-case basis.
I am at a government institution, or a charity, but not at
an university. Do I need a license for the software?
- Generally, yes, though we may give you a discount or even
make available software without charge for public, open-access
projects. Support needs to be paid, though.
I have written an interesting application script. May I redistribute
it?
- Yes, of course, and you may even charge for it if you can
find a taker. Any application scripts developed with the software
are legally unencumbered and property of the developer (or her
institution). However, you must not redistribute our software
which executes your application script, and you cannot set up
a Cactvs-driven Web site which generates income without a license.
PubChem
Interesting. I did not know that PubChem is built with Cactvs.
I heard it was implemented with OpenEye's OEChem library.
Well, that is not wrong, but only part of the story. Both Cactvs
and OpenEye software are used, in combination with the in-house
NCBI toolkit. Software built with Cactvs is in charge of structure
submission pre-parsing, structure input, structure ASN.1 encoding,
2D layout coordinate computation, structure rendering, most property
computations, structure unification, the complete structure search
system and Web-based structure sketching for queries. The OpenEye
toolkit is used for structure standardization, some property computation
(names, canonic SMILES), and SD file output. There are applications
at NCBI which link to both OpenEye and Xemistry libraries - they
can peacefully co-exist.
Are developments for PubChem US government work in the public
domain?
Yes, they are, and you can download source code and script code
for specific PubChem developments from the PubChem Web site. However,
this does not mean that the complete Cactvs system is available.
You may or may not learn something from examining the source code
fragments from NCBI, but it will not be sufficient give you a
working system.
Can Cactvs handle the ASN.1 structure data files of PubChem?
Yes, it can, and to our knowledge it is the only external toolkit
which currently has this ability. Using the ASN.1 forms of PubChem
records is preferable to the SD files, because the latter are
only approximations of the internal PubChem data structure. Cactvs
can represent and handle every bit of the PubChem data. If you
still prefer using PubChem SD files, you are probably excited
to learn that the Cactvs SDF I/O module understands all NCBI extensions
for special bond types and annotations, etc.
|
|