This module contains the core classes of NuriKit. The core module is not
very useful by itself, but is a dependency of many other modules. Chemical
data structures, such as elements, isotopes, and molecules, and also the
graph structure and algorithms, are defined in this module.
The returned atom is invalidated when a mutator context is exited. If the atom
must be kept alive, copy the atom data first with Atom.copy_data()
method.
The returned bond is invalidated when a mutator context is exited. If the bond
must be kept alive, copy the bond data first with Bond.copy_data()
method.
The returned bond may not have bond.src.id==src and
bond.dst.id==dst, as the source and destination atoms of the bond may be
swapped.
Note
The returned bond is invalidated when a mutator context is exited. If the bond
must be kept alive, copy the bond data first with Bond.copy_data()
method.
The returned bond may not have bond.src.id==src.id and
bond.dst.id==dst.id, as the source and destination atoms of the bond
may be swapped.
Note
The returned bond is invalidated when a mutator context is exited. If the bond
must be kept alive, copy the bond data first with Bond.copy_data()
method.
Get a mutator for the molecule. Use this as a context manager to make changes to
the molecule.
Note
The mutator will invalidate all atom and bond objects when the context is
exited, whether or not the changes are made. If the objects must be kept
alive, copy the data first with Atom.copy_data() and
Bond.copy_data() methods.
Note
Successive calls to this method will raise an exception if the previous
mutator is not finalized.
add_conf(self: nuri.core._core.Molecule, coords: object) -> int
Add a conformation to the molecule at the end.
Parameters:
coords – The coordinates of the atoms in the conformation. Must be
convertible to a numpy array of shape (num_atoms,3).
Returns:
The index of the added conformation.
add_conf(self: nuri.core._core.Molecule, coords: object, conf: int) -> int
Add a conformation to the molecule.
Parameters:
coords – The coordinates of the atoms in the conformation. Must be
convertible to a numpy array of shape (num_atoms,3).
conf – The index of the conformation to add the coordinates to. If
negative, counts from back to front (i.e., the new conformer
will be created at max(0,num_confs()+conf)). Otherwise, the
coordinates are added at min(conf,num_confs()). This resembles
the behavior of Python’s list.insert() method.
method – The charge assignment method. See below for the possible charge
assignment methods. Default to "gasteiger".
Raises:
RuntimeError – If the charge assignment method fails.
ValueError – If the charge assignment method is not supported.
Supported methods:
"gasteiger": Assigns Marsili-Gasteiger charges, as described in the
original paper[1].
The Gasteiger algorithm requires initial “seed” charges to be assigned to
atoms. In this implementation, the initial charges are assigned from the
(localized) formal charges of the atoms, then a charge delocalization
algorithm is applied to the terminal atoms of a conjugated system with the
same Gasteiger type (e.g., oxygens of a carboxylate group will be assigned
-0.5 charge each).
Convert trivial explicit hydrogen atoms of the molecule to implicit hydrogens.
Trivial explicit hydrogen atoms are the hydrogen atoms that are connected to
only one heavy atom with a single bond and have no other neighbors (including
implicit hydrogens).
Get an iterable object of all conformations of the molecule. Each conformation
is a 2D array of shape (num_atoms,3). It is not available to update the
coordinates from the returned conformers; you should manually assign to the
conformers to update the coordinates.
Convert implicit hydrogen atoms of the molecule to explicit hydrogens.
Parameters:
update_confs – If True, the conformations of the molecule will be
updated to include the newly added hydrogens. When set to False, the
coordinates of the added hydrogens will have garbage values. Default to True.
optimize – If True, the conformations will be optimized after adding
hydrogens. Default to True. This parameter is ignored if update_confs is
False.
Raises:
ValueError – If the hydrogens cannot be added. This can only happen if
update_confs is True and the molecule has at least one conformation.
The sanitization is done in the order of conjugation, aromaticity,
hybridization, and valence. If any of the sanitization fails, the subsequent
sanitization will not be attempted.
Note
The sanitization is done in place. The state of molecule will be mutated even
if the sanitization fails.
Note
If any of the other three sanitization is requested, the conjugation will be
automatically turned on.
Warning
This interface is experimental and may change in the future.
atoms (Iterable[Atom | int]) – The atoms to include in the
substructure.
bonds (Iterable[Bond | int]) – The bonds to include in the
substructure.
cat – The category of the substructure.
This has three mode of operations:
If both atoms and bonds are given, a substructure is created with
the given atoms and bonds. The atoms connected by the bonds will also be
added to the substructure, even if they are not in the atoms list.
If only atoms are given, a substructure is created with the given atoms.
All bonds between the atoms will also be added to the substructure.
If only bonds are given, a substructure is created with the given bonds.
The atoms connected by the bonds will also be added to the substructure.
If neither atoms nor bonds are given, an empty substructure is
created.
Tip
Pass empty list to bonds to create an atoms-only substructure.
This is a proxy object to the AtomData of the atom in a molecule. The
proxy object is invalidated when any changes are made to the molecule. If
underlying data must be kept alive, copy the data first with copy_data()
method.
We only document the differences from the original class. Refer to the
AtomData class for common properties and methods.
Note
Unlike the underlying data object, the atom cannot be created
directly. Use the Mutator.add_atom() method to add an atom to a
molecule.
Count connected atoms to the atom. Includes both explicit and implicit
neighbors.
Note
This is not same with len(atom). The length of the atom is the number of
explicit neighbors, or, the iterable neighbors of the atom. Implicit hydrogens
could not be iterated, thus not counted in the length.
This is a proxy object to the BondData of the bond in a molecule. The
proxy object is invalidated when any changes are made to the molecule. If
underlying data must be kept alive, copy the data first with copy_data()
method.
We only document the differences from the original class. Refer to the
BondData class for common properties and methods.
Note
Unlike the underlying data object, the bond cannot be created
directly. Use the Mutator.add_bond() method to add a bond to a molecule.
Rotate the bond by the given angle. The components connected only to the
destination atom (excluding this bond) are rotated around the bond axis.
Rotation is done in the direction of the right-hand rule, i.e., the rotation is
counter-clockwise with respect to the src -> dst vector.
Parameters:
angle – The angle to rotate the bond by, in degrees.
rotate_src – If True, the source atom side is rotated instead.
strict – If True, rotation will fail for multiple bonds and conjugated
bonds. If False, the rotation will be attempted regardless.
conf – The index of the conformation to rotate the bond in. If not given,
all conformations are rotated.
Raises:
ValueError – If the bond is not rotatable. If strict is False, it
will be raised only if the bond is a member of a ring, as it will be
impossible to rotate the bond without breaking the ring.
atoms (Iterable[Atom | int]) – The atoms to add to the
substructure. The atoms must belong to the same molecule as the substructure.
All duplicate atoms are ignored.
add_bonds (bool) – If True, the bonds between the added atoms are also added
to the substructure. If False, the bonds are not added.
Due to the implementation, it is much faster to add atoms in bulk than adding
them one by one. Thus, we explicitly provide only the bulk addition method.
Add bonds to the substructure. If any atom of the bond does not belong to the
substructure, the atom is also added to the substructure.
Parameters:
bonds (Iterable[Bond | int]) – The bonds to add to the
substructure. The bonds must belong to the same molecule as the substructure.
All duplicate bonds are ignored.
Due to the implementation, it is much faster to add bonds in bulk than adding
them one by one. Thus, we explicitly provide only the bulk addition method.
The returned atom is invalidated when the parent molecule is modified, or if
the substructure is modified. If the atom must be kept alive, copy the atom
data first.
The returned atom is invalidated when the parent molecule is modified, or if
the substructure is modified. If the atom must be kept alive, copy the atom
data first.
The returned bond is invalidated when the parent molecule is modified, or if
the substructure is modified. If the bond must be kept alive, copy the bond
data first.
The returned bond may not have bond.src.id==src.id and
bond.dst.id==dst.id, as the source and destination atoms of the bond may
be swapped.
Note
The returned bond is invalidated when the parent molecule is modified, or if
the substructure is modified. If the bond must be kept alive, copy the bond
data first.
ValueError – If the bond does not exist, the source or destination
atom does not belong to the substructure, or any of the atoms does not belong
to the same molecule.
The source and destination atoms of the bond may be swapped.
Note
The returned bond is invalidated when the parent molecule is modified, or if
the substructure is modified. If the bond must be kept alive, copy the bond
data first.
Convert trivial explicit hydrogen atoms of the substructure to implicit
hydrogens.
Trivial explicit hydrogen atoms are the hydrogen atoms that are connected to
only one heavy atom with a single bond and have no other neighbors (including
implicit hydrogens).
Get an iterable object of all conformations of the substructure. Each
conformation is a 2D array of shape (num_atoms,3). It is not available to
update the coordinates from the returned conformers; you should manually assign
to the conformers to update the coordinates.
ValueError – If the underlying bond does not exist, the source or
destination atom does not belong to the substructure, or any of the atoms does
not belong to the same molecule.
Refresh the bonds of the substructure. All bonds between the atoms of the
substructure are removed, and new bonds are added based on the parent molecule.
This represents a substructure managed by a molecule. If a user wishes to
create a short-lived substructure not managed by a molecule, use
Molecule.substructure() method instead.
This will invalidate when the parent molecule is modified, or any substructures
are removed from the parent molecule. If the substructure must be kept alive,
convert the substructure first with copy() method.
Explicit chirality of the atom. Note that this does not imply the atom is a
stereocenter chemically and might not correspond to the geometry of the
molecule. See Chirality for formal definition.
A dictionary-like object to store additional properties of the atom. The keys
and values are both strings.
Note
The properties are shared with the underlying AtomData object. If the
properties are modified, the underlying object is also modified.
As a result, the property map is also invalidated when any changes are made
to the molecule. If the properties must be kept alive, copy the properties
first with copy() method.
The explicit configuration of the bond. Note that this does not imply the bond
is a torsionally restricted bond chemically.
Note
For bonds with more than 3 neighboring atoms, BondConfig.Cis or
BondConfig.Trans configurations are not well defined terms. In such
cases, this will return whether the first neighbors are on the same side of
the bond. For example, in the following structure (assuming the neighbors
are ordered in the same way as the atoms), the bond between atoms 0 and 1 is
considered to be in a cis configuration (first neighbors are marked with angle
brackets):
<2><4>
\ /0=1/ \
35
On the other hand, when the neighbors are ordered in the opposite way, the
bond between atoms 0 and 1 is considered to be in a trans configuration:
<2>5
\ /0=1/ \
3<4>
Tip
Assigning None clears the explicit bond configuration.
A dictionary-like object to store additional properties of the bond. The keys
and values are both strings.
Note
The properties are shared with the underlying BondData object. If the
properties are modified, the underlying object is also modified.
As a result, the property map is invalidated when any changes are made to the
molecule. If the properties must be kept alive, copy the properties first with
copy() method.
Explicit chirality of the atom. Note that this does not imply the atom is a
stereocenter chemically and might not correspond to the geometry of the
molecule. See Chirality for formal definition.
The explicit configuration of the bond. Note that this does not imply the bond
is a torsionally restricted bond chemically.
Note
For bonds with more than 3 neighboring atoms, BondConfig.Cis or
BondConfig.Trans configurations are not well defined terms. In such
cases, this will return whether the first neighbors are on the same side of
the bond. For example, in the following structure (assuming the neighbors
are ordered in the same way as the atoms), the bond between atoms 0 and 1 is
considered to be in a cis configuration (first neighbors are marked with angle
brackets):
<2><4>
\ /0=1/ \
35
On the other hand, when the neighbors are ordered in the opposite way, the
bond between atoms 0 and 1 is considered to be in a trans configuration:
<2>5
\ /0=1/ \
3<4>
Tip
Assigning None clears the explicit bond configuration.
When viewed from the first neighboring atom of a “chiral” atom, the chirality
is determined by the spatial arrangement of the remaining neighbors. That is,
when the remaining neighbors are arranged in a clockwise direction, the
chirality is “clockwise” (CW), and when they are arranged in a
counter-clockwise direction, the chirality is “counter-clockwise” (CCW).
If the atom is not a stereocenter or the chirality is unspecified, the chirality
is “unknown” (Unknown).
If the atom has an implicit hydrogen, it will be always placed at the end of the
neighbor list. This is to ensure that the chirality of the atom is not affected
by adding back the implicit hydrogen (which will be placed at the end).
Note
It is worth noting that this chirality definition (“NuriKit Chirality”) is not
strictly equivalent to the chirality definition in SMILES (“SMILES
Chirality”), although it appears to be similar and often resolves to the same
chirality.
One notable difference is that most SMILES parser implementations place the
implicit hydrogen where it appears in the SMILES string. [2]
For example, consider the stereocenter in the following SMILES string:
[C@@H](F)(Cl)Br
The SMILES Chirality of the atom is “clockwise” because the implicit hydrogen
is interpreted as the first neighbor. On the other hand, the NuriKit Chirality
of the atom is “counter-clockwise” because the implicit hydrogen is
interpreted as the last neighbor.
This is not a problem in most cases, because when the stereocenter is not the
first atom of a fragment, the SMILES Chirality and the NuriKit Chirality are
consistent. For example, a slightly modified SMILES string of the above
example will result in a “counter-clockwise” configuration in both
definitions:
F[C@H](Cl)Br
Another neighbor ordering inconsistency might occur when ring closure is
involved. This is because a ring-closing bond addition could only be done
after the partner atom is added, but the SMILES Chirality is resolved in the
order of the appearance of the bonds in the SMILES string. For example,
consider the following SMILES string, in which the two stereocenters are both
“clockwise” in terms of the SMILES Chirality (atoms are numbered for
reference):
1234567C[C@@H]1C[C@@]1(F)C
The NuriKit Chirality of atom 2 is “counter-clockwise” because the order of
the neighbors is 1, 3, 5, 4 in the SMILES Chirality (atom 5 precedes atom 4
because the ring-closing bond between atoms 2 and 5 appears before the bond
between atoms 2 and 4), but 1, 3, 4, 5 in the NuriKit Chirality (atom 4
precedes atom 5 because the ring-closing bond is added after the bond
between atoms 2 and 4).
On the other hand, the NuriKit Chirality of atom 5 is “clockwise” because the
order of the neighbors is 4, 2, 6, 7 in both definitions. Unlike the other
stereocenter, the partner of the ring-closing bond (atom 2) is already added,
and the ring-closing bond can now be added where it appears in the SMILES
string.
All instances of this class are immutable and singleton. If you want to
compare two instances, just use the is operator. You can also compare
two elements using the comparison operators, which in turn compares their
atomic_number (added for convenience).
All instances of this class are immutable and singleton. If you want to
compare two instances, just use the is operator. You can also compare
two elements using the comparison operators, which in turn compares their
mass_number (added for convenience).
Refer to the nuri::Element class in the C++ API Reference for more details.
The periodic table is a singleton object. You can access the periodic table via
the nuri.periodic_table attribute, or the factory static method
PeriodicTable.get(). Both of them refer to the same object. Note that
PeriodicTable object is not constructible from the Python side.
You can access the periodic table as a dictionary-like object. The keys are
atomic numbers, atomic symbols, and atomic names, tried in this order. The
returned values are Element objects. For example:
Add a substructure to the collection and return it.
Parameters:
atoms (Iterable[Atom]) – The atoms to include in the
substructure.
bonds (Iterable[Bond]) – The bonds to include in the
substructure.
cat – The category of the substructure.
Returns:
The newly added substructure.
This has three mode of operations:
If both atoms and bonds are given, a substructure is created with
the given atoms and bonds. The atoms connected by the bonds will also be
added to the substructure, even if they are not in the atoms list.
If only atoms are given, a substructure is created with the given atoms.
All bonds between the atoms will also be added to the substructure.
If only bonds are given, a substructure is created with the given bonds.
The atoms connected by the bonds will also be added to the substructure.
If neither atoms nor bonds are given, an empty substructure is
created.
Tip
Pass empty list to bonds to create an atoms-only substructure.
Add a substructure to the collection at the given index and return it.
Effectively invalidates all currently existing substructures.
Parameters:
idx – The index of the new substructure. If negative, counts from back to
front (i.e., the new substructure will be created at
max(0,len(subs)+idx)). Otherwise, the substructure is added at
min(idx,len(subs)). This resembles the behavior of Python’s
list.insert() method.
atoms (Iterable[Atom]) – The atoms to include in the
substructure.
bonds (Iterable[Bond]) – The bonds to include in the
substructure.
cat – The category of the substructure.
Returns:
The newly added substructure.
This has three mode of operations:
If both atoms and bonds are given, a substructure is created with
the given atoms and bonds. The atoms connected by the bonds will also be
added to the substructure, even if they are not in the atoms list.
If only atoms are given, a substructure is created with the given atoms.
All bonds between the atoms will also be added to the substructure.
If only bonds are given, a substructure is created with the given bonds.
The atoms connected by the bonds will also be added to the substructure.
If neither atoms nor bonds are given, an empty substructure is
created.
Tip
Pass empty list to bonds to create an atoms-only substructure.
Add a substructure to the collection at the given index. Effectively invalidates
all currently existing substructures.
Parameters:
idx – The index of the new substructure. If negative, counts from back to
front (i.e., the new substructure will be created at
max(0,len(subs)+idx)). Otherwise, the substructure is added at
min(idx,len(subs)). This resembles the behavior of Python’s
list.insert() method.
Add a substructure to the collection at the given index. Effectively invalidates
all currently existing substructures.
Parameters:
idx – The index of the new substructure. If negative, counts from back to
front (i.e., the new substructure will be created at
max(0,len(subs)+idx)). Otherwise, the substructure is added at
min(idx,len(subs)). This resembles the behavior of Python’s
list.insert() method.
Find a 4x4 best-fit rigid-body transformation tensor, to align query to
template.
Parameters:
query – The query points. Must be representable as a 2D numpy array of
shape (N,3).
template – The template points. Must be representable as a 2D numpy array
of shape (N,3).
method –
The alignment method to use. Defaults to "qcp". Currently
supported methods are:
"qcp": The Quaternion Characteristic Polynomial (QCP) method, based on
the implementation of Liu and Theobald
[3][4][5]. Unlike
the original implementation, this version can also handle reflection
based on the observations of Coutsias, Seok, and Dill[6].
"kabsch": The Kabsch algorithm.
[7][8] This implementation
is based on the implementation in TM-align. [9]
reflection – Whether to allow reflection in the alignment. Defaults to
False.
Returns:
A tuple of the transformation tensor and the RMSD of the alignment.
Calculate the RMSD of the best-fit rigid-body alignment of query to
template.
Parameters:
query – The query points. Must be representable as a 2D numpy array of
shape (N,3).
template – The template points. Must be representable as a 2D numpy array
of shape (N,3).
method –
The alignment method to use. Defaults to "qcp". Currently
supported methods are:
"qcp": The Quaternion Characteristic Polynomial (QCP) method, based on
the implementation of Liu and Theobald
[3][4][5]. Unlike
the original implementation, this version can also handle reflection
based on the observations of Coutsias, Seok, and Dill[6].
"kabsch": The Kabsch algorithm.
[7][8] This implementation
is based on the implementation in TM-align. [9]
reflection – Whether to allow reflection in the alignment. Defaults to
False.