nuri.tools
TM-tools
from nuri.tools import tm as tmtools
This module provides ground-up reimplementation of TM-align algorithm based on the original TM-align code (version 20220412) by Yang Zhang. This implementation aims to reproduce the results of the original code while providing improved user interface and maintainability. Refer to the following paper for details of the algorithm. [1]
All input structures must have only single atom per residue (usually
CA atom), as the original TM-align algorithm assumes this.
- class nuri.tools.tm.TMAlign
- __init__(self: TMAlign, query: object, templ: object, query_ss: str | None = None, templ_ss: str | None = None, *, gapless: bool = True, sec_str: bool = True, local_sup: bool = True, local_with_ss: bool = True, fragment_gapless: bool = True) None
Prepare TM-align algorithm with the given structures.
- Parameters:
query – The query structure, in which each residue is represented by a single atom (usually
CA). Must be representable as a 2D numpy array of shape(N, 3), whereNis the number of residues.templ – The template structure, in which each residue is represented by a single atom (usually
CA). Must be representable as a 2D numpy array of shape(M, 3), whereMis the number of residues.query_ss – The secondary structure of the query structure. When provided, must be an ASCII string of length
N.templ_ss – The secondary structure of the template structure. When provided, must be an ASCII string of length
M.gapless – Enable gapless threading.
sec_str – Enable secondary structure assignment.
local_sup – Enable local superposition. Note that this is the most expensive initialization method due to the exhaustive pairwise distance calculation. Consider disabling this flag if alignment takes too long.
local_with_ss – Enable local superposition with secondary structure-based alignment.
fragment_gapless – Enable fragment gapless threading.
- Raises:
If:
The query or template structure has less than 5 residues.
The secondary structure of the query or template structure has a different length than the structure.
No initialization flag is set.
The initialization fails (for any other reason).
Note
If the secondary structure is not provided, it will be assigned using the approximate secondary structure assignment algorithm defined in the TM-align code. When both
sec_strandlocal_with_ssflags are not set, the secondary structures are ignored.
- static from_alignment(query: object, templ: object, alignment: object = None) TMAlign
Prepare TM-align algorithm with the given structures and user-provided alignment.
- Parameters:
query – The query structure, in which each residue is represented by a single atom (usually
CA). Must be representable as a 2D numpy array of shape(N, 3), whereNis the number of residues.templ – The template structure, in which each residue is represented by a single atom (usually
CA). Must be representable as a 2D numpy array of shape(M, 3), whereMis the number of residues.alignment – Pairwise alignment of the query and template structures. Must be in a form representable as a 2D numpy array of shape
(L, 2), in which rows must contain (query index, template index) pairs. If not provided, query and template must have same length and assumed to be aligned in order.
- Returns:
A
TMAlignobject initialized with the given alignment.- Raises:
If:
The query or template structure has less than 5 residues.
The alignment contains out-of-range indices.
Alignment is not provided and the query and template structures have different lengths.
The initialization fails (for any other reason).
Tip
When initialized by this method, the result is equivalent to the “TM-score” program in the TM-tools suite.
Note
Duplicate values in
alignmentare not checked and may result in invalid alignment.
- aligned_pairs(self: TMAlign) ndarray[numpy.int32]
Get pairwise alignment of the query and template structures.
- Returns:
A 2D numpy array of shape
(L, 2), whereLis the number of aligned pairs. Each row is a (query index, template index) pair.
Tip
This will always return the same alignment once the
TMAlignobject is created.Note
Even if the
TMAlignobject is created withfrom_alignment(), the returned pairs from this method may not be the same as the input alignment. This is because the TM-align algorithm filters out far-apart pairs when calculating the final alignment.
- score(self: TMAlign, l_norm: int | None = None, *, d0: float | None = None) tuple[ndarray[numpy.float64], float]
Calculate TM-score using the current alignment.
- Parameters:
l_norm – Length normalization factor. If not specified, the length of the template structure is used.
d0 – Distance scale factor. If not specified, calculated based on the length normalization factor.
- Returns:
A pair of the transformation tensor and the TM-score of the alignment.
- nuri.tools.tm.tm_align(query: object, templ: object, l_norm: int | None = None, query_ss: str | None = None, templ_ss: str | None = None, *, d0: float | None = None, gapless: bool = True, sec_str: bool = True, local_sup: bool = True, local_with_ss: bool = True, fragment_gapless: bool = True) tuple[ndarray[numpy.float64], float]
Run TM-align algorithm with the given structures and parameters.
- Parameters:
query – The query structure, in which each residue is represented by a single atom (usually
CA). Must be representable as a 2D numpy array of shape(N, 3), whereNis the number of residues.templ – The template structure, in which each residue is represented by a single atom (usually
CA). Must be representable as a 2D numpy array of shape(M, 3), whereMis the number of residues.l_norm – Length normalization factor. If not specified, the length of the template structure is used.
query_ss – The secondary structure of the query structure. When provided, must be an ASCII string of length
N.templ_ss – The secondary structure of the template structure. When provided, must be an ASCII string of length
M.d0 – Distance scale factor. If not specified, calculated based on the length normalization factor.
gapless – Enable gapless threading.
sec_str – Enable secondary structure assignment.
local_sup – Enable local superposition. Note that this is the most expensive initialization method due to the exhaustive pairwise distance calculation. Consider disabling this flag if alignment takes too long.
local_with_ss – Enable local superposition with secondary structure-based alignment.
fragment_gapless – Enable fragment gapless threading.
- Returns:
A pair of the transformation tensor and the TM-score of the alignment.
- Raises:
If:
The query or template structure has less than 5 residues.
The secondary structure of the query or template structure has a different length than the structure.
No initialization flag is set.
The initialization fails (for any other reason).
Tip
If want to calculate TM-score for multiple
l_normord0values, or want more details such as RMSD or aligned pairs, consider using theTMAlignobject directly.Note
If the secondary structure is not provided, it will be assigned using the approximate secondary structure assignment algorithm defined in the TM-align code. When both
sec_strandlocal_with_ssflags are not set, the secondary structures are ignored.See also
- nuri.tools.tm.tm_score(query: object, templ: object, alignment: object = None, l_norm: int | None = None, *, d0: float | None = None) tuple[ndarray[numpy.float64], float]
Run TM-align algorithm with the given structures and alignment. This is also known as the “TM-score” program in the TM-tools suite, from which the function got its name.
- Parameters:
query – The query structure, in which residues are represented by a single atom (usually
CA). Must be representable as a 2D numpy array of shape(N, 3)whereNis the number of residues.templ – The template structure, in which residues are represented by a single atom (usually
CA). Must be representable as a 2D numpy array of shape(M, 3)whereMis the number of residues.alignment – Pairwise alignment of the query and template structures. Must be in a form representable as a 2D numpy array of shape
(L, 2), in which rows must contain (query index, template index) pairs. If not provided, query and template must have same length and assumed to be aligned in order.l_norm – Length normalization factor. If not specified, the length of the template structure is used.
d0 – Distance scale factor. If not specified, calculated based on the length normalization factor.
- Returns:
A pair of the transformation tensor and the TM-score of the alignment.
- Raises:
If:
The query or template structure has less than 5 residues.
The alignment contains out-of-range indices.
Alignment is not provided and the query and template structures have different lengths.
The initialization fails (for any other reason).
Tip
If want to calculate TM-score for multiple
l_normord0values, or want more details such as RMSD or aligned pairs, consider using theTMAlignobject directly.Note
Duplicate values in
alignmentare not checked and may result in invalid alignment.See also
GAlign
This module provides a Python interface to the GAlign flexible molecular alignment algorithm. The paper describing the GAlign algorithm is under preparation and will be cited here once available.
- class nuri.tools.galign.GAlign
- __init__(self: GAlign, templ: Molecule, *, conf: int | None = None, vdw_scale: float = 0.8, hetero_scale: float = 0.7, dcut: int = 6) None
Prepare GAlign algorithm with the given template structure.
- Parameters:
templ – The template structure. Must have at least 3 atoms and 3D coordinates.
conf – The conformation index to use as the template. If not provided, the first conformation is used.
vdw_scale – The scale factor for van der Waals radii when calculating shape overlap score.
hetero_scale – The scale factor for atom type mismatch when calculating shape overlap score.
dcut – The distance cutoff for neighbor search, in angstroms.
- Raises:
ValueError – If the template structure has less than 3 atoms or no 3D conformation, or if invalid parameters are provided (e.g., negative dcut).
IndexError – If the provided conformation index is out of range.
- align(self: GAlign, query: Molecule, flexible: bool = True, max_confs: int = 1, *, conf: int | None = None, max_translation: float = 2.5, max_rotation: float = 2.0943951023931953, max_torsion: float = 2.0943951023931953, rigid_min_msd: float = 9.0, rigid_max_confs: int = 4, pool_size: int = 10, sample_size: int = 30, max_generations: int = 50, patience: int = 5, n_mutation: int = 5, p_mutation: float = 0.5, opt_ftol: float = 0.01, opt_max_iters: int = 300) list[GAlignResult]
Align the given query molecule to the template structure.
- Parameters:
query – The query molecule to be aligned. Must have at least one 3D conformation.
flexible – Whether to perform flexible alignment. When
False, only rigid alignment is performed and the flexible alignment parameters are ignored.max_confs – The maximum number of alignment results to return.
conf – The conformation index to use as the query structure. If not provided, the first conformation is used.
vdw_scale – The scale factor for van der Waals radii when calculating shape overlap score.
hetero_scale – The scale factor for atom type mismatch when calculating shape overlap score.
dcut – The distance cutoff for neighbor search, in angstroms.
max_translation – The maximum translation allowed during flexible alignment, in angstroms.
max_rotation – The maximum rotation allowed during flexible alignment, in radians.
max_torsion – The maximum torsion angle change allowed during flexible alignment, in radians.
rigid_min_rmsd – The minimum root-mean-squared deviation between different conformations to consider them as distinct during rigid alignment.
rigid_max_confs – The maximum number of conformations to consider for initial rigid alignment. Ignored if in rigid mode; set
max_confsinstead.pool_size – The size of the population pool during flexible alignment.
sample_size – The number of new trial conformations to sample in each generation.
max_generations – The maximum number of generations to run.
patience – The number of generations to wait for improvement before early stopping.
n_mutation – The number of mutation operations to perform when generating new trial conformations.
p_mutation – The probability of mutation when generating new trial conformations.
opt_ftol – The function tolerance for the Nelder-Mead optimization.
opt_max_iters – The maximum number of iterations for the Nelder-Mead optimization.
- Returns:
At most
max_confsalignment results as a list ofGAlignResultobjects, sorted by their alignment scores in descending order.- Raises:
ValueError – If the query molecule has no 3D conformation, or if invalid parameters are provided (e.g., negative max_translation).
IndexError – If the provided conformation index is out of range.
- class nuri.tools.galign.GAlignResult
- property pos
A copy of the aligned conformation as a 2D numpy array of shape
(N, 3), whereNis the number of atoms in the query molecule.
- property score
The alignment score (shape overlap) of this result.
- nuri.tools.galign.galign(query: Molecule, templ: Molecule, flexible: bool = True, max_confs: int = 1, *, qconf: int | None = None, tconf: int | None = None, vdw_scale: float = 0.8, hetero_scale: float = 0.7, dcut: int = 6, max_translation: float = 2.5, max_rotation: float = 2.0943951023931953, max_torsion: float = 2.0943951023931953, rigid_min_msd: float = 9.0, rigid_max_confs: int = 4, pool_size: int = 10, sample_size: int = 30, max_generations: int = 50, patience: int = 5, n_mutation: int = 5, p_mutation: float = 0.5, opt_ftol: float = 0.01, opt_max_iters: int = 300) list[GAlignResult]
Align the given query molecule to the template structure.
- Parameters:
query – The query molecule to be aligned. Must have at least one 3D conformation.
templ – The template structure. Must have at least 3 atoms and 3D coordinates.
flexible – Whether to perform flexible alignment. When
False, only rigid alignment is performed and the flexible alignment parameters are ignored.max_confs – The maximum number of alignment results to return.
qconf – The conformation index to use as the query structure. If not provided, the first conformation is used.
tconf – The conformation index to use as the template structure. If not provided, the first conformation is used.
vdw_scale – The scale factor for van der Waals radii when calculating shape overlap score.
hetero_scale – The scale factor for atom type mismatch when calculating shape overlap score.
dcut – The distance cutoff for neighbor search, in angstroms.
max_translation – The maximum translation allowed during flexible alignment, in angstroms.
max_rotation – The maximum rotation allowed during flexible alignment, in radians.
max_torsion – The maximum torsion angle change allowed during flexible alignment, in radians.
rigid_min_rmsd – The minimum root-mean-squared deviation between different conformations to consider them as distinct during rigid alignment.
rigid_max_confs – The maximum number of conformations to consider for initial rigid alignment. Ignored if in rigid mode; set
max_confsinstead.pool_size – The size of the population pool during flexible alignment.
sample_size – The number of new trial conformations to sample in each generation.
max_generations – The maximum number of generations to run.
patience – The number of generations to wait for improvement before early stopping.
n_mutation – The number of mutation operations to perform when generating new trial conformations.
p_mutation – The probability of mutation when generating new trial conformations.
opt_ftol – The function tolerance for the Nelder-Mead optimization.
opt_max_iters – The maximum number of iterations for the Nelder-Mead optimization.
- Returns:
At most
max_confsalignment results as a list ofGAlignResultobjects, sorted by their alignment scores in descending order.- Raises:
ValueError – If the query or template molecule is invalid, or if any of the parameters are invalid (e.g., negative max_translation).
IndexError – If the provided conformation index is out of range.