Lomap gufe bindings API guide#
This section describes the optional gufe bindings API. These allow Lomap to be seemlessly used with other components of the Open Free Energy ecosystem.
Lomap Atom Mapper#
- class lomap.LomapAtomMapper(*, time: int = 20, threed: bool = True, max3d: float = 1.0, element_change: bool = True, seed: str = '', shift: bool = False)#
Wraps the MCS atom mapper from Lomap.
Kwargs are passed directly to the MCS class from Lomap for each mapping created
- Parameters:
time (int, default 20) – Timeout of MCS algorithm in seconds, passed to RDKit.
threed (bool, default True) – If
True, positional info is used to choose between symmetrically equivalent mappings and prune the mapping.max3d (float, default 1.0) – Maximum discrepancy in Angstroms between atoms before mapping is not allowed.
element_change (bool, default True) – If
True, allow element changes in the mappings.seed (str, default "") – SMARTS string to use as seed for MCS searches. When used across an entire set of ligands, this can speed up calculations considerably.
shift (bool, default False) – If
True, translate the two molecules’ MCS to minimize the RMSD. This can boost potential alignment when determining 3D overlap.
- copy_with_replacements(**replacements)#
Make a modified copy of this object.
Since GufeTokenizables are immutable, this is essentially a shortcut to mutate the object. Note that the keyword arguments it takes are based on keys of the dictionaries used in the the
_to_dict/_from_dictcycle for this object; in most cases that is the same as parameters to__init__, but not always.This will always return a new object in memory. So using
obj.copy_with_replacements()(with no keyword arguments) is a way to create a shallow copy: the object is different in memory, but its attributes will be the same objects in memory as the original.- Parameters:
replacements (Dict) – keyword arguments with keys taken from the keys given by the output of this object’s
to_dictmethod.
- classmethod defaults()#
Dict of default key-value pairs for this GufeTokenizable object.
These defaults are stripped from the dict form of this object produced with
to_dict(include_defaults=False)where default values are present.
- classmethod from_dict(dct: dict)#
Generate an instance from full dict representation.
- Parameters:
dct (Dict) – A dictionary produced by to_dict to instantiate from. If an identical instance already exists in memory, it will be returned. Otherwise, a new instance will be returned.
- classmethod from_json(file: PathLike | TextIO | None = None, content: str | None = None)#
Generate an instance from JSON keyed chain representation.
Can provide either a filepath/filelike as file, or JSON content via content.
- Parameters:
file – A filepath or filelike object to read JSON data from.
content – A string to read JSON data from.
See also
- classmethod from_keyed_chain(keyed_chain: list[tuple[str, dict]])#
Generate an instance from keyed chain representation.
- Parameters:
keyed_chain (List[Tuple[str, Dict]]) – The keyed_chain representation of the GufeTokenizable.
See also
KeyedChain
- classmethod from_keyed_dict(dct: dict)#
Generate an instance from keyed dict representation.
- Parameters:
dct (Dict) – A dictionary produced by to_keyed_dict to instantiate from. If an identical instance already exists in memory, it will be returned. Otherwise, a new instance will be returned.
- classmethod from_msgpack(file: PathLike | BinaryIO | None = None, content: bytes | None = None)#
Generate an instance from a MessagePack keyed chain representation.
Can provide either a filepath/filelike as file, or msgpack content via content.
- Parameters:
file (BinaryIO | PathLike | None) – A filepath or filelike object to read msgpack data from.
content (bytes) – Bytes to read msgpack data from.
See also
- classmethod from_shallow_dict(dct: dict)#
Generate an instance from shallow dict representation.
- Parameters:
dct (Dict) – A dictionary produced by to_shallow_dict to instantiate from. If an identical instance already exists in memory, it will be returned. Otherwise, a new instance will be returned.
- property key#
Tokenized representation of this object, aka ‘gufe key’.
- property logger#
Return logger adapter for this instance.
- classmethod serialization_migration(old_dict: dict, version: int) dict#
Migrate old serialization dicts to the current form.
The input dict
old_dictcomes from some previous serialization version, given byversion. The output dict should be in the format of the current serialization dict.The recommended pattern to use looks like this:
def serialization_migration(cls, old_dict, version): if version == 1: ... # do things for migrating version 1->2 if version <= 2: ... # do things for migrating version 2->3 if version <= 3: ... # do things for migrating version 3->4 # etc
This approach steps through each old serialization model on its way to the current version. It keeps code relatively minimal and readable.
As a convenience, the following functions are available to simplify the various kinds of changes that are likely to occur in as serializtion versions change:
new_key_added()old_key_removed()key_renamed()nested_key_moved()
- suggest_mappings(componentA: SmallMoleculeComponent, componentB: SmallMoleculeComponent) Iterable[LigandAtomMapping]#
Generate one or more mappings between two small molecules
- Parameters:
componentA (gufe.SmallMoleculeComponent) – The SmallMoleculeComponents to suggest mappings between.
componentB (gufe.SmallMoleculeComponent) – The SmallMoleculeComponents to suggest mappings between.
- Returns:
mapping – Potential mappings between
componentAandcomponentB.- Return type:
Iterable[LigandAtomMapping]
- to_dict(include_defaults=True) dict#
Generate full dict representation, with all referenced GufeTokenizable objects also given in full dict representations.
- Parameters:
include_defaults (bool) – If False, strip keys from dict representation with values equal to those in defaults.
See also
GufeTokenizable.to_shallow_dict(),GufeTokenizable.to_keyed_dict()
- to_json(file: PathLike | TextIO | None = None) None | str#
Generate a JSON keyed chain representation.
This will be written to the filepath or filelike object if passed.
- Parameters:
file – A filepath or filelike object to write the JSON to.
- Returns:
A minimal JSON representation of the object if file is None; else None.
- Return type:
See also
- to_keyed_chain() list[tuple[str, dict]]#
Generate a keyed chain representation of the object.
See also
KeyedChain
- to_keyed_dict(include_defaults=True) dict#
Generate keyed dict representation, with all referenced GufeTokenizable objects given in keyed representations.
A keyed representation of an object is a dict of the form:
{‘:gufe-key:’: <GufeTokenizable.key>}
These function as stubs to allow for serialization and storage of GufeTokenizable objects with minimal duplication.
The original object can be re-assembled with from_keyed_dict.
See also
GufeTokenizable.to_dict(),GufeTokenizable.to_shallow_dict()
- to_msgpack(file: PathLike | BinaryIO | None = None, compress: bool = True) None | bytes#
Generate a MessagePack keyed chain representation.
This will be written to the filepath or filelike object if passed.
- Parameters:
file – A filepath or filelike object to write the encoded msgpack to.
compress – Whether or not to zstandard compress the serialized bytes. The default is
True.
- Returns:
A minimal msgpack representation of the object if file is None; else None.
- Return type:
None | bytes
See also
Lomap Network Generator#
- lomap.generate_lomap_network(ligands: list[SmallMoleculeComponent], mappers: AtomMapper | list[AtomMapper], scorer: Callable, distance_cutoff: float = 0.4, max_path_length: int = 6, actives: list[bool] | None = None, max_dist_from_active: int = 2, require_cycle_covering: bool = True, radial: bool = False, fast: bool = False, hub: SmallMoleculeComponent | None = None) LigandNetwork#
Generate a LigandNetwork according to Lomap’s network creation rules
- Parameters:
ligands (list[SmallMoleculeComponent]) – Molecules to include in the network.
mappers (list[AtomMapper] or AtomMapper) – One or more Mapper functions to use to propose edge mappings.
scorer (Callable) – Scoring function for edges. Should be a function which takes an AtomMapping and returns a value from 0.0 (worst) to 1.0 (best), inclusive. These values are used as the “distance” between two molecules, and compared against the
distance_cutoffparameter.distance_cutoff (float, default 0.4) – Edges with a score < 1 - distance_cutoff will be rejected.
max_path_length (int, default 6) – Maximum edge distance between any two molecules in the resulting network.
actives (list[bool] | None, default None) – If defined, a tag for each ligand which defines if it is an active molecule.
max_dist_from_active (int, default 2) – When ‘actives’ is given, constrains the resulting map to be within this number of edges (e.g. distance) from an active molecule.
require_cycle_covering (bool, default True) – If
True, attempt to ensure that every ligand has redundant paths to its neighbors, giving the network robustness against individual perturbation failures. This is achieved by rejecting edge removals that would leave a node outside a cycle or create a new bridge (an edge whose removal disconnects the graph). IfFalse, this constraint is relaxed and the resulting network may have no cycles.radial (bool, default False) – Construct a radial (star) network. Note that the map will not necessarily be a true radial map; edges will still obey the
distance_cutoffand ifrequire_cycle_coveringisTrue, this radial map will still feature cycles.fast (bool, default False) – When both
fastandradialareTrue, switch the initial graph construction to only consider hub-spoke edges (every ligand connected to the hub/lead) rather than all pairwise edges. This makes network construction faster, at the potential cost of a less optimal network.hub (SmallMoleculeComponent | None, default None) – If radial is
True, force this ligand to be the center/hub of the radial graph.
Lomap Atom Mapping scorers#
For most users the default_lomap_score() is all you will need to use:
- lomap.default_lomap_score(mapping: LigandAtomMapping, charge_changes_score: float = 0.1) float#
The default score function from Lomap2
This score is a combination of many rules combined and considers factors such as the number of heavy atoms in common, if ring sizes are changed or rings are broken, or if other alchemically unwise transformations are attempted.
- Parameters:
mapping (LigandAtomMapping) – Mapping between the two ligands in the edge.
charge_changes_score (float, default 0.1) – The electrostatic score to be assigned for mappings of ligands that differ in net charge. Default: 0.1 (e.g. allowing net charge changes)
- Returns:
score – A rating of how good this mapping is, from 0.0 (terrible) to 1.0 (great).
- Return type:
However, you can also use the underlying sub-scores that make up the default_lomap_score():
- lomap.gufe_bindings.scorers.ecr_score(mapping: LigandAtomMapping, charge_changes_score: float) float#
Equal charge rule (ECR) score.
Returns 1.0 if both molecules have the same formal charge, otherwise returns
charge_changes_score.
- lomap.gufe_bindings.scorers.mcsr_score(mapping: LigandAtomMapping, beta: float = 0.1) float#
Maximum common substructure rule (MCSR) score.
This rule is defined as:
\[mcsr = exp( - beta * (n1 + n2 - 2 * n\_common))\]Where n1 and n2 are the number of heavy atoms in each molecule, and n_common is the number of heavy atoms in the MCS. This makes the term
n1 + n2 - 2 * n_commonthe total number of atoms inserted or deleted in the transformation.The exponential is used to ensure the score ranges between 0 and 1, and to strongly favor small structural changes.
- lomap.gufe_bindings.scorers.mncar_score(mapping: LigandAtomMapping, ths: int = 4) float#
Minimum number of common atoms rule (MNCAR) score.
The two molecules must share at least
thsheavy atoms to be regarded as similar. Returns 1.0 if this condition is met, or if either molecule is small (fewer thanths + 3heavy atoms), otherwise returns 0.0.
- lomap.gufe_bindings.scorers.atomic_number_score(mapping: LigandAtomMapping, beta: float = 0.1, difficulty: dict[int, dict[int, float]] | None = None) float#
A score on the elemental changes happening in the mapping
For each transmuted atom, a mismatch score is summed, according to the difficulty scores (see difficult parameter). The final score is then given as:
\[score = exp(-beta * mismatch)\]- Parameters:
mapping (LigandAtomMapping) – Mapping between the two ligands in the edge.
beta (float, default 0.1) – Scaling factor for this rule, default 0.1
difficulty (dict[int, dict[int, float] | None, default None) – A dict of dicts, mapping atomic number of one species, to another, to a mismatch in the identity of these elements. 1.0 indicates two elements are considered interchangeable, 0.0 indicates two elements are incompatible, a default of 0.5 is used. The scores in openfe.setup.lomap_mapper.DEFAULT_ANS_DIFFICULT are used by default
- Returns:
score
- Return type:
- lomap.gufe_bindings.scorers.hybridization_score(mapping: LigandAtomMapping, beta: float = 0.15) float#
Hybridization score — penalizes atom hybridization mismatches in the mapping.
For each mapped heavy atom pair with differing hybridization states, a mismatch is counted. N sp3/sp2 interchanges are permitted. The final score is:
\[score = exp(-beta * nmismatch)\]
- lomap.gufe_bindings.scorers.sulfonamides_score(mapping: LigandAtomMapping, beta: float = 0.4) float#
Sulfonamide score — penalizes mappings that mutate a sulfonamide group in or out.
Testing has shown that growing a sulfonamide from scratch performs very badly. Returns
math.exp(-beta)if a sulfonamide group appears in the unmapped remainder of either molecule, otherwise 1.0.- Parameters:
mapping (LigandAtomMapping) – Mapping between the two ligands in the edge.
beta (float, default 0.4) – Scaling factor controlling the size of the penalty. Smaller values give larger penalties.
- Returns:
score –
math.exp(-beta)if a sulfonamide is mutated in or out, else 1.0.- Return type:
- lomap.gufe_bindings.scorers.heterocycles_score(mapping: LigandAtomMapping, beta: float = 0.4) float#
Heterocycle score — penalizes mappings that form a heterocycle from a hydrogen.
Returns
math.exp(-beta)if a heterocycle is formed from a hydrogen. Testing has shown that growing a pyridine or other heterocycle is unlikely to work (better to grow phenyl than mutate).
- lomap.gufe_bindings.scorers.transmuting_methyl_into_ring_score(mapping: LigandAtomMapping, beta: float = 0.1, penalty: float = 6.0) float#
Penalises having a non-mapped ring atoms become a non-ring
This score would for example penalise R-CH3 to R-Ph where R is the same mapped atom and both CH3 and Ph are unmapped. Does not penalise R-H to R-Ph. If any atoms trigger the rule returns a score of:
\[exp(-1 * beta * penalty)\]
- lomap.gufe_bindings.scorers.transmuting_ring_sizes_score(mapping: LigandAtomMapping) float#
Ring size score — penalizes mappings that alter a ring size.
Checks first-degree neighbors of mapped atoms; if a non-mapped neighbor is in a ring in both molecules but the ring sizes differ, a value of 0.1 is returned, otherwise 1.0 is returned.
- Parameters:
mapping (LigandAtomMapping) – Mapping between the two ligands in the edge.
- Returns:
score – 0.1 if any ring size change is detected, else 1.0.
- Return type: