Lomap gufe bindings API guide#

This section describes the optional gufe bindings API. These allow Lomap to be seemlessly used with other components of the Open Free Energy ecosystem.

Lomap Atom Mapper#

class lomap.LomapAtomMapper(*, time: int = 20, threed: bool = True, max3d: float = 1.0, element_change: bool = True, seed: str = '', shift: bool = False)#

Wraps the MCS atom mapper from Lomap.

Kwargs are passed directly to the MCS class from Lomap for each mapping created

Parameters:
  • time (int, default 20) – Timeout of MCS algorithm in seconds, passed to RDKit.

  • threed (bool, default True) – If True, positional info is used to choose between symmetrically equivalent mappings and prune the mapping.

  • max3d (float, default 1.0) – Maximum discrepancy in Angstroms between atoms before mapping is not allowed.

  • element_change (bool, default True) – If True, allow element changes in the mappings.

  • seed (str, default "") – SMARTS string to use as seed for MCS searches. When used across an entire set of ligands, this can speed up calculations considerably.

  • shift (bool, default False) – If True, translate the two molecules’ MCS to minimize the RMSD. This can boost potential alignment when determining 3D overlap.

copy_with_replacements(**replacements)#

Make a modified copy of this object.

Since GufeTokenizables are immutable, this is essentially a shortcut to mutate the object. Note that the keyword arguments it takes are based on keys of the dictionaries used in the the _to_dict/_from_dict cycle for this object; in most cases that is the same as parameters to __init__, but not always.

This will always return a new object in memory. So using obj.copy_with_replacements() (with no keyword arguments) is a way to create a shallow copy: the object is different in memory, but its attributes will be the same objects in memory as the original.

Parameters:

replacements (Dict) – keyword arguments with keys taken from the keys given by the output of this object’s to_dict method.

classmethod defaults()#

Dict of default key-value pairs for this GufeTokenizable object.

These defaults are stripped from the dict form of this object produced with to_dict(include_defaults=False) where default values are present.

classmethod from_dict(dct: dict)#

Generate an instance from full dict representation.

Parameters:

dct (Dict) – A dictionary produced by to_dict to instantiate from. If an identical instance already exists in memory, it will be returned. Otherwise, a new instance will be returned.

classmethod from_json(file: PathLike | TextIO | None = None, content: str | None = None)#

Generate an instance from JSON keyed chain representation.

Can provide either a filepath/filelike as file, or JSON content via content.

Parameters:
  • file – A filepath or filelike object to read JSON data from.

  • content – A string to read JSON data from.

See also

to_json

classmethod from_keyed_chain(keyed_chain: list[tuple[str, dict]])#

Generate an instance from keyed chain representation.

Parameters:

keyed_chain (List[Tuple[str, Dict]]) – The keyed_chain representation of the GufeTokenizable.

See also

KeyedChain

classmethod from_keyed_dict(dct: dict)#

Generate an instance from keyed dict representation.

Parameters:

dct (Dict) – A dictionary produced by to_keyed_dict to instantiate from. If an identical instance already exists in memory, it will be returned. Otherwise, a new instance will be returned.

classmethod from_msgpack(file: PathLike | BinaryIO | None = None, content: bytes | None = None)#

Generate an instance from a MessagePack keyed chain representation.

Can provide either a filepath/filelike as file, or msgpack content via content.

Parameters:
  • file (BinaryIO | PathLike | None) – A filepath or filelike object to read msgpack data from.

  • content (bytes) – Bytes to read msgpack data from.

See also

to_msgpack

classmethod from_shallow_dict(dct: dict)#

Generate an instance from shallow dict representation.

Parameters:

dct (Dict) – A dictionary produced by to_shallow_dict to instantiate from. If an identical instance already exists in memory, it will be returned. Otherwise, a new instance will be returned.

property key#

Tokenized representation of this object, aka ‘gufe key’.

property logger#

Return logger adapter for this instance.

classmethod serialization_migration(old_dict: dict, version: int) dict#

Migrate old serialization dicts to the current form.

The input dict old_dict comes from some previous serialization version, given by version. The output dict should be in the format of the current serialization dict.

The recommended pattern to use looks like this:

def serialization_migration(cls, old_dict, version):
    if version == 1:
        ...  # do things for migrating version 1->2
    if version <= 2:
        ...  # do things for migrating version 2->3
    if version <= 3:
        ...  # do things for migrating version 3->4
    # etc

This approach steps through each old serialization model on its way to the current version. It keeps code relatively minimal and readable.

As a convenience, the following functions are available to simplify the various kinds of changes that are likely to occur in as serializtion versions change:

  • new_key_added()

  • old_key_removed()

  • key_renamed()

  • nested_key_moved()

Parameters:
  • old_dict (dict) – dict as received from a serialized form

  • version (int) – the serialization version of old_dict

Returns:

serialization dict suitable for the current implementation of from_dict.

Return type:

dict

suggest_mappings(componentA: SmallMoleculeComponent, componentB: SmallMoleculeComponent) Iterable[LigandAtomMapping]#

Generate one or more mappings between two small molecules

Parameters:
  • componentA (gufe.SmallMoleculeComponent) – The SmallMoleculeComponents to suggest mappings between.

  • componentB (gufe.SmallMoleculeComponent) – The SmallMoleculeComponents to suggest mappings between.

Returns:

mapping – Potential mappings between componentA and componentB.

Return type:

Iterable[LigandAtomMapping]

to_dict(include_defaults=True) dict#

Generate full dict representation, with all referenced GufeTokenizable objects also given in full dict representations.

Parameters:

include_defaults (bool) – If False, strip keys from dict representation with values equal to those in defaults.

See also

GufeTokenizable.to_shallow_dict(), GufeTokenizable.to_keyed_dict()

to_json(file: PathLike | TextIO | None = None) None | str#

Generate a JSON keyed chain representation.

This will be written to the filepath or filelike object if passed.

Parameters:

file – A filepath or filelike object to write the JSON to.

Returns:

A minimal JSON representation of the object if file is None; else None.

Return type:

str

See also

from_json

to_keyed_chain() list[tuple[str, dict]]#

Generate a keyed chain representation of the object.

See also

KeyedChain

to_keyed_dict(include_defaults=True) dict#

Generate keyed dict representation, with all referenced GufeTokenizable objects given in keyed representations.

A keyed representation of an object is a dict of the form:

{‘:gufe-key:’: <GufeTokenizable.key>}

These function as stubs to allow for serialization and storage of GufeTokenizable objects with minimal duplication.

The original object can be re-assembled with from_keyed_dict.

See also

GufeTokenizable.to_dict(), GufeTokenizable.to_shallow_dict()

to_msgpack(file: PathLike | BinaryIO | None = None, compress: bool = True) None | bytes#

Generate a MessagePack keyed chain representation.

This will be written to the filepath or filelike object if passed.

Parameters:
  • file – A filepath or filelike object to write the encoded msgpack to.

  • compress – Whether or not to zstandard compress the serialized bytes. The default is True.

Returns:

A minimal msgpack representation of the object if file is None; else None.

Return type:

None | bytes

See also

from_msgpack

to_shallow_dict() dict#

Generate shallow dict representation, with all referenced GufeTokenizable objects left intact.

See also

GufeTokenizable.to_dict(), GufeTokenizable.to_keyed_dict()

Lomap Network Generator#

lomap.generate_lomap_network(ligands: list[SmallMoleculeComponent], mappers: AtomMapper | list[AtomMapper], scorer: Callable, distance_cutoff: float = 0.4, max_path_length: int = 6, actives: list[bool] | None = None, max_dist_from_active: int = 2, require_cycle_covering: bool = True, radial: bool = False, fast: bool = False, hub: SmallMoleculeComponent | None = None) LigandNetwork#

Generate a LigandNetwork according to Lomap’s network creation rules

Parameters:
  • ligands (list[SmallMoleculeComponent]) – Molecules to include in the network.

  • mappers (list[AtomMapper] or AtomMapper) – One or more Mapper functions to use to propose edge mappings.

  • scorer (Callable) – Scoring function for edges. Should be a function which takes an AtomMapping and returns a value from 0.0 (worst) to 1.0 (best), inclusive. These values are used as the “distance” between two molecules, and compared against the distance_cutoff parameter.

  • distance_cutoff (float, default 0.4) – Edges with a score < 1 - distance_cutoff will be rejected.

  • max_path_length (int, default 6) – Maximum edge distance between any two molecules in the resulting network.

  • actives (list[bool] | None, default None) – If defined, a tag for each ligand which defines if it is an active molecule.

  • max_dist_from_active (int, default 2) – When ‘actives’ is given, constrains the resulting map to be within this number of edges (e.g. distance) from an active molecule.

  • require_cycle_covering (bool, default True) – If True, attempt to ensure that every ligand has redundant paths to its neighbors, giving the network robustness against individual perturbation failures. This is achieved by rejecting edge removals that would leave a node outside a cycle or create a new bridge (an edge whose removal disconnects the graph). If False, this constraint is relaxed and the resulting network may have no cycles.

  • radial (bool, default False) – Construct a radial (star) network. Note that the map will not necessarily be a true radial map; edges will still obey the distance_cutoff and if require_cycle_covering is True, this radial map will still feature cycles.

  • fast (bool, default False) – When both fast and radial are True, switch the initial graph construction to only consider hub-spoke edges (every ligand connected to the hub/lead) rather than all pairwise edges. This makes network construction faster, at the potential cost of a less optimal network.

  • hub (SmallMoleculeComponent | None, default None) – If radial is True, force this ligand to be the center/hub of the radial graph.

Lomap Atom Mapping scorers#

For most users the default_lomap_score() is all you will need to use:

lomap.default_lomap_score(mapping: LigandAtomMapping, charge_changes_score: float = 0.1) float#

The default score function from Lomap2

This score is a combination of many rules combined and considers factors such as the number of heavy atoms in common, if ring sizes are changed or rings are broken, or if other alchemically unwise transformations are attempted.

Parameters:
  • mapping (LigandAtomMapping) – Mapping between the two ligands in the edge.

  • charge_changes_score (float, default 0.1) – The electrostatic score to be assigned for mappings of ligands that differ in net charge. Default: 0.1 (e.g. allowing net charge changes)

Returns:

score – A rating of how good this mapping is, from 0.0 (terrible) to 1.0 (great).

Return type:

float

However, you can also use the underlying sub-scores that make up the default_lomap_score():

lomap.gufe_bindings.scorers.ecr_score(mapping: LigandAtomMapping, charge_changes_score: float) float#

Equal charge rule (ECR) score.

Returns 1.0 if both molecules have the same formal charge, otherwise returns charge_changes_score.

Parameters:
  • mapping (LigandAtomMapping) – Mapping between the two ligands in the edge.

  • charge_changes_score (float) – Score assigned when the two molecules differ in net formal charge.

Returns:

score – Value in the range [0, 1.0].

Return type:

float

lomap.gufe_bindings.scorers.mcsr_score(mapping: LigandAtomMapping, beta: float = 0.1) float#

Maximum common substructure rule (MCSR) score.

This rule is defined as:

\[mcsr = exp( - beta * (n1 + n2 - 2 * n\_common))\]

Where n1 and n2 are the number of heavy atoms in each molecule, and n_common is the number of heavy atoms in the MCS. This makes the term n1 + n2 - 2 * n_common the total number of atoms inserted or deleted in the transformation.

The exponential is used to ensure the score ranges between 0 and 1, and to strongly favor small structural changes.

Parameters:
  • mapping (LigandAtomMapping) – Mapping between the two ligands in the edge.

  • beta (float, default 0.1) – Scaling factor.

Returns:

score – Value in the range [0, 1.0], with 1.0 indicating complete overlap.

Return type:

float

lomap.gufe_bindings.scorers.mncar_score(mapping: LigandAtomMapping, ths: int = 4) float#

Minimum number of common atoms rule (MNCAR) score.

The two molecules must share at least ths heavy atoms to be regarded as similar. Returns 1.0 if this condition is met, or if either molecule is small (fewer than ths + 3 heavy atoms), otherwise returns 0.0.

Parameters:
  • mapping (LigandAtomMapping) – Mapping between the two ligands in the edge.

  • ths (int, default 4) – Minimum number of common heavy atoms required, default 4.

Returns:

score – 1.0 if the constraint is satisfied, 0.0 otherwise.

Return type:

float

lomap.gufe_bindings.scorers.atomic_number_score(mapping: LigandAtomMapping, beta: float = 0.1, difficulty: dict[int, dict[int, float]] | None = None) float#

A score on the elemental changes happening in the mapping

For each transmuted atom, a mismatch score is summed, according to the difficulty scores (see difficult parameter). The final score is then given as:

\[score = exp(-beta * mismatch)\]
Parameters:
  • mapping (LigandAtomMapping) – Mapping between the two ligands in the edge.

  • beta (float, default 0.1) – Scaling factor for this rule, default 0.1

  • difficulty (dict[int, dict[int, float] | None, default None) – A dict of dicts, mapping atomic number of one species, to another, to a mismatch in the identity of these elements. 1.0 indicates two elements are considered interchangeable, 0.0 indicates two elements are incompatible, a default of 0.5 is used. The scores in openfe.setup.lomap_mapper.DEFAULT_ANS_DIFFICULT are used by default

Returns:

score

Return type:

float

lomap.gufe_bindings.scorers.hybridization_score(mapping: LigandAtomMapping, beta: float = 0.15) float#

Hybridization score — penalizes atom hybridization mismatches in the mapping.

For each mapped heavy atom pair with differing hybridization states, a mismatch is counted. N sp3/sp2 interchanges are permitted. The final score is:

\[score = exp(-beta * nmismatch)\]
Parameters:
  • mapping (LigandAtomMapping) – Mapping between the two ligands in the edge.

  • beta (float, default 0.15) – Scaling factor.

Returns:

score – Value in the range [0, 1.0], with 1.0 indicating no hybridization mismatches.

Return type:

float

lomap.gufe_bindings.scorers.sulfonamides_score(mapping: LigandAtomMapping, beta: float = 0.4) float#

Sulfonamide score — penalizes mappings that mutate a sulfonamide group in or out.

Testing has shown that growing a sulfonamide from scratch performs very badly. Returns math.exp(-beta) if a sulfonamide group appears in the unmapped remainder of either molecule, otherwise 1.0.

Parameters:
  • mapping (LigandAtomMapping) – Mapping between the two ligands in the edge.

  • beta (float, default 0.4) – Scaling factor controlling the size of the penalty. Smaller values give larger penalties.

Returns:

scoremath.exp(-beta) if a sulfonamide is mutated in or out, else 1.0.

Return type:

float

lomap.gufe_bindings.scorers.heterocycles_score(mapping: LigandAtomMapping, beta: float = 0.4) float#

Heterocycle score — penalizes mappings that form a heterocycle from a hydrogen.

Returns math.exp(-beta) if a heterocycle is formed from a hydrogen. Testing has shown that growing a pyridine or other heterocycle is unlikely to work (better to grow phenyl than mutate).

Parameters:
  • mapping (LigandAtomMapping) – Mapping between the two ligands in the edge.

  • beta (float, default 0.4) – Scaling factor controlling the size of the penalty.

Returns:

scoremath.exp(-beta) if a disallowed heterocycle is formed, else 1.0.

Return type:

float

lomap.gufe_bindings.scorers.transmuting_methyl_into_ring_score(mapping: LigandAtomMapping, beta: float = 0.1, penalty: float = 6.0) float#

Penalises having a non-mapped ring atoms become a non-ring

This score would for example penalise R-CH3 to R-Ph where R is the same mapped atom and both CH3 and Ph are unmapped. Does not penalise R-H to R-Ph. If any atoms trigger the rule returns a score of:

\[exp(-1 * beta * penalty)\]
Parameters:
  • mapping (LigandAtomMapping) – Mapping between the two ligands in the edge.

  • beta (float, default 0.1) – Score scaling factor.

  • penalty (float, default 6.0) – Score scaling factor.

Returns:

score

Return type:

float

lomap.gufe_bindings.scorers.transmuting_ring_sizes_score(mapping: LigandAtomMapping) float#

Ring size score — penalizes mappings that alter a ring size.

Checks first-degree neighbors of mapped atoms; if a non-mapped neighbor is in a ring in both molecules but the ring sizes differ, a value of 0.1 is returned, otherwise 1.0 is returned.

Parameters:

mapping (LigandAtomMapping) – Mapping between the two ligands in the edge.

Returns:

score – 0.1 if any ring size change is detected, else 1.0.

Return type:

float