atomium.mmcif

Contains functions for dealing with the .cif file format.

atomium.mmcif.add_atom_to_non_polymer(atom, aniso, model, mol_type, names)[source]

Takes an MMCIF atom dictionary, converts it, and adds it to a non-polymer dictionary.

Parameters

atom (dict) – the .mmcif dictionary to read.
aniso (dict) – lookup dictionary for anisotropy information.
model (dict) – the model to update.
mol_type (str) – non-polymer or water.
names (dict) – the lookup dictionary for full name information.

atomium.mmcif.add_atom_to_polymer(atom, aniso, model, names)[source]

Takes an MMCIF atom dictionary, converts it, and adds it to a polymer dictionary.

Parameters

atom (dict) – the .mmcif dictionary to read.
aniso (dict) – lookup dictionary for anisotropy information.
model (dict) – the model to update.
names (dict) – the lookup dictionary for full name information.

atomium.mmcif.add_secondary_structure_to_polymers(model, ss_dict)[source]

Updates polymer dictionaries with secondary structure information, from a previously created mapping.

Parameters

model (dict) – the model to update.
ss_dict (dict) – the mapping to read.

atomium.mmcif.add_sequences_to_polymers(model, mmcif_dict, entities)[source]

Takes a pre-populated mapping of chain IDs to entity IDs, and uses them to add sequence information to a model.

Parameters

model (dict) – the model to update.
mmcif_dict (dict) – the .mmcif dictionary to read.
entities (dict) – a mapping of chain IDs to entity IDs.

atomium.mmcif.assign_metrics_to_assembly(mmcif_dict, assembly)[source]

Takes an assembly dict, and goes through an mmcif dictionary looking for relevant energy etc. information to update it with.

Parameters

mmcif_dict (dict) – The dictionary to read.
assembly (dict) – The assembly to update.

atomium.mmcif.assign_transformations_to_assembly(mmcif_dict, operations, assembly)[source]

Takes an assembly dict, and goes through an mmcif dictionary looking for relevant transformation information to update it with.

Parameters

mmcif_dict (dict) – the .mmcif dictionary to read.
operations (dict) – the processed operations matrices.
assembly (dict) – the assembly to update.

atomium.mmcif.atom_dict_to_atom_dict(d, aniso_dict)[source]

Turns an .mmcif atom dictionary into an atomium atom data dictionary.

Parameters

d (dict) – the .mmcif atom dictionary.
d – the mapping of atom IDs to anisotropy.

Return type

dict

atomium.mmcif.atom_to_atom_line(atom)[source]

Takes an atomium atom and turns it into a .cif ATOM record.

Parameters: atom (Atom) – the atom to read.
Return type: str

atomium.mmcif.consolidate_strings(lines)[source]

Generally, .cif files have a one file line to one table row correspondence. Sometimes however, a string cell is given a line of its own, breaking the row over several lines. This function takes the lines of a .cif file and puts all table rows on a single line.

Parameters: lines (deque) – the .cif file lines.
Return type: deque

atomium.mmcif.create_entities(chains, ligands, waters)[source]

Creates a list of entities from chains, ligands and waters.

Parameters

chains (set) – the chains.
ligands (set) – the ligands.
waters (set) – the waters.

Return type

list

atomium.mmcif.get_atom_name(atom)[source]

Formats an atom name for packing in .cif.

Parameters: atom (Atom) – the atom to read.
Return type: str

atomium.mmcif.get_operation_id_groups(expression)[source]

Takes an operator expression from an .mmcif transformation dict, and works out what transformation IDs it is referring to. For example, (1,2,3) becomes [[1, 2, 3]], (1-3)(8-11,17) becomes [[1, 2, 3], [8, 9, 10, 11, 17]], and so on.

Parameters: expression (str) – The expression to parse.
Return type: list

atomium.mmcif.get_structure_from_atom(atom, chains, ligands, waters)[source]

Gets an atom’s molecule and adds it to one of three ligands.

Parameters

atom (Atom) – the atom to check.
chains (set) – the set of chains.
chains – the set of ligands.
chains – the set of waters.

atomium.mmcif.loop_block_to_list(block)[source]

Takes a loop block dict where the initial lines are table headers and turns it into a table list. Sometimes a row is broken over several lines so this function deals with that too.

Parameters: block (dict) – the .cif block to process.
Return type: list

atomium.mmcif.make_aniso(mmcif_dict)[source]

Makes a mapping of atom IDs to anisotropy information.

Parameters: mmcif_dict – the .mmcif dict to read.
Return type: dict

atomium.mmcif.make_residue_id(d)[source]

Generates a residue ID for an atom.

Parameters: d (dict) – the atom dictionary to read.
Return type: str

atomium.mmcif.make_secondary_structure(mmcif_dict)[source]

Creates a dictionary of helices and strands, with each having a list of start and end residues.

Parameters: mmcif_dict – the .mmcif dict to read.
Return type: dict

atomium.mmcif.make_sequences(mmcif_dict)[source]

Creates a mapping of entity IDs to sequences.

Parameters: mmcif_dict (dict) – the .mmcif dictionary to read.
Return type: dict

atomium.mmcif.mmcif_dict_to_data_dict(mmcif_dict)[source]

Converts an .mmcif dictionary into an atomium data dictionary, with the same standard layout that the other file formats get converted into.

Parameters: mmcif_dict (dict) – the .mmcif dictionary.
Return type: dict

atomium.mmcif.mmcif_lines_to_mmcif_blocks(lines)[source]

A .cif file is ultimately a list of tables. This function takes a list of .cif file lines and splits them into these table blocks. Each block will be a dict containing a category name and a list of lines.

Parameters: lines (deque) – the .cif file lines.
Return type: list

atomium.mmcif.mmcif_string_to_mmcif_dict(filestring)[source]

Takes a .cif filestring and turns into a dict which represents its table structure. Only lines which aren’t empty and which don’t begin with # are used.

Multi-line strings are consolidated onto one line, and the whole thing is then split into the blocks that will become table lists. At the end, quote marks are removed from any string which retains them.

Parameters: filestring (str) – the .cif filestring to process.
Return type: dict

atomium.mmcif.mmcif_to_data_transfer(mmcif_dict, data_dict, d_cat, d_key, m_table, m_key, date=False, split=False, multi=False, func=None)[source]

A function for transfering a bit of data from a .mmcif dictionary to a data dictionary, or doing nothing if the data doesn’t exist.

Parameters

mmcif_dict (dict) – the .mmcif dictionary to read.
data_dict (dict) – the data dictionary to update.
d_cat (str) – the top-level key in the data dictionary.
d_key (str) – the data dictionary field to update.
m_table (str) – the name of the .mmcif table to look in.
m_key (str) – the .mmcif field to read.
date (bool) – if True, the value will be converted to a date.
split (bool) – if True, the value will be split on commas.
multi (bool) – if True, every row in the table will be read.
func (function) – if given, this will be applied to the value.

atomium.mmcif.non_loop_block_to_list(block)[source]

Takes a simple block dict with no loop and turns it into a table list.

Parameters: block (dict) – the .cif block to process.
Return type: list

atomium.mmcif.operation_id_groups_to_operations(operations, operation_id_groups)[source]

Creates a list of operation matrices for an assembly, from a list of operation IDs - cross multiplying as required.

Parameters

operations (dict) – the parsed .mmcif operations.
operation_id_groups (list) – the operation IDs.

atomium.mmcif.split_residue_id(atom)[source]

Takes an atom and splits its het ID into components.

Parameters: atom (Atom) – the atom to read.
Return type: tuple

atomium.mmcif.split_values(line)[source]

The body of a .cif table is a series of lines, with each cell divided by whitespace. This function takes a string line and breaks it into cells.

There are a few peculiarities to handle. Sometimes a cell is a string enclosed in quote marks, and spaces within this string obviously shouldn’t be used to break the line. This function handles all of that.

Parameters: line (str) – the .cif line to split.
Return type: list

atomium.mmcif.strip_quotes(mmcif_dict)[source]

Goes through each table in the mmcif dict and removes any unneeded quote marks from the cells.

Parameters: mmcif_dict (dict) – the almost finished .mmcif dictionary to clean.

atomium.mmcif.structure_to_mmcif_string(structure)[source]

Converts a AtomStructure to a .cif filestring.

Parameters: structure (AtomStructure) – the structure to convert.
Return type: str

atomium.mmcif.update_crystallography_dict(mmcif_dict, data_dict)[source]

Takes a data dictionary and updates its crystallography sub-sub-dictionary with information from a .mmcif dictionary.

Parameters

mmcif_dict (dict) – the .mmcif dictionary to read.
data_dict (dict) – the data dictionary to update.

atomium.mmcif.update_description_dict(mmcif_dict, data_dict)[source]

Takes a data dictionary and updates its description sub-dictionary with information from a .mmcif dictionary.

Parameters

mmcif_dict (dict) – the .mmcif dictionary to read.
data_dict (dict) – the data dictionary to update.

atomium.mmcif.update_experiment_dict(mmcif_dict, data_dict)[source]

Takes a data dictionary and updates its experiment sub-dictionary with information from a .mmcif dictionary.

Parameters

mmcif_dict (dict) – the .mmcif dictionary to read.
data_dict (dict) – the data dictionary to update.

atomium.mmcif.update_geometry_dict(mmcif_dict, data_dict)[source]

Takes a data dictionary and updates its geometry sub-dictionary with information from a .mmcif dictionary.

Parameters

mmcif_dict (dict) – the .mmcif dictionary to read.
data_dict (dict) – the data dictionary to update.

atomium.mmcif.update_lines_with_entities(lines, entities)[source]

Updates a list of .cif lines with relevant information about entities.

Parameters

lines (list) – the list of lines to update.
entities (list) – the entities to pack.

atomium.mmcif.update_lines_with_structures(lines, chains, ligands, waters, entities)[source]

Updates a list of .cif lines with relevant information about structures.

Parameters

lines (list) – the list of lines to update.
chains (set) – the chains.
ligands (set) – the ligands.
waters (set) – the waters.
entities (list) – the entities to pack.

atomium.mmcif.update_models_list(mmcif_dict, data_dict)[source]

Takes a data dictionary and updates its models list with information from a .mmcif dictionary.

Parameters

mmcif_dict (dict) – the .mmcif dictionary to read.
data_dict (dict) – the data dictionary to update.

atomium.mmcif.update_quality_dict(mmcif_dict, data_dict)[source]

Takes a data dictionary and updates its quality sub-dictionary with information from a .mmcif dictionary.

Parameters

mmcif_dict (dict) – the .mmcif dictionary to read.
data_dict (dict) – the data dictionary to update.