atomium.mmcif

Contains functions for dealing with the .cif file format.

atomium.mmcif.add_atom_to_non_polymer(atom, aniso, model, mol_type, names)[source]

Takes an MMCIF atom dictionary, converts it, and adds it to a non-polymer dictionary.

Parameters
  • atom (dict) – the .mmcif dictionary to read.

  • aniso (dict) – lookup dictionary for anisotropy information.

  • model (dict) – the model to update.

  • mol_type (str) – non-polymer or water.

  • names (dict) – the lookup dictionary for full name information.

atomium.mmcif.add_atom_to_polymer(atom, aniso, model, names)[source]

Takes an MMCIF atom dictionary, converts it, and adds it to a polymer dictionary.

Parameters
  • atom (dict) – the .mmcif dictionary to read.

  • aniso (dict) – lookup dictionary for anisotropy information.

  • model (dict) – the model to update.

  • names (dict) – the lookup dictionary for full name information.

atomium.mmcif.add_secondary_structure_to_polymers(model, ss_dict)[source]

Updates polymer dictionaries with secondary structure information, from a previously created mapping.

Parameters
  • model (dict) – the model to update.

  • ss_dict (dict) – the mapping to read.

atomium.mmcif.add_sequences_to_polymers(model, mmcif_dict, entities)[source]

Takes a pre-populated mapping of chain IDs to entity IDs, and uses them to add sequence information to a model.

Parameters
  • model (dict) – the model to update.

  • mmcif_dict (dict) – the .mmcif dictionary to read.

  • entities (dict) – a mapping of chain IDs to entity IDs.

atomium.mmcif.assign_metrics_to_assembly(mmcif_dict, assembly)[source]

Takes an assembly dict, and goes through an mmcif dictionary looking for relevant energy etc. information to update it with.

Parameters
  • mmcif_dict (dict) – The dictionary to read.

  • assembly (dict) – The assembly to update.

atomium.mmcif.assign_transformations_to_assembly(mmcif_dict, operations, assembly)[source]

Takes an assembly dict, and goes through an mmcif dictionary looking for relevant transformation information to update it with.

Parameters
  • mmcif_dict (dict) – the .mmcif dictionary to read.

  • operations (dict) – the processed operations matrices.

  • assembly (dict) – the assembly to update.

atomium.mmcif.atom_dict_to_atom_dict(d, aniso_dict)[source]

Turns an .mmcif atom dictionary into an atomium atom data dictionary.

Parameters
  • d (dict) – the .mmcif atom dictionary.

  • d – the mapping of atom IDs to anisotropy.

Return type

dict

atomium.mmcif.atom_to_atom_line(atom)[source]

Takes an atomium atom and turns it into a .cif ATOM record.

Parameters

atom (Atom) – the atom to read.

Return type

str

atomium.mmcif.consolidate_strings(lines)[source]

Generally, .cif files have a one file line to one table row correspondence. Sometimes however, a string cell is given a line of its own, breaking the row over several lines. This function takes the lines of a .cif file and puts all table rows on a single line.

Parameters

lines (deque) – the .cif file lines.

Return type

deque

atomium.mmcif.create_entities(chains, ligands, waters)[source]

Creates a list of entities from chains, ligands and waters.

Parameters
  • chains (set) – the chains.

  • ligands (set) – the ligands.

  • waters (set) – the waters.

Return type

list

atomium.mmcif.get_atom_name(atom)[source]

Formats an atom name for packing in .cif.

Parameters

atom (Atom) – the atom to read.

Return type

str

atomium.mmcif.get_operation_id_groups(expression)[source]

Takes an operator expression from an .mmcif transformation dict, and works out what transformation IDs it is referring to. For example, (1,2,3) becomes [[1, 2, 3]], (1-3)(8-11,17) becomes [[1, 2, 3], [8, 9, 10, 11, 17]], and so on.

Parameters

expression (str) – The expression to parse.

Return type

list

atomium.mmcif.get_structure_from_atom(atom, chains, ligands, waters)[source]

Gets an atom’s molecule and adds it to one of three ligands.

Parameters
  • atom (Atom) – the atom to check.

  • chains (set) – the set of chains.

  • chains – the set of ligands.

  • chains – the set of waters.

atomium.mmcif.loop_block_to_list(block)[source]

Takes a loop block dict where the initial lines are table headers and turns it into a table list. Sometimes a row is broken over several lines so this function deals with that too.

Parameters

block (dict) – the .cif block to process.

Return type

list

atomium.mmcif.make_aniso(mmcif_dict)[source]

Makes a mapping of atom IDs to anisotropy information.

Parameters

mmcif_dict – the .mmcif dict to read.

Return type

dict

atomium.mmcif.make_residue_id(d)[source]

Generates a residue ID for an atom.

Parameters

d (dict) – the atom dictionary to read.

Return type

str

atomium.mmcif.make_secondary_structure(mmcif_dict)[source]

Creates a dictionary of helices and strands, with each having a list of start and end residues.

Parameters

mmcif_dict – the .mmcif dict to read.

Return type

dict

atomium.mmcif.make_sequences(mmcif_dict)[source]

Creates a mapping of entity IDs to sequences.

Parameters

mmcif_dict (dict) – the .mmcif dictionary to read.

Return type

dict

atomium.mmcif.mmcif_dict_to_data_dict(mmcif_dict)[source]

Converts an .mmcif dictionary into an atomium data dictionary, with the same standard layout that the other file formats get converted into.

Parameters

mmcif_dict (dict) – the .mmcif dictionary.

Return type

dict

atomium.mmcif.mmcif_lines_to_mmcif_blocks(lines)[source]

A .cif file is ultimately a list of tables. This function takes a list of .cif file lines and splits them into these table blocks. Each block will be a dict containing a category name and a list of lines.

Parameters

lines (deque) – the .cif file lines.

Return type

list

atomium.mmcif.mmcif_string_to_mmcif_dict(filestring)[source]

Takes a .cif filestring and turns into a dict which represents its table structure. Only lines which aren’t empty and which don’t begin with # are used.

Multi-line strings are consolidated onto one line, and the whole thing is then split into the blocks that will become table lists. At the end, quote marks are removed from any string which retains them.

Parameters

filestring (str) – the .cif filestring to process.

Return type

dict

atomium.mmcif.mmcif_to_data_transfer(mmcif_dict, data_dict, d_cat, d_key, m_table, m_key, date=False, split=False, multi=False, func=None)[source]

A function for transfering a bit of data from a .mmcif dictionary to a data dictionary, or doing nothing if the data doesn’t exist.

Parameters
  • mmcif_dict (dict) – the .mmcif dictionary to read.

  • data_dict (dict) – the data dictionary to update.

  • d_cat (str) – the top-level key in the data dictionary.

  • d_key (str) – the data dictionary field to update.

  • m_table (str) – the name of the .mmcif table to look in.

  • m_key (str) – the .mmcif field to read.

  • date (bool) – if True, the value will be converted to a date.

  • split (bool) – if True, the value will be split on commas.

  • multi (bool) – if True, every row in the table will be read.

  • func (function) – if given, this will be applied to the value.

atomium.mmcif.non_loop_block_to_list(block)[source]

Takes a simple block dict with no loop and turns it into a table list.

Parameters

block (dict) – the .cif block to process.

Return type

list

atomium.mmcif.operation_id_groups_to_operations(operations, operation_id_groups)[source]

Creates a list of operation matrices for an assembly, from a list of operation IDs - cross multiplying as required.

Parameters
  • operations (dict) – the parsed .mmcif operations.

  • operation_id_groups (list) – the operation IDs.

atomium.mmcif.split_residue_id(atom)[source]

Takes an atom and splits its het ID into components.

Parameters

atom (Atom) – the atom to read.

Return type

tuple

atomium.mmcif.split_values(line)[source]

The body of a .cif table is a series of lines, with each cell divided by whitespace. This function takes a string line and breaks it into cells.

There are a few peculiarities to handle. Sometimes a cell is a string enclosed in quote marks, and spaces within this string obviously shouldn’t be used to break the line. This function handles all of that.

Parameters

line (str) – the .cif line to split.

Return type

list

atomium.mmcif.strip_quotes(mmcif_dict)[source]

Goes through each table in the mmcif dict and removes any unneeded quote marks from the cells.

Parameters

mmcif_dict (dict) – the almost finished .mmcif dictionary to clean.

atomium.mmcif.structure_to_mmcif_string(structure)[source]

Converts a AtomStructure to a .cif filestring.

Parameters

structure (AtomStructure) – the structure to convert.

Return type

str

atomium.mmcif.update_crystallography_dict(mmcif_dict, data_dict)[source]

Takes a data dictionary and updates its crystallography sub-sub-dictionary with information from a .mmcif dictionary.

Parameters
  • mmcif_dict (dict) – the .mmcif dictionary to read.

  • data_dict (dict) – the data dictionary to update.

atomium.mmcif.update_description_dict(mmcif_dict, data_dict)[source]

Takes a data dictionary and updates its description sub-dictionary with information from a .mmcif dictionary.

Parameters
  • mmcif_dict (dict) – the .mmcif dictionary to read.

  • data_dict (dict) – the data dictionary to update.

atomium.mmcif.update_experiment_dict(mmcif_dict, data_dict)[source]

Takes a data dictionary and updates its experiment sub-dictionary with information from a .mmcif dictionary.

Parameters
  • mmcif_dict (dict) – the .mmcif dictionary to read.

  • data_dict (dict) – the data dictionary to update.

atomium.mmcif.update_geometry_dict(mmcif_dict, data_dict)[source]

Takes a data dictionary and updates its geometry sub-dictionary with information from a .mmcif dictionary.

Parameters
  • mmcif_dict (dict) – the .mmcif dictionary to read.

  • data_dict (dict) – the data dictionary to update.

atomium.mmcif.update_lines_with_entities(lines, entities)[source]

Updates a list of .cif lines with relevant information about entities.

Parameters
  • lines (list) – the list of lines to update.

  • entities (list) – the entities to pack.

atomium.mmcif.update_lines_with_structures(lines, chains, ligands, waters, entities)[source]

Updates a list of .cif lines with relevant information about structures.

Parameters
  • lines (list) – the list of lines to update.

  • chains (set) – the chains.

  • ligands (set) – the ligands.

  • waters (set) – the waters.

  • entities (list) – the entities to pack.

atomium.mmcif.update_models_list(mmcif_dict, data_dict)[source]

Takes a data dictionary and updates its models list with information from a .mmcif dictionary.

Parameters
  • mmcif_dict (dict) – the .mmcif dictionary to read.

  • data_dict (dict) – the data dictionary to update.

atomium.mmcif.update_quality_dict(mmcif_dict, data_dict)[source]

Takes a data dictionary and updates its quality sub-dictionary with information from a .mmcif dictionary.

Parameters
  • mmcif_dict (dict) – the .mmcif dictionary to read.

  • data_dict (dict) – the data dictionary to update.