tidytcells.mh#

Functions to manage MH gene data.

Functions

tidytcells.mh.get_chain(symbol=None, log_failures=None, gene=None, suppress_warnings=None)[source]#

Given a standardized MH gene / allele symbol, detect whether it codes for an alpha chain, beta chain, or beta-2 microglobulin (B2M) molecule.

Note

This function currently only recognises HLA (human leucocyte antigen or Homo sapiens MH), and not MH from other species.

Parameters:
  • symbol (str) – Standardized MH gene / allele symbol

  • log_failures (bool) – Report standardisation failures through logging (at level WARNING). Defaults to True.

  • gene (str) – Alias for symbol.

  • suppress_warnings (bool) – Disable warnings that are usually logged when standardisation fails. Deprecated in favour of log_failures.

Returns:

'alpha' or 'beta' if symbol is recognised and its chain is known, else None.

Return type:

Optional[str]

tidytcells.mh.get_class(symbol=None, log_failures=None, gene=None, suppress_warnings=None)[source]#

Given a standardized MH gene / allele symbol, detect whether it comprises a class I (MH1) or II (MH2) receptor.

Note

This function currently only recognises HLA (human leucocyte antigen or Homo sapiens MH), and not MH from other species.

Parameters:
  • symbol (str) – Standardized MH gene / allele symbol

  • log_failures (bool) – Report standardisation failures through logging (at level WARNING). Defaults to True.

  • gene (str) – Alias for symbol.

  • suppress_warnings (bool) – Disable warnings that are usually logged when standardisation fails. Deprecated in favour of log_failures.

Returns:

1 or 2 if gene is recognised and its class is known, else None.

Return type:

Optional[int]

tidytcells.mh.query(species=None, precision=None, contains_pattern=None)[source]#

Query the list of all known MH genes / alleles.

Note

tidytcells’ knowledge of MH alleles is limited, especially outside of humans. tidytcells will allow you to query HLA alleles up to the level of the protein (first two allele designators), but that is the highest resolution available. For Mus musculus, there is currently only support for gene-level querying.

Parameters:
  • species (str) – Species to query (see above for supported species). Defaults to "homosapiens".

  • precision (str) – The level of precision to query. allele will query from the set of all possible alleles. gene will query from the set of all possible genes. Defaults to allele.

  • contains_pattern (str) – An optional regular expression string which will be used to filter the query result. If supplied, only genes / alleles which contain the regular expression will be returned. Defaults to None.

Returns:

The set of all genes / alleles that satisfy the given constraints.

Return type:

FrozenSet[str]

tidytcells.mh.standardise(*args, **kwargs)[source]#

Alias for tidytcells.mh.standardize().

Return type:

str | None

tidytcells.mh.standardize(symbol=None, species=None, precision=None, on_fail=None, log_failures=None, gene=None, suppress_warnings=None)[source]#

Attempt to standardize an MH gene / allele symbol to be IMGT-compliant.

Note

This function will only verify the validity of an MH gene/allele up to the level of the protein. Any further precise allele designations will not be verified, apart from the requirement that the format (colon-separated numbers) look valid. The reasons for this is firstly because new alleles at that level are added to the IMGT list quite often and so accurate verification is difficult, secondly because people rarely need verification to such a precise level, and finally because such verification costs more computational effort with diminishing returns.

Parameters:
  • symbol (str) – Potentially non-standardized MH gene / allele symbol.

  • species (str) –

    Can be specified to standardise to a TR symbol that is known to be valid for that species (see above for supported species). If set to "any", then first attempts standardisation for Homo sapiens, then Mus musculus. Defaults to "homosapiens".

    Note

    From version 3, the default behaviour will change to "any".

  • precision (str) – The maximum level of precision to standardize to. "allele" standardizes to the maximum precision possible. "protein" keeps allele designators up to the level of the protein. "gene" standardizes only to the level of the gene. Defaults to "allele".

  • on_fail (str) – Behaviour when standardization fails. If set to "reject", returns None on failure. If set to "keep", returns the original input. Defaults to "reject".

  • log_failures (bool) – Report standardisation failures through logging (at level WARNING). Defaults to True.

  • gene (str) – Alias for symbol.

  • suppress_warnings (bool) – Disable warnings that are usually logged when standardisation fails. Deprecated in favour of log_failures.

Returns:

If the specified species is supported, and symbol could be standardized, then return the standardized symbol. If species is unsupported, then the function does not attempt to standardize, and returns the unaltered symbol string. Else follows the behvaiour as set by on_fail.

Return type:

Optional[str]