tidytcells.tr#

Functions to manage TR gene data.

Functions

tidytcells.tr.get_aa_sequence(symbol=None, species=None, gene=None)[source]#

Look up the amino acid sequence of a given TR allele.

Parameters:
  • symbol (str) – Standardized allele symbol. Note that the symbol must be specified to the level of the allele. Note that some alleles, notably those of non-functional genes, will not have resolvable amino acid sequences.

  • species (str) – Species to which the TR gene in question belongs (see above for supported species). Defaults to "homosapiens".

  • gene (str) – Alias for symbol.

Returns:

A dictionary with keys corresponding to names of different sequence regions within the allele, and values corresponding to their amino acid sequences.

Return type:

Dict[str, str]

tidytcells.tr.query(species, precision=None, functionality=None, contains_pattern=None)[source]#

Query the list of all known TR genes / alleles.

Parameters:
  • species (str) – Species to query (see above for supported species). Defaults to "homosapiens".

  • precision (str) – The level of precision to query. allele will query from the set of all possible alleles. gene will query from the set of all possible genes. Defaults to allele.

  • functionality (str) – Gene/allele functionality to subset by. "any" queries from all possible genes/alleles. "F" queries from functional genes/alleles. "NF" queries from psuedogenes and ORFs. "P" queries from pseudogenes. "ORF" queries from ORFs. An allele is considered queriable if its functionality label matches the description. A gene is considered queriable if at least one of its alleles’ functionality label matches the description. Defaults to "any".

  • contains_pattern (str) – An optional regular expression string which will be used to filter the query result. If supplied, only genes/alleles which contain the regular expression will be returned. Defaults to None.

Returns:

The set of all genes / alleles that satisfy the given constraints.

Return type:

FrozenSet[str]

tidytcells.tr.standardise(*args, **kwargs)[source]#

Alias for tidytcells.tr.standardize().

tidytcells.tr.standardize(symbol=None, species=None, enforce_functional=None, precision=None, on_fail=None, log_failures=None, gene=None, suppress_warnings=None)[source]#

Attempt to standardize a TR gene / allele symbol to be IMGT-compliant.

Parameters:
  • symbol (str) – Potentially non-standardized TR gene / allele symbol.

  • species (str) – Species to which the TR gene / allele belongs (see above for supported species). Defaults to "homosapiens".

  • enforce_functional (bool) – If True, disallows TR genes / alleles that are recognised by IMGT but are marked as non-functional (ORF or pseudogene). Defaults to False.

  • precision (str) – The maximum level of precision to standardize to. "allele" standardizes to the maximum precision possible. "gene" standardizes only to the level of the gene. Defaults to "allele".

  • on_fail (str) – Behaviour when standardization fails. If set to "reject", returns None on failure. If set to "keep", returns the original input. Defaults to "reject".

  • log_failures (bool) – Report standardisation failures through logging (at level WARNING). Defaults to True.

  • gene (str) – Alias for the parameter symbol.

  • suppress_warnings (bool) – Disable warnings that are usually logged when standardisation fails. Deprecated in favour of log_failures.

Returns:

If the specified species is supported, and symbol could be standardized, then return the standardized symbol name. If species is unsupported, then the function does not attempt to standardize , and returns the unaltered symbol string. Else follows the behaviour as set by on_fail.

Return type:

Union[str, None]