tidytcells.mhc
Functions to clean and standardise MHC gene data.
Functions
- tidytcells.mhc.get_chain(gene: str | None = None, suppress_warnings: bool = False, gene_name: str | None = None) str
Given a standardised MHC gene name, detect whether it codes for an alpha or a beta chain molecule.
- Parameters:
gene (
str) – Standardised MHC gene namesuppress_warnings (
bool) – Disable warnings that are usually emitted when chain classification fails. Defaults toFalse.gene_name (
str) – Alias for the parametergene.
- Returns:
'alpha'or'beta'ifgeneis recognised and its chain is known, elseNone.- Return type:
strorNone
- tidytcells.mhc.get_class(gene: str | None = None, suppress_warnings: bool = False, gene_name: str | None = None) int
Given a standardised MHC gene name, detect whether it comprises a class I or II MHC receptor complex.
- Parameters:
gene (
str) – Standardised MHC gene namesuppress_warnings (
bool) – Disable warnings that are usually emitted when classification fails. Defaults toFalse.gene_name (
str) – Alias for the parametergene.
- Returns:
1or2ifgeneis recognised and its class is known, elseNone.- Return type:
intorNone
- tidytcells.mhc.standardise(gene: str | None = None, species: str = 'homosapiens', precision: str = 'allele', suppress_warnings: bool = False, gene_name: str | None = None) tuple
Attempt to standardise an MHC gene name to be IMGT-compliant.
Note
This function will only verify the validity of an MHC gene/allele up to the level of the protein. Any further precise allele designations will not be verified, apart from the requirement that the format (colon-separated numbers) look valid. The reasons for this is firstly because new alleles at that level are added to the IMGT list quite often and so accurate verification is difficult, secondly because people rarely need verification to such a precise level, and finally because such verification costs more computational effort with diminishing returns.
- Parameters:
gene (
str) – Potentially non-standardised MHC gene name.species (
str) – Species to which the MHC gene belongs (see Supported species and species strings). Defaults to'homosapiens'.precision (
str) – The maximum level of precision to standardise to.'allele'standardises to the maximum precision possible.'protein'keeps allele designators up to the level of the protein (first two).'gene'standardises only to the level of the gene. Defaults to'allele'.suppress_warnings (
bool) – Disable warnings that are usually emitted when standardisation fails. Defaults toFalse.gene_name (
str) – Alias for the parametergene.
- Returns:
If the specified
speciesis supported, andgenecould be standardised, then return the standardised gene name. Ifspeciesis unsupported, then the function does not attempt to standardise, and returns the unalteredgenestring. Else returnsNone.- Return type:
strorNone
- tidytcells.mhc.standardize(*args, **kwargs)
Alias for
tidytcells.mhc.standardise().