tidytcells.junction

Functions to clean and standardise junction (CDR3) data.

Functions

tidytcells.junction.standardise(seq: str, strict: bool = False, suppress_warnings: bool = False)[source]

Ensures that a string value looks like a valid junction (CDR3) amino acid sequence. This function is a special variant of tidytcells.aa.standardise().

A valid junction sequence must:

  1. Be a valid amino acid sequence

  2. Begin with a cysteine (C)

  3. End with a phenylalanine (F) or a tryptophan (W)

Parameters:
  • seq (str) – String value representing a junction sequence.

  • strict (bool) – If True, any string that does not look like a junction sequence is rejected. If False, any inputs that are valid amino acid sequences but do not start with C and end with F/W are not rejected and instead are corrected by having a C appended to the beginning and an F appended at the end. Defaults to False.

  • suppress_warnings (bool) – Disable warnings that are usually emitted when standardisation fails. Defaults to False.

Returns:

If possible, a standardised version of the input string is returned. If the input string cannot be standardised, it is rejected and None is returned.

Return type:

str or None

tidytcells.junction.standardize(*args, **kwargs)[source]

Alias for tidytcells.junction.standardise().