tidytcells.junction
Functions to clean and standardize junction (CDR3) data.
Functions
- tidytcells.junction.standardise(*args, **kwargs)[source]
Alias for
tidytcells.junction.standardize().
- tidytcells.junction.standardize(seq: str, strict: bool = False, on_fail: str = 'reject', suppress_warnings: bool = False)[source]
Ensures that a string value looks like a valid junction (CDR3) amino acid sequence. This function is a special variant of
tidytcells.aa.standardize().A valid junction sequence must:
Be a valid amino acid sequence
Begin with a cysteine (C)
End with a phenylalanine (F) or a tryptophan (W)
- Parameters:
seq (str) – String value representing a junction sequence.
strict (bool) – If
True, any string that does not look like a junction sequence is rejected. IfFalse, any inputs that are valid amino acid sequences but do not start with C and end with F/W are not rejected and instead are corrected by having a C appended to the beginning and an F appended at the end. Defaults toFalse.on_fail (str) – Behaviour when standardization fails. If set to
"reject", returnsNoneon failure. If set to"keep", returns the original input. Defaults to"reject".suppress_warnings (bool) – Disable warnings that are usually emitted when standardisation fails. Defaults to
False.
- Returns:
If possible, a standardized version of the input string is returned. If the input string cannot be standardized, the function follows the behaviour as set by
on_fail.- Return type:
Union[str, None]