skbio.embedding.ProteinEmbedding#
- class skbio.embedding.ProteinEmbedding(embedding, sequence, clip_head=False, clip_tail=False, **kwargs)[source]#
Embedding of a protein sequence.
- Parameters:
- embeddingarray_like
The embedding of the protein sequence. Row vectors correspond to the latent residues coordinates.
- sequencestr, Protein, or 1D ndarray
Characters representing the protein sequence itself.
- clip_headbool, optional
If
True, then the first row of the embedding will be removed. Some language models specify start tokens, and this parameter can be used to account for this.- clip_tailbool, optional
If
True, then the last row of the embedding will be removed. Some language models specify end tokens, and this parameter can be used to account for this.
See also
Examples
>>> from skbio.embedding import ProteinEmbedding >>> import numpy as np >>> embedding = np.random.rand(10, 3) >>> sequence = "ACDEFGHIKL" >>> ProteinEmbedding(embedding, sequence) ProteinEmbedding -------------------------- Stats: length: 10 embedding dimension: 3 has gaps: False has degenerates: False has definites: True has stops: False -------------------------- 0 ACDEFGHIKL
Attributes
default_write_formatembeddingThe embedding tensor.
idsIDs corresponding to each row of the embedding.
residuesArray containing underlying residue characters.
sequenceString representation of the underlying sequence.
Built-ins
__eq__(value, /)Return self==value.
__ge__(value, /)Return self>=value.
__getstate__(/)Helper for pickle.
__gt__(value, /)Return self>value.
__hash__(/)Return hash(self).
__le__(value, /)Return self<=value.
__lt__(value, /)Return self<value.
__ne__(value, /)Return self!=value.
__str__()String representation of the underlying sequence.
Methods