itpseq.Replicate.load_data#
- Replicate.load_data(how='aax', min_peptide=None, max_peptide=None, limit=None, sample=None)[source]#
Reads the nucleotide/aminoacid inverse-toeprint file as a pandas Series, filters entries based on peptide length and stop codons.
- Parameters:
how (str, optional) – Defines which type of inverse toeprints to load: - ‘nuc’: loads the nucleotide data - ‘aa’: loads the amino acid data - ‘aax’: loads the amino acid data, removes peptides with a stop in the coding sequence (before the A-site)
filename (Path or str) – Path to the nucleotide/amino-acid file
min_peptide (int, optional) – Minimum peptide length to keep, by default None
max_peptide (int, optional) – Maximum peptide length to keep, by default None
limit (int or None, optional) – If not None, limits the number of reads to process. This is useful to perform quick tests.
sample (str or None, optional) – If not None, samples <sample> reads randomly.
- Returns:
Series of inverse-toeprints for the replicate.
- Return type:
Series
Examples
- Load the inverse-toeprints with a minimum peptide length of 3 and keep the internal stops.
>>> replicate.load_data(min_peptide=3, how='aa') 0 mFIVRGWQV 1 mWQ 2 m*T 3 mEVHATTSGQ 4 mHPNYTS*PV ... 2828877 mTGA 2828878 mRSATINLQ 2828879 mSLMPHHRGN 2828880 mHWH 2828881 mSSTRSSRS Length: 2828882, dtype: object