itpseq.Sample.DE#
- Sample.DE(pos=None, how='aax', join=False, quiet=True, filter_size=True, translate=False, multi=True, n_cpus=None, raw=False, _nocache=False, **kwargs)[source]#
Computes the differential expression between the sample and its reference.
- Parameters:
pos (str) – Ribosome positions to consider to compute the differential expression.
how (str, optional) – Type of inverse toeprints to analyze (see
Replicate.load_data).join (bool, optional) – If True, joins the DE results back to the original df. Defaults to False.
quiet (bool, optional) – If True, suppresses the console output of the pydeseq2 library. Defaults to True.
translate (bool, optional) – If True, translates the nucleotide motif into amino-acids. Defaults to False. This doesn’t check that the nucleotide motif is in frame or composed of consecutive positions.
multi (bool, optional) – Whether to compute DE with a specific contrast (cond vs. ref). Defaults to True.
n_cpus (int, optional) – The number of CPUs to utilize for parallel processing. Defaults to the total number of available CPUs.
filter_size (bool) – Only considers reads for which an amino acid is present in all target positions.
**kwargs (optional) – Additional parameters to
get_counts_ratioandReplicate.load_data. For instancemin_peptideandmax_peptideare useful to filter the peptide size of the inverse toeprints to consider.
- Returns:
DataFrame of the differential expression statistics with a row per motif.
- Return type:
DataFrame
See also
Sample.get_counts_ratioGets the inverse toeprint counts and sample/reference ration of normalized counts.
Sample.volcanoDraws a volcano plot from the Differential Expression data.
Sample.subset_logoCreates a logo from a subset of the Differential Expression data.
Examples
- Compute the differential expression for positions E-P
>>> sample.DE('E:P') baseMean log2FoldChange lfcSE stat pvalue padj log10pvalue log10padj QK 5537.704183 0.778031 0.073280 10.617238 2.477833e-26 7.582170e-24 25.605928 23.120206 VI 6874.891363 0.295160 0.371018 0.795542 4.262985e-01 7.718778e-01 0.370286 0.112451 MY 747.538317 0.263705 0.074294 3.549477 3.859965e-04 NaN 3.413417 NaN YY 2216.501684 0.259860 0.068213 3.809545 1.392226e-04 6.086018e-03 3.856290 2.215667 WM 200.446070 0.226720 0.111555 2.032371 4.211614e-02 NaN 1.375551 NaN .. ... ... ... ... ... ... ... ... TP 15256.234795 -0.255940 0.061793 -4.141886 3.444618e-05 2.635133e-03 4.462859 2.579197 mK 10824.395771 -0.308538 0.210353 -1.466765 1.424400e-01 4.737680e-01 0.846368 0.324434 EP 8950.363473 -0.321266 0.068514 -4.689045 2.744828e-06 2.799725e-04 5.561485 3.552885 PP 20880.530910 -0.372203 0.078851 -4.720365 2.354220e-06 2.799725e-04 5.628153 3.552885 KK 7645.111411 -0.390365 0.096140 -4.060381 4.899280e-05 2.998359e-03 4.309868 2.523116 [420 rows x 8 columns]
Include the read counts for each replicate, the average count per million reads, and the sample/reference ratio of the normalized counts.
>>> sample.DE('E:P', join=True) noa.1 noa.2 noa.3 sample.1 sample.2 sample.3 noa sample ratio baseMean log2FoldChange lfcSE stat pvalue padj log10pvalue log10padj QK 4312 3594 4506 7161 7153 6396 1414.463921 2663.851852 1.883294 5537.704183 0.778031 0.073280 10.617238 2.477833e-26 7.582170e-24 25.605928 23.120206 VI 6696 4833 7340 6335 5263 10473 2146.655446 2805.667048 1.306995 6874.891363 0.295160 0.371018 0.795542 4.262985e-01 7.718778e-01 0.370286 0.112451 MY 760 633 672 855 795 767 233.801175 309.668069 1.324493 747.538317 0.263705 0.074294 3.549477 3.859965e-04 NaN 3.413417 NaN YY 2365 1826 1956 2686 2231 2260 692.561206 914.186048 1.320008 2216.501684 0.259860 0.068213 3.809545 1.392226e-04 6.086018e-03 3.856290 2.215667 WM 204 172 186 236 243 164 63.727865 82.966388 1.301886 200.446070 0.226720 0.111555 2.032371 4.211614e-02 NaN 1.375551 NaN .. ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... TP 17811 14607 18129 15427 13436 12471 5751.773266 5279.144489 0.917829 15256.234795 -0.255940 0.061793 -4.141886 3.444618e-05 2.635133e-03 4.462859 2.579197 mK 15656 9112 12044 11894 7434 9530 4099.750100 3623.052276 0.883725 10824.395771 -0.308538 0.210353 -1.466765 1.424400e-01 4.737680e-01 0.846368 0.324434 EP 10913 8512 10882 8323 7911 7345 3441.100658 3024.810787 0.879024 8950.363473 -0.321266 0.068514 -4.689045 2.744828e-06 2.799725e-04 5.561485 3.552885 PP 24521 19997 27230 19044 17820 17053 8191.673763 6910.316773 0.843578 20880.530910 -0.372203 0.078851 -4.720365 2.354220e-06 2.799725e-04 5.628153 3.552885 KK 9528 6918 10052 7128 5688 6785 3010.061611 2490.775936 0.827483 7645.111411 -0.390365 0.096140 -4.060381 4.899280e-05 2.998359e-03 4.309868 2.523116 [420 rows x 17 columns]
- Export the previous table as CSV (name the index “motif”)
>>> sample.DE('E:P', join=True).rename_axis('motif').to_csv('sample_enrichment_EP.csv')