Posted by Nathan Edwards
Accepted talk at the 2013 USHUPO conference in Baltimore, in the Tuesday morning “Systems Proteomics” session.
Georgetown University Medical Center, Washington, DC
Functional enrichment analysis is used extensively in systems biology analyses of transcriptomics data, linking phenotypically distinct experimental samples via differentially expressed genes to knowledgebases that categorize genes by function, cellular location, domain, or canonical pathway. We address the challenges in applying functional enrichment analysis to proteomic data by ensuring a minimal protein set is considered, and by using spectral counting to detect differential abundant proteins and protein isoforms.
We propose a stringent criteria for inferring proteins from bottom-up peptide-fragmentation spectra – requiring that each inferred protein be supported by at least two unshared peptides, and that the proportion of peptide identifications omitted be consistent with FDR-based peptide identification filtering strategies. We also show that hypergeometric based statistical models and Fisher exact tests can be readily applied to spectral counts to determine differentially abundant proteins, and when coupled with stringent protein inference to ensure an appropriate statistical background, can be used to infer functional enrichment of differential proteins using the DAVID tool. Furthermore, we show that we can test functional protein sets directly, finding evidence for differentially abundant proteins sets based on spectral counts – avoiding the perils of counting proteins. Lastly, we show that spectral counts can be used for the detection of alternative splicing events, even when the underlying set of observed peptides do not provide evidence of distinguishing amino-acid sequence.