← All projects

OProtInDepth

A proteomics analysis platform for clinical research teams working with Olink NPX data.

OProtInDepth

Problem

Olink's NPX output is a specialised format, and the teams working with it typically stitch together R scripts, Excel workbooks, and ad-hoc plotting to run an analysis end to end. The workflow is slow, hard to reproduce, and discourages iteration on analysis choices that should be cheap to revisit.

Approach

A Next.js frontend handles upload, cohort/covariate configuration, and result review. A FastAPI service runs the analysis pipeline across thirteen modules: NPX parsing and QC, differential abundance via Welch's t-test with Benjamini-Hochberg FDR correction, ElasticNet logistic regression with nested cross-validation, sparse co-expression networks via Graphical Lasso and Louvain community detection, and pathway enrichment against Enrichr (GO + KEGG), STRING PPI, and Reactome. Supabase stores runs, artifacts, and per-user access; Plotly powers the interactive figures so a reviewer can drill into a volcano or enrichment plot without leaving the report view. Methods text is auto-generated with citations for inclusion in manuscripts.

Stack

Next.jsFastAPIPythonSupabasePlotlyTailwind

Synopsis

The scope covers the complete proteomics analysis loop for Olink NPX data — ingest and QC through differential abundance, pathway enrichment, co-expression networks, classification, and auto-generated methods text. The purpose is to give bench scientists without dedicated bioinformatics support a no-code path from raw NPX file to publication-ready figures, while preserving the statistical rigour a reviewer will scrutinise.

Outcome

Feature-complete: 13 analysis modules and 116 passing tests.

Gallery

Quality control

Quality-control step: NPX distributions per sample, outlier flags, and missing-data summaries. Runs on upload, no tuning required.
Quality-control step: NPX distributions per sample, outlier flags, and missing-data summaries. Runs on upload, no tuning required.

Differential abundance — volcano plot

Differential abundance by Welch's t-test with Benjamini-Hochberg correction, rendered as an interactive Plotly volcano. Hovering any protein surfaces fold change, adjusted p-value, and direction; significance thresholds are marked.
Differential abundance by Welch's t-test with Benjamini-Hochberg correction, rendered as an interactive Plotly volcano. Hovering any protein surfaces fold change, adjusted p-value, and direction; significance thresholds are marked.

Pathway enrichment

Enrichment against Enrichr (GO + KEGG), STRING PPI, and Reactome, ranked by fold enrichment and shaded by adjusted p-value.
Enrichment against Enrichr (GO + KEGG), STRING PPI, and Reactome, ranked by fold enrichment and shaded by adjusted p-value.

Differential interaction network

A sparse co-expression network via Graphical Lasso with Louvain community detection, visualising the edges that shift between experimental groups.
A sparse co-expression network via Graphical Lasso with Louvain community detection, visualising the edges that shift between experimental groups.

Classification & feature importance

ML layer: cross-validated confusion matrix and ElasticNet-regularised feature importances that surface which proteins drive classification between groups.
ML layer: cross-validated confusion matrix and ElasticNet-regularised feature importances that surface which proteins drive classification between groups.