Conduit

Coming Soon

DIA Metaproteomics Ecosystem

Open Science · Metaproteomics

DIA Metaproteomics.
Finally solved.

A modular, end-to-end analysis ecosystem for DIA-based metaproteomics. Built for scientists who need rigorous results — not workarounds.

Download
Conduit ecosystem logo

§ 01 — Problem & Solution

Metaproteomics is hard.
Conduit makes it easy.


The core bottleneck:
defining the search space.

In DIA-based metaproteomics, the protein database used for spectral matching defines everything downstream. Too broad and you drown in false positives. Too narrow and you miss real biology. Building that database correctly — from community composition data, taxonomic assignments, or genomic assemblies — has been the single hardest step in the entire workflow.

"Every analysis starts with this question: what proteins could possibly be here? Conduit answers it — five different ways."

Most metaproteomics workflows are assembled from disconnected scripts, incompatible databases, and brittle manual steps. Getting from raw DIA spectra to biological insight requires navigating a maze of bioinformatics tools that were never designed to work together.

Conduit is a fully integrated ecosystem — a Snakemake pipeline (Ascent), a GUI launcher (Basecamp), a visualization app (Summit), and an R package (conduitR) — all designed around a single unified data model.

You bring the data. Conduit handles the rest: database construction, spectral search, quantification, and analysis. One ecosystem. One output. Every answer.


§ 02 — Search Space Construction

Your experiment.
Your data. Your search space.

Strategy 01
NCBI Taxonomy IDs
Supply a list of NCBI taxonomy IDs and Conduit retrieves corresponding proteomes from UniProt. Ideal when community composition is approximately known from prior work.
Strategy 02
UniProt Proteome IDs
Supply a list of UniProt proteome IDs to handpick exactly which proteomes define your search space. Best when you need finer control than taxonomy alone provides.
Strategy 03
MetaPhlAn-guided Metagenomics
Supply fastq files and Conduit uses MetaPhlAn to select reference proteomes matching your community composition. Best for well profiled microbiomes.
Strategy 04
MAGs
Provide metagenome-assembled genomes directly. Conduit translates coding sequences to protein space, enabling sample-specific databases that capture novel or uncultured organisms.
Strategy 05
Peptidotyping
Catches what you didn't know to look for. Using only the peptides already detected in your sample — no additional sequencing needed — Peptidotyping surfaces organisms like dietary peptides that never made it into your original database.
Novel approach · No orthogonal data required

UNIQUE TO CONDUIT

Peptidotyping closes the loop — your proteomics experiment can now define its own search space, with no additional experimental cost.


§ 03 — Ecosystem Components

Four tools.
One unified workflow.

conduit-ascent Pipeline

The core Snakemake/Singularity pipeline. Handles all stages of DIA metaproteomics analysis — database construction, spectral search, quantification, and output generation — in a fully reproducible, containerised workflow.

github.com/baynec2/conduit-ascent
conduit-basecamp Coming Soon

A PyQt graphical interface for configuring and launching Conduit Ascent pipelines. Designed for scientists who want the full power of the pipeline without touching the command line — form-based configuration, run management, and progress monitoring.

In development — launching soon
conduit-summit Visualization

An R Shiny web application for interactive exploration of Conduit results. Visualise protein abundance, community composition, differential expression, and functional annotations — all from the single standardized output file produced by Ascent.

github.com/baynec2/conduit-summit
conduitR R Package

The R package that powers Conduit's statistical analysis layer. Provides programmatic access to all Conduit data structures, normalization methods, differential abundance testing, and plotting — for users who want full scripting control over their analysis.

github.com/baynec2/conduitR

§ 04 — Technical Specifications

Built for how
science actually works.

Capability Details
Cross-platform Runs on Mac, Windows, Linux, and HPC clusters. It just works.
macOS Windows Linux HPC / SLURM
Full container support Reproducible environments via Docker and Singularity. No dependency hell.
Docker Singularity
CLI and GUI Snakemake CLI for power users and reproducible pipelines. Basecamp GUI for everyone else.
Snakemake CLI Basecamp GUI PyQt
Unified output One structured output file contains every result your analysis needs — proteins, abundances, annotations, and metadata.
Single file output Standardized schema