Conduit — DIA Metaproteomics Ecosystem

§ 01 — Problem & Solution

Metaproteomics is hard.
Conduit makes it easy.

The core bottleneck:
defining the search space.

In DIA-based metaproteomics, the protein database used for spectral matching defines everything downstream. Too broad and you drown in false positives. Too narrow and you miss real biology. Building that database correctly — from community composition data, taxonomic assignments, or genomic assemblies — has been the single hardest step in the entire workflow.

"Every analysis starts with this question: what proteins could possibly be here? Conduit answers it — five different ways."

Most metaproteomics workflows are assembled from disconnected scripts, incompatible databases, and brittle manual steps. Getting from raw DIA spectra to biological insight requires navigating a maze of bioinformatics tools that were never designed to work together.

Conduit is a fully integrated ecosystem — a Snakemake pipeline (Ascent), a GUI launcher (Basecamp), a visualization app (Summit), and an R package (conduitR) — all designed around a single unified data model.

You bring the data. Conduit handles the rest: database construction, spectral search, quantification, and analysis. One ecosystem. One output. Every answer.

§ 02 — Search Space Construction

Your experiment.
Your data. Your search space.

Strategy 01

NCBI Taxonomy IDs

Supply a list of NCBI taxonomy IDs and Conduit retrieves corresponding proteomes from UniProt. Ideal when community composition is approximately known from prior work.

Strategy 02

UniProt Proteome IDs

Supply a list of UniProt proteome IDs to handpick exactly which proteomes define your search space. Best when you need finer control than taxonomy alone provides.

Strategy 03

MetaPhlAn-guided Metagenomics

Supply fastq files and Conduit uses MetaPhlAn to select reference proteomes matching your community composition. Best for well profiled microbiomes.

Strategy 04

MAGs

Provide metagenome-assembled genomes directly. Conduit translates coding sequences to protein space, enabling sample-specific databases that capture novel or uncultured organisms.

Strategy 05

Peptidotyping

Catches what you didn't know to look for. Using only the peptides already detected in your sample — no additional sequencing needed — Peptidotyping surfaces organisms like dietary peptides that never made it into your original database.

Novel approach · No orthogonal data required

UNIQUE TO CONDUIT

Peptidotyping closes the loop — your proteomics experiment can now define its own search space, with no additional experimental cost.

§ 03 — Ecosystem Components

Four tools.
One unified workflow.

conduit-ascent Pipeline

The core Snakemake/Singularity pipeline. Handles all stages of DIA metaproteomics analysis — database construction, spectral search, quantification, and output generation — in a fully reproducible, containerised workflow.

github.com/baynec2/conduit-ascent

conduit-basecamp Coming Soon

A PyQt graphical interface for configuring and launching Conduit Ascent pipelines. Designed for scientists who want the full power of the pipeline without touching the command line — form-based configuration, run management, and progress monitoring.

In development — launching soon

conduit-summit Visualization

An R Shiny web application for interactive exploration of Conduit results. Visualise protein abundance, community composition, differential expression, and functional annotations — all from the single standardized output file produced by Ascent.

github.com/baynec2/conduit-summit

conduitR R Package

The R package that powers Conduit's statistical analysis layer. Provides programmatic access to all Conduit data structures, normalization methods, differential abundance testing, and plotting — for users who want full scripting control over their analysis.

github.com/baynec2/conduitR

§ 04 — Technical Specifications

Built for how
science actually works.

	Capability	Details
	Cross-platform Runs on Mac, Windows, Linux, and HPC clusters. It just works.	macOS Windows Linux HPC / SLURM
	Full container support Reproducible environments via Docker and Singularity. No dependency hell.	Docker Singularity
	CLI and GUI Snakemake CLI for power users and reproducible pipelines. Basecamp GUI for everyone else.	Snakemake CLI Basecamp GUI PyQt
	Unified output One structured output file contains every result your analysis needs — proteins, abundances, annotations, and metadata.	Single file output Standardized schema

Coming Soon

DIA Metaproteomics. Finally solved.

Metaproteomics is hard.Conduit makes it easy.

The core bottleneck:defining the search space.

Your experiment.Your data. Your search space.

Four tools.One unified workflow.