This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
This is a Rust library for somatic variant calling and analysis from Oxford Nanopore long-read sequencing data. The library provides a complete pipeline from POD5 files through basecalling, alignment, variant calling, annotation, and statistical analysis. It supports execution both locally and via Slurm HPC environments.
# Build the library
cargo build
# Run tests with full output
cargo test -- --nocapture
# Run tests with debug logging
RUST_LOG=debug cargo test -- --nocapture
# Format code
cargo fmt
# Lint with warnings as errors
cargo clippy -- -D warnings
# Generate documentation
cargo doc --open
The library requires a configuration file at ~/.local/share/pandora/pandora-config.toml. Use pandora-config.example.toml as a template. The configuration system uses path templates with placeholders like {result_dir}, {id}, {time}, {reference_name}, and {haplotagged_bam_tag_name}.
Key configuration sections:
The library uses a trait-based command execution system defined in src/commands/mod.rs:
Command trait: Provides init(), cmd(), and clean_up() lifecycle methodsLocalRunner trait: Executes commands directly via bashSlurmRunner trait: Wraps commands with srun or sbatch for HPC executionrun! macro (line 639): Dispatches to LocalRunner or SlurmRunner based on config.slurm_runnerrun_many! macro (line 987): Parallelizes multiple commands using RayonAll external tools (dorado, samtools, bcftools, longphase, modkit) implement these traits, allowing seamless switching between local and Slurm execution.
callers/: Variant calling tool interfaces
clairs.rs: ClairS somatic small variant caller with LongPhase haplotaggingdeep_variant.rs, deep_somatic.rs: Google DeepVariant/DeepSomatic wrappersnanomonsv.rs: Structural variant calling (paired tumor/normal)savana.rs: SV and CNV analysis with haplotagged BAM supportseverus.rs: VNTR and repeat-based variant callingcommands/: External command wrappers implementing Command, LocalRunner, and SlurmRunner
dorado.rs: Basecalling and alignment from POD5 filessamtools.rs, bcftools.rs: SAM/BAM/VCF manipulationlongphase.rs: Phasing and modcall for methylationmodkit.rs: Methylation pileup and summarycollection/: Input data discovery and organization
run.rs, prom_run.rs: PromethION run metadata and POD5 file discoverybam.rs: BAM file collection across cases and time pointsvcf.rs: VCF file organizationflowcells.rs: Flowcell metadata managementminknow.rs: MinKNOW sample sheet and telemetry parsingrunners.rs: Defines Run, Wait, RunWait traits and run_wait() function for command execution lifecycle with timestamped RunReport generation
pipes/: Multi-caller pipeline composition
somatic.rs: Orchestrates full somatic pipeline across ClairS, Nanomonsv, Savana, etc.somatic_slurm.rs: Slurm-optimized batch submission variantsannotation/: VEP (Variant Effect Predictor) line parsing and consequence filtering
variant/: Variant data structures, loading, filtering, and statistics
variant.rs: Core variant types, BND graph construction, alteration categorizationvariant_collection.rs: Bulk variant loading and grouping operationsvariants_stats.rs: Mutation rates, depth quality ranges, panel-based statsio/: File readers/writers (BED, GFF, VCF, gzip handling)
positions.rs: Genome coordinate representations (GenomePosition, GenomeRange) with parallel overlap operations
config.rs: Global Config struct loaded from TOML (line 14 defines the struct)
helpers.rs: Path utilities, Shannon entropy, Singularity bind flag generation
scan/: Somatic variant scanning algorithms
functions/: Genome assembly and custom analysis logic
commands::dorado::Dorado basecalls and aligns POD5 to referencecallers::clairs::ClairS::initialize(...)?. run()?)variant::variant_collection::Variants, filter, annotate with annotation::vepvariant::variants_stats::VariantsStats for mutation rates and quality metricspipes::somatic::Somatic runner or collection::run::Collections for batch processingTEST_DIR constant (src/lib.rs:158): /mnt/beegfs02/scratch/t_steimle/test_data#[cfg(test)]External tools required at runtime (ensure they are in PATH or configured in config file):
pandora_lib_variants for VEP install)Rust dependencies of note:
rust-htslib: HTSlib bindings for BAM/VCF reading (requires cmake, libclang-dev for build)rayon: Parallel iteration across samples and tasksdashmap: Concurrent hashmaps for thread-safe collectionsarrow: Efficient columnar data handling (from Apache Arrow)noodles-*: Pure-Rust bioinformatics file parsers (FASTA, GFF, CSI)Tools like ClairS, DeepVariant, and DeepSomatic run via Singularity containers. The config.singularity_bin setting defaults to module load singularity-ce && singularity. Image paths are specified per tool in the config (e.g., deepvariant_image, clairs_image).
anyhow::Result with ? operator; avoid unwrap() in production code paths.context() for debugging clarityformat!() and config field substitutiontumoral_name (default "diag"), normal is normal_name (default "norm")haplotagged_bam_tag_name config field (default "HP")