Ingen beskrivning

Thomas 2a7460a44d dbsnp parse even if no FREQ and echtvar use exec_jobs 5 timmar sedan
docs 87009c3474 Add Pandora SomaticPipe output container 2 veckor sedan
jq_filters b793b4a698 N50 and size for BAMStats 1 år sedan
src 2a7460a44d dbsnp parse even if no FREQ and echtvar use exec_jobs 5 timmar sedan
.gitignore ff6cea3cb3 Use local Cargo config for Windows htslib patches 2 veckor sedan
AGENTS.md b135d60d4e callers 6 månader sedan
CLAUDE.md c0060e3f5a callers 6 månader sedan
Cargo.lock 65a90d09ee dbsnp 7 timmar sedan
Cargo.toml ff6cea3cb3 Use local Cargo config for Windows htslib patches 2 veckor sedan
README.md b28f23a056 refine somatic pandora output 2 veckor sedan
TODO.md e129656a84 TODO barbell 7 månader sedan
cargo dcd1aa64f7 assembler pipe with recursion 1 månad sedan
hs_err_pid4007520.log dce4a111cc nanomonsv -> mediumq; GATK ok 1 månad sedan
memo_fb_inv.md bd93786500 refactor and audit src/io/bam.rs 1 månad sedan
pandora-config.example.toml 3c15a2820a bam annotation par 4 dagar sedan
setup-patches.ps1 ff6cea3cb3 Use local Cargo config for Windows htslib patches 2 veckor sedan

README.md

Install

Dependencies

For building required HTSlib:

sudo apt install cmake libclang-dev

Windows build

Install the native dependencies with Rtools/MSYS2:

pacman -S --needed mingw-w64-x86_64-curl mingw-w64-x86_64-sqlite3 mingw-w64-x86_64-openssl mingw-w64-x86_64-tre mingw-w64-x86_64-libiconv mingw-w64-x86_64-gettext

Make sure these user environment variables are set:

[Environment]::SetEnvironmentVariable("LIBCLANG_PATH", "$env:USERPROFILE\tools\libclang-win", "User")
[Environment]::SetEnvironmentVariable("PKG_CONFIG", "C:\rtools45\usr\bin\pkg-config.exe", "User")
[Environment]::SetEnvironmentVariable("PKG_CONFIG_PATH", "C:\rtools45\mingw64\lib\pkgconfig", "User")

Also ensure the user Path contains:

%USERPROFILE%\tools\libclang-win
C:\rtools45\mingw64\bin
C:\rtools45\usr\bin

Then open a fresh PowerShell in the repository and run:

.\setup-patches.ps1
cargo build

setup-patches.ps1 generates the Windows-only local Cargo patch config in .cargo/config.toml and patched crate sources in patches/. Both are ignored by git. This keeps Linux builds on the registry crates while Windows builds use the patched hts-sys / rust-htslib sources.

Usage

SomaticPipe output container

A proposed single-file container for SomaticPipe results is documented in docs/somaticpipe-output-format.md.

Use jq for selecting variants

  • Somatic Variants of chrM (25)

    zcat /data/longreads_basic_pipe/*/diag/somatic_variants.json.gz | \
    jq -L ./jq_filters -C 'include "jq_variants"; [.data[] | select(contig("chrM") and n_in_constit <= 1) | format]'
    

Using jq and find to look for chrM norm coverage

find /data/longreads_basic_pipe/ -name "*_diag_hs1_info.json" -type f -exec sh -c 'basename $(dirname $(dirname "{}")) | tr -d "\n"' \; -printf "\t" -exec jq -L ./jq_filters -r 'include "jq_bam"; contig_coverage("chrM")' {} \;

Using jq and find VEP consequences (cf https://www.ensembl.org/info/genome/variation/prediction/predicted_data.html)

zcat /data/longreads_basic_pipe/ADJAGBA/diag/somatic_variants.json.gz | jq -L ./jq_filters -C 'include "jq_variants"; consequence("SynonymousVariant")' | bat

Using jq and find to count VEP consequences

find /data/longreads_basic_pipe/ -name "somatic_variants.json.gz" -type f -exec sh -c 'dirname=$(basename $(dirname $(dirname "$1"))); count=$(zcat "$1" | jq -L ./jq_filters -r '\''include "jq_variants"; count_consequence("SynonymousVariant") | [.true_count, .total_count, .proportion] | @tsv'\''); echo "${dirname}\t${count}"' sh {} \;

Find recurrence by VEP consequence

find /data/longreads_basic_pipe/ -name "somatic_variants.json.gz" -type f -exec sh -c 'dirname=$(basename $(dirname $(dirname "$1"))); count=$(zcat "$1" | jq -L ./jq_filters -r '\''include "jq_variants"; consequence("StopGained") | .[] | select(.has_consequence == true) | [.chr, .position, .ref, .alt] | @tsv'\''); echo "${count}"' sh {} \; | sort -k1,1V -k2,2n | uniq -c |  awk '$1 > 1 {print $2"\t"$3"\t"$4"\t"$5"\t"$1}'

Reading log files

zcat /data/longreads_basic_pipe/ID/log/deepsomatic/deepvariant_e7ed1.log.gz | jq -r '.log'