mod.rs 2.2 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051
  1. //! File I/O for all genomic formats used by Pandora.
  2. //!
  3. //! # Coordinate convention
  4. //!
  5. //! All types and functions in this module use **0-based, half-open `[start, end)`**
  6. //! coordinates unless explicitly documented otherwise. This matches the BED format,
  7. //! Rust's `Range<u32>`, and the internal [`GenomeRange`](crate::positions::GenomeRange)
  8. //! representation. Conversions to/from 1-based formats (GFF3, VCF POS, SAM POS,
  9. //! Tabix positions) are handled internally and noted in each function's documentation.
  10. //!
  11. //! # BGZF vs standard gzip
  12. //!
  13. //! All `.gz` files are treated as **BGZF** (block gzip), not standard gzip.
  14. //! BGZF is produced by `bgzip` and used by BAM, VCF.gz, BED.gz, etc.
  15. //! Plain `gzip` output will not decompress correctly. See [`readers`] for details.
  16. //!
  17. //! # Submodules
  18. //!
  19. //! | Module | Purpose |
  20. //! |--------|---------|
  21. //! | [`bam`] | BAM/CRAM reading, SA-tag parsing, fold-back inversion detection |
  22. //! | [`bed`] | BED file I/O, overlap queries, gene annotation, tabix compression |
  23. //! | [`vcf`] | VCF file I/O with BGZF + Tabix index |
  24. //! | [`fasta`] | Indexed FASTA access, contig splitting |
  25. //! | [`gff`] | GFF3 feature range extraction |
  26. //! | [`modkit`] | Modkit bedMethyl pileup parsing, epigenetic activity computation |
  27. //! | [`straglr`] | Straglr STR genotyper TSV parsing |
  28. //! | [`liftover`] | UCSC chain file parsing and coordinate liftover |
  29. //! | [`readers`] | Generic BGZF/plain readers, Tabix region fetch (`fetch_tabix_lines_with`) |
  30. //! | [`writers`] | BGZF writers, `BgzTabixWriter` for combined BGZF + Tabix output |
  31. //! | [`tsv`] | `TsvLine` — reusable delimiter-agnostic line buffer (replaces `csv::ByteRecord`) |
  32. //! | [`dict`] | Sequence dictionary (`.dict`) reader |
  33. //! | [`fastq`] | FASTQ writer from BAM records |
  34. //! | [`pod5_infos`] | POD5 run metadata extraction via Arrow IPC + flatbuffers |
  35. //! | [`pod5_footer_generated`] | Auto-generated flatbuffers types for the POD5 footer |
  36. pub mod bam;
  37. pub mod bed;
  38. pub mod dict;
  39. pub mod fasta;
  40. pub mod fastq;
  41. pub mod gff;
  42. pub mod liftover;
  43. pub mod modkit;
  44. pub mod pod5_footer_generated;
  45. pub mod pod5_infos;
  46. pub mod readers;
  47. pub mod straglr;
  48. pub mod tsv;
  49. pub mod vcf;
  50. pub mod writers;