Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .gitattributes
Original file line number Diff line number Diff line change
@@ -1,4 +1,7 @@
*.config linguist-language=nextflow
*.nf.test linguist-language=nextflow
*.bwt binary
*.pac binary
*.sa binary
modules/nf-core/** linguist-generated
subworkflows/nf-core/** linguist-generated
2 changes: 1 addition & 1 deletion .github/workflows/nf-test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -119,7 +119,7 @@ jobs:
confirm-pass:
needs: [nf-test]
if: always()
runs-on: # use self-hosted runners
runs-on: # use self-hosted runners
- runs-on=${{ github.run_id }}-confirm-pass
- runner=2cpu-linux-x64
steps:
Expand Down
11 changes: 11 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -1,3 +1,14 @@
exclude: |
(?x)^(
\.nf-test/|
results/|
work/|
tmp/|
\.codex$|
\.codex/|
assets/testdata/.*\.(amb|ann|bwt|pac|sa)$
)

repos:
- repo: https://github.com/pre-commit/mirrors-prettier
rev: "v3.1.0"
Expand Down
31 changes: 29 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ TrESFlow is a Nextflow DSL2 pipeline for the preprocessing of TrES-seq data from
## Install

Install your conda/mamba/micromamba env as follows (conda-forge & bioconda channels):

```bash
micromamba env create -n tres
micromamba activate tres
Expand All @@ -16,12 +17,14 @@ micromamba install screen samtools bwa-mem2 star fastqc multiqc trim-galore deep
```

Download the repo and cd in it:

```bash
git clone git@github.com:CSOgroup/TrESFlow.git
cd TrESFlow
```

Install codon in your env:

```bash
./scripts/install_codon_0.16.3.sh --prefix /path/to/env/prefix
```
Expand Down Expand Up @@ -110,16 +113,25 @@ RNA publishes:
- `rna_split_fastqs/`
- `rna_align/`
- `TrES_Stats/`
- `qc/samtools/`
- `multiqc/`
- `tres_report/`
- `pipeline_info/`

DNA publishes:

- `dna_split_fastqs/`
- `dna_align/`
- `TrES_Stats/`
- `qc/samtools/`
- `multiqc/`
- `tres_report/`
- `pipeline_info/`

`TrES_Stats/` includes RNA and DNA sequencing-efficiency UpSet PDF plots. Sankey plots, HTML reports, count tables, combined RNA+DNA reports, and sequencing-efficiency warning TSVs are not produced. Optional unavailable BAM-derived categories are skipped with warnings in the process log.
The pipeline also writes two end-of-run HTML reports:

- `tres_report/tres_report.html`: compact TrESFlow-specific RNA/DNA mapping and barcode summary
- `multiqc/multiqc_report.html`: nf-core MultiQC aggregation of supported logs and QC files

## Runtime Contract

Expand Down Expand Up @@ -162,7 +174,7 @@ Default local CPU budget:

Work-directory cleanup is intentionally aggressive: `--cleanup_work true` uses Nextflow's successful-run cleanup to remove task work directories after outputs have been published and downstream tasks have completed. This substantially reduces retained `work/` storage, but cleaned tasks are not expected to be usable with `--resume`. Set `--cleanup_work false` when you need the previous resume-friendly behavior for debugging or iterative development.

DNA alignment no longer removes low-count cell barcodes during `ALIGN_DNA`. The aligned BAM still keeps proper-pair mapped, non-blacklisted reads; duplicate removal is represented later by `*_NoDup.bam`, and duplicate status appears in DNA sequencing-efficiency plots as `Unique +`.
DNA alignment no longer removes low-count cell barcodes during `ALIGN_DNA`. The aligned BAM still keeps proper-pair mapped, non-blacklisted reads; duplicate removal is represented later by `*_NoDup.bam`.

Every run writes:

Expand All @@ -171,6 +183,21 @@ Every run writes:
- `${outdir}/pipeline_info/execution_trace.tsv`
- `${outdir}/pipeline_info/flowchart.html`
- `${outdir}/pipeline_info/runtime_contract.tsv`
- `${outdir}/tres_report/tres_report.html` with per-library main statistics, detailed QC tables, and CSV/Excel export buttons
- `${outdir}/tres_report/tres_report_metrics.json`
- `${outdir}/multiqc/multiqc_report.html`

Runs with real BAM outputs also write nf-core samtools sidecar QC under:

- `${outdir}/qc/samtools/*.flagstat`
- `${outdir}/qc/samtools/*.stats`
- `${outdir}/qc/samtools/*.idxstats`
- `${outdir}/qc/samtools/*.quickcheck.tsv`

Raw FASTQ QC from nf-core FastQC is written under:

- `${outdir}/qc/fastqc/*_fastqc.html`
- `${outdir}/qc/fastqc/*_fastqc.zip`

The active runtime scripts live under [`scripts/core_runtime/`](scripts/core_runtime/). `upstream/source_scripts/` is kept only as provenance for the vendored core code.

Expand Down
48 changes: 48 additions & 0 deletions assets/test_realdata/ligation_barcode_whitelist.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
ATCACGTT
CGATGTTT
TTAGGCAT
TGACCACT
ACAGTGGT
GCCAATGT
CAGATCTG
ACTTGATG
GATCAGCG
TAGCTTGT
GGCTACAG
CTTGTACT
TGGTTGTT
TCTCGGTT
TAAGCGTT
TCCGTCTT
TGTACCTT
TTCTGTGT
TCTGCTGT
TTGGAGGT
TCGAGCGT
TGATACGT
TGCATAGT
TTGACTCT
TGCGATCT
TTCCTGCT
TAGTGACT
TACAGGAT
TCCTCAAT
TGTGGTTG
TACTAGTC
TTCCATTG
TCGAAGTG
TAACGCTG
TTGGTATG
TGAACTGG
TACTTCGG
TCTCACGG
TCAGGAGG
TAAGTTCG
TCCAGTCG
TGTATGCG
TCATTGAG
TGGCTCAG
TATGCCAG
TCAGATTC
TAGTCTTG
TTCAGCTC
Loading
Loading