NanoMEI

Build sample-specific diploid genomes with non-reference mobile element insertions from targeted long-read TE-capture data.

Overview

NanoMEI is a post-processing tool for TEnCATS and Fiber-TEnCATS datasets analyzed with NanoPal.

Starting from NanoPal mobile element insertion calls, NanoMEI extracts insertion-supporting sequence from read softclips or BLAST-defined read segments, groups reads that support the same insertion event, and builds a consensus sequence for each MEI. It then writes (optionally phased) MEI VCF and can uses vcf2diploid to augment the reference genome with the reconstructed non-reference insertions.

Installation

All dependencies can be installed with the provided conda environment:

conda env create -f environment.yml
conda activate NanoMEI
pip install -e .

Running tests

To make sure everything is working, you can run the following test from the main repository folder:

pytest -v

Usage

Please first run NanoPal on your (Fiber-)TEnCATS data to identify captured non-reference TEs. Once you have the NanoPal output files summary.final.PALMER.TE.read.txt and blastn_refine.all.txt, you can use NanoMEI to reconstruct MEI sequences and augment your reference genome.

reconstruct-mei \
  --final-summary-palmer summary.final.PALMER.TE.read.txt \
  --blastn-refine blastn_refine.all.txt \
  --bam-file reads.hg38.bam \
  --output-vcf sample.MEI.vcf \
  --reference-genome-fasta hg38.fa \
  --sample-id sample

Argument	Required	Description
`--final-summary-palmer`	Yes	Path to the NanoPal/PALMER summary file, usually named `summary.final.PALMER.TE.read.txt`. This file provides read-level evidence for captured non-reference TE insertions.
`--blastn-refine`	Yes	Path to the NanoPal BLAST refinement output, usually named `blastn_refine.all.txt`. This file is used to determine which portion of each read corresponds to the mobile element insertion.
`--bam-file`	Yes	BAM file containing nanopore reads aligned to the reference genome. NanoMEI uses this file to extract insertion-supporting read sequence from softclips or BLAST-defined read segments.
`--output-vcf`	Yes	Path/name for the final MEI VCF produced by NanoMEI.
`--reference-genome-fasta`	Yes	Reference genome FASTA used by `vcf2diploid` when creating the sample-specific diploid genome.
`--sample-id`	Yes	Sample identifier to use in the final VCF and `vcf2diploid` output.
`--vcf2diploid-jar`	No	Path to the `vcf2diploid.jar` file. `vcf2diploid.jar` is already in the `resources` subdirectory, but you can provide a new path if you want to try out new versions
`--min-reads-support`	No	Minimum number of reads required to build a consensus sequence for an insertion event. Default: `10`.
`--phased-reads`	No	Optional file with read-level haplotype assignments. The first column must contain the read name matching the FASTA header, and the second column must contain the phase (`1
`--output-dir`	No	Directory for intermediate files and `vcf2diploid` output. Default: `TE_vcf`.
`--vcf-header`	No	Path to the VCF header file. The provided default is a minimal header that can be used with any reference genome. An example of a more detailed header is also provided for hg38; use a different header if working with another reference.
`--use-blast-defined-only`	No	If set, NanoMEI extracts only the BLAST-defined read segment instead of the full trimmed softclip.
`-v`, `--version`	No	Print the NanoMEI version and exit.
`-h`, `--help`	No	Print the help message and exit.

Citation

If you use this repository, please cite:

Fiber-TEnCATS reveals haplotype-specific chromatin accessibility and DNA methylation at human L1HS loci

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
NanoMEI.egg-info		NanoMEI.egg-info
NanoMEI		NanoMEI
resources		resources
scripts		scripts
tests		tests
.gitignore		.gitignore
README.md		README.md
environment.yml		environment.yml
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

NanoMEI

Overview

Installation

Running tests

Usage

Citation

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

NanoMEI

Overview

Installation

Running tests

Usage

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages