Skip to content

GreshamLab/bartools

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

bartools

An R package for analyzing Bar-seq (barcode sequencing) data. bartools provides functions to ingest, tidy, summarize, and quality-control barcode count data generated by BarNone or Bartender.

Installation

# install devtools if needed
install.packages("devtools")

library(devtools)
install_github("GreshamLab/bartools", build_vignettes = TRUE)

Once installed, browse the package documentation with:

help(package = "bartools", help_type = "html")

Overview

Bar-seq experiments track strain-specific DNA barcodes across samples over time. bartools takes the raw count outputs from BarNone or Bartender, attaches sample metadata, and returns tidy long-format tibbles ready for downstream analysis.

Workflow:

  1. Count barcodes → BarNone / Bartender
  2. Tidy & attach metadata → tidy_bar_none() / tidy_bar_tender()
  3. Summarize per sample → summarize_bar()
  4. Plot QC → summarize_barnon_plot()

Input formats

All workflows require a sample sheet CSV with at least two columns:

Column Description
File_path Path to the barcode count file
Sample Sample name to assign

Additional metadata columns (e.g. time point, condition) may be included and will be preserved in the output.

BarNone format

Tab-separated files with barcodes in a Strain column and sample counts in columns named Sample_<name>_UP / Sample_<name>_DOWN.

Bartender format

CSV files with barcodes in a Center column and counts in columns named time_point_<n>.

Functions

tidy_bar_none(sample_sheet, u_vector = NA)

Reads BarNone output files listed in sample_sheet, reshapes counts to long format, joins sample metadata, looks up gene names from the bundled SGD lookup table, and returns a tidy tibble.

  • sample_sheet — path to sample sheet CSV
  • u_vector — optional character vector of column names to combine into a unique sample ID (UID); defaults to File_path + Sample

tidy_bar_tender(sample_sheet, u_vector = NA)

Same as tidy_bar_none() but for Bartender output files.

summarize_bar(data, sum_by, barcode_by, count_by, tag_by, tags)

Produces a per-sample summary table from a tidy barcode tibble.

Argument Default Description
data Tidy tibble from tidy_bar_*
sum_by "SampleNum" Column to group by
barcode_by "Strain" Column containing barcode IDs
count_by "Counts" Column containing read counts
tag_by "Tag" Column containing UP/DOWN tag labels (set to NULL to skip)
tags c("UP","DOWN") Names of the two tag directions

Returns a tibble with columns including LibrarySize, NumSamples, and Shannon diversity (Shannon).

summarize_barnon_plot(df, datadir, format)

Generates four QC plots and saves them to disk:

  1. Library size per sample (bar chart, log scale, colored by tag)
  2. Strain frequency histogram across all samples
  3. UP vs DOWN tag scatter plots per sample
  4. Pairwise sample correlation matrix
Argument Default Description
df Tidy tibble from tidy_bar_none()
datadir getwd() Output directory for plots
format ".pdf" File extension / format for saved plots

Data

The package ships with data/gene_lookup_table.csv, a mapping of S. cerevisiae systematic names (e.g. YAL001C) to common gene names downloaded from the Saccharomyces Genome Database (SGD). This table is joined automatically by tidy_bar_none() to annotate barcodes with gene names.

To regenerate the table from SGD:

source("R/get_sgd.R")

Dependencies

Package Use
tidyverse data import, reshaping, and plotting
vegan Shannon diversity calculation
GGally pairwise correlation plots
gridExtra multi-panel plot layout

License

MIT

About

tools for analyzing barseq data

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages