An R package for analyzing Bar-seq (barcode sequencing) data. bartools provides functions to ingest, tidy, summarize, and quality-control barcode count data generated by BarNone or Bartender.
# install devtools if needed
install.packages("devtools")
library(devtools)
install_github("GreshamLab/bartools", build_vignettes = TRUE)Once installed, browse the package documentation with:
help(package = "bartools", help_type = "html")Bar-seq experiments track strain-specific DNA barcodes across samples over time. bartools takes the raw count outputs from BarNone or Bartender, attaches sample metadata, and returns tidy long-format tibbles ready for downstream analysis.
Workflow:
- Count barcodes → BarNone / Bartender
- Tidy & attach metadata →
tidy_bar_none()/tidy_bar_tender() - Summarize per sample →
summarize_bar() - Plot QC →
summarize_barnon_plot()
All workflows require a sample sheet CSV with at least two columns:
| Column | Description |
|---|---|
File_path |
Path to the barcode count file |
Sample |
Sample name to assign |
Additional metadata columns (e.g. time point, condition) may be included and will be preserved in the output.
Tab-separated files with barcodes in a Strain column and sample counts in columns named Sample_<name>_UP / Sample_<name>_DOWN.
CSV files with barcodes in a Center column and counts in columns named time_point_<n>.
Reads BarNone output files listed in sample_sheet, reshapes counts to long format, joins sample metadata, looks up gene names from the bundled SGD lookup table, and returns a tidy tibble.
sample_sheet— path to sample sheet CSVu_vector— optional character vector of column names to combine into a unique sample ID (UID); defaults toFile_path + Sample
Same as tidy_bar_none() but for Bartender output files.
Produces a per-sample summary table from a tidy barcode tibble.
| Argument | Default | Description |
|---|---|---|
data |
— | Tidy tibble from tidy_bar_* |
sum_by |
"SampleNum" |
Column to group by |
barcode_by |
"Strain" |
Column containing barcode IDs |
count_by |
"Counts" |
Column containing read counts |
tag_by |
"Tag" |
Column containing UP/DOWN tag labels (set to NULL to skip) |
tags |
c("UP","DOWN") |
Names of the two tag directions |
Returns a tibble with columns including LibrarySize, NumSamples, and Shannon diversity (Shannon).
Generates four QC plots and saves them to disk:
- Library size per sample (bar chart, log scale, colored by tag)
- Strain frequency histogram across all samples
- UP vs DOWN tag scatter plots per sample
- Pairwise sample correlation matrix
| Argument | Default | Description |
|---|---|---|
df |
— | Tidy tibble from tidy_bar_none() |
datadir |
getwd() |
Output directory for plots |
format |
".pdf" |
File extension / format for saved plots |
The package ships with data/gene_lookup_table.csv, a mapping of S. cerevisiae systematic names (e.g. YAL001C) to common gene names downloaded from the Saccharomyces Genome Database (SGD). This table is joined automatically by tidy_bar_none() to annotate barcodes with gene names.
To regenerate the table from SGD:
source("R/get_sgd.R")| Package | Use |
|---|---|
| tidyverse | data import, reshaping, and plotting |
| vegan | Shannon diversity calculation |
| GGally | pairwise correlation plots |
| gridExtra | multi-panel plot layout |
MIT