Enhance plotting functions and documentation#8
Conversation
Updated documentation for plotPRC, plotROC, plotCM, plotDensity, and plotTopFeatsVI functions. Added new functions for plotting ROC curves, confusion matrices, and density of predicted class probabilities.
epbrenner
left a comment
There was a problem hiding this comment.
These work, but I'd strongly suggest that the import functions for the plots to natively include the tsv reading rather than requiring it.
Add functions to plot drug phenotype distribution, model performance, and cross-drug generalization heatmaps.
I have modified the function to input tsv. |
|
@AbhirupaGhosh @jananiravi @amcim |
| #' @return A heatmap (`ggplot2` object) showing the confusion matrix. | ||
| #' | ||
| #' @export | ||
| plotCM <- function(test_data_plus_predictions_file) { |
There was a problem hiding this comment.
plotConfusion will seem odd?
| #' @param shuffled_label_results Output of `runMLPipeline(shuffle_labels = TRUE)` | ||
| #' | ||
| #' @importFrom graphics barplot | ||
| #' @return A base R barplot comparing balanced accuracy across models. |
| beside = TRUE, | ||
| legend.text = TRUE, col = c("skyblue", "lightpink"), | ||
| ylab = "Balanced accuracy", xlab = "Antibiotic" | ||
| ylab = "Balanced Accuracy" |
There was a problem hiding this comment.
| ylab = "Balanced Accuracy" | |
| ylab = "Balanced accuracy" |
| "#0F2A5A" # very dark for ~1 | ||
| ), | ||
| values = scales::rescale(c(-1, 0, 0.85, 1)), | ||
| name = "Best MCC" |
| feat_pal <- c( | ||
| "args" = "#56B4E9", # sky blue | ||
| "cogs" = "#E69F00", # orange | ||
| "genes" = "#009E73", # bluish green | ||
| "domains" = "#F0E442", # yellow | ||
| "proteins" = "#CC79A7", # reddish purple | ||
| "struct" = "#D55E00" # vermillion | ||
| ) |
There was a problem hiding this comment.
not defined separately in a function -- so it's the same colors across plots/panels?
| final_plot | ||
| } | ||
|
|
||
| #' Plot cross-drug generalization heatmap |
There was a problem hiding this comment.
unless I'm missing something
| #' Plot cross-drug generalization heatmap | |
| #' Plot cross-drug prediction as a heatmap |
| cross_drug <- arrow::read_parquet(file.path(cross_test_performance_path, "cross_drug_perf.parquet")) | ||
| performance <- arrow::read_parquet(file.path(drug_performance_path, "all_performance.parquet")) | ||
|
|
||
| ###################### CROSS DRUG Testing ############################# |
There was a problem hiding this comment.
| ###################### CROSS DRUG Testing ############################# |
| tibble::column_to_rownames("drug_or_class") |> | ||
| as.matrix() | ||
|
|
||
| row_order <- heatmap_df |> |
There was a problem hiding this comment.
drugs ordered by their classes?
| ggplot2::labs( | ||
| title = if (year_or_country == "year") { | ||
| "Temporal performance by drug" | ||
| } else { | ||
| "Geographical performance by drug" |
There was a problem hiding this comment.
| ggplot2::labs( | |
| title = if (year_or_country == "year") { | |
| "Temporal performance by drug" | |
| } else { | |
| "Geographical performance by drug" | |
| ggplot2::labs( | |
| title = if (year_or_country == "year") { | |
| "Performance of AMR drug models with temporal holdouts" | |
| } else { | |
| "Performance of AMR drug models with geographical holdouts" |
| #' ) | ||
| plotTopClusters <- function(top_feat_path = ".", cluster_feature_path = ".", | ||
| protein_names_path = ".", top_n = 10) { | ||
| ################### Top features ######################### |
There was a problem hiding this comment.
| ################### Top features ######################### |
Updated documentation for plotPRC, plotROC, plotCM, plotDensity, and plotTopFeatsVI functions. Added new functions for plotting ROC curves, confusion matrices, and density of predicted class probabilities.
Description
What kind of change(s) are included?
Checklist
Please ensure that all boxes are checked before indicating that this pull request is ready for review.