Add LOO and cross-drug test matrix generation#19
Conversation
Added functions to build leave-one-out and cross-drug test matrices from parquet files. Updated the main matrix generation function to include calls to these new functions.
jananiravi
left a comment
There was a problem hiding this comment.
Due to GitHub action (gha) styler updates I missed the main changes in this pr but I think it looks good based on a quick look.
There was a problem hiding this comment.
I tested the functionality of the new functions. These look good, they produce the correct number of cross-drug parquets, LOO drug parquets, and there is no genome leakage in terms of training/held out drugs
There were multiple issues with CI and R CMD CHECK that I fixed. The one of note was that a query was failing, and I believe when the merge from main originally happened there was a conflict. It should be genome_drug.genome_id. The failed CI that points out this error can be looked at in previous actions, but basically I just had to correct this query.
Basically now that main will be back to passing CI, this will make reviewing the other open PRs here much more straightforward. (Particularly PR #8 )
One note I would like to make is in terms of the parsing filenames. If the filenames will be changing to remove the bug name in the prefix (such as in JRaviLab/amRviz#36) then this will need to be changed.
If CI passes, then I approve the merge
Added functions to build leave-one-out and cross-drug test matrices from parquet files. Updated the main matrix generation function to include calls to these new functions.
Description
What kind of change(s) are included?
Checklist
Please ensure that all boxes are checked before indicating that this pull request is ready for review.