Skip to content

Scripts for RATTACA LD Pruning Testing from Palmer Lab @ UCSD

Notifications You must be signed in to change notification settings

mika-okamoto/RATTACA_LD_Pruning

Repository files navigation

Scripts for RATTACA LD Pruning Testing

Benjamin B. Johnson, Thiago M. Sanches, Mika H. Okamoto, Khai-Min Nguyen, Clara A. Ortez, Oksana Polesskaya, Abraham A. Palmer

Main Notebook: plot_correlations.ipynb

Includes scripts written in R and Python to generate prediction performance data for various experiments and to visualize the data.

Comparing phenotype prediction performance with different genome subsampling methods including:

  • LD pruning parameters - $r^2$ and window size
  • Number of random SNPs
  • Number of training rats
  • Random vs LD Pruning
  • LD clumping

And graphing:

  • Prediction performance distributions
  • Runtimes

main folder: main prediction pipeline and correlation plots
full_pred_pipeline.r: code for general prediction pipeline
plot_correlations.ipynb: correlation graphs from various experiments
plot_runtimes_py.ipynb: runtime graphs from various experiments (to demonstrate cost of different methods)
convex_hull.ipynb: code for testing various rat breeding algorithms for maximizing genetic diversity of offspring

/experiments/code: pipelines for generating performance data for different genome subsampling methods
/experiments/plots: notebooks of plots of different experiments of different genome subsampling methods
/pyrrBLUP: class to run rrBLUP in python, used for scikit-learn comparisons
/experimental: scripts for testing and new ideas (not public)
/old: in-between scripts and old ideas which were refined in other scripts (not public)

About

Scripts for RATTACA LD Pruning Testing from Palmer Lab @ UCSD

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages