Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flaky Tests Question #8

Open
jordan-gillard opened this issue Sep 28, 2023 · 1 comment
Open

Flaky Tests Question #8

jordan-gillard opened this issue Sep 28, 2023 · 1 comment

Comments

@jordan-gillard
Copy link

Flaky tests, such as test_conditional_prob_inf_given_vl_dist, are due to their non-deterministic nature. Monte Carlo simulations, such as those used in the baseline_exposure_model fixture and test_conditional_prob_inf_given_vl_dist, fail because they're probabilistic. For modelling this is great - but for unit tests its not 😅 It's a nice feeling to have a 🟢 CI.

All you'd have to do to fix this test is set the random seed, like np.random.seed(42). However, this is misleading if the goal (as I suspect) is to ensure the models accuracy within a certain tolerance.

If the goal is to measure the models accuracy, what if you removed the 3x retry and instead ran the model 10 times, gathering the results each time? Then you could do a statistical analysis against the mean/median/standard deviation/percentiles etc.

Another alternative is to change the fixed 0.002 tolerance. What if you calculated the tolerance based on the number of runs? Could an absolute tolerance of 0.002 be wishful thinking?

It could also help to log model deviations and store them with a timestamp. Then you can monitor deviations over time.

I'm happy to fix this test (and get that CI ✅). Its a cool project. Just let me know what the expected behavior is & how I can help out.

@jordan-gillard
Copy link
Author

😱 y'all are on GitLab? Which repo should I create PRs for?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant