Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Benchmarks for compute resources #41

Open
awgymer opened this issue Nov 13, 2024 · 1 comment
Open

Benchmarks for compute resources #41

awgymer opened this issue Nov 13, 2024 · 1 comment

Comments

@awgymer
Copy link

awgymer commented Nov 13, 2024

Hi I have looked through the documentation but I can't see any indication of speed benchmarks or recommended compute to achieve a given throughput?

Given that we run jobs through a scheduler that requires setting resource requests I am wondering if you are able to shed any light on what you might consider to be sensible defaults to provide a process for:

  • --threads argument
  • cpus/cores to give a job
  • memory (overall or per-thread) to give a job
  • what sort of runtime you might expect for a typical sample with these settings
@aquaskyline
Copy link
Member

Line 680 in ClairS' preprint gives you some figures about using ClairS on a whole genome. If you are distributing ClairS' job to multiple nodes by setting intervals, you will need to adjust the --chunk_size accordingly. Say if you set --thread 32 for each 5Mbp interval on a single computing node. The best chuck_size is calculated as 5Mbp/32*4, the constant 4 is because ClairS uses 4 threads for each chunk.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants