-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CP013: Create a motivational example (P1795) #129
Comments
Do we plan on being able to look up cache sizes at different levels of the memory hierarchy? We could use Strassen for a simple example, and have it use the lowest-level cache size to decide when to stop recursing. |
I would like to have a property which reflects the various caches levels and their sizes. I'm not entirely sure how best to represent those in a generic way yet, perhaps through a hierarchy of managed memory resources which provide constructive/destructive interference. Yeah, I like that idea, it would be a good example of using the topology information. So we would recursively divide the matrices into blocks until they fit into the lowest level cache and then compute one at a time, per group of threads sharing the cache. It would be interesting to then further generalize this so that larger matrices could be subdivided across NUMA regions as well. |
I am working on a pseudo generic algorithm for this incorporating the various architecture agnostic information that we will need to be able to query, and this is a summary of what I have so far:
|
From last heterogeneous C++ call:
|
In our last discussion, we decided that we should create a motivational example of how a developer could use the topology discovery design proposed in P1795 to optimise an algorithm such as matrix multiply based on different system architectures.
The text was updated successfully, but these errors were encountered: