HAShCache is a shared last-level die-stacked DRAM cache for a integrated heterogeneous CPU-GPU processor.
HAShCache can adapt dynamically to address the inherent disparity of demands in integrated CPU-GPU processors (heterogeneity-aware).
HAShCache proposes an intelligent DRAM cache controller which employs an
- intelligent DRAM request scheduler (PrIS)
- a selective temporal bypass scheme (ByE)
- cache-line occupancy controlling mechanism (Chaining).
More details in TACO '17 paper and at http://adar.sh/hashcache-acm-taco
This repo adds the following to gem5-gpu
- 3D stacked DRAM as hardware managed cache
adds logic to use a HMC-like stacked DRAM as hardware managed, memory side DRAM cache (gem5/src/mem/DRAMCacheCtrl.py, gem5/src/mem/dramcache_ctrl.cc, gem5/src/mem/dramcache_ctrl.hh) - Ability to differentiate between memory requests from CPU and GPU
adds metadata to Ruby's Abstract Controller to differentiate between requests originating from CPU and GPU by modifying the request packet (gem/src/mem/ruby/slicc_interface/AbstractController.cc) - DRAM Cache optimization
Implements PrIS, ByE and Chaining mechanisms (gem5/src/mem/dramcache_ctrl.cc) - Concurrently executing CPU-GPU workload mixes
Scripts to run workload mixes from the paper (regression/runme.py) - Stability fixes
several fixes throughout the code to run the heterogenous SPEC+Rodinia workload
If you are using Dvé for your work, please cite:**
@article{hashcache-taco17,
author = {Patil, Adarsh and Govindarajan, Ramaswamy},
title = {HAShCache: Heterogeneity-Aware Shared DRAMCache for Integrated Heterogeneous Systems},
year = {2017},
issue_date = {December 2017},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
volume = {14},
number = {4},
issn = {1544-3566},
url = {https://doi.org/10.1145/3158641},
doi = {10.1145/3158641},
journal = {ACM Trans. Archit. Code Optim.},
month = {dec},
articleno = {51},
numpages = {26},
keywords = {3D-stacked memory, Integrated CPU-GPU processors, cache sharing, DRAM cache}
}
The original gem5-gpu simulator
- Merges gem5 and gpgpu-sim simulators
- Uses the gem5 memory hierarchy to unify CPU/GPU memory
- Adds hetergenous coherence protocols