Research | CASL Gatech

Modeling and Simulation

The evolution to many core architectures has left a void in accurate systems simulation that cannot be met by traditional point tools such as standalone microarchitecture simulators and interconnection network simulators. We need integrated (application, OS, microarchitecture) modeling and simulation environments that span a range of fidelity and scale. The first project is Manifold, a multi-faculty effort developing an open source infrastructure for workload-driven full system parallel simulation of many core architectures. Manifold includes i) KitFox – an open source multi-physics library of public domain tools for integrated energy/power, reliability, cooling, and thermal modeling, and ii) QSim – a QEMU based multicore front-end to drive timing models with current support for x86-64 and ARM64. The second project target coarse grain application-level modeling of time and energy including various tools – Eiger (automated model construction), lwperf (model construction API), and eAudit (energy profiling of applications).

Manifold more…

Eiger and related tools more…

Heterogeneous Computing

Current trends have led to the development chip-scale and rack-scale of heterogeneous many core platforms — large scale, heterogeneous systems comprised of homogeneous general purpose cores intermingled with customized heterogeneous cores and using diverse memory and cache hierarchies. Our past efforts are focused on software execution environments for these new generation architectures at the chip and system level. Our current efforts are driven by the observation that emerging applications exhibit irregular control flow and memory access patterns (e.g., query processing, graph analytics, etc) across massive data sets. Parallelism is dynamic, fine grained, time-varying, and nested. Thus we focus dynamic parallelism, exploring new execution mechanisms and microarchitecture support for efficiently harnessing the computer and memory bandwidth of GPUs for such applications.

Dynamic Thread Block Launch (DTBL) for supporting dynamic parallelism more…

Red Fox: Compilation of queries over massive data sets to GPUs more…

Previous Projects: GPUOcelot, Oncilla, Lynx

Power and Thermal Management

As industry moves to increasingly small feature sizes, performance scaling will become dominated by the physics of the computing environment. There are fundamental trade-offs to be made at the architectural level between performance, energy/power, and reliability based on understanding, characterizing, and managing the multi-physics and multi-scale (nanoseconds to milliseconds) transient interactions between delivery, dissipation, and removal (cooling) of energy and their impact on system level performance. Much of this analysis is in the context of integration of computation into 3D memory stacks. more…

Processing Near Memory

This research thrust addresses the challenges facing the design of processing-near memory (PNM) architectures. While research in the 1990’s addressed the idea of placing processing-in- memory (PIM), the continued evolution of Moore’s Law and related architectural advances of the day precluded the need for such architectures. However, the modern notion of PNM is driven by a compelling confluence of technology and application trends due to which the cost ($), execution time, and energy of applications are being dominated by the memory system. Consequently, we are in the midst of a need for a pronounced shift from traditional processor-centric system architectures to memory-centric system architectures.

Cymric: Prototyping Near Memory Architectures more…

Previous Projects: Network and Memory Systems more…