Publications

PDFs are provided for personal use and subject to owner copyright of the publishers.

2017

  • K. Rao, J. Wang, S. Yalamanchili, Y. Wardi and H. Ye, “Application-Specific Performance-Aware Energy Optimization on Android Mobile Devices” The 2017 International Symposium on High-Performance Computer Architecture (HPCA-23). 2017. paper

2016

  • E. Anger, J. Wilke, and S. Yalamanchili, “Power-Constrained Performance Scheduling of Data Parallel Tasks,” in Energy Efficient Supercomputing Workshop (E2SC), 2016, 2016. paper
  • Y. Wardi, C. Seatzu, X. Chen, and S. Yalamanchili, “Performance Regulation of Event-Driven Dynamical Systems using Infinitesimal Perturbation Analysis”, Nonlinear Analysis: Hybrid Systems, vol:22, November 2016, pp. 116-13.6
  • M. Hassan and S. Yalamanchili, “Understanding the Impact of Air and Microfluidic Cooling on the Performance of 3D Stacked Memory Systems,” International Symposium on Memory Systems, October 2016.
  • J. Wang, N. Rubin, A. Sidelnik and S. Yalamanchili. “LaPerm: Locality Aware Scheduler for Dynamic Parallelism on GPUs.” The 43rd International Symposium on Computer Architecture (ISCA). June 2016. paper
  • D. Kim, J. Kung, S. Chai, S. Yalamanchili, and S. Mukhopadhyay, “Neurocube: A Programmable Digital Neuromorphic Architecture with High Density3D Memory,” IEEE/ACM International Symposium on Computer Architecture (ISCA), June 2016.
  • S. Li, V. Sridharan, S. Gurumurthi, and S. Yalamanchili, “Software-based Dynamic Reliability Management for GPU Applications,” IEEE International Reliability Physics Symposium, April 2016.
  • S. Yalamanchili, “New Rules: Sustaining Performance Scaling in a Physical World,” Department of Electrical Engineering, University of Southern California, April 2016. Slides
  • S. Yalamanchili, “When is Energy Not Equal To Time: Understanding Energy Scaling with eAudit,” SIAM Conference on Parallel Processing for Scientific Computing, Paris, France, April 2016. Slides
  • W. J. Song, S. M. Hassan, S. Mukhopadhyay, and S. Yalamanchili, “Reliability-Performance Tradeoff between 2.5D and 3D-Stacked DRAM Processors,” IEEE International Reliability Physics Symposium, (short paper/poster). April 2016.
  • D. Zinn, H. Wu, J. Wang, M. Aref and S. Yalamanchili. “General-Purpose Join Algorithms for Large Graph Triangle Listing on Heterogeneous Systems.” Proceedings of 9th Workshop on General-Purpose Computation on Graphics Processing Units (GPGPU-9). March 2016. paper
  • W. J. Song, S. Mukhopadhyay and S. Yalamanchili. “Amdahl’s Law for Lifetime Reliability Scaling in Heterogeneous Multicore Processors.” The 2016 International Symposium on High-Performance Computer Architecture (HPCA-22). March 2016. paper
  • J. Wang, Z. Dong. G. Riley, and S. Yalamanchili, “FNM: An Enhanced Null-Message Algorithm for the Parallel Simulation of Multicore Systems,” ACM Transactions on Modeling and Simulation, vol. 26. No. 2, January 2016.

2015
  • X. Chen, H. Xiao, Y. Wardi, and S. Yalamanchili. “Throughput Regulation in Shared Memory Multicore Processors.” 2015 IEEE International Conference on High Performance Computing. December 2015. paper
  • W. J. Song, S. Mukhopadhyay and S. Yalamanchili. “KitFox: Multi-Physics Libraries for Integrated Power, Thermal, and Reliability Simulations of Multicore Microarchitecture.” IEEE Transactions on Components, Packaging and Manufacturing Technology, vol. 5, no. 11. November 2015. paper
  • H. Xiao, W. Yueh, S. Mukhopadhyay and S. Yalamanchili, “Thermally Adaptive Cache Access Mechanisms for 3D Many-core Architectures”, Computer Architecture Letters, October 2015. paper
  • C. Kersey, H. Kim, and S. Yalamanchili, “SIMT-Based Logic Layer for Stacked DRAM Architectures: A Prototype,” International Symposium on Memory Systems, October 2015.
  • S. M. Hassan, S. Yalamanchili and S. Mukhopadhyay. “Near Data Processing: Impact and Optimization of 3D Memory System Architecture on the Uncore.” 2015 International Symposium on Memory Systems (MEMSYS 2015). October 2015. paper
  • H. Kim, H. Kim, S. Yalamanchili, and A. Rodrigues, “Understanding the Energy Aspects of Processing Near Memory Workloads,” International Symposium on Memory Systems, October 2015.
  • S. Yalamanchili, “Implications of Memory Centric Computing Architectures for NoCs,” Keynote, IEEE/ACM International Symposium on Network on Chip Architectures (NOCS), September 2015. slides
  • K. Rao, W. Song, S. Yalamanchili, and Y. Wardi, “Temperature Regulation in Multicore Processors using Adjustable-Gain Integral Controllers,” IEEE Multi-Conference on Systems and Control, September 2015. paper
  • H. Xiao, W. Yueh, S. Mukhopadhyay and S. Yalamanchili, “Short-Stack: Pushing Back the Pin Bandwidth Wall with FinFET-based eDRAM In-Package Last Level Cache”, SRC TECHCON, September 2015. paper
  • E. Anger, S. Yalamanchili, D. Dechev, G. Hendry and J. Wilke. “Application Modeling for Scalable Simulation of Massively Parallel Systems.” 2015 IEEE International Conference on High Performance Computing and Communications (HPCC 2015). August 2015. paper
  • H. Xiao, W. Yueh, S. Mukhopadhyay and S. Yalamanchili, “Multi-physics Driven Co-design of 3D Multicore Architectures.” ASME 2015 International Technical Conference and Exhibition on Packaging and Integration of Electronic and Photonic Microsystems collocated with the ASME 2015 13th International Conference on Nanochannels, Microchannels, and Minichannels (InterPACK & ICNMM 2015). July 2015. paper
  • Y. Joshi, B. Barabadi, R. Ghosh, Z. Wan, H.Xiao, S. Yalamanchili, and S. Kumar, “Thermal Simulations in support of Multi-Scale Co-Design of Energy Efficient Information Technology Systems,” International Journal of Numerical Methods for Heat and Fluid Flow, Vol. 25 Iss: 6, pp.1385 – 1403, 2015
  • C. Kersey, H. Kim and S. Yalamanchili, “Cymric: A Framework for Prototyping Near-Memory Architectures,” Sixth Workshop on Architectural Research Prototyping (WARP), held with ISCA. June 2015.
  • I. Paul, W. N. Huang, M. Arora and S. Yalamanchili. “Harmonia: Balancing Compute and Memory Power in High Performance GPUs.” The 42nd International Symposium on Computer Architecture (ISCA). June 2015.
  • J. Wang, N. Rubin, A. Sidelnik and S. Yalamanchili. “Dynamic Thread Block Launch: A Lightweight Execution Mechanism to Support Irregular Applications on GPUs.” The 42nd International Symposium on Computer Architecture (ISCA). June 2015. paper
  • J. Kung, Wen Yueh, S. Yalamanchili, and S. Mukhopadhyay, “Post-silicon Estimation of Spatiotemporal Temperature Variations Using MIMO Thermal Filters,” IEEE Transactions on Components, Packaging and Manufacturing Technology, May 2015.
  • X. Chen, H. Xiao, W. Song, Y. Wardi, and S. Yalamanchili, “Performance and Power Regulation in Multicore Processors”, Poster section in Power Delivery for Electronics Systems (PDES) Workshop, May 2015. poster
  • W. J. Song, S. Mukhopadhyay and S. Yalamanchili. “Managing Performance-Reliability Tradeoffs in Multicore Processors.” The 2015 IEEE International Reliability Physics Symposium (IRPS 2015). April 2015 (Best Student Paper). paper
  • S. Li, V. Sridharan, S. Gurumurthi and S. Yalamanchili. “Software-based Dynamic Reliability Management for GPU Applications.” IEEE Workshop in Silicon Errors in Logic – System Effects (SELSE 2015). March 2015. paper
  • I. Saeed, J. Young, and S. Yalamanchili. “A portable benchmark suite for highly parallel data intensive query processing.” Proceedings of the 2nd Workshop on Parallel Programming for Analytics Applications. In conjunction with PPoPP 2015. pg. 31-38. February 2015. paper
  • S. Yalamanchili, “Relational Processing Accelerators: From Clouds to Memory Systems,” Intel India, January 2015. slides

2014

  • E. Anger, S. Yalamanchili, S. Pakin and P. McCormick. “Architecture-Independent Modeling of Intra-Node Data Movement.” The LLVM Compiler Infrastructure in HPC Workshop (in conjunction with Supercomputing 2014). November 2014. paper
  • K. J. Barker, D. J. Kerbyson and E. Anger. “On the Feasibility of Dynamic Power Steering.” 2014 Second Workshop on Energy Efficient Supercomputing (E2SC). November 2014. paper
  • J. Wang and S. Yalamanchili. “Characterization and Analysis of Dynamic Parallelism in Unstructured GPU Applications.” 2014 IEEE International Symposium on Workload Characterization (IISWC). October 2014. (Best paper nominee) paper
  • S. Hassan and S. Yalamanchili. “Bubble Sharing: Area and Energy Efficient Adaptive Routers using Centralized Buffers.” 2014 International Symposium on Networks-on-Chip (NOCS). September 2014. paper
  • W. Song, S. Mukhopadhyay, and S. Yalamanchili, “Lifetime Reliability Characterization and Management of Many-Core Processors,” SRC TECHCON, September 2014. (Best in Session Award)
  • B. Alexandrov, O. Sullivan. W. Song,. S. Yalamanchili, S. Kumar and S. Mukhopadhyay, “Control Principles and On-Chip Circuits for Active Cooling using Integrated Super Lattice Based Thin Film Thermoelectric Devices,” IEEE Transactions on VLSI, vol. 22, no. 9, September 2014. paper
  • I. Saeed, H. Wu, S. Yalamanchili, “A Portable Relational Algebra Library for High-Performance Data-Intensive Query Processing”, CERCS Industry Advisory Board meeting, Georgia Tech, 2014. Poster
  • H. Wu, D. Zinn, M. Aref, and S. Yalamanchili. “Multipredicate Join Algorithms for Accelerating Relational Graph Processing on GPUs.” The 5th International Workshop on Accelerating Data Management Systems Using Modern Processor and Storage Architectures (ADMS). September 2014. paper
  • W. J. Song, S. Mukhopadhyay, A. Rodrigues and a. S. Yalamanchili. “Energy Introspector: Standard Physical Library Interface for Full-System Microarchitecture and Multi-Physics Simulations.” The 2014 Workshop on Modeling & Simulation of Systems and Applications (ModSim 2014). August 2014. paper
  • H. Kim, E. Anger, P. Gera, J. J. Wilke, P. S. McCormick and S. Yalamanchili. “Integrated, Application-Level, Performance‐Energy Modeling for Heterogeneous Architectures.” Workshop on Modeling & Simulation of Systems and Applications (MODSIM 2014). August 2014. paper
  • W. Song, S. Mukhopadhyay and S. Yalamanchili. “Architectural Reliability: Lifetime Reliability Characterization and Management of Many-Core Processors.” Computer Architecture Letters, vol. 14, no. 2. July 2014. paper
  • J. Lim, N. Lakshminarayana, H. Kim, W. Song, S. Yalamanchili, S. Manne, “Power Modeling of GPU Architecture Using McPAT”, ACM Transactions on Design Automation of Electronic Systems (TODAES), vol. 19, no. 3, June 2014. paper
  • C. Kersey, S. Yalamanchili, H. Kim, N. Nigania, H. Kim. “Harmonica: An FPGA-based Data Parallel Soft Core.” The 22nd IEEE International Symposium on Field-Programmable Custom Computing Machines (poster), May 2014. poster
  • H. Xiao, Z. Min, S. Yalamanchili and Y. Joshi, “Leakage Power Characterization and Minimization over 3D Stacked Multi-core Chip with Microfluidic Cooling,” 30th Semiconductor Thermal Measurement & Management Symposium (SEMI-THERM), March 2014. paper
  • S. Li, V. Sridharan, S. Gurumurthi, and S. Yalamanchili, “Position Paper: Software Techniques for Reducing the Vulnerability of GPU Applications,” Workshop on Dependable GPU Computing (at DATE), March 2014. paper
  • Z. Dong, J. Wang, G. Riley, and S. Yalamanchili, “An Efficient Front-End for Timing-Directed Parallel Simulation of Multi-Core System,” The 7th IEEE/ICST International Conference on Simulation Tools and Techniques, March 2014. paper
  • H. Wu, G. Diamos, T. Sheard, M. Aref, and S. Yalamanchili, “Red Fox: An Execution Environment for Relational Queries Processing on GPUs,” GPU Technology Conference, March 2014.
  • J. Wang, J. Beu, R. Bheda, T. Conte, Z. Dong, C. Kersey, M. Rasquinha, G. Riley, W. Song, H. Xiao, P. Xu, and S. Yalamanchili, “Manifold: A Parallel Simulation Framework for Multicore Systems”, 2014 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), March 2014. paper
  • W. Song, S. Mukhopadhyay, and S. Yalamanchili, “Energy Introspector: A Parallel Composable Framework for Integrated Power-Reliability-Thermal Modeling for Multicore Architectures (short paper)”, 2014 IEEE International Symposium on Performance Evaluation of Systems and Software (ISPASS), March 2014. paper
  • N. Farooqui, K. Schwan, and S. Yalamanchili, “Efficient Instrumentation of GPUGPU Programs using Information Flow Analysis and Symbolic Execution,” Proceedings of Seventh Workshop on General-Purpose Computation on Graphics Processing Units (GPGPU-7), March 2014. paper
  • J. Wang, N. Rubin and S. Yalamanchili. “ParallelJS: An Execution Framework for JavaScript on Heterogeneous Systems.” Seventh Workshop on General Purpose Processing Using GPUs (GPGPU-7). March 2014. paper
  • H. Wu, G. Diamos, T. Sheard, M. Aref, S. Baxter, M. Garland, and S. Yalamanchili. “Red Fox: An Execution Environment for Relational Query Processing on GPUs.” International Symposium on Code Generation and Optimization (CGO). Feberuary 2014. paper
  • S. Yalamanchili, “Constructing High-Level Application Models for Exascale Co-Design Simulations,” 16th SIAM Conference on Parallel Processing for Scientific Computing, February 2014. slides
  • M. Cho, K. Z. Ahmed, W. Song, S. Yalamanchili, and Saibal Mukhopadhyay, “Post-Silicon Characterization and On-Line Prediction of Transient Thermal Field in Integrated Circuits Using Thermal System Identification,” IEEE Transactions on Components, Packaging and Manufacturing Technology, vol. 4, no. 1. January 2014.

2013
  • J. Lee, S. Li, H. Kim, and S. Yalamanchili “Design Space Exploration of On-chip Ring Interconnection for a CPU-GPU Architecture”, Journal of Parallel and Distributed Computing (JPDC), vol. 73, no. 12, December 2013.
  • I. Paul, V. Ravi, S. Manne, M. Arora, and S. Yalamanchili, “Coordinated Energy Management in Heterogeneous Processors,” IEEE/ACM International Conference on Higher Performance Computing, Networking, Storage and Analysis (SC) (Best Paper Finalist). November 2013. paper
  • I. Saeed, S. Shon, H. Wu, J, Young, S. Yalamanchili, “Exploration of Data Warehousing and Graph Applications with GPUs”, 3rd IntelScience and Technology Center for Cloud Computing (ISTC-CC). CMU 2013. Poster
  • J. Kung, M. Cho. S. Yalamanchili, and S. Mukhopadhyay, “On-Line Real-Time Power Estimation of an IC using Time Domain Thermal Filters,” IEEE Conference on Electrical Performance of Electronic Packaging and Systems, October 2013.
  • J. Lee, S. Li, H. Kim, and S. Yalamanchili “Adaptive Virtual Channel Partitioning for On-chip-Network in Heterogeneous Architectures”, ACM Transactions on Design Automation of Electronic Systems, October 2013
  • Z. Wan, H. Xiao, Y. Joshi, and S. Yalamanchili, “Co-design of Multicore Architectures and Microfluidic Cooling for 3D Stacked ICs,” 19th IEEE International Workshop on Thermal Investigations of ICs and Systems (THERMINIC), September 2013. paper
  • J. Kung, M. Cho, S. Yalamanchili and S. Mukhopadhyay, “A Test Driven Methodology for On-Line Real Time Temperature Prediction and Power Estimation,” 2013 IEEE International Test Conference, September 2013. paper
  • W. Song, S. Mukhopadhyay, and S. Yalamanchili, “Lifetime Reliability and Accelerated Execution,” SRC TECHCON, September 2013. (Best in Session Award)
  • J. Young, S. Shon, S. Yalamanchili, A. Merritt, K. Schwan, H. Fröning, Oncilla: A GAS Runtime for Efficient Resource Allocation and Data Movement in Accelerated Clusters. IEEE International Conference on Cluster Computing (Cluster). September 2013. paper
  • B. Alexandrov, O. Sullivan, W. Song, S. Yalamanchili, S. Kumar, and S. Mukhopadhyay, “Control Principles and On-Chip Circuits for Active Cooling Using Integrated Superlattice-Based Thin-Film Thermoelectric Devices,” IEEE Transactions on Very Large Scale Integrated Systems (VLSI), September 2013.
  • Z. Dong, J. Wang, G. Riley, and S. Yalamanchili. “A Study of the Effect of Partitioning on Parallel Simulation of Multicore Systems.” IEEE 21st International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS’13). August 2013. paper
  • Sudhakar Yalamanchili. “New Rules: Managing Processor Physics to Sustain Reliable Performance Scaling.” Third Workshop on Energy-Secure System Architectures (ESSA), held in conjunction with ISCA-2013. June 2013. slides
  • I. Paul, S. Manne, M. Arora,  W. L. Bircher and S. Yalamanchili, “Cooperative Boosting: Needy Versus Greedy Power Management.” IEEE/ACM International Symposium on Computer Architecture (ISCA-2013). June 2013. paper
  • J. Wang, Z. Dong, S. Yalamanchili, and G. Riley, “Optimizing Parallel Simulation of Multicore Systems Using Domain-Specific Knowledge.” 2013 ACM SIGSIM Conference on Principles of Advanced Discrete Simulation (PADS), May 2013. paper
  • S. Yalamanchili, “Scaling Data Warehousing Applications using GPUs.” Second International Workshop on Performance Analysis of Workload Optimized Systems (FastPath-2013), held with ISPASS-2013. April 2013. slides
  • S. Hassan and S. Yalamanchili, “Centralized Buffer Router: A Low Latency, Low Power Router for High Radix NoCs.” IEEE/ACM International Symposium on Networks on Chip (NOCS-2013). April 2013. paper
  • S. Yalamanchili, “New Rules: Sustaining Performance Through Extreme Scale.” Special Session on Emerging Interconnects Technologies at IEEE/ACM International Symposium on Networks on Chip (NOCS-2013). April 2013. slides
  • J. Wang, N. Rubin, H. Wu and S. Yalamanchili. “Accelerating Simulations of Agent Based Models on Heterogeneous Architectures.” Sixth Workshop on General-Purpose Computation on Graphics Processing Units (GPGPU6). March 2013. paper
  • G. Diamos, H. Wu, J. Wang, A. Lele and S. Yalamanchili. “Efficient Relational Algebra Algorithms and Data Structures for Hierarchical Parallel Processors (short paper).” Proceedings of the 18th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP-2013). February 2013. paper
  • S. Li, N. Farooqui and S.Yalamanchili. “Software Reliability Enhancements for GPU Applications.” Sixth Workshop on Programmability Issues for Heterogeneous Multicores (MULTIPROG-2013), held with HiPEAC’13. January 2013. paper
  • S. Yalamanchili. “Architectural Alternatives for Energy Efficient Performance Scaling.” 26th IEEE International Conference on VLSI Design, Special Session on Low Power Computing – Reducing the gap between the Physical and Practical Limits. January 2013. slides
  • M. Cho, C. Kersey, M. Gupta,  N. Sathe, S. Kumar, S. Yalamanchili, and S. Mukhopadhyay, “Power Multiplexing for Thermal Field Management in Many Core Processors,” IEEE Transactions on Components, Packaging and Manufacturing Technology, vol.3, no.1. January 2013. (2013 Best Paper, Components: Characterization & Modeling Category)
  • J. S. Vetter, R. Glassbrook, K. Schwan, S. Yalamanchili, M. Horton, A. Gavrilovska, M. Slawinska, J. Dongarra, J. Meredith, P. C. Roth, K. Spafford, S. Tomov, and J. Wynkoop. “Keeneland: Computational Science using Heterogeneous GPU Computing.” in Contemporary High Performance Computing: From Petascale to Exascale, CRC Computational Science Series, 2013.

2012
  • N. Almoosa, W. Song, S. Yalamanchili, and Y. Wardi. “Throughput Regulation in Multicore Processors via IPA.” IEEE Annual Conference on Decision and Control. December 2012. paper
  • A. Kerr, E. Anger, G. Hendry, and S. Yalamanchili. “Eiger: A framework for the automated synthesis of statistical performance models.” 1st Workshop on Performance Engineering and Applications (WPEA), held with HiPC. December 2012. paper
  • H. Wu, G. Diamos, S. Cadambi, and S. Yalamanchili. “Kernel Weaver: Automatically Fusing Database Primitives for Efficient GPU Computation.” The 45th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 45). December 2012. paper source code
  • J. Young, A. Merritt, H. Wu, S. Yalamanchili. “Oncilla – A GAS Run-time for Efficient Resource Partitioning in Data Centers (Poster).” 2nd Annual Intel Science and Technology Center for Cloud Computing (ISTC-CC) Retreat. December 2012. poster
  • J. Young, H. Wu, S. Yalamanchili. “Satisfying Data-Intensive Queries Using GPU Clusters.” 2nd Annual Workshop on High-Performance Computing meets Databases (HPCDB), held with SC12. November 2012. paper and slides
  • J. Wang, J. Beu, S. Yalamanchili, and T. Conte. “Designing Configurable, Modifiable and Reusable Components for Simulation of Multicore Systems.” 3rd International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS12), held with SC12, November 2012. paper
  • I. Paul, S. Yalamanchili, and L. John. “Performance Impact of Virtual Machine Placement in a Datacenter.” Proceedings of the 31st IEEE International Performance Computing and Communication Conference. October 2012.
  • S. Yalamanchili. “Keynote: Scalable Resource Composition in a Flat World.” First International Workshop on Unconventional Cluster Architectures and Applications (UCAA), held with 41st International Conference on Parallel Processing (ICPP-2012). September 2012. slides
  • N. Almoosa, W. J Song, Y. Wardi, and S. Yalamanchili. “A Power Capping Controller for Multicore Processors.” IEEE American Control Conference. June 2012. paper
  • J. Young, S. Yalamanchili. “Commodity Converged Fabrics for Global Address Spaces in Accelerator Clouds.” The 14th IEEE Conference on High Performance Computing and Communications (HPCC). June 2012. paper and slides
  • H. Wu, G. Diamos, A. Lele, J. Wang, S. Cadambi, S. Yalamanchili, and S. Chakradhar. “Optimizing Data Warehousing Applications for GPUs using Kernel Fusion/Fission.” Workshop on Multicore and GPU Programming Models, Languages and Compilers (PLC), held with IPDPS12. May 2012. paper
  • N. Farooqui, A. Kerr, G. Eisenhauer, K. Schwan, S. Yalamanchili. “Lynx: A Dynamic Instrumentation System for Data-Parallel Applications on GPGPU Architectures.” IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS). April 2012. paper
  • A. Kerr, G. Diamos, S. Yalamanchili. “Dynamic Compilation of Data-Parallel Kernels for Vector Processors.” Annual IEEE/ACM International Symposium on Code Generation and Optimization (CGO). April 2012. paper
  • W. Song, S. Yalamanchili, S. Mukhopadhyay and A. F. Rodrigues. “Instruction-Based Energy Estimation Methodology for Asymmetric Manycore Processor Simulations.” IEEE/ICST International Conference on Simulation Tools and Techniques. March 2012. paper
  • M. Cho, W. Song, S. Yalamanchili, and S. Mukhopadhyay. “Thermal System Identification (TSI): A Methodology for Post-silicon Characterization and Prediction of the Transient Thermal Field in Multicore Chips.” IEEE Symposium on Thermal Measurement, Modeling, and Management. March 2012. paper
  • A. Rodrigues, K. Bergman, D. Bunde, E. Cooper-Balis, K. Ferreira, S. Hemmert, B. Barrett, R. Hendry, B. Jacob, H. Kim, V. Leung, M. Levenhagen, M. Rasquinha, R. Riesen, P. Rosenfeld, M. Varela, S. Yalamanchili and C. Versaggi. “Improvements to the Structural Simulation Toolkit.” IEEE/ICST International Conference on Simulation Tools and Techniques. March 2012.
  • H. Wu, G. Diamos, J. Wang, S. Li, and S. Yalamanchili. “Characterization and Transformation of Unstructured Control Flow in Bulk Synchronous GPU Applications.” International Journal of High Performance Computing Applications (JHPCA). February 2012. paper
  • C. Kersey, A. Rodrigues, and S. Yalamanchili. “A Universal Parallel Front-End for Execution-Driven Microarchitecture Simulation.” HIPEAC Workshop on Rapid Simulation and Performance Evaluation: Methods and Tools. January 2012. paper and slides

2011
  • S. Yalamanchili. “Switching Techniques. (invited)” Encyclopedia of Parallel Computing. 2011.
  • S. Yalamanchili. “Interconnection Networks. (invited)” Encyclopedia of Parallel Computing. 2011.
  • G. Diamos, B. Ashbaugh, S. Maiyuran, A. Kerr, H. Wu, Sudhakar Yalamanchili. “SIMD Re-Convergence At Thread Frontiers.” 44th International Symposium on Microarchitecture (MICRO 44). December 2011. paper
  • S. M. Hassan, D. Choudhary, M. Rasquinha, and S. Yalamanchili. “Regulating Locality vs Parallelism Tradeoffs in a Multiple Memory Controller Environments. (short paper, poster)” IEEE/ACM International Conference on Parallel Architectures and Compilation Techniques. October 2011.

  • J.S. Vetter, R. Glassbrook, J. Dongarra, K. Schwan, B. Loftis, S. McNally, J. Meredith, J. Rogers, P. Roth, K. Spafford, and S. Yalamanchili. “Keeneland: Bringing heterogeneous GPU computing to the computational science community.” IEEE Computing in Science and Engineering, vol.13. September-October 2011. paper
  • A. Kerr, G. Diamos, and S. Yalamanchili. “GPU Application Development, Debugging, and Performance Tuning with GPU Ocelot.” GPU Computing GEMS Jade Edition, 1st Edition. September 2011. book
  • H. Wu, G. Diamos, S. Li, and S. Yalamanchili. “Characterization and Transformation of Unstructured Control Flow in GPU Applications.” First International Workshop on Characterizing Applications for Heterogeneous Exascale Systems (CACHES), held with ICS’2011. June 2011. paper
  • S. Chatterjee,, M. Rasquinha, S. Yalamanchili, and S. Mukhopadhyay. “A Scalable Design Methodology for Energy Minimization of STTRAM: A Circuit and Architecture Perspective.” IEEE Transactions on very large scale intergration (VLSI) systems, vol.19. May 2011.
  • N. Farooqui, A. Kerr, G. Diamos, S. Yalamanchili, and K. Schwan. “A Framework for Dynamically Instrumenting GPU Compute Applications within GPU Ocelot.” Fourth Workshop on General-Purpose Computation on Graphics Procesing Units (GPGPU-4). March 2011. paper
  • S. Chatterjee, S. Mukhopadhyay, M. Rasquinha, S. Yalamanchili, S. Bhania. “Energy Efficient Circuit-System Co-Design for Spin Torque Transfer Random Access Memory (STTRAM) in Submicron Technologies.” Proceedings of the Second Annual Non-Volatile Memories Workshop. March 2011.
  • J. Young, S. Yalamanchili, B. Holden, M. Cavalli, P. Miranda. “HyperTransport Over Ethernet – A Scalable, Commodity Standard for Resource Sharing in the Data Center.” Second International Workshop on HyperTransport Research and Applications (WHTRA). February 2011. paper

2010
  • M. Rasquinha, K. Chae, M. Cho, M. Hassan, W. Song, S. Mukhopadhyay, S. Yalamanchili. “Exploiting the Long-term Advantages of 3D Integration: A System-Driven Approach (Poster).” Interconnect Focus Center Annual Review. October 2010.
  • G. Diamos, A. Kerr, and S., Yalamanchili. “Ocelot: A Dynamic Execution Environment for Heterogeneous Archietctures.” GPU Technology Conference Tutorial. September 2010.
  • G. Diamos, A. Kerr, S. Yalamanchili, and N. Clark. “Ocelot: A Dynamic Compiler for Bulk-Synchronous Applications in Heterogeneous Systems.” The Nineteenth International Conference on Parallel Architectures and Compilation Techniques (PACT2010). September 2010. paper
  • J. Young and S. Yalamanchili. “Dynamic Partitioned Global Address Spaces for Power Efficient DRAM Virtualization.” Workshop on Work in Progress in Green Computing at the First International Green Computing Conference. August 2010. paper
  • M. Rasquinha, D. Choudhary, S. Chatterjee, S. Mukhopadhyay, S. Yalamanchili. “An Energy Efficient Cache Design Using Spin Torque Transfer (STT) RAM.””The International Symposium on Low Power Electronics and Design (ISLPED). August 2010.
  • S. Yalamanchili. “Architectural Level Modeling and Simulation of Many Core Architectures.” Georgia Tech Summer School on Cyber Physical Systems. June 2010.
  • N. Almoosa, Y. Wardi, and S. Yalamanchili. “Controller Design for Tracking Induced Miss Rate in Cache Memories.” Proceedings of the IEEE International Conference on Control and Automation. June 2010.
  • S. Yalamanchili. “Predictable System Design: Control Theory Meets Computer Architecture.” Shanghai Jiao Tong University Workshop. June 2010.
  • G. Diamos and S. Yalamanchili. “Speculative Execution On Multi-GPU Systems.” IEEE International Parallel & Distributed Processing Symposium (IPDPS 2010). April 2010. paper
  • A. Kerr, G. Diamos, and S. Yalamanchili. “Modeling GPU-CPU Workloads and Systems.” Third Workshop on General-Purpose Computation on Graphics Procesing Units (GPGPU-3). March 2010. paper
  • M. Cho, N. Sathe, M. Gupta, S. Kumar, S. Yalamanchili, and S. Mukhopadhyay. “Proactive Power Migration to Reduce Maximum Value and Spatiotemporal Non-uniformity of On-chip Temperature Distribution in Homogeneous Many-Core Processors.” The 26th IEEE Annual Thermal Measurement, Modeling and Management Symposium (SEMI-THERM). February 2010.

2009
  • S. Padalikar and G. Diamos. “Exploring The Latency and Bandwidth Tolerance of CUDA Applications.” NFinTes Tech Report. December 2009. paper
  • S. Chatterjee, M. Rasquinha, S. Yalamanchili, and S. Mukhopadhyay. “A Methodology for Robust, Energy Efficient Design of Spin-Torque-Transfer RAM Arrays at Scaled Technologies.” The International Conference on Computer-Aided Design (ICCAD’09). November 2009.
  • A. Kerr, G. Diamos, and S. Yalamanchili. “A Characterization and Analysis of PTX Kernels.” IEEE International Symposium on Workload Characterization (IISWC). October 2009.
  • S. Yalamanchili and J. Young. “System Impact of Integrated Interconnects.(Invited)” High Performance Communications: A Vertical Approach, (A. Gavrilovska eds.), publisher CRC Press, Taylor & Francis Group. September 2009.
  • S. Yalamanchili. “Manifold: Modeling and Simulation of Many Core Architectures.” First Workshop on High Performance Computing Architectural Simulation (HPCAS), DoE Institute for Advanced Architectures and Algorithms. September 2009.
  • D. Lewis, S. Yalamanchili, and Hsien Hsin Lee. “High Performance Non blocking Switch Design in 3D Die Stacking Technology.” IEEE Annual Symposium on VLSI. May 2009.
  • S. Yalamanchili. “Key Note: System Impact of Integrated Interconnects.” Second Symposium of the HyperTransport Center of Excellence, University of Hiedelberg, Mannheim, Germany. February 2009.
  • J. Young, S. Yalamanchili, J. Duato, and F. Silla. “A HyperTransport-Enabled Global Memory Model For Improved Memory Efficiency.” First Workshop on HyperTransport Research and Applications. February 2009. paper
  • J. Duato, F. Silla, B. Holden, P. Miranda, J. Underhill, M. Cavalli, S. Yalamanchili and U. Bruning. “Extending HyperTransport Protocol for Improved Scalability.” First Workshop on HyperTransport Research and Applications. February 2009

2008
  • S. Ramaswamy and S. Yalamanchili. “A Utilization Driven Framework for Energy Efficient Caches.” IEEE International Conference on High Performance Computing. December 2008
  • G. Diamos and S. Yalamanchili. “Harmony: A Flexible Runtime for Heterogeneous Many Core Architectures.” ACM/IEEE International Symposium on High Performance Distributed Computing, Session on Hot Topics. June 2008.
  • K. Chuang, S. Yalamanchili, A. Gavrilovska, K. Schwan. “Sharestreams-V: A Virtualized, QoS Packet Scheduling Accelerator.” IEEE Symposium on Custom Computing Machines. April 2008.

2007
  • S. Ramaswamy and S. Yalamanchili. “Improving Cache Efficiency via Resizing + Remapping.” Proceedings of the IEEE International Conference on Computer Design. October 2007.
  • G. Diamos, S. Yalamanchili. J. Duato. “STARS: A System for Tuning and Actively Reconfiguring Links (Poster).” Workshop on Diagnostic Services in Network on Chips, held with Design and Test Europe.” April 2007.
  • G. Diamos, A. Gavrilovska, V. Gupta, S. Kumar, H. Raj, K. Schwan, S. Yalamanchili. “Virtualizing Heterogeneous Many-core Platforms (Poster).” EuroSys 2007, Lisbon, Portugal. March 2007.
  • S. Ramaswamy and S. Yalamanchili. “Customized Placement for Embedded Processor Caches.” Proceedings of Architecture of Computing Systems. March 2007.

2006

  • S. Ramaswamy and S. Yalamanchili. “Customizable Fault Tolerant Caches for Embedded Processors.” Proceedings of the IEEE International Conference on Computer Design. October 2006.
  • B. Caminero, C. Carron, F. J. Quiles, J. Duato, S.Yalamanchili, “MMR: A Multi-Media Router Architecture to Support Multi-Media Workloads,” Journal of Parallel and Distributed Systems, vol 66, issue 2, pp. 307-321. February 2006.

2005

  • B. Caminero, C. Carron, F. Quiles, J. Duato, and S. Yalamanchili. “Traffic Scheduling Solutions with QoS Support for an Input-Buffered MultiMedia Router.” IEEE Transactions on Parallel and Distributed Systems, pp. 1009-1021. November 2005.