STI Publications - View Publication Form #17021

Publication Information

Title

24 MemHC: An Optimized GPU Memory Management Framework for Accelerating Many-body Correlation

Abstract

The many-body correlation function is a fundamental computation kernel in modern physics computing applications, e.g., Hadron Contractions in Lattice quantum chromodynamics (QCD). This kernel is both computation and memory intensive, involving a series of tensor contractions, and thus usually runs on accelerators like GPUs. Existing optimizations on many-body correlation mainly focus on individual tensor contractions (e.g., cuBLAS libraries and others). In contrast, this work discovers a new optimization dimension for many-body correlation by exploring the optimization opportunities among tensor contractions. More specifically, it targets general GPU architectures (both NVIDIA and AMD) and optimizes many-body correlation¿s memory management by exploiting a set of memory allocation and communication redundancy elimination opportunities: first, GPU memory allocation redundancy: the intermediate output frequently occurs as input in the subsequent calculations; second, CPU-GPU communicatio

Author(s)

Qihan Wang, Zhen Peng, Bin Ren, Jie Chen, Robert Edwards

Publication Date

June 2022

Document Type

Journal Article

Primary Institution

Thomas Jefferson National Accelerator Facility, Newport News

Affiliation

Comp Sci&Tech (CST) Div / Scientific Computing / Scientific Computing

Funding Source

Nuclear Physics (NP)

Proprietary?

This publication conveys

Technical Science Results

Document Numbers

JLAB Number: JLAB-CST-22-3602	OSTI Number: 1867362
LANL Number:	Other Number: DOE/OR/23177-5487

Associated with an experiment

Associated with EIC

Supported by Jefferson Lab LDRD Funding

Journal Article

Journal Name	ACM Transactions on Architecture and Code Optimization
Refereed	Yes
Volume	19
Issue	2
Page(s)	24

Attachments/Datasets/DOI Link

Document(s)	3506705.pdf (STI Document) 3506705.pdf (Accepted Manuscript)
DOI Link	https://doi.org/10.1145/3506705
Dataset(s)	(none)

Back to Search Results