CppCon 2021 has ended
Back To Schedule
Thursday, October 28 • 9:00am - 10:00am
A Persistent Hash Map for Graph Processing Workloads and a Methodology for Persistent Transactional Data Structures

Log in to save this to your schedule, view media, leave feedback and see who's attending!

Feedback form is now closed.
Emerging byte-addressable Non-Volatile Memories (NVMs) provide higher densities and data persistence where process state can be recovered after crashes. Graph analytic applications such as search and ranking, pattern matching, and clustering benefit from the large capacity available with persistent memory since the data sets are very large with up to trillions of vertices and edges. A hash map is fundamental for graph analytics because it associatively map keys to values, offering constant time lookup for graph updates. While lock-free persistent hash maps have been developed, most perform poorly due to a heavy use of pointer dereferences. Additionally, it is often desirable to perform a composition of graph updates as a transaction. Although persistent data structures enable applications to rely on persistent data with failure-atomic operations, they lack the ability to allow users to execute a sequence of operations as transactions. Meanwhile, persistent transactional memory (PTM) has been proposed by adding durability to Software Transactional Memory (STM). However, PTM suffers from high performance overheads and low scalability due to false aborts, logging, and ordering constraints on persistence.

In this talk, we present PMap, a persistent concurrent hash map with open addressing and non-blocking parallel resizing to cater to large graph processing workloads. Our PMap achieves lock-freedom using compare_exchange_strong from the C++11 Atomic Operations Library. To provide persistence, our design uses a modified version of the link-and-persist approach. While this approach is traditionally used to persist pointers, our modification enables link-and-persist to be used in-place, persisting keys and values directly. This is crucial, as logs, or separate object allocations may require pointer dereferences, which undermines the cache locality benefits of pure open addressing. The PMap uses the C++11 alignas specifier to ensure data can be aligned and flushed on a per-cacheline basis into persistent memory. To accommodate durable transactions, we present PETRA, a new approach for constructing persistent transactional non-blocking linked data structures. PETRA natively supports transactions, but unlike PTM, relies on the high-level information from the data structure semantics. This gives PETRA unique advantages in the form of high performance and high scalability. We present diagrams detailing the PMap design and PETRA methodology for the C++ programming language, and illustrate how to keep the number of cache line flushes and memory fences low through the use of descriptor objects, which are commonly used in the design of lock-free data structures.

We showcase the performance of our proposed persistent data structures by presenting a live demonstration of PMap and PETRA data structures including a linked list, skiplist, map, and multi-dimensional list. We compare PMap to clevel, OneFile, STL, and PMDK, tested on a micro-benchmark and on files from Recursive Model for graph mining (R-MAT). We compare the PETRA data structures to OneFile, Romulus, and PMDK, tested on a micro-benchmark and the Telecommunication Application Transaction Processing (TATP) benchmark. The live demonstration illustrates how PMap outperforms state-of-the-art alternatives averaging 3122x faster under Intel Optane DC, and PETRA outperforms the state-of-the-art PTMs by an order of magnitude in transactions of size greater than one, and demonstrates superior performance in transactions of size one.

ALL TALK SESSIONS CAN BE ACCESSED FROM THE MAIN LOBBY: https://cppcon.digital-medium.co.uk/

avatar for Kenneth Lamar

Kenneth Lamar

Graduate Research Assistant, University of Central Florida
Kenneth Lamar received his Bachelor of Science (2017) from the University of Central Florida. He is currently pursuing his PhD in Computer Science at UCF. His research interests include concurrent data structures and distributed computing. His work is currently focused on HPC scheduling... Read More →
avatar for Christina Peterson

Christina Peterson

Postdoctoral Associate, University of Central Florida
Dr. Christina Peterson is a Postdoctoral Associate at the University of Central Florida. Dr. Peterson completed her dissertation work under the supervision of Dr. Damian Dechev with a focus on correctness and progress guarantees of multiprocessor algorithms and data structures. Her... Read More →

Thursday October 28, 2021 9:00am - 10:00am MDT