Multi- and Manycore Systems ResearchFast forward to
IntroductionFuture CPU generations will no longer be able to increase their single-thread performance exponentially. Instead, CPUs will scale the number of processing cores. In consequence, software will no longer get faster execution speeds automatically with each hardware upgrade, but will have to be adapted to the higher level of parallelism exposed by the CPU. Existing parallelization techniques get more and more complex with an increasing number of execution threads, which is why the software industry is looking for new, less complex parallel programming paradigms. Transactional memory is a promising programming model that provides transactions (known from database technology) that take the burden for synchronizing concurrent data access off programmers’ backs. However, today’s software implementations of transactional memory, known as Software Transactional Memory (STM), still inflict too much overhead for synchronization and bookkeeping, making STMs impractical for the CPU count to be expected in the near future. One way to reduce this overhead is to accelerate STMs with new hardware mechanisms. Another promising programming paradigm is that of lock-free data structures. Many authors have shown that lock-free algorithms perform and scale well and are robust against deadlocks, but to date these algorithms have been limited by incomplete hardware support: Lock-free programming relies on atomically modifying a set of memory locations using instructions like test-and-set and compare-and-swap. However, these instructions typically operate on only one or two words of memory and have a high latency, making lock-free programming impractical for more complex data structures or when low latency is required. AMD's Operating System Research Center helps defining and evaluating CPU-architecture extensions for making parallel programs faster as well as easier to write. We work with the STM community to develop CPU extensions for speeding up STM systems; and we evaluate an experimental AMD64 feature known as the Advanced Synchronization Facility (ASF).
VELOXVELOX is an EU-sponsored research project aiming at improving STM technology for multicore CPUs found in computers today or in the near future. VELOX takes a whole-system approach and looks at all system aspects from hardware over operating systems and runtimes all the way to applications. AMD participates in the Architecture work package of this project, helping to developing simulator technology and to evaluate architecture-extension proposals.
Advanced Synchronization Facility (ASF)ASF is an experimental AMD64 extension that allows user- and system-level code to modify a set of memory objects atomically without requiring expensive synchronization mechanisms. The ASF extension provides an inexpensive primitive from which higher-level synchronization mechanisms can be synthesized: for example, multi-word compare-and-exchange, load-locked-store- conditional, lock-free data structures, and primitives for software transactional memory. ASF is both more flexible and faster than existing lock-free atomic memory-modification approaches. Instead of offering new instructions with hardwired semantics (such as compare-and-exchange for two independent memory locations), ASF only exposes a mechanism for atomically updating multiple independent memory locations and allows software to implement the intended synchronization semantics. We have evaluated ASF in the contexts of both lock-free programming and software transactional memory. Please find more information in the papers posted in the Publications section of this page. We have released the simulator we used in our evaluation: a version of the open-source AMD64 simulator PTLsim that we extended with an implementation of ASF. The simulator can be downloaded in the download section of this page.
PublicationsHardware acceleration for lock-free data structures and software transactional memory
Hardware acceleration for software transactional memory
PTLsim-ASF releasePTLsim-ASF is a variant of the open-source AMD64 simulator PTLsim that we modified to simulate ASF. Please refer to the papers posted in the Publications section of this page for information on how we simulated and used ASF. For general information regarding PTLsim, including a detailed user's manual, please refer to the PTLsim home page. Release notes
License
Full release
Xen hypervisor with PTLsim patches
|