Publications
Fast forward to
Hardware acceleration for lock-free data structures and software transactional memory Stephan Diestelhorst, Michael Hohmuth. In the proceedings of the Workshop on Exploiting Parallelism with Transactional Memory and other Hardware Assisted Methods (EPHAM), April 2008. Boston, MA
In this paper, we report on a new CPU-architecture extension proposal, named Advanced Synchronization Facility (ASF), which is geared toward accelerating and easing lock-free programming and software transactional memory (STM). We present an initial performance simulation and usability study of ASF’s application to a lock-free data structure (a singly linked list) and to accelerating a state-of-the-art STM system, TinySTM. Our results indicate that ASF can significantly increase the throughput and scaling behavior of both workloads: The lock-free implementation has doubled single-threaded performance and maintains a 66 % increase for eight CPUs, while application-transparent enhancement of the STM increases single-thread performance by up to 15 %, and the factor of scaling to eight CPUs by up to 20 %.
Paper: PDF Talk: PDF
Hardware acceleration for software transactional memory Stephan Diestelhorst. Diploma thesis, Technische Universität Dresden, January 2008. Dresden, Germany
Stephan's diploma thesis originated during his internship at AMD's OSRC in 2007.
Thesis: PDF
How to Deal with Lock-Holder Preemption Thomas Friebel. Presentation at the Xen Summit North America, July 2008. Boston, MA
Lock-holder preemption is the preemption of a virtual CPU (VCPU) holding a spinlock. Other VCPUs of the same guest that try to acquire the same lock will have to wait until the lock-holder is scheduled again and releases the lock. On a multi-core machine, lock-holder preemption can cause Xen guests to waste about 7% of their time waiting for spinlocks. In this presentation we will show the effects of lock-holder preemption, show two ways to counteract it, and analyze one approach in detail. We will give a short overview of our modifications to the Xen scheduler, and show how we regained the lost performance.
Extended abstract: PDF Talk: PDF, PDF with comments
Nested paging hardware and software Benjamin Serebrin, Joerg Roedel. Presentation at the KVM Forum, June 2008. Napa, CA
This presentation covers the ASPLOS paper 'Accelerating two-dimensional page walks for virtualized systems', and implementation details and performance of nested paging support for KVM.
Talk: PDF
Accelerating two-dimensional page walks for virtualized systems Ravi Bhargava, Benjamin Serebrin, Francesco Spadini, Srilatha Manne. In the proceedings of the 13th international conference on Architectural support for programming languages and operating systems (ASPLOS), March 2008. Seattle, WA
Nested paging is a hardware solution for alleviating the software memory management overhead imposed by system virtualization. Nested paging complements existing page walk hardware to form a two-dimensional (2D) page walk, which reduces the need for hypervisor intervention in guest page table management. However, the extra dimension also increases the maximum number of architecturally-required page table references.
This paper presents an in-depth examination of the 2D page table walk overhead and options for decreasing it. These options include using the AMD Opteron processor's page walk cache to exploit the strong reuse of page entry references. For a mix of server and SPEC benchmarks, the presented results show a 15%-38% improvement in guest performance by extending the existing page walk cache to also store the nested dimension of the 2D page walk. Caching nested page table translations and skipping multiple page entry references produce an additional 3%-7% improvement.
Much of the remaining 2D page walk overhead is due to low-locality nested page entry references, which result in additional memory hierarchy misses. By using large pages, the hypervisor can eliminate many of these long-latency accesses and further improve the guest performance by 3%-22%.
Paper: PDF
Partitioning the physical TLB with SVM ASIDs Sebastian Biemueller. Presentation at Xen Summit, April 2007. Yorktown Heights, NY
Slide deck used at the 2007 Xen Summit.
Talk: PDF
Nested paging support in Xen Wei Huang. Presentation at Xen Summit, April 2007. Yorktown Heights, NY
Slide deck used at the 2007 Xen Summit including an introduction to the AMD Barcelona technology by Elsie Wahlig.
Talk: PDF
Myths and facts about 64-bit Linux Andreas Herrmann, Andre Przywara. Presentation at Chemnitzer Linux-Tage, March 2008. Chemnitz, Germany
Since the dawn of 64bit-Linux on PCs there are some myths circulating around the 64bit topic. These slides will deliver some technical details to create some facts. An overview of the hardware changes of the x86-64 architecture is followed by a small report on necessary changes to Linux and the GCC toolchain. A focus lies on the compatibility to 32bit, detailing both the hardware parts and the Linux implementation. Some real life experiences and traps are shown, as well as some hints for porting old 32bit programs to 64bit. A range of benchmark results will conclude this presentation providing a view on actual performance of 64bit applications.
Talk: PDF
|