PTLsim-ASF 219.asf.1.1 Release Notes Stephan Diestelhorst AMD Operating System Research Center August 2008 Overview ======== PTLsim is a near-cycle-accurate AMD64 simulator with in-order and out-of-order execution-core models. PTLsim-ASF is a version of PTLsim that implements an experimental extension of them AMD64 architecture: the Advanced Synchronization Facility (ASF). The following paper describes ASF and its PTLsim implementation: Hardware acceleration for lock-free data structures and software transactional memory Stephan Diestelhorst, Michael Hohmuth. In the proceedings of the Workshop on Exploiting Parallelism with Transactional Memory and other Hardware Assisted Methods (EPHAM), April 2008. Boston, MA http://www.amd64.org/fileadmin/user_upload/pub/epham08-asf-eval.pdf Please find more information on ASF and PTLsim on the following web site: http://www.amd64.org/research/multi-and-manycore-systems.html Disclaimer ========== PTLsim-ASF introduces support for the experimental ASF extension. AMD provides this implementation of ASF without any intent and commitment to release such functionality in any future microprocessor product. Furthermore, the experimental status of this extension manifests in the possibility that functionality, implementation, and interface of ASF may change in the future and deviate from previous versions (such as the one presented in the publication cited above). Having said that, you are welcome and invited to experiment with the Advanced Synchronization Facility and to let us know if you have any suggestions for improvement, discover bugs, and for any other comments. Use the contact provided at the end of this document. We would also be glad to learn about interesting uses of ASF. If you want to refer to ASF in a publication, please use the paper cited above. While the simulation core of PTLsim-ASF has been tweaked to behave similar to a AMD Opteron(tm) family 10h (Barcelona) processor, this model is not accurate, neither in functionality, performance, nor internal implementation. Do not use this model to assess, project, or derive any properties of a real AMD Opteron(tm) processor. License ======= PTLsim-ASF is released under GPLv2. See LICENSE for details. Changes to baseline PTLsim-219 ============================== PTLsim-ASF contains the following enhancements over the baseline PTLsim release: * An implementation of ASF in the out-of-order core model * Support for simulating multicore configurations with a simple cache-coherence performance model * Updated microarchitecture for AMD Opteron(tm) processors (family 10h, Barcelona core) * Various bugfixes back-ported from current PTLsim releases Configuration ============= The following configurations are known to work: * Out-of-order core model "asfsmt" with ASF and multicore support This core is a modification of PTLsim's original "smtcore" simulation core with extensions to simulate a true multicore system. The core has been tested in full-system simulation mode only (using PTLsim/X), as it relies on support for multithreaded simulation which is not available in the user-space version of PTLsim. The modified core still resides in the smt*.{h,cpp} files, but has been renamed to "asfsmt" and replaces the original "smt" core. Use "-core asfsmt -run" as parameters to PTLsim/X to make use of this core model. In contrast to the original "smt" simulation model, this enhanced version contains multiple truly independent cores that do not compete for functional units or ROB entries, and each core has a private cache hierarchy. These cache hierarchies are kept coherent with a simplified coherence protocol that models first-order performance effects of coherent caches. * Out-of-order core model "asfooo" with ASF support In order to allow experiments in user space, an additional core model, named "asfooo" has been created that adds ASF to the existing "ooo" single core model with out-of-order execution. All modifications have been made to a new clone of the "ooo" core, and hence both versions ("ooo" and "asfooo") can be used by using the appropriate command-line option, such as "-core asfooo -run". ASF support in this simulation core is functionally limited, as PTLsim's user-space simulation infrastructure does not support multithreaded applications and thus concurrent threads cannot cause ASF critical section aborts due to contention on data. However, this core can still be used for quick user-space prototyping of applications using ASF, as various reasons for interference with ASF can be simulated using a newly added random-injection framework. Several random predicates have been defined that are evaluated at various stages in the pipeline. In a configurable percentage of evaluations (0 % by default), these predicates evaluate to true and for example trigger ASF roll-back due to data-contention. In the recent upstream version of PTLsim the "ooo" core has been replaced by the "smt" core (renamed to "ooo"), so that a single core model is used in both user-space and full system simulation. As the PTLsim-ASF project is currently preparing for merge with that most recent version, the "asfooo" core is not tested as much as the "asfsmt" core and slated for removal. In that future version, the "asfsmt" core will be available for user-space testing and rapid prototyping. * In-order core model "seq" without ASF support The sequential in-order simulation core "seq" has not been changed and is still present in the release. It can be used for fast forwarding into the simulation, but care has to be taken as it does not have support for ASF. Usage ===== In order to use ASF from within C / C++, include asf-highlevel.h (in the root of the package) which provides convenient wrappers around the ASF primitives for loading, for prefetching values, and to start and end ASF critical sections. Refer to the publication above for details on ASF primitives and section layout. Due to special treatment of the frame-pointer register within GCC (it cannot be clobbered), two different flavours of the macro for the ACQUIRE instruction exist. If you want to compile your application with _enabled_ frame pointers, specify "-DASF_PUSH -fno-omit-frame-pointer" on GCC's command-line. If you want to _disable_ frame pointers, use "-DASF_STACK -fomit-frame-pointer". This behaviour is an artifact of GCC not directly supporting ASF and its inability to clobber RBP, even if frame pointers are disabled. As ASF does not restore any of the GPRs (except the stack pointer) after an abort, RBP has to be saved on the stack manually. The ACQUIRE macros hence expect a local 64-bit storage on the stack that should be declared as follows: "volatile unsigned long acq_state;". Please note that the ASF extension is only supported for 64-bit applications. Support for (legacy) 32-bit environments is not tested and will likely result in fancy errors anywhere in the tool chain. Known issues ============ * PTLsim/X sometimes crashes when rapidly switching between native-execution mode and simulation mode. The workaround is to avoid switching back from simulation mode to native-execution mode, or to resort to user-mode-only support. Note that this is an issue introduced by the original PTLsim version. * This modification to PTLsim is based on an old release of PTLsim. Several of the improvements made to upstream PTLsim have been contributed by us, and hence are included in this tree as well. Various others have been "back-ported". We are currently working on a merge with the current upstream version of PTLsim. Contact ======= This version of PTLsim is maintained by Stephan Diestelhorst (AMD OSRC) stephan.diestelhorst@amd.com Any question specific to the extensions of PTLsim-ASF should be directed to him. The general PTLsim mailing list for general PTLsim questions is ptlsim-devel@ptlsim.org. Its archives can be found at: http://www.ptlsim.org/pipermail/ptlsim-devel/