-------------------------------------------------------------------- CS 758 Programming Multicore Processors Fall 2012 Section 1 Instructor Mark D. Hill -------------------------------------------------------------------- ------------ TITLE ------------ OUTLINE ------------------------------ BODY HW vs SW bounded vs. unbounded Intel coming Outline * LogTM-SE -- Unbounded DONE * Intel HLE & RTM * STM & HybridTM (NOrec) Intel Haswell * Hardware Lock Elison * Best-Effort HTM Speculative Lock Elison http://www.cs.wisc.edu/~markhill/restricted/micro01_sle.pdf http://www.cs.wisc.edu/~markhill/restricted/micro01_sle_talk_2up.pdf http://www.cs.wisc.edu/~markhill/restricted/others/micro01_sle_talk.ppt Best-Effort HTM http://www.cs.wisc.edu/multifacet/papers/vldb08_keynote.ppt Slides 21-43 Intel Haswell REFERENCE { Intel has implemented HLE/TSX (speculative lock elision & some transactional semantics) in Haswell. Intel blog post about this: http://software.intel.com/en-us/blogs/2012/02/07/transactional-synchronization-in-haswell/ Programming reference (see chapter 8, large pdf): http://software.intel.com/file/41417 } Hardware Lock Elison -- based on Rajwar and Goodman's Speculative Lock Elison * XACQUIRE and XRELEASE prefixes * Put prefixes on lock and unlock for suitable short critical section * HW will read locks and seek to execute CS as implicit transaction insert(hashtable, key, value) { XACQUIRE "LOCK"(hashtable.lock) bucket = hash (key) ... manipulate bucket ... XRELEASE "UNLOCK"(hashtable.lock) * On (repeated) failure, HW will fall back on RMW of lock (aborting otherimplicit transactions) + HW can be limited or omited + EASY change to lock-based SW - Does not allow one to write SEW free of locks Also Restricted Transactional Memory (RTM) * XBEGIN, XEND, and XABORT instructions * Undefined on older processors * Programmers must provide alternative code path (as branch offset on XBEGIN) Do HLE or RTM restore registers? Software TM * See Harris et al. TM2 Synthesis Lecture * SW to record conflicts and save old values * At transaction end, check conflicts (atomically) and commit/abort - Much progress, but integer factors slower + no need to write lock-based program Hybrid TM * Have bounded HTM * Fall back to STM * "play" together + no need to write lock-based program * Naive * Add single, global STMlock * HTM path reads STMlock * STM path RMW STMlock (aborting HTM transaction & serializing STM transactions) * Ok if STM fallback rare * Many fancier schemes -- e.g., HybridTM (ownership records), PhasedTM, NOrec NOrec slides? OLD { Outline * STM / Hybrid Intro uses part of ACACES 4of5 STM slides. * Discussion / Reviews Mention Sun Rock -- and COI Reviews * Andrew N.: What HW now and later? -- Mention Sun Rock -- and COI * Marc: Hybrid not compelling -- Mentioned Phased TM -- Transact'07 * Syed: Livelock * David: LogTM performance, but context switching -- point to Transact'08 * Eric: use LogTM more? * Daniel: What is minimum HW? * Guoliang: STM data structures? Complex? * Brian: Why rdcnt? -- so all readers can go away }