Some question in Remzi class ---------------------------- * What is challenging about implementing lockset? A: + a lot of state of keep track + need to instrument ~ load/store instruction ~ acquire/release lock ~ call to storage allocator + overhead: memory overhead Solution: table of all possible lockset, for every memory location, store an index to the table * What guarantees can Eraser make about a program? A: No guarantee (part of a program may not run) * Some forms of shared data access requires modification to the basic lockset algorithm: What were they and why were they problematic for lockset? A: too strict, and may cause high overhead. Some problems are: + Initialization: ~ shared variables are frequently initialized WITHOUT holding a lock ~ Why? Because there is only one thread that creates, and initializes that variables, others threads are not aware of that + Read-shared data ~ this type of data can be safely be access without lock ~ no need to incur monitoring overhead in this case + Read-write lock: ~ read-write locks allow multiple readers to access a shared variable, but allow only one writer to do so. * Does the quality of Eraser's results depend on how threads are schedule? A: The author claims that Eraser is less susceptible to thread scheduling than happens before, but still: The initialization problem: **PROBLEM**: dependent on thread interleaving. It is possible that a thread allocate and initializes a variable without a lock, and before it finishes initializing, a second thread can access it. --> Eraser can detect the problem only if the second thread *actually* access the variable before the first thread finishes initializing it. Otherwise, Eraser will miss it. (FALSE NEGATIVE) * Eraser relies on binary instrumentation to work. What does Eraser instrument? A: - load/store - lock acquire/release - memory allocation - thread creation * Does the fact that a Eraser-modified binary runs 10-30x slower than normal affect its results? A: By and large, not much * What are the common false alarms produces by Eraser? How does one get rid of them? A: Source of false alarms - private lock - memory reuse - benign races Solution: anotation * Why Alta Vista Ni2 example not a real bug? A: benign races, no harm, just optimization * Compare Data Race and Deadlock. If data races are Scylla, why are deadlocks Charybdis? A:??? ################################################################################ =========== E R A S E R =========== link: http://www.cs.cmu.edu/~15712/lectures/05-eraser.pdf Summary: -------- this paper solve the problem of detecting data races in a lock-based multithreaded programs. It monitors every shared memory reference and verify the consistent locking behavior (i.e, every shared variable should be protected by a lock) Back ground: ------------ - What is a data race? A: Data race occurs when two concurrent threads access a shared variable, and: 1) at least one access is write 2) there is no explicit mechanism to prevent accesses happening at the same time - What is a monitor? A: Brief summary of monitor: monitor is a group of shared variables together with procedures to access them, all bundled together with a single anonymous lock is acquired and released at the entry and exit of the procedures. Shared variable is *invisible* to out side world, and it can only be accessed by procedures ==> Monitor is static, compiled-time guarantee for freedom from data races. - What is happen before relation? A: In distributed system, there is no way to agree in a physical time. So there is need to know what event happens before others. Happen before relation is a *partial* ordering of events in distributed system. 1) if A and B are events in the same process, A is executed before B, then A->B 2) If A is the event of sending a message by one process and B is the event of receiving that message by another process, then A -> B 3) if A -> B and B -> C then A -> C - How happen-before relation is implemented? A: In short, using vector timestamp. if A->B, then timestamp(A) < timestamp(B) Details: + each process has a logical clock, each event is associated with a *time stamp*. + Logical clock can be a simple counter that is incremented between any two successive events in a single process. + a process update its logical clock whenever it receives a message with time stamp greater than its logical clock from other process. - How to use happen-before relation in detecting data race? A: we keep a vector time stamp for every access in a concurrent program. Every synchronization event (lock, unlock, wait, signal...) is a message. (For example, unlock sends a 'message' to the follow up lock, parent thread sends a message to a child through thread_create) Then at run time, we maintain and compare time stamps; races are reported if two conflicting accesses have concurrent timestamps Motivation: ----------- - Does monitor solve the data race problem? A: Yes, but it works for only static global variables (Eraser can work with dynamically allocated shared data) - Why happen-before relation is bad? A: because it is: 1) difficult to implement because it requires per-thread information about concurrent accesses to each shared memory location. 2) it effectiveness depends on the interleaving of threads produced scheduler (i.e, in some interleaving, it can catch the data race bug, in some others, it cannot) --> false negative ==> ERASER: is complement to happen before, it is simple, and less dependent in the interleaving of threads. What **The lockset algorithm**: ------------------------------- *Note*: Eraser can only be used for lock-based synchronization technique only. - Idea: Eraser will enforce the locking discipline at run time (i.e every shared variable is protected by some lock). The idea is to keep a lock-set record for every shared variable. This lock-set is updated by intersecting itself to the set of locks that current thread accessing the variable holds. The intuition is if the variable is consistently protected by some lock, then its lock-set is never be empty. Hence, during runtime, if a lock-set of a particular shared variable is empty, Eraser issues a warning. - simple algorithm: Let locks_held(t) is set of locks held by thread t For each shared variable v, C(v) is the set of candidate locks that protect v. C(v) is initialized by the set of all locks. For every access to v by thread t. C(v) = C(v) intersects locks_held(t) if C(v) = empty then issue a warning - Refinement: above algorithm is too strict, and may cause high overhead. Some problems are: + *Initialization*: shared variables are frequently initialized without holding a lock (Why? Because there is only one thread that creates, and initializes that variables, others threads are not aware of that) + *Read-shared data*: this type of data can be safely be access without lock, so no need to incur monitoring overhead in this case + *Read-write lock*: read-write locks allow multiple readers to access a shared variable, but allow only one writer to do so. To addresses, refinement algorithm has 4-state transition diagram for each memory location. (Virgin -> Exclusive -> Shared Read -> Shared Write) - When a variable is first allocated, it has Virgin state. - When it is first read or write by the initializing thread, it state changes to Exclusive. Subsequent accesses to the variable by the same thread does not change the state and does not update C(v) - When other thread read the variable, the state turns to Shared-Read. Subsequent reads does not change the state. C(v) is updated but data race is not reported even if C(v) is empty. - When other thread writes the variable, ==> Shared-Write, now like simple algorithm **PROBLEM**: dependent on thread interleaving. It is possible that a thread allocate and initializes a variable without a lock, and before it finishes initializing, a second thread can access it. --> Eraser can detect the problem only if the second thread *actually* access the variable before the first thread finishes initializing it. Otherwise, Eraser will miss it. (FALSE NEGATIVE) How - Implementation -------------------- - Eraser uses binary instrument, it instrument: + each load and store in the binary program + each acquire and release a lock + each call to storage allocator - Granularity: 32-bit word in heap or global data as a shared variable - Maintain a *shadow word* that encodes an index to the lock-set table and the state of the variable (index is hash in to the lockset table) - Entry in the table is a vector of locks sorted by lock address (for easier intersection) - Entries are never deallocated or modified --> each lockset index is valid for a life time of the program **QUESTION**: How this table is constructed??? Is it list all possible locks, and brute-forcely list all combination of them?? - A: we can assume that this table is built on the fly (dynamically) **PROBLEM**: since Eraser only instruments standard locking library (like pthread),there is chance for false positive (false alarm) if program use private lock (because Eraser does not intercept that). Some other sources to false alarm: - Memory reuse: memory is reused without resetting the shadow memory ??? Example: let say we access shared variable V1 and has a shadow word W for it. If we free V1, but now we allocate V2 to the same place as V1 (mem reuse), and we do not clear the shadow word correctly (we can assume V1 is protected by L1) then another thread which uses L2 to protect V2, but the intersection becomes empty --> false alarm - Benign races: Example: some data races is on purpose, for example, some stats information that is not harmful, when updating it, we don't need a lock, although data race can happen, but it is ok Weakness: --------- - Eraser has high false alarms - it is just a testing methodology, not a proven of program that is free from DR - work only for lock - depends on workload/input, because it may not cover all the code path (where a shared variable is not protected by a lock) - multiple locks: can Eraser work with this? Happen Before vs. Eraser: -------------------------- - HB: small false positive, large false negative, works with a lot of synchronization primitives - Eraser: high false positive (due to memory reuse, private lock, benign races), small false negative, works only with lock Open Question: -------------- - can we apply the idea of Eraser to detect deadlock? A: yes, we can monitor the locking primitives, and check for a simple discipline like this: choose a partial order among all locks and program each thread so that whenever it holds more than one lock, it acquires them in ascending order. - Can we use Eraser with other synchronization method other than lock? A: unknown (think about binary semaphore) + i think it can be apply to binary semaphore, but not to for private sem (wait, signal) - Is there any false negative case with Eraser? + yes, during initialization, if a thread let it data known by other thread before it finishes initializing, other thread can access it, hence a race. Eraser can detect this race if other thread access data before this guy finishes initialization, but not after that. Open topic: ----------- - Multiple lock: will eraser work with multiple locks? Yeah, but it needs some refinements, and some times it can cause false alarms and false negative. + The setup: some shared variables are protected by multiple locks instead of single one. The rule is: **every thread that writes the variable must hold all the protecting locks, and every thread that reads the variable must hold at least one lock** --> This allow only one write but many reads at a time --> somewhat similar to read lock and write lock, but the purpose is different. While read lock and write lock aim at increasing concurrency, this Eraser's locking discipline aims at detecting data race and avoiding deadlock in programs that contain upcalls. + False alarms: happen when two readers access the same shared location with two different locks (think about brlock) --> Solution: update the lock set when write only On each read of v by thread t: if C(v) = {} then issue a warning // note: we don't update c(v) On each write of v by tread t set C(v) = C(v) intersect locks_held(t); if C(v) = {} then issue a warning; This prevents false alarm but can cause false negative: Example: t1 read v while holding lock l1, t2 write v while holding lock l2 the violation will be reported only if write precedes read??? WHY ??? Question: how to handle multiple locks that without any risk of false negative? - Granularity: + char X,y this are both shared variable, say t1 use l1 to access x, t2 use l2 to access y --> FP ** BIG shared locking in kernel **