Milo M. K. Martin, Mark D. Hill, and David A. Wood, "Token Coherence: Decoupling Performance and Correctness," International Symposium on Computer Architecture (ISCA), June 2003. IEEE Xplore Link |
Technology trend
workload (good thread level parallelism, low latency preferred) > snooping
Technology trend > directory
+ Fast cache to cache misses
+ No bus like interconnect
avoid virtual bus ordering
+ Bandwidth efficiency
Token counting > ensures safety
Persistent requests > prevent starvation
Decoupling correctness and performance
Broadcast with direct responses
use unordered interconnect
Need one token to read, all tokens to write.
To avoid data sent with all tokens : owner token (clean/dirty)
- Non-silent eviction
Prevent starvation
invoke after a timeout
send to all componenets
all comp remembers in a table and continually redirects all tokens to requestor
Deactivate when complete
What if many processors issue persistent req simulat?
Use starvation-free arbiter
Single/Banked/Distributed
need to prevent reordering of activate and deactive messages
- Scalability of persistent requests
Predictor based (reduce broadcasts)
broadcast-if-shared.
Owner
Groups