(3.2.2) Cache Invalidation patterns

Anoop Gupta, Wolf-Dietrich Weber: Cache Invalidation Patterns in Shared-Memory Multiprocessors. IEEE Trans. Computers 41(7): 794-810 (1992). IEEE Xplore link

Conclusion : Directory based schemes with 3-4 pointers per entry should work well for executing well-designed || programs.

cache line sizes

increase > larger invalidations

> data traffic goes up

> coherence traffic comes down

> overall traffic min when line size = 32.

Classification of data objects

Code and read-only data

Migratory data high proportion of single invalidations

Mostly-read data small invalidations

Freq read/written objects large inv, eg : number of processors waiting in a global queue

sync objects locks and barriers

low contention sync objects distributed locks, easy to implement, optimal for directory based

high contention sync objects

distinguish : large invalidation > write to a line cached in many processors ; frequent invalidation > ...

Effect of Cache line size

large size :

> better hardware efficiency, prefetching, inc in message traffic (increases min communication granularity between processors).

> parallel programs exhibit less spacial locality than sequential programs.

> false sharing significant.

> increase the number of proc sharing a cache line (false sharing) > increase in size of invalidations.

> spatiality depends on classes of objects

> fewer messages of each type (control/data), but size of each data increases.

Of course : best case cache line == size of the object thats shared.