(3.2.2) Cache Invalidation patterns

Anoop Gupta, Wolf-Dietrich Weber: Cache Invalidation Patterns in Shared-Memory Multiprocessors. IEEE Trans. Computers 41(7): 794-810 (1992). IEEE Xplore link

Conclusion : Directory based schemes with 3-4 pointers per entry should work well for executing well-designed || programs.

cache line sizes
    increase > larger invalidations
                 > data traffic goes up
                 > coherence traffic comes down
                 > overall traffic min when line size = 32.
         
Classification of data objects
    Code and read-only data
    Migratory data                                     high proportion of single invalidations
    Mostly-read data                                 small invalidations
    Freq read/written objects                     large inv, eg : number of processors waiting in a global queue
    sync objects                                         locks and barriers
         low contention sync objects             distributed locks, easy to implement, optimal for directory based
         high contention sync objects             

distinguish : large invalidation > write to a line cached in many processors ; frequent invalidation > ...

Effect of Cache line size
large size :
    > better hardware efficiency, prefetching, inc in message traffic (increases min communication granularity between processors). 
    > parallel programs exhibit less spacial locality than sequential programs.
    > false sharing significant. 
    > increase the number of proc sharing a cache line (false sharing) > increase in size of invalidations. 
    > spatiality depends on classes of objects
    > fewer messages of each type (control/data), but size of each data increases. 

Of course : best case cache line == size of the object thats shared.