Discussion of: "Exploiting Gray-Box Knowledge of Buffer-Cache Management"

Problem: An application wants to know the state of the cache, e.g. to reorder
file accesses
  * No interface to get this information
  * So app can internally model cache
     - But it needs size and replacement policy

DUST: a fingerprinting tool that determines OS buffer cache size and
replacement policy
  * Assumptions
     - What policy's based on: frequency, FIFO, recency, etc.
     - Otherwise quiet system
  * Finding size: easy (increase the number of accesses until you see misses)
  * Finding policy: harder
     - Use a file almost as big as the cache
     - Read in once
     - Read various orders and frequencies to separate out policies
     - Evict data and see what is still there

DUST Fingerprints of real systems are compared with fingerprints from a
simulation
  * Problems
     - Must visually inspect each fingerprint
     - Simulated policies are simplistic
        + OK if real system uses a simple policy like NetBSD
        + Not OK if real system uses complex combinations of policies (Solaris)
     - Limited number of known policies
     - Slow to run:  time grows with cache size


Application the uses Algorithmic Mirroring to simulate the cache and predict
its contents
  * App knows size, policy, and workload (assumes an otherwise quiet system)
  * Models contents of cache in parallel with the disk reads that it performs
  * Schedules Shortest-Job-First (reorders accesses to hit cache)


Could this information be obtained passively?
  * Would probably be hard to do without injecting additional probes
  * Because app knows the rest of the workload, it could slowly increase the
    period a probe to a particular file to determine cache size