









| Example:<br>Array Iterator                                                                                                                    |                            |                                                          |
|-----------------------------------------------------------------------------------------------------------------------------------------------|----------------------------|----------------------------------------------------------|
| ١                                                                                                                                             | What virtual addresses?    | What physical addresses?                                 |
| <pre>int sum = 0;<br/>for (i=0; i<n; i++){<br="">sum += a[i];<br/>}<br/>Assume 'a' starts at 0x3000<br/>Ignore instruction fetches</n;></pre> | load 0x3000                | load 0x100C                                              |
|                                                                                                                                               | load 0x3004                | load 0x1000                                              |
|                                                                                                                                               | load 0x3008                | load 0x7004<br>load 0x100C<br>load 0x7008<br>load 0x100C |
|                                                                                                                                               | load 0x300C                |                                                          |
|                                                                                                                                               |                            | load 0x700C                                              |
|                                                                                                                                               | Aside: What can you infer? |                                                          |
| • ptbr: 0x1000; PTE 4 bytes each<br>• VPN 3 -> PPN 7                                                                                          |                            |                                                          |
| Repeatedly access same PTE because program repeatedly<br>accesses same virtual page                                                           |                            |                                                          |

















## TLB PERFORMANCE with Workloads

Sequential array accesses almost always hit in TLB

- Very fast!

What access pattern will be slow?

- Highly random, with no repeat accesses







## **TLB** Performance

Context switches are expensive

Even with ASID, other processes "pollute" TLB

 Discard process A's TLB entries for process B's entries

Architectures can have multiple TLBs

- 1 TLB for data, 1 TLB for instructions
- 1 TLB for regular pages, 1 TLB for "super pages"



## **TLB PERFORMANCE**How can system improve TLB performance (hit rate) given<br/>fixed number of TLB entries?Increase page size<br/>Fewer unique translations needed to access same amount of memory)









