Lightweight Remote Procedure Call

B. N. Bershad, T. E. Anderson, E. D. Lazowska, and H. M. Levy @ Washington Univ

ACM Transactions on Computer Systems, February 1990, pages 37-55

Problem

Brian's Note

Observations

Brian's Note

Design and Implementation of LRPC

Execution Model

Client calls server procedure = kernel trap

Then the kernel

validates caller

creates call linkage

dispatch client thread to server domain

client thread runs in server's address space with the argument stack shared between them

Binding

Binding is done at the granularity of interface--a set of procedures

A server module exports an interface

LRPC runtime library (server clerk) registers the interface with a name server

A client binds to the interface by making a import call to the kernel

The kernel notifies the server's waiting clerk

The clerk replies to the kernel with the list of PD (procedure descriptor):

one PD per procedure in the interface

contains:

an entry address in the server domain (= address space)

the size of A-stack (Argument stack): for arguments and return value

For each PD

the kernel pairwise allocates in the client and server domain an A-stack

this A-stack is read-write shared by the client and server

the kernel allocates linkage record for the A-stack

The kernel returns to the client

a Binding Object -- unforgable certificate to access the server's interface

List of A-stacks for procedures in the interface

Calling

Client calls user-stub

The stub puts arguments into the A-stack given by the kernel at the time of binding

The stub places the followings into the registers and traps to the kernel

Binding Object

A-stack address for the procedure

procedure id

The kernel executes in the context of the client thread

verifies Binding Object, A-stack

locates the corresponding linkage record

puts the return address and the current stack pointer into the linkage record

finds an E-stack (Execution stack) of the server

creates a new E-stack or allocate from the pool

updates thread's user stack pointer--maybe the SP register--to run off of the server's E-stack

note that the thread is the client thread

reloads the processor's virtual memory registers--base register--with those of the server domain

Note that light weight context switch has been done by the last two steps

performs upcall into server-stub

Server-stub

calls the server, which executes with the A-stack and E-stack

when the server returns, trap to the kernel

the kernel does the light weight context switch back to the client address space

Client-stub again

reads the return value from the A-stack

returns the result to the client

Stub Generation

Two types of stubs are automatically generated from Modula2+ definition file

Simple and fast stub in assembly language for most cases

Complex and general in Modular2+ for complex arguments, exception handling, etc

LRPC on MP

Locking mechanism is required for A-stacks

Further reduced context switch

In single processor machine, light weight context switch still incurs lots of context switch overheads:

TLB misses

register updates

bringing page-table in memory

Context switch In MP

Popular server's context (= processes) are cached in idle processors

When a client calls the server procedure, kernel exchages caller's processor with server's processor

On return, kernel exchanges those processors back

Argument Copying

Conventional RPC: 4 times

user-stub -> RPC message -> kernel -> RPC message -> server-stub

LRPC: one

user-stub -> A-stack