We first mention the common concerns and then address specific review specific questions
for reviewers.
------------------------------------------------------------------------------------------

We thank the reviewers for their detailed feedback, and take encouragement from their
words: "The idea of slightly tweaking power management code in drivers to save and restore
device is so clever that we should accept the paper for that alone." (Review 1), "The
paper presents a fresh new-look to driver and device recovery" (Review 4), and "The
authors of this paper have done an impressive amount of engineering" (Review 3).

Our primary contribution is two fold: first, we show how to re-use existing power
management code to provide device checkpoints, and have designed a fault tolerance system
based on it (as identified by Reviews 1,2,4,5). Second, we demonstrate a transactional
isolation model than can improve performance by selectively isolating specific functions.

Review 1 is not clear about why the cost of checkpoints is so low when compared to device
restart. We briefly mention this in Section 4.2: device initialization performs a full
probe sequence to determine the type, capabilities and environment for the device, while
FGFT merely reloads an existing configuration. We will describe this in detail in
subsequent revisions of the paper.

Review 1,2 and 3 ask what types of faults we handle and question if they are more limited
than related work (Nooks). FGFT traps on all processor exceptions (NULL pointer exception,
 general protection fault, alignment fault, divide error (divide by zero), missing
segments, and stack faults) apart from memory errors. It also detects malformed data
structures during marshaling. Compared to Nooks, it may not detect corruption that occurs
in one call and is accessed in another, although the marshaling may detect such
corruption. In addition, due to automated generation of marshaling code, FGFT does not
include the explicit parameter checks. Recent work on statically determining kernel
entrypoint pre-and post-conditions [1] could be used to detect more faults automatically. 
Our fault injection tests (5.1) used different bug types (Table 2), which manifest as
memory violations or as one of the above processor exceptions.

Review 3: "This reviewer believes the paper should be rejected because it is long on
engineering and short on science." We politely disagree. Our novel contribution is device
checkpointing, which can be used for variety of uses apart from fault tolerance (Table 1),
and the idea of transactional isolation.  Within fault tolerance, the availability of
checkpointing introduces a fundamental different way to think about driver isolation.
Furthermore, to clearly demonstrate its value and overheads, we implemented a driver
isolation and a driver recovery solution, which has made the paper heavy on implementation
and engineering. In subsequent revisions of the paper, we will describe the our research
contributions of device checkpoints and in-kernel SFI using marshaling better. However, we
also believe that rigorous engineering is one of our important contributions.

Reviews 3,4 ask whether device checkpointing can work if there are bugs in power
management code. The answer is no. However, FGFT represents a new place in the tradeoff
between complexity of implementation and fault tolerance: Nooks and other full isolation
systems require much more code hand-written code in the OS, but isolate the entire driver.
Microdrivers isolate only non-critical path code. FGFT isolates all code except for
checkpoint/restore, which is typically less than 5% of the driver code.

Review 1,3,4,5 discuss selective isolation and where it is useful. FGFT is most useful if
it can be determined that specific entrypoints are likely to contain faults, such as if
they had recent patches or were flagged by static bug-finding tools. Past work on
Microdrivers showed that bugs do not have a higher density in the I/O path code and that
such code is a small fraction of total driver code. In subsequent revisions of the paper,
we will demonstrate this with an example (Review 1).

Reviews 2,3,4 discuss our synchronization policy. FGFT uses lazy version management and
holds  locks until the entry point completes successfully. This isolates threads from each
other, and ensures that conflicts between concurrent threads are impossible. It is
possible that some locking patterns could lead to deadlocks, but we have not seen those
patterns in any of the drivers we have examined. Furthermore, deadlock could be detected
at lock acquire time and handled by aborting one of the threads involved.

Review 2 compares us with past transactional systems TxOS (SOSP 09), TxLinux (SOSP 07) and
xCalls (Eurosys 09). These systems do not perform device I/O transactionally and either
rely on higher-level atomicity techniques (TxOS and xCalls) or serialize transactions with
a lock (TxLinux) We believe device checkpointing is an useful contribution and would be
excited to see it is applied to other applications.

[1] Diagnosys: Automatic generation of a debugging interface to the Linux kernel. In ASE
2012.

------------------------------------------------------------------------------------------

We now discuss important concerns raised by reviewers not discussed above. Questions are
give in quotation marks, and answers are marked with an arrow (==>):

Review 1: 
=========

The questions in this review were discussed above. We will apply the helpful suggestions
mentioned in this review.

Review 2: 
=========

"Also, there appear to be limitations in supporting disks that the authors gloss over
(with a reference to Membrane [39].",  "Where is the USB mass storage device?  Is its
absence due to the problem of persistent storage(4.1.4)?  Or performance overhead of
copyin/out (sec 3)?"

==> Drivers managing persistent state will not recover that state via FGFT. For example, a
faulty disk driver could write to the wrong block, and neither FGFT nor Membrane would
solve that problem. For failures that do not write to the wrong block, FGFT provides
at-most-once failure semantics. The overhead of FGFT with storage devices is likely to be
lower than with network devices because the data itself need not be copied; only the
descriptors pointing to the data. In addition, storage devices typically generate fewer
requests per second: the network driver described handled 70,000 packets/second, each of
which took a separate driver invocation.

"Finally, it is a bit suspicious that the authors used a 3 year old kernel (2.6.29 was
released 3/09) for their evaluation. "

==> We used an older kernel only because we already using it for other projects; our tools
have no dependency on a particular kernel version.

"I am most concerned with how FGFT must take exclusive access to the device to take a
checkpoint....The actual locking that must take place, especially during device callbacks
is ad-hoc often difficult to determine.  "

"Can you please analyze the drivers (even classes you didn't evaluate) to convince me that
this isn't a hopeless task for entire classes of drivers?"" "You have just modified the
kernel locking convention, and how to you guarantee you won't deadlock? "

"I also don't understand how copyin/out can work if there are multiple threads ever let
into the driver, even after configuration.  "

==> In general, FGFT does not let other driver threads execute while one thread is
executing in isolation. Hence, there can be no conflicts between concurrent threads. It
does this by re-using existing locks present in the driver and expanding their scope for
isolated calls to cover the entire call. In the next version of the paper we will add an
analysis of more drivers to demonstrate that resynchronization is not a widespread
problem. For code that could deadlock if it holds locks across kernel call, it may not be
possible to use FGFT; however it can still be applied to other entry points in the driver.

"Comparisons with Mondrix (SOSP '05)"

==> Mondrix offers cheap memory protection using specialized hardware.  But it does not
create a copy of data being accessed and cannot provide rollback. We will compare our SFI
to systems like Mondrix in the paper.


Review 3: 
=========

"Assumptions in the paper/system that are not addressed or validated in the paper (in
priority order):

1) Memory safety violations are the primary cause of driver failures. This neglects other
causes of driver failure including race conditions, lock inversions, state machine errors,
errors in logic, etc.  Key unanswered question: what fraction of driver failures are cause
by memory safety violations?"

==> Unfortunately there is not data available on the causes of driver crashes; from
preliminary analysis of available kernel crash dumps, almost all driver-related crash
failures are due to memory safety problems. FGFT can address lock inversions via detecting
deadlock and aborting a call, but cannot automatically address state machine errors or
logic errors. Then again, no other automatic fault-tolerance system can handle these.

"2) SFI isolation can automatically separate locking and ordering operations for memory
accesses.  Key unanswered question: how does isolator identify and reactor locking
operations."

==> We use static analyses to detect  locking operations in suspect code based on common
Linux Kernel lock functions.

"is refactoring power management code into checkpoint and restore code automatically
possible?  if it can't be done automatically, how much domain and device expertise is
required to do it manually?"

==> We show in the evaluation how few changes were required to refactor driver code, and
we have no special knowledge of these devices. While automating the conversion for all
drivers would likely be difficult or impossible, it could be done for drivers with simple
suspend/resume routines (which are the majority). We will look at analyzing more drivers
to evaluate this question.

"if the system is so easy to apply, why wasn't it applied to 60 or 600 drivers instead of
just 6?  what about more complex drivers like queuing storage drivers or graphics
drivers?"

==> Evaluating a driver requires having the device present, and it would be expensive to
purchase 60 or 600 devices. We evaluate with a comparable number of drivers to past work
on driver fault tolerance, and we augment that with statistics about all drivers. We agree
that the technique may not apply to all drivers, particularly complex ones such as
graphics. However, that does not reduce the value it offers to all other drivers. No prior
driver fault tolerance paper has been able to address graphics drivers.


"Does refactoring of power management code for checkpointing violate any ordering
assumptions in the code?"

==> Existing suspend/resume code does have assumptions that we address by acquiring driver
locks before suspend. In Section 4.1.3 we discuss changes needed to perform checkpoints in
interrupt or atomic contexts.

Review 4: 
=========

"From my understanding of paragraph 4 in section 2.1, I gather that you are assuming every
driver invocation is state-less, i.e., one invocation does not affect the next. "

==> We will clarify the text, as that is not our assumption: we acquire locks during
isolated driver calls so that the state from one call is available to the next call.


"While the driver-state touched is explicitly annotated by the user, it is unclear how the
kernel-state touched is identified."

==> We currently do this manually. However, Isolator can identify statically which fields
have been modified by the driver prior to a kernel call, and which fields are accessed
afterwards. These are the fields that must be synchronized with the kernel.

"Can the authors provide an example of the kinds of structures that were touched and which
fields were copied, as opposed to the entire structure?"

==> We will add an example to the next version of the paper.

For example, if the driver issues an ioctl that updates driver internal private structure
(usually pointed to by struct netdev->priv, where netdev is kernel's netdevice). In such
cases, FGFT will use points-to-analysis and  pre-determine the fields touched, such as
netdev->priv->tx_ring and netdev->priv->rx_ring and only generate marshaling code to copy
in/out these parameters (rather than complete netdev or netdev->priv). This reduces
marhsaling code and unnecessary copying.


" How were the time-related measurements in section 5.3 done? What is the error margin of
the measurement?"

==> We used the TSC processor register to get the timestamp values, (rdtscll calls) which
is used for extremely high precision for short intervals. We did an average of 5 runs.

Review 5 
========

"cost of protection: 20+ us. In other words, the approach adds 60,000 cycles to each
driver entry point that needs to be protected."

==> This approach explicitly provides a tradeoff between higher latency per-use costs but
reduced use by only isolating select entry points. If the majority of entry points require
isolation then whole-driver techniques are more useful. However, if it is possible to
identify one or two suspect entry points, then FGFT can have much lower cost.

"knowing what to protect (the method is probably too expensive to protect everything)."

==> In the paper we suggest using static bug-finding tools and applying isolation to
recently patched code. In addition, crash dump stack traces could be used to identify
candidates for isolation.

"which class of bugs does it actually help against (e.g., would a restarted driver after
checkpoint resumption just fail again the same way, in the common case?)."

==> The system can only recover in the presence of heisenbugs if bugs lie on common case,
but can prevent crashes for other bugs by failing the call instead of re-invoking it. This
allows higher-level recovery techniques, such as unloading and reloading the driver, to be
applied that are more likely to resolve persistent faults. However, the system can be used
to not let buggy code on uncommon path, affect the common case.


"do the performance data presented account for the possibility that the method forces more
serialization than would otherwise be needed for the driver (e.g., multiqueue nics, etc)."

==> Yes, FGFT imposes more serialization while invoking isolated calls. However, if FGFT
is applied to non-critical path code, this is unlikely to reduce performance.

"what assumptions are made about drivers for stateful devices like disks where a
checkpoint can't include the data being written to disk?"

We assume the driver will not write to the wrong block on disk and that it either writes
the correct data or no data at all. This is similar to past work on file-system recovery
such as Membrane.