Student
The paper surveys distributed systems as of the 80s and tries to predict
the future line of research. The main goals of the distributed systems
of the 80s was to provide a price/performance advantage over the
mainframes (as workstations started getting cheaper). In doing so, these
early systems also focused on providing a higher degree of availability
and fault tolerance compared to a centralized system.
These systems foucssed on distributing every OS functionality
transparently. Systems like the Locus (I think it was around same time)
were tightly coupled as opposed to the systems today. The focus was on
having a distributed operating system rather than building distributed
service on top of a vanilla OS. These systems mainly aimed at
parallelizing the computation tasks of the users and were typically
small scale. Even though data was distributed, it was not emphasized as
it is today. Further these systems didnt decouple the storage and
computation.
These systems focused on communication primitives and resource
management. Also the systems were not very resellient to faults and
fault tolerance is cited as a future research area. As I said before,
the focus was on small scale systems, tightly coupled systems with a
focus on computation intensive tasks. Systems today (like clusters and
p2p) are loosely couple and carefully choose the service to distribute.
Also systems today are highly scalable and focus a lot on fault
tolerance. Data and computation are decoupled and most systems focus
today is on distributing data. Data, as a resource distributes better
than computation, which is probably why distributed file systems stayed
on as opposed to distributed OSes. I think most of today's paradigm was
mainly because of the WWW.
Student
The paper is a good discussion of the goals, key design issues and examples
of distributed systems as of 1985. The goals of these systems as described
in the paper are:
1. To have a global, systemwide operating system instead of each computer
having its own OS.
2. Dynamic allocation of processes to various CPUs which is transparent to
the user ,instead of the user having to do remote login to use another
machine.
3. Transparent placement of files by the system, instead of having the
user worry about the specific machine on which they are located.
4. A high degree of reliability (not losing data) and availibility
(crashing of one process or processor allowing the system to proceed
normally).
However, there seem to be a few assumptions (in terms of workload and
environment) that bear upon the design and implementation aspects detailed
in the paper. In particular, the systems seem to assume that the
processors available in the system will only leave the system (in event of
a crash). The design does not seem to accomodate dynamic adding of
resources (CPU, disk space etc.) as is common in a mobile environment of
today with laptops and other mobile devices. Also, the system design
focuses only on the various aspects of allocating CPU to the workloads,
thus implicity assuming that they would be mainly CPU-intensive. Thus I/O
intensive and memory intensive workloads seem to have been deemed not
important or not frequently used in the design of these systems.
The state of the art for distributed systems at this time was to use the
client-server model with some form of RPC as the communication base. Thus
the systems were not robust for applications demanding mutlicast or
broadcast
semantics (like video conferencing.) The name servers were straightforward
(eg. the Cambridge system had a single name server) , thus limiting the
scalability of the system. Resource allocation was either not handled by
the system (eg in Eden, it was done by the underlying existing operating
systems) or focused only the processor banks (eg Cambridge, Amoeba, V).
Fault tolerance was little and was compromised in favour of performance
(Cambridge, Amoeba, V) or very inefficient to use (eg checkpointing in
Eden was very slow).
Student
According to Tannenbaum, the primary goal of distributed systems in
1995 was to exploit the "price/performance advantage" of
microprocessors. In other words, for a given price, the aggregate
computing power of many smaller systems was greater than that of a
few larger systems -- so if new operating systems could enable easier
(or even transparent) use of such multiple distributed resources, in
theory users could enjoy more bang for the buck (or maybe just the
same bang for less buck).
Secondary goals were incremental resource growth (the smaller the
units, the more gradually you can scale), software simplicity through
the assignment of discrete services to discrete processors, and
improved fault-tolerance by allowing distributed OS components to
fail independently of one another and potentially restart on
functioning resources elsewhere.
One implicit assumption of the four reviewed systems, according to
Tannenbaum, is that given a tradeoff between performance and
reliability, users value the former somewhat more than the latter.
This is reflected in design decisions, e.g. the lack of atomic file
operations in the V system, and in actual user practice, e.g., the
infrequent use of Eden's expensive object-checkpointing mechanism.
In terms of workload and environment, the systems are designed to fit
one (or more) of three broad scenarios: a small group of federated
but independent multi-user minicomputers; a collection of single-user
workstations sharing common services (e.g., a fileserver); or a large
pool of processors, more than one of which may be dedicated to a
given user at one time.
In terms of design, the state of the art at the time, as evidenced by
Amoeba, included a shift from monolithic kernels and towards
microkernels, where key OS services operated either in user space, or
were at least abstracted into clear objects or modules within kernel
space. The use of capabilities to achieve protection was also
present in two of the four systems surveyed.
Interestingly, although Tannenbaum doesn't go into great detail on
the actual users of the four implemented systems he surveys, the
least-advanced system in terms of design (Cambridge) appears to have
usefully served the largest actual community, consisting of over 90
machines across three sites. And this without objects, capabilities,
etc.
Student
One goal of distributed systems at that time was to provide a way to
share computation among many processors. While distributing data among
the entire system was certainly important, the distribution of
computation was arguably the more interesting technical problem. This
is in contrast to current times where we see somewhat of a shift, with
many large distributed systems such as P2P networks more focused on
distributing data versus computation. Another goal was to provide
unified services and protection. For example, it was important to be
able to have a system-wide service that provided files and also
provided user-level protection to those files. Lastly, it was
important for the distributed systems to have some amount of fault
tolerance. While none of the four specific systems mentioned in the
paper went too far to provide complete fault tolerance, it was a goal
nonetheless.
The most basic assumption was that the nodes that comprise the
distributed system were all *near* one another. That is, they would
all likely be connected by a local area network as opposed to a
wide-area network where they would be physically separated by hundreds
of miles. Also, in distributed systems meant to share processing, the
jobs submitted were expected to be batch-like jobs. While some
terminals connected to the distruted environment allowed more
interactive computing, most of the distributed computation system
expected batch jobs.
At the time, a state of the art system had tens of nodes, and very
rarely hundreds of nodes. Since fault tolerance was a young area of
research at the time, any sort of added fault tolerance to the system
would have been considered state of the art. Finally, the use of
cryptography to provide protection within a distributed system, as in
the Amoeba system, was also an advanced concept.
Student
The goals of these distributed systems was on one side to speed up big tasks
for users since they could use multiple processors and on the other side to
allow them to login anywhere and still be able to access their own work.
I think the assumption was that the workloads could be pretty high since there
was a relatively small amount of computers. Also the emphasize seems to be on
fileservers.
The state of the art for that time seemed to be a 10mbit network and maybe a
couple dozen 12 MHz computers. They were talking about 90 workstations for
Amoeba. Also if a computer had a harddisk it was already quite something.
Student
The goals of the distributed systems at this time were many-fold.
Incremental growth was one main objective of building these systems.
The motivation was that by adding more computing resources in the
network, the computing power could be increased proportionately.
Another important consideration was making services available even if a
few machines in the system were down. The goal was to achieve such
reliability and availability without increasing the communication
protocol overhead. One important goal was also to make services
accessible transparently. For example, being able to access files
transparently by name irrespective of their location in the network,
while addressing issues of protection/access rights.
A few assumptions have been made regarding the type of workloads and the
kinds of systems connected together by the Distributed System. Most of
the systems studied operate on a network spread over a small area,
typically with < 100 machines connected together. Network overheads,
hence, are not considered as elaborately as they are in today's widely
distributed systems (eg. a peer-to-peer system like Kazaa). For
example, the V system broadcasts a query to all kernels whenever a
client wants to access a service by name. By maintaining centralized
services (eg. a centralized file server), the utility of these
distributed systems is also targeted primarily towards low-end diskless
systems (again, as in the V system). For these centralized services to
work well, the assumptions that the number of computers is small again
comes into picture. These systems, hence seem to be targeted towards
small amounts of data movement/transactions.
The state of the art in these distributed systems revolved mainly around
the aspects of reliability, security and resource management. Amoeba,
for example, protected the rights part of the capabilities
cryptographically to prevent user programs from manipulating them.
Dynamic allocation of processors from processor pools was also new.
Services are also kept continually running by multiple copies of the
same service / multiple process, enabling other processes to run even if
some are blocked waiting for replies. Though the systems were built for
performance rather than reliability, some systems do address issues like
bringing up crashed servers (Still, crashes are assumed to be
infrequent). Also, the Cambridge distributed computing systems provides
the notion of special files, the writes to which are atomic. Most
systems also emphasize the property of statelessness of the various
servers, with mechanisms to account for crashes, etc. Amoeba also
managed to deliver signals/interrupts to related processes in the system.
Student
This paper surveys distributed systems as of 1985.
What were the goals of these distributed systems?
What were the assumptions (in terms of workload
and environment) of these systems? What was the
state of the art for distributed systems at this time?
The authors have presented different approaches
employed in distributed systems as of 1985. The
focus is on 'distributed operating systems' that
ensure transparency to the user about the
existence of multiple independent processors. The
primary goal of these systems is to manage
multiple users and processes that may
simultaneously access the system and to
dynamically allocate processes to different
processors available in the system. Also these
systems manage file placement on multiple
processors and allow transparent access to remote
resources by employing communication primitives.
Additionally some of the other goals of these
systems are reliability, fault tolerance and
scalability.
The distributed systems presented in the paper
have been typically designed for use as a
computing resource in the university campus. The
user population, although substantial, is not
expected to be very large compared to the
available workstations such that each user can
have a dedicated workstation during his/her
session period, which would also handle the short
interactive user jobs. The systems were designed
to work for heterogeneous machines with varying
hardware, data type formats and network capacities.
Most distributed systems described in the paper
are designed based on the typical client/server
model with various specialized servers providing
services. These systems could typically scale up
to a few hundred machines, for instance the
Cambridge distributed computing system was
operating on 90 machines and Amoeba was configured
for a collection of 24 computers. Scalability of
these distributed systems across the wide-area
network was still being investigated. Also, due to
the low network bandwidths and high communication
overheads, the data rate between different
machines were on the order of few Mbps.
Student
This paper provides an overview of distributed computing mechanisms
and describes four distributed computing systems as of 1985. Upon
reading the paper, it seems to me that the overall goal for
distributed systems in 1985 was to effectively take advantage of
increasingly available and capable microprocessor-based systems to
provide an acceptable alternative to a time-shared minicomputer
system. While high-level distributed system goals such as
transparency, incremental growth, and reliability were considered, it
seems that much work was focused on working with different models
that would allow a traditional operating system kernel (VMS, Unix) to
operate in a distributed manner. Another common goals included
simplification of experimentation with new mechanisms (for instance,
making file system support in V and Amoeba user-level
processes). Because of the idea that one needs to build a "truly
distributed" system from scratch, much effort went into communication
and and alternative kernel models, instead of effort into
higher-level services. State-of-the art systems focused efforts on
alternative kernel models that could be better distributed, such as
the work in Amoeba and Eden on capability-based objects.
The four systems presented in the paper had similar assumptions about
environment. They assumed groups of dozens of microprocessor
workstations connected via fairly high-speed custom networks, such as
Cambridge's 10/4MB ring. Of the four systems, only Eden was built on
top of "commodity" network (ethernet) and kernel (SunOS Unix)
technology - all the others were built directly on top of the
hardware. All systems used their own form of connectionless
client-server RPC instead of using ISO seven-layer models such as
UDP/IP or TCP/IP, presumably because of performance reasons.