1. To use NAP/NASDs or to use the server configuration. Because I know
that the server configuration works, is well supported, etc., this is
the obvious choice for a business that is going to make a large
investment and needs the resulting hardware to work.
2. Assuming that NAP/NASDs are going to be used, then the question
becomes wether to use either one of TCP/IP, UDP/IP, xxx/IP, or to use
xxx without IP under it. One choice not involving IP might be xxx/SCSI
or some derivative of SCSI. From a business perspective, this choice is
harder, since it may be the case that any implementation will be
somewhat experimental. From a development perspective, I would choose
xxx/IP. I do not believe either TCP or UDP is the best possible
protocol for NAP/NASDs. But, IP is so low-level, that it does not place
any restrictions on what can be put on top of it.
This "magical number" we talked about in class isn't just related to
performance. For instance, some computing environments may be willing to
sacrifice more performance for more flexibility with configurations and
distances. While other environments may not need that flexibility and
prefer to maximize the bandwidth.
I'd prefer to stay on the fence with this issue, but if I were to take
sides for the sake of this vote, I would fall to the networking side.
A major motivation for using an already standardized, globally available
protocol like tcp/ip is the ease of deployment and the level of
interoperability possible. In a heterogeneous global environment, this is
likely to be a more important issue than the performance benefit we get
out of custom-made protocols, considering the ever increasing network
bandwidth and processor speed.
- use of naming and addressing (no need to develop another naming
and addressing system). In the case of NAD's where potentialy every device
is on the net, naming and addressing is an important part of the system.
- it is not inconceivable that a well-tuned transport protocol on
top of IP will yield good performance. IP does not restrict the type of
transport protocol to be used on top. From the system development point of
view, even before development of such a performant transport layer is
finished, UDP and TCP might be used instead and other parts of the distributed
system that use NAD's will be able to develop in paralel, without waiting
for testing/debugging of the tuned layer.
In this network attached storage systems, I dont think, scalability will
be a major issue - and if IP is chosen, one of the fundamental objectives
of network-attached-storage-systems, to improve performance would be
compromised. So my vote is for "NOT IP" - rather, my vote is NOT for IP.
Bottom line: It takes a lot of CPU resources to get the same performance from
software as an already-existing and working piece of common hardware, and the
VISA paper doesn't make many claims as to what new features they get by putting
the disk farther down the wire. So I say forget it.
In a LAN, it seems reasonable to use a specialized protocol because many of the
IP benefits are not needed.
The IP suite is a fairly mature protocol and thus, I don't expect to see
the performance improvements that would be needed to make it a reasonable
alternative for disk interfaces. Further, I think the WAN capabilities of
TCP/UDP/IP are wasted on this application. Even if, for some reason you
wanted a disk array spread all over the planet, the security concerns of
putting a SCSI interface on the Internet are disturbing.
My biggest problem with this paper is one that was brought up in class:
the paper does not compare the right things. Their stated goal was to
run VISA at a performance close to native, local SCSI (something like
80%). Their are numerous advantages to locating storage componts away
from the system, and most users can justify the cost of distirbuted file
systems. I would have much rather seen VISTA's performance compared to
something like AFS or NFS, where someone has decided that the benefits
of non-local storage outweigh the disadvantages, and want a system that
would provide the least slowdown compared to a local solution.
Additionally, it seems that one of the main advantages of attaching
disks directly to the network is that they are not bound to a machine
and can be shared among multiple clients. But this seems difficult
because each computer would still have its own local file system and
would need knowledge of the other client's filesystems in order to
access the appropriate blocks. In a traditional distributed FS, the
server attached to the disks allows the clients to use an abstract
(perhaps object based) interface that the server would translate to
specific block requests. Allowing clients to use this simpler interface
reduces complexity and possibly network bandwidth (i.e. issue request to
retrieve file rather than issue SCSI requests to gather all the
fragmented blocks in the file).