Third Party Software used by Niagara

This is a list of the third party software currently used by the Niagara project. I've started adding comments about where this third party software is used, and what direction the use of that software is going to take.

Apache C/C++ XML Parser
Berkeley DB Library
GNU Common C++ Library
JTC ... Java-Like Threads in C++
W3C HTML and WWW Library
Software under consideration

Apache C/C++ XML Parser

Otherwise known as xerces-c. This is the ex-IBM XML Parser that has been taken over by the Apache XML effort.

Niagara uses Xerces-C for all its XML parsing duties. It also uses the in-memory (DOM?) representation for all it's in-memory (but non-database) XML duties.

No major changes are planned for Xerces-c support in Niagara at current. Long-term it may be to Niagara's advantage to adapt its own an XML representation. Then that implmentation would be implemented with Xerces-c, or any other XML providing library. However, there aren't a great deal of other XML parsers to compete with Xerces-c at the moment, which makes neutralizing the interface to XML rather low priority. However, given larger XML documents, some sort of paging in parts of an XML document as needed may be needed, which would require layering XML properly. There is also a fairly big design task here if this is considered.

Xerces-c has a major synchronization bubble; it uses a single mutex to implement various atomic memory operations on the reference counts in all its in-memory XML representation. That means that even separate threads with completely independent, XML data structures may be contending over this mutex. If things are actually happening in parallel, this is expensive. However, even if things are happening serially, we still suffer the costs of locking and unlocking a mutex for simple reference counting operations. Long term, we probably want to look at trying to enable architecture specific atomic memory operation support to make this work better. Use of Bolo's testandset library is one possibility. In the short term, support will need to be added to Xerces-c to support Shore Threads as a thread package / operating system for these synchronization primitives.

Versions and Information

Version 1.3 is in /p/niagara/repository.
Version 1.3 is in /p/niagara/s.
Version 1.4 is in /p/niagara/s.
Niagara compiles and runs with both versions 1.3 and 1.4.
Niagara uses version 1.4 as of May 4, 2001.
The /p/niagara/s version(s) of xerces-c have some Bolo hacks in them. One is a hack to the various Makefiles to make include files install properly in a conventional software setup, with a proper include directory. I also have some hacks in there to compile a statically linked version of the library. I should see about submitting those changes back to Apache.
The /p/niagara/s version(s) of xerces-c does not have libwww support compiled in, unlike the canned version in the repository. This means that URLs can not be fetched by xerces-c directly. For the long term, that is a advantage, as URL fetching will need to be done by a Niagara component or subsystem, NOT by a Parser. This also means that if URL fetching is enabled in xerces-c, that it causes a hidden dependency of the system upon libwww.
Latest version is 1.4
Info and Distribution

Berkeley DB Library

Otherwise known as libdb. The Berkeley DB Library has grown from a DBM replacement into a small transaction processing system. It provides good support for smaller embedded databases and transactions where a full-blown database is not required or would be overkill.

In Niagara, libdb is used as a directly replaceable component of Niagara. The storage engine provides storage for the Data Manager and the Index Manager Niagara components. In addition, libdb is also used to provide transactions. Well, that is in-theory. It turns out that libdb transactions do not support multiple threads in a transaction, so libdb transactions can not be used by Niagara. Libdb also has problems at the lock level. It doesn't provide any lock escalation, which means that large transactions require rebuilding or reconfiguring Niagara to configure libdb to have a larger lock table.

Libdb support is already modularized and hidden underneath the covers in Niagara. However, lots of the libdb header files and data structures are needlessly exposed by the libdb StorageManager interface in Niagara. This part needs to be rewritten (tweaked might be a better term) to encapsulate the libdb implementation better.

Versions and Information:

Version 3.1.17 is in /p/niagara/repository.
Version 3.1.17 is available in /p/niagara/s.
Version 3.2.9 (without the two patches yet) is available in /p/niagara/s.
The source for 3.2.9 is /p/niagara/s is patched, but the patched version is not compiled and installed (8 May 2001).
Niagara currently works with either 3.1.17 or 3.2.9.
Latest version 3.2.9 (plus two patches)
Info and Distribution

GNU Common C++ Library

This is a C++ wrapper on top of pthreads, I/O, and sockets. It provides high level object oriented abstractions for a number of things.

In Niagara it is mainly used in the client-server communications system. That portion of Niagara uses CommonC++'s TCP communications classes, threads, and synchronization objects. CommonC++ and JTC interact OK because they both have Posix threads underneath, and because JTC has an "adoption" mechanism to deal with "foreign" threads.

In the future, direct support for the CommonC++ library will be removed from the Niagara code base. CommonC++ will still be used to implement things in one of many Niagara System Layers, but the code will no longer be directly dependent upon CommonC++.

Version and other Information:

Version 1.2 binary is installed in /p/niagara/repository.
Version 1.2 from sources in /p/niagara/s.
Version 1.3.3 from sources in /p/niagara/s.
Niagara compiles and runs with both version 1.2 and 1.3.3
Niagara uses 1.3.3 by default as of May 4, 2001.
Latest Version 1.4.2.
Info and Distribution
I don't know why we aren't using a newer version. It is on my list of things to try sometime.

Java-like Threads in C++ Library

JTC provides threads and synchronization objects that mimic those available in Java. It also provides other "convenience" things that are not Java like, such as reference counting support.

Initially, JTC was used by Niagara to lower transition costs when the move was made from the Java version to the C++ version. This eased porting of the Java version of the system to C++ by letting people use familiar from Java constructs in C++.

As of writing this, new use of the JTC library should be curtailed. Thread support in Niagara is going to be made thread-package-neutral, with a Niagara Threads and Synchronization layer providing the interface to actual thread packages. At the same time, we are going to concentrate support for one third-party threads package, and that will probably be the GNU CommonC++ library.

Currently, most of the JTC features used by Niagara are generic thread and synchronization facilities that are easily replaceable. The one big black area of JTC use is in the Query Engine. The QE uses JTC reference counting classes to do garbage collection, to try and blow away unused portions of the XML data being evaluated, as unused parts of tuples become known during query processing. This may work well, but it puts a dependency on JTC on the source. At the same time, there is also a large cost involved, as the JTC reference counting objects are fairly hefty (32 bytes?), often out-weighing the object being reference counted. Although a JTC-like reference counting mechanism is simple to implement, it still puts an onus to implemenet reference counting, where a different strategy might work better. Reference counts have costs associated with them, both in space and time, that may not be desired.

As we transition to reducing dependencies on 3rd party software packages, the existing JTC Reference Counting classes will be used as a wrapper on top of the actual threads package being used. In addition, we want to look at alternate strategies of dealing with this storage problem at execution. Tuple generating is one strategy, keeping everything in-memory is another. Doing persistent storage is yet another. This is an issue to be examined in detail in the future. Things like depending upon some basic optimizations, once we have an optimizer, may provide better long term strategies than brute force methods. In the meantime, Leonidas is starting to look at alternate methodologies for dealing with this problem. Long term we may want to look at this kind of issue as a group to see what issues, performance concerns, and alternatives exist.

Versions and Information:

Version in /p/niagara/repository is unknown.
Version in /p/niagara/s is JTC-1.0.8
Version in /p/niagara/s is JTC-1.0.14
Niagara uses version 1.0.14 as of May 4, 2001.
Latest version 1.0.14
Info and Distribution
The repository binary install had include files that were modified from the originals, includings some structure definitions. That may have been highly problematical.

W3C HTML and WWW Library

libwww is the W3C's generic HTTP and WWW support tool used by all of their WWW software. Its scope has been expanded beyond HTML to also include some XML support. It is the platform that sits underneath the W3C browser, client, server, robot, and other software development efforts. It provides facilities to fetch documents across the web,

This is used directly by one thing in the Niagara system, the Trigger Manager's Event Detector's File Change mechanism. It is used to retrieve document timestamps to see if the document has changed.

There is an indirect dependence upon libwww when Xerces-C is compiled with libwww support to do direct url fetching.

At the moment, support for libwww there has been removed, and some simple code to retrieve timestamps directly put in its place. It is my intention to make use of libwww optional at that point, as libwww has a great deal of domain specific knowledge with regards to retrieving information across the WWW, or parsing the same information. Long term it is good to keep some support for libwww to allow rapid prototyping.

Long term, if libwww is used it should be well encapsulated so the system isn't directly dependent upon it. That will allow us to choose use of lighter weight packages, either those we write ourself, or ones from other sources. Or we can mix-and-match as we see fit.

Versions and Information:

Version 5.3.2 is in /p/niagara/repository.
Version 5.3.2 is in /p/niagara/s.
Version 5.3.2 is used by Niagara.
Latest Version is 5.3.2
Info and Distribution
Niagara source requires some hacks, such as #define HAVE_CONFIG_H to allow libwww include files to work properly. This problem should be investigated and fixed, or documented. Perhaps a fix could be submitted to the W3C.

Software Under Consideration

This section just providides a little bit of information about software that people have spoken up about. From this point on we are no longer trying to accumulate random software packages. Any that we do need to be given consideration for all sorts of things beyond the services that they provide. Such as interoperability, mutability to suit needs, how hard it is to hide it under and interface, and other issues like those.

Bolo Documents
Bolo's Home Page

Last Modified: Mon Apr 2 17:46:21 CDT 2001

Bolo (Josef Burger) <bolo@cs.wisc.edu>