"There are only 10 kinds of people in this world, those who understand binary and those who don't!"

My Computer Science Career

I began working at IBM's Silicon Valley Lab in June of 2002, after completing my master's degree in computer science.  I worked on DB2 for z/OS which is IBM's mainframe version of DB2 (i.e., the original DB2).  This database combined with IMS (another IBM database) run the majority of the Fortune 500 companies.  So, in a very real sense, if DB2 didn't exist, the world as we know it would not be the same.  I am currently the sole developer on DB2's IFC component. This component produces performance data to on-line monitors and in batch format.  This allows customer to tune their database, measure performance, and bill customers for use of the system.  I have also started work in Storage Manager and Agent Services.  Storage Manager is the internal code used to obtain and free heap storage.  Agent services deals with TCB switching and connecting to the database.

Programming database systems is phenomenally fun and poses great challenges.  We focus on instruction counts and the details of the assembler code generated rather than using object models that exist in application programming environments.  Currently, I can't imagine not continuing my career in DB2 but you never know what the future will hold.  I have past experience in networking, OS, and AI (although my networking and AI are getting a little rusty ;-)).  Click here to see my resume (MS Word).

Hope you enjoy this page and I'll try to add more interesting tidbits in the future.

Computer Science Projects and Research

Here is a list of my major projects in CS.  The hacking research is pretty cool stuff and should serve as a beginning text for wannabe hackers (or hackerz).  Note: All links are to PDF file unless otherwise specified.
 

Information Security

bullet Hacking: An Analysis of Current Methodolgy (2001).
bullet CIH Virus Talk in PowerPoint (2001).  For more information on CIH and other computer pests, see CERT.

Operating Systems

bullet WinFAM2k -- A Windows 2000 File Access Manager (2001).

Artificial Intelligence

bullet Building Decision Tree Ensembles:  A Comparative Analysis of Bagging, AdaBoost, and a Genetic Algorithm (2001).
bullet Evaluating Machine Learning Approaches for Aiding Probe Selection for Gene Expression Arrays (2002).  Appears in Proceedings of the 10th International Conference on Intelligent Systems for Molecular Biology (ISMB-2002).

Networking

bullet Using an Artificially Intelligent Congestion Avoidance Algorithm to Augment TCP Reno and TCP Vegas Performance (2002).

 

Computer Science Language Opinions


PL/X

This is an IBM proprietary language so I can't say too much :-)  The language is an abstraction of IBM zArchitecture.  It is best described as a cross between C and assembly.  The language is extremely powerful because assembly can be in-lined when straight PL/X does not suffice.  DB2 for OS/390 is written primarily in PL/X so much of the business world has data being accessed via PL/X code. 

 

 

 


Java

Java was the first object-oriented language I learned.  Java is great for AI tasks because AI algorithms lend themselves well to object-oriented design.  The primary problem with Java is speed.  Compared to C/C++ programs Java is a slug because it is compiled to bytecode which is interpreted.  Therefore, I use Java only when speed is not a consideration.  A big bonus of Java is ease with which bugs can be found.  The error messages are explicit and you don't get stuck in "seg fault hell."  This is a great language for OOP beginners but not recommended for serious networking or OS work.

Tip: java -Xmx will expand the max heap size for the JVM.  For garbage collection optimization, Java fixes the max heap size and will crash with an out of memory error if you try to allocate more memory than this limit (even if you have more memory on the machine!)


C/C++

C/C++ are compiled to machine code and put Java to shame with regard to performance.  C/C++ can be very difficult for the beginning programmer.  There are no bounds on arrays so you can start overwriting memory.  In addition, you are responsible for cleaning up memory allocated on the heap, so long running programs can crash if you don't delete all of your "newed" variables.  C/C++ compilers are also not all compatible.  Visual C++ is a completely different beast to tame than the gcc compiler.  I have also worked with code that won't compile on future versions of gcc (so much for backward compatibility!)  I use C/C++ for any OS or networking projects.


Visual Basic

I am almost embarrassed to say that this was the first language I learned to any great degree.  VB is only useful for front ends but has major problems.  It doesn't support parameters for constructors.  You must violate OOP principles to instantiate and initialize an object.  VB doesn't support a "continue" statement that can make loop coding much easier to read.  When casting a Double to Integer, VB will actually round the value (unbelievable).  While workarounds exist by the thousands, VB is plain clunky.  It is sufficient to say that VB is for beginners and front ends only.


Cache

This is a very unusual language that I programmed in at Epic Systems Corporation.  The language was designed for medical databases.  String is the only data type which eliminates almost all type-specific errors.  In addition, saving data to a file is extremely simple and highly optimized.  This language isn't as powerful as C/C++ and I found myself calling out to the OS to run scripts to perform complex tasks.  In addition, it is very hard to read (perhaps more difficult than assembly).  Finally, Cache is interpreted making it slow.