< Previous | Next >

Angry Unix Programmer

Segmentation fault
Debugger process has died
Sanity breaking.
May 28, 2005 2:47 PM CDT by psilord in category Codecraft

Hopscotch with the Devil

Today's post is how not to write stupid code.

The other day I ran across this construct (but with far worse whitespace layout) in some code I was attempting to maintain:


if (func1() == OK) {
	/* lots of code 1 */
	if (func2() == OK) {
		/* lots of code 2 */
		if (func3() == OK) {
			/* lots of code 3 */
			if (func4() == OK) {
				/* lots of code 4 */
					if (func5() == OK) {
						/* lots of code 5 */
					} else {
						/* do some error condition 5 */
						return;
					}
			} else {
				/* do some error condition 4 */
				return;
			}
		} else {
			/* do some error condition 3 */
			return;
		}
	} else {
		/* do some error condition 2 */
		return;
	}
} else {
	/* do some error condition 1 */
	return;
}

For the comments where it dictates code should be, put up to a few dozen lines in each block.

The code layout for this particular mental form is an utter disaster.

This is how it was rewritten:

if (func1() == NOT_OK) {
	/* do some error condition 1 */
	return;
}

/* lots of code 1 */

if (func2() == NOT_OK) {
	/* do some error condition 2 */
	return;
}

/* lots of code 2 */

if (func3() == NOT_OK) {
	/* do some error condition 3 */
	return;
}

/* lots of code 3 */

if (func4() == NOT_OK) {
	/* do some error condition 4 */
	return;
}

/* lots of code 4 */

if (func5() == NOT_OK) {
	/* do some error condition 5 */
	return;
}

/* lots of code 5 */

The first major problem in the first example is the notion of where the PC goes on the common execution path. In the first example, this is very explicitly defined and the common path leads the programmer into a deeper set of nested if statements. This explicit definition ends up directly opposing the implicit understanding of the common code execution path already embued to any decent programmer of codes: the next basic block down in the file.

The second major problem is where the PC goes on the uncommon execution of the code. :) If the first function failed, it could be potentially hundreds to thousands of lines later that the PC hops skipping a LOT of code to somewhere that will be cumbersome to find, bouncing the brace nonwithstanding.

Here is a picture I drew of the code path execution of the first example as it would appear on a page of text.

Legend: '*' means a conditional branch, 'u' means an execution path in an uncommonly hit condition, '#s' means a starting point in the common execution path, '#e' means an end of the common excution path, 'v' means a place execution will hop in an uncommon execution path, '.' means a logical connection, 'c' means where the execution goes in a commonly hit condition, and '|','\' means a path of execution.

                                                            
#s                                                          
|                                                           
|                                                           
|                                                           
*c                                                          
| \                                                         
u  *c                                                       
.  | \                                                      
.  u  *c                                                    
.  .  | \                                                   
.  .  u  *c                                                 
.  .  .  | \                                                
.  .  .  u  *c                                              
.  .  .  .  | \                                             
.  .  .  .  u  |                                            
.  .  .  .  .  goto #e  [ This skip here is abhorrent. ]    
.  .  .  .  v                                               
.  .  .  .   \                                              
.  .  .  v    return                                        
.  .  .   \                                                 
.  .  v    return                                           
.  .   \                                                    
.  v    return                                              
.   \                                                       
v    return                                                 
 \                                                          
  return                                                    
                                                            
#e                                                          
|                                                           
|                                                           
|                                                           

This picture shows exactly the confusion, the uncommon 'u' code paths are pushed onto a mental stack as place holders the programmer must remember to match to corresponding 'v' blocks while going down the common path of execution from #s to the inner most branch. Then from the inner most branch a terrible skip of the common execution path occurs to #e, skipping a lot code of that you have to determine isn't in #e's code path since the 'v' blocks are so far away from the initiating conditional.

In the second example, only in the exceptional cases of uncommon code paths do the path of the code deviate from the intial depth of execution.

Here is a picture I drew of the code path execution of the second example almost as it would appear on a page of text:

#s        
|         
|         
*u-return 
c         
|         
*u-return 
c         
|         
*u-return 
c         
|         
*u-return 
c         
|         
*u-return 
c         
|         
|         
#e        

Ah, much better and much more natural.... #s goes directly to #e and the programmer is free to ignore the uncommon conditions of the execution path. I said almost because the return blocks will be directly underneath the initiating conditionals on the page of text, but even so they are still easy to skip in practice and a natural ability for programmers.

This is the hallmark of clear code--keeping the common path of execution on, or very near, the first depth (where higher depth is defined as going farther to the right on the page because of conditionals) line through the function.

The second code execution path tree is simply an isomorphism of the first tree, however, the mapping of that tree onto the 2D page of text and preferred top to bottom execution is much, much clearer.

Moral of the story: Aggresively minimize the depth and skip distance of a common path of execution.

End of Line.

May 24, 2005 1:24 AM CDT by psilord in category Idiocy

The Road to Hell Is Paved with Good Intentions

Before we are sucked screaming into the cesspool of reality concerning the implementations of shared libraries, here is a good cross section of why shared libraries are very desirable and often implemented in a multi-tasking operating system:

  1. Allow shared code between process for smaller memory footprints of collections of programs in memory
  2. Increase performance of the paging system
  3. Allow the OS to upgrade system libraries (for bug fixes and improvements)
  4. Allow programs to have multiple implementations of an unchanging API alterable at runtime
  5. Allow a program to decide what functionality it should have loaded in memory at any given time because either it is not known until runtime what said functionality must be, or if all functionality is loaded simultaneously, it would be larger than the available physical ram.

So, with all of these benefits, why do I stare longingly at my recently emptied bottle of Mezcal and contemplate a career change whenever I think about this topic? Because I know the truth.

Linux is by far the worst when it comes to the implementation of shared libraries (especially for number 3), but there are a few trouble areas that seem endemic to the use of shared libraries that affect any program that uses them under many OSes. I suspect it is because most people who develop these systems simply say, "Damn, that looks hard, I'll finish it later", which obviously means never.

Simply thinking about the legion of problems I've found with shared libraries across different OSes invokes severe cataplexy. This leaves me unable to muster the energy it would take to write the encyclopedic volumes of animosity and pure contempt for the monumental idiocy that surrounds this topic. Instead, I will only speak of a few encounters with the most foul and ruinous aberrations.

Oops! The crisis prevention center just picked up my call. I gotta go.

End of Line.

May 21, 2005 1:44 AM CDT by psilord in category Idiocy

G Plus Plus Minus One

I hate compilers.

I'm responsible for the porting of Condor to many different flavors and revisions of OS. It is a challenging job in most respects that Sisyphus would understand--though I do love it since it hones my technical skills for use in other areas of my life. I spend a lot of time with different revisions of the GNU Compiler Collection, the system programming APIs to a lot of OSes--especially Linux, and know a fair amount of how vendor compilers and C preprocessors do their job. The one pervading lesson that I have learned is that people who write compilers probably don't use them.

For example, good old GNU g++ likes to put -lstdc++ (among other things) at the end of the compile line like this (on a Redhat 7.2 x86 box while compiling "Hello World"):

Linux rh7.2 > g++ -v hello.C -o hello

[ snip tangential garbage ]

/usr/lib/gcc-lib/i386-redhat-linux/2.96/collect2 -m elf_i386 \
-dynamic-linker /lib/ld-linux.so.2 \
-o foo \
/usr/lib/gcc-lib/i386-redhat-linux/2.96/../../../crt1.o \
/usr/lib/gcc-lib/i386-redhat-linux/2.96/../../../crti.o \
/usr/lib/gcc-lib/i386-redhat-linux/2.96/crtbegin.o \
-L/usr/lib/gcc-lib/i386-redhat-linux/2.96 \
-L/usr/lib/gcc-lib/i386-redhat-linux/2.96/../../.. \
/tmp/ccnZj0aB.o \
-lstdc++ -lm -lgcc -lc -lgcc \
/usr/lib/gcc-lib/i386-redhat-linux/2.96/crtend.o \
/usr/lib/gcc-lib/i386-redhat-linux/2.96/../../../crtn.o

One might think this is exactly what you want since you shouldn't have to figure out and supply that crap at the end of the line for symbol resolution. And you'd be right, you shouldn't have to figure it out.

But, here we teeter on the brink of idiocy. This is pretty much a catastrophic failure when you deal with binary compatibility between, as if it matters, Linux distributions (which I'll assume for the rest of this post). First off, my stupid little hello.C program requires both the gcc and C++ runtimes (in the rh72 example, there is no shared gcc runtime, but it will show up later in gcc's evolution) as shared libraries which tie the executable to specific versions of the compiler revision's libraries. See:

Linux rh7.2 > ldd ./hello
        libstdc++-libc6.2-2.so.3 => /usr/lib/libstdc++-libc6.2-2.so.3 (0x40033000)
        libm.so.6 => /lib/i686/libm.so.6 (0x40076000)
        libc.so.6 => /lib/i686/libc.so.6 (0x40099000)
        /lib/ld-linux.so.2 => /lib/ld-linux.so.2 (0x40000000)

The libraries as they stand implement all sorts of goo (to make things like dynamic_cast function) and for the most part have completely opaque implementations to the end user. However, those implementations have functions in them, and functions are the domain of the linker loader during executable runtime. This is where the problem begins to show itself.

It turns out, that if you take the above program and run it on a different Linux distribution, suppose SuSE 8.1, it'll work just fine.

Or does it?

If my little C++ program uses a tiny subset of C++, say no exceptions, run time type information, or STL, it'll probably work just fine. However, suppose I make my C++ program a little more complicated by adding in a correct use of dynamic_cast and recompile it on the rh7.2 box. What happens when I move it to the SuSE box?

Linux SuSE 8.1 > ./hello
./hello: relocation error: ./hello: undefined symbol: __dynamic_cast_2
Uh Oh! What happened? What happened was that the opaque runtime layer blew up because the dynamic linker loader couldn't figure out how to resolve this internal function at runtime which changed between stdc++ internal runtime revisions between the stdc++ library it was linked against and the library it found during execution on the different machine. That's right, my program could have been happily running for days until it decided to do a dynamic_cast and BAM it gets shot right between the eyes. This implies that maybe the rest of the program might be subtlely producing incorrect information, or not, it is undefined. However, I only noticed this after adding a slightly more complex feature of C++ which turned on a mishmash of internal behavior.

So, how do we fix this to achieve binary compatibility? Three options: 1) Remove the dynamic_cast, 2) produce a statically linked executable, or 3) statically link in only the gcc and c++ runtime libraries while leaving everything else dynamically linked, and 4) recompile. I definitely know option 2 is stupid since you can kiss goodbye NSS lookups beyond 'files', option 1 is appealing to me, but due to some strange twist of fate it isn't chosen, option 4 is out of the question since not only would that mean I'd have to port 400,000 lines of often deeply magical code to a new compiler, but also the 9+ million lines of external third party libraries(like kerberos)--to 28+ different architectures. Option 3 becomes the winner, mostly through forfeit of the other options.

So, let's try the obvious:

Linux rh7.2 > g++ -v hello.C -o hello -Wl,-Bstatic -lstdc++

[ snip extraneous junk ]

/usr/lib/gcc-lib/i386-redhat-linux/2.96/collect2 -m elf_i386 \
-dynamic-linker /lib/ld-linux.so.2 \
-o hello \
/usr/lib/gcc-lib/i386-redhat-linux/2.96/../../../crt1.o \
/usr/lib/gcc-lib/i386-redhat-linux/2.96/../../../crti.o \
/usr/lib/gcc-lib/i386-redhat-linux/2.96/crtbegin.o \
-L/usr/lib/gcc-lib/i386-redhat-linux/2.96 \
-L/usr/lib/gcc-lib/i386-redhat-linux/2.96/../../.. \
/tmp/ccUrafey.o \
-Bstatic -lstdc++ -lstdc++ -lm -lgcc -lc -lgcc \
/usr/lib/gcc-lib/i386-redhat-linux/2.96/crtend.o \
/usr/lib/gcc-lib/i386-redhat-linux/2.96/../../../crtn.o

Linux rh7.2 > ldd ./hello
	not a dynamic executable
Oops. What happened? Well, if you look carefully, the stdc++ I added after the -Wl,-Bstatic is present, but then so are the compiler supplied libraries after it. Since -Wl,Bstatic is a stateful flag, it turns of dynamic linking for everything after it, so not only do I get my requested static linkage of stdc++, I also get unrequested static linkage of libc and libm. Kiss NSS good bye.

Ok, what if I get smart and turn back on dynamic linking at the very end of the link line? I would do this with the fool notion in my head that since I'm resolving all dependancies in libstdc++ statically with the object files beforehand, the compiler wouldn't bring in the dynamic version of the libstdc++ since it wouldn't be needed. Let's see what happens:

Linux rh7.2 > g++ -v hello.C -o hello -Wl,-Bstatic -lstdc++ -Wl,-Bdynamic

[ snip extraneous junk ]

/usr/lib/gcc-lib/i386-redhat-linux/2.96/collect2 -m elf_i386 \
-dynamic-linker /lib/ld-linux.so.2 \
-o hello \
/usr/lib/gcc-lib/i386-redhat-linux/2.96/../../../crt1.o \
/usr/lib/gcc-lib/i386-redhat-linux/2.96/../../../crti.o \
/usr/lib/gcc-lib/i386-redhat-linux/2.96/crtbegin.o \
-L/usr/lib/gcc-lib/i386-redhat-linux/2.96 \
-L/usr/lib/gcc-lib/i386-redhat-linux/2.96/../../.. \
/tmp/ccUrafey.o \
-Bstatic -lstdc++ -Bdynamic -lstdc++ -lm -lgcc -lc -lgcc \
/usr/lib/gcc-lib/i386-redhat-linux/2.96/crtend.o \
/usr/lib/gcc-lib/i386-redhat-linux/2.96/../../../crtn.o

Linux rh7.2 > ldd ./hello
        libstdc++-libc6.2-2.so.3 => /usr/lib/libstdc++-libc6.2-2.so.3 (0x40033000)
        libm.so.6 => /lib/i686/libm.so.6 (0x40076000)
        libc.so.6 => /lib/i686/libc.so.6 (0x40099000)
        /lib/ld-linux.so.2 => /lib/ld-linux.so.2 (0x40000000)

Um. WTF! In a way, this is totally unexpected and now I have no idea what is actually in my executable. Do I have two competing version of libstdc++? How do they interact while running in a binary compatible situation (two different versions of libstdc++ playing in the same process)? This is a catastrophe. This is about 50% of the insidious idiocy about this topic of which I speak.

Ok, I figured out this is terrible so I figure I need to turn off bringing in of the compiler defined libraries. I find an option: -lnostdlib. Jeez. I hope you didn't need crt1.o or anything like that since not only does this option get rid of the appending libstdc++ and friends, it gets rid of everything else supplied by the compiler as well. In short there is absolutely no method of turning off the stdc++ and gcc runtime inclusion but still keeping enough low level objects (like crtn.o) there to produce an executable.

This leaves two options: 1) Only use gcc to link, or 2) write our own ld script which does the right thing.

Option 1 is laughable from a user's point of view. "You mean to tell me I cannot use g++ to link my objects when I not only compiled all of my software with it, but all of the documentation I have says to do it that way? How do I know I'm supplying the right libraries? Which libraries do I use for which revision of the compiler?"

Option 2 is laughable from a system programmer's point of view. "You mean I have to dig around in 28+ different architecture's compiler revision's interactions with the (potentially vendor) linker with an eye to the C++ features being currently used in a codebase constantly modified by 40 people and ensure I get the options correct? Oh, and it has to be maintainable by someone that isn't me and nonfragile in our build system?"

That damned of you do, and damned if you don't is the other 50% of the idiocy. There is no good solution.

It gets even better. Since it was obvious to me that the stdc++ library tried to resolve that __dynamic_cast_2 symbol at runtime, if I manage to link the stdc++ statically through manually specifying the ld link line, what happens when it hits it at runtime? Let's try it:

Call the linker by hand fixing up the static linking of the stdc++ library but leaving dynamic libc and libm:

/usr/lib/gcc-lib/i386-redhat-linux/2.96/collect2 -m elf_i386 \
-dynamic-linker /lib/ld-linux.so.2 \
-o hello \
/usr/lib/gcc-lib/i386-redhat-linux/2.96/../../../crt1.o \
/usr/lib/gcc-lib/i386-redhat-linux/2.96/../../../crti.o \
/usr/lib/gcc-lib/i386-redhat-linux/2.96/crtbegin.o \
-L/usr/lib/gcc-lib/i386-redhat-linux/2.96 \
-L/usr/lib/gcc-lib/i386-redhat-linux/2.96/../../.. \
hello.o \
-Bstatic -lstdc++ -Bdynamic -lm -lgcc -lc -lgcc \
/usr/lib/gcc-lib/i386-redhat-linux/2.96/crtend.o \
/usr/lib/gcc-lib/i386-redhat-linux/2.96/../../../crtn.o

Linux rh7.2 > ldd ./hello
        libm.so.6 => /lib/i686/libm.so.6 (0x40033000)
        libc.so.6 => /lib/i686/libc.so.6 (0x40056000)
        /lib/ld-linux.so.2 => /lib/ld-linux.so.2 (0x40000000)

That looks promising. The dynamic cast appears to function on the machine it was compiled in when linked in this fashion, which is a tad bit surprising. Let's see what happens when we move it to the SuSE 8.1 machine:

Hmm, it worked on the SuSE 8.1 machine. That's definitely surprising. It sure beats the hell out of me why without serious time investment.

Here is my line of reasoning which makes me not understand why it works: If the dynamic linker was wanting to load the __dynamic_cast_2 function at runtime before, it implies that the function wasn't there in the original link pass to create the executable and so therefore wouldn't be brought into the executable at all--which is why the dynamic linker loader was trying to find it at runtime. I was pretty sure the link pass to create the executable would not bring in the required object files and the program would segfault since there wasn't a fancy linker loader telling it something was wrong. So, why didn't it segfault?

Obviously the problem that started this whole thing was that the C++ ABIs changed radically a few times between gcc 2.96, found on the redhat 7.2 machine, and gcc 3.2.2, found on the SuSE 8.1 machine. Evolution of the compiler yadda yadda yadda. However the thing that pisses me off is that the runtime of language, and other compiler internals, are shared libraries at all. Sure, from the point of view of sharing text when running multiple programs it makes sense, but from a binary compatibility point of view it is a disaster. Why isn't it made easier to package together the run time statically into the binary? Why would I have to hand invoke the linker to do something that any reasonable person whould have desired from the beginning?

This is the example of the insidious idiocy. More and more time is being spent to understand how to do something that should be simple or shouldn't have to be done at all.

I'm sure I'll get an itch and figure out the exact mechanism for why it ended up working (I already started poking it) and do another post explaining it in the future. But for now, the post traumatic stress disorder episode has passed and I am resting comfortably. The booze helps. It helps a lot.

I hate compilers.

End of Line.

May 20, 2005 1:07 AM CDT by psilord in category Apocryphal

Behold!

During lunch, I wandered over to Alan's office, which is next to mine. I mentioned that this blog generation code was basically finished (read: abandoned) and he could look at the results of my work. He fired up firefox, which took around 15 minutes because his machine was interacting with "the grid", and .....

"You used a center tag?", Alan said in a distasteful tone after dismissing the content and inspecting the html source. "Uh, yeah", I mumbled, "Isn't html circa 1994 good enough for you?". After an uncomfortably long stare from Alan, he casually mentions that my life would be better if I used CSS.

Now, I've used CSS before--my poor excuse for a home page has some on it, but I never really understood it. This was simply because creating HTML is about as appealing to me as a bone marrow transplant without anesthetic. Never seriously reading any documentation and mostly goading others to write it for me probably had a big contribution as well... So, Alan began his explanation and, after a while, I actually began to like what CSS could do for you. It could alter the layout of a page very nicely and you don't have to change the HTML at all. With some simple span and div tags you could do anything.

SNAP OUT OF IT! What was I thinking?!? Here's reality: CSS is a sucking pile of feces. Even though it solved all of my problems, fixed up the layout to look nice, and was quite pleasurable to use, deep inside somewhere there is a rotten core. It is present simply because a human designed it and we all know how well humanity is doing of late. Sure, I might not see it and it is all peaches and roses right now--but mark my words there is a soaking pit of sanity destroying hell waiting to ensnare and shatter your mind. This is the nature of computers and the reason why I have slowly begun to loathe them.

A computer and the code it executes are the actualized representations of human desire. Vast intellect concentrates into computer hardware and software, but, in addition, a form of highly distilled idiocy congeals and hides in the reified thought. This isn't your run of the mill idiocy such as reading a book while driving and inattentively slamming into a bus full of school kids. Instead it is insidious idiocy. It is the kind of idiocy that when you encounter it, you wonder if someone actually sat down and fashioned that idiotic thing in just so a manner as to be a blighted red tide upon the seas of intellect.

Someone did.

It is the kind of idiocy that needs so much experience and wisdom to see, that once seen and understood, with realization often coming in an instant, you gravely reflect on the amount of your life that was wasted finding it. It is the idiocy that grows and can overwhelm both a project and/or a mind. It is the idiocy that every so often is actually required--at great cost to sanity, for smooth operations. This blog (I hate that word almost as much as any particular distribution of Unix) is mainly here to talk about these hidden pockets of monumental frustration and other things for which I care to babble.

Obviously, if I had a comment system, people would be queuing up and buying tickets to point out some stupidity that I wrote in code or said in these pages. While asserting their mental dominance over me, they'd probably get in a jab on how ugly the style of the pages are to get some self-gratification for free. In many and/or all cases, they'd be right and I'd be made a fool.

But to these people I say to you: I know the Emperor has no clothes.

End of Line.

May 18, 2005 4:32 AM CDT by psilord in category Apocryphal

The Birth of Angry Unix Programmer

I was walking back to work from a coffee shop with a friend of mine, Alan De Smet, explaining to him some detailed facet of unix shell programming that I truly despised, when after a while he mentioned that he'd like to see a blog of my rants about Unix, programming, life, and the other ridiculous garbage that flows from my pie hole. "A blog?", I said incredulously, "What do I look like? Some 14 year old goth with an exhibitionist fetish and a misdiagnosed case of ADD?"

He smiled in that usual way which caused me to think he was about to punch me in the face when he said, "Nah, I just like reading rants written by people who are experts in their field". Obviously, I became suspicious at that and pondered if I owed him a lot of money or something. "I don't know, Alan, I didn't think I was filled with enough bitterness for a blog", I hazarded, "And besides, noone cared what I thought about or said anyway as it stood and a blog would make that situation perceptably worse."

Alan gave me a stark look of bewilderment and darkly intoned, "I'm going to kick you sqwah in the noots".

At that point, I decided I needed a blog.

I looked around and found some blog sites elsewhere, like livejournal, but decided that I liked the content in my web pages and on my hard drives. So, I figured it wouldn't be too hard to write a rudimentary blog construction script. I decided I had no need for comments--I could already predict the form they would take: "You Suck!", "It is "its", not "it's", idiot!", "I could have spent my time more effectively while choking on fish bones instead of reading this garbage!", and "Buuy Viiagr anow!!!". In addition, I determined that web design was for the weak and decided not to care. If a comment system existed, I'm sure it would find heavy use detailing exactly how I could make my pages better on the eyes. But frankly, if you wanted better on the eyes, walk away from your computer and go outside for a change. In fact, stop using a computer all together since they do more damage to your self-esteem, let alone your eyes, than you realize.

End of Line.


< Previous | Next >