On the matter of indentation and whitespace.
                            aka
                     Coding Guidelines

                            Bolo


Introduction

     The  purpose of this note is to introduce you to a uni-
form scheme for indenting and formatting code.  This is  not
supposed to be a rigid standard that can not be broken; nei-
ther is it an attempt to put a straitjacket on you and  con-
fine  your  thinking.   What it is is an attempt to make the
code that everyone writes for the system easily  understand-
able,  at least at the obvious formatting level, by everyone
else.

     Code any way you like in your own personal code.   When
you  are orking on the system, please try to use the follow-
ing guidelines.  Aand that is EXACTLY what these are  GUIDE-
LINES.   If  everyone follows them, the code is easy to read
for everyone.  And when you are trying to figure out what is
going  wrong, at least having a common style gets rid of one
layer of obfuscation.

Warning

     Don't be over-zealous about rewriting existing code  to
have  the  new  indentation.   It  wreaks havoc with the CVS
diffs, since you can't "look through"  a  formatting  change
easily.   Instead, apply the guidelines to new code that you
write, and to methods and class declarations that your  sub-
stantially rewrite.

     If  you  partially rewrite something and are also going
to add new things into it, please don't do the  rewrite  and
the  indentation changes at the same time.  Instead, just do
a set of formatting changes only to the  existing  code  and
check  that in with a note that the check is only a bunch of
formatting changes.  Afterwards, go add the new stuff to the
class.   It  is  a lot easier to deal with changes like that
when the format change is different from

Why have Coding Guidelines?

     One of the big  deals  about  indentation  is  that  it
allows  one  to  easily understand the structure of the code
without having to read it line  by  line.   Poorly  indented
code  is  like  reading  a book or paper with bad grammar or
structure.  You can read it, but  it  is  painful  and  time
consuming, not something that you do easily and enjoy.

     Why is this important?  Well, other people need to look
at your code.  And quite often it is  code  that  they  have
never  seen before, or that they rarely look at.  And on top
of it, they are probably looking at the  code  because  they
are  tracking  down a bug and trying to get it fixed.  Which
means that they have a lot of code to look at.  If they have
to  take a lot of time to understand your code, often curso-
rily, just to verify that it works correctly, it  cuts  down
tremendously  on  the  actual  time spent finding and fixing
bugs!  Instead, if they can just browse over your well writ-
ten and formatted code, verify that it seems to be doing the
right thing, then they can move on to the  next  thing  they
have to look at to track down the bug.  When you are looking
at a stack traceback of two or three items, this isn't a big
deal.   When  you  are  looking at 30 stack tracebacks, each
20-30 items deep, stuff like this is a big deal, and it  can
end  up  consuming a lot of time and enthusiasm and patience
of the person doing the debugging.

What is Coupling?

     In this note, and in some others I'll mention  coupling
a  lot.   But,  you  ask,  what the *^%^&# is it?  It is the
dependency of one portion of a system on another portion  of
a  system.   It  exists at several levels.  To be brief, the
levels are  include  file,  linking,  class  declaration  or
interface, and class definition or implementation.

     At  some  point  you  have  to have some sort of use of
other components of a system, or otherwise you don't have  a
system.   At  the  same time, you must be very careful about
what portions of components you expose to be seen by others.
As  the  amount  of exposed stuff increases, the coupling of
the system increases,  as  well  as  its  complexity.   This
affects  things  in  several  ways, and they are usually all
bad.

o    For example, excessive include-file coupling  can  mean
     that  touching  an  inccocuous  include  file somewhere
     causes the  whole  system  to  recompile.   Or  several
     include  files  depending  upon  other include files to
     include stuff that this include file requires.

o    Excessive coupling at link time means that you need  to
     drag  in  all  sorts  of libraries and objects that you
     don't really want or need.

o    Coupling at class declaration level means that the dec-
     laration  of  a  class  requires  exposing, via include
     files, the definition of  classes  that  the  class  in
     question  uses,  either in the definition of the class,
     or in the interface of the class.

o    Lastly, coupling at the implementation level is when  a
     class  uses another class.  To build a system, you need
     a certain amount of coupling at this  level.   However,
     the  interface  between  the  classes  needs to be well
     designed to avoid making the class couple to the imple-
     mentation of another class, instead of to the interface
     of another class.

     When coupling occurs, parts of a system  become  depen-
dent,  needlessly,  on  other parts.  It also means that you
can't test just a portion of the  system, because it is cou-
pled  too closely to the rest of the system and can't be run
in isolation.  In a word, coupling is  bad  and  you  should
really try to avoid it.


Guidelines

Here we go ...

o    Indents  are  for  normal 8 char tab stops.  It is what
     everyone has available.  It is what  all  the  printers
     and  tools and everything use.  It shows enough sepera-
     tion that it is easy to match indent levels.

o    Don't indent at the file level because of namespaces.

o    For functions/methods, the open brace is  on  the  line
     start after the definition.

o    For  normal  control  strucutes,  such as if/while/else
     the open brace follows the if/while/else.

o    Closing braces are on a line by themselves at the  same
     indent  level as the matching if/while/then/else state-
     ment.

o    The brace to start  a  function  or  method  definition
     should  be  on  a  line by itself, in the first column,
     following the function name  and  arguments.   The  end
     brace matches it, in the first column.

o    The one place I recommend violating the previous guide-
     line is in the case of methods declared inside a  class
     declaration.   That  is  someplace that whitespace is a
     precious commodity.  In that case, put the brace  after
     the  name  and  arguments.   The sames goes for the ':'
     initializers in this case.  If you can  fit  everything
     nicely on one line, even better, do it!

o    Decl arguments are on the decl line, and if you have to
     introduce a line break, the following arguments  should
     match  the indentation of the first.  If it is a really
     long function name, such that the decls would wrap  any
     which  way  you try, ibreak the first argument line and
     just indent everything a bit so that the decls fit on a
     line without wrapping.

o    Always  put  c preprocessor (cpp) commands in the first
     column.  Don't indent them.  Also, don't put the '#' in
     the  first  column  and indent the body of the cpp com-
     mand.

o    Seperate the cpp command from the argument, just  don't
     blast them together #include"foobar.h"

o    Use  member  variable  ':' inits.  The ':' should be in
     the first column.  The inits should be in order of dec-
     laration, and specified one per line.

o    Always  indent  stuff,  just  don't shove it all on one
     line; for example:  'if (err)  return  err;'   it  does
     nothing  to make the code easier to read and understand
     the structure of.  The indentation gives  a  hint  that
     something  is going on and needs to be looked at.  How-
     ever, This can be used to effect in a small function or
     method,  where  there isn't a lot to read.  In a larger
     function or method, however, such a non-indented struc-
     ture is something begging to be ignored.

o    Don't  use extra spaces to seperate tokens in the code,
     such as around parenthesis  in  expressions  and  such.
     They actually make things more difficult to read.

o    I  recommend placing data members first in a class dec-
     laration.  Follow the data members with internal,  pri-
     vate  methods.   Follow  the internal methods with pro-
     tected methods.  And last, expose the user interface to
     the class, the public methods.

o    A  big  block comment is usually telling you that some-
     thing is important and should be read.  Lesser comments
     provide  info  about what is going on.  One liners give
     you a hint about something that isn't obvious.

o    Don't use big block comments often, especially in class
     definitions.  Or in the midst of code.  There is a con-
     stant battle going on trying to stuff  enough  informa-
     tion  on  a  "screen"  or  a  "page" so that people can
     encompass the code and understand  it  better.   Adding
     large  comments in the middle just spread the code far-
     ther apart and make it more difficult to understand its
     entire structure.

o    Seperate  blocks  of  declarations from blocks of code.
     If code falls in with declarations it is often  glossed
     over as part of the declaration.

o    When  you have block of declarations, or sometimes even
     a single declaration, it is good to seperate the decla-
     ration  names from the declaration types.  Do this with
     a tab or two so all the names line up.   It  makes  the
     variable declarations easier to read.

o    Don't use the "C++" style of declaration modifiers that
     Stroustroup uses in  his  style.   To  be  brief,  that
     groups  modifiers  with the declaration type instead of
     the declaration name.  Instead, do the normal "C" style
     of  declaring  where  modifiers,  such  as  & and * are
     grouped against the declared  name.   This  immediately
     raises  a  flag to a reader that something is different
     about the  declaration.   With  the  other  way,  these
     important hints are often lost in the noise.

o    C++  public:, private:, and protected keywords in class
     declarations should not be indented, so they stand  out
     clearly.

o    If  you  use goto's, the labels you use should be unin-
     dented.  1/2 level works well in this case.

o    If you have friend  declarations  inside  of  a  class,
     unindent  them  a half indent.  Also you should explain
     why these classes are friends of the class in question.

o    It aids understanding of code considerably if you start
     all data members with an underscore.  When you see that
     there  is  no  doubt  about what is happening, or where
     that variable magically came from.

o    Often you have externally visible methods in  a  class,
     which  are  just  wrappers for internal methods that do
     the real work.  In this case, prefix the  name  of  the
     fIinternal method with an underscore.

o    If it is non-obvious why an internal method exists, you
     could always prefix something to it to indicate why  it
     is internal.  Such a prefix can also make understanding
     easier, so people don't inadverdently use the  internal
     method incorrectly.  For example, _unlocked_method() to
     indicate that the method  assumes  the  caller  in  the
     class has providing locking as necessary.

o    If  you  have  a  method  or function that isn't imple-
     mented, just don't let it sit there and do nothing,  or
     return that it succeeded.  If it returns an error, have
     it return  the  unimplimented  error.   If  it  doesn't
     return  an error, crash the system.  It may not be ele-
     gant, but it will get your attention.  Otherwise,  peo-
     ple  will  wonder why in the world everything is appar-
     ently working but not producing the correct results.

o    #defines are bad news.  They pollute the global  names-
     pace, and invade the context of all classes.  Then what
     happens is that people start using  them  because  they
     are  conveniently  there.   And, portions of the system
     become coupled together.  In C++, the best way to avoid
     #define  use  is  to  define  enumerations at the class
     level.  This firmly scopes that information,  and  also
     ensures that the correct values are being used.

o    Global variables are another thing that is bad.  Global
     class instances are even  worse.   Why?   Well,  global
     variables  are unencapsulated state.  Global class dec-
     larations mean that a global constructor  needs  to  be
     run  for  a  class.  The ordering and error catching of
     those global class instance is all random.   You  can't
     recover  gracefully  from errors, or ensure an ordering
     that works correctly.  They cause real problems  and  I
     encourage  not  using  them.   Instead,  they should be
     scoped inside a class at the very least.  However,  see
     the next entry ...

o    What  I  said  above  for global variables goes equally
     well for static class members, for all  the  same  rea-
     sons.   The  better thing to do, if you really need the
     equivalent of a class static, is to make a  class  that
     holds  all  the things that would be static in a class.
     Each instance of the class can have a reference to  the
     "holder"  class.  Doing things this way also guarantees
     that you can instantiate multiple, independent versions
     of the class and its holder class.  This will break the
     possibility to do the last, but in  some  cases,  where
     there  many  instances and memory use becomes an issue,
     perhaps a class static pointing to the  "holder"  class
     is in order, or even a global variable.  But that is an
     optimization that can be done at a later date.

o    Eliminate include file coupling by  insuring  that  the
     include  file  for  a  class includes all include files
     that a class needs  for  its  own  declaration.   Don't
     include things that the class needs for its implementa-
     tion, though, becuase that exposes the  implementation,
     or  parts  of it, to the outside world.  While that may
     not increase coupling, it certainly does  increase  the
     amount  of  work  the  compiler  has to do to compile a
     given file.  Multiply  that  by  the  number  of  other
     source  files  including  that  file, directly or indi-
     rectly, and it is a big overhead

o    If you just need some classes in  the  interface  of  a
     class,  don't  #include  the  include  files  for those
     classes.  Instead, use C++'s ability to have a  forward
     declaration  for  a class.  You'll need to #include the
     proper include  files  in  the  implementation  of  the
     class, but at least all that junk won't need to be seen
     by the rest of the world.

o    Think hard about putting instances of one class in  the
     declaration  of  another  class.  When you do that, you
     create a coupling.  Sometimes it  is  not  really  bad,
     sometimes it is necessary, for example, with a template
     class.  Perhaps it is better to use a pointer  to  that
     class,  or  to use an implementation-only class to hold
     random information.  This way you don't need to  expose
     portions of the implementation to users of the class in
     question.

o    Be very careful about exposing data types used  in  the
     implementation  of  a  class  in  the interface of that
     class.  Especially data types provided by a third-party
     software package, or even the underlying operating sys-
     tem.  Doing that exposes users of your  class  to  that
     third party package or the OS.  It is better to declare
     your own types, system-wide if need be,  and  use  them
     instead.

o    Don't  optimize code prematurely by using inline direc-
     tives, or by coding things directly in the class  defi-
     nition.   Instead, place the implementation in the .cpp
     file.  If profiling shows that something  needs  to  be
     optimized, then that can be done, on-purpose, later on.

o    If you are having code that is inline,  or  have  moved
     something  from a .cpp to a .h so it can be inline.  Do
     not put big chunks of code  in  the  class  definition.
     Declare them inline and put them after the class decla-
     ration in the include file.  And  code  them  normally,
     just  as  they would be in the .cpp file, not to try to
     compact them.

o    If a method does any decision making, versus just  act-
     ing  as  a  dumb  accessor,  think several times before
     putting it inline or in the class declaration.  If  you
     ever  need  to  change  something, crunch, suddenly you
     need to recompile a larger amount of code.  Same  thing
     holds for constructors and destructors, especially if a
     class has pointers or other things that might  need  to
     be debugged in the future.

o    Don't  bother  with  copying virutal keywords in method
     declarations from an inherited-from class that declares
     a method virtual.  In other words, the virtual declara-
     tion of a method should only exist in the  class  where
     the method is virtualized.

o    Factoring  of  code is important.  Even if it only used
     in one place, and dragged in inline to  make  it  effi-
     cient,  factoring  of chunks of code can greatly add to
     the understanding of something.  If something  is  used
     multiple  times,  it is also an excellent candidate for
     factoring.

     If you find yourself copying code, don't.  This  is  an
     excellent indication that the code needs to be factored
     into a method or function so it can be used by multiple
     callers.

o    If  you  virtualize any methods in a class declaration,
     the destructor for that class must also be virtualized.

o    When  you  are  designing  or implementing a component,
     remember that the interface  is  everything.   Given  a
     good  interface,  you  can  write  an absolutely sucky,
     stupid implementation to prototype something and get it
     off  the  ground  and running.  When the implementation
     shows it's limitations, there is  only  one  place  you
     will need to revisit to improve things, and that is the
     implementation of that component alone.  You won't  end
     up  changing everything that uses the class to adapt to
     its' new implementation.

o    It is said that Everything  is  in  a  name.   That  is
     entirely  true when writing code.  A good name can make
     the difference between instantly understanding what  is
     going  on,  and  spending lots of time trying to under-
     stand what is going on.  Take the time and think  about
     a  good  name for what you are doing.  Don't name some-
     thing after how it does it, but rather  after  what  it
     does.

Newer Items

o    Something  to  mention is that I often do not re-indent
     some code.  Rather I leave it as it is and work in  its
     existing  indentation.   Typically  this is because the
     code needs to be totally redesigned and rewritten,  and
     there  just  isn't  the  time  to  do it.  So, you keep
     patching it and trying to make minor cleanups, once you
     are  aware  of  some  of the issues involved, that will
     make the eventual rewrite easier.

o    If a class has any complex state  associated  with  it,
     add  a print() method and an operator<<() to it.  If it
     is needed during debugging, perhaps even add  a  extern
     "C"  function  that  can  be  used from the debugger to
     print the object.  This print method makes it much eas-
     ier  to  determine  what  the  object  is doing, and to
     determine if it is in a  valid  state.   This  is  true
     whether  you  are  using  the  debugger, try to debug a
     problem, or whatever.

o    If you are going to make a class  printable,  give  the
     class  a  ostream  &T::print(ostream  &)  const method.
     Also add an ostream &operator<<(ostream &, const  T  &)
     function that is used to print the class.  This way the
     output operator does not need to be  a  friend  of  the
     class.   It  also means the print method can be written
     as a normal method with access  to  class  members  and
     statics, instead of something else.

o    The  case  ...:  components  of  a switch statement are
     indented at the same level as the switch  itself.   The
     stanzas of the case are at the next indent level.

o    If  you  are  using  exceptions, be certain to firewall
     exceptions properly.  You need to do this when entering
     and  leavin   code that does not know about exceptions,
     such as libraries and system calls, and callbacks.   It
     may be necessary to convert an exception to an error so
     it can be passed through the  component  properly,  and
     the  reconvert the error to an exception once exception
     land has been reached.  If this is not done,  the  code
     that doesn't know about exceptions, but that deals with
     errors quite well, will not be able to do normal  error
     handling and will appear broken.

o    Do Not use exceptions.  Use errors instead.  If you are
     calling code that might generate an exception, you need
     to  generate a firewall to make certain errors are han-
     dled correctly.

o    If there are exceptions being used in  the  system,  be
     very  careful  to  wrap all stateful constructs in some
     sort of C++ scoped object so  that  the  state  can  be
     undone properly if an exception goes off.

o    If  exceptions  are  used,  or  it  is possible that an
     exception can go through a chunk of code.  It is neces-
     sary to ensure that all clases used that are exposed to
     exceptions in this manner can work in an arbitrary man-
     ner.   This  means thay have to clean up whatever state
     that they have and work properly in the face  of  arbi-
     trary (non-planned) use.

o    Code  the  assignment operator so it works correctly in
     the case of self-assignment.  In most cases, you can do
     something like this:
          const A &A::operator=(const A &r)
          {
               if (this == &r)
                    return *this;
               /* the actual assignment */
               return *this;
          }

o    Don't  say  this->  all the time.  C++ does it for you!
     And, if you are trying to  overload  an  argument  name
     with something built into the class by this, don't even
     try.

o    Some guidelines for #if use.  Use #ifdef notyet if this
     is code that is intended to be used but isn't used yet.
     The notyet indicates it is for the future.  Use  #if  0
     and  #else  as appropriate to show various alternatives
     of a piece of code.  The one that is  currently  active
     should  compile,  and  it  should  be in a #if 1 as the
     first thing, or a #else as the last.  A comment in each
     stanza  may  be appropriate.  If you are disabling code
     because something else is broken, explain what is  bro-
     ken.

o    If you find something disgusting or wrong, mark it with
     an XXX mark and describe what  it  is  that  is  wrong.
     People can then see the XXX comments and know that they
     might need to checkout this thing further.  It provides
     a indication that something is funny and may need to be
     looked at.  But it isn't such a big issue that it needs
     to be brought up in a design or other meeting ... well,
     maybe it does.  One typical use is a bug that can't  be
     fixed because of another problem.

o    Format code and comments to fit into 80 characters.  It
     is the "standard" size of printers, terminals,  xterms,
     and  everything  else.   If you want to be really nice,
     only use 79 characters so the  80th  character  doesn't
     cause a line wrap in emacs.

o    Sometimes  if  a wrapped line is short, and breaking it
     would just result in something equally ugly, leave  it.

o    Don't put multiple statements on a line.

o    Don't  use  //  comments  for block comments.  Only use
     them for single line comments.  An especially bad  for-
     mat to avoid are multiple indented // comments trailing
     the end of the line.  It basically  makes  the  comment
     impossible  to  edit easily without an editor that will
     edit the comment for you, and it also forces  the  com-
     ment  to be very columunar and take up lots of vertical
     space.  If you have something to say about a something,
     say it in a block comment before the something.

o    Do not do ad-hoc argument parsing.  Use getopt.

o    Don't  use  void  indiscriminately.   A  'void  *' is a
     pointer to an unknown data type.  You can't  do  arith-
     metic  on  a  'void  *'.   A 'char *' is a pointer to a
     chunk of memory.  You can do arithmetic on a 'char  *'.
     The  lesson on this is that you may want or need to use
     a 'void *' in your interfaces so pointers don't need to
     be  cast.   However, you should use 'char *' internally
     to point to areas of memory, since  the  areas  have  a
     length  and  you probably want to do pointer arithmetic
     on them.  The two types are completely  different,  use
     them correctly.

o    Don't  try  to  combine data values and error values in
     function returns.  If you need to return  error  codes,
     pass in a parameter (by reference) to return the value.
     Return the error code via the  function  return  value.
     Mixing the two is just bad news.  It makes it very dif-
     ficult to do uniform error  handling,  and  it  grossly
     limits your data types.

o    Never  assume  that  there will be only one instance of
     your class.  Always implement the general  case.   Just
     because  you  can't  see a need for it now doesn't mean
     there isn't a need.  Implementing for the general  case
     will  also provide a cleaner solution that is easier to
     maintain and extend.

o    Don't use ********* comment lines  in  ordinary  usage.
     Stuff  like that should be reserved for BIG EYE OPENING
     DISASTER comments or other indications  that  something
     really important or difficult or bad is going on.

o    Only  deal  with those errors that you know how to han-
     dle.  Handle those errors.  Either pass other errors up
     to  the caller to deal with, or ABORT because you don't
     know what to do.  If you just let the error  slide  you
     will have big time problems tracking down what is going
     wrong.

o    Use streams and '<<' and '>>' operators when  possible.
     It  isolates  the  formating  from  type changes to the
     arguments and just makes it work.  It also gets rid  of
     the  overhead  of interpreting the format string, since
     the compiler can generate the format code directly.  If
     you  still  need  to  use  printf-like  formatting, use
     form() instead, and output to the I/O streams.

o    Don't use stdio I/O, use iostreams I/O.

o    C++ is NOT Pascal.  There can be multiple exits from  a
     function  or  method.  It can considerably increase the
     readability and comprehension of code to use this capa-
     bilities.   For  example,  to return early when various
     conditions are not met, and then the "main body" of the
     function  contains  what  really  happens.   Verus, for
     example, nesting go and no-go if statements and burying
     the  "proper" behavior of the function inside levels of
     indentation and control structure.

o    With regards to complex control  structure  ...   If  a
     function  or  method  is complex enough it can simplify
     things to put all the "safety checks" and environmental
     changes  in  a wrapper function.  This divorces all the
     setup and shutdown complexity from  that  of  the  task
     being done, so that it is obvious what is being done.

Afterwards

     This  set  of  guidelines is trying to dump out as much
information as I can think about how to do things well.   It
actually  extends  well  past the area of formatting code to
design decisions about code,  overall  code  structure,  and
other issues.  At some point, I feel that have barely grazed
the surface of the kinds of issues that you should  consider
when designing and writing code.  All that is above, and way
more, goes on in my head when I look at  code,  write  code,
design  code,  design systems, and do everything else that I
do.  I hope I have at least provided some idea of the things
you  should consider to do a better job designing and imple-
menting a system.

     I'll try to do a better job organizing this document in
the  future,  but  it was tough enough just trying to dig up
all the things to consider and get it written down :)