Natural Abstraction Levels

I have some statements to make about when to choose one language over another for solving a particular problem.

The short story is that the decision to use a certain language comes down to how close the "natural abstraction level" of the language you have matches the "natural abstraction level" of the problem domain. The best language for a situation has a close to a 1:1 mapping from the problem domain entities to default language entities as you can get.

Here are extremes which starkly show the balance:

Obviously bad:
- Writing a hospital's record/billing system in Assembler.
  
  You'd spend so much time setting up record structures in memory with consistant accessors plus argument passing conventions that you basically would have invented at LEAST a C dialect (via macros) which then you would use to abstract an object system. And this was before you even started representing the problem domain in the codebase. Debugging....oops. :)
- Writing an x86 boot loader and initial device configuration system in Prolog.
  
  Um. I don't think I need to explain why this is just simply a terrible idea.
Obviously good:
- Writing a compiler in ML.
  
  The basic entities of a compiler are basically found by looking into the language's (that you are compiling) definition. ML's awesome datatype definition system easily allows you to deftly specify one to one mappings from the abstract compilation entites to ML defined structures.
- Writing a device driver in C.
  
  Device driver's deal with hardware sized quantities that are in highly specific places in the address space. Luckily, that is EXACTLY what C was designed to manipulate.

The question comes not in which language to choose, but how to classify the "natural abstraction level" of the problem domain you are working in in addition to the "natural abstraction level" of any particular language.

I suspect it would have to deal with how varied the types are of the entites in the problem domain. If you have just a few types, then you'd have a low "natural abstraction level" and a low level language like C would suffice.

If you have a VERY large range of possible types (or properties) of the entities in the problem domain, then you'd have a high "natural abstraction level" and a high level language would suffice (suppose the "typeless nature" of scheme--especially in function arguments).

Now, the stuff most people write in code is the difference between the abstraction levels. The actual act of manipulating the entities is actually a quite small amount of code. If I choose a language with a lower abstraction level than the problem domain, then I must create an "abstraction infrastructure" in the lower level language to create the right 1:1 mapping from problem domain entities to newly minted abstracted entities in the language. If I have a language with a high natural abstraction level and I want to implement something needing an extremely low abstraction level, chances are I'm going to author an infrastructure to defeat the naturally high abstraction level--such as authoring a C library which sits underneath a python program.

Since it is usually easier create an "abstraction infrastructure" in a lower level language like C to manipulate a higher abstraction set of concepts then it is to tear apart a high level abstraction languages, this is why languages like C are popular, and language like ML are not.

Easier isn't a subjective thing; I can give a qualitative metric to it. It has to do with "what to leave out". :) In creating a higher level abstraction from a low abstraction language, I only create what is necessary and no more. The code becomes the explicit declaration of the capabilities and the capabilities you left out is the stuff that simply wasn't written. Going the other way, one has to write a lot of code to REMOVE functionality from the language and suddenly you have a lot of "well, you can call the C function from scheme, but if it allocates memory I can't ever get rid of it because it wasn't in the garbage collector's world view".

Instead of a small list of "here is what is legal" you end up with a BIG list of "here is what is illegal" that you have to shoehorn into a language which WANTS to do all of the things which are now illegal. The first is ALWAYS easier to reconcile in the mind of a programmer simply since it is a smaller set to validate.

Specification of Natural Abstraction Level

This comes in the form of three catagories: Representation, Control, Debugging. These catagories represent what is available to you by default in either a language or a problem domain.

XXX in progress

Representation
- Hardware Sized Entities
  
  Entities which are exactly integral/real registers or memory locations.
- Simple Entities
  
  Integral and real variables, not necessarily hardware sized, but they can be. Arrays of integral and real variables are included in this section.
- Complex Entities
  
  Entites which by their very nature are complex. Example: Perl's hash tables in a single variable, Matlab's records.
- Aggregate Entities
  
  Groups of primitive data types and/or other aggregate entities.
Control

Program flow constructs such as goto, looping, threading, exceptions, message passing, higher order functions, continuations.
Debugging

Inspectable/Alterable capabilities of the above two catagories.