Combinative and Coincident Architectures in Programming Language Constructs

Posted on Thu 02 September 2010 in Situation Theory

In my previous two posts, Architecture of Information, some notes and More on Combinative and Coincident Information Architectures, I introduced John Perry and David Israel's notion of information architectures. In this post I would like to explore how they can inform us about the semantics of some common programming language constructs as a rather straightforward exercise.

Example 1: Variable declaration and assignment - a simple coincident architecture

Suppose that we have the following in the main function of a procedural programming language like C.

    int main(){
      // Statement 1: declare a character variable named c
      char c*;
      // Statement 2: assign the character value 'd' to the
      // variable named c. 
      c = 'd';
      ...
    }

The scope of the two programming statements, 'char c;' and 'c = d' is the function main. This is the relevant context in which to understand programming statements.

The reflexive and incremental contents of these programming statements may be understood as follows:

Reflexive:

(1a) The fact that "c;" is preceded by "char " in the 1st statement "char c;" carries the information that the variable referred to by "c" in the statement is to be a variable of type char.

(1b) The fact that "c" is followed by " = " and then by "'d';" in the 2cd statement "c = 'd';" carries the information that the variable indicated by "c" in "c = 'd';" is to be assigned the character value 'd'.

Incremental:

(2a) The fact that "c;" is preceded by "char " in the 1st statement "char c;" carries the information that the variable c is to be of type char*.

(2b) The fact that "c" is followed by " = " and then by "'d';" in the 2cd statement "c = 'd';" carries the information that the variable c is to be assigned the character value 'd'.

Now we consider the architectural contents. But what is the relationship between these two programming statements, and are their respective contents induced by this relationship or reflected by it?

The relevant architectural relationship between these statements in the program is that they both refer to the same variable c, and make type appropriate assignments. The architectural constraint is that any two instances of an identifier in the same scope refer to the same variable. The connecting fact is that these two instances are in the same scope, in this case the function main. We see that this architectural fact induces (the compiler) to assign the value 'd' to the variable c. That is, the architectural relationships between the programming statements under the constraints imposed by the programming language compiler induce a particular interpretation of the programming statements that determine the relationship between the character value, the variable, and the variable type. Hence it is a coincident architecture.

For example, we might say that the 2cd statement carries the information that the variable named c of type char is to assigned value 'd', or that the 1st statement carries the information that the variable to which the value 'd' is to be assigned is named c and is of type char.

Example 2: a simple data structure

In a language like C, a structure is a named group of variables, constants, and other structures. The motivation behind such structures is the observation that one will typically have multiple items of data that are naturally grouped together; objects after all have many attributes that may be of interest.

For example, if a program needed to track the heights and weights of different persons, it is crucial that the data not be mistakenly associated with the wrong person; they should be bundled together, because the object that these data are about, the person whose weight and height are being tracked, provides the connection that makes these data usefully and meaningfully associated. The structure provides a clean and useful means of accomplishing this goal.

// declare a structure definition
struct person {
    char *name;
    double weight; // in kilos
    double height; // in meters
}
// declare to instances of type struct person
struct person p1;
struct person p2;

// assign values to these instances
p1.name = 'Elwood'
p1.weight = 92.5;
p1.height = 6.2;

p2.name = 'Marja'
p2.weight = 54.2;
p2.height = 5.9;

The above program first declares a structure with three constituent data members: a string of characters called name, and two real numbers called weight and height respectively. This declaration is followed by a declaration of two instances of this structure p1 and p2 (for person 1 and person 2). Then we assign name, weight, and height values to each instance. The result of these declarations will be that the computer executing this program will reserve an appropriate amount of memory for each of these instances. The amount of memory is determined by a variables type, and in this case the type is a program-defined structure. Then certain bit patterns representing values will be written to the reserved memory in an appropriate fashion.

Now, there is a very intuitive semantic interpretation of this code that may tempt us to call this a combinative architecture. After all, the structure instance p2 refers to a person called Marja, whose weight is 54.2 kilograms and whose height is 5.9 meters, right? Isn't this exactly what Perry and Israel are talking about when they speak about file folders and labeled records? Well, yes...almost. A data structure is a way of combining different items of information together in a useful way, often in just the way that Perry and Israel are talking about. But, and this is a big but, this interpretation has nothing to do with what the program means to the compiler, or how it will be executed. Even the fact weight and height are supposed to be in kilograms and meters is not something that can be enforced directly.

Programmers code with their semantic interpretations of the code in mind, and the names and groupings of the variables are chosen accordingly, but the semantic relations with 'external' people and things like Elwood and Marja are not grounded in the language itself, but in the complex of causal relations that ensure that the information in the structure reflects the respective facts about Elwood and Marja. In short, it is a wholly different semantic level. As programs, the things at which the instance identifiers p1 and p2 point are not people named Elwood and Marja, but to sections of reserved computer memory storing various data.

In other words, our program is not all that different from that found in example 1. While a structure does gather together diverse sets of data into a single named structure, exactly like one of Perry and Israel's files, the semantics of a programming language simply aren't concerned with whatever is "out there". Programs induce their indicated contents, via the action of a compiler.

Is There a Combinative Architecture in There?

In short, yes, as we've already hinted. It is entirely possible to generate random programs that have no semantic relation to the rest of the world. But programs are written by programmers with particular intended interpretations in mind*. In order that these programs accomplish useful work (from the perspective of the programmer, or customer), the induced relationships between programming elements must have stable, meaningful, and coherent interpretations for the programmer and other users. The names of variables serve this purpose by suggesting intended uses to which a function, variable, structure, or object is to be put. Data structures and classes accomplish this goal by providing a stable, easily accessible, and meaningful grouping of elements that reflect the structure between their user-semantically indicated contents. Thus the structure person brings together relevant properties of people so that, if used as intended, the fact that properties belong to the same structure instances, for example, reflect the fact that the thing that the structure instance is indicating/modeling/referring to, has both those properties.

The interesting aspect of this example is that the signals simultaneously involved in two architectures and have two indicated contents, one that is induced in the compiler by the structural relations between the program's textual elements, and the other where the structural relations between program elements reflect the the structural relationships between the object in the world of the programmer and program-user it is intended to model, and which is made real or concrete by use.

*The semantic gap between a programmer's intended interpretation and that of compiler is a frequent cause of errors, bugs, and security flaws.

REFERENCES

Israel, David, and John Perry. Information and Architecture. In Proceedings of the Second Conference on Situation Theory and Its Applications, edited by Jon Barwise, Jean Mark Gawron, Gordon Plotkin, and Syun Tutiya, 147-160. Vol. 2. Stanford, CA: Center for the Study of Language (CSLI), 1992.