# Introduction to Channel Theory

Posted on Mon 06 June 2011 in Situation Theory

The channel theory of Jeremy Seligman and Jon Barwise [1] is intended to
help us understand information flows of the following sort: *a*'s being
*F* carries the information that *b* is *G*. For example, we might want
a general framework in which understand how a piece of fruit's
bitterness may carry the information that it is toxic, or how a mountain
side's having a particular distribution of flora can carry information
about the local micro-climate, or how a war leader's generous
gift-giving may carry information about the success of a recent
campaign, or the sighting of a gull can carry the information that land
is near. A classification is a simple sort of data structure
classifying a set of particulars by a set of types. But as my examples
above were intended to show, classifications are not only intended to
model 'categorical' data, as usually construed.

**Def 1.** A *classification* is a triple *A* = \(\langle tok(A),
type(A), \vDash \rangle\) such that for every token \( a \in
tok(A)\), and every type \( \alpha\in typ(A)\), \( a
\vDash_{A}\alpha\) if and only if *\( a\)* is of type \(
\alpha\).

One might remark that a classification is not much more than a table
whose attributes have only two possible value, a sort of degenerate
relational database. However, unlike a record/row in a relational
database, channel theory treats each token as a first-class object.
Relational databases require keys to guarantee that each tuple is
unique, and key constraints to model relationships between records in
tables. By treating tokens as first class objects, we may model
relationships using an *infomorphism*:

**Def 2.** Let \( A\) and \( B\) be two classifications. An
*infomorphism* \( f : A \rightleftarrows B\) is a pair of
functions \( f = \lbrace f^{\wedge}, f^{\vee} \rbrace\) such
that \(f^{\wedge}: typ(A) \rightarrow typ(B)\) and \(
f^{\vee}: tok(B) \rightarrow tok(A)\) so that it satisfies the
following property: that for every type \( \alpha\) in \(A\) and every
token \(b\) in \(B\), \(b \vDash_{B} f^{\wedge}(\alpha)\) if and only
if \( f^{\vee}(b) \vDash_{A} \alpha\).

An infomorphism is more general than an isomorphism between
classifications, i.e. an isomorphism is a special case of an
infomorphism. For example, an infomorphism \( f : A
\rightleftarrows B\) between classifications *A* and *B* might map a
single type \( \beta\) in *B* onto two or more types in *A*,
provided that from *B's* point of view the two types are
indistinguishable, or more precisely that for all tokens *b* in *B* and
all types \( \alpha\) in A, \( f^{\vee}(b) \vDash_{A}
\alpha\) if and only if \( f^{\vee}(b) \vDash_{B}
\alpha^{\prime}\). Note that this requirement does not mean that
those types in *A* are not distinguishable in *A* (or more technically,
are not co-extensional in *A*). There may be tokens in *A* outside the
range of \( f^{\vee}\) for which, for example, \( a
\vDash_{A} \alpha\) but not \( a \vDash_{A}
\alpha^{\prime}\). A dual observation may be made about the tokens of
*B*. Two tokens of *B* may be mapped onto the same token in *A*,
provided that those tokens in *B* are indistinguishable with respect to
the set of types \( \beta\) in *B* for which there exists some
\( \alpha\) such that \( f^{\wedge}(\alpha) = \beta)\).
Again, this does not mean these same tokens in *B* are wholly
indistinguishable in *B*. For example, there may be types outside the
range of \( f^{\wedge}\) classifying them differently. Thus, an
infomorphism may be thought of as a kind of view or filter into the
other classification.

It is actually rather difficult to find infomorphisms between arbitrary classifications. In many cases there will be none. If it were too easy, then the morphism would not be particularly meaningful. Too stringent and then it would not be very applicable. However, two classifications may be joined in a fairly standard way.For example, we can add them together:

**Def 3.** Given two classifications A and B, the sum of A and B is the
classification A+B such that:

- \( tok(A + B)=tok(A)\times tok(B)\),
- \( typ(A + B)\) is the disjoint union of \( typ(A)\) and \( typ(B)\) given by \(\langle 0,\alpha \rangle\) for each type \(\alpha \in typ(A)\) and \(\langle 1,\beta \rangle\) for each type \(\beta \in typ(B)\) , such that
- for each token \(\langle a,b\rangle \in tok(A+B)\), \(\langle a,b \rangle {{\vDash }_{A+B}}\langle 0,\alpha \rangle \text{ iff a}{{\vDash }_{A}}\alpha\) and \(\langle a,b\rangle {{\vDash }_{A+B}}\langle 1,\beta \rangle \text{ iff b}{{\vDash }_{B}}\beta\).

**Remark.** For any two classifications A and B there exist
infomorphisms \( {{\varepsilon }_{A}} : A \rightleftarrows A+B\)
and \( {{\varepsilon }_{B}}:B\rightleftarrows A+B\) defined such
that \( {{\varepsilon }_{A}}^{\wedge }(\alpha )=\langle
0,A\rangle\) and \( {{\varepsilon }_{B}}^{\wedge }(\beta
)=\langle 1,B\rangle\) for all types \( \alpha \in typ(A)\)
and \( \beta \in typ(B) {{\varepsilon }_{B}}^{\vee }(\langle
a,b\rangle )=b\) and \( {{\varepsilon }_{A}}^{\vee }(\langle
a,b\rangle )=a\) for each token \( \langle a,b\rangle \in
tok(A+B)\).

To see how this is useful, we turn now to Barwise and Seligman's notion of an information channel.

**Def 4.** A channel **C** is an indexed family of infomorphisms \(
\{ f_{i} : A_{i} \rightleftarrows C \} _{i \in I}\) each having
co-domain in a classification C called the core of the channel.

As it turns out, in a result known as the Universal Mapping Property of
Sums, given a binary channel **C** = \( \{ f : A \rightleftarrows
C, g : B \rightleftarrows C \}\), and infomorphisms \(
{{\varepsilon }_{A}} : A \rightleftarrows A+B\) and \(
{{\varepsilon }_{B}}:B\rightleftarrows A+B\), the following diagram
commutes:

The result is general and can be applied to arbitrary channels and sums.

I still haven't exactly shown how this is useful. To do that we introduce some inference rule that can be used to reason from the periphery to the core and back again in the channel.

A sequent \( \langle \Gamma ,\Delta \rangle\) is a pair of sets of types. A sequent \( \langle \Gamma ,\Delta \rangle\) is a sequent of a classification \( A\) if all the types in \( \Gamma\) and \( \Delta\) are in \( typ(A)\).

**Def 5.** Given a classification \( A,\) a token \( a\in
tok(A)\) is said to *satisfy* a sequent \( \langle \Gamma
,\Delta \rangle\) of \( A\), if \( a{{\vDash
}_{A}}\alpha\) for every type \( \alpha \in \Gamma\) and
\( a{{\vDash }_{A}}\alpha\) for some type \( \alpha \in
\Delta\). If every \(a\in tok(A)\) satisfies \( \langle
\Gamma ,\Delta \rangle\), then we say that \( \Gamma\)
*entails* \( \Delta\) in \(A\), written \( \Gamma
{{\vdash }_{A}}\Delta\) and \( \langle \Gamma ,\Delta
\rangle\) is called a *constraint* of \( A\).

Barwise and Seligman introduce two inference rules: *f*-Intro and
*f*-Elim. Given an infomorphism from a classification *A* to a
classification *C*, \( f:A\rightleftarrows C\):

\( f\text{-Intro: }\frac{{{\Gamma }^{-f}}{{\vdash }_{A}}{{\Delta }^{-f}}}{\Gamma {{\vdash }_{C}}\Delta }\)

\( f\text{-Elim: }\frac{{{\Gamma }^{f}}{{\vdash }_{C}}{{\Delta }^{f}}}{\Gamma {{\vdash }_{A}}\Delta }\)

The two rules have different properties. *f*-Intro preserves validity,
*f*-Elim does not preserve validity; *f*-Intro fails to preserve
invalidity, but *f*-Elim fails to preserve invalidity. *f*-Elim is
however valid precisely for those tokens in *A* for which there is a
token *b* of *B* mapping onto *A* by the infomorphism *f*.

Suppose then that we have a channel. At the core is a classification of
flashlights, and and at the periphery are classifications of bulbs and
switches. We can take a sum of the classifications of bulbs and
switches. We know that there are infomorphisms from these
classifications to the sum (and so this too makes up a channel), and
using f-Intro, we know that any sequents of the classifications of bulbs
and switches will still hold in the sum classifications: bulbs +
switches. But note that the classification bulbs + switches, since it
connects every bulb and switch token, any sequents that might properly
hold between bulbs and switches will not hold in the sum classification.
Similarly, all the sequents holding in the classification bulbs +
switches will hold in the core of the flashlight channel. However, there
will be constraints in the core (namely those holding *between* bulbs
and switches) not holding in the sum classification bulbs + switches.

In brief: suppose that we know that a particular switch is in the On
position, and that it is a constraint of switches that a switch being in
the On position precludes it being in the Off position. We can project
this constraint into the core of the flashlight channel reliably. But in
the channel additional constraints hold (the ones we are interested in).
Suppose that in the core of the channel, there is a constraint that if a
switch is On in a flashlight then the bulb is Lit in the flashlight We
would like to know that because *this* switch is in the On position,
that a particular bulb will be Lit. How can we do it? Using *f*-Elim we
can pull back the constraint of the core to the sum classification. But
note, that this constraint is *not valid* in the sum-classification.
But it is not valid for precisely those bulbs that are not connected in
the channel. In this way, we can reason from local information to a
distant component of a system, but in so doing, we lose the guarantee
that our reasoning is valid, and we lose the guarantee that it is sound.

[1] Barwise, Jon, and Jerry Seligman. 1997. *Information Flow: The Logic
of Distributed Systems*. Cambridge tracts in theoretical computer
science 44. Cambridge: Cambridge University Press.