2.3 Enumerable sets

Another useful notion we will be using time and again is that of effective enumerability. As with effective decidability and effective computability, we will first explain what it means to be enumerable in principle, before moving on to the effectively so. Straight from the book:

A set $\Sigma$ is enumerable if its members can – at least in principle – be listed off in some order (a zero-th, first, second) with every member appearing on the list; repetitions are allowed, and the list may be infinite.

What this means is that you can give a (possibly infinite) list that will contain every single member of $\Sigma$ and each member is guaranteed to appear a finite number of entries into this list. The finite case is fairly obvious if $\Sigma=\emptyset$ (where $\emptyset$ denotes the empty set containing no elements) then trivially all elements of $\Sigma$ will appear on any list (say, the empty list containing no entries). If $\Sigma$ is larger, but still finite, we can imagine just going through each of the elements and listing them one by one. For example, $0, 1, 2, 3, 4, 5$ is an enumeration of the finite set $\{0,1,2,3,4,5\}$. Similarly, $0, 1, 5, 3, 4, 3, 0, 1, 2$ also enumerates $\Sigma$, although with redundancies and not in a natural order.

The tricky case is infinite lists. The condition you need to pay special attention to in this instance is that each element of $\Sigma$ must appear a finite number of entries into the list. So, for example, the following lists each enumerate $\Sigma=\mathbb N$ (where $\mathbb N$ denotes the infinite set containing all natural numbers):

1. $0, 1, 2, 3, 4, 5,\ldots$
2. $1, 0, 3, 2, 5, 4,\ldots$
3. $0, 1, 2, 2, 3, 3, 3, 4, 4, 4, 4,\ldots$

Notice that in each case (assuming our patterns hold) we can determine exactly how many entries into the list a given number $n$ will appear. In 1. $n$ is in the $n^\text{th}$ position. In 2. $n$ appears $n+1$ entries down the list if $n$ is even, and $n-1$ entries down if $n$ is odd. In the third example $n$ appears $\Sigma_{i=0}^n i$ entries into the list. Note that to make the math easier, we start our counting at zero: thus, the left-most element listed is the “zero-th”, the next is the first, the next is the second and so on. Now 2. and 3. are contrived examples but the point they make is that each $n$ appears a finite number of entries into the list, and we can tell exactly how far into the list it is. Contrast that with the following non-examples:

1. $0, 2, 4, 6, ... , 1, 3, 5, 7,\ldots$
2. $100, 99, 98, 97,\ldots$
3. $1, 9, 0, 26, 82, 0, 13,\ldots$

In 1., all the odd numbers seem to appear an infinite number of places into the list. This clearly violates precisely what we’re looking at. In 2. there’s still an obvious pattern, but any number greater than 100 doesn’t seem to appear at all. Finally, in 3. there’s no clear pattern to how the numbers are being listed. It is entirely possible that this is the beginning of some valid enumeration, but without more information it’s impossible to tell. So despite the fact that $\Sigma$ is enumerable, none of these three lists are valid ways to do so.

So hopefully that gives you a bit of an intuitive notion of the idea of enumerability. For the more formally-inclined, here is how this is defined mathematically:

The set $\Sigma$ is enumerable iff either $\Sigma$ is empty or else there is a surjective (onto) function $f:\mathbb N\rightarrow\Sigma$ (so that $\Sigma$ is the range of $f$). We say that such a function enumerates $\Sigma$.

The text proves that these two definitions are equivalent, but it’s fairly straightforward, so if you’re having trouble seeing it, I suggest sitting down and working out why these two versions of enumerability come out to the same thing. It should be similarly obvious that any subset of $\mathbb N$ (finite or infinite) is also enumerable. However:

Theorem 2.1 There are infinite sets that are not enumerable.

Proof: Consider the set $\mathbb B$ of infinite binary strings (ie: the set containing strings like $011001011001..."$). Obviously $\mathbb B$ is infinite. Suppose, for the purposes of contradiction (also known as reductio) that some enumerating function $f:\mathbb N\rightarrow \mathbb B$ does exist. Then, for example, $f$ will look something like:

$0\mapsto s_0:\underline{0}110010010\ldots\\1\mapsto s_1:1\underline{1}01001010\ldots\\2\mapsto s_2:10\underline{1}1101100\ldots\\3\mapsto s_3:000\underline{0}000000\ldots\\4\mapsto s_4:1110\underline{1}11000\ldots\\\ldots$

The exact values of $s_i$ aren’t important (as we will see) so this example will abstract to the general case. What we are going to do now is construct a new string, $t$, such that $t$ does not appear in the enumeration generated by $f$. We will do this by generating $t$ character-by-character. To determine the $n^\text{th}$ character in $t$ simple look at the $n^\text{th}$ character of $s_n$ and swap it. Thus, given our example enumeration above, the first 5 characters of $t$ would be $01010\ldots"$ which we get by just this method (for convenience, the $n^\text{th}$ character of each $s_n$ has been underlined). Now all we have to do is notice that $t$ will differ from each of the $s_i$‘s at precisely the $i^{th}$ position. As such, $t$ does not appear in the enumeration generated by $f$. Thus, $f$ is not an enumeration of $\mathbb B$ which contradicts our hypothesis that $\mathbb B$ is enumerable.

QED

This gives us some interesting corollaries depending on how you want to interpret the set $\mathbb B$:

For example, a binary string $b\in\mathbb B$ can be thought of as representing a real binary decimal number $0\leq b\leq 1$ (ie: $0010110111..."$ would represent $0.0010110111...$ and $0000000000..."$ would represent $0$. Thus we know that the real numbers in the interval $[0,1]$ are not enumerable  (and so neither is the set of all real numbers $\mathbb R$).

Another way to think of $\mathbb B$ is that it is the set of sets of natural numbers. To see this, interpret a given string $b=b_0b_1b_2\ldots"$ to be the set $b^\prime=\{n|b_n=1\}$, where a number $n\in b^\prime$ iff $b_n=1$ and $n\not\in b^\prime$ iff $b_n=0$. So for example, if $b=10101000111..."$ then $b^\prime=\{0,2,4,8,9,10,\ldots\}$. Thus, the set of sets of natural numbers (denoted $\mathcal P\mathbb N$) is also not enumerable.

In later chapters we will learn the notion of a characteristic function which is a function $f:\mathbb N\rightarrow\{0,1\}$ which takes a numerical property $P$ and maps $n\mapsto 0$ if $P n$ holds and $n\mapsto 1$ if $\neg Pn$ holds. (This may seem backwards, since $0$ typically denotes $\texttt{False}$ and $1$ denotes $\texttt{True}$, however we will see the reasons for this in due course.) If we consider an element $b=b_0b_1b_2\ldots"\in\mathbb B$ to describe a characteristic function $b^\prime$ by $n\mapsto b_n$, then we can observe that the set of all characteristic functions is similarly non-enumerable.

Next time we will finish up chapter 2 by discussing the limitations of what can be effectively enumerated by a computer program.

2 Decidability and enumerability

Here we go over some basic notions that will be crucial later.

2.1 Functions

As I imagine anyone reading this is aware (although it’s totally cool if you’re not… that’s why it’s called learning), a function $f:\Delta\rightarrow\Gamma$ is a rule $f$ that takes something from its domain $\Delta$ and turns it into something from its co-domain $\Gamma$. We will be dealing exclusively with total functions, which means that $f$ is defined for every element in $\Delta$. Or, more plainly, we can use anything in $\Delta$ as an argument for $f$ and have it make sense. This is contrasted with the notion of partial functions, which can have elements of the domain that $f$ isn’t designed to handle. We will not be using partial functions at any point in this book (or so it promises).

So, given a function $f:\Delta\rightarrow\Gamma$, some definitions:

The range of a function is the subset of the $\Gamma$ that $f$ can possibly get to from elements of $\Delta$, ie: $\{f(x)|x\in\Delta\}$. In other words, the range is the set of all possible outputs of $f$.

$f$ is surjective iff for every $y\in\Gamma$ there is some $x\in\Delta$ such that $f(x)=y$. Equivalently, $f$ is surjective iff every member of its co-domain is a possible output of $f$ iff its co-domain and its range are identical. This property is also called onto.

$f$ is injective iff for it maps every different element of $\Delta$ to a different element of $\Gamma$. Equivalently, $f$ is injective iff $x\neq y$ implies that $f(x)\neq f(y)$. This property is also called one-to-one because it matches everything with exactly one corresponding value.

$f$ is bijective iff it is both surjective and injective. Because $f$ is defined for every element of $\Delta$ (total), can reach every member of $\Gamma$ (surjective) and matches each thing to exactly one other thing (injective), an immediate corollary of this is that $\Delta$ and $\Gamma$ have the same number of elements. This is an important result that we will use quite often when discussing enumerability.

2.2 Effective decidability, effective computability

Deciding is the idea of determining whether a property or a relation applies in a particular case. For example, if I ask you to evaluate the predicate “is red” against the term “Mars”, you would say yes. If I gave you the predicate “halts in a finite number of steps” to the computer program $\texttt{while (true);}$ you would probably say no. In either case you have just decided that predicate.

Computing is the idea of applying a function to an argument and figuring out the result is. If I give you the function $f(x)=x+1$ and the argument $x=3$ you would compute the value $4$. If I give you the function $f(x)=\text{the number of steps a computer program }x\text{ executes before halting}$ to the argument of the same computer program as above, you would conclude that the result is infinite. In both cases you have just computed that function.

What effectiveness comes down to is the notion of whether something can be done by a computer. Effective decidability is the condition that a property or relation can be decided by a computer in a finite number of operations. Effective computability is the condition that the result of a function applied to an argument can be calculated by a computer in a finite number of operations. For each notion, consider the two sets of two examples above. In each, the first is effectively decidable/computable and the second is not, for reasons I hope will eventually be clear.

This raises an obvious questions: what is a computer? Or, more to the point, what can computers do exactly? For our purposes we will be using a generalized notion of computation called a Turing machine (named for their inventor, Alan Turing). Despite its name, a Turing machine is not actually a mechanical device, but rather a hypothetical one. Imagine you have an infinite strip of tape, extending forever in both directions. This tape is divided up into squares, each square containing either a zero or a one. Imagine also that you can walk up and down and look at the square you’re standing next to. You have four options at this point (and can decide which to do take based on whether you’re looking at a zero or a one, as well as a condition called the “state” of the machine): you can either move to the square to your left, move to the square on your right, change the square you’re looking at to a zero, or change it to a one. It may surprise you, but the Turing machine I have just described is basically a computer, and can execute any algorithm that can be run on today’s state-of-the-art machines.

In fact, throughout the history of computability theory, whenever a new model has been developed of what could be done algorithmically by a computer (such as $\lambda$-calculus, $\mu$-calculus, and even modern programming languages) it has turned out that each of these notions were equivalent to a Turing machine, as well as each other. Thus, Alan Turing and Alonzo Church separately came up with what is now called the Church-Turing thesis (although the book only deals with Turing, hence “Turing’s thesis”):

Turing thesis: the numerical functions that are effectively computable in an informal sense (ie: where the answer can be arrived at by a step-by-step application of discrete, specific numerical operations, or “algorithmically”)  are just those functions which are computable by a properly programmed Turing machine. Similarly, the effectively decidable properties and relations are just the numerical properties and relations which are decidable by a suitable Turing machine.

Of course we are unable to rigorously define an “intuitive” or “informal” notion of what could be computed, so Turing’s thesis could never be formally proven, however all attempts to disprove it have been thoroughly rebuked.

You might wonder, however, about just how long it might take such a simple machine to be able to solve complex problems. And you would be right to do so: Turing machines are notoriously hard to program, and take an enormous number of steps in order to solve most interesting problems. If we were to actually use such a Turing machine to try and get a useful answer to a question (as opposed to, say, writing a C++ program) it could very realistically take lifetimes to calculate. By what right, then, do we call this “effective”? Another objection to be raised might have to do with the idea of an infinite storage medium, which violates basic engineering principles of modern computer architectures.

Both of these objections can be answered at once: when we discuss computability, we are not so much interested in how practical it is to run a particular program. What interests us is to know what is computable in principle, rather than in practice. The reason for this is simple: when we discover a particular problem that cannot be solved by a Turing machine in a finite number of steps, this result is all the more surprising for the liberal attitude we’ve taken towards just how long we will let our programs run, or how much space we will allow them to take up.

One final note in this section. The way we’ve defined Turing machines, they operate on zeroes and ones. This of course reflects how our modern computers represent numbers (and hence why the Turing thesis refers to “numerical” functions, properties and relations). So how then can we effectively compute functions or decide properties of other things, such as truth values or sentences? This is simple. We basically encode such things into numbers and then perform numerical operations upon them.

For example, most programming languages encode the values of $\texttt{True}$ and $\texttt{False}$ as $1$ and $0$, respectively. We can do the same thing with strings. An immediate example is ASCII encoding of textual characters to numeric values which is standardized across virtually all computer architectures. Later in the book we will learn another, more mathematically rigorous way to do the same.

The rest of this chapter is about the enumerability and effective enumerability of sets, but I’m going to hold off on talking about those until next time.

Gödel! #1 An Introduction to Gödel’s Theorems 1.0

I am going to rip off something Zach Weiner has been doing on his blog where he’s blogging his way through a few different textbooks. This sounds like an awesome way to get a better understanding out of stuff, so I am going to completely steal the idea from him, including the way he formats his titles (while giving him full credit as my inspiration) and blog my way through some of the textbooks I bought in university but never really actually bothered to crack open. Perhaps more fun for me than for you, but we’ll see how it goes.

The textbook I’m going to start off with is called An Introduction to Gödel’s Theorems written by Peter Smith. From the bits of it I’ve actually made use of, it’s a fairly detailed logic text while still being relatively accessible to anyone with a bit of background. Some of the concepts it seems to take for granted are elementary set theory, introductory logic, and basic computability theory. But most everything we’ll need seems to be covered in the text.

1 What Gödel’s Theorems say

1.1 Basic arithmetic

A lot of what is going to be covered in the text has to do with basic arithmetic, which is to say the natural numbers (0, 1, 2, etc…) and operations on them (addition, multiplication, etc…). Although this will all be flushed out formally in a few chapters, the natural numbers have a specific starting point, 0, each one has a unique successor, and every number falls into this sequence. But this will all be made formal in short order.

Our bigger concern is the notion of a formalized mathematics. In 1920 mathematician David Hilbert put forward a program to axiomatize all of mathematics into a set of finite, simple and non-controversial mathematical statements. The goal was to lift mathematics up by its bootstraps and prove the completeness and consistency of mathematics from these axioms in order to leave zero doubt as to their correctness.

The idea would be that mathematics could be axiomatized into a theory $T$ (a theory is simply a collection of axioms) that would be (negation) complete, which is to say that for any sentence $\phi$, either $\phi$ or $\neg\phi$ would be provable in $T$.

Thus, in our case, we are considering how one could build a complete theory of basic arithmetic where we could prove (or disprove) conclusively the truth of any claim that could be expressed arithmetically. This is where Gödel’s Theorems come into play…

1.2 Incompleteness

… by basically shitting all over the idea. What mathematician Kurt Gödel was able to do in a 1931 paper was present a way to, given a theory $T$ which was sufficiently strong enough to express arithmetic, construct a sentence $\textbf G_T$ such that neither $\textbf G_T$ nor $\neg\textbf G_T$ can be derived in $T$, yet we can show that if $T$ is consistent then $\textbf G_T$ will be true.

Thus, basic arithmetic in its most striped-down form fails to be negation complete, which puts quite a dampener on Hilbert’s program.

The specifics of how $\textbf G_T$ is actually constructed is the subject of most of the text, but the gist of it is this: $\textbf G_T$ encodes the sentence “$\textbf G_T$ is unprovable in $T$“. Thus, $\textbf G_T$ is true iff $T$ can’t prove it. Suppose then that $T$ is sound (ie: cannot prove a false sentence). Then if it were to prove $\textbf G_T$ it would prove a falsehood, which violates soundness. Thus $T$ does not prove $\textbf G_T$ and so $\textbf G_T$ is true. Thus, $\neg\textbf G_T$ is false, which means that $T$ can’t prove it either. Again, how $\textbf G_T$ is constructed is what we’ll be getting to, but this is the gist of Gödel’s First Theorem.

It should also be noted that there isn’t only one such sentence that renders $T$ incomplete. Suppose we decide to augment $T$ by adding $\textbf G_T$ to it, to create a new theory $U=T+\textbf G_T$. We will then be able to construct a new Gödel sentence, $\textbf G_U$ which will be true but unprovable in $U$. Since $U$ encompasses $T$, $\textbf G_U$ will also be unprovable in $T$ and we get a construction of an infinite number of unprovable-in-$T$ sentences.

Thus, arithmetic is not only incomplete, but indeed incompletable.

1.3 More incompleteness

This incompletability does not just affect arithmetic. In fact it will also affect any systems which could be used to represent arithmetic. For example, set theory can define the empty set $\emptyset$. Then, form the set $\{\emptyset\}$ containing the empty set, followed by the set containing both these sets $\{\emptyset,\{\emptyset\}\}$ and so on. We get a sequence of the form:

$\emptyset,\{\emptyset\},\{\emptyset,\{\emptyset\}\},\{\emptyset,\{\emptyset\},\{\emptyset,\{\emptyset\}\}\}$

where we can define 0 as $\emptyset$, 1 as the set containing 0, 2 as the set containing 0 and 1, and so on. The successor of $n$ is $n$ unioned with itself (ie: $n\cup\{n\}$), addition is defined as iterated succession, multiplication as iterated addition and all of a sudden you have a theory of arithmetic encompassed in set theory. By Gödel’s First Theorem, then, set theory is also incomplete.

1.4 Some implications?

This section deals with some philosophical implications of the First Theorem, but doesn’t delve into enough detail to be worth talking about.

1.5 The unprovability of consistency

Any worthwhile arithmetical theory will be able to prove the proposition $0\neq1$. Thus, any decent theory that proves $0=1$ will be inconsistent. Furthermore, an inconsistent theory can prove any proposition, including $0=1$, thus a theory of arithmetic $T$ is inconsistent iff $T$ proves $0=1$. Now, we’ve already established that we can encode facts about the provability of propositions in $T$, thus we have a way to encode the idea that $T$ can’t prove $0=1$, which is to say that $T$ can express its own consistency. We’ll call the sentence that expresses this $\textbf{Con}_T$

From above, we’ve already seen that a consistent theory $T$ can’t prove $\textbf G_T$. Since $\textbf G_T$ is itself the sentence that expresses its own unprovability, we can then express (in $T$) that if $T$ is consistent then $\textbf G_T$ is unprovable by $\textbf{Con}_T\rightarrow\textbf G_T$.

Make sense?

According to the text, it turns out (although we’ve yet to see how) that this sentence $\textbf{Con}_T\rightarrow\textbf G_T$ turns out to be provable within theories with conditions only slightly stronger than those required for the First Theorem. However $\textbf G_T$ must still be unprovable within such theories, and so $\textbf{Con}_T$ must also be unprovable otherwise we would be able to get $\textbf G_T$ by simple modus ponens.

As such, we have Gödel’s Second Incompleteness Theorem: that nice theories that can express a sufficient amount of arithmetic can’t prove their own consistency.

1.6 More implications?

The key point that I took from this section is that since we’ve shown that a theory of arithmetic $T$ can’t prove its own completeness or consistency, then it certainly can’t prove the same for a richer theory $T^+$. This rudely defeats what remains of Hilbert’s Programme, as arithmetic isn’t capable of validating itself, let alone the rest of mathematics.

1.7 What’s next?

Obviously this is all pretty roughshod. In chapter 2 we go over a few basic notions that we will need to prove things in more detail. Chapter 3 discusses what is meant by an “axiomatized theory”. Chapter 4 introduces concepts specific to axiomatized theories of arithmetic. Chapters 5 and 6 give us some more direction as we head off towards formally proving… pause for dramatic music… Gödel’s First Incompleteness Theorem!