# 3 Axiomatized formal theories

Alright, cards on the table: I found this chapter kind of boring until around the end, so I’m just going to try and skim through it to get to the interesting parts.

## 3.1 Formalization as an ideal

Gödel’s theorems are statements about formal languages. Why do we care about formal languages? It’s pretty straightforward: formal languages allow us to ensure correctness by following specific rules regarding structure and syntax. In a natural language, such as English, you can have sentences with ambiguous meanings. For example:

“John knows a woman with a cat named Amy.”

could have two possible meanings: either John knows a woman who has a cat whose name is Amy, or John knows a woman named Amy who has a cat. This won’t upset us too much in our day-to-day lives. Usually the intended meaning can be inferred from context. But when we’re trying to prove something logically, this requires precision. Thus, we can use a formal language (such as first-order logic or even any given programming language) to create an unambiguous parsing of our sentences.

Even in proofs, mathematics or computer science, however, we don’t always use such rigorous formalization, as such precision can be tedious. What’s important in these cases is more to do with conveying an understanding of a concept. But the underpinnings of the formal language exists, and if one desired should be able to be spelled out with perfect rigour.

## 3.2 Formalized languages

Normally we are interested in interpreted languages, ie: ones with not just a syntax for deriving valid structures, but one where those structures have actual meaning. I could symbolize one version of the John/Amy sentence from above with $Kja\wedge Wa\wedge Ca$ but that sentence is meaningless unless I also inform you that $Kxy$ means “$x$ knows $y$“, $Wx$ means “$x$ is a woman”, $Cx$ means “$x$ has a cat”, and $j,a$ mean “John” and “Amy” respectively.

Thus, we will define a language $L$ as being a pair $\langle\mathcal L,\mathcal I\rangle$ where $\mathcal L$ is a syntactically defined system of expressions and $\mathcal I$ is an intended interpretation of those expressions.

Starting with $\mathcal L$: it will be based on a finite alphabet of symbols (I’m pretty sure you can get away with relaxing the requirement to an effectively enumerable alphabet, but the book says finite so we’ll go with finite). Some of these symbols will make up $\mathcal L$‘s logical vocabulary, for example: connectives, quantifiers, parentheses, identity… Other symbols will constitute $\mathcal L$‘s non-logical vocabulary: predicates and relations, functions, constants, variables… We will also need a system of rules for determining which sequences of symbols are well-formed formulae of $\mathcal L$ (referred to throughough the text as its wffs).

For example, in first-order logic, our logical vocabulary is $\{(,),\wedge,\vee,\neg,\rightarrow,\leftrightarrow,=,\forall,\exists\}$. Our non-logical vocabulary is a bit more complicated in that it needs to potentially address infinitely many variables. There are two ways to do this. The way I like to think of it is having $\{f^i_j,P^i_j|i,j\in\mathbb N\}$ which gives you an infinite set containing all of your variables $f^0_0,f^0_1,f^0_2,\ldots$, all of your $k$-place predicates $P^k_0,P^k_1,P^k_2,\ldots$ and all of your $k>0$-place functions $f^k_0,f^k_1,f^k_2,\ldots$. But, of course, this results in an infinite alphabet of symbols. The book chooses to accomplish this by having your variables be something like $x,x^\prime,x^{\prime\prime},\ldots$ which operates on a finite alphabet ($x$ and $^\prime$). In either case, we will typically just denote variables as $x,y,z$, predicates as $P,Q,R$, functions as $f,g,h$ and so on. As stated in the previous section, we don’t always need to use perfect rigour but it’s important to understand how it would be accomplished. The union of the logical and non-logical vocabularies form the language’s alphabet. To see how first-order logic determines its wffs, see wikipedia. It is important for our purposes that determining whether a sentence is a wff is effectively decidable.

We then use $\mathcal I$ to set the interpretation of our language. It cant do this by manually setting the truth conditions for each wff (there are too many of them). What it does, rather, is to determine the domain of quantification, the which terms are applied to particular predicates, and the rules for determining the truth of a sentence. For example, $\mathcal I$ might set the domain of quantification to the set of people, set the value constants $m,n$ to Socrates and Plato respectively, and the meaning of the predicate $F$ to mean “is wise”. Then $\mathcal I$ continues by indicating which predicates are true of terms predicates, for example that $F$ is true of both $m$ and $n$. Finally, $\mathcal I$ sets up rules for determining whether wffs are true. For example a wff of the form $A\wedge B$ is true iff $A$ is true and $B$ is true. This can be tedious but straightforward. Again, however, it is important that this process be an effectively decidable one.

## 3.3 Axiomatized formal theories

The following things are required to construct an axiomatized formal theory $T$:

(a) First, some wffs of our $T$‘s language are selected as its axioms. There must be an effective decision procedure to determine whether a given wff is an axiom of $T$. This doesn’t mean that $T$ must have a finite number of axioms: we can have an axiom schema which tells us that sentences of such-and-such a form are axioms. For example, in Zermelo-Fraenkel set theory any wff of the form $\forall z\exists y\forall x(x\in y\leftrightarrow(x\in z\wedge\phi))$ (where $\phi$ can be substituted for (essentially) any property) is an axiom, giving the theory infinitely many axioms.

(b) We also need some form of deductive apparatus or proof system in order to prove things in $T$. I’ve talked about proof systems before and demonstrated two: truth tables and Fitch-style calculus. The exact system used for $T$ is irrelevant so long as it is effectively decidable whether a derivation from premises $\varphi_1,\varphi_2,\ldots,\varphi_n$ to a conclusion $\psi$ is valid. Note that this is different from having an effective procedure to actually create this derivation. All is has to do is determine whether a given proof which has been provided is valid for teh system.

(c) Thus, given an axiomatized formal theory $T$, since we can effectively decide which wffs are $T$-axioms, and whether a derivation in $T$‘s proof system is valid, it follows that it is also decidable whether a given array of wffs forms a $T$-proof (ie: which proofs are sound in $T$).

(d) For the remainder of this book, when we discuss theories, what we really mean is axiomatized formal theories. Many times in logic “theory” simply means any collection of sentences, thus we must be careful to make this distinction here.

Next time we’ll finish up chapter 3 where it actually gets interesting.