First-order Logic and the Star-Free Languages
[DRAFT]
The syntax and semantics of first order logic
Given a signature, we can define a FO logic over that signature. To build such a logic and define how it can be interpreted over models in the signature, we need to define the syntax and semantics of FO. Defining this precisely tells us not only the range of grammars we can write, but also how these grammars compute languages.
Syntax
The syntax defines the range of statements we can write.
First, statements in FO choose from a set of variables \(\mathcal{X}=\{x,y,z,...,x_1,x_2,...\}\).
Additionally, for each relation in the signature, we have a set of atomic formulas that correspond to the relations in our model (plus equality).
Definition. (Atomic formulas)
- For any two variables \(x\) and \(y\), \(x=y\) is an atomic formula.
- For any two variables \(x\) and \(y\), \(x<y\) is an atomic formula.
- For any variable \(x\) and any unary relation \(R_\sigma\in\mathcal{S}\), \(\sigma(x)\) is an atomic formula.
Atomic formulas, as their name suggests, define the basic properties in the model to which the logic refers. Given our running example for \(\Sigma={a,b,c}\), some atomic formulas are \(a(x)\), \(b(z)\), \(x=y\), \(y<z\), etc.
The set of well-formed formulas (WFFs) in FO is then built out of atomic formulas connected by the logical connectives \(\neg\) and \(\lor\) and the quantifier \(\forall\).
Definition. (Well-formed formulas)
- If \(\varphi\) is an atomic formula, then \(\varphi\) is a WFF
- If \(\varphi\) is a WFF, then \((\neg\varphi)\) is a WFF
- If \(\varphi\) and \(\psi\) are WFFs, then \((\varphi\lor\psi)\) are WFFs
- If \(\varphi\) is a WFF for which \(x\) is a free variable, then \((\forall x)[\varphi]\) is a WFF
A free variable in an formula \(\varphi\) is a variable \(x\) such that \(...x...\) appears in \(\varphi\) without being preceded by \((\forall x)\) or \((\exists x)\) (i.e., it's not bound by a quantifier).
Note that this definition is recursive: the definition of WFF itself refers to WFFs. Thus, we can write an infinite number of WFFs by recursively applying the rules above. For example, \((\neg a(x))\) is built by starting with the atomic formula \(a(x)\) and plugging that in as \(\phi\) for rule (2.) in the definition of WFFs.
Take a moment to convince yourself of how this works by deriving the WFFs \(((\neg a(x))\lor b(x))\) and \((\forall x)[(b(x)\lor(\neg(c(x))))\).
Some notational sugar. The parentheses in the above definition make the composition of a WFF explicit, but they are often removed for readability when there is no danger of ambiguity. So, for example, \(((\neg a(x))\lor b(x))\) can be rewritten as \(\neg a(x)\lor b(x)\).
Finally, the other common logical connectives \(\rightarrow\) and \(\land\) and the quantifier \(\exists\) can be derived from the syntax we've already defined. Namely,
- \(\varphi\rightarrow\psi := \neg\varphi\lor\psi\)
- \(\varphi\land\psi := \neg(\neg\varphi\lor\neg\psi)\)
- \((\exists x)[\varphi] := \neg(\forall x)[\neg\varphi]\)
...where the notation '\(:=\)' means something like 'can be re-written as'.
Take a minute apply the above to construct our original example of \((\forall x)[a(x)\rightarrow\neg(\exists y)[b(y)\land x<y]]\).
Semantics
So far our FO WFFs are just strings of meaningless symbols. We have to define the semantics of our WFFs; that is, when a WFF is true in a model. We will do this recursively, following closely the definition above of the syntax of FO logic. In this way, for every sentence we can construct using our syntax, our semantics guarantees an interpretation for it.
Because this is a first-order logic, the variables \(x,y,z,\) etc. range over elements of the domain (i.e., positions in the string). Satisfaction is then dependent on how variables are assigned to positions into the string. As such, our notation for satisfaction must take into account this dependency.
Thus, satisfaction will be defined specifying the model \(M\) and some series \(i_1,i_2,...,i_k\) of \(k\) elements in \(D\). Accordingly, below \(\varphi(x_1,x_2,...,x_k)\) refers to a WFF with exactly \(k\) free variables \(x_1,x_2,...,x_k\). Essentially, we are going to plug each \(i_j\) into each variable \(x_j\).
Definition. (Satisfaction of well-formed formulas)
- a. \(M,i\models \sigma(x)\) iff \(i\in R_\sigma\)
b. \(M,i,j\models (x<y)\) iff \((i,j)\in R_<\)
c. \(M,i,j\models (x=y)\) iff \(i=j\) (that is, \(i\) and \(j\) are the same element in \(D\).) - \(M,i_1,i_2...,i_k\models \neg\varphi(x_1,x_2...,x_k)\) iff \(M,i_1,i_2,...,i_k\not\models \varphi(x_1,x_2...,x_k)\) (that is, \(M,i_1,i_2,...,i_k\) does not satisfy \(\varphi(x_1,x_2,...,x_k)\)).
- \(M,i_1,i_2,...,i_k,j_1,j_2,...,j_\ell\models \varphi(x_1,x_2,...,x_k)\land\psi(y_1,y_2,...,y_\ell)\) iff \(M,i_1,...,i_k\models \varphi(x_1,x_2,...,x_k)\) and \(M,j_1,j_2,...,j_k\models \psi(y_1,y_2,...,y_\ell).\)
- \(M,i_1,...,i_k\models (\forall x)[\varphi(x,y_1,y_2,...,y_k)]\) iff for all \(i\in D\), \(M,i,i_1,...,i_k\models\varphi(x,y_1,...,y_k).\)
This is a lot, so let us go through each. To illustrate, we'll use \(M=M(abac)\), depicted graphically below.
\[ \begin{array}{c} \begin{array}{|c|c|c|c|}\hline a & b & a & c \\ \hline \end{array}\\ \begin{array}{cccc} 1 & 2 & 3 & 4 \\ \end{array} \end{array} \]
First, notice that each part of the definition matches with the definition of the syntax of well-formed formulas above. Thus, for example, part (1) of the definition of the semantics defines the semantics for atomic formulas (i.e., from part (1) of the definition of the syntax).
So if our WFF is simply \(a(x)\), then \(M,1\models a(x)\) because \(1\in R_a\) in \(M\); conversely, \(M,2\not\models a(x)\) because \(2\not\in R_a\) in \(M\). Similarly, \(M,1,2,3\models(\neg b(x))\land (y<z)\) because \(M,1\models\neg b(x)\) (because \(1\not\in R_b\) and thus \(M,1\not\models b(x)\)) and \(M,2,3\models y<z\) (because \((2,3)\in R_<\)).
Part (4) of the definition, with the quantifier, is a bit different. Say we have a WFF \((\forall x)[a(y)\land y<x]\). Satisfaction here only requires specifying one element in the domain, specifically that which will correspond to \(y\), as \(x\) is bound by a quantifier. So let us check whether \(M,1\models(\forall x)[a(y)\land \neg( y<x )]\).
By the definition, \(M,1\models(\forall x)[a(y)\land \neg (x<y)]\) only in the case that for all \(i\in D\), \(M,i,1\models a(y)\land \neg x<y]\). In other words, no matter what element we assign \(x\) to, satisfaction goes through. One sure way to calculate this is via brute force, i.e. checking each element in \(D\) to make sure satisfaction holds in every case.1 So first we set \(i=1\). It holds that \(M,1,1\models a(y)\land \neg (x<y)]\): \(1\in P_a\) and it is not the case that \((1,1)\in P_<\). Next we set \(i=2\). It then holds that \(M,2,1\models a(y)\land \neg(x<y)]\): \(1\in P_a\) and it is not the case that \((2,1)\in P_<\). As is perhaps clear at this point, we really only have to check whether \(M,i,1\models\neg(x<y)\), as we always know that \(M,1\models a(y)\), the first part of the conjunct. Knowing this, it is clear that when \(i=3\) and \(i=4\), it is also the case that \(M,i,1\models\neg(x<y )\), as neither \((3,1)\) or \((4,1)\) are in \(P_<\).
As we have now checked each of (1,2,3,4) in \(D\), we are finished: for all \(i\in D\), \(M,i,1\models a(y)\land \neg(x<y)]\), and so \(M,1\models(\forall x)[a(y)\land \neg (x<y)]\).
Sentences
An important special case is when a WFF has no free variables---that is, for every variable, there is a quantifier that binds it. Such a WFF is called a sentence. This case is important because we can define satisfaction independent of any assignment of variables to elements in \(D\). That is, we can talk about whether a model satisfies a sentence directly.
Definition. (Satisfaction of a sentence). For a sentence \(\varphi=(\forall x)[\psi(x)]\), \(M\models\varphi\) iff for all \(i\in D,M,i\models\psi(x)\).
As an example, consider the WFF \[\varphi=(\forall x)\big[(\forall y)[\neg(c(x)\land a(y)\land x<y)]\big]\]
First, \(\varphi\) is a sentence because both of its variables \(x\) and \(y\) are bound by a quantifier. To see how a model can be evaluated for satisfaction of \(\varphi\), take \(M=M(abac)\) from above. For this \(M\), \(M\models\varphi\). To see why, take any two elements \(i,j\in D\) from the domain of \(M\) and substitute \(i\) for \(x\) and \(j\) for \(y\). If you evaluate the formula inside the quantifiers (i.e., \(\neg(c(x)\land a(y)\land x<y)\)), it will return true. (I leave it to the reader to work out the details themselves.)
To illustrate a model that does not satisfy \(\varphi\), consider \(M=M(abca)\).
\[ \begin{array}{c} \begin{array}{|c|c|c|c|}\hline a & b & c & a \\ \hline \end{array}\\ \begin{array}{cccc} 1 & 2 & 3 & 4 \\ \end{array} \end{array} \]
Try evaluating each pair \(i,j\in D\) against \(\varphi\). One pair will return false for the internal formula. Which is it?2
Summary
We have now learned a syntax that generates an infinite number of FO WFFs, and a semantics that can interpret any one of these WFFs. In particular, given a sentence with no free variables, we can directly check whether a model satisfies that sentence. Next, we will show how this allows a sentence to serve the function of a grammar that describes a formal language.
Exercises
- Convince yourself that \((\varphi\lor\psi)\equiv\neg(\neg\varphi\land\neg\psi)\); that is, that the former is logically equivalent to the latter. To do this, use the definition of the semantics to show that the former is true exactly in the cases that the latter is true. (If you have done a proof before, you can prove this via induction on the two definitions.)
- Convince yourself that \((\exists x)[\varphi(x)]\equiv\neg(\forall x)[\neg\varphi(x)]\).
- What are some more strings whose models satisfy the sentence \(\varphi\) above? Some that do not?
1 There can be more efficient ways to do this, if we know the formula ahead of time. But this exact procedure is guaranteed to work for any formula and any model.
2 It fails for \(i=3\) and \(j=4\), as \(i\) is a \(c\) and \(j\) is an \(a\) and \((i,j)\in R_<\).