First-order Logic and the Star-Free Languages
[DRAFT]
Introduction
Last time we learned about the difference between computable and uncomputable languages. This time, we're going to show how we can compute languages using first-order logic (FO).
Let's start with an actual natural language pattern. In Sarcee [cook78], [+anterior] sibilants can follow [-anterior] sibilants, but the reverse results in ungrammaticality. (Data from [cook78], cited in [heinz10]. Sibilants are highlighted in bold.)
| ʃítʃídzàʔ | 'my duck' |
| nāʃɣátʃ | 'I killed them again' |
| *sítʃídzàʔ |
In the above example, the [-anterior] [z] follows the [+anterior] [ʃ], but it is impossible for the [+anterior] [ʃ] to follow [s].
Let's study this abstractly as a formal language. Let \(L\) be the formal language over \(\Sigma=\{a,b,c\}\) of all strings such that no \(a\) ever precedes a \(b\), anywhere in the string. I.e.,
\[ L = \{accca,cccc,bcccb,bbccca,...\}\]
and
\[ \overline{L} = \{ab,acccb, ccaccccbc, aacccccccccb, .... \} \]
Thus, \(L\) above distills the precedence-based constraint in Sarcee, replacing [-anterior] consonants with \(a\) and [+anterior] consonants with \(b\) (and everything else with \(c\)).
We can describe \(L\) with the below FO statement that forbids a \(b\) from following an \(a\).
\[(\forall x)[a(x)\rightarrow\neg(\exists y)[b(y)\land x<y]]\]
For now, we'll forgo the formal details and interpret the statement informally. The variables \(x\) and \(y\) here range over positions (or indices) in a string. Thus, \(a(x)\) is interpreted to mean 'position \(x\) is occupied by the symbol \(a\).'
The entire statement, then, reads as follows:
For every position \(x\), if \(x\) is an \(a\), then there is no \(y\) such that \(y\) is a \(b\) and \(x\) precedes \(y\).
This logical statement thus specifies a constraint on strings: for every \(a\) in a string, it cannot precede a \(b\). As shown in detail below, this statement is either true or false for every string in \(\Sigma^*\), and there is a procedure for computing this. Thus, this logical statement specifies exactly the strings in \(L\): the strings in \(L\) are those for which the statement is true, and the strings not in \(L\) are those for which it is false.
The advantages of using logical statements as grammars are numerous. Among them are the following.
-
We can draw from a rich literature connecting logic and computation [buchi60],[mcnaughtonpapert71],[immerman80],[rogerspullum11].
-
We can study patterns over structures besides strings.
-
They allow us to directly connect formal language classes to classes of functions, through the technique of logical interpretations [engelfriethoogeboom01],[filiot15].
There are more, which we will learn about more later. For now, we will use FO as a primer for using logical statements as grammars for languages, and discuss how FO is connected to the star-free (SF) class of formal languages [mcnaughtonpapert71].
Before we define our logic, we have to define strings in a bit more detail. That is, it is necessary to encode all the information that is necessary to identify a particular string. To do this we use model theory, the mathematical structure of language.