If $L$ is some language and $\{ww^R \mid w \in L\}$ is a regular language then does $L$ have to be a regular language?
-
5Yes, $L$ consists of the first halves of a regular language: https://cs.stackexchange.com/q/14192/4287 – Hendrik Jan May 12 '21 at 18:24
-
1@HendrikJan I encourage you to put it in the answer box, even if the answer is short :) – nir shahar May 12 '21 at 20:32
-
@nirshahar Thanks, but please be my guest, and make the answer, preferably a little longer than my comment. – Hendrik Jan May 12 '21 at 20:34
-
Yes, $L$ consists of the first halves of context-free palindromes, https://cs.stackexchange.com/a/109637/91753. – John L. May 13 '21 at 13:24
-
1@JohnL. Ha, that is funny. – Hendrik Jan May 13 '21 at 19:53
1 Answers
Let $A = \{ww^R \mid w \in L\}$, and suppose that $A$ is regular. What can we say about $L$?
For a word $w \in \Sigma^*$, let $E(w) = \{x : wx \in A\}$. Since $A$ is regular, there are only finitely many different $E(w)$.
Say that $w \in L$ is self-terminating if it has no extension in $L$. If $w \in L$ is self-terminating then $E(w)$ is finite, and the unique longest word it contains is $w^R$. Therefore if $w_1 \neq w_2$ are both self-terminating words in $L$, then $E(w_1) \neq E(w_2)$. Consequently, there are only finitely many self-terminating words in $L$. Let $L'$ result from removing them and all their prefixes. Thus every word in $L'$ has infinitely many extensions in $L'$.
Say that $w \in L'$ is creative if there are two different letters $\sigma,\tau$ such that both $w\sigma$ and $w\tau$ have extensions in $L'$. If $w \in L'$ is creative then we can recover $w$ from $E(w)$ by taking the longest common suffix which appears infinitely often. This implies that there are only finitely many creative words in $L'$. Let $L''$ result from removing all creative words in $L'$ and their prefixes. Thus every word in $L''$ has infinitely many extensions in $L''$, and all of them are prefixes of a single $\omega$-word.
Let $x$ be an $\omega$-word that $L''$ has infinitely many prefixes of, and choose the shortest such prefix $w_x$. We can recover $x$ from $E(w_x)$ in the following way: the first $\ell$ letters of $x$ are the reverse of the suffix of length $\ell$ of all long enough words in $E(w_x)$. It follows that $L''$ consists of prefixes of finitely many $\omega$-words.
Let $x$ be one such $\omega$-word. By the pigeonhole principle, there exist $n_1 < n_2$ such that $x_1 \ldots x_{n_1},x_1 \ldots x_{n_2} \in L$ and $E(x_1 \ldots x_{n_1}) = E(x_1 \ldots x_{n_2})$. Since all long enough words in the former start with $x_{n_1+1} \ldots x_{n_2}$, it follows that $x_{n_2+1} \ldots x_{n_2+(n_2-n_1)} = x_{n_1+1} \ldots x_{n_2}$, and continuing in this way, we see that $x$ is eventually periodic, say $x=yz^\omega$.
The number of words of given length in a regular language is eventually periodic. It follows that we can modify $L''$ to another language $L'''$, differing in finitely many words, such that $L'''$ also consists of prefixes of finitely many $\omega$-words, and furthermore there exists a modulus $m$ such that for each $a \in \{0,\ldots,m-1\}$ and each $\omega$-word, either all prefixes of length $nm+a$ are in $L'''$, or none of them. Up to finitely many words, for each $\omega$-word $x = yz^\omega$, these prefixes are of the form $y' (z')^*$.
Every two $\omega$-words in $L'''$ differ in some symbol. It follows that for each such word $x$, and for each $a$, we can use regular closure operations to isolate the corresponding subset of $A$, which is of the form $$ \{ ww^R \mid w \in y z^* \} = \{ y z^n (z^R)^n y^R \mid n \geq 0 \}. $$ We can further use regular closure operations to isolate $$ \{ z^n (z^R)^n \geq 0 \}. $$ Applying the pigeonhole principle to the sets $E'(z^n)$ (where $E'$ is the analog of $E$ to the new language), we see that $E'(z^{n_1}) = E'(z^{n_2})$ for some $n_1 < n_2$. Since $(z^R)^{n_1} \in E'(z^{n_1}) = E'(z^{n_2})$, it follows that $z^{n_2}(z^R)^{n_1} = z^{n_1} (z^R)^{n_2}$, and so $z = z^R$.
We conclude that there exist $a,b \in \mathbb{N}$, words $w_1,\ldots,w_a,y_1,\ldots,y_b$, and palindromes $z_1,\ldots,z_b$, such that $$ L = \sum_{i=1}^a w_i + \sum_{j=1}^b y_j z_j^*. $$ The corresponding language $A$ is $$ A = \sum_{i=1}^a w_i w_i^R + \sum_{j=1}^b y_j (z_j z_j)^* y_j^R. $$
- 276,994
- 27
- 311
- 503