1

The title says it all. The language $\{ a^{2^n} \mid n \in N\}$ looks quite simple. Yet I could not find a grammar that generates this language.

1 Answers1

6

(Answer adapted from an almost identical question on StackOverflow, because it really belongs here.)

It's certainly possible to write a grammar for this language, but it won't be a context-free grammar. That's easy to demonstrate using the pumping lemma.

The pumping lemma states that for any context-free language, there is some integer $p$ such that any string $S$ in the language whose length is at least $p$ can be written as $uvxyz$, where $u$, $v$, $x$, $y$ and $z$ are strings and $vy$ is not empty, and for all integers $k$, the string $uv^kxy^kz$ is also in the language.

It's clear that the lengths of the strings "pumped" by using successively larger values of $k$ form an arithmetic sequence. So we can assert that for any string $S$ in the language whose length is greater than $p$, there is there some $n$ such that for any integer $k$ there is a string in the language whose length is $|S| +nk$. That is not the case for the language $\{ a^{2^n} \mid n \in N\}$, since the lengths of the strings in that language form an exponential sequence. So the language cannot be context-free.

In fact, since the alphabet of the language has only one symbol, Parikh's theorem demonstrates that if the language were context-free, it would also be regular. It's even easier to demonstrate that the set of lengths of strings recognised by a regular language must be eventually periodic (unless the set is finite). A regular language corresponds to a Deterministic Finite State Automaton (DFA), and any DFA which can accept an infinite number of strings must include a cycle in its transition diagram. The cycle cannot have more steps than the automaton has states, so the lengths of the strings accepted must be ultimately periodic.

Constructing a non-context-free grammar for the language is not that difficult, but I don't know how useful it is.

The following is a Type 0 grammar (i.e. it's not context-sensitive either), but only because of the productions used to get rid of the metacharacters. The basic idea here is that there we put start and end markers around the string (${\boldsymbol \langle}$ and ${\boldsymbol \rangle}$) and we have a "duplicator" ($\blacktriangleright$) which moves from left to right doubling the $a$'s; when it hits the end marker, it either turns into a back-shuttle ($\blacktriangleleft$) or it eats the end-marker and turns into a start-marker-destroyer ($\star$)

\begin{align*} \mathrm{Start}\Rightarrow &\ {\boldsymbol \langle} {\blacktriangleright} a{\boldsymbol \rangle}\\ {\blacktriangleright} a\Rightarrow &\ aa{\blacktriangleright}\\ {\blacktriangleright}{\boldsymbol \rangle}\Rightarrow &\ {\blacktriangleleft}{\boldsymbol \rangle}\\ {\blacktriangleright}{\boldsymbol \rangle}\Rightarrow &\ \star\\ a{\blacktriangleleft}\Rightarrow &\ {\blacktriangleleft} a\\ a\star\Rightarrow &\ \star a\\ {\boldsymbol \langle}{\blacktriangleleft}\Rightarrow &\ {\boldsymbol \langle}{\blacktriangleright}\\ {\boldsymbol \langle}\star\Rightarrow &\ \varepsilon \end{align*}

rici
  • 12,020
  • 21
  • 38
  • Your keycap graphics are almost illegible on my screen. There's a reason we don't write mathematics this way. – David Richerby Aug 30 '18 at 09:06
  • Great, thanks! There must be a method that turns algorithms into grammars and you are good at it. BTW I am a mathematician and not trained in compiler stuff. I bumped into this problem while preparing a lecture on inductive construction, closure operators, fixed points etc. Learned a lot from your answer. Thanks again. – mindconnect.cc Aug 30 '18 at 12:32
  • @DavidRicherby: I'll redo that part of the answer later today. Since MathJax rendering is completely illegible on my Android (a combination of the font size being wrong and an apparently well-known issue which causes text overlap), I tend to only use [cs.se] on my desktop, where the keycaps look nice. Perhaps "we" all have iPhones which render MathJax nicely. But surely the real solution is to fix the CSS. – rici Aug 30 '18 at 14:11
  • @joohee: there is no such algorithm afaik; in the end, it becomes a recreational programming puzzle, like turning an algorithm into a Turing machine. I don't regard myself as particularly good at either of thise tasks, but over the years I've collected some examples and this is one of them. Formal language theory tends to stop at "a solution exists" and context-sensitive grammars are useless for practical parsing, so it's only to scratch the solve-a-puzzle itch. Anyway, glad it was helpful. – rici Aug 30 '18 at 14:20
  • 2
    @rici OK -- I've done an edit, though you might prefer to re-edit with symbols closer to the ones you originally chose. – David Richerby Aug 30 '18 at 14:32