Example Lex and Yacc Programs
Here are a number of short Lex and Yacc programs to demonstrate what
sorts of things you can do with Lex and Yacc. Several of these programs
are copied from the book lex & yacc by John R. Levine, Tony
Mason, and Doug Brown, published by O'Reilly & Associates, 1992.
- trivial.l The simplest possible lex
program: it has no patterns except the "default" pattern that matches
everything, and has no rules except the "default" rule that prints the
pattern to stdout. In other words, it copies stdin to stdout.
- ex1.l, copied out of the "flex" man page,
replaces the word "username" wherever it
appears in the input with the username of the person who's logged in
right now; otherwise it copies stdin unaltered to stdout.
- ex2.l, also copied out of the "flex" man page,
counts the number of lines and characters in the input file.
Note that it uses the special symbol "." to match any character other
than a newline.
- caesar1.l implements the Caesar cipher: it
replaces every letter with the one three letters after in in
alphabetical order, wrapping around at Z. It uses character ranges
"[a-z]" and "[A-Z]", which match any lower-case letter and any
upper-case letter respectively.
- caesar2.l does the same thing, but with a
different (and briefer) approach: it uses range notation more cleverly
to distinguish which characters need to wrap and which don't.
- lexlongword.l finds the longest
word (defined as a contiguous string of upper and lower case letters)
in the input. Although the range notation "[a-zA-Z]" would match
any single letter, tacking on a "+" after it produces a regular
expression matching any sequence of one or more
letters, as long as possible (i.e. until it sees a non-letter).
Compare with
clongword.c,
clongword2.c, and
clongword3.c.
- ch102.l, chapter 1 exercise 2 of the
O'Reilly book, categorizes a number of words as verbs and
nonverbs.
- ch103.l, also from the O'Reilly book,
categorizes into verbs, adverbs, and
other words. Still a very limited vocabulary....
- ex3.l, copied out of the "flex" man page,
distinguishes keywords, integers, floats,
identifiers, operators, and comments in a simple Pascal-like language.
You don't need to understand everything that's going on in this one, but
it demonstrates the kind of thing you can do fairly quickly and simply
in lex.
- romans.l reads and interprets Roman
numerals. Note how enormously much shorter and clearer this is than
solving the same problem in C. This program depends on lex recognizing
the longest of several alternative patterns, e.g. if it sees an
I, it doesn't apply the action for the "I" rule until it has first
checked that the next character isn't "V" or "X".
You are visitor number
to this and related
pages since Feb. 8,
1996.
Last modified:
Tue Nov 19 12:00:51 EST 1996
Stephen Bloch / sbloch@boethius.adelphi.edu