Example Lex and Yacc Programs

Here are a number of short Lex and Yacc programs to demonstrate what sorts of things you can do with Lex and Yacc. Several of these programs are copied from the book lex & yacc by John R. Levine, Tony Mason, and Doug Brown, published by O'Reilly & Associates, 1992.

trivial.l The simplest possible lex program: it has no patterns except the "default" pattern that matches everything, and has no rules except the "default" rule that prints the pattern to stdout. In other words, it copies stdin to stdout.
ex1.l, copied out of the "flex" man page, replaces the word "username" wherever it appears in the input with the username of the person who's logged in right now; otherwise it copies stdin unaltered to stdout.
ex2.l, also copied out of the "flex" man page, counts the number of lines and characters in the input file. Note that it uses the special symbol "." to match any character other than a newline.
caesar1.l implements the Caesar cipher: it replaces every letter with the one three letters after in in alphabetical order, wrapping around at Z. It uses character ranges "[a-z]" and "[A-Z]", which match any lower-case letter and any upper-case letter respectively.
caesar2.l does the same thing, but with a different (and briefer) approach: it uses range notation more cleverly to distinguish which characters need to wrap and which don't.
lexlongword.l finds the longest word (defined as a contiguous string of upper and lower case letters) in the input. Although the range notation "[a-zA-Z]" would match any single letter, tacking on a "+" after it produces a regular expression matching any sequence of one or more letters, as long as possible (i.e. until it sees a non-letter). Compare with clongword.c, clongword2.c, and clongword3.c.
ch102.l, chapter 1 exercise 2 of the O'Reilly book, categorizes a number of words as verbs and nonverbs.
ch103.l, also from the O'Reilly book, categorizes into verbs, adverbs, and other words. Still a very limited vocabulary....
ex3.l, copied out of the "flex" man page, distinguishes keywords, integers, floats, identifiers, operators, and comments in a simple Pascal-like language. You don't need to understand everything that's going on in this one, but it demonstrates the kind of thing you can do fairly quickly and simply in lex.
romans.l reads and interprets Roman numerals. Note how enormously much shorter and clearer this is than solving the same problem in C. This program depends on lex recognizing the longest of several alternative patterns, e.g. if it sees an I, it doesn't apply the action for the "I" rule until it has first checked that the next character isn't "V" or "X".

You are visitor number to this and related pages since Feb. 8, 1996.

Last modified: Tue Nov 19 12:00:51 EST 1996

Stephen Bloch / sbloch@boethius.adelphi.edu