[ Jocelyn Ireson-Paine's Home Page | Publications | Dobbs Code Talk Index | Dobbs Blog Version ]

Prolog as a Text-Hacking Language

"Which scripting language do you use?" once demanded a caption down the right hand side of the Dobbs Code Talk page. Perl? PHP? Python? No. If by "scripting", you mean quick-and-dirty programmatic text-hacking, I use SWI-Prolog. I'm going to solve a small file-conversion problem to explain how, walking through my program to show off some aspects of Prolog. One thing I want to emphasise, as I did in a posting about Yet More XML: with Prolog, is how convenient I find Prolog. Look! ['This', is, all, 'I', need, to, do, to, make, a, list, '!'].

I am not going to explain everything about Prolog, and shan't, for example, tell you how Prolog executes code and decides what order to do things in. I also shan't say anything about the cut: the exclamation mark you'll see in some of my code listings. (Roughly speaking, it tells Prolog to commit itself to one path of execution, and not go backtracking into alternatives.)

Rather, I just want to show Prolog at work on a typical small text-hacking problem, and demonstrate some coding techniques, as well as some of Prolog's quirks. The two most important points, I think, are: Prolog's convenience and conciseness when creating and acting on lists. Why, in comparison, does Java have to be so verbose? And: how to understand definitions declaratively, as in line_to_equations and lines_to_equations.

Introduction: Life and Life pattern files

This problem arose when I was preparing my posting Gliders, Hasslers, and the Toadsucker: Writing and Explaining a Structured Excel Life Game. In it, I describe how I wrote a program in my Excelsior spreadsheet-description language, to implement a Life game in Excel.

Perhaps you already know Life. It's a cellular automaton — probably the world's most famous. It runs on a rectangular grid, transforming a pattern of full and empty cells one generation at a time. Details are in Gliders, Hasslers, and the Toadsucker: Writing and Explaining a Structured Excel Life Game (and in many other places): there, you'll find links to references, and to Life simulators, including my spreadsheet.

I wanted to give people using my spreadsheet some example patterns, so that they could do something interesting straight away. Because these patterns live on a rectangular grid, and their cells can be either full or empty, one can code them as pictures using two symbols, like this:

...***...***.......
..............
.*....*.*....*
.*....*.*....*
.*....*.*....*
...***...***..
..............
...***...***..
.*....*.*....*
.*....*.*....*
.*....*.*....*
..............
...***...***..

That, by the way, is a "pulsar": it oscillates, reverting to this original every three cycles.

Such patterns are easy to type. But for my Excelsior program, I wanted them in a different representation, like this:

pulsar[7,4] = "n".
pulsar[8,4] = "n".
pulsar[9,4] = "n".
pulsar[13,4] = "n".
pulsar[14,4] = "n".
pulsar[15,4] = "n".
pulsar[5,6] = "n".
pulsar[10,6] = "n".
pulsar[12,6] = "n".
pulsar[17,6] = "n".
pulsar[5,7] = "n".
...
pulsar[15,16] = "n".

In the interests of boredom-reduction, I've omitted some rows, but the intent should be clear. Each line assigns "n" to an array element. The spreadsheet that I generated from these assignments uses the Wingdings font to display these arrays; and in Wingdings, "n" appears as a filled box. So the assignments initialise the full cells. The empty cells don't need assigning, because I represent them as blanks — which a cell's default value is in Excel.

The problem, then, is to read a sequence of lines, where each line is a sequence of dots and stars. The stars must be converted into assignments to elements of a two-dimensional array. The Y coordinate of each element is determined by its line number, and the X coordinate by its position within that line.

The Prolog program

I'm now going to show my Prolog program, then explain it piece by piece. This won't be an exhaustive introduction to Prolog — and I hope it won't be exhausting — I'm just going to use it to show off some aspects of the Prolog language and of SWI-Prolog.

First, some generalities. I wrote my program in Jan Wielemaker's SWI-Prolog. This is a free open-source Prolog that runs on Windows, Linux, and Macs. Like all Prologs, you can run it interactively, which is convenient for developing and debugging. And for teaching, I have to say: I wonder why more computer-science departments don't use interactive languages like Prolog and Lisp, rather than non-interactive ones like C++ and Java.

Predicates and clauses

My program is a sequence of predicate definitions. (Predicates are what Prolog has instead of subroutines and functions.) The definitions consist of clauses. Often, each clause processes a different combination of input arguments, so alternative clauses are like alternative "then" branches in an "if-then-elseif-then" statement.

I've separated different clauses for the same predicate by single blank lines, and different predicates by double blank lines. This isn't necessary — Prolog is free-format — but it helps readability.

I mentioned subroutines and functions. In conventional languages, "function" connotes something purer than "subroutine": something that, at its purest, has no side effects, and implements a mathematical function. In Prolog too, some predicates are purer than others. At its purest, a Prolog predicate corresponds to a predicate in first-order predicate logic, and implements a mathematical relation.

Here, then, is my program:

convert_life_pattern( InputFile, PatternName ) :-
  read_file_to_lines( InputFile, Lines ),
  lines_to_equations( Lines, PatternName, 4, Eqns ),
  atom_concat( PatternName, '.exc', OutputFile ),
  tell( OutputFile ),
  flatten( Eqns, EqnsFlattened ),
  write_equation( write_equation, EqnsFlattened ),
  told.


lines_to_equations( [], _PatternName, _Y, [] ) :- !.

lines_to_equations( [ Line | Lines ], PatternName, Y, [ Eqns | MoreEqns ] ) :-
  line_to_equations( Line, PatternName, 4, Y, Eqns ),
  YPlus1 is Y+1,
  lines_to_equations( Lines, PatternName, YPlus1, MoreEqns ).


line_to_equations( [], _PatternName, _X, _Y, [] ) :- !.

line_to_equations( [ 0'* | Rest ], PatternName, X, Y, [ Eqn | Eqns ] ) :- !,
  Eqn = eqn( PatternName, X, Y ),
  XPlus1 is X+1,
  line_to_equations( Rest, PatternName, XPlus1, Y, Eqns ).

line_to_equations( [ 0'. | Rest ], PatternName, X, Y, Eqns ) :-
  XPlus1 is X+1,
  line_to_equations( Rest, PatternName, XPlus1, Y, Eqns ).


write_equation( eqn( PatternName, X, Y ) ) :-
  format( '~w[~w,~w] = "n".~n', [PatternName,X,Y] ).


read_file_to_lines( File, Lines ) :-
  open( File, read, Stream, [] ),
  read_line_to_codes( Stream, Line ),
  read_stream_to_lines( Line, Stream, Lines ),
  close( Stream ).


read_stream_to_lines( end_of_file, _Stream, [] ) :- !.

read_stream_to_lines( Line, Stream, [ Line | Rest ] ) :-
  read_line_to_codes( Stream, NextLine ),
  read_stream_to_lines( NextLine, Stream, Rest ).

Running the program

I've made my program available so that you can run it, and all the example calls below, yourself: this may be useful if you're teaching or learning Prolog. This is how. First, download SWI-Prolog, as follows. Go to the SWI-Prolog home page. Click on the "Downloads" link in the left-hand navigation column. Go to the "Stable release" link in the SWI-Prolog downloads page. Then download an appropriate version. As a Windows user, the one I download is the second, currently "SWI-Prolog/XPCE 5.6.64 for Windows NT/2000/XP/Vista".

Next, download my zipped examples and unzip into a clean directory.

Next, click on the icon of the SWI-Prolog you installed. This will give you output similar to what's below:

Welcome to SWI-Prolog (Multi-threaded, 32 bits, Version 5.6.64)
Copyright (c) 1990-2008 University of Amsterdam.
SWI-Prolog comes with ABSOLUTELY NO WARRANTY. This is free software,
and you are welcome to redistribute it under certain conditions.
Please visit http://www.swi-prolog.org for details.

For help, use ?- help(Topic). or ?- apropos(Word).

1 ?-

Next, in SWI-Prolog, change to the directory you unzipped into. Do this by typing the command shown in bold below into the Prolog window, then hitting RETURN. Be sure not to omit the full stop. You'll need to replace the path shown by that to the directory you unzipped into: in paths, you can use forward slashes even on Windows:

1 ?- cd('c:/dobbs').
true.

2 ?-

Next, load my program:

2 ?- [convert_life_pattern].
% convert_life_pattern compiled 0.00 sec, 3,552 bytes
true.

3 ?-

And finally, invoke the main predicate:

3 ?- convert_life_pattern('pulsar.dots',pulsar).
true.

4 ?-

If all went well, this will have generated the file pulsar.exc, containing the array assignments shown in my introduction.

Running the goals

In my listings of interaction with Prolog, I've taken care to make examples that you can easily run yourself, by typing the parts of the interaction that are in bold. I've also put these into a file of Prolog goals (calls), convert_life_pattern_goals.pl . This is in the zip file.

To use it, you can copy the text of any goal, then paste it into the SWI-Prolog window by selecting Edit on the Prolog toolbar, then Paste.

On Windows, you can run the entire file by typing the following DOS command. Linux and Macs may allow something similar:

plcon < convert_life_pattern_goals.pl
The "plcon" command runs plcon.exe, which is a DOS-text-only version of the Prolog interpreter that runs inside its DOS window. It will display the goals' output — mainly, the results put into Prolog variables — in this window.

The program piece by piece

convert_life_pattern

The program's top-level predicate is the one you just invoked: convert_life_pattern. Now, there are two ways to read Prolog predicate definitions. One is imperatively: seeing the definition as a sequence of commands.

The other way is logically and declaratively. Here, we view the definition as a logical description of what the predicate has to do rather than how it does it. I showed an example or two of this in my posting about The Prolog Lightbulb Joke.

When input-output is involved, as in convert_life_pattern, the imperative way is usually easiest. So I'll show convert_life_pattern again, followed by its imperative reading. This is convert_life_pattern:

convert_life_pattern( InputFile, PatternName ) :-
  read_file_to_lines( InputFile, Lines ),
  lines_to_equations( Lines, PatternName, 4, Eqns ),
  atom_concat( PatternName, '.exc', OutputFile ),
  tell( OutputFile ),
  flatten( Eqns, EqnsFlattened ),
  write_equation( write_equation, EqnsFlattened ),
  told.

And this is its imperative reading:

To convert the Life pattern in InputFile called PatternName:
  read InputFile to a list of lines in Lines,
  convert Lines to a list of equations in Eqns that assign to an array called PatternName,
  concatenate PatternName with the atom '.exc' to make the name OutputFile,
  open OutputFile and make it the current output stream,
  flatten the list Eqns to EqnsFlattened,
  apply write_equation to each element of EqnsFlattened,
  and close the current output stream.

line_to_equations

I'm now going to switch attention from the program's top level to its bottom level, and show how I process individual lines. Here again is the first line of the pulsar pattern and the output I want from it:

...***...***.......
pulsar[7,4] = "n".
pulsar[8,4] = "n".
pulsar[9,4] = "n".
pulsar[13,4] = "n".
pulsar[14,4] = "n".
pulsar[15,4] = "n".

So, each star must become an equation that assigns to the corresponding element of the array. This is what "line_to_equations" does. Its first argument is a line; its second argument is the name to use for the array, such as "pulsar"; and its third and fourth arguments are the X and Y array subscripts to use for the first character of the line.

In its fifth argument, it returns a list of data structures representing equations. Unlike functions in other languages, predicates don't have special syntax for "returning a result": if they need to, they do so via one or more of their arguments.

Now here are some example interactive calls to line_to_equations:

4 ?- line_to_equations( "..**..*", pat, 4,4, Eqns ).
Eqns = [eqn(pat, 6, 4), eqn(pat, 7, 4), eqn(pat, 10, 4)] .

5 ?- line_to_equations( "..**..*", pat, 1,3, Eqns ).
Eqns = [eqn(pat, 3, 3), eqn(pat, 4, 3), eqn(pat, 7, 3)] .

6 ?- line_to_equations( "..**..*", pat, 0,0, Eqns ).
Eqns = [eqn(pat, 2, 0), eqn(pat, 3, 0), eqn(pat, 6, 0)] .

7 ?- line_to_equations( "*.*", pat, 0,0, Eqns ).
Eqns = [eqn(pat, 0, 0), eqn(pat, 2, 0)] .

8 ?- line_to_equations( "", pat, 0,0, Eqns ).
Eqns = [].

9 ?- line_to_equations( [], pat, 0,0, Eqns ).
Eqns = [].

10 ?- line_to_equations( [ 0'*, 0'., 0'* ], pat, 0,0, Eqns ).
Eqns = [eqn(pat, 0, 0), eqn(pat, 2, 0)] .

Compound terms

To explain Prolog's responses, I want to refer back to my already-mentioned posting about The Prolog Lightbulb Joke. In it, I showed a route-finder that calculates how to get from one country to another, for example from Holland to Portugal. It returns its result as a list, like this:

[go(netherlands, belgium), go(belgium, france),
go(france, spain), go(spain, portugal)]
Each element of this list is a compound term representing one node of the path. It's like a structure or record, with two fields, distinguished by position rather than by name. The output from line_to_equations is similar, except that its compound terms — the things like eqn(pat, 1, 3) — have three fields.

I decided that the first field should be the name of the array to be assigned to, while the second and third fields are the X and Y coordinates of the array element. So eqn(pat,1,3) represents this assignment:

pat[1,3] = "n".

write_equation

Presumably, you are thinking, some other part of the program uses these "eqn" compound terms, and hence need to know which order the fields are in? Indeed so. It's the little predicate "write_equation". Here it is again:

write_equation( eqn( PatternName, X, Y ) ) :-
  format( '~w[~w,~w] = "n".~n', [PatternName,X,Y] ).

Like convert_life_pattern, this does I/O, so I'll read it imperatively:

To write the equation represented by the compound term whose fields are PatternName, X, and Y:
  call "format" with the format string and arguments shown.

The built-in predicate "format"

This "format" built-in replaces each ~w in its format string by the corresponding element of the list given as its second argument. The ~n becomes a newline. Let these two calls demonstrate:

11 ?- write_equation( eqn(pulsar,0,1) ).
pulsar[0,1] = "n".
true.

12 ?- write_equation( eqn(glider,3,2) ).
glider[3,2] = "n".
true.

The built-in predicate "maplist"

While I'm on the subject of write_equation, let me show how I handle lists of equations. There is a built-in predicate, "maplist", that applies another predicate to a list. It comes in several arities (numbers of arguments), depending how many lists you want to call it with. I'll call the two-argument variety, often used as here to apply some predicate to each element of a list for the sake of a side-effect:

13 ?- maplist( write_equation, [eqn(pulsar,0,1)] ).
pulsar[0,1] = "n".
true.

14 ?- maplist( write_equation, [eqn(pulsar,0,1),eqn(pulsar,2,3)] ).
pulsar[0,1] = "n".
pulsar[2,3] = "n".
true.

15 ?- maplist( write_equation, [] ).
true.

Character-code lists

I've now shown you how line_to_equations's result gets used. Let me now go back and look at its inputs and how it works. So look now again at these two calls:

7 ?- line_to_equations( "*.*", pat, 0,0, Eqns ).
Eqns = [eqn(pat, 0, 0), eqn(pat, 2, 0)] .

10 ?- line_to_equations( [ 0'*, 0'., 0'* ], pat, 0,0, Eqns ).
Eqns = [eqn(pat, 0, 0), eqn(pat, 2, 0)] .

These actually mean exactly the same thing. I said that the first argument to line_to_equations is an input, the line to be converted. It is a list of integer character codes. As a convenience, Prolog allows this to be written as a double-quoted string, "*.*". But that's just shorthand for the list [ 0'*, 0'., 0'* ], where each element is a character constant.

The list can even be written as [42,46,42], using the characters' ASCII codes. So the following call gives the same result as the two above:

16 ?- line_to_equations( [42,46,42], pat, 0,0, Eqns ).
Eqns = [eqn(pat, 0, 0), eqn(pat, 2, 0)].

Atoms

That takes care of the first argument to line_to_equations. Its second argument is an atom. This is a kind of string, but one that Prolog stores in an optimised way, such that all occurrences of the atom point at the same piece of storage. Atoms in our source code can always be delimited by single quotes, but if they happen to be alphanumeric identifiers such as "pat", you can omit the quotes.

Finally, the third and fourth arguments are X and Y values. They state the Y position of the line being converted, and the X offset of the first element in the list. These calls demonstrate their effect:

5 ?- line_to_equations( "..**..*", pat, 1,3, Eqns ).
Eqns = [eqn(pat, 3, 3), eqn(pat, 4, 3), eqn(pat, 7, 3)].

6 ?- line_to_equations( "..**..*", pat, 0,0, Eqns ).
Eqns = [eqn(pat, 2, 0), eqn(pat, 3, 0), eqn(pat, 6, 0)].

How line_to_equations works

Now that I've explained line_to_equations's arguments, how does it work? Here's the definition again:

line_to_equations( [], _PatternName, _X, _Y, [] ) :- !.

line_to_equations( [ 0'* | Rest ], PatternName, X, Y, [ Eqn | Eqns ] ) :- !,
  Eqn = eqn( PatternName, X, Y ),
  XPlus1 is X+1,
  line_to_equations( Rest, PatternName, XPlus1, Y, Eqns ).

line_to_equations( [ 0'. | Rest ], PatternName, X, Y, Eqns ) :-
  XPlus1 is X+1,
  line_to_equations( Rest, PatternName, XPlus1, Y, Eqns ).

Because line_to_equations does not have side effects, it's best to read it declaratively, like this:

The list of equations for an empty line is empty.

The list of equations for a line which is * followed by Rest, where the line's first character must go into (X,Y) of array PatternName, is Eqn followed by Eqns, if
  Eqn is the record (PatternName,X,Y),
  and the equations for Rest and PatternName and X+1 and Y are Eqns.

The list of equations for a line which is dot followed by Rest, where the line's first character must go into (X,Y) of array PatternName, is Eqns, if
  the equations for Rest and PatternName and X+1 and Y are Eqns.
Does that make it clear? Notice that the second clause recurses after building one equation, whereas the third clause ignores the first character and recurses immediately.

Arithmetic

I said that the third argument is the X offset of the first element in the list. As just shown, when line_to_equations recurses, it has to increment this. And here, we see one of Prolog's quirks. Why, in my code, did I write "XPlus1 is X+1", instead of just writing X+1 as an argument to the recursive calls of line_to_equations? In other words, why didn't I define it like this:

line_to_equations( [], _PatternName, _X, _Y, [] ) :- !.

line_to_equations( [ 0'* | Rest ], PatternName, X, Y, [ Eqn | Eqns ] ) :- !,
  Eqn = eqn( PatternName, X, Y ),
  line_to_equations( Rest, PatternName, X+1, Y, Eqns ).

line_to_equations( [ 0'. | Rest ], PatternName, X, Y, Eqns ) :-
  line_to_equations( Rest, PatternName, X+1, Y, Eqns ).

Well, let's try it. Below, I call the original line_to_equations, then load a version of my program where line_to_equations and lines_to_equations are redefined as above, then call line_to_equations again:

17 ?- line_to_equations( "****", pat, 0,0, Eqns ).
Eqns = [eqn(pat, 0, 0), eqn(pat, 1, 0), eqn(pat, 2, 0), eqn(pat, 3, 0)] .

18 ?- [convert_life_pattern_2].
% convert_life_pattern_2 compiled 0.00 sec, -48 bytes
true.

19 ?- line_to_equations( "****", pat, 0,0, Eqns ).
Eqns = [eqn(pat, 0, 0), eqn(pat, 0+1, 0), eqn(pat, 0+1+1, 0), eqn(pat, 0+1+1+1, 0)] .
And, you will see, where the original version gives an X of 1 in the "eqn(pat, 1, 0)", the new version gives an X of 0+1. And where the original version gives an X of 2, the new version gives 0+1+1. What's going on?

The answer is intimately related to the fact that "go(belgium,france)" and "eqn(pat,0,0)" are compound terms. Watch how they behave when I pass them to Prolog's output predicate, "write":

20 ?- write( go(belgium,france) ).
go(belgium, france)
true.

21 ?- write( eqn(pat,0,0) ).
eqn(pat, 0, 0)
true.

22 ?- write( u+v ).
u+v
true.

23 ?- write( 1+2 ).
1+2
true.

24 ?- write( +(1,2) ).
1+2
true.

25 ?- 1+2 = +(1,2).
true.

To somebody accustomed to most other programming languages, the call to "write( 1+2 )" looks as though it should evaluate "1+2", thereby outputting 3. But Prolog is different. Like "go(belgium,france)" and "eqn(pat,0,0)", "1+2" is a compound term. Humans like to put + between its arguments when doing arithmetic, and so Prolog's implementors allow the same freedom. But this infix form is merely syntactic sugar, meaning exactly the same as "+(1,2)". Indeed, I've demonstrated that in the final goal above, where Prolog tells me the two are equal.

So, "+(1,2)" and "go(belgium,france)" both behave like records with two fields, and "1+2" behaves like "+(1,2)". Prolog won't do arithmetic automatically on it, but must be forced to with the "is" operator. That is why I needed to write "XPlus1 is X+1" and then pass XPlus1 as argument to the recursive call of line_to_equations.

Using minus to build pairs

To emphasise this, here's a call to the built-in "keysort" predicate. Its first argument is a list of pairs, and in its second argument, it returns a list of the pairs sorted by their first component:

26 ?- keysort( [ 8-a, 7-x, 3-foo, 2-fred, 4-go(belgium,france), 9-(1+2), 0-eqn(pat,0,0) ], Sorted ).
Sorted = [0-eqn(pat, 0, 0), 2-fred, 3-foo, 4-go(belgium, france), 7-x, 8-a, 9- (1+2)].

Notice two things. First, the 1+2 hasn't become a three. Second, Prolog hasn't complained at being asked to subtract a from 8, or x from 7, or fred from 2, or any of the others. Like 1+2, 8-a acts like a record with two fields. Actually, Prolog programmers often use "-" to build pairs in this way, almost as often as they use it in an "is" to denote subtraction!

lines_to_equations

Let's now look at lines_to_equations, which calls line_to_equations. Here it is again:

lines_to_equations( [], _PatternName, _Y, [] ) :- !.

lines_to_equations( [ Line | Lines ], PatternName, Y, [ Eqns | MoreEqns ] ) :-
  line_to_equations( Line, PatternName, 4, Y, Eqns ),
  YPlus1 is Y+1,
  lines_to_equations( Lines, PatternName, YPlus1, MoreEqns ).

And here's a declarative reading:

The list of equations for an empty list of lines is empty.

The list of equations for a line which is Line followed by Lines, where Line's characters must go into Y of array PatternName, is Eqns followed by MoreEqns, if
  the equations for Line and PatternName and 4 and Y are Eqns,
  and the equations for Lines and PatternName and Y+1 are MoreEqns.
All this works in the same way as line_to_equations, and I don't think I need to explain it further. It's a straightforward recursive definition.

The built-in predicate "flatten"

This leaves me to explain the built-in predicates in convert_life_pattern, and read_file_to_lines. In convert_life_pattern, "flatten" is one of SWI-Prolog's handy list-processing built-ins. Here's an interactive demonstration:

27 ?- flatten( [], F ).
F = [].

28 ?- flatten( [ [1],2,[3] ], F ).
F = [1, 2, 3].

29 ?- flatten( [ [[[[1]]]],[[[2,[3]]]] ], F ).
F = [1, 2, 3].

You can see that "flatten" removes all but one level of square brackets — that is, it flattens the list passed as its first argument.

I used "flatten" because of a mismatch between the number of lists per line returned by the "natural" way of writing my code, and the structure of the list of equations I wanted the code to return. The point is that the easiest recursion scheme to code is the kind in line_to_equations and lines_to_equations. On each recursion, this appends elements to a list, ending up with a list of elements.

When we wrap one instance of this scheme inside another, as lines_to_equations in effect does, we end up with a list of lists of elements. In this case, of equations. There are well-known ways to avoid this and keep the list flat as it's built. Doing so is important for efficiency when building big lists. But my lists are so small that it doesn't matter: I can safely build them the easy way, then clean them up with "flatten".

The built-in predicate "atom_concat"

In convert_life_pattern, atom_concat is a built-in that — as its name suggests — concatenates two atoms. Here is a demonstration:

30 ?- atom_concat( a, b, A ).
A = ab.

31 ?- atom_concat( pulsar, '.exc', A ).
A = 'pulsar.exc'.

32 ?- atom_concat( 'pulsar', '.exc', A ).
A = 'pulsar.exc'.
You see that I may write "pulsar" with or without single quotes. As I explained earlier, the single quotes around atoms can be omitted when they are alphanumeric identifiers.

The built-in predicates "tell" and "told"

Also in convert_life_pattern, the predicate "tell" redirects the current output stream so that it sends output to the file named by "tell"'s argument. This will make write_equation write to that file. The predicate "told" closes the file, and switches output back to the terminal. This example uses "write" to output one line, and "format" to output a second:

33 ?- tell(myfile), write('Line 1.\n'), format( 'Line ~w.~n', [2] ), told.
true.

It results in this file:

Line 1.
Line 2.

read_file_to_lines

Finally, my predicate read_file_to_lines reads a file into a list of lines, where each line is a list of character codes. Do you remember that I showed how the calls below are equivalent? They demonstrate how a double-quoted thing that looks like a string constant is just a list of character codes. That is a very common way to hold text in Prolog:

?- line_to_equations( "*.*", pat, 0,0, Eqns ).
?- line_to_equations( [ 0'*, 0'., 0'* ], pat, 0,0, Eqns ).
?- line_to_equations( [42,46,42], pat, 0,0, Eqns ).

So read_file_to_lines reads a file into a list of lists of character codes. Here's an example:

34 ?- read_file_to_lines( 'pulsar.dots', Lines ).
Lines = [[46, 46, 46, 42, 42, 42, 46, 46|...], [46, 46, 46, 46, 46, 46, 46|...], [46, 42, 46, 46, 46, 46|...], [46, 42, 46, 46, 46|...], [46, 42, 46, 46|...], [46, 46, 46|...], [46, 46|...], [46|...], [...|...]|...].
By the way, the three dots in the above list are SWI-Prolog's way of truncating long output. They stand for the trailing elements of various sublists, which it doesn't want to display for fear of swamping the user.

And this is the source:

read_file_to_lines( File, Lines ) :-
  open( File, read, Stream, [] ),
  read_line_to_codes( Stream, Line ),
  read_stream_to_lines( Line, Stream, Lines ),
  close( Stream ).


read_stream_to_lines( end_of_file, _Stream, [] ) :- !.

read_stream_to_lines( Line, Stream, [ Line | Rest ] ) :-
  read_line_to_codes( Stream, NextLine ),
  read_stream_to_lines( NextLine, Stream, Rest ).

The built-in predicates "open", "close", and "read_line_to_codes"

My predicate read_file_to_lines depends on three built-ins. Of these, the "open" predicate creates a stream connected to an input file; and "close" closes it. In contrast to "tell" and "told", which act on an implicit current output stream that Prolog maintains, "open" and "close" make the stream explicit as shown in the next paragraph.

The other built-in, read_line_to_codes, reads a single line from a stream into a list. This example shows it reading the first line of pulsar.dots from a stream it's made by calling "open".

35 ?- open( 'pulsar.dots', read, Stream, [] ), read_line_to_codes( Stream, Line ).
Stream = '$stream'(1577028),
Line = [46, 46, 46, 42, 42, 42, 46, 46, 46|...].

The end-of-file marker

To explain read_file_to_lines and its auxiliary predicate read_stream_to_lines, I'll show another example. It does the same, but opens an empty file I've made, called empty.dots. Notice that this time, Line becomes not a list, but the atom "end_of_file":

36 ?- open( 'empty.dots', read, Stream, [] ), read_line_to_codes( Stream, Line ).
Stream = '$stream'(1577284),
Line = end_of_file.

How read_file_to_lines works

Armed with this knowledge, we can see how read_file_to_lines works. The key is that read_stream_to_lines gets passed the previous line of input in its first argument. It has to decide whether or not to incorporate this into the result it returns, and also whether to recurse and read more lines. It only does these if the previous line of input is not end-of-file.

Here's a declarative reading. Although we're doing input-output, this, here, seems easier than an imperative one.

The result of reading File is Lines, if:
  opening File gives Stream,
  and the next line from Stream is Line,
  and reading the rest of Stream including Line gives Lines
  (and we have closed Stream).


Reading the rest of Stream including end-of-file gives no lines.

Reading the rest of Stream including Line gives Line followed by Rest, if:
  reading the next line from Stream gives NextLine,
  and reading the rest of Stream including NextLine gives Rest.

Thinking declaratively, the first clause of read_stream_to_lines makes sense, because there can be nothing after an end-of-file. So the list of lines returned must be empty. Thinking imperatively, what this clause is saying is: "We've hit end-of-file, so stop and don't try reading any more".

This technique, of "reading one ahead of oneself", seems to be used a lot by Prolog programmers. There's no reason why it shouldn't also be used in other languages, but I have a suspicion — perhaps unfounded — that it isn't.