Comments

Next: Errors and efficiency
Up: Information representation
Previous: D
Back: to main list of student notes

Comments

The point of these exercises is to show that the same information can be represented in different ways, and that the representation affects the ease of doing things with it.

In (A), the places are ordered only by their position in the knowledge base. It is actually possible to write next_place and nth_point, but only by an advanced subterfuge which involves building a list of all the points, in the same way that the route-finder does, and then searching along it.

In order to insert a fact about the Body Shop, and preserve order, you have to insert it at the correct position in the knowledge base, leaving all the other clauses in the same relative position. The same goes for deleting a point. This is like a library where the books are catalogued only by their position on the shelf, and not by catalogue numbers written on the spine. When you have to insert a new book, it must go in the correct position, otherwise it will not be properly catalogued. And if there is not already a gap, this means that all the books on one side may have to shoved up to make room. The same is true here. If you want to add a point at run-time using adda or add, you have problems, because they can only add at the start or end of the database. It turns out in standard Prolog that the only way to add a new clause anywhere else is to delete all the clauses above where you want to add, insert the new one, and then replace all the deleted ones.

In (B), each point carries with it the address of its neighbour, and its relative position doesn't matter. Getting from one point to the next is easy, i.e. implementing next is easy:

    next_place( X, Y ) :- point( X, Y ).

If you insert a fact about the Body Shop, or delete one about All Souls, you have to ``tie up the ends'' so that the sequence is not broken. If you don't do this, the new point, even though there, is inaccessible. Some people have suggested that something similar happens when we know we know something, but can't recall it: the memory is there, but we have lost our pointers to it.

Writing nth_point requires moving from one point to the next as many times as necessary: if you can't see how now, you should by the time you've done the LINKS exercises in the final supplement.

This very general idea, of associating with one piece of data the address of the next, is used in computing under the general name of list processing, and we say that one piece of data (e.g. carfax) carries the address of, or points to the next (e.g. brasenose). In the previous supplement you encountered the lists returned from route. These are a very specific instance of list processing. When Prolog displays them they look as though they are a single object, but the computer actually stores them in an analogous way to (B), with one element pointing to the next.

In (C), each point carries with it a numeric tag. If you add or subtract one from the tag, you get the tag for a neighbouring point. Getting straight to the Nth point is much easier than in (A), (B), or (D). This association of numbers with pieces of data, and the quick indexing it allows, is used in computing: we say that the points are held as an array, each subscripted or indexed by a given number. Arrays in other programming languages are stored in an analogous way.

One non-computational example of tagging is Littlewoods. At the meat and cheese counter, you used to have to queue. In 1985, ``because of arguments about who was to be served next'', the system changed so that you had to collect a numbered ticket from a dispenser, and then wait until the number is called. The information about order is now held in the tickets, and not in the arrangement of the queuers. This has been called the difference between parking and marking by Fairthorne in Towards Information Retrieval.

Another is book indexes. In principle, a book could be re-written as a giant index, which would list for every word in the book, the locations in the text at which it appears. This would be much easier to use when searching for a specified word, but not very pleasant to read! Nevertheless, exactly the same information is present in both.

(D) is a tricky one. As in (C), each place has a numeric tag, but the differences between the tags of successive points are not the same. To write next_place(X,Y), you have to find the number for X, and then search for the least number bigger than it, using a method similar to that for nearest_gateway in Supplement 6. nth_point can only be implemented by repeating this process N times.

So the choice of representation is not trivial. Which representation you choose depends very much on how you will use the information.

Next: Errors and efficiency
Up: Information representation
Previous: D
Back: to main list of student notes

Jocelyn Paine
Tue Jun 4 18:10:48 BST 1996