[ Jocelyn Paine's Home Page | Free software | Publications | Algebraic Web specification | Report on Visit to University of Minho, May 2000 ]
-- web2.mod
-- We start with basic definitions for Web pages, URLs
-- and so forth.
module* PAGES
{
-- This module introduces the sort Page, denoting
-- the space of complete Web pages.
-- Other modules will define functions mapping
-- between Page and URLs, filenames, and content.
--
-- We shall probably only use terms of sort Page to
-- represent pages resident on our own server. We
-- can certainly use CafeOBJ to specify the behaviour
-- of pages on any server, but we can only have
-- control over our own (except perhaps via the
-- use of PUT or other methods for writing to remote
-- servers). So when using these modules
-- as components of the algebraic web site generator,
-- it's probably not useful to name pages elsewhere.
[ Page ]
}
module* CONTENT
{
-- This module introduces the notion of a mapping from
-- pages to their contents. We assume that every page's
-- content can be represented as a string. In
-- most cases, this will be HTML.
protecting( STRING )
protecting( PAGES )
op content : Page -> String
}
module* CONTENT'
{
-- This is an alternative version of CONTENT.
-- Though most pages contain HTML, not all do.
-- Each page has a so-called content type or
-- MIME type. A browser will either infer this
-- from the extension in the page's URL (e.g.
-- ".htm" and ".html" indicate HTML; ".gif" indicates
-- the GIF graphical format), or from a
-- 'Content-type' header sent back by the server
-- (e.g. the header "Content-type: text/html"
-- indicates HTML; the header "Content-type: image/gif"
-- indicates GIF). The browser uses the content
-- type to decide how to interpret and display
-- the page's content. E.g. a browser will
-- render the content "<H1>Header</H1>" as a bold
-- heading in some big font if the content type
-- is HTML; but if the content type is "text/plain",
-- the browser will display the HTML H1 tags verbatim
-- rather than interpreting them, and will probably use a
-- monospaced font such as Courier throughout.
--
-- Below, I've defined sort ContentType, and some
-- constants. Other modules might want to add other
-- constants to the sort. Does the web site generator
-- need ContentType? Probably, it's too early to say.
-- Instead of saying that the function 'content' maps
-- all pages to strings, should we define subsorts
-- of String for different content, as I've done below?
-- If so, how do we specify that, e.g. 'content'
-- will only map a page to an HTML if the page's
-- conteot-type is text/html?
protecting( STRING )
protecting( PAGES )
[ ContentType ]
[ Content < String ]
[ GIF < Content ]
[ HTML < Content ]
[ Text < Content ]
ops image/gif text/html text/plain : -> ContentType
op content-type : Page -> ContentType
op content : Page -> String
}
module* URLS
{
-- This module introduces the space of URLs.
-- For the moment, we ignore the internal structure
-- of these (e.g. the way they're divided by "/"
-- separators) and just represent them as strings.
-- In future, we might want not to ignore the
-- structure: perhaps represent each URL as a list of
-- elements, and have a function that converts
-- the list
-- [ e1 e2 ... e_n ]
-- to the string
-- e1 ++ "/" ++ e2 ++ "/" ++ ... ++ e_n.
--
-- We might also want to introduce functions
-- that normalise URLs (e.g. converting
-- "http://site/dir1//dir2" to "http://site/dir1/dir2")
-- and that test whether two URLs name the same page
-- (do this by comparing normalised forms, perhaps).
-- Some transformations (such as the above) can be
-- done without knowledge of the site whose pages the
-- URLs name, I think. Others will be site-dependent
-- (e.g. knowing that "/dir1/" denotes the default
-- page "/dir1/index.htm").
--
-- We do not assume that every element of sort URL
-- actually names a page.
--
-- The subsort SiteAbsoluteUrl is for URLs of pages
-- living on the current server. These are special
-- because the web site generator can create them and
-- know their location in its filestore. They
-- will not contain an "http://" prefix or a site
-- address, and will start with a "/" (e.g.
-- "/dir1/index.htm").
protecting( STRING )
[ URL < String ]
[ SiteAbsoluteURL < URL ]
}
module* PAGE-VS-URL
{
-- This module introduces mappings between the
-- space of pages and of URLs. The function
-- 'page' will be partial, since not every URL
-- actually names a page. 'url' can be total.
--
-- Should we use sort constraints to distinguish
-- URLs that do name pages from those that don't?
-- If so, how, given that the set of pages changes
-- with time?
protecting( PAGES + URLS )
op page : URL -> Page
op url : Page -> URL
}
module* FILENAMES
{
-- This module introduces the spaces of filenames
-- and directory names. We shall need these when
-- mapping URLs to locations in our servers filestore.
-- As with URLs, we might later want to represent the
-- internal structure of filenames rather than just
-- representing them as strings.
--
-- We define two sorts, Filename and DirectoryName.
-- Some operating systems treat directories as
-- special kinds of file, but it's probably cleaner
-- to think of the two as different.
--
-- Question: are these relative or absolute filenames,
-- and do we need to distinguish the two by subsorting?
--
-- I assume that directory names do not end in the
-- directory-separator character. This assumption is
-- important when prepending them to URLs to make
-- filenames.
protecting( STRING )
[ Filename < String ]
[ DirectoryName < String ]
}
module* URL-VS-FILENAME
{
-- This module introduces the idea of a mapping between
-- a page's URL and its location in the server's
-- filestore. Though we can talk about this for
-- any server, we only have control over it on
-- our own, so I've restricted it to SiteAbsoluteURL.
--
-- The mapping from URLs to filenames is total.
-- That from filenames to URLs will be partial,
-- since some files will be outside the region
-- of filestore which the server is allowed to
-- send pages from.
protecting( URLS + FILENAMES )
op filename : SiteAbsoluteURL -> Filename
op url : Filename -> SiteAbsoluteURL
}
module* DOCUMENT-ROOT-MAPPING
{
-- This module describes the most common way
-- of mapping between URLs and filenames. The
-- server's filestore has a 'document root'
-- directory. The server is allowed to make
-- any files in this directory or its subdirectories
-- visible as Web pages. To convert a URL
-- to a filename, it prepends the document
-- root, and then converts all "/" characters in the
-- URL to whatever character its operating
-- system uses as a directory separator (e.g.
-- "/" on Unix, "\" on Windows).
--
-- To describe your own server's document-root mapping,
-- protect this module and give equations defining
-- document-root and directory-separator. document-root
-- should be a directory name, without a trailing
-- directory separator. It should use "/" for directory
-- separators, even if your own system uses "\" or
-- some other character. This is merely because there seems
-- to be a bug in CafeOBJ which makes it pass
-- "\" characters in strings to Lisp incorrectly.
-- directory-separator should be the directory separator:
-- if this is "\", write it as "\\", which despite the
-- bug, does seem to get passed to Lisp OK.
protecting( URL-VS-FILENAME )
op document-root : -> DirectoryName
op directory-separator : -> String
op $substitute : String String -> String
-- Private operator which replaces all "/"'s in its
-- first argument by the second argument. The second
-- argument will be directory-separator, but we need
-- to pass it as a variable: I can't find a way to make
-- Lisp refer to terms other than via CafeOBJ
-- variables.
var U : SiteAbsoluteURL
eq filename( U ) = $substitute( document-root ++ U, directory-separator ) .
eq $substitute( URL:String, Sep:String ) = #! (substitute (char Sep 0) #\/ URL) .
}
-- Next, we consider how Web pages can be grouped into
-- collections. We need some utilities first, and then
-- define a particular kind of collection, the slideshow.
module* HTML-UTILITIES
{
-- This module defines some utilities for generating HTML.
-- We'll need them when linking pages within collections.
-- The <A HREF=_>_</A> operation places a link, as its name
-- suggests.
protecting( PAGE-VS-URL )
op <A HREF=_>_</A> : Page String -> String
op <A HREF=_>_</A> : URL String -> String
eq <A HREF= P:Page > S:String </A> = <A HREF= url(P) > S </A> .
eq <A HREF= U:URL > S:String </A> = "<A HREF=" ++ U ++ ">" ++ S ++ "</A>" . }
module* SLIDESHOW
{
-- This module describes a specific kind of collection of
-- pages, namely a 'slideshow'. This is examplified by the
-- MM presentation on my site at http://www.ifs.org.uk/~popx .
-- It's a sequence of pages intended for linear reading.
-- Each page except the last has a successor, and each page
-- except the first has a predecessor.
--
-- To use this module, define some extra pages p2 ... p_n .
-- Define the ordering with equations for 'succ' such that
-- last = succ(p_n), p_n = succ(p_n-1), ... p2 = succ(first).
-- and for 'pred' such that
-- first = pred(p2), p3 = pred(p2), ... p_n = pred(last).
-- Define the pages' titles with equations for 'title' such that
-- title(p_i) = ...The string between <TITLE> and </TITLE>
-- tags for page p_i. The generator will also
-- use this for the page's heading, placing
-- it betweem <H1> and </H1> tags...
-- Define the pages' bodies with equations for 'body' such that
-- body(p_i) = ...The content for page p_i that comes after
-- the heading...
-- Define the pages' URLs with equations for 'url' such that
-- url(p_i) = ...The site-absolute URL for page p_i...
-- Define the URL for the contents page (see below) with an equation
-- for 'url' such that
-- url(contents-page) = ...The site-absolute URL for the contents page...
--
-- Module CONTENT introduces the function 'content' from
-- a page to its complete contents. SLIDESHOWS defines 'content'
-- in terms of the page's title, body and links. After placing
-- the title in the page's <HEAD> section, it generates a
-- navigation bar linking to the page's successor and precessor
-- (or to one of these if the page is at the start or end). It
-- then reuses the title to generate the heading, as described
-- above. It then places the page's body, following it with
-- another copy of the navigation bar and the closing
-- </BODY> and </HTML> tags. SLIDESHOW also generates a
-- contents page, linking to each page in order. It places
-- a link to this in each navigation bar.
--
-- In real-life use, the user would want more control over
-- location of the navigation bar and whether, e.g. the 'Next'
-- link should appear before or after the 'Last' one, and
-- the words to be used. We could define these by providing
-- operations to construct templates for pages and their parts,
-- terms constructed from which this module interprets. For
-- example,
-- eq nav-bar = "Up" ++ UP-LINK ++
-- " Next " ++ NEXT-LINK ++ " Last " ++ LAST-LINK
-- or
-- eq nav-bar = "Inhoud" ++ UP-LINK ++
-- " Vorige pagina " ++ LAST-LINK ++ " Volgende pagina " ++ NEXT-LINK
-- where the capitalised operations get replaced by appropriate
-- links for each page.
--
-- Question: I haven't declared a sort to represent slideshows.
-- Would that be better?
--
-- Operations whose names begin with "$" are to be treated as
-- private to the module.
protecting( PAGES + CONTENT + PAGE-VS-URL + HTML-UTILITIES )
[ SequencePage < Page ]
[ StartPage < SequencePage ]
[ InteriorPage < SequencePage ]
[ EndPage < SequencePage ]
[ ContentsPage < Page ]
op succ : StartPage -> Page
op succ : InteriorPage -> Page
op pred : InteriorPage -> Page
op pred : EndPage -> Page
op first : -> StartPage
op last : -> EndPage
op body : Page -> String
op title : Page -> String
op contents-page : -> ContentsPage
op url : Page -> SiteAbsoluteURL
-- Specialise 'url' so that 'filename' can be reduced.
op $nav-bar : Page -> String
-- The navigation bar at the top and bottom of each sequence page.
op $contents : -> String
op $contents-from : Page -> String
-- $contents is the list of page titles to be used in the contents.
-- $contents-from(P) is the list from and including P.
eq content( contents-page ) = "<HTML><HEAD>" ++
"<TITLE>Contents</TITLE>" ++
"</HEAD><BODY>" ++
"<H1>Contents</H1>" ++
"<UL>" ++
$contents ++
"</UL>" ++
"</BODY></HTML>" .
eq content( P:SequencePage ) = "<HTML><HEAD>" ++
"<TITLE>" ++ title(P) ++ "</TITLE>" ++
"</HEAD><BODY>" ++
$nav-bar(P) ++
"<HR>" ++
"<H1>" ++ title(P) ++ "</H1>" ++
body(P) ++
"<HR>" ++
$nav-bar(P) ++
"</BODY></HTML>" .
eq $nav-bar( P:StartPage ) = <A HREF= contents-page > "Contents" </A> ++
<A HREF= succ(P) > "Next" </A> .
eq $nav-bar( P:EndPage ) = <A HREF= contents-page > "Contents" </A> ++
<A HREF= pred(P) > "Last" </A> .
eq $nav-bar( P:InteriorPage ) = <A HREF= contents-page > "Contents" </A> ++
<A HREF= succ(P) > "Next" </A> ++
<A HREF= pred(P) > "Last" </A> .
eq $contents = $contents-from( first ) .
eq $contents-from( P:EndPage ) = "<LI>" ++ <A HREF= P > title(P) </A> .
eq $contents-from( P:StartPage ) = "<LI>" ++ <A HREF= P > title(P) </A> ++ $contents-from( succ(P) ) .
eq $contents-from( P:InteriorPage ) = "<LI>" ++ <A HREF= P > title(P) </A> ++ $contents-from( succ(P) ) . }
-- Now we consider how to generate the pages in a collection
-- to file. We'll need utilities such as lists and methods
-- for writing to files first.
mod! LISTS ( ELT :: TRIV ) principal-sort List
{
-- This module defines the space of lists. It was originally
-- written by Razvan Diaconescu as part of the CafeOBJ examples.
-- We'll need it when constructing lists of all the pages
-- defined by a module.
[ NonEmptyList < List ]
op nil : -> List
op cons : Elt List -> NonEmptyList
op cons : Elt -> NonEmptyList
op car : NonEmptyList -> Elt
op cdr : NonEmptyList -> List
op _+_ : List List -> List
vars L L' : List
var E : Elt
eq car(cons(E,L)) = E .
eq cdr(cons(E,L)) = L .
eq nil + L = L .
eq cons(E,L) + L' = cons(E,(L + L')) .
}
make PAGE-LISTS ( LISTS(PAGES) * { sort List -> PageList } ) .
-- This defines the space of lists of pages.
module* WRITE-FILE
{
-- This module calls Lisp to define a 'function'
-- write-string--to-file(S,F) which writes the
-- string S to file F. It does return a value,
-- but we use it only for its side-effect, in WRITE-PAGE.
--
-- I use the subsort Void for the coarity of functions
-- derived from 'write-string-to-file', where it's only
-- the side-effect that's of interest. If I need to write
-- an explicit return value, I use the operation 'void' (or,
-- if in Lisp, an equivalent symbol).
--
-- I don't know exactly what #! does, since there's
-- no documentation about it. Looking at the libraries,
-- there are two ways to invoke Lisp, #! and #!!.
-- The second doesn't work here: I think #! discards
-- the sort tags from variables when passing them
-- to Lisp, whereas #!! keeps them. For 'write-string-to-file',
-- we don't need the tags, only the variables' values
-- as strings.
protecting( STRING )
[ Void < String ]
op write-string-to-file : String String -> Void
op void : -> Void
eq write-string-to-file( Content:String, Filename:String ) =
#! (let ( (
stream (open Filename :direction :output)
)
)
(write-string Content stream)
(close stream)
""
) .
eq void = "" .
}
module* WRITE-PAGE
{
-- This module defines a 'function' whose side-effect is
-- to write a page's content to its file. This only makes
-- sense when applied to pages whose filestore location
-- we know, namely those on our own server. We use it
-- in the Web-page generator below.
protecting( WRITE-FILE + PAGE-VS-URL + URL-VS-FILENAME + CONTENT )
op write-page : Page -> Void
eq write-page( P:Page ) = write-string-to-file( content(P), filename(url(P)) ) .
}
module* PAGE-COLLECTION
{
-- See GENERATOR, below.
protecting( PAGE-LISTS )
op all-pages : -> PageList
}
module* GENERATOR( PAGE-COLLECTION :: PAGE-COLLECTION )
{
-- This module defines the operation 'generate'. This
-- iterates over all pages in a PAGE-COLLECTION, obtaining
-- the content and filename for each, and calling
-- write.WRITE-FILE to write the content to the file.
-- Since we use it for its side-effects only, we
-- give it the coarity Void, as in WRITE-FILE.
--
-- The interface between 'generate' and the PAGE-COLLECTION
-- is via all-pages. Any PAGE-COLLECTION must define
-- all-pages to return a list of all the pages in
-- the collection.
--
-- The code for $generate is less efficient than it
-- need be. Since we don't want its result, there's
-- no need to use ++ to concatenate results together.
-- But I can't think of any other way to force
-- evaluation of both write-page and the recursive
-- call to $generate in the second equation.
protecting( WRITE-PAGE + PAGE-LISTS )
op generate : -> Void
op $generate : PageList -> Void
-- Private operation which writes each page
-- in its argument.
eq generate = $generate( all-pages ) .
eq $generate( nil ) = void .
eq $generate( cons( P:Page, Tail:PageList ) ) =
write-page( P ) ++ $generate( Tail ) .
}
module* SLIDESHOW-PAGE-COLLECTION( SLIDESHOW :: SLIDESHOW )
{
-- This module extends SLIDESHOW to define all-pages.
-- Conceptually, this is not part of the slideshow itself,
-- which is why I define it separately.
protecting( PAGE-COLLECTION )
op $the-sequence-from : Page -> PageList
-- Private operation which returns a list of all pages in
-- the slideshow, starting from its first argument.
eq all-pages = cons( contents-page, $the-sequence-from(first) ) .
eq $the-sequence-from( P:EndPage ) = cons( P, nil ) .
eq $the-sequence-from( P:StartPage ) = cons( P, $the-sequence-from( succ(P) ) ) .
eq $the-sequence-from( P:InteriorPage ) = cons( P, $the-sequence-from( succ(P) ) ) .
}
module! MM-SLIDESHOW
{
-- This module instantiates SLIDESHOW to define
-- a presentation about MM.
protecting( SLIDESHOW )
ops p2 p3 : -> InteriorPage
eq body( first ) = "This is an introduction to MM." .
eq body( p2 ) = "This is more about MM." .
eq body( p3 ) = "This is the third page in the presentation." .
eq body( last ) = "This is the final page." .
eq title( first ) = "Introduction" .
eq title( p2 ) = "Background" .
eq title( p3 ) = "Details" .
eq title( last ) = "Prospects" .
eq url(contents-page) = "/contents.html" .
eq url( first ) = "/first.html" .
eq url( p2 ) = "/p2.html" .
eq url( p3 ) = "/p3.html" .
eq url( last ) = "/last.html" .
eq succ( first ) = p2 .
eq succ( p2 ) = p3 .
eq succ( p3 ) = last .
eq pred( last ) = p3 .
eq pred( p3 ) = p2 .
eq pred( p2 ) = first .
}
module! MY-SERVER
{
-- This module extends DOCUMENT-ROOT-MAPPING to
-- define the directory separator and URL-to-filename
-- mapping used by my server. GENERATOR must have
-- the latter before it can evaluate 'filename'.
protecting( DOCUMENT-ROOT-MAPPING )
eq document-root = "/home/popx" .
eq directory-separator = "/" .
}
make MM-SLIDESHOW-GENERATOR ( GENERATOR(SLIDESHOW-PAGE-COLLECTION(MM-SLIDESHOW)) +
MY-SERVER
)
-- This defines a generator for the MM slideshow, using the
-- URL-to-filename mapping of MY-SERVER. Open the module
-- and reduce 'generate' (assuming that your directory
-- separator is "\" and the directory kb7\mm\cafeobj\ exists),
-- and you will obtain files containing all the pages.
22nd May 2000.
[ Jocelyn Paine's Home Page | Report on Visit to University of Minho, May 2000 ]
[ Jocelyn Paine's Home Page | Free software | Publications | Algebraic Web specification | Report on Visit to University of Minho, May 2000 ]