Compiling Spreadsheets to Prolog

[ Jocelyn Ireson-Paine's Home Page | Publications ]

There's been discussion in some groups recently about compiling spreadsheets into programs in other languages that run stand-alone. Different users want this for different reasons, which include: the compiled code will run on computers that don't have Excel; it can be run on a Web server; it's faster than Excel; it can be called as a subroutine or function from Pascal, Java, or whatever your favourite language is. There do exist packages that will compile spreadsheets: one that comes up from time to time is Visual Baler. But what about compiling into Prolog?

I'm going to tie this up with the topic of modularity. Authors from Herbert Simon through Dijkstra, Knuth, and Hoare onwards have emphasised that modularity is our main tool for handling complexity in programming. See, for example, my The Watchmaker and The Spreadsheet Factory for Simon's parable of how modularity helped the watchmaker, and for my own work on modular spreadsheet programming.

It shouldn't be necessary to mention the authors cited above - unless your name is Microsoft. Any self-respecting programmer has heard of them, and knows how important modularity is. Strange then, that it's a topic almost completely ignored by Microsoft in their relentless stream of updates to Excel.

Or almost ignored. Simon Peyton-Jones has an interesting paper on how to add user-defined functions to Excel. Functions are perhaps the most basic tool for modularity, which means more than just modules. Unfortunately, as he depicts in an accompanying slideshow, against the inertia of Microsoft, his proposal's chances are those of a bee on a battleship.

We can experiment with these ideas and see how we might relate Excel to Prolog.

The wrong way to do it?

First, here's what's probably the wrong way to do it. Look at this spreadsheet. This contains three Prolog facts down the left-hand column -

  man(socrates)
  man(fred)
  mortal(X) :- man(X)
In the next column, it contains a goal,
  :- mortal(X).
Using MM, we read out the three facts and assert them, then extract the goal, and call it. Easy.

Similarly, with a more complicated spreadsheet whose facts are:

  can_follow( birth, research ).
  can_follow( research, getting_published ).
  can_follow( getting_published, nobel_prize ).
  can_follow( research, not_getting_published ).
  can_follow( not_getting_published, losing_job ).
  can_follow( losing_job, no_money ).
  can_follow( no_money, eating_out_of_dustbins ).
  can_follow( no_money, eating_thesis_pages ).
  can_follow( eating_out_of_dustbins, poisoning ).
  can_follow( poisoning, death ).
  can_follow( eating_thesis_pages, choking ).
  can_follow( nobel_prize, lecture_tours ).
  can_follow( choking, death ).
  can_follow( lecture_tours, huge_dinners ).
  can_follow( huge_dinners, cholesterol_overdose ).
  can_follow( cholesterol_overdose, death ).

  life( [birth|Life] ) :- 
    life_from( birth, death, Life ).

  life_from( X, Y, [Y] ) :-
    can_follow( X, Y ).
  life_from( X, Y, [Z|RestLife] ) :-
    can_follow( X, Z ),
    life_from( Z, Y, RestLife ).
and whose goal is:
  [Life]^life(Life).
From this, we get the following Prolog session:
  
  Welcome to SWI-Prolog (Multi-threaded, Version 5.3.14)
  etc...

  1 ?- cd('d:/kb7/mm5').
  Yes

  2 ?- [paths].
  % paths compiled 0.00 sec, 916 bytes
  Yes

  3 ?- [includes].
  %  d:/kb7/mm5/paths.pl compiled 0.00 sec, 340 bytes
  %  grips grips2 compiled into grips2 0.01 sec, 34,620 bytes
  %   higher_order compiled into higher_order 0.01 sec, 3,072 bytes
  %   codegen compiled into codegen 0.00 sec, 1,700 bytes
  %  general(structures) compiled into structures 0.02 sec, 16,740 bytes
  %  general(polymorphism) compiled into polymorphism 0.01 sec, 4,204 bytes
  %  general(strings) compiled into strings 0.01 sec, 6,560 bytes
  %  general(maps) compiled into maps 0.01 sec, 17,608 bytes
  %  general(vectors) compiled into vectors 0.01 sec, 12,156 bytes
  %   gauss compiled into gauss 0.01 sec, 6,260 bytes
  %  general(matrices) compiled into matrices 0.02 sec, 17,552 bytes
  %  sdo(sdo) compiled into sdo 0.01 sec, 7,824 bytes
  %   general(bug) compiled into bug 0.00 sec, 2,032 bytes
  %  sdo(sdo_output) compiled into sdo_output 0.00 sec, 16,784 bytes
  %  sdo(pp) compiled into pp 0.00 sec, 4,332 bytes
  %  mm(cell_refs) compiled into cell_refs 0.01 sec, 6,020 bytes
  %    general(sets) compiled into sets 0.00 sec, 15,024 bytes
  %    mm(intervals) compiled into intervals 0.01 sec, 10,612 bytes
  %    mm(seqs) compiled into seqs 0.00 sec, 6,432 bytes
  %     mm(subscripted_attrs) compiled into subscripted_attrs 0.00 sec, 5,292 bytes
  %    mm(refs) compiled into refs 0.00 sec, 10,804 bytes
  %   mm(bases) compiled into bases 0.02 sec, 52,944 bytes
  %    mm(exprs) compiled into exprs 0.01 sec, 2,264 bytes
  %   mm(transforms) compiled into transforms 0.02 sec, 25,196 bytes
  %    mm(cells) compiled into cells 0.00 sec, 3,764 bytes
  %   mm(arrows) compiled into arrows 0.02 sec, 17,752 bytes
  %  mm(reformattings) compiled into reformattings 0.07 sec, 106,652 bytes
  %   mm(cell_dependencies) compiled into cell_dependencies 0.01 sec, 6,572 bytes
  %  mm(grammars) compiled into grammars 0.02 sec, 18,732 bytes
  %     mm(cell_names) compiled into cell_names 0.00 sec, 6,576 bytes
  %    mm(write_expr) compiled into write_expr 0.01 sec, 15,824 bytes
  %   mm(write_xml) compiled into write_xml 0.02 sec, 23,364 bytes
  %  mm(write_arrow) compiled into write_arrow 0.02 sec, 27,736 bytes
  %   mm(errors) compiled into errors 0.00 sec, 1,008 bytes
  %   mm(ss_lexer) compiled into ss_lexer 0.00 sec, 18,596 bytes
  %   mm(ss_parser) compiled into ss_parser 0.00 sec, 13,812 bytes
  %   mm(read_xml) compiled into read_xml 0.01 sec, 6,384 bytes
  %  mm(read_cells) compiled into read_cells 0.01 sec, 45,680 bytes
  %   mm(equations) compiled into equations 0.00 sec, 7,140 bytes
  %   mm(environments) compiled into environments 0.00 sec, 3,880 bytes
  %   mm(read_arrow) compiled into read_arrow 0.01 sec, 3,380 bytes
  %   mm(write_cells) compiled into write_cells 0.00 sec, 2,056 bytes
  %    mm(list_arrow) compiled into list_arrow 0.00 sec, 1,676 bytes
  %   mm(list_cells) compiled into list_cells 0.00 sec, 4,088 bytes
  %  mm mm_in_prolog compiled into mm_in_prolog 0.01 sec, 40,648 bytes
  % includes compiled 0.25 sec, 385,412 bytes
  Yes

  4 ?- solve_spreadsheet('prc1.xml').
  The solutions for mortal(_G1633) are [[socrates]]

  5 ?- solve_spreadsheet('prc11.xml').
  The solutions for life(_G3609) are [ [ [ birth,
        research,
        getting_published,
        nobel_prize,
        lecture_tours,
        huge_dinners,
        cholesterol_overdose,
        death
      ]
    ],
    [ [ birth,
        research,
        not_getting_published,
        losing_job,
        no_money,
        eating_out_of_dustbins,
        poisoning,
        death
      ]
    ],
    [ [ birth,
        research,
        not_getting_published,
        losing_job,
        no_money,
        eating_thesis_pages,
        choking,
        death
      ]
    ]
  ]

A better way to do it?

Now here's a different way to try it. This spreadsheet contains three function definitions. Each definition contains in its left hand corner, a specification of the function's name and arity, followed in the same column by a list of formals. The result is the final entry in this list. In the rest of the section, we have the function body. The result can be calculated directly from the formals, but as in f, can also call on any other cell, providing that all cells in the dependency chain are defined. With this, we get the output

  ?- prc2 becomes load(`'prc2.xml').

  ?- show( mm val prc2 ).
  object( 'Sheet1'::vector(0,0)..vector(3,19)
        ) 
        with Sheet1[0,0] = "Simple function: adds its arguments."
        with Sheet1[0,6] = "Function with intermediate expressions"
        with Sheet1[0,12] = "A resatisfiable function"
        with Sheet1[1,0] = "add/3"
        with Sheet1[1,1] = 1
        with Sheet1[1,2] = 2
        with Sheet1[1,3] = Sheet1[1,1]+Sheet1[1,2]
        with Sheet1[1,6] = "f/3"
        with Sheet1[1,7] = 1
        with Sheet1[1,8] = 2
        with Sheet1[1,9] = SIN(Sheet1[3,7])
        with Sheet1[1,12] = "within/3"
        with Sheet1[1,15] = between(Sheet1[1,13], Sheet1[1,14])
        with Sheet1[1,19] = ">1"
        with Sheet1[2,7] = Sheet1[1,7]*3
        with Sheet1[2,8] = Sheet1[1,8]*5
        with Sheet1[3,7] = Sheet1[2,7]/Sheet1[2,8].

  ?- compile_to_prolog_and_assert( mm val prc1, mm 'Sheet1'$b1 ).
  asserted add(_G4477, _G4493, _G4735):-_G4735 is _G4477+_G4493

  ?- add(1,2,X).
  X = 3 

  ?- compile_to_prolog_and_assert( mm val prc1, mm 'Sheet1'$b7 ).
  asserted f(_G4471, _G4487, _G5185):-_G5116 is _G4471*3/ (_G4487*5), SIN(_G5116, _G5185)

  134 ?- f(1,2,X).
  X = 0.29552 

  ?- listing(f).
  f(A, B, C) :-
	D is A*3/ (B*5),
	'SIN'(D, C).

  ?- listing('SIN').
  'SIN'(A, B) :-
	B is sin(A).

  ?- compile_to_prolog( mm val prc1, mm 'Sheet1'$b7, Defn ).
  Defn = f(_G4465, _G4481, _G5179):-_G5110 is _G4465*3/ (_G4481*5), 'SIN'(_G5110, _G5179)
  
  ?- compile_to_prolog( mm val prc1, mm 'Sheet1'$b7, Defn ), numbervars(Defn,1,_).
  Defn = f(B, C, D):-E is B*3/ (C*5), 'SIN'(E, D) ;;
 
  ?- compile_to_prolog_and_assert( mm val prc1, mm 'Sheet1'$b13 ).
  asserted within(_G4498, _G4514, _G4750):-between(_G4498, _G4514, _G4750)

  ?- within(1,3,X).
  X = 1 ;;
  X = 2 ;;
  X = 3 ;;

[ Jocelyn Ireson-Paine's Home Page | Publications ]