December 2010 Archives

In this series of postings, I'm going to explain how I modularise Excel spreadsheets. That is, how I break them into parts that can be separately documented, tested, and debugged; and how I share the parts between different spreadsheets, thus easing code reuse. My research on spreadsheet modularity takes ideas from a mathematical discipline called category theory, as well as from the work of Joseph Goguen and colleagues about how to modularise algebraic specification languages. This isn't as forbidding as it may sound, because much of it boils down to simple notions of mapping, text inclusion, and replacement of identifiers. But one important point is that I represent spreadsheets as text rather than graphically, coding them in a programming language that resembles the reverse-engineered listings I displayed in my recent posting How to Reveal Implicit Structure in Spreadsheets. To turn the text into spreadsheets, I use a tool called Excelsior. I've made Excelsior freely available, and to start the series, I'll show you how to install it for Windows.

First, download Excelsior from my Spreadsheet Parts download page and unzip it into a clean directory. Hereon, I'll assume you've named this c:\excelsior\ . The zip file contains several subdirectories: make sure your unzip command preserves their structure, and doesn't flatten them into the top-level directory.

You must now set three environment variables. I ask you to do so manually, rather than via an installer; and I don't change the Windows Registry. That way, you know exactly what the install will do to your computer.

This is how I set environment variables in Windows. If you already know how to set them, skip the rest of the paragraph. I click on Start, then on Control Panel halfway down the righthand column in the resulting pop-up. This opens a window with icons labelled "Accessibility Options", "Add Hardware", and so on: I am viewing it in what Microsoft call Classic View, rather than Category View. I double-click on the System icon, then on the Advanced tab in the window this opens, and then on the Environment Variables button near the bottom. This makes another pop-up, with two fields headed "User Variables" and "System variables". I click the "New" button below the "System variables" field, and a pop-up appears with a "Variable name" and a "Variable value" field. Then I type (e.g.) EXCEL_HOME into the "Variable name" field, and (e.g.) c:\Program Files\Microsoft Office\Office11 for "Variable value". I then click "OK", and then either add another variable, or click "OK" on the pop-up below. And that's it.

So now, set the first environment variable EXCEL_HOME to the directory your Excel is in. On my laptop, that's c:\Program Files\Microsoft Office\Office11 . Every time Excelsior successfully compiles a program, it will write out an intermediate file containing the formulae that make up the spreadsheet to be generated from it. It then runs excel.exe from EXCEL_HOME, passing it the name "interface.xls" as a command-line argument. This starts Excel and makes it open the file interface.xls which comes with Excelsior. A VBA macro in interface.xls then reads the intermediate file and copies it into a new spreadsheet, which will magically appear before you.

Second, set EXCELSIOR_HOME to the directory you unzipped into, namely c:\excelsior\ . Excelsior will look in here to find interface.xls and other files.

Third, set EXCELSIOR_TEMP to a scratch directory where Excelsior can put files such as the intermediate formula file. I use c:\windows\temp\ .

Having set these variables, you are ready to test Excelsior. Open a new MS-DOS window, and change directory to c:\excelsior\ . Then type the command

excelsior

You should get a summary of Excelsior's command-line syntax. If it doesn't work, perhaps the unzip has failed to copy some of the files, possibly DLLs that Excelsior uses.

But if it does work, you are now ready to compile a test spreadsheet. Type this:

excelsior -t c:\excelsior\template.xlt source_tests\test1.exc
This should display the messages:
Compiling source_tests\test1.exc.
Compiled source_tests\test1.exc.

Excelsior will then launch Excel as I've described. Excel will run its VBA macro to copy the formula, and may ask you to enable macros (because of the VBA code in interface.xls). There'll probably be a bit of delay caused by those irritating virus scans. And then your new spreadsheet should appear in front of you. It will contain the number 2000 in cell A1. In the rest of the series, I'll show you how to build spreadsheets that are more interesting, and how to use modules.


For a conference, I once drew a cartoon about spreadsheets. It showed a City gent staring despairingly at an Excel formula while behind him, Excel developers in Canary Wharf fretted. Like this:

In the cartoon, I'd assumed it was his own spreadsheet, but it could have been someone else's "legacy" spreadsheet that he was trying to decipher — and that's what I'm going to write about. My mind has been running on the topic since a colleague asked me to explain my work on what I call "structure discovery": making explicit the things an author knows about his spreadsheet but didn't write down.

To demonstrate, I've written a tiny cash-flow spreadsheet, pictured below. Its formulae are realistic, but I've reduced the number of rows and simplified the currency and date formats so that it doesn't take up too much space on the screen:

The first thing one needs is a listing of the spreadsheet's formulae. Oddly, Excel doesn't provide this; but below, I do so by reading the spreadsheet into a reverse-engineering tool I've developed. This parses the Excel file, extracts the formulae, associates them with their cells, and lists them. That's something that will be useful again later on, when I go on to transform the formulae to make them easier to read. But to start with, here are the formulae without any processing:

Cells
  Sheet1!A1:F12
where:
  Sheet1!B1 = "Simple cash-flow spreadsheet"
  Sheet1!B2 = "Demonstrates Jocelyn Ireson-Paine's reverse-engineering software"
  Sheet1!F4 = "Opening cash"
  Sheet1!F5 = 1000
  Sheet1!B7 = "Month"
  Sheet1!C7 = "Income"
  Sheet1!D7 = "Expenses"
  Sheet1!E7 = "Net profit"
  Sheet1!F7 = "Cash at start of period"
  Sheet1!B8 = 1
  Sheet1!C8 = 120
  Sheet1!D8 = 27
  Sheet1!E8 = Sheet1!C8-Sheet1!D8
  Sheet1!F8 = Sheet1!F5
  Sheet1!B9 = 2
  Sheet1!C9 = 120
  Sheet1!D9 = 41
  Sheet1!E9 = Sheet1!C9-Sheet1!D9
  Sheet1!F9 = Sheet1!F8+Sheet1!E8
  Sheet1!B10 = 3
  Sheet1!C10 = 150
  Sheet1!D10 = 64
  Sheet1!E10 = Sheet1!C10-Sheet1!D10
  Sheet1!F10 = Sheet1!F9+Sheet1!E9
  Sheet1!A12 = "Totals"
  Sheet1!C12 = SUM(Sheet1!C8:Sheet1!C10)
  Sheet1!D12 = SUM(Sheet1!D8:Sheet1!D10)
  Sheet1!E12 = SUM(Sheet1!E8:Sheet1!E10)

Having the formulae in a single listing I can print is nice, but it would be even better if the cell addresses were replaced by meaningful names. So suppose again that the cash-flow spreadsheet is someone else's that I'm trying to decipher, and suppose also that from its layout, cell colouring, and labels, I've guessed which cells belong together. For example, suppose I've guessed that cells B8:B10 are successive months, and that C8:C10 are successive incomes. I could summarise my guesses like this:

'Sheet1'!F5      is opening_cash.
'Sheet1'!B8:B10  is months.
'Sheet1'!C8:C10  is income.
'Sheet1'!D8:D10  is expenses.
'Sheet1'!E8:E10  is net_profit.
'Sheet1'!F8:F10  is cash_at_start_of_period.
'Sheet1'!D12     is total_expenses.
'Sheet1'!C12:C12 is total_income.
'Sheet1'!E12:E12 iso total_profit.

Equipped with such a mapping from addresses to names, one could then regard each group of cells as an array. For example, B8:B10 would consist of the elements months[1], months[2] and months[3]. Similarly, C8:C10 would consist of income[1], income[2], and income[3]. It's then fairly simple to rewrite each formula, replacing cell addresses in it by named array elements. I've implemented such transformations in my reverse-engineering software, and below, apply them to the cash-flow spreadsheet. Notice how much more intelligible it becomes:

Cells
  total_profit[1]
  total_expenses[1]
  total_income[1]
  Sheet1!A1:F12
  cash_at_start_of_period[1:3]
  net_profit[1:3]
  expenses[1:3]
  income[1:3]
  months[1:3]
  opening_cash[1]
where:
  Sheet1!B1 = "Simple cash-flow spreadsheet"
  Sheet1!B2 = "Demonstrates Jocelyn Ireson-Paine's reverse-engineering software"
  Sheet1!F4 = "Opening cash"
  Sheet1!B7 = "Month"
  Sheet1!C7 = "Income"
  Sheet1!D7 = "Expenses"
  Sheet1!E7 = "Net profit"
  Sheet1!F7 = "Cash at start of period"
  Sheet1!A12 = "Totals"
  cash_at_start_of_period[1] = opening_cash[1]
  cash_at_start_of_period[2] = cash_at_start_of_period[1]+net_profit[1]
  cash_at_start_of_period[3] = cash_at_start_of_period[2]+net_profit[2]
  expenses[1] = 27
  expenses[2] = 41
  expenses[3] = 64
  income[1] = 120
  income[2] = 120
  income[3] = 150
  months[1] = 1
  months[2] = 2
  months[3] = 3
  net_profit[1] = income[1]-expenses[1]
  net_profit[2] = income[2]-expenses[2]
  net_profit[3] = income[3]-expenses[3]
  opening_cash[1] = 1000
  total_expenses[1] = SUM(expenses[1]:expenses[3])
  total_income[1] = SUM(income[1]:income[3])
  total_profit[1] = SUM(net_profit[1]:net_profit[3])

Although such a listing is more intelligible than the original formulae, it can become tedious and very long if the spreadsheet has lots of repeated formulae such as those for net_profit. To overcome this, I've implemented a "run detector". This compares each formula with the ones next to it, looking to see whether it's either identical or differs by 1 in a cell address or constant. When it finds a run of two or more formulae that are thus related, it summarises them by factoring out the common parts and replacing these by a variable, written here as V0. Here's output from the run-detector applied to the cash-flow spreadsheet:

total_profit[1] = SUM(net_profit[1]:net_profit[3]).
total_expenses[1] = SUM(expenses[1]:expenses[3]).
total_income[1] = SUM(income[1]:income[3]).
Sheet1!A12 = "Totals".
Sheet1!B1 = "Simple cash-flow spreadsheet".
Sheet1!B2 = "Demonstrates Jocelyn Ireson-Paine's reverse-engineering software".
Sheet1!B7 = "Month".
Sheet1!C7 = "Income".
Sheet1!D7 = "Expenses".
Sheet1!E7 = "Net profit".
Sheet1!F4 = "Opening cash".
Sheet1!F7 = "Cash at start of period".
cash_at_start_of_period[1] = opening_cash[1].
cash_at_start_of_period[V0>1] = cash_at_start_of_period[V0-1]+net_profit[V0-1].
net_profit[V0] = income[V0]-expenses[V0].
expenses[1] = 27.
expenses[2] = 41.
expenses[3] = 64.
income[V0<3] = 120.
income[3] = 150.
months[V0] = V0.
opening_cash[1] = 1000.

In case this isn't clear, the equation months[V0] = V0 means that for all V0 within the bounds of the months table, the V0 month is V0. That is, the first element of the months table is 1, the second is 2, and so on. Similarly, the equations for cash_at_start_of_period say that the first element of cash_at_start_of_period is equal to opening_cash, and that each subsequent element is the value for the previous month plus that month's net profit. This, by the way, is a common pattern in spreadsheets: an initialisation formula followed by a run of formulae that pass information from one row to the next. In effect, it's recursion, but over space rather than time.

The run-detector also works on "raw" spreadsheet listings: those without name substitutions. Here it is, applied again to the cash-flow spreadsheet:

Sheet1!A12 = "Totals".
Sheet1!B1 = "Simple cash-flow spreadsheet".
Sheet1!B2 = "Demonstrates Jocelyn Ireson-Paine's reverse-engineering software".
Sheet1!B7 = "Month".
Sheet1!B[V1 in 8:10] = V1-7.
Sheet1!C7 = "Income".
Sheet1!C[8:9] = 120.
Sheet1!C10 = 150.
Sheet1![V0 in C:E]12 = SUM(Sheet1![V0]8:Sheet1![V0]10).
Sheet1!D7 = "Expenses".
Sheet1!D8 = 27.
Sheet1!D9 = 41.
Sheet1!D10 = 64.
Sheet1!E7 = "Net profit".
Sheet1!E[V1 in 8:10] = Sheet1!C[V1]-Sheet1!D[V1].
Sheet1!F4 = "Opening cash".
Sheet1!F5 = 1000.
Sheet1!F7 = "Cash at start of period".
Sheet1!F8 = Sheet1!F5.
Sheet1!F[V1 in 9:10] = Sheet1!F[V1-1]+Sheet1!E[V1-1].

I had to invent a notation for variables that range over row and column numbers. This is illustrated by the line Sheet1!B[V1 in 8:10] = V1-7. This says that Sheet1!B8 = 1, and that Sheet1!B9 = 2, and so on. Similarly, the left-hand side Sheet1![V0 in C:E]12 says that V0 ranges over column addresses C to E. And on its right-hand side, the subexpression Sheet1![V0]8 denotes Sheet1!C8 or Sheet1!D8 or Sheet1!E8.

Such a listing is useful because it helps discover runs of related formulae, and hence determine which cells are part of the same group. For example, it refers repeatedly to rows 8:10 and 9:10. If we didn't already know that the cells in these rows of each column were related, the listing would be a good hint. That isn't actually necessary in my cash-flow example, where the grouping is obvious by eyeballing the spreadsheet, but it can be very useful. I once worked on a 10,000-cell 200-sheet monster of a housing-finance spreadsheet which contained some 60 20×40 interrelated tables. The run-detector was invaluable in showing which formulae were related. 60 × 20×40 is 48,000 cells; I was able to reduce these to about 120 equations. I say 120, because each table tended to be defined by two sets of formulae: one set to initialise its first row, and one to progress down subsequent rows, in a similar way to the recursion mentioned above. The formulae within each set differed only by column address, and hence the run-detector could summarise them into one equation.

Finally, here is a spreadsheet that I've seeded with errors:

This spreadsheet looks like the one I showed earlier, but Excel is not happy with it, which it signals with the little green triangles in cells E9 and D12. It displays these when it thinks a formula is out of step with those around it. To find out what's wrong, I listed the formulae with name substitutions again; and three errors become apparent. The value for cash_at_start_of_period[2] is a hard-wired constant rather than a formula. The formula for net_profit[2] subtracts from income[1] and not income[2], an error hard to see by inspecting the spreadsheet because both cells have the same value. And the SUM making up total_expenses[1] has run off the end of its range:

Cells
  total_profit[1]
  total_expenses[1]
  total_income[1]
  Sheet1!A1:F12
  cash_at_start_of_period[1:3]
  net_profit[1:3]
  expenses[1:3]
  income[1:3]
  months[1:3]
  opening_cash[1]
where:
  Sheet1!B1 = "Simple cash-flow spreadsheet"
  Sheet1!B2 = "Demonstrates Jocelyn Ireson-Paine's reverse-engineering software"
  Sheet1!F4 = "Opening cash"
  Sheet1!B7 = "Month"
  Sheet1!C7 = "Income"
  Sheet1!D7 = "Expenses"
  Sheet1!E7 = "Net profit"
  Sheet1!F7 = "Cash at start of period"
  Sheet1!A12 = "Totals"
  cash_at_start_of_period[1] = opening_cash[1]
  cash_at_start_of_period[2] = 1093 Error
  cash_at_start_of_period[3] = cash_at_start_of_period[2]+net_profit[2]
  expenses[1] = 27
  expenses[2] = 41
  expenses[3] = 64
  income[1] = 120
  income[2] = 120
  income[3] = 150
  months[1] = 1
  months[2] = 2
  months[3] = 3
  net_profit[1] = income[1]-expenses[1]
  net_profit[2] = income[1]-expenses[2] Error
  net_profit[3] = income[3]-expenses[3]
  opening_cash[1] = 1000
  total_expenses[1] = SUM(expenses[1]:Sheet1!D11) Error
  total_income[1] = SUM(income[1]:income[3])
  total_profit[1] = SUM(net_profit[1]:net_profit[3])

In bigger spreadsheets, the "out-of-step" errors would also become evident when I list by runs. The spreadsheet here doesn't really have enough related formulae for this to happen. However, the following extract from a by-runs listing does show that something is amiss with net_profit:

net_profit[V0<3] = income[1]-expenses[V0].
net_profit[3] = income[3]-expenses[3].

All three kinds of error, by the way, have occurred in real spreadsheets. I've got repeated formulae wrong in my own spreadsheets, though I've always caught the error before the spreadsheets went live. I seem to remember horror stories of naïve spreadsheeters overtyping a formula with the number it has calculated, to "save the computer work"; and other anecdotes of the same being done to defraud. And errors in SUM ranges are notorious. SF author and critic David Langford once received only half the royalties he was due because a SUM in his publisher's royalties spreadsheet missed some cells. And the European Spreadsheet Risks Interest Group has a horror story (search for "$38K spreadsheet error") of a hospital's Medicare bad-debts spreadsheet whose SUM ranges were misprogrammed. Because of this, the hospital claimed $38,240 more to compensate it for bad debts than it should have.

Making existing spreadsheets easier to understand is one important aspect of spreadsheet safety, and is the main point of this posting. But it's also important to avoid bugs when writing new spreadsheets. I've worked on this too, and I'll finish with a summary of the work. It's based on a language I've developed called Excelsior, which can compile listings like those above into spreadsheets. This means that you can write meaningful identifiers rather than cell addresses. Separate layout statements in the program specify how these identifiers map to cells.

Excelsior also lets you write reuseable modules that can be shared between different spreadsheets. In other words, code reuse. That's something lacking from Excel, where to share code between spreadsheets, you have to copy cells from one spreadsheet and paste into another. And then, of course, you end up having to maintain several copies of your formulae.

The following links give more information on this work. In How to Document a Spreadsheet: an Exercise with Cash-Flow and Loans, I show how to develop, in Excelsior, a spreadsheet that represents several loans, and for each month, decides which is best to borrow from. The algorithm is fairly simple, but it's subtle, and it would be tricky to get right if coding directly in Excel. Having the formulae all available in front of you, with sensible cell names, makes it much easier.

A much longer exercise of the same kind is Gliders, Hasslers, and the Toadsucker: Writing and Explaining a Structured Excel Life Game. In it, I show how to develop a spreadsheet that implements a well-known cellular automaton, John Conway's game of Life. I wrote my article for maths teachers who might want to introduce secondary-school students to Life. Having the primary source code be an Excelsior listing, rather than a spreadsheet, makes it much easier to show students the equations that update the cellular automaton.

Rather different is the science-fiction generator at Ireson-Paine Spreadsheet Parts Repository Excel science-fiction generator. I based this on a jokey diagram in an old SF anthology, Gahan Wilson's The Science Fiction Horror Movie Pocket Computer from Harry Harrison and Brian Aldiss's The Year's Best Science Fiction No. 5. It's a familiar idea: a network of linked boxes where each box contains a plot event such as "Earth falls into the Sun" or "And so everybody dies". The links between boxes indicate which events can follow which. My spreadsheet implements this as a transition graph, which it recurses over to generate a story. I wrote the spreadsheet entirely in Excelsior; the recursion would have been very tricky to program if using only Excel.

Those examples use the idea that one can avoid bugs by generating spreadsheets from easily readable programs. Another way to avoid bugs is to build spreadsheets from prefabricated components. That's demonstrated in Spreadsheet Components, Google Spreadsheets, and Code Reuse, where I apply it to Google's online spreadsheet, enabling users to load library formulae into it in the same way they can load charts. The use of modules is an important part of spreadsheet safety, but much neglected.


Plain English

| No Comments | No TrackBacks

In some New Statesman literary competition, I once read the following quatrain:

I love my Kellog's All-Bran,
It fills me with elation;
It's quick and easy and ensures
diurnal defecation.
For those who don't know, I should explain that: All-Bran is a cereal that looks like twigs, designed to deter constipation; Kellogg's is a very very rich cereal company. And was founded, apparently, as the Battle Creek Toasted Corn Flake Company in 1906, by Will Keith Kellogg, being an outgrowth of his work with his brother John Harvey Kellogg at the Battle Creek Sanitarium following practices based on the Seventh-day Adventists. Thank you, Wikipedia.

That All-Bran poem was a submission to a competition that asked readers to praise some household product in the same way that the following quatrain praises Carnation evaporated milk:

Carnation Milk is the best in the land;
Here I sit with a can in my hand —
No tits to pull, no hay to pitch,
You just punch a hole in the son-of-a-bitch.

There are several versions of the Carnation poem. Here's another:

Carnation Milk is best in the land;
it comes in a little red-and-white can.
No tits to pull, no hay to pitch;
just poke a hole in the son-of-a-bitch.
That one, I found at Snopes.com's Carnation Slogan Contest, from their collection of pages which explain the history of famous advertising promotions, and debunk some urban legends about them.

I think I first came across the poem in David Ogilvy's book Confessions of an Advertising Man. He writes:

Unless you have some special reason to be solemn and pretentious, write your copy in the colloquial language which your customers use in everyday conversation. I have never acquired a sufficiently good ear for vernacular American to write it, but I admire copywriters who can pull it off, as in this unpublished pearl from a dairy farmer.

The Marketing Sleuth agree. In their blog posting The Best Language For Writing Copy, they say:

Don't say things like —
  • engage next-generation functionalities, or
  • enhance killer web-readiness, or
  • monetize visionary technologies, or
  • leveraging webinars to sell services
Or any bullshit like that. No one wants to know how clever you are. They just want to know what your product will do for him or her. And they want details in language they can understand.

But I must go. I'm just off to enhance my killer Java-hacking-readiness by demonetizing my wallet and leveraging caffeine into my bloodstream from a cup belonging to my local Summertown Costa. It shall engage my neurotransmitters, implementing within me an entire next-generation level of coding functionality. As my American friend Pat Nicholls used to say, bullcrap. And the bull didn't even need to eat All-Bran.


Plain Language

| No Comments | No TrackBacks

This morning, I was berating my brain. If I were supple enough, I'd have given it a good kick. Because it couldn't remember how to open and print to a Unicode file in Java. (You create a new FileOutputStream pointing at the file, then create an OutputStreamWriter from the FileOutputStream and the name of your encoding, and then create a PrintWriter from the OutputStreamWriter. While doing this, you have to keep in mind the difference between the "write" methods, which most output streams can execute, and the "print" methods, which fewer can. I don't know why.)

But then I decided that it isn't my brain that's at fault; it's Java. Because look how Java makes you define an associative table:

static {
  defaultPorts.put( "http" , new Integer(80) );
  defaultPorts.put( "shttp", new Integer(80) );
  defaultPorts.put( "https", new Integer(443) );   // ...
  UsesGenericSyntax.put( "http" , Boolean.TRUE );
  UsesGenericSyntax.put( "shttp" , Boolean.TRUE );
  UsesGenericSyntax.put( "https" , Boolean.TRUE ); // ...
}

Which is ludicrous, insane, crazy, preposterous, risible, ridiculous, farcical and absurd, as well as every other synonym you can think of. An associative table is a mapping, and a mapping is a function, and function is the most fundamental notion in mathematics. Also the simplest. Functions are sets of ordered pairs, so any decent language would let you write them using a notation whose primitives represent the primitive parts of a set of ordered pairs, clearly arrangeable to describe the whole.

It's analogous, or should be, to the control knobs on my cooker. The layout of the knobs is a scaled-down picture of the layout of the hotplates, and the set of possible rotations of each knob is a curved picture of the set of its hotplate's possible temperatures. The parts of the controls match the parts of what they control, and the arrangement of these controls matches the arrangement of the things they control. Except if I program in Java, when my cooker controls would look like this:

static {
  CookerControlPanel cp = 
    my_cooker.addCookerControlPanel( new CookerControlPanel("CookMaster CP141","UK version") );
  cp.setLanguage( "English" );
  cp.getLabelTextGenerator().setTextReadingDirection( TextReadingDirections.LEFT_TO_RIGHT );
  CookerHotplateControlKnobPosition front_left_pos, front_right_pos, back_left_pos, back_right_pos;
  front_left_pos = new CookerHotplateControlKnobPosition( new Integer(0), new Integer(0) );
  front_right_pos = new CookerHotplateControlKnobPosition( new Integer(6), new Integer(0) );
  back_left_pos = new CookerHotplateControlKnobPosition( new Integer(0), new Integer(6) );
  back_right_pos = new CookerHotplateControlKnobPosition( new Integer(6), new Integer(6) );
  CookerHotplateControlKnob front_left_knob = new CookerHotplateControlKnob( front_left_pos );
  front_left_knob.addLabel( "Front left" );
  CookerHotplateControlKnobTemperatureScale front_left_knob_temperature_scale = 
    new CookerHotplateControlKnobTemperatureScale( CookerHotplateControlKnobTemperatureScaleBaseScales.FAHRENHEIT, new Integer(50), new Integer(350) );
  front_left_knob.addTemperatureScale( front_left_knob_temperature_scale );
}
And on and on and on, until your brain turns to toast.

Moreover — returning to the associative table — why must I tell Java that 80 is an integer? Does it know an 80 that isn't? Why must I tell it that TRUE is a Boolean? Does it know TRUEs that aren't? Perhaps they're TRUES used by visitors from another universe whose logic is different from ours, so that when they come here, they have to switch. Like an internationalisation-resource-bundle, except that you carry with you not your time-zone and alphabet, but your universe's truth values.

I copied the associative-table code from a video called "Public Static Void". This was one of the keynote speeches at OSCON 2010, the O'Reilly Open Source Convention, and is by Rob Pike, co-designer of Google's Go programming language. Pike doesn't like the verbosity of Java and C++, and gives examples that demonstrate it, as well as ridiculing the spurious packaging around 80 and TRUE. At the beginning of the video, he quotes Lisp expert Dick Gabriel:

I'm always delighted by the light touch and stillness of early programming languages. Not much text; a lot gets done. Old programs read like quiet conversations between a well-spoken research worker and a well-studied mechanical colleague, not as a debate with a compiler. Who'd have guessed sophistication bought such noise?

I'm sure Gabriel was thinking of Lisp. And maybe also of APL, Algol 60, BCPL, Comit, Simula 67, Snobol, and even Fortran. But then, perhaps somewhere in the years that began with Pascal and Modula and Clu, and ended with C++ and Java, something went awry. What? And why?


I just found a copy of my favourite Asimov online. It was written half a century ago, but it is still the ultimate computer story: The Last Question.


Sketch of man using mobile phone, sayiong loudly 'HEY SIMON, IT'S JOHN ROGERS. I'm ON THE TRAIN AND EVERYONE KNOWS IT. WHAT IS THE AGREED EVALUATION FEE?'

I met this character on the train back from Paddington. When he saw me making a fair copy of the sketch by holding it up against the window to trace, he stalked over and hissed "Lose the name! Lose the name!" But really, he had already had a very easy way to not let me know his name. Cut out this drawing, glue onto card, and present to your next mobile-phone boor.


About this Archive

This page is an archive of entries from December 2010 listed from newest to oldest.

November 2010 is the previous archive.

January 2011 is the next archive.

Find recent content on the main index or look in the archives to find all content.