[ Jocelyn Ireson-Paine's Home Page | Publications | Dobbs Code Talk Index | Dobbs Blog Version ]

I C BB 2 E

I hate identifiers.

A bad dream last night dragged me back to a project I took over some years ago, a Web shop bungled by its original developers. Their code implemented both shopping baskets and the products customers put into them as arrays, using numeric constants as field selectors. So when reading it, you had to know that prod[1] was a product's name and prod[2] its price. If bask is a shopping basket, you had to know bask[1] to be the total price of its contents, bask[2][N] the N'th product therein, and bask[3] the delivery charge. (The developers dropped out of touch and out of Google. Perhaps they're farming pigs.)

To help myself understand the mess, I coded some access functions. Thus product_to_name(prod) returned the product name prod[1]; basket_to_total_price(bask) returned bask[1]; and basket_to_product(bask,N) returned bask[2][N]. I began to bring order to the arrays amidst the disarray.

But the (ex-) developers had written huge tracts of code for outputting baskets in copious styles: as bulleted lists, as tables, as PDF's for invoices, as transaction logs. Code sharing had not lain within their world view. The system went on and on selecting products from baskets then printing their prices; and I continually found myself having to type product_to_price(basket_to_product(bask,N)).

Should I therefore define this as a new function? But what to name it? The language had overloading, so should I reuse product_to_price, as in product_to_price(bask,N)? Or name it basket_to_product_price? Or basketed_product_to_price? Or product_in_basket_to_price? I sensed a returning listlessness, the familiar onset of Naming Fatigue.

Moreover, the implementation was more complicated than I've implied, having a level between products and baskets. Because a customer might purchase ten tins of Kattomeat, say, or (after the meal, presumably) ten sacks of Kitty Litter, there were what I suppose you'd call "product repetitions". These were arrays of which one element held a product, another the number of units bought, and the others data about bulk discounts. I needed access functions for these as well. But what to call them?
— Perhaps product_in_basket_to_price?
— Ouch, no can do: I used that earlier.
— Well then, product_repetition_sequence_to_unit_price?
— Hmmm, typing that all the time will be tedious, not to mention bringing on my RSI.
— What about bought_product_to_price?
— Well, no. Strictly speaking, the product isn't bought until the customer succeeds in paying for it.
— OK, putatively_bought_product_to_price?
— Let's not be silly.
— I know! I'll Joyce it and say prorep_to_price.
— That's not too long — but in six months' time, will I remember what a prorep is? S'what comments are for, I suppose.
— Oh hell oh bum, the Web shop also has data structures to represent product reports.
— I know! I'll get a job in China!

Because of the Norman invasion and the fact that programmers are linguistically descended from the lords who dined on pork rather than the peasants who herded pigs, too many English programming words start with the same letters. Which usually seem to be "tran", "pro", "re" and even "rep". Chinese words can share common parts too; but they're shorter. So, I'm thinking, when you can write "product repetition" as 產品重複 (if the dictionaries tell me correctly), perhaps you don't need to agonise whether to abbreviate as prorep, prodrep, prod_rep or prodRep. Or word_vec_long_ptr_ProductRepetitionSequence_shop_module_private_type.

Chinese words are not only short, they're graceful and refined. This one means "identifier". Intricate and elegant, let's look at it close up:

標識符

Here's "report":

报告

A grounded cross upon a rectangle — isn't that a nice design? A number of Chinese characters started life as pictures. In this one, the rectangle depicts an open mouth. The cross is a cow, the curvy line thereleft being a horn; and the whole thing signifies a cow warning of an intruder. Warn, inform, tell, report.

I love these graphic little stories. Some may be mere etymythology. But they are fun. The character for "trouble", T. K. Ann's book Cracking the Chinese Puzzles says, began as an open hand next to a tiger. The open hand lets free a tiger into the countryside. Now that's trouble! Nowadays, it would be an open hand letting free a virus down the Net.

Here's my favourite. These two are "child" and the first character of the word "learn":

子  學
The "child" is a drawing of a baby with arms and head. Knowing this, you can believe the story in Weiger's book Chinese Characters that "learn" began as two hands reaching down to remove a cover from a child's head and thereupon cast light. The three-branched things top left and top right are hands — like Disney animators, ancient Chinese scribes drew only three fingers — and the whole, Weiger says, can be read as "the teacher dispels darkness from the mind of the disciple". Some wouldn't quite agree; but it's a nice story. Actually, to benefit AI programmers, the Committee For Character Reforms has approved a character for "machine learning". It replaces the child by R2D2.

But wish as I might, I'm not e-selling Terracotta Horse replicas in Xi'an; I'm code-checking the dogfood module in a last-minute pet requisites Web shop in Luton, a town that exists to make Milton Keynes look good. You can't divert your mind with character stories in English; and my GP has put me on steroids after a bout of Naming Exhaustion brought on by renovating a Fortran program whose most informative identifier was sqrik5. So I keep thinking: why have meaningful names at all? When manufactured objects like the components in my TV carry any purpose-designating label, it's invariably a garble of alphanumerics such as EN29F512-70JC NHGMSf 04020. But you can always look this up in a datasheet, thereby getting a full spec written in just about any language you can read, and the electronics engineers manage OK, don't they? Anyway, I'm fed up. I've an acre of code still to check, and the dog's dinner submodule keeps printing the Rat 'n Chaffinch Flavour Yummie Sticks as worming pills, but I've had enough and I'm giving up. I'm going to take a tip from Salman Rushdie, be honest about the difficulty of explaining code, and name all my functions p2c2e1. Or p2c2e2. Or p2c2e3. Whatever. Until I run out of functions to name.

Why p2c2e, you ask? From a handy little word that stands for Process Too Complicated To Explain. I first met it in the context "How the atom can be split using a toaster and a household drill? Well, I'm afraid that's a p2c2e" while reading a report of the 1989 Utah press conference on Cold Fusion, but in fact, it originates in Rushdie's novel Haroun and the Sea of Stories. But let him explain:

'Not so fast,' said Haroun, whose head was spinning, not only at the discovery that there really were Water Genies, that the Great Story Sea wasn't only a story, but also at the revelation that Rashid has quit, given up, buttoned his lip. 'I don't believe you,' he said to the Genie Iff. 'How did he send the message? I've been right with him almost all the time'.
'He sent it by the usual means,' Iff shrugged. 'A P2C2E.'
'And what is that?'
'Obvious,' said the Water Genie with a wicked grin. 'It's a Process Too Complicated To Explain.' Then he saw how upset Haroun was, and added: 'In this case, it involves Thought Beams. We tune in and listen to his thoughts. It's an advanced technology.'

Haroun has travelled to Earth's invisible second moon Kahani on a quest to undo damage done to the Great Story Sea by evil Khattam-Shud. Thus he will restore his father Rashid's ability to tell stories. Without this, Rashid can't continue his job as the greatest story-teller of them all, the Ocean of Notions, the Shah of Blah. And soon after learning the existence of P2C2Es, Haroun discovers where they all originate:

'Orders,' said Iff. All queries to be taken up with the Grand Comptroller.'
'Grand Comptroller of what?' Haroun wanted to know.
'Of the Processes Too Complicated To Explain, of course. At P2C2E House, Gup City, Kahani. All letters to be addressed to the Walrus.'
'Who's the Walrus?'
'You don't concentrate, do you?' Iff replied? 'At P2C2E House in Gup City there are many brilliant persons employed, but there is only one Grand Comptroller. They are the Eggheads. He is the Walrus.'
P2C2E House, I imagine as a cross between Kafka's Castle and Microsoft Corporation Headquarters, office muzak softly playing "they are the eggmen. I am the walrus" while Eggheads sit coding. But I can't help wondering: if these processes are Too Complicated To Explain, how on Earth do the Eggheads explain them to each other? My best guess is that they wrap up all their inexplicable concepts in the ultra-compact language Douglas Adams invented where "Ix" means "boy who is not able satisfactorily to explain what a Hrung is, nor why it should choose to collapse on Betelgeuse Seven". Now that's what I want to write my identifiers in!

Of course, like all successful words, P2C2E has its offshoots and descendants. These include:

But I must stop — it's time to heed Michael Covington's advice. In Some Coding Guidelines for Prolog, he cautions: though programmers once found it fashionable to abbreviate "to" as "2", thereby saving one vital character in names such as exe2bin, this is too confusing. Spelling correctly is hard enough; don't make those reading your code remember your creative misspellings too. And, he says, be warned by one regrettable program, which used the names menutwo, menutoo, menu2, and (probably by accident) mneu2. That program was written by my (ex-) Web shop developers.

~~~~~

Cartoon of programmer sadly staring at his diagram of messy over-complicated data structure. Above his head is a thought bubble containing a code comment: 'Data Structure Too Embarrassing To Explain'.