Baffling Buffalo Bison and Computer Language Abuse
English Abuse
The following is a correct English sentence:
Buffalo buffalo Buffalo buffalo buffalo buffalo Buffalo buffalo.
I dictated the above sentence to a few of my friends and coworkers who are proficient English speakers. None of them understood the sentence well enough to respond. Most (5/6) thought I was spouting gibberish, and one person thought I was trying to play a trick on him.
How could this possibly be valid English and convey a thought? Three different meanings of the word “buffalo” are used:
- buffalo: Noun. A bison
- buffalo: Verb. To baffle
- Buffalo: Proper Noun. A city in New York
My friends didn’t parse the sentence for a few reasons. First, it is taking advantage of the homonyms of the word “buffalo”. Second, most people don’t use “buffalo” as a verb. The sentence just isn’t idiomatic English.
Making some key substitutions, and adding commas, we can change the sentence to the following:
Buffalo bison, Buffalo bison baffle, baffle Buffalo bison.
The sentence is still confusing, but one should be able to figure out what it means. It’s been about 10 years since I’ve had to diagram a sentence, so I’ll leave it as an exercise for the adventurous reader… or the enterprising Wikipedia-goer.
Coding Abuse
There are plenty of good examples of language abuse in computer languages. Operator overloading, macros, and template metaprogramming allow C++ to be bended to create fantastical, and sometimes horrifying, constructions. For every potentially useful abuse like Boost.Parameter, there are 9 trillion unclear overloads of operator+().
“Code Complete” by Steve McConnell gives an abusive example that is legal in PL/I:
if if = then then then = else; else else = if;
I have a few personal favorites, such as the “Go Towards” operator in C:
unsigned int a = 10; // Can be rewritten as while(a-- > 0) while(a --> 0) { // Perform action. }
This is cute, and may be tempting to use because it seems harmless. However, I’ve seen people confuse this for operator->(), making them put unnecessary work into figuring out something inconsequential. This problem is especially bad in fonts that have no space between adjacent dashes.
If you have any examples of abusive language tricks, I’d like to see them.
What is the lesson?
People are famously imperfect. If the purpose of code isn’t immediately obvious, they will not understand it. Even worse, they may incorrectly understand the code! This is the exact same problem that we run into when we are talking about Buffalo buffalo in English. Most normal people would never understand the sentence, so why say it?
Most programmers don’t set out to make their code Obfuscated C-worthy, but most languages offer you facilities to take shortcuts. Don’t get me wrong, there are instances where operator overloading can make code clearer. For example, it would be beneficial if the BigInteger in Java had its own special arithmetic overloads. Not only would it be easier to read, but it acts exactly the same as normal integer arithmetic operations in Java. There is no learning curve because everybody is already familiar with how arithmetic works in Java.
Sometimes there is a clear advantage to language abuse, and it’s important to weigh the pluses and minuses of your decisions. Keep in mind that the normal user is only willing to put up with only so much change before they drop your code in favor of something they can immediately read.
Readers are only human, and you need to be nice to them. Abusive syntax makes code readers work just a little bit harder. Sure, most are perfectly capable of figuring it out, but they aren’t reading your code to find neat language hacks. Consider the fact that other people will spend much more time reading your code than you will spend writing your code, which is especially true for popular projects.
Image from Flickr user PrairieDog
Creative Commons Attribution-Noncommerical license.
Popularity: 11% [?]
