Where are the Tools for Your Job?
Posted on May 16, 2008
Filed Under Quote | Leave a Comment
After taking an abstract algebra class, I decided to revisit something from my cryptography class and figure out how the Number Field Sieve works, since my final project was writing the quadratic sieve [side-note: if anyone wants the code for it, I'll throw it up on the site. Actual sieving was not a requirement, so it uses trial division. IIRC, it factors 20-digit numbers pretty quickly].
In the introduction to the paper, I found:
- An interesting Knuth quote.
- An important lesson.
"Unfortunately we are unable to give a rigorous proof that [the listed runtime] is indeed the expected running time of the number field sieve. Consequently, this paper does not contain a rigorous mathematical result. In this context the following quote from Donald Knuth is of interest: `One of my mathematician friends told me he would be willing to recognize computer science as a worthwhile field of study, as soon as it contains 1000 deep theorems. This criterion should obviously be changed to include algorithms as well as theorems, say 500 deep theorems and 500 deep algorithms.’
The present paper describes a deep algorithm for the solution of a fundamental problem, and it depends on techniques that have not been of traditional use in this area. We therefore trust that it is of interest to theoretical computer scientists, and that they will appreciate the challenge posed by its rigorous running time analysis."
"The Number Field Sieve": Lenstra, Lenstra Jr., Manasse, Pollard. 1990. Copyright ACM.
The interesting Knuth chestnut aside, it’s important to note that even the famous mathematicians draw inspiration from different sources. This is the best argument for perpetually continuing one’s education, no matter your field of study: the more you know, the more tools you can use.
Integer factorization is ostensibly a number theory problem, but the tools for the job were found in abstract algebra. Where are the tools for your job?
How are you sure that an Object-oriented database doesn’t solve your problem better if you’ve never learned how to use one? How are you sure that you shouldn’t be writing your web application with Ruby on Rails instead of PHP? Maybe Erlang solves your scaling problems much easier than Java. Maybe those Lisp jerks were right all along.
A lot of different solutions have been introduced by a lot of smart people. There may be a good one you haven’t tried.
Popularity: 3% [?]
How Reddit Will (Maybe) Save Software Development
Posted on May 12, 2008
Filed Under Programming, Software Development | 10 Comments
Or, This Started as a Diatribe About Bad Programming Books, and Turned Into Beating a Dead Horse.
Decades after The Mythical Man Month examined the management of software development, projects are still failing at an alarming rate. Some estimates say that as few as 34% of software engineering undertakings are successful. Not only that, but big projects fail, and fail spectacularly, like the FBI’s case management system debacle. The whole project was scrapped after it was discovered that you can’t build software the way the Egyptians built pyramids: draw a triangle blueprint and whip the slaves until it’s all in place. Since then, effective management and work styles somehow shifted, and the FBI didn’t keep up.
Edit: As pointed out by Greg Molyneux , the pyramids were not built by slaves . Was EVERYTHING I learned in history class in middle school a lie?
Programming failures are due to a lack of rigor at all levels. The requirements suck, so the design sucks, so the interaction between modules suck. These problems will continue to persist until development practices assume that the developer and designers are making mistakes, and does everything possible to correct the mistakes early. This will be done, not by a slick methodology from a book, but rather a karma-based social programming application.
To strengthen my case, I want to look first at mathematics.
Why Does Math Work?
Mathematics, despite its difficulty, continues its inexorable advance. Theorems Conjectures that have withstood the test of time are falling, one by one; just ask Fermat and PoincarĂ© . New ground is constantly blazed, and even tired subjects– such as number theory– are finding new applications and new life.
First, let’s get the obvious difference out of the way. Mathematics is largely a theoretical field. Sure, we can often apply it to the real world, but where’s the fun in that? The goal is not to produce products for ordinary people, but rather to further mathematics as a subject.
Also only those who are fully interested in mathematics are mathematicians. This does not hold true for software development, as trade programmers exist. I’ve met them. These programmers simply bang out code day to day as a living, and are not thoroughly passionate about programming.
Now, let’s see what math does right:
Excellent Peer Review System: Mathematics is advanced through proofs. These proofs are carefully perused by other mathematicians. Even better, there is an incentive to find mistakes: when an error is discovered in a published paper, the discoverer can write a paper on the subject. This ups the yearly publication count by 1, making the mathematician seem more valuable. It’s like karma!
At the end of the day, the effect is clear. The only mathematics that survives has seen a thousand eyes, and subtle flaws are discovered in due time. For example, Bertrand Russell discovered a contradiction in set theory near the turn of the century. This caused set theory to be ripped up from the floorboards and be rebuilt on top of new assumptions that did not fall victim to this same flaws. The paradox is now known, unsurprisingly, as Russell’s Paradox .
Open source software also has this same advantage of many eyes, but open source software also does not do everything for everybody. Yet.
So How Can We Fix Software Engineering?
In one sense, we may not be able to. People are not usually programming for a decade before they enter the field, so it turns out that some people were just never meant to be programmers, and it is too late!
Other people will always code without "reading the manual" on their coding methodology. They will use the waterfall method. They will take eXtreme Programming as a blanket excuse to code without thinking. They will adopt the Rational Unified Process, but forget about that testing thing because it takes too much time. It could be impossible for collectives of the uninformed to produce good work.
The best, on the other hand, do not need to be regulated. They know better than anybody else about their own personal shortcomings (whether they realize it or not), and they use them to their own advantage. They know they are likely to make breaking changes to their own code, so they use version control. They know that they are going to write bugs, so they do everything they can to automate their finding. They already have a library of old code that is tested and works, so they can finish projects faster.
So to fix development for the rest of us, we need to approach software engineering like we would approach mathematics: with rigor. We need to treat it as if it is an incredibly difficult subject. We need a plan of attack, and there needs to be an excellent review system. However, I strongly suspect that most developers will not willingly go along with the rigor associated with, well, rigor. They may give it good lip service, but when it comes time to write the 60th test case for their physics simulation, they just might start cutting a corner or two. So what will the next programming methodology have to do better? One of the following:
- Simulate rigor from the unrigorous.
- Encourage rigor.
The Next Big Thing in software engineering methodologies may even do both. How could it accomplish this? Incentive! There has to be an incentive for programmers to do rigorous work, which is why I think that the Next Big Thing will come, not with a book, but with a slick collaboration GUI that encourages/rewards rigorous activities with karma. If Reddit has taught me anything, it is that people will do anything for something as worthless as a karma point. Of course, this would need more serious study before anything was actually produced, but we’re talking theoretically at this point.
What The GUI Will Need
The program would best be served by a karma system, similar to Reddit. The program will need to keep track of (and reward) the activities most associated with rigorous programming.
I haven’t really worked out the details (being elbow-deep in another project), and there are many details to work out. Ideally, the GUI will be able to reward good design, good requirements, good code, and good tests.
I just want to give you a quick example of how the program can work: rewarding unit tests. When tests are coded well, they tell you when things break, and they tell you when things are (likely) working. It is generally well-agreed that programmers should write unit tests for their code. Despite all of this, the actual writing often fall by the wayside at the 50th unit test for your extremely boring file parsing function.
The program should reward you points for the number of unit tests you write for anybody’s code. Other programmers are human, and can often simply forget to perform these measures. They could even document the areas that still need work if they are fatigued by the process. They shouldn’t be penalized for something that should easily be added by another programmer. After all, if the function that they wrote is any good, they will be rewarded with karma for the function.
Obviously, this can be gamed. You can simply write 500 unit tests for the "add()" function written by another programmer’s class. But not when the karma system comes into play! If other developers find that your unit test additions are worthless, they can downmod the useless unit tests, and they can flag you as abusive.
Why won’t the developers spend all of their time writing unit tests instead of adding to the development effort? It can be claimed that if they are adding the unit tests and getting upmodded that they are contributing to the development effort, but I will not dodge the question.
Why will they not game the system? At the end of the day, they have a manager, usually in their same office. The managers will be able to see who is abusing the system, who is getting the most downmodded votes, and will be able to read comments and get explanations from their team. They will be able to walk down the hall and say, "Hey, dummy! Write the binary parsing utility you’re assigned and quit spamming unit tests!"
This system has the benefit that, over the long haul, the manager gets to see who has contributed the most to the project, who has the most controversial changes, and who has contributed the least. They will get to see the comments that other developers have left, and they will get to see who is not fitting into the team.
Why Am I Telling You Instead of Making It Myself?
There are a few reasons.
- I’m sick and tired of using bad software.
- It will never actually work in practice for a lot of teams.
I will use this project to learn Ruby + Ruby on Rails… after the few big projects on The List already, but if someone else wants to hack up their own version, awesome! I’d like to see what is done with it.
Why will it not work in practice? Because politics, unfortunately, plays into programming decisions in some companies. As an intern, I was shocked to find out that this was true. It could be hard to give the thumbs-down to someone’s code on a Friday afternoon at 4:00 when you know that in a month, they will be your manager.
Not only that, but the workplace is not like the internet. When you flame somebody within the office, they could theoretically come to your cubicle and napalm you in the face. Not pretty.
The Reddit Programming System can only really work if you have a team that understands that everyone makes mistakes, and to just be an adult when someone corrects you. I believe that it can help small-to-medium sized teams, but the teams that would do well with the Reddit Programming System may already be the teams that are doing well with other methodologies.
I have no reason to suspect that it will help teams that aren’t performing.
Popularity: 17% [?]
Talking to People, not Computers
Posted on May 9, 2008
Filed Under Quote | Leave a Comment
It’s possible to program a computer in English. It’s also possible to make an airplane controlled by reins and spurs. - John McCarthy, 196x.
This is pretty thought-provoking for being a flippant remark on computer language design. I don’t 100% agree with the sentiment of this quote (which is hard to do in the face of the developer of Lisp), but it certainly raises some good issues.
The less written in a language to complete a task, the better the language. English could be used to program computers, but the result would be quite unwieldy. Why? To even talk about mathematics and algorithms out loud, mathematicians and computer scientists need to create precise, exact definitions of every single building block that they use. English doesn’t cut it, even amongst native speakers. There would need to be a big huge conference of the hive minds of language design to sit and decide what each of the notions would be and how they would be expressed in English, and even then the language would still suck for a lot of programming tasks.
The consequences of vague language are disasterous: proofs are wrong, programs are wrong, and people punch each other in the face. McCarthy has very real concerns: if you told your robot that it should kill time, your wall clock might be in very real danger. There are much better means of specifying algorithms and mathematics, and these should be used when describing algorithms and mathematics.
However…
As has been pointed out before [shameless self-reference], one of the main reasons that we have programming languages is to show other people how we talk to the computer. We don’t program in Lisp or Python or C++ or Fortran because computers explicitly speak these languages. They need to be compiled down to machine language. If we wanted to speak directly to computers, we should be using the machine language. The reason we don’t is so that we can share our code easily with others, and easily organize our own thoughts. Our languages are designed directly for humans to convey mathematical and algorithmic thought.
Likewise, we don’t steer airplanes by manually moving the flaps and pouring the fuel into the engines. We have a human interface that flies the airplane for us, that humans can talk to, and humans can listen to. The actual act of flying has nothing to do with moving a joystick or flipping a switch, but it is a handy metaphor.
English is obviously not the right language to manipulate math. However, SQL is little more than structured English, and it acts excellently as a data storage/retrieval language. Even those with no programming experience can get the idea of what a SQL query is attempting to accomplish BECAUSE it is simply structured English.
It all comes down to “the right tool for the right job.”
Popularity: 7% [?]
keep looking »