Working Through the “OO’s Small Classes and Short Methods” Exercise
Posted on May 26, 2008
Filed Under Computer Science, Programming, Software Development |
In "Perfecting OO’s Small Classes and Short Methods ", Andrew Binstock describes the constraints of an Object Oriented exercise by Jeff Bay.
I may be three weeks late on the issue, but I had to wait to find the time to write a project using these restrictions. After all, one experiment is worth a thousand blog entries, and I wanted to scout out the territory before I mounted my high horse.
1. Use only one level of indentation per method. If you need more than one level, you need to create a second method and call it from the first. This is one of the most important constraints in the exercise.
2. Don’t use the ‘
else’ keyword. Test for a condition with an if-statement and exit the routine if it’s not met. This prevents if-else chaining; and every routine does just one thing. You’re getting the idea.3. Wrap all primitives and
strings. This directly addresses “primitive obsession.” If you want to use an integer, you first have to create a class (even an inner class) to identify it’s true role. So zip codes are an object not an integer, for example. This makes for far clearer and more testable code.4. Use only one dot per line. This step prevents you from reaching deeply into other objects to get at fields or methods, and thereby conceptually breaking encapsulation.
5. Don’t abbreviate names. This constraint avoids the procedural verbosity that is created by certain forms of redundancy—if you have to type the full name of a method or variable, you’re likely to spend more time thinking about its name. And you’ll avoid having objects called
Orderwith methods entitledshipOrder(). Instead, your code will have more calls such asOrder.ship().6. Keep entities small. This means no more than 50 lines per class and no more than 10 classes per package. The 50 lines per class constraint is crucial. Not only does it force concision and keep classes focused, but it means most classes can fit on a single screen in any editor/IDE.
7. Don’t use any classes with more than two instance variables. This is perhaps the hardest constraint. Bay’s point is that with more than two instance variables, there is almost certainly a reason to subgroup some variables into a separate class.
8. Use first-class collections. In other words, any class that contains a collection should contain no other member variables. The idea is an extension of primitive obsession. If you need a class that’s a subsumes the collection, then write it that way.
9. Don’t use setters, getters, or properties. This is a radical approach to enforcing encapsulation. It also requires implementation of dependency injection approaches and adherence to the maxim “tell, don’t ask.”
These constraints are tough when you try to adhere to all of them at once. They have also received their fair share of criticism and teasing from the mainstream blogging community.
These constraints are the object oriented version of the classic functional challenge: write [program] without variables. All of your functions must have only one parameter. Performing the exercise is apparently supposed to force a light bulb to turn on, and I wanted to see if it did.
What Project Did You Choose?
I chose to rewrite my implementation of Simon Funk’s algorithm for the Netflix Prize . I chose this for a few different reasons:
- It would show weaknesses in the constraints . There are a few specific areas that I thought might be a ‘gotcha!’, which I will describe later.
- It is performance-critical : The algorithm is training-based, and I wanted to see if these constraints slow down the algorithm at all.
- It is a hell of a memory hog : my implementation sucks up over a gig of RAM when it has all of the prize data cached.
- It is a typical example of programming that I enjoy : mathematical and processing large volumes of data.
If this isn’t a torture test of the exercise, I don’t know what is!
The constraints of the exercise hint towards Java (between the mention of packages and properties), but I chose C++ for this task. I’m starting a job at a C++-specific workplace in a few weeks, so you can deal with it. To make the exercise more Java-like, I pulled a Boost and kept each of the class definitions in a single .h file. This also allowed me to more easily compare myself to the "Keep classes under 50 lines" constraint, as I didn’t have to work around what this really meant.
Working With the Constraints in Practice
1. Use only one level of indentation per method. This was one of the easiest constraints to work with, and it had an interesting effect on the code: it made it more AND less readable at the same time. Let’s take my top-level static method for loading the dataset:
static Dataset load_dataset(const Directory& dir){ Dataset dataset; for(MovieID id = MovieID::MIN_ID; id != MovieID::MAX_ID; ++id){ add_file_to_dataset(create_filename(dir, id), dataset, id); } return dataset; }
On one hand, this is very clear. It’s easy to see exactly how the code is broken down, and the code can quickly be read. However, this does come with a negative effect: you have to look in more places to fully estimate the complexity of the code.
As with all ideas, this constraint works well in moderation. There is absolutely no reason to create a new method for the inner loop of bubble sort! There is also nothing wrong with separating code out into its logical units, and this constraint helps encourage this.
2. Don’t use the ‘else ’ keyword. This constraint was useless and irritating. This issue only came up a few times in this particular project, but I could imagine the nightmare of writing any kind of user interface programming. I would have much preferred being forced to only call methods from within if and else statements, like so:
if(condition){ perform_work(); } else{ perform_other_work(); }
3. Wrap all primitives and string s. All of them? Really? Damn.
The immediate effect of this was painful. I spent a lot of up-front time writing out wrappers for Filename s, Directory s, SVDCoefficient s, CustomerID s, and MovieID s, etc. I think that this was the constraint that caused the development effort to drag on longer than my first implementation of this program.
However, once all of the up-front coding was done, it started to pay off big time. First, you get improved runtime checking. I was able to add some automatic checking to the primitives, which gave me much better runtime error detection. This helped catch programming errors that gave me invalid ratings, for instance. I could have done a lot more with this: automated file checking, for instance.
This also helps in static-typed languages. For instance, if I were to show you the following interface, I bet that you would be able to immediately figure out how to use it:
RatingPrediction SVDApproximation::prediction(const CustomerID&, const MovieID&);
Even better, if you mix up the parameters, the compiler is going to tell you immediately, and you’re a <Meta>-t away from having working code.
If you didn’t wrap the types and the IDs were both unsigned ints, you could switch the parameters, still have valid ids , and spend 15 minutes debugging. You get static type-checking and sanity checks on your values at runtime just by taking the time to write out the classes.
However, avoiding primitive obsession becomes ridiculous at a point . For array indexes, I used the size_t variable, which is used in C++ to indicate a size or an array index. There is absolutely no gain to wrapping something like array indexes. Don’t go overboard with this crap.
4. Use only one dot per line. This wasn’t an issue. I could see it being more of an issue in a different kind of program, but I did not run into chaining problems, so I can not comment on them.
5. Don’t abbreviate names. I’m not very big on abbreviation, so this was not an issue. I had to expand some things like "num" to "number", but this was the easiest constraint to follow, as I already follow it.
6. Keep entities small. I did bump up against the 50-line limit a few times, but usually this was because there was something else I could refactor. I could foresee the need to ignore this constraint, but in my case it helped keep me honest about what my entities did, and helped logical grouping.
7. Don’t use any classes with more than two instance variables. Aha! I knew that I’d catch the constraints somewhere!
The way that this particular algorithm is designed, the training is performed by looping through the Dataset . The Dataset is, at heart, a std::vector<RatingTuple> . Each rating is a 3-tuple consisting of a MovieID , a CustomerID , and a RatingValue .
However, it doesn’t make sense to pair down the data further. The only added value that can be gained is following the constraints of the exercise… there is nothing else to find here! I also thought about breaking the MovieID , CustomerID , and RatingValue s into separate collections, but this approach is also flawed. The strongest relation here is between the tuples and not between the objects of the same type. One could store a map from a std::pair<CustomerID, MovieID> to a RatingValue , but this is just an added complexity.
The message here is very clear, and very useful: logically group related variables. This particular constraint proves too restrictive sometimes, so use it as a guideline, not a rule.
8. Use first-class collections. This particular constraint wasn’t very different from the primitive obsession constraint, and had all of the same benefits.
9. Don’t use setters, getters, or properties. I found that this was the hardest constraint to use, and it was sometimes impossible.
For the vast majority of my classes, I found that the constructors became workhorses when they might not have before. I also found that operator overloading was imperative to this constraint failing to drive me nuts. Overloaded operators can be considered a violation of encapsulation, but this is a violation that falls into the hands of the library writers, not the library users. I am comfortable with that. I also did not abuse operators for this purpose, and only overloaded sane operations: the adding of two RatingValue s, for instance, and the comparisons of some other types.
Going back to the RatingTuple approach, the strongest association of the data also forces the violation of this constraint. When iterating through the data, the algorithm can’t work if it doesn’t know the rating, it doesn’t work if it doesn’t know a matrix index corresponding to the CustomerID , and it doesn’t work if it doesn’t know the matrix index corresponding to the MovieID .
However, for most of the rest of my classes, it was possible to follow the constraint, and I found that it led to terser classes: they only ended up with the methods that they needed, not methods that I thought they needed.
What Was The Damage?
- 655 total lines . It’s hard to compare it to the original project, because a) the other project has extra features, b) the other project was written using the traditional (.h, .cpp) pairings, and c) I don’t care.
- 20 files
- 18 classes (if I took the time to separate them into proper namespaces, they would be in a few different "packages". I wouldn’t say I violated the constraint).
- 1 macro file
- 1 .cpp file
Final Thoughts
This exercise has some good constraints, some bad constraints, and some bland constraints. I will summarize them in list form for those of you who skipped to the end. If you are wondering why, scroll back up to the appropriate section, because I already said so once.
- Good :
- Use one level of indentation per method.
- Wrap all primitives and strings.
- First-class collections.
- Bad:
- Don’t use the `else’ keyword.
- Don’t use a class with more than two instance variables.
- Blah:
- Use one dot per line.
- Don’t abbreviate.
- Keep entities small.
- Don’t use getters, setters, or properties.
The exercise only has a few good constraints, but I think that it is important to work through the ones you truly hate in order to appreciate the ones that help code readability. Fragmenting the methods into their smallest cohesive units also makes them easier to test, which is what this exercise certainly encourages.
Popularity: 19% [?]
Comments
4 Responses to “Working Through the “OO’s Small Classes and Short Methods” Exercise”
Leave a Reply

Hey Jake,
you might want to check out how your setup handles code. This is a prototype as I see it:
RatingPrediction SVDApproximation::prediction(const CustomerID&, const MovieID&);
Something’s escaping things when they don’t need to be escaped. I suspect that the tag has something to do with it, but I’m not sure.
Great post! Thanks for doing the exercise and sharing your results.
I certainly agree with your “good.” As to the “bad,” I think limiting a class to no more than two instance variables is a salutary exercise, as it tends to force reconsideration of whether an object really shows cohesion or whether it’s doing several things that should be broken out separately.
Cheers!
Jake,
Interesting, and cool to see that you actually took the time to try this experiment. But, one thing I would say is that it’s not really possible for you to judge the value of these constraints on your own. My main complaint about the “object calisthenics” exercise was that the guidelines favoured “proper OO” and conciseness over readability and simplicity.
For you, the “one level of identation per method” and “wrap all primitives” guidelines might seem like good constraints, it’s hard for you to judge because you don’t have to experience reading the code without understanding it. Someone not familiar with the code might have to spend more time exploring the code to figure it out.
On the other hand, there is no clear-cut answer to these things. Some people prefer nice abstractions, other people just want to see what the code is actually doing. I don’t mind the abstractions when I know I can trust them (e.g. the Python standard library) but am less comfortable with code that hasn’t been as thoroughly tested. YMMV.
Anyways, great article. Good point about the advantage of wrapping primitives when you’re using a language with explicit types — I hadn’t thought about that.
Jake!
Very interesting post. I read a little bit about wrapping all primitives and removing getters and setters in this article: http://www.javaworld.com/javaworld/jw-01-2004/jw-0102-toolbox.html It’s a good read; check it out.
-Steve