<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>So Jake Says &#187; Object oriented</title>
	<atom:link href="http://www.jakevoytko.com/blog/tag/object-oriented/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.jakevoytko.com/blog</link>
	<description>Ye Olde Computer Science Blogge</description>
	<lastBuildDate>Sun, 17 Jan 2010 15:16:00 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0</generator>
		<item>
		<title>Working Through the &#8220;OO&#8217;s Small Classes and Short Methods&#8221; Exercise</title>
		<link>http://www.jakevoytko.com/blog/2008/05/26/working-through-the-oos-small-classes-and-short-methods-exercise/</link>
		<comments>http://www.jakevoytko.com/blog/2008/05/26/working-through-the-oos-small-classes-and-short-methods-exercise/#comments</comments>
		<pubDate>Mon, 26 May 2008 04:00:13 +0000</pubDate>
		<dc:creator>Jake</dc:creator>
				<category><![CDATA[Computer Science]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[Software Development]]></category>
		<category><![CDATA[abbreviate]]></category>
		<category><![CDATA[else keyword]]></category>
		<category><![CDATA[first-class collections]]></category>
		<category><![CDATA[getters]]></category>
		<category><![CDATA[instance variables]]></category>
		<category><![CDATA[Object oriented]]></category>
		<category><![CDATA[one dot per line]]></category>
		<category><![CDATA[one level of indentation]]></category>
		<category><![CDATA[properties]]></category>
		<category><![CDATA[setters]]></category>
		<category><![CDATA[short methods]]></category>
		<category><![CDATA[small classes]]></category>
		<category><![CDATA[small entities]]></category>
		<category><![CDATA[Wrap primitives]]></category>

		<guid isPermaLink="false">http://www.jakevoytko.com/blog/?p=81</guid>
		<description><![CDATA[In &#8220;Perfecting OO&#8217;s Small Classes and Short Methods &#8220;, Andrew Binstock describes the constraints of an Object Oriented exercise by Jeff Bay. I may be three weeks late on the issue, but I had to wait to find the time to write a project using these restrictions. After all, one experiment is worth a thousand [...]]]></description>
			<content:encoded><![CDATA[<p>In &#8220;<a href="http://binstock.blogspot.com/2008/04/perfecting-oos-small-classes-and-short.html">Perfecting OO&#8217;s Small Classes and Short Methods</a> &#8220;, Andrew Binstock describes the constraints of an Object Oriented exercise by Jeff Bay.</p>
<p>I may be three weeks late on the issue, but I had to wait to find the time to write a project using these restrictions. After all, one experiment is worth a thousand blog entries, and I wanted to scout out the territory before I mounted my high horse.</p>
<p>What were the restrictions?<a href="http://binstock.blogspot.com/2008/04/perfecting-oos-small-classes-and-short.html"><br />
</a></p>
<blockquote><p><strong><a href="http://binstock.blogspot.com/2008/04/perfecting-oos-small-classes-and-short.html">From the Article</a> :</strong></p>
<p>1. <strong>Use only one level of indentation per method.</strong> If you need more than one level, you need to create a second method and call it from the first. This is one of the most important constraints in the exercise.</p>
<p>2. <strong>Don’t use the ‘<code>else</code> ’ keyword.</strong> Test for a condition with an if-statement and exit the routine if it’s not met. This prevents if-else chaining; and every routine does just one thing. You’re getting the idea.</p>
<p>3. <strong>Wrap all primitives and <code>string</code> s. </strong> This directly addresses “primitive obsession.” If you want to use an integer, you first have to create a class (even an inner class) to identify it’s true role. So zip codes are an object not an integer, for example. This makes for far clearer and more testable code.</p>
<p>4. <strong>Use only one dot per line.</strong> This step prevents you from reaching deeply into other objects to get at fields or methods, and thereby conceptually breaking encapsulation.</p>
<p>5. <strong>Don’t abbreviate names. </strong> This constraint avoids the procedural verbosity that is created by certain forms of redundancy—if you have to type the full name of a method or variable, you’re likely to spend more time thinking about its name. And you’ll avoid having objects called <code>Order</code> with methods entitled <code>shipOrder()</code> . Instead, your code will have more calls such as<code> Order.ship()</code> .</p>
<p>6. <strong>Keep entities small. </strong> This means no more than 50 lines per class and no more than 10 classes per package. The 50 lines per class constraint is crucial. Not only does it force concision and keep classes focused, but it means most classes can fit on a single screen in any editor/IDE.</p>
<p>7. <strong>Don’t use any classes with more than two instance variables.</strong> This is perhaps the hardest constraint. Bay’s point is that with more than two instance variables, there is almost certainly a reason to subgroup some variables into a separate class.</p>
<p>8. <strong>Use first-class collections. </strong> In other words, any class that contains a collection should contain no other member variables. The idea is an extension of primitive obsession. If you need a class that’s a subsumes the collection, then write it that way.</p>
<p>9. <strong>Don’t use setters, getters, or properties.</strong> This is a radical approach to enforcing encapsulation. It also requires implementation of dependency injection approaches and adherence to the maxim “tell, don’t ask.”</p></blockquote>
<p>These constraints are tough when you try to adhere to all of them at once. They have also received their fair share of <a href="http://dubroy.com/blog/2008/05/06/if-this-is-object-calisthenics-i-think-ill-stay-on-the-couch/">criticism</a> and <a href="http://weblog.raganwald.com/2008/05/narcissism-of-small-code-differences.html">teasing</a> from the mainstream blogging community.</p>
<p>These constraints are the object oriented version of the classic functional challenge: <em>write [program] without variables. All of your functions must have only one parameter.</em> Performing the exercise is apparently supposed to force a light bulb to turn on, and I wanted to see if it did.</p>
<h2>What Project Did You Choose?</h2>
<p>I chose to rewrite my implementation of Simon Funk&#8217;s <a href="http://sifter.org/~simon/journal/20061211.html">algorithm</a> for the <a href="http://www.netflixprize.com/">Netflix Prize</a> . I chose this for a few different reasons:</p>
<ol>
<li><strong>It would show weaknesses in the constraints</strong> . There are a few specific areas that I thought might be a &#8216;gotcha!&#8217;, which I will describe later.</li>
<li><strong>It is performance-critical</strong> : The algorithm is training-based, and I wanted to see if these constraints slow down the algorithm at all.</li>
<li><strong>It is a hell of a memory hog</strong> : my implementation sucks up over a gig of RAM when it has all of the prize data cached.</li>
<li><strong>It is a typical example of programming that I enjoy</strong> : mathematical and processing large volumes of data.</li>
</ol>
<p>If this isn&#8217;t a torture test of the exercise, I don&#8217;t know what is!</p>
<p>The constraints of the exercise hint towards Java (between the mention of packages and properties), but I chose C++ for this task. I&#8217;m starting a job at a C++-specific workplace in a few weeks, so you can deal with it. To make the exercise more Java-like, I pulled a <a href="http://www.boost.org/">Boost</a> and kept each of the class definitions in a single .h file. This also allowed me to more easily compare myself to the &#8220;Keep classes under 50 lines&#8221; constraint, as I didn&#8217;t have to work around what this really meant.</p>
<h2>Working With the Constraints in Practice</h2>
<p>1. <strong>Use only one level of indentation per method.</strong> This was one of the easiest constraints to work with, and it had an interesting effect on the code: it made it more AND less readable at the same time. Let&#8217;s take my top-level static method for loading the dataset:</p>

<div class="wp_syntax"><div class="code"><pre class="cpp" style="font-family:monospace;"><span style="color: #0000ff;">static</span> Dataset load_dataset<span style="color: #008000;">&#40;</span><span style="color: #0000ff;">const</span> Directory<span style="color: #000040;">&amp;</span>amp<span style="color: #008080;">;</span> dir<span style="color: #008000;">&#41;</span><span style="color: #008000;">&#123;</span>
    Dataset dataset<span style="color: #008080;">;</span>
&nbsp;
    <span style="color: #0000ff;">for</span><span style="color: #008000;">&#40;</span>MovieID id <span style="color: #000080;">=</span> MovieID<span style="color: #008080;">::</span><span style="color: #007788;">MIN_ID</span><span style="color: #008080;">;</span> id <span style="color: #000040;">!</span><span style="color: #000080;">=</span> MovieID<span style="color: #008080;">::</span><span style="color: #007788;">MAX_ID</span><span style="color: #008080;">;</span> <span style="color: #000040;">++</span>id<span style="color: #008000;">&#41;</span><span style="color: #008000;">&#123;</span>
        add_file_to_dataset<span style="color: #008000;">&#40;</span>create_filename<span style="color: #008000;">&#40;</span>dir, id<span style="color: #008000;">&#41;</span>, dataset, id<span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
    <span style="color: #008000;">&#125;</span>
    <span style="color: #0000ff;">return</span> dataset<span style="color: #008080;">;</span>
<span style="color: #008000;">&#125;</span></pre></div></div>

<p>On one hand, this is very clear. It&#8217;s easy to see exactly how the code is broken down, and the code can quickly be read. However, this does come with a negative effect: you have to look in more places to fully estimate the complexity of the code.</p>
<p>As with all ideas, this constraint works well in moderation. There is absolutely no reason to create a new method for the inner loop of bubble sort! There is also nothing wrong with separating code out into its logical units, and this constraint helps encourage this.</p>
<p>2. <strong>Don’t use the ‘<code>else</code> ’ keyword. </strong> This constraint was useless and irritating. This issue only came up a few times in this particular project, but I could imagine the nightmare of writing any kind of user interface programming. I would have much preferred being forced to only call methods from within if and else statements, like so:</p>

<div class="wp_syntax"><div class="code"><pre class="cpp" style="font-family:monospace;"><span style="color: #0000ff;">if</span><span style="color: #008000;">&#40;</span>condition<span style="color: #008000;">&#41;</span><span style="color: #008000;">&#123;</span>
    perform_work<span style="color: #008000;">&#40;</span><span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
<span style="color: #008000;">&#125;</span>
<span style="color: #0000ff;">else</span><span style="color: #008000;">&#123;</span>
    perform_other_work<span style="color: #008000;">&#40;</span><span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
<span style="color: #008000;">&#125;</span></pre></div></div>

<p>3. <strong>Wrap all primitives and <code>string</code> s. </strong> All of them? Really? Damn.</p>
<p>The immediate effect of this was painful. I spent a lot of up-front time writing out wrappers for <code>Filename</code> s, <code>Directory</code> s, <code>SVDCoefficient</code> s, <code>CustomerID</code> s, and <code>MovieID</code> s, etc. I think that this was the constraint that caused the development effort to drag on longer than my first implementation of this program.</p>
<p>However, once all of the up-front coding was done, it started to pay off big time. First, you get improved runtime checking. I was able to add some automatic checking to the primitives, which gave me much better runtime error detection. This helped catch programming errors that gave me invalid ratings, for instance. I could have done a lot more with this: automated file checking, for instance.</p>
<p>This also helps in static-typed languages. For instance, if I were to show you the following interface, I bet that you would be able to immediately figure out how to use it:</p>

<div class="wp_syntax"><div class="code"><pre class="cpp" style="font-family:monospace;">RatingPrediction SVDApproximation<span style="color: #008080;">::</span><span style="color: #007788;">prediction</span><span style="color: #008000;">&#40;</span><span style="color: #0000ff;">const</span> CustomerID<span style="color: #000040;">&amp;</span>amp<span style="color: #008080;">;</span>, <span style="color: #0000ff;">const</span> MovieID<span style="color: #000040;">&amp;</span>amp<span style="color: #008080;">;</span><span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span></pre></div></div>

<p>Even better, if you mix up the parameters, the compiler is going to tell you immediately, and you&#8217;re a &lt;Meta&gt;-t away from having working code.</p>
<p>If you didn&#8217;t wrap the types and the IDs were both unsigned ints, you could switch the parameters, <em>still have valid ids</em> , and spend 15 minutes debugging. You get static type-checking and sanity checks on your values at runtime just by taking the time to write out the classes.</p>
<p>However, <strong>avoiding primitive obsession becomes ridiculous at a point</strong> . For array indexes, I used the <code>size_t</code> variable, which is used in C++ to indicate a size or an array index. There is absolutely no gain to wrapping something like array indexes. Don&#8217;t go overboard with this crap.</p>
<p>4. <strong>Use only one dot per line. </strong> This wasn&#8217;t an issue. I could see it being more of an issue in a different kind of program, but I did not run into chaining problems, so I can not comment on them.</p>
<p>5. <strong>Don’t abbreviate names. </strong> I&#8217;m not very big on abbreviation, so this was not an issue. I had to expand some things like &#8220;num&#8221; to &#8220;number&#8221;, but this was the easiest constraint to follow, as I already follow it.</p>
<p>6. <strong>Keep entities small. </strong> I did bump up against the 50-line limit a few times, but usually this was because there was something else I could refactor. I could foresee the need to ignore this constraint, but in my case it helped keep me honest about what my entities did, and helped logical grouping.</p>
<p>7. <strong>Don’t use any classes with more than two instance variables. </strong> Aha! I knew that I&#8217;d catch the constraints somewhere!</p>
<p>The way that this particular algorithm is designed, the training is performed by looping through the <code>Dataset</code> . The <code>Dataset</code> is, at heart, a <code>std::vector&lt;RatingTuple&gt;</code> . Each rating is a 3-tuple consisting of a <code>MovieID</code> , a <code>CustomerID</code> , and a <code>RatingValue</code> .</p>
<p>However, it doesn&#8217;t make sense to pair down the data further. The only added value that can be gained is following the constraints of the exercise&#8230; there is nothing else to find here! I also thought about breaking the <code>MovieID</code> , <code>CustomerID</code> , and <code>RatingValue</code> s into separate collections, but this approach is also flawed. The strongest relation here is between the tuples and not between the objects of the same type. One could store a map from a <code>std::pair&lt;CustomerID, MovieID&gt;</code> to a <code>RatingValue</code> , but this is just an added complexity.</p>
<p>The message here is very clear, and very useful: logically group related variables. This particular constraint proves too restrictive sometimes, so use it as a guideline, not a rule.</p>
<p>8. <strong>Use first-class collections.</strong> This particular constraint wasn&#8217;t very different from the primitive obsession constraint, and had all of the same benefits.</p>
<p>9. <strong>Don’t use setters, getters, or properties.</strong> I found that this was the hardest constraint to use, and it was sometimes impossible.</p>
<p>For the vast majority of my classes, I found that the constructors became workhorses when they might not have before. I also found that operator overloading was imperative to this constraint failing to drive me nuts. Overloaded operators can be considered a violation of encapsulation, but this is a violation that falls into the hands of the library writers, not the library users. I am comfortable with that. I also did not abuse operators for this purpose, and only overloaded sane operations: the adding of two <code>RatingValue</code> s, for instance, and the comparisons of some other types.</p>
<p>Going back to the <code>RatingTuple</code> approach, the strongest association of the data also forces the violation of this constraint. When iterating through the data, the algorithm can&#8217;t work if it doesn&#8217;t know the rating, it doesn&#8217;t work if it doesn&#8217;t know a matrix index corresponding to the <code>CustomerID</code> , and it doesn&#8217;t work if it doesn&#8217;t know the matrix index corresponding to the <code>MovieID</code> .</p>
<p>However, for most of the rest of my classes, it was possible to follow the constraint, and I found that it led to terser classes: they only ended up with the methods that they needed, not methods that I thought they needed.</p>
<h2>What Was The Damage?</h2>
<ul>
<li><strong>655 total lines</strong> . It&#8217;s hard to compare it to the original project, because a) the other project has extra features, b) the other project was written using the traditional (.h, .cpp) pairings, and c) I don&#8217;t care.</li>
<li><strong>20 files</strong></li>
<li><strong>18 classes </strong> (if I took the time to separate them into proper namespaces, they would be in a few different &#8220;packages&#8221;. I wouldn&#8217;t say I violated the constraint).</li>
<li><strong>1 macro file</strong></li>
<li><strong>1 .cpp file</strong></li>
</ul>
<h2>Final Thoughts</h2>
<p>This exercise has some good constraints, some bad constraints, and some bland constraints. I will summarize them in list form for those of you who skipped to the end. If you are wondering why, scroll back up to the appropriate section, because I already said so once.</p>
<ul>
<li><strong>Good</strong> :
<ul>
<li>Use one level of indentation per method.</li>
<li>Wrap all primitives and strings.</li>
<li>First-class collections.</li>
</ul>
</li>
<li style="text-align: left;"><strong>Bad:</strong>
<ul>
<li style="text-align: left;">Don&#8217;t use the `else&#8217; keyword.</li>
<li style="text-align: left;">Don&#8217;t use a class with more than two instance variables.</li>
</ul>
</li>
<li style="text-align: left;"><strong>Blah:</strong>
<ul>
<li style="text-align: left;">Use one dot per line.</li>
<li style="text-align: left;">Don&#8217;t abbreviate.</li>
<li style="text-align: left;">Keep entities small.</li>
<li style="text-align: left;">Don&#8217;t use getters, setters, or properties.</li>
</ul>
</li>
</ul>
<p>The exercise only has a few good constraints, but I think that it is important to work through the ones you truly hate in order to appreciate the ones that help code readability. Fragmenting the methods into their smallest cohesive units also makes them easier to test, which is what this exercise certainly encourages.</p>
<img src="http://www.jakevoytko.com/blog/?ak_action=api_record_view&id=81&type=feed" alt="" />]]></content:encoded>
			<wfw:commentRss>http://www.jakevoytko.com/blog/2008/05/26/working-through-the-oos-small-classes-and-short-methods-exercise/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
	</channel>
</rss>
