<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>So Jake Says &#187; Fast inverse square root</title>
	<atom:link href="http://www.jakevoytko.com/blog/tag/fast-inverse-square-root/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.jakevoytko.com/blog</link>
	<description>Ye Olde Computer Science Blogge</description>
	<lastBuildDate>Sun, 17 Jan 2010 15:16:00 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0</generator>
		<item>
		<title>Quake 3&#8242;s Fast Inverse Square Root Function</title>
		<link>http://www.jakevoytko.com/blog/2008/01/28/quake-3s-fast-square-root-function/</link>
		<comments>http://www.jakevoytko.com/blog/2008/01/28/quake-3s-fast-square-root-function/#comments</comments>
		<pubDate>Mon, 28 Jan 2008 05:25:47 +0000</pubDate>
		<dc:creator>Jake</dc:creator>
				<category><![CDATA[Computer Science]]></category>
		<category><![CDATA[Math]]></category>
		<category><![CDATA[Fast inverse square root]]></category>
		<category><![CDATA[Iterative]]></category>
		<category><![CDATA[Newton-Raphson]]></category>
		<category><![CDATA[Quake 3]]></category>

		<guid isPermaLink="false">http://www.jakevoytko.com/blog/2008/01/28/quake-3s-fast-square-root-function/</guid>
		<description><![CDATA[Note: This is not meant to be an authoritative mathematical description, and I&#8217;m pretty late to the party.. I was experimenting with the code, and am scratching an itch. For a far superior description, please look at Chris Lomont&#8217;s excellent analysis. The Infamous Code x2 = number * 0.5F; y = number; i = * [...]]]></description>
			<content:encoded><![CDATA[<p><em>Note: This is not meant to be an authoritative mathematical description, and I&#8217;m pretty late to the party.. I was experimenting with the code, and am scratching an itch. For a far superior description, please look at Chris Lomont&#8217;s <a href="http://www.lomont.org/Math/Papers/2003/InvSqrt.pdf">excellent analysis</a>.</em></p>
<h2>The Infamous Code</h2>

<div class="wp_syntax"><div class="code"><pre class="cpp" style="font-family:monospace;">x2 <span style="color: #000080;">=</span> number <span style="color: #000040;">*</span> <span style="color:#800080;">0.5F</span><span style="color: #008080;">;</span>
y  <span style="color: #000080;">=</span> number<span style="color: #008080;">;</span>
i  <span style="color: #000080;">=</span> <span style="color: #000040;">*</span> <span style="color: #008000;">&#40;</span> <span style="color: #0000ff;">long</span> <span style="color: #000040;">*</span> <span style="color: #008000;">&#41;</span> <span style="color: #000040;">&amp;</span>amp<span style="color: #008080;">;</span>y<span style="color: #008080;">;</span>
i  <span style="color: #000080;">=</span> <span style="color: #208080;">0x5f3759df</span> <span style="color: #000040;">-</span> <span style="color: #008000;">&#40;</span> i <span style="color: #000040;">&amp;</span>gt<span style="color: #008080;">;</span><span style="color: #000040;">&amp;</span>gt<span style="color: #008080;">;</span> <span style="color: #0000dd;">1</span> <span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
y  <span style="color: #000080;">=</span> <span style="color: #000040;">*</span> <span style="color: #008000;">&#40;</span> <span style="color: #0000ff;">float</span> <span style="color: #000040;">*</span> <span style="color: #008000;">&#41;</span> <span style="color: #000040;">&amp;</span>amp<span style="color: #008080;">;</span>i<span style="color: #008080;">;</span>
y  <span style="color: #000080;">=</span> y <span style="color: #000040;">*</span> <span style="color: #008000;">&#40;</span> threehalfs <span style="color: #000040;">-</span> <span style="color: #008000;">&#40;</span> x2 <span style="color: #000040;">*</span> y <span style="color: #000040;">*</span> y <span style="color: #008000;">&#41;</span> <span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
<span style="color: #666666;">// y  = y * ( threehalfs - ( x2 * y * y ) );</span></pre></div></div>

<h2>The Function to Model</h2>
<p><img src="http://www.jakevoytko.com/blog/wp-content/uploads/2008/01/invsqrt.png" alt="Inverse Square Root" /></p>
<h2>How Did the Authors Think of This?</h2>
<p>Interestingly, this method is nearly identical to one from a mathematical text called &#8220;<a href="http://www.amazon.com/gp/redirect.html?ie=UTF8&amp;location=http%3A%2F%2Fwww.amazon.com%2FIntroduction-Numerical-Analysis-Kendall-Atkinson%2Fdp%2F0471624896%3Fie%3DUTF8%26s%3Dbooks%26qid%3D1201497518%26sr%3D8-1&amp;tag=jakvoyshom-20&amp;linkCode=ur2&amp;camp=1789&amp;creative=9325">An Introduction to Numerical Analysis</a><img style="border:none !important; margin:0px !important;" src="http://www.assoc-amazon.com/e/ir?t=jakvoyshom-20&amp;l=ur2&amp;o=1" border="0" alt="" width="1" height="1" />&#8220;, where there is an application exercise to compute the square root of a function, taking advantage of the storage of floating point numbers.</p>
<p>My Numerical Methods book for this semester contains the full derivation of the method from &#8220;An Introduction to Numerical Analysis&#8221; that uses linear interpolation for an initial guess to Newton&#8217;s Method that gives the accuracy of the function to provably under 4.7E-14 for four iterations. Chris Lomont&#8217;s paper goes into much more detail about the method for choosing a suitable constant that the &#8220;Quake 3&#8243; authors likely used. Linear interpolation gives a fairly good guess, but it&#8217;s possible to take advantage of the way the constant is stored to give us a much better guess. As you can see below, the guess isn&#8217;t linear, but actually fits the curve very well without any iterations of the Newton-Rhapson method.</p>
<p>The initial guess is very good. How good? It nearly overlaps the function. The guess is added in red:</p>
<p><img src="http://www.jakevoytko.com/blog/wp-content/uploads/2008/01/invsqrt_vs_constant.png" alt="Inverse Square Root With Constant" /></p>
<p>I was going to compare the output of the Quake 3 method with the real output, but it was difficult finding a view where there was any very noticeable difference at all, so suffice it to say that it is very close.</p>
<h2>Some of the Math</h2>
<p>We are trying to find a quick approximation for the function <img src='/blog/wp-content/plugins/latexrender/pictures/49183d94955ae5740aa3ce519cf8b009_2.94444pt.gif' title='$y = x^{\frac{-1}{2}}$' alt='$y = x^{\frac{-1}{2}}$'  style="vertical-align:-2.94444pt;" >. This can be rearranged as <img src='/blog/wp-content/plugins/latexrender/pictures/54d39f0f031540c16765d5ff80000bd9.gif' title='$0 = y^{-2} &amp;#8211; x$' alt='$0 = y^{-2} &amp;#8211; x$'  align=absmiddle>. We want to find the roots of this function for <img src='/blog/wp-content/plugins/latexrender/pictures/a68a2511f13494e2ba44cc046def78dd.gif' title='$F(y) = y^{-2} &amp;#8211; x$' alt='$F(y) = y^{-2} &amp;#8211; x$'  align=absmiddle>, which are +/- <img src='/blog/wp-content/plugins/latexrender/pictures/f52a893986e9de0715d949144c08ca0b_4.05008pt.gif' title='$\sqrt(x)$' alt='$\sqrt(x)$'  style="vertical-align:-4.05008pt;" >. Note that <img src='/blog/wp-content/plugins/latexrender/pictures/04f26fcbdfc34c1cd8d0688f942a15a8_3.5pt.gif' title='$F\prime(y) = -2 * y^{-3}$' alt='$F\prime(y) = -2 * y^{-3}$'  style="vertical-align:-3.5pt;" >.</p>
<h2>Newton-Raphson Method</h2>
<p>Back in the days of Newton, all math had to be calculated by hand. Since it was often impossible to calculate the exact value of many results, approximations were needed.</p>
<p>The Newton-Raphson method is used to quickly approximate function roots. The basic idea is that we start off with a guess that we think is very close to the value of the root. We then take the tangent line at the function f(x). Provided that f(x) is continuous, we follow the tangent line to the X-axis. We then take the derivative at this point and follow the tangent line to the function to the X-axis. Rinse and repeat until you get the precision you need.</p>
<p>It is important to note that this method doesn&#8217;t always work: it is not guaranteed to converge, and in fact, you could continue calculating intersections <em>ad infinitum </em>and never get any closer to having the right answer! Therefore, it is important in this instance to try to optimize the initial guess to have as little error as possible.</p>
<p>As with all good methods, this one has an easy-to-remember formula.  For a current approximation, <img src='/blog/wp-content/plugins/latexrender/pictures/d7084ce258ffe96f77e4f3647b250bbf_2.49998pt.gif' title='$x_n$' alt='$x_n$'  style="vertical-align:-2.49998pt;" >, we find <img src='/blog/wp-content/plugins/latexrender/pictures/bfa03e1b73d4cba50a3eef37c4f20d57_3.5pt.gif' title='$x(n+1)$' alt='$x(n+1)$'  style="vertical-align:-3.5pt;" > by:</p>
<p><img src='/blog/wp-content/plugins/latexrender/pictures/06cb14fcd30adbedf4908f6e5af49555.gif' title='$x_{n+1} = x_{n} &amp;#8211; \frac{f(x_{n})}{f\prime(x_{n})}$' alt='$x_{n+1} = x_{n} &amp;#8211; \frac{f(x_{n})}{f\prime(x_{n})}$'  align=absmiddle></p>
<p>For those interested, the derivation can be found here. For the mildly interested, it is derived by taking the first few terms of the Taylor Series of the function.</p>
<p>So to do the Newton-Raphson approximation on a differentiable function, we need one thing:</p>
<ol>
<li>An initial guess of the root. The closer, the better.</li>
</ol>
<h2>Actual Iterative Derivation</h2>
<p>One small nit pick I had with Chris Lomont&#8217;s paper was that it skipped the actual derivation of the iterative function, so here it is:</p>
<blockquote><p><img src='/blog/wp-content/plugins/latexrender/pictures/f5949838267d957b802074b070c17e9f.gif' title='$y_{n+1} = y_{n} &amp;#8211; \frac{f(y_{n})}{f\prime(y_{n})}$' alt='$y_{n+1} = y_{n} &amp;#8211; \frac{f(y_{n})}{f\prime(y_{n})}$'  align=absmiddle><br />
<img src='/blog/wp-content/plugins/latexrender/pictures/f9908df14b5c1e7f06684410f25b2002.gif' title='$y_{n+1} = y_{n} &amp;#8211; \frac{y_{n}^{-2} &amp;#8211; x}{-2*y_{n}^{-3}}$' alt='$y_{n+1} = y_{n} &amp;#8211; \frac{y_{n}^{-2} &amp;#8211; x}{-2*y_{n}^{-3}}$'  align=absmiddle><br />
<img src='/blog/wp-content/plugins/latexrender/pictures/a62806df07c758d1d7f049e224e52870.gif' title='$y_{n+1} = y_{n} + y_{n}^{-2} * \frac{y_{n}^{3}}{2} &amp;#8211; \frac{x*y_{n}^{3}}{2}$' alt='$y_{n+1} = y_{n} + y_{n}^{-2} * \frac{y_{n}^{3}}{2} &amp;#8211; \frac{x*y_{n}^{3}}{2}$'  align=absmiddle><br />
<img src='/blog/wp-content/plugins/latexrender/pictures/28f4071f9a8337fda53f615073bd4605.gif' title='$y_{n+1} = y_{n} + \frac{y_{n}}{2} &amp;#8211; \frac{x*y_{n}^{3}}{2}$' alt='$y_{n+1} = y_{n} + \frac{y_{n}}{2} &amp;#8211; \frac{x*y_{n}^{3}}{2}$'  align=absmiddle><br />
<img src='/blog/wp-content/plugins/latexrender/pictures/f09d9a2922ad41eff6ae66d44b1eac44.gif' title='$y_{n+1} = \frac{3y_{n}}{2} &amp;#8211; \frac{x*y_{n}^{3}}{2}$' alt='$y_{n+1} = \frac{3y_{n}}{2} &amp;#8211; \frac{x*y_{n}^{3}}{2}$'  align=absmiddle><br />
<img src='/blog/wp-content/plugins/latexrender/pictures/9e891513a659c3cb3b31004103aa1d9f.gif' title='$y_{n+1} = y_{n} * (1.5 &amp;#8211; (x/2) y_{n}^{2})$' alt='$y_{n+1} = y_{n} * (1.5 &amp;#8211; (x/2) y_{n}^{2})$'  align=absmiddle></p></blockquote>
<p>When we substitute &#8220;<img src='/blog/wp-content/plugins/latexrender/pictures/332cc365a4987aacce0ead01b8bdcc0b_1.0pt.gif' title='$x$' alt='$x$'  style="vertical-align:-1.0pt;" >&#8221; for &#8220;<img src='/blog/wp-content/plugins/latexrender/pictures/c3a8057857fabfcea20140f7c90a76a7_2.94444pt.gif' title='$y_{n}$' alt='$y_{n}$'  style="vertical-align:-2.94444pt;" >&#8221; and &#8220;<img src='/blog/wp-content/plugins/latexrender/pictures/319afcef79efd357fc57aba5ad0dc553_3.33333pt.gif' title='$y_{n+1}$' alt='$y_{n+1}$'  style="vertical-align:-3.33333pt;" >&#8220;, (Since the function uses the same variable to store the current and next guess), we find the following:</p>

<div class="wp_syntax"><div class="code"><pre class="cpp" style="font-family:monospace;">x <span style="color: #000080;">=</span> x <span style="color: #000040;">*</span> <span style="color: #008000;">&#40;</span><span style="color:#800080;">1.5</span> <span style="color: #000040;">-</span> <span style="color: #008000;">&#40;</span>x<span style="color: #000040;">/</span><span style="color: #0000dd;">2</span><span style="color: #008000;">&#41;</span> <span style="color: #000040;">*</span> x <span style="color: #000040;">*</span> x<span style="color: #008000;">&#41;</span></pre></div></div>

<p>Which looks awfully familiar.</p>
<img src="http://www.jakevoytko.com/blog/?ak_action=api_record_view&id=45&type=feed" alt="" />]]></content:encoded>
			<wfw:commentRss>http://www.jakevoytko.com/blog/2008/01/28/quake-3s-fast-square-root-function/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
	</channel>
</rss>
