<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Fun With String Searching</title>
	<atom:link href="http://www.jakevoytko.com/blog/2007/12/11/fun-with-string-searching/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.jakevoytko.com/blog/2007/12/11/fun-with-string-searching/</link>
	<description>Ye Olde Computer Science Blogge</description>
	<lastBuildDate>Mon, 18 Jan 2010 23:12:49 -0500</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.1</generator>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>By: eran</title>
		<link>http://www.jakevoytko.com/blog/2007/12/11/fun-with-string-searching/comment-page-1/#comment-9905</link>
		<dc:creator>eran</dc:creator>
		<pubDate>Tue, 16 Jun 2009 10:57:13 +0000</pubDate>
		<guid isPermaLink="false">http://www.jakevoytko.com/blog/2007/12/11/fun-with-string-searching/#comment-9905</guid>
		<description>excellent presentation !
The hardest in math is to explain things simply not to just understand them and in this you are a winner.
as said before and more relevant to my case, Rabin-Karp is the best for multi-pattern searches (see wikipedia)</description>
		<content:encoded><![CDATA[<p>excellent presentation !<br />
The hardest in math is to explain things simply not to just understand them and in this you are a winner.<br />
as said before and more relevant to my case, Rabin-Karp is the best for multi-pattern searches (see wikipedia)</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Recent Links Tagged With "rabinkarp" - JabberTags</title>
		<link>http://www.jakevoytko.com/blog/2007/12/11/fun-with-string-searching/comment-page-1/#comment-9852</link>
		<dc:creator>Recent Links Tagged With "rabinkarp" - JabberTags</dc:creator>
		<pubDate>Thu, 23 Oct 2008 10:04:35 +0000</pubDate>
		<guid isPermaLink="false">http://www.jakevoytko.com/blog/2007/12/11/fun-with-string-searching/#comment-9852</guid>
		<description>[...] public links &gt;&gt; rabinkarp   Fun With String Searching Saved by hikarisuong on Wed 22-10-2008   Rabin-Karp string search algorithm - Wikipedia, the free [...]</description>
		<content:encoded><![CDATA[<p>[...] public links &gt;&gt; rabinkarp   Fun With String Searching Saved by hikarisuong on Wed 22-10-2008   Rabin-Karp string search algorithm &#8211; Wikipedia, the free [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Preslav Rachev</title>
		<link>http://www.jakevoytko.com/blog/2007/12/11/fun-with-string-searching/comment-page-1/#comment-9848</link>
		<dc:creator>Preslav Rachev</dc:creator>
		<pubDate>Sun, 19 Oct 2008 19:03:27 +0000</pubDate>
		<guid isPermaLink="false">http://www.jakevoytko.com/blog/2007/12/11/fun-with-string-searching/#comment-9848</guid>
		<description>A truly great post. Thanks for the good explanation. My Data Structures professor could not explain that topic properly. Since I am mostly using C#, I tend to avoid the C/C++ way of thinking, but after I read your post, I felt really comfortable.

Thanks once again</description>
		<content:encoded><![CDATA[<p>A truly great post. Thanks for the good explanation. My Data Structures professor could not explain that topic properly. Since I am mostly using C#, I tend to avoid the C/C++ way of thinking, but after I read your post, I felt really comfortable.</p>
<p>Thanks once again</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: josh</title>
		<link>http://www.jakevoytko.com/blog/2007/12/11/fun-with-string-searching/comment-page-1/#comment-1702</link>
		<dc:creator>josh</dc:creator>
		<pubDate>Thu, 13 Dec 2007 00:41:46 +0000</pubDate>
		<guid isPermaLink="false">http://www.jakevoytko.com/blog/2007/12/11/fun-with-string-searching/#comment-1702</guid>
		<description>As written, Rabin-Karp is basically the linear search except you&#039;re comparing 4 characters (sort of) at a time.  Without bothering to read up on it, I imagine it could win out if you&#039;re searching a fixed text for a large number of substrings.  (Assuming you build an actual hash table.)

IIRC both Boyer-Moore and KMP optimize the other way, for finding one string in many texts or at least with a much larger haystack than needle.  That&#039;s the common case, but not the only case.</description>
		<content:encoded><![CDATA[<p>As written, Rabin-Karp is basically the linear search except you&#8217;re comparing 4 characters (sort of) at a time.  Without bothering to read up on it, I imagine it could win out if you&#8217;re searching a fixed text for a large number of substrings.  (Assuming you build an actual hash table.)</p>
<p>IIRC both Boyer-Moore and KMP optimize the other way, for finding one string in many texts or at least with a much larger haystack than needle.  That&#8217;s the common case, but not the only case.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Andreas Bernauer</title>
		<link>http://www.jakevoytko.com/blog/2007/12/11/fun-with-string-searching/comment-page-1/#comment-1688</link>
		<dc:creator>Andreas Bernauer</dc:creator>
		<pubDate>Wed, 12 Dec 2007 13:51:10 +0000</pubDate>
		<guid isPermaLink="false">http://www.jakevoytko.com/blog/2007/12/11/fun-with-string-searching/#comment-1688</guid>
		<description>I could improve Rabin-Karp by replacing the hash function with the simplest one: just adding all characters. 

Of course, this results in a lot of hash equalities where there is no match, but the extra brute_force_match calls still cost less than the overhead for the real hash value.</description>
		<content:encoded><![CDATA[<p>I could improve Rabin-Karp by replacing the hash function with the simplest one: just adding all characters. </p>
<p>Of course, this results in a lot of hash equalities where there is no match, but the extra brute_force_match calls still cost less than the overhead for the real hash value.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Andreas Bernauer</title>
		<link>http://www.jakevoytko.com/blog/2007/12/11/fun-with-string-searching/comment-page-1/#comment-1687</link>
		<dc:creator>Andreas Bernauer</dc:creator>
		<pubDate>Wed, 12 Dec 2007 13:02:31 +0000</pubDate>
		<guid isPermaLink="false">http://www.jakevoytko.com/blog/2007/12/11/fun-with-string-searching/#comment-1687</guid>
		<description>Hi Jake, thanks for the post.  I found it interesting and checked out the sources you provided to play a little bit. I found two bugs you might be interested in:

- You read in the files with getline(), which discards the newline. This has two effects: (a) the reported match position is not helpful to find the string in the file. (b) Boyre-Moore isn&#039;t really &quot;tortured&quot;, as the match is already at the beginning of the file.  You can see this from your posted results: match already at position 110,154 of 11,015,556 characters. Using read() to read the file avoids both problems and also speeds up reading the large files:
	string result;
	static const int buffer_size = 1 </description>
		<content:encoded><![CDATA[<p>Hi Jake, thanks for the post.  I found it interesting and checked out the sources you provided to play a little bit. I found two bugs you might be interested in:</p>
<p>- You read in the files with getline(), which discards the newline. This has two effects: (a) the reported match position is not helpful to find the string in the file. (b) Boyre-Moore isn&#8217;t really &#8220;tortured&#8221;, as the match is already at the beginning of the file.  You can see this from your posted results: match already at position 110,154 of 11,015,556 characters. Using read() to read the file avoids both problems and also speeds up reading the large files:<br />
	string result;<br />
	static const int buffer_size = 1</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Fun With String Searching : So Jake Says:</title>
		<link>http://www.jakevoytko.com/blog/2007/12/11/fun-with-string-searching/comment-page-1/#comment-1683</link>
		<dc:creator>Fun With String Searching : So Jake Says:</dc:creator>
		<pubDate>Wed, 12 Dec 2007 10:07:55 +0000</pubDate>
		<guid isPermaLink="false">http://www.jakevoytko.com/blog/2007/12/11/fun-with-string-searching/#comment-1683</guid>
		<description>[...] tlipcon wrote an interesting post today onHere&#8217;s a quick excerpt [...]</description>
		<content:encoded><![CDATA[<p>[...] tlipcon wrote an interesting post today onHere&#8217;s a quick excerpt [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: purrl.net &#124;** URLs that purr **&#124;</title>
		<link>http://www.jakevoytko.com/blog/2007/12/11/fun-with-string-searching/comment-page-1/#comment-1682</link>
		<dc:creator>purrl.net &#124;** URLs that purr **&#124;</dc:creator>
		<pubDate>Wed, 12 Dec 2007 10:06:33 +0000</pubDate>
		<guid isPermaLink="false">http://www.jakevoytko.com/blog/2007/12/11/fun-with-string-searching/#comment-1682</guid>
		<description>&lt;strong&gt;The web&#039;s most interesting stories on Wed 12th Dec 2007...&lt;/strong&gt;

These are the web&#039;s most talked about URLs on Wed 12th Dec 2007. The current winner is...</description>
		<content:encoded><![CDATA[<p><strong>The web&#8217;s most interesting stories on Wed 12th Dec 2007&#8230;</strong></p>
<p>These are the web&#8217;s most talked about URLs on Wed 12th Dec 2007. The current winner is&#8230;</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Chief</title>
		<link>http://www.jakevoytko.com/blog/2007/12/11/fun-with-string-searching/comment-page-1/#comment-1681</link>
		<dc:creator>Chief</dc:creator>
		<pubDate>Wed, 12 Dec 2007 07:58:53 +0000</pubDate>
		<guid isPermaLink="false">http://www.jakevoytko.com/blog/2007/12/11/fun-with-string-searching/#comment-1681</guid>
		<description>I&#039;d be interested to know how your algorithms test if all your multiply operations are replaced with bitshifts in Rabin-Karp.

So,
ALPHABET_SIZE = 256;
x = ... y </description>
		<content:encoded><![CDATA[<p>I&#8217;d be interested to know how your algorithms test if all your multiply operations are replaced with bitshifts in Rabin-Karp.</p>
<p>So,<br />
ALPHABET_SIZE = 256;<br />
x = &#8230; y</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Chii</title>
		<link>http://www.jakevoytko.com/blog/2007/12/11/fun-with-string-searching/comment-page-1/#comment-1679</link>
		<dc:creator>Chii</dc:creator>
		<pubDate>Wed, 12 Dec 2007 07:29:27 +0000</pubDate>
		<guid isPermaLink="false">http://www.jakevoytko.com/blog/2007/12/11/fun-with-string-searching/#comment-1679</guid>
		<description>the third algorithm is very interesting! i wonder why it works...</description>
		<content:encoded><![CDATA[<p>the third algorithm is very interesting! i wonder why it works&#8230;</p>
]]></content:encoded>
	</item>
</channel>
</rss>
