<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Jonas K. Sekamane &#187; technology</title>
	<atom:link href="http://jonas.sekamane.com/tag/technology/feed/" rel="self" type="application/rss+xml" />
	<link>http://jonas.sekamane.com</link>
	<description>...</description>
	<lastBuildDate>Fri, 13 Aug 2010 22:33:06 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.5</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Mandag Morgen RSS – courtesy of Feed43</title>
		<link>http://jonas.sekamane.com/2010/08/mandag-morgen-rss-courtesy-of-feed4/</link>
		<comments>http://jonas.sekamane.com/2010/08/mandag-morgen-rss-courtesy-of-feed4/#comments</comments>
		<pubDate>Fri, 13 Aug 2010 22:29:42 +0000</pubDate>
		<dc:creator>Jonas K. Sekamane</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[Feed43]]></category>
		<category><![CDATA[hack]]></category>
		<category><![CDATA[mandagmorgen.dk]]></category>
		<category><![CDATA[mm.dk]]></category>
		<category><![CDATA[Ponyfish]]></category>
		<category><![CDATA[RSS]]></category>
		<category><![CDATA[technology]]></category>

		<guid isPermaLink="false">http://jonas.sekamane.com/?p=116</guid>
		<description><![CDATA[I consume through Google Reader.
I used to question the benefits of RSS.
That was then, and this is now.
And now I hardly read an online article unless it has somehow made its way through my RSS Reader. The work required to stay up to date otherwise is just too tedious.
Luckily it has now become customary to [...]]]></description>
			<content:encoded><![CDATA[<p>I consume through Google Reader.<a href="http://feed43.com/"><img class="alignright" title="Feed43" src="http://feed43.com/res/logo.gif" alt="" width="177" height="45" /></a><br />
I used to question the benefits of RSS.<br />
That was then, and this is now.</p>
<p>And now I hardly read an online article unless it has somehow made its way through my RSS Reader. The work required to stay up to date otherwise is just too tedious.</p>
<p>Luckily it has now become customary to provide an RSS feeds along with ones content. But some, for some mysterious reason, still don&#8217;t follow this code of practice. For those rare instances <a href="http://feed43.com/"><strong>Feed43</strong></a> is a <em>&#8220;free online service that converts any web page to an RSS feed on the fly.&#8221;</em></p>
<p>Using this service I created an RSS feed for respectively <a title="Mandag Morgen Danmark" href="https://www.mm.dk/">mm.dk</a> and <a title="Mandag Morgen Norge" href="http://mandagmorgen.no/">mandagmorgen.no</a><em>:<strong><br />
</strong></em></p>
<ul>
<li><strong><a href="http://feed43.com/mm-dk.xml">http://feed43.com/mm-dk.xml</a></strong></li>
<li><strong><a href="http://feed43.com/mandagmorgen-no.xml">http://feed43.com/mandagmorgen-no.xml</a></strong></li>
</ul>
<h3>The technical setup</h3>
<p>I tried a couple of other services first. Some much easier to setup. But as fare as I could tell non of the other services could be customized (free of charge) to capture all the articles. On mm.dk The syntax for the top article and &#8220;<em>ugens graf</em>&#8221; is slightly different from the rest of the articles. And the URLs do not look similar enough. For those two reasons the other services came up short. If the page you want to convert is more uniform I would recommend using <a title="Ponyfish - Create RSS feeds" href="https://www.ponyfish.com/">Ponyfish</a>.</p>
<p>But using Feed43 was quite straight forward once you got the syntax: <em>parameter</em> and <em>joker</em>.</p>
<p><strong>Global Search Pattern:</strong></p>
<pre>&lt;div id="content"&gt;<span style="color: #ff0000;">{%}</span>&lt;div id="sidebarLeft"</pre>
<p><strong>Item (repeatable) Search Pattern</strong><strong>:</strong></p>
<pre>&lt;h<span style="color: #ff0000;">{*}</span>&gt;&lt;a href="<span style="color: #ff0000;">{%}</span>"&gt;<span style="color: #ff0000;">{%}</span>&lt;/a&gt;&lt;/h<span style="color: #ff0000;">{*}</span>&gt;
<span style="color: #ff0000;">{*}</span>
&lt;p class="content"&gt;<span style="color: #ff0000;">{%}</span>&lt;/p&gt;</pre>
<p>You use the Global Search Pattern to limit the area of interest, so to speak. In the case of mm.dk I toke advantage of the three column design. Limiting my search to the middle column. And you use Item Search Pattern to further limit so you left with only the parameters (title, url, describtion). The pattern is repeatedly searched for until there are no more matching items. I wanted the headings (&lt;h1&gt; and &lt;h2&gt;), and with the joker I was able to catch both. The paragraph was more straight forward because of the class name. And out comes:</p>
<pre>
<pre><span style="color: #ff0000;">{%1} = </span>https://www.mm.dk/drejebog-til-et-dronningemord
<span style="color: #ff0000;">{%2} = </span>Drejebog til et&amp;nbsp;dronningemord
<span style="color: #ff0000;">{%3} = </span>&lt;strong&gt;MM Perspektiv: &lt;/strong&gt; &lt;span&gt;De danske medier&lt;/span&gt; har med sine
       spinkelt underbyggede vedvarende angreb på Lene Espersen, undergravet
       deres&amp;nbsp;troværdighed.</pre>
</pre>
<p>Beautiful.</p>
]]></content:encoded>
			<wfw:commentRss>http://jonas.sekamane.com/2010/08/mandag-morgen-rss-courtesy-of-feed4/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Crowds teach computers to read the scanned text</title>
		<link>http://jonas.sekamane.com/2009/09/crowds-teach-computers-to-read-the-scanned-text/</link>
		<comments>http://jonas.sekamane.com/2009/09/crowds-teach-computers-to-read-the-scanned-text/#comments</comments>
		<pubDate>Wed, 16 Sep 2009 19:21:24 +0000</pubDate>
		<dc:creator>Jonas K. Sekamane</dc:creator>
				<category><![CDATA[Innovative Business Models]]></category>
		<category><![CDATA[crowdsourcing]]></category>
		<category><![CDATA[digitize]]></category>
		<category><![CDATA[google]]></category>
		<category><![CDATA[reCAPTCHA]]></category>
		<category><![CDATA[scanned text]]></category>
		<category><![CDATA[technology]]></category>

		<guid isPermaLink="false">http://jonas.sekamane.com/?p=40</guid>
		<description><![CDATA[From the Google Acquires reCAPTCHA article at Mashable.com:
Why exactly does Google want to own this technology?
&#8230; many of the CAPTCHAs provided by reCAPTCHA come from scanned archival newspapers and old books. Computers find it hard to recognize these words because the ink and paper have degraded over time, but by typing them in as a CAPTCHA, [...]]]></description>
			<content:encoded><![CDATA[<p>From the <a title="Google Acquires reCAPTCHA" href="http://mashable.com/2009/09/16/google-acquires-recaptcha/"><strong>Google Acquires reCAPTCHA</strong></a> article at Mashable.com:</p>
<blockquote><p><strong>Why exactly does Google want to own this technology?</strong></p>
<p style="margin-bottom: 1em; margin-top: 0px; margin-right: 0px; margin-left: 0px; line-height: 1.5; text-align: justify; padding: 0px;">&#8230; many of the CAPTCHAs provided by reCAPTCHA come from scanned archival newspapers and old books. Computers find it hard to recognize these words because the ink and paper have degraded over time, but by typing them in as a CAPTCHA, crowds teach computers to read the scanned text.”</p>
<p style="margin-bottom: 1em; margin-top: 0px; margin-right: 0px; margin-left: 0px; line-height: 1.5; text-align: justify; padding: 0px;">&#8230; those 100,000+ captcha forms are now Google-powered, with the data being used to improve Google’s ability to digitize old books and newspapers to make them Web searchable. It makes a lot of sense, and gives Google yet another strategic advantage over would-be competitors.</p>
</blockquote>
]]></content:encoded>
			<wfw:commentRss>http://jonas.sekamane.com/2009/09/crowds-teach-computers-to-read-the-scanned-text/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
