Thursday, May 28, 2009

Retrieving information from RSS in .NET

In my earlier post I discussed on the overview of the RSS (Really Simple Syndication) Feeds and how it can be used by any user as a source of information. Today am going to talk about how as a developer you can utilize these RSS for fetching certain information for your application. Yes you can retrieve and use the information provided by these RSS feeds, depending upon what information is provide in the RSS.

Before starting to use RSS to retrieve information, let’s have a high level look on the RSS format. RSS is defined as a XML based Data structure, as shown in a sample below:

<rss version="2.0">
<channel>
<title>Composed Crap</title>
<link>http://composedcrap.blogspot.com</link>
<description></description>
<item>
<RSS for .NET Developers</title>
<link> http://composedcrap.blogspot.com/2009/05/rss-for-.Net-developers.html </link>
<description>... content of the RSS feed... </description>
</item>
</channel>
</rss>


For more details on the RSS structure, just google for it and I am sure you will find more detailed description.

So you know this information provided by the RSS since is in XML format, can be parsed easily to fetch the required details. Well, you don’t have to take the pain of parsing. Just download the ASP.NET RSS Toolkit from http://www.codeplex.com/ASPNETRSSToolkit and you are ready to go.

Once you download the Toolkit, add reference to the RSSToolkit API you just downloaded in you .NET Project where you want to retrieve the information from the feed.

Here is the code for retrieving the feed items from a particular feed site, I am using ComposeCrap RSS here as an example:

//Retrieve the RSS from the site and load it into the RssDocument
RssToolkit.Rss.RssDocument rss = RssToolkit.Rss.RssDocument.Load(new System.Uri("http://composedcrap.blogspot.com/feeds/posts/default"));
//For each Item in the RSS retrieve the description and wite it on the console
foreach (RssItem item in rss.Channel.Items)
{
Console.WriteLine(item.Description);
}


Now there are many RSS providers that do not provide the complete description of the feeds. In that case if you want to retrieve the complete description of an item, check for the tag
<content:encoded> in the XML source of the feed. You can see the XML source of the feed through page source of the feed web page. This is because some sites do not provide the complete information in the description of the feed but however they want to publish the complete information through the feed that can be used by others (not reader).

Note: If a site does not provide the complete description of the feed items, it is not necessary that the content will be available <content:encoded> through tag. If the RSS site does not provide the complete information by either way you cannot retrieve the complete details using just the RSS toolkit.

To retrieve the information from the <content:encoded> tag, you first need to add the Encode information to the RSS Toolkit.

  1. Open the RssToolkit solution from the Source folder of the downloaded toolkit.
  2. Open the RssItem.cs file and add the following property:
    private string _encode;
    [XmlElement("encoded",Namespace = "http://purl.org/rss/1.0/modules/content/")]
    public string Encode
    {
    get
    {
    return _encode;
    }
    set
    {
    _encode = value;
    }
    }
  3. Build the RssToolkit project and update the reference in you .NET project which is referencing the RssToolkit API.
  4. After updating the reference, in the RssItem object, you will be able to access the property Encode, which provides the complete description of the item.
    RssToolkit.Rss.RssDocument rss = RssToolkit.Rss.RssDocument.Load(new System.Uri("http://www.bollywoodz.net/feed/"));
    foreach (RssItem item in rss.Channel.Items)
    {
    Console.WriteLine(item.Encode);
    }
If you are looking to use information from any RSS in your .NET application, go ahead and start using RSSToolkit for .NET following the details above.

1 comment: