Thursday, August 27, 2009

YQL hack to get DZone popular articles on 1 page

Problem with DZone feed is, it contains the link to DZone page. No way i can jump to article, skipping the DZone page. At the end of this article You will be getting a simple YQL based hack to get a DZone popular links on 1 page linked to original blog page.
Note: if you are not interested in how i got it Jump to Demo.
STEP-1
Using YQL first get the popular links.  using feed table (use YQL console).
URL to popular link feed is http://feeds.dzone.com/dzone/frontpage
So my YQL query will be.
select * from feed where url='http://feeds.dzone.com/dzone/frontpage'

You will get the details of all popular articles linked to Dzone’s page (same as in feed).
I am just interested in links so my YQL will be (replace * by link).
select link from feed where url='http://feeds.dzone.com/dzone/frontpage'
STEP-2
With all those link retrieved in the step-1, get the HTML page out of it. get the title node using an XPath query .
First by simple DOM inspection i can figure out that title node is wrapped in a DIV element  whose class name is ‘ldTitle’.
select * from html where url in (select link from feed where url='http://feeds.dzone.com/dzone/frontpage') and xpath='//div[@class="ldTitle"]/a'
Select format as XML and grab the REST URL
which will look like this
http://query.yahooapis.com/v1/public/yql?q=select%20*%20from%20html%20where%20url%20in%20(select%20link%20from%20feed%20where%20url%3D'http%3A%2F%2Ffeeds.dzone.com%2Fdzone%2Ffrontpage')%20and%0A%20%20%20%20%20%20xpath%3D'%2F%2Fdiv%5B%40class%3D%22ldTitle%22%5D%2Fa'&format=xml

STEP-3
Now Time to do some tweaking. if i get this data in XML i have to parse it. if i will get it in JSON, JavaScript will do all the parsing for me. That's why i love JSON.
Now if i get everything in JSON i have to regenerate DOM via a JavaScript code. I am lazy to do even that. So i want result formatted in JSON and DOM section as it is(i.e. in XML).
I can do this by putting a call-back as argument in above REST URL.
so my REST URL will be

http://query.yahooapis.com/v1/public/yql?q=select%20*%20from%20html%20where%20url%20in%20(select%20link%20from%20feed%20where%20url%3D'http%3A%2F%2Ffeeds.dzone.com%2Fdzone%2Ffrontpage')%20and%0A%20%20%20%20%20%20xpath%3D'%2F%2Fdiv%5B%40class%3D%22ldTitle%22%5D%2Fa'%0A%0A&format=xml&callback=loadlist

My call back name is ‘loadlist’. so lets implement this call-back.
which is like this.
function loadlist(data)
{
var content_div=document.getElementById('content');
if(!data.query)
{

content_div.innerHTML="error.. failed to load"
}
else
{
var i=0;
var html="";
for(i=0;i<data.query.count;i++)
{
html= html+data.results[i]+"<br/>";
}
content_div.innerHTML=html;
}
}


To see the full source code, visit this page and view source code. I have implemented extra function ‘track’ for some purpose. .


No comments:

Post a Comment