G Duy 's Blog: August 2009

Thursday, August 27, 2009

YQL hack to get DZone popular articles on 1 page

Problem with DZone feed is, it contains the link to DZone page. No way i can jump to article, skipping the DZone page. At the end of this article You will be getting a simple YQL based hack to get a DZone popular links on 1 page linked to original blog page.
Note: if you are not interested in how i got it Jump to Demo.
STEP-1
Using YQL first get the popular links. using feed table (use YQL console).
URL to popular link feed is http://feeds.dzone.com/dzone/frontpage
So my YQL query will be.
select * from feed where url='http://feeds.dzone.com/dzone/frontpage'

You will get the details of all popular articles linked to Dzone’s page (same as in feed).
I am just interested in links so my YQL will be (replace * by link).
select link from feed where url='http://feeds.dzone.com/dzone/frontpage'
STEP-2
With all those link retrieved in the step-1, get the HTML page out of it. get the title node using an XPath query .
First by simple DOM inspection i can figure out that title node is wrapped in a DIV element whose class name is ‘ldTitle’.
select * from html where url in (select link from feed where url='http://feeds.dzone.com/dzone/frontpage') and xpath='//div[@class="ldTitle"]/a'
Select format as XML and grab the REST URL
which will look like this

http://query.yahooapis.com/v1/public/yql?q=select%20*%20from%20html%20where%20url%20in%20(select%20link%20from%20feed%20where%20url%3D'http%3A%2F%2Ffeeds.dzone.com%2Fdzone%2Ffrontpage')%20and%0A%20%20%20%20%20%20xpath%3D'%2F%2Fdiv%5B%40class%3D%22ldTitle%22%5D%2Fa'&format=xml

STEP-3
Now Time to do some tweaking. if i get this data in XML i have to parse it. if i will get it in JSON, JavaScript will do all the parsing for me. That's why i love JSON.
Now if i get everything in JSON i have to regenerate DOM via a JavaScript code. I am lazy to do even that. So i want result formatted in JSON and DOM section as it is(i.e. in XML).
I can do this by putting a call-back as argument in above REST URL.
so my REST URL will be

My call back name is ‘loadlist’. so lets implement this call-back.
which is like this.

function loadlist(data)
{
var content_div=document.getElementById('content');
if(!data.query)
{

content_div.innerHTML="error.. failed to load"
}
else
{
var i=0;
var html="";
for(i=0;i<data.query.count;i++)
{
html= html+data.results[i]+"<br/>";
}
content_div.innerHTML=html;
}
}

To see the full source code, visit this page and view source code. I have implemented extra function ‘track’ for some purpose. .

DEMO

Sunday, August 23, 2009

7 things I did on my blog, to make it a better place to read

Readability & accessibility for a blog is very important. You need to give few good reasons to your reader to bring them on your blog site. Today, when technology is really advanced we need to re-look our blog website & think of few additional things. This blog has all those 7 things, which I am going to discuss. I consider my blog most accessible in all forms. Good theme for readability, easy to share, optimized for speed, accessible on mobile, user can choose to translate in his language & last but not the least user can choose to listen its content.

1. Choose better theme.
This is most basic thing. Clean and neat theme on a blog is very important. White background with black colour text is most appropriate theme. Do not put too much of flashy advertisements or flashy content, they not only distract & confuse your readers; they also slow down the loading time. This article might help you to choose better theme.

2. Make it easy to share & comment.
Blogging is a form of social media. It’s very important to have better interactivity with your readers. Comment form must be having various options; user should not be forced to login to put comment. Yes, if don't want any unwanted content to be posted by your readers, your can go for “comment approval” option.
These days’ people love to share blogs on various channels. Add buttons like addthis, dzone, tweetmeme(blogger, wordpress) digg(blogger, wordpress) etc to make it easy to share.

3. Optimize for speed.
Adding to much of buttons, poor theme and gadgets might slow down loading time of your blog site. Here are 3 things you can do to improve performance.

i. Put all unobtrusive scripts at the bottom of page.
As you know most of the gadgets and buttons are based on JavaScript. Adding too much of buttons and gadgets might slow down loading time of your blog.

Any JavaScript block, freezes the loading of rest of the content till complete JavaScript code is not loaded. So you can effectively bring readable content much early, by putting most of the JavaScript blocks at the bottom of the page. Question is how can you decide, what can be kept at the bottom of the page? Any script which only executes when entire webpage is loaded can be brought the bottom of the page. In more technical terms any unobtrusive code can be kept at the bottom of the page.

ii. Use CDN (for better loading time) .
One way to get free CDN is “Google’ App Engine”. Read this article.

iii. Run tests like Yslow or Page Speed.
YSlow and PageSpeed are a plug-ins for Firefox (on top of Firebug). These plug-ins can run a test on websites and can tell you what can be optimized. They will provide huge list, though you can not achieve all of them easily but if you even achieve 50% of them your website will be much faster than any webpage. If you really want to see most optimized web page see yahoo home page. They have lots of content on the website but still loading surprisingly low.

4. Automated podcast (sometime listening is better).
If you are kind of guy, who write long descriptive content, automated podcasting feature might help your reader to have an option to “listen” rather reading it. Odiogo gives free automated way to get podcast of your blog text. It has built-in text-2-speech feature, which will narrate your blog.

5. Automatically Translatable.
Automated translation might not help a lot, but can be really handy for foreign users. Use Google Translate widget on your blog so that user can easily translate.

6. Mobile readable.
Many people prefer to do blog reading on their mobile. There is an easy way to create mobile version of your website. You can use Mobify, to get mobile version of your blog. Getting mobile version is easy; you can get it in 3 easy steps.

7. Add some fun.
On my blog you can drag those sidebar gadgets, such kind of functionality can help reader to play with your blog.

Thanks for reading, Love to see your comments.

Sunday, August 9, 2009

Top 11 Language Concepts That Every Developer Should Know

There are a few fundamental things we have invented in programming languages,
which was invented at various point of time, particularly by one language and they were later adapted by many other languages.
I am going to list some of them, which have bigger impacts.

DataType
The first and foremost thing which we have ever invented is the data type. Computers were just a dumb machine, which could only understand the binary sequence. Binary sequence made no sense till it was grouped to form a DataType.
As you know a DataType, is something which groups the binary sequence together and represents some entity in mathematical word OR real word. All depends on its interpretation.
DataType is not just a grouping of binary sequence but also a set of operations which it posses. I mean a DataType definition just don't end with its binary grouping but also the operation which can be performed on those entities.
Later we have evolved these DataTypes into more complex form. We used mathematics to bring various complex data structures like List, Stack, Queue etc.
Today almost every programming language directly or indirectly have concept of data type.

Pointers
With the invention of computers, we've also invented the concept of loading and storing data from Hardware. Computers accesses memory by toggling special bit patterns in the wires (which was called data bus). This led to the invention of Pointers in our high level programming languages.
Pointers specially became popular in C programming language. Pointer is one of the most popular programming concepts ever invented. Other than C language, pointers are supported by languages like C++,C#, Fortran, Pascal. Few dialects of BASIC also supports pointer.

Structured Programming
Most of the programs in the early days were completely relying on GOTO statements, which was a real mess and was making programmers life real hell. Now that's where we have invented the "structured programming " another fundamental programming concept. Through this concept we have brought something called functions, and subsequently we realized the power of abstraction.

OOPs
I think OOPs (Object Oriented Programming) is one of the most popular and ever lasting fundamental concept we have invented in history of programming languages. OOps is an umbrella concept. We have brought many concepts under this.Concepts like Data hiding, Inheritance, abstraction, polymorphisms (static & dynamic) were just the beginning of OOps. OOps is available directly or indirectly in almost all modern languages. Languages like C++, java & c# have brought it a long way.

Regular Expression
Regular expressions provide a concise and flexible means for identifying patterns in string. This is used for searching and replacing special pattern in a string.If your favourite programming language is supporting Regular Expressions, and you are still thinking of learning,then this is the time to go and learn. Regular Expressions are now supported by many languages (almost all popular programming languages). Additionally Regular expression became standard language for many find and replace system utilities. Unix command (utility) Grep is one of the most popularly known Regular Expression based utility. Regular Expressions became so popular concept that many programming languages made it as the part of there language syntax (construct).Languages like Perl, Ruby and TCL embraced regular expression as their primary language construct.

SQL
With the invention of Relational database, where everything is stored in the form of Tables, SQL type of languages evolved. This was mainly developed for data query, data update, schema creation & schema modification. They became so popular that it was extended to make procedural SQL.
With the popularity of SQL, recently Google has brought GQL to abstract their Big Table ( a non relational data base). SQL like syntax is also borrowed by yahoo in YQL, to query data from anywhere on internet. LINQ in C#.net, is also inspired from SQL.

Managed Heap
Managed Heap OR Smart pointers, was another revolutionary concept which was invented as a hack of OOps concept (classes ) in C++. This was invented by Microsoft in a concept called COM. Smart Pointer, solved the problem of memory leak .
This concept was later adapted as default language semantic in programming languages like Java & C#. Later this was adapted by many programming languages like VB.net and Managed C++.

XPATH
XPath is another programming concept which was developed to access DOM tree, and became a preferred way to access the XML formatted data. This is another programming paradigm which you should be aware of, If by any chance you work with XML.

Duck Typing
The Term “Duck Typing” is invented by Python, though the concept of duck typing is old and was there in few languages earlier than Python.In duck typing, programmer is concerned with just those aspects of an object that are used, rather than with the type of the object itself.
Let's understand this. Let us say we have a real life object Shape, which knows how to draw itself (with a method draw). Now in OOps, We enforce this by creating an interface something like IDraw, any anything which can be drawn on the screen must be of type IDraw(i.e It should be inheriting IDraw). I Duck Typing, object can be drawn on the screen as long as object holds the draw method,irrespective of the type of object. DuckTyping removed the dependency of common interface definition, which are typically shared by client and server modules in OOps languages. Disadvantage of such thing is, programmer will not be able to know at the compile time, that the object is not having the draw method in it. But wait, python does not have compile time, its interpreted language, so all the problem can only be identified during the runtime.
One thing i am sure about DuckTyping is that it is very risky deal when you are building a big (cathedral like) software. But it is very good when you are writing very small quick and dirty lines of code.
Duck Typing is supported by Python, JavaScript(and similar languages) & C# (for your surprise, read this nice example).
Duck Typing helps a lot in JavaScript, in fact the concept of “JSON based AJAX” is completely based of duck typing.
Duck typing is a very controversial concept, many OOPs lovers hate this concept.

Closure
Some languages (e.g. JavaScript) allows you to define a function inside another function. Closure is the scope, which inner function is having. Coolest part of closure is, scope remains valid even after outer function have returned.
One of the nice example of closure is this (in JavaScript), Inner function dofading will still be having access to ‘Div_InClosureScope’ even after the Fade have returned.

function Fade(id)
{
var Div_InClosureScope= document.getElementById(id);
var level=0;
function dofading()
{
var hex=level.toString(16);
Div_InClosureScope.style.backgroundColor='#ff'+hex+hex+hex+hex;
if(level<15)
{
level++;
setTimeout(dofading,100);
}
}
setTimeout(dofading,10);
}

In more general term, closure is a special scope, provided to a special instance of a function. In OPPs the member function enjoys closure (data members are in closure scope). Programming language C does not have closure concept at all. Its a most simple and straight language.

Yield
I found this technique first in python.This was not something new, but can confuse most the programmers from c/c++ background. This technique somehow stores state of iterator (I will explain latter), and returns different result at different time, and this is something c/c++ programmers are not used to. Any C/C++ programmer assumes a function is a stateless machine, which can return only one result for given set of argument, no matter how many time you going call that function.

A Python function stack can be unwrapped( retuned) in 2 ways, by a return statement or by a yield statement. A return statement stops the execution of a function (same as in c/c++ ). On the other hand an yield statement halts the execution of a function and store the state, so that when it will be invoked later, execution will start from the same point.

Lets take an example of typical Fibonacci number generator.

#An endless generator
def fibonacci():
i = j = 1
while True:
r, i, j = i, j, i + j #respective assignment
yield r

for rabbits in fibonacci():
print rabbits,
if rabbits > 100: break

Output
1 1 2 3 5 8 13 21 34 55 89 144

Python certainly support, C++ like return statement, but using yield is most efficient for this such generators. All those recursive way of writing Fibonacci number generators are really inefficient( though they look simple).
Yield statement is also supported by C#.net. Here is one of the nice post about yield in c#. Do not miss!

I love this article, "Life after loops". somehow, its connected to this blog post

Saturday, August 1, 2009

We Code writers are better than Literary Writers

We write to make it readable.
Yes, we spend 20% of our coding time to produce working code, and rest of 80% time in beautifying and indenting our code. We love to make our code more and more readable.

We don't beat around the bush.
We write very specific. We hate un-necessary details. We try to communicate the message (the solution) as early as possible.

We love to keep it simple.
We as a code writer, don't try to complicate things. We feel much better when we see our solution simple. That's why we say KISS (keep it simple, stupid!).

We don't repeat Information.
yes, we believe in DRY (don't repeat yourself) principle. we represent one information at only one place. Though many novice developers make mistake to represent information twice and thrice in their code. This brings a serious problem.

We don't count pages.
We developers, (true developer) feels very productive when we delete LOC( line of code). Most managers in our company think that LOC is true measure of our productivity, but we don't. We love to delete code, and our most productive time is spent in deleting Code.

We are open.
As an innovator, we try to present idea, even if we are scared that idea can be stolen. We don't think hiding an idea makes any kind of sense.

We are least bothered, when it comes to plagiarism.
Unlike literary writers, we are less concerned about plagiarism. We love if someone copy our code. We take it as a good news. in-fact the whole open source philosophy is based on it. We just request to mention our name in the code, but if they don't mention we don't bother.

Pages