My JavaScript book is out! Don't miss the opportunity to upgrade your beginner or average dev skills.

Tuesday, September 19, 2006

studying google adsense code ...

PLEASE READ ME FIRST
This post is "a joke", then I hope you'll look at this as an ironical and hilarious post about google scripts and google developers that are really more skilled than me :)
I've found an error on my "best" script then I've looked for problem inside adsense code because with its code my script generates strange redirect errors (adsense code escapes objects or numbers too without string casting while my encodeURIComponent implementation didn't care about toString() method befor parsing ... I know, I've done a stupid thing :P )


studying code ...
3 days ago I've posted about an unknown problem using my JSL at the top of this blog.

The first point is ... sorry blogspot, you didn't cause any problem to JSL

The secondo point is this one: google adsense ... have you seen the code ?
To find the problem and then the solution I've downloaded adsense code to look inside that after a "manual code beautifier operation".
That's what I think about adsense code:
  • absolutely "undebuggable", to be lightweight (about 7Kb in a local file, maybe compressed on-line with gz, then 2Kb) variables and function names are really hard to understand
  • contains few errors, Opera as FireFox show everytime something inside the js console
  • it doesn't use a "perfectly" optimized code

Let's analyze adsense JS code, using this page as referer: google adsense beautified code.

The good thing is that all adsense code is inside an anonymous function, then every other script on the page will not be modified ... every but another google adsense script, because the use of window inside the script allows itself to create a big list of window.google_* variables.
This shouldn't be a problem, but if you use a script that does a for in loop with the window oblect, you need to rememeber that every /^google_/.test(param) should be leave as is.
It's time to view the internal function code, starting from optimizzations.
As you can read on many lines, for example on line 6, every if / else if / else uses curly brackets ... even when it's not necessary (every single operation after the condition).
But if you look at the line 295, someone uses correctly an "if" without braces ... who did this ? Maybe not the same developer ?
Since it's correct even with single line funtions, to optimize this script a lot of braces should be removed, adding where we need a ";" char.

// example with function B (starting from line 5)
function B(b){
if(typeof encodeURIComponent=="function")
return encodeURIComponent(b);
else
return escape(b);
}

Then I've just removed 2 chars from the size of the script but hey ... that function should be different!

// another example with function B
function B(b){
return typeof encodeURIComponent=="function"?encodeURIComponent(b):escape(b)
}

Another example is on line 153, where there is a ternary operation for var h but not for var q.

var h=a.google_ad_region==b?"":a.google_ad_region,q=j?j.indexOf("_0ads")>0:false;

After few lines (159) we can read another "unoptimized" piece of code that should be wrote in one line.

a.google_num_0ad_slots=!a.google_num_0ad_slots||a.google_num_0ad_slots+1>1?1:a.google_num_0ad_slots+1;

However I wonder when this piece of code should be usefull because google_num_0ad_slots is not present in this script, then it's maybe defined from other scripts.
If this is true or not, the if,else and then if does something like this:
if google_num_0ad_slots is not defined, or is null or is 0, google_num_0ad_slots should be 1, in every other case should be google_num_0ad_slots + 1 then it should be 0 if google_num_0ad_slots is less than zero.

if(!a.google_num_0ad_slots||++a.google_num_0ad_slots>1)a.google_num_0ad_slots=1;

These "if/else and then if again" I've just optimized are in different lines of the script but in some cases there is only an if else (i.e. 170 whre there's any "greater than 1" check)
In these cases the code should be

if(!a.google_num_ad_slots)a.google_num_ad_slots=0;
++a.google_num_0ad_slots;

or should be this one

if(!++a.google_num_ad_slots)a.google_num_ad_slots=1;

only if parameter is always initialized with 0 or greater value.

Another little optimizzation should be done on function F (line 73).
Since adsense code optimizzation is based on short var names I think that repeat for a lot of times the same object param prefix is not so good as a dedicated params array should be:

function F(b){
var a=[
"ad_frameborde","ad_format","page_url","language","gl","country","region","city","hints","safe",
"encoding","ad_output","max_num_ads","ad_channel","contents","alternate_ad_url","alternate_color",
"color_bg","color_text","color_link","color_url","color_border","color_line","adtest","kw_type",
"kw","num_radlinks","max_radlink_len","rl_filtering","rl_mode","rt","ad_type","image_size","feedback",
"skip","page_location","referrer_url","ad_region","ad_section","bid","cpa_choice","cust_age","cust_gender",
"cust_interests","cust_id","cust_job","cust_u_url"
],l=a.length;
while(l)b["google_"+a[--l]]=null;
}

In this way all properties are simple to add or to remove from the list, and "b.google_" is present just one time.
However if an optimized while should be slower to parse with a really big array, using o.param1=o.param2=o.paramN=null instead of "=a" for each param should be the same thing.

AdSense script uses a lot of returns in-function, that is a method I don't like very much (but it's only my opinion and using only one return isn't a better way to write functions).
For example there's a "special" function I've seen that's not good enought for me, it is the x function (line 293).

function x(b,a){
var d=a.documentElement,r=z(b,a,"location"),g=1,e=1;
if(!r&&b.google_ad_width&&b.google_ad_height){
if(b.innerHeight){
g=b.innerWidth;
e=b.innerHeight
}
else if(d&&d.clientHeight){
g=d.clientWidth;
e=d.clientHeight
}
else if(a.body){
g=a.body.clientWidth;
e=a.body.clientHeight
}
r=(e>2*b.google_ad_height||g>2*b.google_ad_width)
}
return !r
}

If you look at the original version you can view that if z is true (then r in my version), function returns false.
Then if !r (when r is not true) it's possibile to do other operations inside the first if condition.
At the end of the first if you can assign a boolean value without the if and the second in-function return.
Then if r is true, the final return value is false. It's true for the first check as for the second, then in every other cases, when r is not true, returned value will be true (not false).

This is the way I usually like to return a boolean value from a function or method using only a single return (cleaner, imho) at the end of the function.

We are going to the end of this post, there's only another function I've not understand ... the C function (line 286).
As you can see C function recieves 3 parameters, any of these is used, A function is called and true value is returned.
Do you think it's usefull ? I think that A function, that doesn't have any input parameters and doesn't return anything, should return true value and should be used directly on line 325.

b.onerror=A;

... adding return true on A function ... then anyone doesn't need the C function (but maybe it was created for future implementations).

The absolute last thing I want to tell to google AdSense script developers is this one:
why do you optimize in this way the code but you use "var" for every temporary function variable ?

Look at the line 319, inside the function E ... wasn't better something like ...

var b=window,a=document,d=a.location,g=a.referrer,e=null;

??? it's the same with A, D and other functions ...

2 comments:

Anonymous said...

Hey Andrea.
The thing is that Google Adsence was not intended to be optimized. It is just compressed with that Javascript Compressor, so there is a feeling like stuff is optimized but not to the end.

I have made a dig into source code and uncompressed it. But then I found original code in open sources. I can find a link if you wish.

Anonymous said...

And about ternary operator...

Valerio Proietti doesnt approve usage of it: "Too hax0r", he says.

So mootools is written without them, what is a real shame :(