My JavaScript book is out! Don't miss the opportunity to upgrade your beginner or average dev skills.

Tuesday, April 14, 2009

Essential Selector - Cross Browser LightWeight Selector Engine

Thanks to new DOM methods introduced recently in most common browser, e.g. querySelectorAll, we will not hopefully need full libraries to implement common CSS selectors. Nowadays, this could be the basement to create any kind of selector engine but we are still stack with old fashion browsers a la Internet Explorer version 6 or 7, both far away from W3 standards and with the slowest JavaScript engine in browsers panorama.

At the same time, whatever great selector API/engine we have under control, most used selectors are really few:

  • #id

  • .class

  • tag

  • tag.class



Reasons behind this fact are different, but in my opinion the most valid one is that Web Developers use CSS selectors in the same way they create CSS files and since CSS has become standard only recently thaks to Internet Explorer full CSS 2.1 support (n.d. other browsers are working on the CSS 3 since ages ...) our CSS files and our selectors will be that simple for long time.

Accordingly, and since we have some intermediate and cool prototype such getElementsByClassName, all we could need is just a basic selector engine able to retrieve nodes in the fastest possible way.

Of course, if querySelectorAll is present, this method will be a must, but what if it is not available?

Sizzle library is one of the most famous selectors engine so far, but we need "to move" 4KB of minified and gzipped code (not that much but often more than necessary) to obtain something simple, specially if precedent selectors are the only one we use in our project.

The Essential Selector Library

Maybe it sounds obvious, but to cover first 3 selectors in the list all we need is the fast getElementById, the standard getElementsByTagName, and the un-standard getElementsByClassName, easy to implement for old browsers. querySelectorAll? Superfluous in this case, but obviously still welcome! Above 4 selectors are the only one considered for performances in my last tiny library: about 1Kb minified and gzipped, suited for libraries and/or GUI development.
You can have a look directly in my repository to understand what will perform truly fast in every browser and what will perform in a reasonable time.

Essential Selector Philosophy

#id, .class, tag, and tag.class selectors will be fast for every browser while more complex selectors will be browser dependent. The main focus is into most used selectors but if you decide to use a specific one:

// CSS selector example
$e("div ul.myclass p");

recent browsers will perform in about 1 milliseconds while old browsers will perform a clean runtime CSS specific modification. This means that these browsers will have approximately the same delay for a selector like "div p" and "div p #content ul li.testcase" but at least, if the selector is compatible with the browser CSS engine, the result will be the same for every browser.
Moreover, due to the light size of the library, those bugged browsers will not be perfectly supported. As example, thre is a version of Opera which does not understand className in upper case ... well, this is not our problem, it is a browser specific bug so the browser vendor should solve it. The same is for other weird cases ... come on, we cannot consider every alpha/beta/unstable/intermediate/old version, so if the CSS works, the browser will respond as expected.
This is the philosophy behind this simple selector engine, where a search like

$e("div p")

will make sense, while another one like

$e("div[class^=whatever]")

will not, because of the not yet that standard chose selector.

Summary

Do you want a small footprint selector which works OK with daily basis environments? Try out Essential Library, otherwise I just gave you a valid, full compatible, alternative ;)

8 comments:

Unknown said...

That's a clever solution.

You can speed it up a bit by identifying the target tagName.

e.g.

For this selector:

p.example a

The target tagName is "a", so you can change the getElementsByTagName("*") to getElementsByTagName(tagName). It should be fairly simple to parse out the target tagName.

kangax said...

I see you generate a random token with -

`"__" + String(Math.random()).substring(2) + "__"`

I usually use a bit shorter alternative -

'_' + (Math.random()+'').slice(2);

If you care about shaving off those few extra characters, of course : )

Andrea Giammarchi said...

Sorry kangax, I did a mistake via Android clicking the wrong link. You also wrote:
It seems to me that `new RegExp("\\b" + split.shift() + "\\b")` will erroneously match values like "foo-bar" when asked for "foo" or "bar".

AFAIK, regex like this is considered to be standard - `new RegExp("(?:\\s|^)" + value + "(?:\\s|$)")`

It's much easier to catch errors like that if you have unit tests ;)
I agree, I did not think that much about that RegExp and I'll fix in few minutes.
About slice VS substring, that operation is performed once in the script lifetime, I do not htink it will affect that much performances.
I removed "0." from the Math.random() to avoid problems with the counter name but it could be a single char like "a", the random was only to avoid possible conflicts :)

Andrea Giammarchi said...

@dean, that was simple. I implemented a quick and simple "resolver" for cases like that and performances are effectively good.

@kangax, I updated the standard regexp plus I used slice for the random, happy now?

both inside the code now :D

turnando said...

Hey, Dean Edwards posted about a MS IE "feature" that throws an exception for document.createStyleSheet() when you try it in a document that has 30 style sheets.

http://dean.edwards.name/weblog/2010/02/bug85/

I'm going to see if I can solidify the IE version of your custom querySelectorAll with a try/catch that re-uses an existing stylesheet. Don't know if it will be OK to add/remove from an existing stylesheet?

turnando said...

Hello Andrea,
I'm liking the Essential selector, it is so lightweight.

I refactored the custom querySelectorAll function for IE to take into consideration the limit of 30 stylesheets per document (and a limit of 4095 rules per stylesheet).

It needs a few tweaks but the essence of it is that it tries document.createStyleSheet() and if that fails it uses the first stylesheet in the document that is not disabled and which does not fail when addRule() is used on it (it will fail if the stylesheet has 4095 rules).

I need to make one more change so that the query code is adding a single rule to the stylesheet (like for the FF branch of code) and after I do that I will test it and give it to you.

turnando said...

Me again.

I refactored the branch of code for delegating to the native querySelectorAll so that it would give consistent results with the custom querySelectorAll impls that are used for old browsers.

The problem I'm solving can be seen when using a context object to scope the query. The W3C spec for the selector API says this about using querySelectorAll on an element that is not the root document element: "Even though the method is invoked on an element, selectors are still evaluated in the context of the entire document."

http://www.w3.org/TR/selectors-api/

John Resig does a better job than me of explaining the issue:

http://ejohn.org/blog/thoughts-on-queryselectorall/#postcomment

Basically, we want a JavaScript selector library to apply a selector to the DOM below the scoping context element (when a scoping element is used). The Essential library does do that for the custom querySelectorAll but when delegating to the native querySelectorAll it does not take this scoping into consideration and the selector ends up getting evaluated against the root document element.

I refactored Essential to scope the selector to the scoping context object before delegating to a native querySelectorAll call. In a nutshell, it does it by adding a temporary marker attribute to the scoping element and then prepending an attribute selector to the original selector. I have another version that will alternatively use the id of the scoping element (if it exists) to be less intrusive, but I'm sticking with this because I'm going for compact code:


if(div.querySelectorAll){ // If querySelectorAll is available then we don't need our custom selectors
div = null;
return function(selector, ctxtElement){
var results;
if (ctxtElement) {
ctxtElement.setAttribute(mrkr, "");
selector = "[" + mrkr + "] " + selector; // orig selector is now scoped to ctxt element with an attribute selector
}
results = toArray.call((ctxtElement || doc).querySelectorAll(selector));
if (ctxtElement){
ctxtElement.removeAttribute(mrkr);
}
return results;
};
}

turnando said...

Hi Andrea,
I see from your vice-versa library (which includes this Essential Selector library) that the code is covered by an MIT style license:

http://code.google.com/p/vice-versa/

I don't see an MIT license statement in the Essential source code, but it is your intent to license it with an MIT license, right?

I apologize for all the wordy comments, I talk too much... but here is something that might help people who are looking at the code for the first time. Here is what the $e selector function does, in my words, after it is refactored to properly scope selectors used with the native querySelectorAll (I renamed the context object argument from HTMLElement to ctxtElement in this explanation):

Returns an array of DOM Elements matching the given CSS selector(s). The returned array will not contain duplicates if $e.duplicated is set to false.

The selector arg is a comma-delimited list of CSS selectors.

The ctxtElement arg is an optional context for the search. The search will be conducted only on children of this element and if no context element is provided then the entire document will be searched. When the ctxtElement arg is present then the selector is applied relative to the ctxtElement (NOT relative to the document root).

The selectors can be contextual. If they are contextual AND the ctxtElement arg is present then the contextual selectors should be written relative to the ctxtElement context object, NOT relative to the document root.

If there is a native querySelectAll function then this call delegates to it. If a context element is provided and the call is being delegated to the native querySelectAll then the delegation is made only after automatically scoping the given selector(s) so that the native function will be applying selectors relative to the ctxtElement instead of relative to the root document.

If there is no native querySelectAll function then the following logic is used:

If a selector is non-contextual and one of these simple types of selector then a very simple search will be performed:
id selector: #fooId
nodeName selector: div
className selector: .fooClass
tagClass selector: div.fooClass

If a selector is contextual or is non-contextual but is not one of the simple types of selector that are listed above then a more complex search will be performed using a custom querySelectAll function. This custom querySelectAll function works by temporarily applying a "marker" CSS style to elements that match the given selector(s) and then searching the document for elements with the "marker" style.