Thursday, October 04, 2007

A "little bastard" BUG using JSmin parsers

Update
It seems I found a valid fix for "strange" regular expressions. I suppose MyMin will be available tomorrow while packed.it has been just updated with last client version.

Have fun with packed.it and please tell me if some source cannot be parsed.





--------------------------------------
While I was testing every kind of source and CSS with my last creation, packed.it, I found a bug on JSmin logic, partially recoded inside MyMin project as JavaScript minifier (MyMinCSS and MyMinCompressor seem to work perfectly).

This bug seems to be present on every version linked in JSmin page.

I found this bug parsing jQuery UI + every CSS with my new service but one JavaScript file has one single, simple problem:

function(){return /reg exp with this ' char/;}


Bye bye JSmin, regexp will be parsed as string and an error will be raised.

The problem is reproducible just with a string that contains only this (so a source that begins and end with these 3 chars):

/'/

that's a normal regexp ... at this point I can be happy to found first packed.it and MyMin project bug and the coolest thing is that is inside a part of project that's not really mine :D

To solve this problem, just use
return new RegExp
or just asign reg to a var before return them ... it seems to be the only one bug I found parsing every kind of source. Did anyone solve them?

I know it's not so simple, just because a char by char parser cannot simply know if slash is for a division, a comment or a regexp.

Do You have any sugest to solve this problem? It will be really appreciated! :-)

8 comments:

kentaromiura said...

is not true, a parser could parse a thing if it's respect his BNF(backus naur form) .
if is legacy javascript then is parsable :D i wrote a compiler once and i remember that is not simple but is feasible ;D

Andrea Giammarchi said...

It seems You missed word simply.

I've never said it's not possibile, I just said this problem seems to be present on every JSmin parser and with JSmin code logic it's not simple to solve (and I've just started to write another personal parser for MyMin project :D)

kentaromiura said...

quick fix :
instead of jsmin ("",ImVeryBastard,3)

use
ImVeryBastard.replace(/(\s|=|\()\/[^\/\n]*?('|")[^\/]*?\/(g|i|m|gi|gm|gim|gmi|im|ig|img|igm|mi|mg|mgi|mig)?/,
"($&)")

Andrea Giammarchi said...

that increases output size :/

kentaromiura said...

not my fault :P
in the meantime you could write another better parser ...
:pheasant:

Andrea Giammarchi said...

increase means change output.

if You have a regexp inside a string, a saved information, for example, it will be modified by regexp.

regexp is not the solution, at least not the one to solve this JSmin problem that is a char by char parser

mrclay said...

Fix costs 2 bytes: wrap the expression in ():

return (/'/);

Works fine. :)

Andrea Giammarchi said...

so your suggestion is to add 2 more characters to pass a bug into a minifier ... sounds illogical in my mind, I would not change a perfectly parser error free code for a minifier, adding more bytes, just a personal opinion :P