Thursday, April 28, 2011

What techniques do JavaScript compression libraries use to mimize file size?

Please note that this question is intended to be a bit more on the theory side of the subject, but besides stripping whitespace what other techniques are used for JavaScript Compression?

From stackoverflow
  • I can name a few that are found within Google Web Toolkit compiler:

    • Inline of method calls
    • Dead Code elimination
    • Variable renaming/source obfuscation
      • this means rewriting long variables to short ones and so on

    Almost all of them require javscript parsing (ie they work beyond a pure lexical analysis).

    Daniel Lew : That dead code part scares me, how does GWT know that code is dead when JS has a global scope? Or is GWT designed for compiling all your JS at once, and doesn't work if you execute it on only one of your multiple JS files?
    Miguel Ping : GWT is a monolithic compiler, so by adding some restrictions to what you can do within java, it can accurately determine if a method is called or not (similar to what you can do with eclipse ide).
  • Off the top of my head...

    • Tokenizes local variables and then renames them to a minimally-sized variable.
    • Removes tons of whitespace.
    • Removes unnecessary braces (for example, single line executions after if-statements can remove braces and add a single semi-colon).
    • Removes unnecessary semi-colons (for example, right before an ending brace '}').

    My most commonly used minifier is the YUI Compressor, and as they state, it's open source so you can take a look for yourself on exactly what they do. (I'm not sure what they mean by "micro-optimizations", probably a bunch of rare cases that gain you a character or two.)

  • Most of the compressors use a combination of different techniques:

    • stripping whitespaces
    • compress file by compression algorythm (gzip, deflate)
    • most of the space is saved renaming interall variables and function to shorter names, eg:

    This function:

    function func (variable) {
      var temp = 2 * variable;
      return temp;
    }
    

    will become:

    function func (a) {
      var b = 2 * a;
      return b;
    }
    
    • Dean Edwards packer uses some internal compression. The script is decompressed whhen loaded on the page.
    • All the usual stuff to make a programmcode shorter:
      • delete unused code
      • function inlining
    Pop Catalin : Will become: function f(a){return 2*a;} ;) from 72 chars down to 27
    Hippo : if you want to use func on global javascript scope i would leave the name as it is... ;-)
  • Code renaming and reordering so that gzip compressor will have better result.

    for example (not so clever)

    original code:

    function mul(mul1, mul2)
    {
     return mul1 * mul2;
    }
    
    function print(str)
    {
      // do something 
    }
    
    function add(add1, add2)
    {
     return add1 + add2;
    }
    

    modify code:

    function mul(a,b)
    {
     return a * b;
    }
    
    function add(a, b)
    {
     return a + b;
    }
    
    function print(str)
    {
      // do something 
    }
    
    Rob : I'm down voting this answer as you didn't really answer the question, namely what *techniques* are being used.
    Shay Erlichmen : sorry Rob, I didn't read the entire question. New answer, hope I do better this time :)
    Rob : @Shay - Yes that's better.
    Shay Erlichmen : It's just a sample to the idea, of course that you can preform only operation that won't affect the result.

0 comments:

Post a Comment