About Me

My photo
Author of Groovy modules: GBench, GProf, Co-author of a Java book パーフェクトJava, Game Developer at GREE

Thursday, October 13, 2011

A way to write Groovy's AST easily

It is hard to write AST

Groovy provides a DSL to generate AST for implementation of AST transformation. But the DSL is complicated and hard to learn. Let's assume that you want to generate AST of GString:
"Hi, $name."
In this case, the DSL will be the following. Who can remember this stuff as many as the syntax?
gString 'Hi, $name.', {
    strings {
        constant 'Hi, '
        constant '.'
    }
    values {
        variable 'name'
    }
}

AST is also can be generated from code, but it does not benefit from the compiler checking (Joachim Baumann explains about it) and performance is less than the DSL. Please see the following benchmark:
@Grab('com.googlecode.gbench:gbench:0.2.2')
import gbench.BenchmarkBuilder
import org.codehaus.groovy.ast.builder.AstBuilder

def benchmarks = new BenchmarkBuilder().run {
    'DSL to AST' {
        new AstBuilder().buildFromSpec {
            gString 'Hi, $name.', {
                strings {
                    constant 'Hi, '
                    constant '.'
                }
                values {
                    variable 'name'
                }
            }
        }
    }
    'Code to AST' {
        new AstBuilder().buildFromString('"Hi, $name"')
    }
}
benchmarks.prettyPrint()

            user system cpu    real

DSL to AST     0      0   0  339918
Code to AST    0      0   0 2076590
So at the end of the day, you will cheat implementation code or test code of the DSL while writing AST. I believe that you can easily imagine it is a hard work.

How can we write the DSL easily?

I created a library that automatically generates the DSL from code to resolve the problem. The library, named AstSpecBuilder is now available here. The usage is very simple, just pass code to build method:
import astspecbuilder.*

def spec = new AstSpecBuilder().build('"Hi, $name."')

def expectedSpec =  '''\
block {
    returnStatement {
        gString 'Hi, $name.', {
            strings {
                constant 'Hi, '
                constant '.'
            }
            values {
                variable 'name'
            }
        }
    }
}
'''
assert expectedSpec == spec
It also can generate the DSL from AST, or rather the method in the above example is just a shortcut for the following code:
import astspecbuilder.*

def ast = new AstBuilder.buildFromString('"Hi, $name."')
def spec = new AstSpecBuilder().build(ast)

The indent string is 4 spaces by default but it has an option to change it. In the following example, the DSL will be indented with a tab:
def spec = new AstSpecBuilder(indent: '\t').build('"foo"')

def expectedSpec = '''\
block {
\treturnStatement {
\t\tconstant 'foo'
\t}
}
'''
assert expectedSpec == spec

Wednesday, July 20, 2011

GBench 0.2.0 released

I released GBench 0.2.0 today. GBench is a benchmarking framework for Groovy. This framework allows you to easily benchmark for Groovy programs by providing two powerful features, an AST transformation and a builder.


What's New
  • Added support for CPU time.
  • Improved the builder API.
    • Added better (Groovier) syntax to add benchmark target code blocks.
                ----
                run {
              with 'label', {
              }
          }
          ----
                ->
                ----
                run {
              label {
              }
          }
          ----
    • Added new options for run().
      • "average", "idle", "trim". See its javadoc for usage. Thanks to Yasuharu Nakano (author of GroovyServ) for providing the source code.
      • "repeat". This option is an alternative to "time" option, but "time" option still available for backward compatibility.
    • Added new APIs.
      • sum(), average(), prettyPrint(). See their javadoc for usage.
  • Changed versioning scheme.
        YY.MM.DD
        ->
        Major.Minor.Micro


Resolved Issues
  • The name of the system property to specify @Benchmark's default handling contains the old domain.
        "groovybenchmark.sf.net.defaulthandle"
        ->
        "gbench.defaulthandle"


Examples
  • AST transformation Example:

  • Builder Example:

Please try and let me know your feedback (via this blog's comment, the project's issue tracking system, or Twitter:@nagai_masato). Your feedback helps GBench to continue to impove.

Sunday, July 10, 2011

GBench available on Maven Central!

I'm happy to announce that GBench 11.07.05 is now available on the Maven Central Repository. It means that you can use GBench via Grape. Please try GBench and let me know your feedback!



Monday, July 4, 2011

GBench project moved to Google Code!

I moved GBench project hosting over Google Code. 


http://code.google.com/p/gbench/


GBench is a benchmarking framework for Groovy. 
Please go to the site and try GBench!

Sunday, July 3, 2011

Groovy Quiz: How many Elvises are there?

How many Elvises are there?:


[]?.size?:[].@size?:[]*.size?:[].&size?':[].size?':[]

Saturday, July 2, 2011

Poison for Groovy Part 2

Unfortunately, I had success in generating a new poison for Groovy today (I already found another poison. Please read my previous post). 


There are three steps to generate it:


1. Get a Windows machine which has Groovy is installed in it. This is a VERY HARD mission.
2. Set JAVA_OPTS environment variable with a double-quoted string has a space character:
----
> set JAVA_OPTS=-Daprop="a value"
----
3. Run groovy:
----
> groovy -h
----


Then you'll get the follwing error:
----
value"" was unexpected at this time.
----


I already reported two issues related to this bug and provided patches for them (See GROOVY-4910 and GANT-126). I hope these issues will be resolved soon, Windows-specific issues are always left unresolved even if they have patches, though. 


Groovy is a cross-platform language that runs on a virtual machine, right?

Saturday, June 25, 2011

GBench = @Benchmark Annotation + BenchmarkBuilder

I released GBench, a benchmarking framework for Groovy. Now GBench has two features, @Benchmark Annotation and BenchmarkBuilder. @Benchmark Annotation is an annotation that allows to measure execution time of methods without modifying production code and is already published. Please read the previous post for more detail information. BenchmarkBuilder is a convenient builder to write benchmark code easily and now appears for the first time.


The following code is an example to get benchmark string concatenation by repeating them 1000 times:
----
def strings = ['GBench', ' = ', '@Benchmark Annotation', ' + ', 'BenchmarkBuilder']
def benchmark = new BenchmarkBuilder()
benchmark.run times: 1000, {
    with '+=', { 
        def s = ''
        for (string in strings) {
            s += string    
        }
    }
    with 'concat', { 
        def s = ''
        for (string in strings) {
            s.concat string    
        }
    }
    with 'string builder', {
        def sb = new StringBuilder()    
        for (string in strings) {
            sb << string    
        }
        sb.toString() 
    }
    with 'join', {
        strings.join()
    }
}
println benchmark
----


The output will be like this:
----
                 time
+=             18197705
concat         7669621
string builder 9420539
join           5248348
----

Of course you can sort the results:
----
println benchmark.sort({ lhs, rhs -> lhs.time <=> rhs.time })
----


----
                 time
join            5248348
concat          7669621
string builder  9420539
+=             18197705
----

and also you can handle the results as you want!:
----
new File('benchmark.csv').withWriter { file ->
    file.writeLine 'label,time(ms)'
    benchmark.sort().each { bm ->
        file.writeLine "${bm.label},${bm.time / 1000000}"
    }
}
----


----
> cat benchmark.csv
label,time(ms)
join,5.248348
concat,7.669621
string builder,9.420539
+=,18.197705
----


For now, GBench can measure only wall-clock time but I'm planning to also support CPU time and user time in a future release.


You can download GBench from here. Please try and let me know your feedback!

Friday, June 10, 2011

@Benchmark annotation for Groovy gets a big update!

I released @Benchmark annotation for Groovy v11.06.11. This release includes some big changes and a big bug fix:


- Changes
  - Added classes / methods information to benchmark results
  - New option to customize handling of benchmark results
  - Supported class annotation


- Bug fixes
  - Fixed a problem that not working with methods of classes


Added classes / methods information to benchmark results


You can get benchmarked methods and their classes information as benchmark results. For example with the following code,
----
package foo


class Foo {
    @Benchmark
    def foo() {
    }
}
----


the output will be:
----
foo.Foo java.lang.Object foo(): xxx ns
----


New option to customize handling of benchmark results


Added "value" option to customize handling of benchmark results instead of poor options, "prefix" and "suffix". There are three ways to assign value to it:


- Use handler classes
- Use closures
- Use a system property


In the case of using handler classes, create classes that implement Benchmark.BenchmarkHandler interface and add two methods, handle() and getInstance() to them:
----
class MyHandler implements Benchmark.BenchmarkHandler {
    static def instance = new MyHandler()
    static MyHandler getInstance() {
        instance    
    }
    void handle(klass, method, time) {
        println("${method} of ${klass}: ${(time/1000000) as long} ms")
    }    
}
----


Yes, singleton classes as the above example can be writtern shorter by using @Singleton annotation in Groovy :-)
----
@Singleton
class MyHandler implements Benchmark.BenchmarkHandler {
    void handle(klass, method, time) {
        println("${method} of ${klass}: ${(time/1000000) as long} ms")
    }
}
----


In the end, assign handler classes to @Benchmark:
----
@Benchmark(MyHandler.class)
def foo() {
}
----


Since Groovy 1.8, you can also use closures instead of handler classes. With closures, you just need to assign closures that handle benchmark results:
----
@Benchmark({println("${method} of ${klass}: ${(time/1000000) as long} ms")})
def foo() {
}
----


And you can replace the default handling operation with a system property, "groovybenchmark.sf.net.defaulthandle":
----
> groovy -cp groovybenchmark-11.06.11.jar 
-Dgroovybenchmark.sf.net.defaulthandle="println(method + ' of ' + klass + ': ' + ((time/1000000) as long) + ' ms')" foo\Foo.groovy
----


In the cases of the examples, the outputs will be:
----
java.lang.Object foo() of foo.Foo: xxx ms
----
  
Supported class annotation


By annotating classes, you can get benchmark results of all the methods of the classes:
----
package foo


@Benchmark
class Foo {
    def foo() {
    }
    def bar() {
    }
}
----


In the case of the example, you will get the benchmark results of foo() and bar() methods:
----
foo.Foo java.lang.Object foo(): xxxx ns
foo.Foo java.lang.Object bar(): xxxx ns
----


And this means, now you've got a power to benchmark all the methods of your program without modifying code and profiling tools! Because Groovy 1.8 provides a compilation customizer to apply annotations to all the classes:
----
// BenchmarkGroovyc.groovy
import groovybenchmark.sf.net.Benchmark


import org.codehaus.groovy.control.CompilerConfiguration
import org.codehaus.groovy.control.customizers.ASTTransformationCustomizer
import org.codehaus.groovy.tools.FileSystemCompiler


def cc = new CompilerConfiguration()
cc.addCompilationCustomizers(new ASTTransformationCustomizer(Benchmark))
new FileSystemCompiler(cc).commandLineCompile(args)
----


----
> groovy -cp groovybenchmark-11.06.11.jar BenchmarkGroovyc.groovy MyApp.groovy
> java -cp .;%GROOVY_HOME%\embeddable\groovy-all-1.8.0.jar;groovybenchmark-11.06.11.jar MyApp 
----


----
Xxx xxxx foo(): xxx ns
Xxx xxxx bar(): xxx ns
Xxx xxxx baz(): xxx ns
MyApp java.lang.Object main(java.lang.Object): xxx ns
----


Please try and let me know your feedback!

Tuesday, May 31, 2011

@Benchmark annotation for Groovy has been updated to fix bugs

Today I updated @Benchmark annotation for Groovy to resolve the following (embarrassing) bugs:


- java.lang.VerifyError occurs with Groovy 1.7
- java.lang.UnsupportedClassVersionError occurs with JVM 1.5


Now I have made sure that @Benchmark works with Groovy 1.7/1.8 and JVM 1.5/1.6. 
If you are new to @Benchmark, please read the last post.

Saturday, May 28, 2011

I released @Benchmark annotation for Groovy!

@Benchmark annotation allows to measure execution time of methods without adding code. The following example is the standard use of @Benchmark:
----
@Benchmark
def method() {
    // an operation takes 1 ms
}


method()
----
In this case, "1000000" will be printed out in the standard output. @Benchmark express the time in nanoseconds.


There are two options to customize the output format:
----
@Benchmark(prefix = 'execution time of method(): ', suffix = ' ns')
def method() {
    // an operation takes 1 ms
}


method()
----
Then the output will be "execution time of method(): 1000000 ns".


@Benchmark uses this.println(value) for printing out. So you can redirect the outputs by setting an appropriate writer for the purpose to the "out" property:
----
def results = new StringBuffer()
this.setProperty("out", new StringBufferWriter(results))


// method calls


this.setProperty("out", null)


results.toString().eachLine { result ->
    println "${result.asType(int.class) / 1000000} ms"
}
----


You can download it from here.

Friday, May 27, 2011

Groovy Mascot Tournament - Mr.Groovy

I've been thinking that Groovy should have a mascot.
One idea that I have is "Mr. Groovy":


 Gr--
  oo-;
   v 
 / y  \


Something good, isn't it?


Sorry, it's a joke... but I really want a cool mascot for Groovy.

Sunday, May 22, 2011

Exploring Groovy 1.8 - Is the code really optimized?


Groovy's biggest issue is performance. In 1.8, optimizations for integer operations and direct method calls are done as the beginning of further performance improvements. Both of the optimizations has the same concept, "Try to compile as it is written". It does not mean the Groovy compiler had failed syntax analysis. Groovy source code are dramatically converted on compilation time to make possible flexible metaprogramming (I mentioned about it in my previous post, "Inside GDK"). The problem is there is no exception. I'll take a method to calculate a Fibonacci number as example:
----
int fib(int n) {
    if (n < 2) return n
    return fib(n - 1) + fib(n - 2)
}
----

The code is not needed to receive benefits of metaprogramming, but it is converted like the following code in 1.7 or the earlier versions(in the actual code,  Groovy APIs appear and it is little more complicated but I omit them because they are not important in this post). All of the operations/methods are replaced to reflections and boxing and unboxing are done to the integers repeatedly:
----
int fib(int n) {
    // Get the Method objects
    if (lessThanMethod.invoke(Integer.valueOf(n), Integer.valueOf(2))) 
        return Integer.valueOf(n).intValue();
    return 
        ((Integer) plusMethod.invoke(
            fibMethod.invoke(
                minusMethod.invoke(Integer.valueOf(n), Integer.valueOf(1)
            ),
            fibMethod.invoke(
                minusMethod.invoke(Integer.valueOf(n), Integer.valueOf(2)
            )
        )).intValue();
    );
----

It will be optimized as the following in 1.8. This is the meaning of "Try to compile as it is written" that I said in the beginning:
----
int fib(int n) {
    if (n < 2) return n;
    return fib(n - 1) + fib(n - 2);
}
----

By the way, the story is not ended yet. Because, the requirements to receive benefits of the optimizations are much harder than we expect. Obeying the following warnings that are written in the release notes is not enough:
1. If you want the optimization of integer operations, must not mix int type and Integer type in a expression
2. If you want the optimization of direct method calls, the method must be done on "this" and all of the argument types must be primitive or final.

See the following table. This is what I compare the execution time between 1.8.0 and 1.7.10 with methods in different ways to calculate a Fibonacci number:
Execution time of Fibonacci(30) in nanoseconds
Rank
Algorithm / Style
v1.8
v1.7
Improvement
1
Iteration + Groovy-style for loop
49720
46276
-6.93%
Down :-(
2
Iteration + Java/C-style for loop
129360
77824
-39.84%
Down :-(
3
Recursion + if + return
45080820
548522821
+1116.75%
Up :-)
4
Recursion + ternary operator
120489259
538476119
+346.91%
Up :-)
5
Recursion + if-else
259945537
546381516
+110.19%
Up :-)

The result shows us three things:
1. The performances of iteration versions were slightly worse
2. The performances of recursion versions were highly improved
3. The performance gaps among the coding style got wider

Those reasons are simple. Most of the cases failed both or one of the optimizations. In the table, all of the cases except for recursion + if + return version failed. For example, recursion + if + else version is this:
----
int fib(int n) {
    if (n < 2) n
    else fib(n - 1) + fib(n - 2)
}
----

The code will be converted like the following (I'll repeat that it slightly differs from the actual code to be exact):
----
int fib(int n) {
    if (n < 2) return n;
    else return 
        ((Integer) plusMethod.invoke(
            fibMethod.invoke(Integer.valueOf(n - 1)),
            fibMethod.invoke(Integer.valueOf(n - 2))
        )).intValue();
    );
}
----

That small difference affects success of the optimizations and make the more than five times difference in performance. I guess this problem will be fixed in a future release, but at least, it is not time to rejoice yet.

By the way, is the code really optimized?


(The code list that I used for the calculation)

Iteration + Groovy-style for loop:
----
int fib(int n) {

    int a = 0
    int b = 1
    int c
    for (int i in 2..n) {
        c = a + b
        a = b
        b = c
    }
    b
}
----

Iteration + Java/C-style for loop:
----
int fib(int n) {
    int a = 0
    int b = 1
    int c
    for (int i = 2; i <= n; i++) {
        c = a + b
        a = b
        b = c        
    }
    b
}
----

Recursion + if + return:
----
int fib(int n) {
    if (n < 2) return n
    return fib(n - 1) + fib(n - 2)
}
----

Recursion + ternary operator:
----
int fib(int n) {
    n < 2 ? n : fib(n - 1) + fib(n - 2)  
}
----

Recursion + if-else:
----
int fib(int n) {
    if (n < 2) n
    else fib(n - 1) + fib(n - 2)
}
----