How many s-expression formats are there for Ruby?
Posted by matijs 04/11/2012 at 13h34
Once upon a time, there was only UnifiedRuby, a cleaned up representation of the Ruby AST.
Now, what do we have?
-
RubyParser before version 3; this is the UnifiedRuby format:
RubyParser.new.parse "foobar(1, 2, 3)" # => s(:call, nil, :foobar, s(:arglist, s(:lit, 1), s(:lit, 2), s(:lit, 3)))
-
RubyParser version 3:
Ruby18Parser.new.parse "foobar(1, 2, 3)" # => s(:call, nil, :foobar, s(:lit, 1), s(:lit, 2), s(:lit, 3)) Ruby19Parser.new.parse "foobar(1, 2, 3)" # => s(:call, nil, :foobar, s(:lit, 1), s(:lit, 2), s(:lit, 3))
-
Rubinius; this is basically the UnifiedRuby format, but using Arrays.
"foobar(1,2,3)".to_sexp # => [:call, nil, :foobar, [:arglist, [:lit, 1], [:lit, 2], [:lit, 3]]]
-
RipperRubyParser; a wrapper around Ripper producing UnifiedRuby:
RipperRubyParser::Parser.new.parse "foobar(1,2,3)" # => s(:call, nil, :foobar, s(:arglist, s(:lit, 1), s(:lit, 2), s(:lit, 3)))
How do these fare with new Ruby 1.9 syntax? Let’s try hashes. RubyParser before version 3 and Rubinius (even in 1.9 mode) can’t handle this.
-
RubyParser 3:
Ruby19Parser.new.parse "{a: 1}" # => s(:hash, s(:lit, :a), s(:lit, 1))
-
RipperRubyParser:
RipperRubyParser::Parser.new.parse "{a: 1}" # => s(:hash, s(:lit, :a), s(:lit, 1))
And what about stabby lambda’s?
-
RubyParser 3:
Ruby19Parser.new.parse "->{}" # => s(:iter, s(:call, nil, :lambda), 0, nil)
-
RipperRubyParser:
RipperRubyParser::Parser.new.parse "->{}" # => s(:iter, s(:call, nil, :lambda, s(:arglist)), # s(:masgn, s(:array)), s(:void_stmt))
That looks like a big difference, but this is just the degenerate case. When the lambda has some arguments and a body, the difference is minor:
-
RubyParser 3:
Ruby19Parser.new.parse "->(a){foo}" # => s(:iter, s(:call, nil, :lambda), # s(:lasgn, :a), s(:call, nil, :foo))
-
RipperRubyParser:
RipperRubyParser::Parser.new.parse "->(a){foo}" # => s(:iter, s(:call, nil, :lambda, s(:arglist)), # s(:lasgn, :a), s(:call, nil, :foo, s(:arglist)))
So, what’s the conclusion? For parsing Ruby 1.9 syntax, there are really only two options: RubyParser and RipperRubyParser. The latter stays closer to the UnifiedRuby format, but the difference is small.
RubyParser’s results are a little neater, so RipperRubyParser should probably conform to the same format. Reek can then be updated to use the cleaner format, and use either library for parsing.
Comments
Comments are disabled