SlideShare a Scribd company logo
1 of 54
Download to read offline
Hacking parse.y
   Tatsuhiro UJIHISA
Me

• Ruby experience: 4 years
 • Rails application
 • Data mining tool
• Learning English here: 5 mths
• Looking for a job!
Me


• Presentations in Japan
 • Kansai Ruby Workshop
 • RubyKaigi2008, 2009
This is my first English
    presentation.
Hacking parse.y
Fixing ruby parser to understand ruby
 • Introducing new syntax
  • {:key :-) "value"}
  • 'symbol
  • ++i
  • def A#b(c)
MRI Inside

• MRI (Matz Ruby Implementation)
• $ ruby -v
  ruby 1.9.2dev (2009-08-05 trunk 24397) [i386-darwin9.7.0]


• Written in C
 • array.c, vm.c, gc.c, etc...
ruby 1.8 vs 1.9

• ~1.8
 • Parser: parse.y
 • Evaluator: eval.c
• 1.9~
 • Parser: parse.y
 • Evaluator:YARV (vm*.c)
Matz said

• Ugly: eval.c and parse.y
 RubyConf2006
• Now the original evaluator
 was all replaced with YARV
MRI Parser

• MRI uses yacc
  (parser generator for C)
• parse.y-o y.tab.c parse.y
  bison -d
  sed -f ./tool/ytab.sed -e "/^#/s!y.tab.c!
  parse.c!" y.tab.c > parse.c.new
  ...
parse.y

• One of the darkest side
• $ wc -l *{c,h,y} | sort -n
  ...
 9261 io.c
 10350 parse.y
 16352 parse.c # (automatically generated)
 183370 total
(Broad) Parser

• Lexer (yylex)
 • Bytes → Symbols
• Parser (yyparse)
 • Symbols → Syntax Tree
Tokens in Lexer
                                           %token <id> tOP_ASGN /* +=, -= et
%token   tUPLUS     /* unary+ */           %token tASSOC       /* => */
%token   tUMINUS    /* unary- */           %token tLPAREN      /* ( */
%token   tPOW /* ** */                     %token tLPAREN_ARG /* ( */
%token   tCMP /* <=> */                    %token tRPAREN      /* ) */
%token   tEQ     /* == */                  %token tLBRACK      /* [ */
%token   tEQQ /* === */                    %token tLBRACE      /* { */
%token   tNEQ /* != */                     %token tLBRACE_ARG /* { */
%token   tGEQ /* >= */                     %token tSTAR    /* * */
%token   tLEQ /* <= */                     %token tAMPER       /* & */
%token   tANDOP tOROP /* && and || */      %token tLAMBDA      /* -> */
%token   tMATCH tNMATCH/* =~ and !~ */     %token tSYMBEG tSTRING_BEG tXSTRING_
%token   tDOT2 tDOT3 /* .. and ... */      tWORDS_BEG tQWORDS_BEG
%token   tAREF tASET /* [] and []= */      %token tSTRING_DBEG tSTRING_DVAR tST
%token   tLSHFT tRSHFT /* << and >> */
%token   tCOLON2    /* :: */
%token   tCOLON3    /* :: at EXPR_BEG */
(detour)

n   MRI: parse.y (10350 lines)

n   JRuby: src/org/jruby/parser/{DefaultRubyParser.y,
    Ruby19Parser.y}
    (1886, 2076 lines)

n   Rubinius: lib/ruby_parser.y (1795 lines)
Case 1:
             :-)

• Hash literal
  {:key => 'value'}
  {:key :-) 'value'}
• :-) is just an alias of =>
Mastering “Colon”
Colons in Ruby

• A::B, ::C
• :symbol, :"sy-m-bol"
•a ? b : c
• {a: b}
• when 1: something (in 1.8)
static int
parser_yylex(struct parser_params *parser) {
    ...
    switch (c = nextc()) {
      ...
      case '#': /* it's a comment */
      ...
      case ':':
        c = nextc();
        if (c == ':') {
            if (IS_BEG() ||...
      ...
    }
    ... (about 1300 lines)
How does parser deal
    with colon?

• :: → tCOLON2 or tCOLON3
 • tCOLON2 Net::URI
 • tCOLON3 ::Kernel
lex_state
enum lex_state_e {
    EXPR_BEG,        /* ignore newline, +/- is a sign. */
    EXPR_END,        /* newline significant, +/- is an operator. *
    EXPR_ENDARG,     /* ditto, and unbound braces. */
    EXPR_ARG,        /* newline significant, +/- is an operator. *
    EXPR_CMDARG,     /* newline significant, +/- is an operator. *
    EXPR_MID,        /* newline significant, +/- is an operator. *
    EXPR_FNAME,      /* ignore newline, no reserved words. */
    EXPR_DOT,        /* right after `.' or `::', no reserved words
    EXPR_CLASS,      /* immediate after `class', no here document.
    EXPR_VALUE       /* alike EXPR_BEG but label is disallowed. */
};
case ':':
  c = nextc();
  if (c == ':') {
      if (IS_BEG() ||
          lex_state == EXPR_CLASS ||
          (IS_ARG() && space_seen)) {
          lex_state = EXPR_BEG;
          return tCOLON3;
      }
      lex_state = EXPR_DOT;
      return tCOLON2;
  }
...
  if (lex_state == EXPR_END ||
      lex_state == EXPR_ENDARG ||
      (c != -1 && ISSPACE(c))) {
      pushback(c);
      lex_state = EXPR_BEG;
      return ':';
  }
  switch (c) {
    case ''':
      lex_strterm = NEW_STRTERM(str_ssym, c, 0);
      break;
    case '"':
      lex_strterm = NEW_STRTERM(str_dsym, c, 0);
      break;
    default:
      pushback(c);
      break;
  }
  lex_state = EXPR_FNAME;
  return tSYMBEG;
How does parser deal
with colon? (summary)
• :: → tCOLON2 or tCOLON2
• EXPR_END or →: (else)
• otherwise → tSYMBEG
 • :' → str_ssym
 • :" → str_dsym
So,
• :-) → tASSOC
• :: → tCOLON2 or tCOLON2
• EXPR_END or →: (else)
• otherwise → tSYMBEG
 • :' → str_ssym
 • :" → str_dsym
:-)
Case 2:
    Lisp Like Symbol
• Symbol Literal
 :vancouver
 'vancouver
• Ad-hoc
 p :a, :b
 p 'a, 'b
Single Quote
(in parser_yylex)
...
case ''':
      lex_strterm = NEW_STRTERM(str_squote, ''', 0);
      return tSTRING_BEG;
...
Single Quote
(in parser_yylex)
...
case ''':
      if (??? condition ???) {
           lex_state = EXPR_FNAME;
           return tSYMBEG;
      }
      lex_strterm = NEW_STRTERM(str_squote, ''', 0);
      return tSTRING_BEG;
...
(loop
  (lambda (p 'good)))
Case3: Pre
Incremental Operator

• ++i
• i = i.succ
  (NOT i = i + 1)
Lexer
@@ -685,6 +685,7 @@ static void
token_info_pop(struct parser_params*, const
char *token);
 %type <val> program reswords then do
dot_or_colon
 %*/
 %token tUPLUS /* unary+ */
+%token tINCR /* ++var */
 %token tUMINUS /* unary- */
 %token tPOW    /* ** */
 %token tCMP    /* <=> */
               (Actually there are more trivial fixes)
regenerate id.h

• id.h is automatically
 generated by parse.y in make
• $ rm id.h
 $ make
parser example
variable     : tIDENTIFIER
        |   tIVAR
        |   tGVAR
        |   tCONSTANT
        |   tCVAR
        |   keyword_nil {ifndef_ripper($$ = keyword_nil);}
        |   keyword_self {ifndef_ripper($$ = keyword_self);}
        |   keyword_true {ifndef_ripper($$ = keyword_true);}
        |   keyword_false {ifndef_ripper($$ = keyword_false);}
        |   keyword__FILE__ {ifndef_ripper($$ = keyword__FILE__);}
        |   keyword__LINE__ {ifndef_ripper($$ = keyword__LINE__);}
        |   keyword__ENCODING__ {ifndef_ripper($$ = keyword__ENCODING_
        ;
lhs     : variable
         {
         /*%%%*/
        if (!($$ = assignable($1, 0))) $$ = NEW_BEGIN(0);
         /*%
        $$ = dispatch1(var_field, $1);
         %*/
         }
     | primary_value '[' opt_call_args rbracket
         {
         /*%%%*/
        $$ = aryset($1, $3);
         /*%
        $$ = dispatch2(aref_field, $1, escape_Qundef($3));
         %*/
         }
  ...
BNF (part)
program    : compstmt             arg       : lhs '=' arg
                                            | var_lhs tOP_ASGN arg
compstmt   : stmts opt_terms                | primary_value '[' aref_args ']' tOP

stmts      : none
           | stmt                           | arg '?' arg ':' arg
           | stmts terms stmt               | primary

stmt       : kALIAS fitem fitem   primary   : literal
           | kALIAS tGVAR tGVAR             | strings


           | expr                           | tLPAREN_ARG expr ')'
                                            | tLPAREN compstmt ')'
expr       : kRETURN call_args
           | kBREAK call_args
                                            | kREDO
                                            | kRETRY
           | '!' command_call
           | arg
Assign
stmt : ...
 | mlhs '=' command_call
     {
     /*%%%*/
         value_expr($3);
         $1->nd_value = $3;
         $$ = $1;
     /*%
         $$ = dispatch2(massign, $1, $3);
     %*/
     }
mlhs
mlhs: mlhs_basic | ...
mlhs_basic: mlhs_head | ...
mlhs_head: mlhs_item ',' | ...
mlhs_item: mlhs_node | ...
mlhs_node: variable {
  $$ = assignable($1, 0); }
Method call
block_command        : block_call
| block_call '.' operation2 command_args
    {
    /*%%%*/
        $$ = NEW_CALL($1, $3, $4);
    /*%
        $$ = dispatch3(call, $1, ripper_id2sym('.'),
        $$ = method_arg($$, $4);
    %*/
    }
Mix!
var_ref: ...
| tINCR variable
    {
    /*%%%*/
        $$ = assignable($2, 0);
        $$->nd_value = NEW_CALL(gettable($$->nd_vid),
rb_intern("succ"), 0);
    /*%
        $$ = dispatch2(unary, ripper_intern("++@"), $2);
    %*/
    }
++ruby
Case 4:
         def A#b

• A#b
 instance method b of class A
• A.b
 class method b of class A
A#b
class A    def A.b
  def b      ...
    ...    end
  end
end
A#b
def A#b    def A.b
  ...        ...
end        end
#
(in parser_yylex)
case '#':                 /* it's a comment */
 /* no magic_comment in shebang line */
 if (!parser_magic_comment(parser, lex_p, lex_pend - lex_p)) {
     if (comment_at_top(parser)) {
            set_file_encoding(parser, lex_p, lex_pend);
     }
 }
 lex_p = lex_pend;
#
(in parser_yylex)
case '#':                 /* it's a comment */
 c = nextc();
 pushback(c);
 if(lex_state == EXPR_END && ISALNUM(c)) return '#';
 /* no magic_comment in shebang line */
 if (!parser_magic_comment(parser, lex_p, lex_pend - lex_p)) {
     if (comment_at_top(parser)) {
            set_file_encoding(parser, lex_p, lex_pend);
Primary
primary: literal | ...
       | k_def singleton dot_or_colon {lex_state = EXPR_FNAME;} fname
           {
               in_single++;
               lex_state = EXPR_END; /* force for args */
           /*%%%*/
               local_push(0);
           /*%
           %*/
           }
         f_arglist
         bodystmt
         k_end
           {
           /*%%%*/
               NODE *body = remove_begin($8);
               reduce_nodes(&body);
               $$ = NEW_DEFS($2, $5, $7, body);
               fixpos($$, $2);
               local_pop();
           /*%
               $$ = dispatch5(defs, $2, $3, $5, $7, $8);
           %*/
               in_single--;
           }
| k_def cname '#' {lex_state = EXPR_FNAME;} fname
    {
        $<id>$ = cur_mid;
        cur_mid = $5;
        in_def++;
    /*%%%*/
        local_push(0);
    /*%
    %*/
    }
  f_arglist
  bodystmt
  k_end
    {
    /*%%%*/
        NODE *body = remove_begin($8);
        reduce_nodes(&body);
        $$ = NEW_DEFN($5, $7, body, NOEX_PRIVATE);
        fixpos($$, $7);
        fixpos($$->nd_defn, $7);
        $$ = NEW_CLASS(NEW_COLON3($2), $$, 0);
        nd_set_line($$, $<num>6);
        local_pop();
    /*%
        $$ = dispatch4(defi, $2, $5, $7, $8);
    %*/
        in_def--;
        cur_mid = $<id>6;
    }
Reference
Ruby




Minero AOKI,Yukihiro
MATSUMOTO
"Ruby Hacking Guide"


HTML Version is available
Reference


• My blog
 http://ujihisa.blogspot.com
• All patches I showed are there
end
Appendix:
Imaginary Numbers
• Matz wrote a patch in
 [ruby-dev:38843]
• translation:
 [ruby-core:24730]
• It won't be accepted
Appendix:
 Imaginary Numbers

> 3i
=> (0 + 3i)
> 3i.class
=> Complex
Applendix2:
  I'm looking for job!
• ujihisa at gmail com
• Ruby, Rails, Merb, Sinatra, etc
• C, JavaScript,Vim script,
 HTML, Java, Haskell, Scheme
• Fluent in Japanese

More Related Content

What's hot

Zend Certification Preparation Tutorial
Zend Certification Preparation TutorialZend Certification Preparation Tutorial
Zend Certification Preparation TutorialLorna Mitchell
 
Perl 6 in Context
Perl 6 in ContextPerl 6 in Context
Perl 6 in Contextlichtkind
 
4 operators, expressions &amp; statements
4  operators, expressions &amp; statements4  operators, expressions &amp; statements
4 operators, expressions &amp; statementsMomenMostafa
 
8 arrays and pointers
8  arrays and pointers8  arrays and pointers
8 arrays and pointersMomenMostafa
 
9 character string &amp; string library
9  character string &amp; string library9  character string &amp; string library
9 character string &amp; string libraryMomenMostafa
 
Swift 함수 커링 사용하기
Swift 함수 커링 사용하기Swift 함수 커링 사용하기
Swift 함수 커링 사용하기진성 오
 
Basic C++ 11/14 for Python Programmers
Basic C++ 11/14 for Python ProgrammersBasic C++ 11/14 for Python Programmers
Basic C++ 11/14 for Python ProgrammersAppier
 
Programming Language Swift Overview
Programming Language Swift OverviewProgramming Language Swift Overview
Programming Language Swift OverviewKaz Yoshikawa
 
6 c control statements branching &amp; jumping
6 c control statements branching &amp; jumping6 c control statements branching &amp; jumping
6 c control statements branching &amp; jumpingMomenMostafa
 
PHP in 2018 - Q4 - AFUP Limoges
PHP in 2018 - Q4 - AFUP LimogesPHP in 2018 - Q4 - AFUP Limoges
PHP in 2018 - Q4 - AFUP Limoges✅ William Pinaud
 
Wx::Perl::Smart
Wx::Perl::SmartWx::Perl::Smart
Wx::Perl::Smartlichtkind
 
Introduction to Swift programming language.
Introduction to Swift programming language.Introduction to Swift programming language.
Introduction to Swift programming language.Icalia Labs
 
A swift introduction to Swift
A swift introduction to SwiftA swift introduction to Swift
A swift introduction to SwiftGiordano Scalzo
 
Notes for GNU Octave - Numerical Programming - for Students - 02 of 02 by aru...
Notes for GNU Octave - Numerical Programming - for Students - 02 of 02 by aru...Notes for GNU Octave - Numerical Programming - for Students - 02 of 02 by aru...
Notes for GNU Octave - Numerical Programming - for Students - 02 of 02 by aru...ssuserd6b1fd
 
Notes for C Programming for MCA, BCA, B. Tech CSE, ECE and MSC (CS) 2 of 5 by...
Notes for C Programming for MCA, BCA, B. Tech CSE, ECE and MSC (CS) 2 of 5 by...Notes for C Programming for MCA, BCA, B. Tech CSE, ECE and MSC (CS) 2 of 5 by...
Notes for C Programming for MCA, BCA, B. Tech CSE, ECE and MSC (CS) 2 of 5 by...ssuserd6b1fd
 
Falcon初印象
Falcon初印象Falcon初印象
Falcon初印象勇浩 赖
 

What's hot (20)

Zend Certification Preparation Tutorial
Zend Certification Preparation TutorialZend Certification Preparation Tutorial
Zend Certification Preparation Tutorial
 
Swift 2
Swift 2Swift 2
Swift 2
 
Perl 6 in Context
Perl 6 in ContextPerl 6 in Context
Perl 6 in Context
 
The most exciting features of PHP 7.1
The most exciting features of PHP 7.1The most exciting features of PHP 7.1
The most exciting features of PHP 7.1
 
4 operators, expressions &amp; statements
4  operators, expressions &amp; statements4  operators, expressions &amp; statements
4 operators, expressions &amp; statements
 
8 arrays and pointers
8  arrays and pointers8  arrays and pointers
8 arrays and pointers
 
9 character string &amp; string library
9  character string &amp; string library9  character string &amp; string library
9 character string &amp; string library
 
Swift 함수 커링 사용하기
Swift 함수 커링 사용하기Swift 함수 커링 사용하기
Swift 함수 커링 사용하기
 
Basic C++ 11/14 for Python Programmers
Basic C++ 11/14 for Python ProgrammersBasic C++ 11/14 for Python Programmers
Basic C++ 11/14 for Python Programmers
 
Programming Language Swift Overview
Programming Language Swift OverviewProgramming Language Swift Overview
Programming Language Swift Overview
 
6 c control statements branching &amp; jumping
6 c control statements branching &amp; jumping6 c control statements branching &amp; jumping
6 c control statements branching &amp; jumping
 
PHP in 2018 - Q4 - AFUP Limoges
PHP in 2018 - Q4 - AFUP LimogesPHP in 2018 - Q4 - AFUP Limoges
PHP in 2018 - Q4 - AFUP Limoges
 
Wx::Perl::Smart
Wx::Perl::SmartWx::Perl::Smart
Wx::Perl::Smart
 
Introduction to Swift programming language.
Introduction to Swift programming language.Introduction to Swift programming language.
Introduction to Swift programming language.
 
A swift introduction to Swift
A swift introduction to SwiftA swift introduction to Swift
A swift introduction to Swift
 
Notes for GNU Octave - Numerical Programming - for Students - 02 of 02 by aru...
Notes for GNU Octave - Numerical Programming - for Students - 02 of 02 by aru...Notes for GNU Octave - Numerical Programming - for Students - 02 of 02 by aru...
Notes for GNU Octave - Numerical Programming - for Students - 02 of 02 by aru...
 
PHP PPT FILE
PHP PPT FILEPHP PPT FILE
PHP PPT FILE
 
Notes for C Programming for MCA, BCA, B. Tech CSE, ECE and MSC (CS) 2 of 5 by...
Notes for C Programming for MCA, BCA, B. Tech CSE, ECE and MSC (CS) 2 of 5 by...Notes for C Programming for MCA, BCA, B. Tech CSE, ECE and MSC (CS) 2 of 5 by...
Notes for C Programming for MCA, BCA, B. Tech CSE, ECE and MSC (CS) 2 of 5 by...
 
Falcon初印象
Falcon初印象Falcon初印象
Falcon初印象
 
c programming
c programmingc programming
c programming
 

Viewers also liked

How To Use Kagemusha
How To Use KagemushaHow To Use Kagemusha
How To Use Kagemushaujihisa
 
HootSuite Dev 2
HootSuite Dev 2HootSuite Dev 2
HootSuite Dev 2ujihisa
 
Agile Web Posting With Ruby / Ruby Kaigi2008
Agile Web Posting With Ruby / Ruby Kaigi2008Agile Web Posting With Ruby / Ruby Kaigi2008
Agile Web Posting With Ruby / Ruby Kaigi2008ujihisa
 
Presentacion de trabajos
Presentacion de trabajosPresentacion de trabajos
Presentacion de trabajosgabychap
 
A P R E S E N T AÇÃ O C L U B M A X I 24 11 07
A P R E S E N T AÇÃ O  C L U B M A X I 24 11 07A P R E S E N T AÇÃ O  C L U B M A X I 24 11 07
A P R E S E N T AÇÃ O C L U B M A X I 24 11 07Daniel Ferreira
 
Preguntas frecuentes FERIA DE CIENCIAS
Preguntas frecuentes FERIA DE CIENCIASPreguntas frecuentes FERIA DE CIENCIAS
Preguntas frecuentes FERIA DE CIENCIASgabychap
 

Viewers also liked (6)

How To Use Kagemusha
How To Use KagemushaHow To Use Kagemusha
How To Use Kagemusha
 
HootSuite Dev 2
HootSuite Dev 2HootSuite Dev 2
HootSuite Dev 2
 
Agile Web Posting With Ruby / Ruby Kaigi2008
Agile Web Posting With Ruby / Ruby Kaigi2008Agile Web Posting With Ruby / Ruby Kaigi2008
Agile Web Posting With Ruby / Ruby Kaigi2008
 
Presentacion de trabajos
Presentacion de trabajosPresentacion de trabajos
Presentacion de trabajos
 
A P R E S E N T AÇÃ O C L U B M A X I 24 11 07
A P R E S E N T AÇÃ O  C L U B M A X I 24 11 07A P R E S E N T AÇÃ O  C L U B M A X I 24 11 07
A P R E S E N T AÇÃ O C L U B M A X I 24 11 07
 
Preguntas frecuentes FERIA DE CIENCIAS
Preguntas frecuentes FERIA DE CIENCIASPreguntas frecuentes FERIA DE CIENCIAS
Preguntas frecuentes FERIA DE CIENCIAS
 

Similar to Hacking Parse.y with ujihisa

Achieving Parsing Sanity In Erlang
Achieving Parsing Sanity In ErlangAchieving Parsing Sanity In Erlang
Achieving Parsing Sanity In ErlangSean Cribbs
 
java compilerCompiler1.javajava compilerCompiler1.javaimport.docx
java compilerCompiler1.javajava compilerCompiler1.javaimport.docxjava compilerCompiler1.javajava compilerCompiler1.javaimport.docx
java compilerCompiler1.javajava compilerCompiler1.javaimport.docxpriestmanmable
 
Unit 4
Unit 4Unit 4
Unit 4siddr
 
... now write an interpreter (PHPem 2016)
... now write an interpreter (PHPem 2016)... now write an interpreter (PHPem 2016)
... now write an interpreter (PHPem 2016)James Titcumb
 
Load-time Hacking using LD_PRELOAD
Load-time Hacking using LD_PRELOADLoad-time Hacking using LD_PRELOAD
Load-time Hacking using LD_PRELOADDharmalingam Ganesan
 
calc3build# calc3bison -y -d calc3.yflex calc3.lgcc -c .docx
calc3build# calc3bison -y -d calc3.yflex calc3.lgcc -c .docxcalc3build# calc3bison -y -d calc3.yflex calc3.lgcc -c .docx
calc3build# calc3bison -y -d calc3.yflex calc3.lgcc -c .docxRAHUL126667
 
Round PEG, Round Hole - Parsing Functionally
Round PEG, Round Hole - Parsing FunctionallyRound PEG, Round Hole - Parsing Functionally
Round PEG, Round Hole - Parsing FunctionallySean Cribbs
 
Im having difficulty with the directives i figured out a duplicatio.pdf
Im having difficulty with the directives i figured out a duplicatio.pdfIm having difficulty with the directives i figured out a duplicatio.pdf
Im having difficulty with the directives i figured out a duplicatio.pdfmaheshkumar12354
 
Best C++ Programming Homework Help
Best C++ Programming Homework HelpBest C++ Programming Homework Help
Best C++ Programming Homework HelpC++ Homework Help
 
[FT-11][suhorng] “Poor Man's” Undergraduate Compilers
[FT-11][suhorng] “Poor Man's” Undergraduate Compilers[FT-11][suhorng] “Poor Man's” Undergraduate Compilers
[FT-11][suhorng] “Poor Man's” Undergraduate CompilersFunctional Thursday
 
Go Says WAT?
Go Says WAT?Go Says WAT?
Go Says WAT?jonbodner
 
PERL for QA - Important Commands and applications
PERL for QA - Important Commands and applicationsPERL for QA - Important Commands and applications
PERL for QA - Important Commands and applicationsSunil Kumar Gunasekaran
 
Q1 Consider the below omp_trap1.c implantation, modify the code so t.pdf
Q1 Consider the below omp_trap1.c implantation, modify the code so t.pdfQ1 Consider the below omp_trap1.c implantation, modify the code so t.pdf
Q1 Consider the below omp_trap1.c implantation, modify the code so t.pdfabdulrahamanbags
 
Binary Studio Academy PRO: ANTLR course by Alexander Vasiltsov (lesson 4)
Binary Studio Academy PRO: ANTLR course by Alexander Vasiltsov (lesson 4)Binary Studio Academy PRO: ANTLR course by Alexander Vasiltsov (lesson 4)
Binary Studio Academy PRO: ANTLR course by Alexander Vasiltsov (lesson 4)Binary Studio
 
全裸でワンライナー(仮)
全裸でワンライナー(仮)全裸でワンライナー(仮)
全裸でワンライナー(仮)Yoshihiro Sugi
 

Similar to Hacking Parse.y with ujihisa (20)

Achieving Parsing Sanity In Erlang
Achieving Parsing Sanity In ErlangAchieving Parsing Sanity In Erlang
Achieving Parsing Sanity In Erlang
 
LEX & YACC TOOL
LEX & YACC TOOLLEX & YACC TOOL
LEX & YACC TOOL
 
java compilerCompiler1.javajava compilerCompiler1.javaimport.docx
java compilerCompiler1.javajava compilerCompiler1.javaimport.docxjava compilerCompiler1.javajava compilerCompiler1.javaimport.docx
java compilerCompiler1.javajava compilerCompiler1.javaimport.docx
 
Unit 4
Unit 4Unit 4
Unit 4
 
... now write an interpreter (PHPem 2016)
... now write an interpreter (PHPem 2016)... now write an interpreter (PHPem 2016)
... now write an interpreter (PHPem 2016)
 
Load-time Hacking using LD_PRELOAD
Load-time Hacking using LD_PRELOADLoad-time Hacking using LD_PRELOAD
Load-time Hacking using LD_PRELOAD
 
calc3build# calc3bison -y -d calc3.yflex calc3.lgcc -c .docx
calc3build# calc3bison -y -d calc3.yflex calc3.lgcc -c .docxcalc3build# calc3bison -y -d calc3.yflex calc3.lgcc -c .docx
calc3build# calc3bison -y -d calc3.yflex calc3.lgcc -c .docx
 
Ch4c.ppt
Ch4c.pptCh4c.ppt
Ch4c.ppt
 
Antlr V3
Antlr V3Antlr V3
Antlr V3
 
Round PEG, Round Hole - Parsing Functionally
Round PEG, Round Hole - Parsing FunctionallyRound PEG, Round Hole - Parsing Functionally
Round PEG, Round Hole - Parsing Functionally
 
Im having difficulty with the directives i figured out a duplicatio.pdf
Im having difficulty with the directives i figured out a duplicatio.pdfIm having difficulty with the directives i figured out a duplicatio.pdf
Im having difficulty with the directives i figured out a duplicatio.pdf
 
Best C++ Programming Homework Help
Best C++ Programming Homework HelpBest C++ Programming Homework Help
Best C++ Programming Homework Help
 
[FT-11][suhorng] “Poor Man's” Undergraduate Compilers
[FT-11][suhorng] “Poor Man's” Undergraduate Compilers[FT-11][suhorng] “Poor Man's” Undergraduate Compilers
[FT-11][suhorng] “Poor Man's” Undergraduate Compilers
 
Go Says WAT?
Go Says WAT?Go Says WAT?
Go Says WAT?
 
PERL for QA - Important Commands and applications
PERL for QA - Important Commands and applicationsPERL for QA - Important Commands and applications
PERL for QA - Important Commands and applications
 
Tork03 LT
Tork03 LT Tork03 LT
Tork03 LT
 
Q1 Consider the below omp_trap1.c implantation, modify the code so t.pdf
Q1 Consider the below omp_trap1.c implantation, modify the code so t.pdfQ1 Consider the below omp_trap1.c implantation, modify the code so t.pdf
Q1 Consider the below omp_trap1.c implantation, modify the code so t.pdf
 
Binary Studio Academy PRO: ANTLR course by Alexander Vasiltsov (lesson 4)
Binary Studio Academy PRO: ANTLR course by Alexander Vasiltsov (lesson 4)Binary Studio Academy PRO: ANTLR course by Alexander Vasiltsov (lesson 4)
Binary Studio Academy PRO: ANTLR course by Alexander Vasiltsov (lesson 4)
 
Groovy
GroovyGroovy
Groovy
 
全裸でワンライナー(仮)
全裸でワンライナー(仮)全裸でワンライナー(仮)
全裸でワンライナー(仮)
 

More from ujihisa

vimconf2013
vimconf2013vimconf2013
vimconf2013ujihisa
 
KOF2013 Minecraft / Clojure
KOF2013 Minecraft / ClojureKOF2013 Minecraft / Clojure
KOF2013 Minecraft / Clojureujihisa
 
Keynote ujihisa.vim#2
Keynote ujihisa.vim#2Keynote ujihisa.vim#2
Keynote ujihisa.vim#2ujihisa
 
vimshell made other shells legacy
vimshell made other shells legacyvimshell made other shells legacy
vimshell made other shells legacyujihisa
 
From Ruby to Haskell (Kansai Yami RubyKaigi)
From Ruby to Haskell (Kansai Yami RubyKaigi)From Ruby to Haskell (Kansai Yami RubyKaigi)
From Ruby to Haskell (Kansai Yami RubyKaigi)ujihisa
 
Text Manipulation with/without Parsec
Text Manipulation with/without ParsecText Manipulation with/without Parsec
Text Manipulation with/without Parsecujihisa
 
CoffeeScript in hootsuite
CoffeeScript in hootsuiteCoffeeScript in hootsuite
CoffeeScript in hootsuiteujihisa
 
Ruby Kansai49
Ruby Kansai49Ruby Kansai49
Ruby Kansai49ujihisa
 
Hootsuite dev 2011
Hootsuite dev 2011Hootsuite dev 2011
Hootsuite dev 2011ujihisa
 
LLVM Workshop Osaka Umeda, Japan
LLVM Workshop Osaka Umeda, JapanLLVM Workshop Osaka Umeda, Japan
LLVM Workshop Osaka Umeda, Japanujihisa
 
RubyConf 2009 LT "Termtter"
RubyConf 2009 LT "Termtter"RubyConf 2009 LT "Termtter"
RubyConf 2009 LT "Termtter"ujihisa
 
Ruby Kansai #35 About RubyKaigi2009 ujihisa
Ruby Kansai #35 About RubyKaigi2009 ujihisaRuby Kansai #35 About RubyKaigi2009 ujihisa
Ruby Kansai #35 About RubyKaigi2009 ujihisaujihisa
 
Kof2008 Itll
Kof2008 ItllKof2008 Itll
Kof2008 Itllujihisa
 
All About Metarw -- VimM#2
All About Metarw -- VimM#2All About Metarw -- VimM#2
All About Metarw -- VimM#2ujihisa
 
Itc2008 Ujihisa
Itc2008 UjihisaItc2008 Ujihisa
Itc2008 Ujihisaujihisa
 
Agile Web Posting with Ruby (lang:ja)
Agile Web Posting with Ruby (lang:ja)Agile Web Posting with Ruby (lang:ja)
Agile Web Posting with Ruby (lang:ja)ujihisa
 
From Java To Haskell P
From Java To Haskell PFrom Java To Haskell P
From Java To Haskell Pujihisa
 
Ruby Monad
Ruby MonadRuby Monad
Ruby Monadujihisa
 
From Javascript To Haskell
From Javascript To HaskellFrom Javascript To Haskell
From Javascript To Haskellujihisa
 
進捗報告2007 11 09 15 31 39
進捗報告2007 11 09 15 31 39進捗報告2007 11 09 15 31 39
進捗報告2007 11 09 15 31 39ujihisa
 

More from ujihisa (20)

vimconf2013
vimconf2013vimconf2013
vimconf2013
 
KOF2013 Minecraft / Clojure
KOF2013 Minecraft / ClojureKOF2013 Minecraft / Clojure
KOF2013 Minecraft / Clojure
 
Keynote ujihisa.vim#2
Keynote ujihisa.vim#2Keynote ujihisa.vim#2
Keynote ujihisa.vim#2
 
vimshell made other shells legacy
vimshell made other shells legacyvimshell made other shells legacy
vimshell made other shells legacy
 
From Ruby to Haskell (Kansai Yami RubyKaigi)
From Ruby to Haskell (Kansai Yami RubyKaigi)From Ruby to Haskell (Kansai Yami RubyKaigi)
From Ruby to Haskell (Kansai Yami RubyKaigi)
 
Text Manipulation with/without Parsec
Text Manipulation with/without ParsecText Manipulation with/without Parsec
Text Manipulation with/without Parsec
 
CoffeeScript in hootsuite
CoffeeScript in hootsuiteCoffeeScript in hootsuite
CoffeeScript in hootsuite
 
Ruby Kansai49
Ruby Kansai49Ruby Kansai49
Ruby Kansai49
 
Hootsuite dev 2011
Hootsuite dev 2011Hootsuite dev 2011
Hootsuite dev 2011
 
LLVM Workshop Osaka Umeda, Japan
LLVM Workshop Osaka Umeda, JapanLLVM Workshop Osaka Umeda, Japan
LLVM Workshop Osaka Umeda, Japan
 
RubyConf 2009 LT "Termtter"
RubyConf 2009 LT "Termtter"RubyConf 2009 LT "Termtter"
RubyConf 2009 LT "Termtter"
 
Ruby Kansai #35 About RubyKaigi2009 ujihisa
Ruby Kansai #35 About RubyKaigi2009 ujihisaRuby Kansai #35 About RubyKaigi2009 ujihisa
Ruby Kansai #35 About RubyKaigi2009 ujihisa
 
Kof2008 Itll
Kof2008 ItllKof2008 Itll
Kof2008 Itll
 
All About Metarw -- VimM#2
All About Metarw -- VimM#2All About Metarw -- VimM#2
All About Metarw -- VimM#2
 
Itc2008 Ujihisa
Itc2008 UjihisaItc2008 Ujihisa
Itc2008 Ujihisa
 
Agile Web Posting with Ruby (lang:ja)
Agile Web Posting with Ruby (lang:ja)Agile Web Posting with Ruby (lang:ja)
Agile Web Posting with Ruby (lang:ja)
 
From Java To Haskell P
From Java To Haskell PFrom Java To Haskell P
From Java To Haskell P
 
Ruby Monad
Ruby MonadRuby Monad
Ruby Monad
 
From Javascript To Haskell
From Javascript To HaskellFrom Javascript To Haskell
From Javascript To Haskell
 
進捗報告2007 11 09 15 31 39
進捗報告2007 11 09 15 31 39進捗報告2007 11 09 15 31 39
進捗報告2007 11 09 15 31 39
 

Recently uploaded

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfhans926745
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 

Recently uploaded (20)

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 

Hacking Parse.y with ujihisa

  • 1. Hacking parse.y Tatsuhiro UJIHISA
  • 2. Me • Ruby experience: 4 years • Rails application • Data mining tool • Learning English here: 5 mths • Looking for a job!
  • 3. Me • Presentations in Japan • Kansai Ruby Workshop • RubyKaigi2008, 2009
  • 4.
  • 5. This is my first English presentation.
  • 6. Hacking parse.y Fixing ruby parser to understand ruby • Introducing new syntax • {:key :-) "value"} • 'symbol • ++i • def A#b(c)
  • 7. MRI Inside • MRI (Matz Ruby Implementation) • $ ruby -v ruby 1.9.2dev (2009-08-05 trunk 24397) [i386-darwin9.7.0] • Written in C • array.c, vm.c, gc.c, etc...
  • 8. ruby 1.8 vs 1.9 • ~1.8 • Parser: parse.y • Evaluator: eval.c • 1.9~ • Parser: parse.y • Evaluator:YARV (vm*.c)
  • 9. Matz said • Ugly: eval.c and parse.y RubyConf2006 • Now the original evaluator was all replaced with YARV
  • 10. MRI Parser • MRI uses yacc (parser generator for C) • parse.y-o y.tab.c parse.y bison -d sed -f ./tool/ytab.sed -e "/^#/s!y.tab.c! parse.c!" y.tab.c > parse.c.new ...
  • 11. parse.y • One of the darkest side • $ wc -l *{c,h,y} | sort -n ... 9261 io.c 10350 parse.y 16352 parse.c # (automatically generated) 183370 total
  • 12. (Broad) Parser • Lexer (yylex) • Bytes → Symbols • Parser (yyparse) • Symbols → Syntax Tree
  • 13. Tokens in Lexer %token <id> tOP_ASGN /* +=, -= et %token tUPLUS /* unary+ */ %token tASSOC /* => */ %token tUMINUS /* unary- */ %token tLPAREN /* ( */ %token tPOW /* ** */ %token tLPAREN_ARG /* ( */ %token tCMP /* <=> */ %token tRPAREN /* ) */ %token tEQ /* == */ %token tLBRACK /* [ */ %token tEQQ /* === */ %token tLBRACE /* { */ %token tNEQ /* != */ %token tLBRACE_ARG /* { */ %token tGEQ /* >= */ %token tSTAR /* * */ %token tLEQ /* <= */ %token tAMPER /* & */ %token tANDOP tOROP /* && and || */ %token tLAMBDA /* -> */ %token tMATCH tNMATCH/* =~ and !~ */ %token tSYMBEG tSTRING_BEG tXSTRING_ %token tDOT2 tDOT3 /* .. and ... */ tWORDS_BEG tQWORDS_BEG %token tAREF tASET /* [] and []= */ %token tSTRING_DBEG tSTRING_DVAR tST %token tLSHFT tRSHFT /* << and >> */ %token tCOLON2 /* :: */ %token tCOLON3 /* :: at EXPR_BEG */
  • 14. (detour) n MRI: parse.y (10350 lines) n JRuby: src/org/jruby/parser/{DefaultRubyParser.y, Ruby19Parser.y} (1886, 2076 lines) n Rubinius: lib/ruby_parser.y (1795 lines)
  • 15. Case 1: :-) • Hash literal {:key => 'value'} {:key :-) 'value'} • :-) is just an alias of =>
  • 17. Colons in Ruby • A::B, ::C • :symbol, :"sy-m-bol" •a ? b : c • {a: b} • when 1: something (in 1.8)
  • 18. static int parser_yylex(struct parser_params *parser) { ... switch (c = nextc()) { ... case '#': /* it's a comment */ ... case ':': c = nextc(); if (c == ':') { if (IS_BEG() ||... ... } ... (about 1300 lines)
  • 19. How does parser deal with colon? • :: → tCOLON2 or tCOLON3 • tCOLON2 Net::URI • tCOLON3 ::Kernel
  • 20. lex_state enum lex_state_e { EXPR_BEG, /* ignore newline, +/- is a sign. */ EXPR_END, /* newline significant, +/- is an operator. * EXPR_ENDARG, /* ditto, and unbound braces. */ EXPR_ARG, /* newline significant, +/- is an operator. * EXPR_CMDARG, /* newline significant, +/- is an operator. * EXPR_MID, /* newline significant, +/- is an operator. * EXPR_FNAME, /* ignore newline, no reserved words. */ EXPR_DOT, /* right after `.' or `::', no reserved words EXPR_CLASS, /* immediate after `class', no here document. EXPR_VALUE /* alike EXPR_BEG but label is disallowed. */ };
  • 21. case ':': c = nextc(); if (c == ':') { if (IS_BEG() || lex_state == EXPR_CLASS || (IS_ARG() && space_seen)) { lex_state = EXPR_BEG; return tCOLON3; } lex_state = EXPR_DOT; return tCOLON2; }
  • 22. ... if (lex_state == EXPR_END || lex_state == EXPR_ENDARG || (c != -1 && ISSPACE(c))) { pushback(c); lex_state = EXPR_BEG; return ':'; } switch (c) { case ''': lex_strterm = NEW_STRTERM(str_ssym, c, 0); break; case '"': lex_strterm = NEW_STRTERM(str_dsym, c, 0); break; default: pushback(c); break; } lex_state = EXPR_FNAME; return tSYMBEG;
  • 23. How does parser deal with colon? (summary) • :: → tCOLON2 or tCOLON2 • EXPR_END or →: (else) • otherwise → tSYMBEG • :' → str_ssym • :" → str_dsym
  • 24. So, • :-) → tASSOC • :: → tCOLON2 or tCOLON2 • EXPR_END or →: (else) • otherwise → tSYMBEG • :' → str_ssym • :" → str_dsym
  • 25. :-)
  • 26. Case 2: Lisp Like Symbol • Symbol Literal :vancouver 'vancouver • Ad-hoc p :a, :b p 'a, 'b
  • 27. Single Quote (in parser_yylex) ... case ''': lex_strterm = NEW_STRTERM(str_squote, ''', 0); return tSTRING_BEG; ...
  • 28. Single Quote (in parser_yylex) ... case ''': if (??? condition ???) { lex_state = EXPR_FNAME; return tSYMBEG; } lex_strterm = NEW_STRTERM(str_squote, ''', 0); return tSTRING_BEG; ...
  • 29. (loop (lambda (p 'good)))
  • 30. Case3: Pre Incremental Operator • ++i • i = i.succ (NOT i = i + 1)
  • 31. Lexer @@ -685,6 +685,7 @@ static void token_info_pop(struct parser_params*, const char *token); %type <val> program reswords then do dot_or_colon %*/ %token tUPLUS /* unary+ */ +%token tINCR /* ++var */ %token tUMINUS /* unary- */ %token tPOW /* ** */ %token tCMP /* <=> */ (Actually there are more trivial fixes)
  • 32. regenerate id.h • id.h is automatically generated by parse.y in make • $ rm id.h $ make
  • 33. parser example variable : tIDENTIFIER | tIVAR | tGVAR | tCONSTANT | tCVAR | keyword_nil {ifndef_ripper($$ = keyword_nil);} | keyword_self {ifndef_ripper($$ = keyword_self);} | keyword_true {ifndef_ripper($$ = keyword_true);} | keyword_false {ifndef_ripper($$ = keyword_false);} | keyword__FILE__ {ifndef_ripper($$ = keyword__FILE__);} | keyword__LINE__ {ifndef_ripper($$ = keyword__LINE__);} | keyword__ENCODING__ {ifndef_ripper($$ = keyword__ENCODING_ ;
  • 34. lhs : variable { /*%%%*/ if (!($$ = assignable($1, 0))) $$ = NEW_BEGIN(0); /*% $$ = dispatch1(var_field, $1); %*/ } | primary_value '[' opt_call_args rbracket { /*%%%*/ $$ = aryset($1, $3); /*% $$ = dispatch2(aref_field, $1, escape_Qundef($3)); %*/ } ...
  • 35. BNF (part) program : compstmt arg : lhs '=' arg | var_lhs tOP_ASGN arg compstmt : stmts opt_terms | primary_value '[' aref_args ']' tOP stmts : none | stmt | arg '?' arg ':' arg | stmts terms stmt | primary stmt : kALIAS fitem fitem primary : literal | kALIAS tGVAR tGVAR | strings | expr | tLPAREN_ARG expr ')' | tLPAREN compstmt ')' expr : kRETURN call_args | kBREAK call_args | kREDO | kRETRY | '!' command_call | arg
  • 36. Assign stmt : ... | mlhs '=' command_call { /*%%%*/ value_expr($3); $1->nd_value = $3; $$ = $1; /*% $$ = dispatch2(massign, $1, $3); %*/ }
  • 37. mlhs mlhs: mlhs_basic | ... mlhs_basic: mlhs_head | ... mlhs_head: mlhs_item ',' | ... mlhs_item: mlhs_node | ... mlhs_node: variable { $$ = assignable($1, 0); }
  • 38. Method call block_command : block_call | block_call '.' operation2 command_args { /*%%%*/ $$ = NEW_CALL($1, $3, $4); /*% $$ = dispatch3(call, $1, ripper_id2sym('.'), $$ = method_arg($$, $4); %*/ }
  • 39. Mix! var_ref: ... | tINCR variable { /*%%%*/ $$ = assignable($2, 0); $$->nd_value = NEW_CALL(gettable($$->nd_vid), rb_intern("succ"), 0); /*% $$ = dispatch2(unary, ripper_intern("++@"), $2); %*/ }
  • 41. Case 4: def A#b • A#b instance method b of class A • A.b class method b of class A
  • 42. A#b class A def A.b def b ... ... end end end
  • 43. A#b def A#b def A.b ... ... end end
  • 44. # (in parser_yylex) case '#': /* it's a comment */ /* no magic_comment in shebang line */ if (!parser_magic_comment(parser, lex_p, lex_pend - lex_p)) { if (comment_at_top(parser)) { set_file_encoding(parser, lex_p, lex_pend); } } lex_p = lex_pend;
  • 45. # (in parser_yylex) case '#': /* it's a comment */ c = nextc(); pushback(c); if(lex_state == EXPR_END && ISALNUM(c)) return '#'; /* no magic_comment in shebang line */ if (!parser_magic_comment(parser, lex_p, lex_pend - lex_p)) { if (comment_at_top(parser)) { set_file_encoding(parser, lex_p, lex_pend);
  • 46. Primary primary: literal | ... | k_def singleton dot_or_colon {lex_state = EXPR_FNAME;} fname { in_single++; lex_state = EXPR_END; /* force for args */ /*%%%*/ local_push(0); /*% %*/ } f_arglist bodystmt k_end { /*%%%*/ NODE *body = remove_begin($8); reduce_nodes(&body); $$ = NEW_DEFS($2, $5, $7, body); fixpos($$, $2); local_pop(); /*% $$ = dispatch5(defs, $2, $3, $5, $7, $8); %*/ in_single--; }
  • 47. | k_def cname '#' {lex_state = EXPR_FNAME;} fname { $<id>$ = cur_mid; cur_mid = $5; in_def++; /*%%%*/ local_push(0); /*% %*/ } f_arglist bodystmt k_end { /*%%%*/ NODE *body = remove_begin($8); reduce_nodes(&body); $$ = NEW_DEFN($5, $7, body, NOEX_PRIVATE); fixpos($$, $7); fixpos($$->nd_defn, $7); $$ = NEW_CLASS(NEW_COLON3($2), $$, 0); nd_set_line($$, $<num>6); local_pop(); /*% $$ = dispatch4(defi, $2, $5, $7, $8); %*/ in_def--; cur_mid = $<id>6; }
  • 48.
  • 50. Reference • My blog http://ujihisa.blogspot.com • All patches I showed are there
  • 51. end
  • 52. Appendix: Imaginary Numbers • Matz wrote a patch in [ruby-dev:38843] • translation: [ruby-core:24730] • It won't be accepted
  • 53. Appendix: Imaginary Numbers > 3i => (0 + 3i) > 3i.class => Complex
  • 54. Applendix2: I'm looking for job! • ujihisa at gmail com • Ruby, Rails, Merb, Sinatra, etc • C, JavaScript,Vim script, HTML, Java, Haskell, Scheme • Fluent in Japanese