Robert Gawron

"Tak naprawdę, człowiek posiada tylko życiorys."

Archive for January 22, 2010

Domain Specific Language in Ruby used as a Virtual Machine

Introduction

In this post I will describe my attempts to build virtual machine, that executes source byte-code as files written in DSL (Domain Specific Language). This DSL is built on the top of other popular language.

Couple of months ago I created a small programming language, that was compiled to assembler. Compiled files could been executed by virtual machine. There were two versions of VM, in Python and in C (created by czarodziej). I take this assembler and (tiny) virtual machine and tried to build another version, as described below.

I was confused was what language should I use as a base for DSL. It should be flexible, expressive and with good support for metaprogramming. It couldn’t be Java or PHP for sure! ;-) I considered Lisp, Ruby, Erlang and Python. My choice was Ruby.

Some people might say that it’s not DSL at all, that it’s in place where DSL shouldn’t be used or that all is wrong. I didn’t worry about those questions and treat it as an interesting experiment, but for sure it could be made better.

It isn’t useless theory, I’ve seen those languages in my previous jobs (used to creating specialized tests scenarios or gathering in formations). The way of accessing data through Django’s ORM is also sort of DSL.

Details of implementation

Part of assembler’s syntax is invalid in Ruby, so I made small preprocessor, that runs before code is loaded and makes it valid. It’s second purpose is to modify this code to the form that is more expressive in Ruby. It’s coded in DSLPreprocesor class.

This is how assembly file looks before preprocesing:
bar:
    int 0
ret

foo:
    call bar
    ret

main:
    push 13
    call foo
    ret
and this is how it looks after, I’ve added indents to make code more readable:
@bar = Proc.new{
    int 0
}

@foo = Proc.new{
    call @bar
}

@main = Proc.new{
    push 13
    call @foo
}

We have to implement in Ruby all mnemonics, like call, add in form, that makes state and behavior of running virtual machine exactly like in original version. Then we have to evaluate processed file (there are only variables, so nothing will happened), and then run @main object.

You might be wondering why I didn’t use normal functions instead of Ruby’s mods, well, I needed to tread functions as objects (see implementation of call in DSLVirtualMachine).

You can obtain more information by analyzing source code (see bellow) or by confrontation with original version, you can also check project’s website.

   1  #!/usr/bin/env ruby 
   2  
   3  class DSLVirtualMachine
   4      def initialize(file_name = String.new)
   5          @preprocesor = DSLPreprocesor.new file_name
   6          @stack = Array.new
   7          @memmory = Array.new
   8          @stack_ptr = [0]
   9          @memmory_ptr = [0]
  10      end
  11  
  12      def run()
  13          eval @preprocesor.parse()     # dynamic code loading
  14          @main.call()                  # code executing
  15      end
  16  
  17      private
  18  
  19      def alu(operation)
  20          parser = { 'add' => lambda { @stack.pop + @stack.pop },
  21                     'sub' => lambda { @stack.pop - @stack.pop },
  22                     'div' => lambda { @stack.pop / @stack.pop },
  23                     'mul' => lambda { @stack.pop * @stack.pop },
  24                     'gt'  => lambda { @stack.pop > @stack.pop }, }
  25          push parser[operation].call()
  26      end
  27  
  28      def call(function)
  29          @memmory_ptr.push @memmory.length
  30          function.call()
  31          @memmory_ptr.pop
  32      end
  33  
  34      def push(element)
  35          @stack.push element
  36      end
  37  
  38      def load(offset)
  39          memmory_ptr = @memmory_ptr.last
  40          @stack.push @memmory[offset+memmory_ptr]
  41      end
  42  
  43      def store(offset)
  44          memmory_ptr = @memmory_ptr.last
  45          @memmory[offset+memmory_ptr] = @stack.pop
  46      end
  47  
  48      def get_if_condition()
  49          return @stack.pop
  50      end
  51  
  52      def int(interrupt)
  53          parser = [ lambda { p @stack.pop },
  54                     lambda { push gets.to_i } ]
  55          parser[interrupt].call()
  56      end
  57  end
  58  
  59  
  60  class DSLPreprocesor
  61      attr_accessor :fname
  62  
  63      def initialize(fname = String.new)
  64          @fname = fname
  65          # input token => output token, order does matter
  66          @parser = {
  67              # remove comments
  68              /^([^;]*);.*$/  => 
  69                          lambda { |u| "#{u}" },
  70              # convert jump to label into 'if' statement
  71              # it's very LAME!
  72              /^[\s]*jz ([a-z0-9]*)$/ => 
  73                          lambda { |u| "if (get_if_condition)" },
  74              /^a[0-9]:*$/ => 
  75                          lambda { |u| "end" },
  76  
  77              # convert function label to declaration 
  78              # of proc in ruby ruby
  79              /^([a-z]*[0-9]*):$/  => 
  80                          lambda { |u| "@#{u} = Proc.new{" },
  81              # convert end of function in assembler 
  82              # to end of proc in in ruby 
  83              /^[\s]*ret$/  => 
  84                          lambda { |u| "}" },
  85              # 
  86              /^[\s]*call ([a-z0-9]*)$/  =>
  87                          lambda { |u| "call @#{u}" },
  88  
  89              # treat ALU operation in consistent way
  90              /^[\s]*(add|sub|mul|div|gt|lt|eq)$/ => 
  91                          lambda { |u| "alu '#{u}'" },
  92          }
  93      end
  94  
  95      def parse()
  96          file = File.readlines(self.fname)
  97          return file.map { |l| parse_mnemonic(l) }.join('')
  98      end
  99  
 100      private
 101      def parse_mnemonic(mnemonic)
 102          @parser.map { |k,v| mnemonic.gsub!(k){ v.call($1) } }
 103          return mnemonic
 104      end
 105  end
 106  
 107  # TODO use optparse instead of raw ARGV[0]
 108  vm = DSLVirtualMachine.new ARGV[0]
 109  vm.run()
 110  

Results

My implementation is incomplete and in some places incompatible with original, but I reached the point and build it. It becomes overcomplicated due to fact that syntax wasn’t created to be compatible with Ruby’s syntax. Second problem was my lack of experience with Ruby — to be honest I don’t know this language at all.

It would be interesting to count SLOC, but I didn’t have tool to do that (I don’t have space on HDD to install it) and implementation in Ruby is incomplete. I count all lines in source files, for C implementation it was ~600, for Python ~250, for DSL+Ruby ~110.

Case studies and materials about DSL