This is a translation in English of Yugui's blog written in Japanese.


Dec. 14th, 2006

Compare Ruby Callable Objects, Part 1



Ruby has various callable objects, Method, UnboundMethod and Proc.

Continuation is slightly different, but pretty close as the object which remembers execution context.

In "The Ruby Way", author says that it's not surprisingly. But I was surprised.

Also they have slightly different behaviors. There ain't no justice. It might be handy, though.


This time I'd like to look at a brief overview.



Normal method

We usually define a method with def keyword.


class C
  def greeting(arg)
    puts "C#greeting reveived #{arg}"
  end

  def iterator
    yield 'iterator 1st'
    yield 'iterator 2nd'
    yield 'iterator 3rd'
  end

  local = 1
  def ref_local
    puts local
  end
end

obj = C.new

# We can call it normally.
obj.greeting 1     # => C#greeting received 1

# Ruby checks number of arguments.
obj.greeting 1, 2
  # => ArgumentError: wrong number of arguments (2 for 1)

# Ruby can call a method with a block. Good feature. Pretty good!
obj.iterator do |item|
  puts item
end
  # => iterator 1st
  #    iterator 2nd
  #    iterator 3rd


# We can not access local variables outside of def block.
# In JavaScript, we can. So, not a good feeling.
obj.ref_local
  # => NameError: undefined local variable or method 'local' for #<C:0x5e1b8>

Method Object

Ruby's method is able to become an object. That is called Method object which represents a callable object.

Method objects are created by Object#method method. It is invoked with a method name in string or symbol as an argument.


greeting = obj.method(:greeting)

We can call methods without parentheses. But that is the reason it's impossible to write obj.greeting to creat Method object as like as in JavaScript or Python.

There is no natural way to represent Method object. That is one of the reasons Ruby's methods are not first-class objects.

We can not use greeting as same as a normal method. This greeting is a local variable. But Ruby looks for the method name in the name space for methods. Then it occurs NoMethodError.

In Ruby, variables and methods don't share the same name space.


greeting(1)  # => NoMethodError: undefined method `greeting' for main:Object

In fact, if you write greeting with no arguments, the error for wrong number of arguments doesn't occur.

As you know, Ruby accepts that as an expression includes only a local variable greeting.


greeting     # => #<Method: C#greeting>

We use call or [] to invoke Method object. The [] is overridden to invoke the object in the nealy same way as normal method call with parentheses.


# greeting.call 1 # => C#greeting received 1
# greeting[1]     # => C#greeting received 1

Ruby checks the number of arguments.


greeting[1, 2]  # => ArgumentError: wrong number of arguments (2 for 1)

Also possible with a block.


iterator = obj.method(:iterator)
iterator.call do |item|
  puts item
end
  # => iterator 1st
  #    iterator 2nd
  #    iterator 3rd

Calling Syntax

Can't invoke by normal way is disturbing. Matz has also concerned about that.

For instance: Omit call. Back and forth in Ruby 1.9.



parse.y version 1.372

Changelog: parse.y version 1.372. As local variable with parentheses, Ruby converts it to call implicitly.


greeting("v 1.372")    # => C#greeting received v 1.372
greeting "v 1.372"     # => C#greeting received v 1.372

It's possible to omit parentheses. Oh, fantastic!



parse.y version 1.382

Changelog: parse.y version 1.382. Stop implicit conversion. Also need to add parentheses to local variable.


(greeting)("v 1.382")   # => C#greeting received v 1.382

I guess this syntax came from function pointer in C. Because MRI Matz's Ruby Implementation) is written with K & R style. In fact, both side parentheses are impossible to omit.


greeting(1)   # => undefined method `greeting' for main:Object (NoMethodError)

(greeting) 1  # => parse error



After all, stop again

Changelog: parse.y version 1.442. This syntax was gone.


(greeting)(1)  # => parse error, expecting `$'



Comparison to __send__

Method object calling syntax is not cool. But in fact, it's not so comfortable in Ruby.

Because we don't usually use it in the first place.

If we want to change the method, it's easy to hold the message and send it to the object with __send__ method.


msg1 = [:greeting, 1]
msg2 = [:greeting, "Hello"]
msg3 = [:inspect]

obj.__send__(*msg1) # => C#greeting reveived 1
obj.__send__(*msg2) # => C#greeting reveived Hello
obj.__send__(*msg3) # => "#<C:0x2aa33c>"

Method object holds the code of the method. This is an advantage.

As you creat Method object, you can use the code of the method even if the original method was edited or deleted.


greeting = obj.method(:greeting)
class C
  def greeting(arg)
     puts "Yet another ruby hacker"
  end
end

obj.greeting("test")  # => Yet another ruby hacker
greeting.call("test") # => C#greeting received test

class C
  remove_method :greeting
end

obj.greeting("test")
    # => NoMethodError: undefined method `greeting' for #<C:0x2aa33c>
greeting.call("test")
    # => C#greeting received test

On the other hand, __send__ metod holds the message (method name) for the object and dispatches that in each case. So, if you edit the method, it'll be affected.

Method object holds the code of the method and the method dispatching will not happen.

It had already happened when Object#method was called. Hence it'll be executed by the previous definition, even if after edited.



UnboundMethod

Method is an object looks like pairing the method definition and the receiver. UnboundMethod is an object pulled the receiver out.

Method object is close to delegate in C#. UnboundMethod object is closest to the member function pointer in C++. You implement a kind of stuff Method by using a member function pointer like Functor of boost in C++.

UnboundMethod is created by Module#instance_method or Method#unbind.


u_greeting = greeting.unbind
u_iterator = C.instance_method(:iterator)

self is around every corner in Ruby program. Ruby always sends a message to the object even if it looks like a function. So, it's impossible to execute UnboudMethod without the receiver.

UnboundMethod has the call and [] methods. But probably, they are legacies in Ruby 1.6 when UnboundMethod was a subclass of Method. TypeError will happen if you call call and [] anyways.

So no choice, you can call it after setting the receiver with UnboundMethod#bind.

You can use it in the same way because UnboundMethod#bind returns Method object.


iterator2 = u_iterator.bind(C.new)
iterator2.call do |item|
  puts item
end

You can not bind the UnboundMethod to the instance which belongs to the different class.

If it'd be possible to copy a method from a class to another class by using UnboundMethod, I guess it's useful. But in fact,


u_iterator.bind(OtherClass.new)

TypeError will happen.

Well, it may be risky to copy, because the method implemented in C is severe for type checking.

It's almost obvious to select Method or UnboundMethod to use.

Method is used to carry around self. UnboundMethod is used to decide self at the point of execution.

You need an instance object to create Method object, but it's possible to create UnboundMethod object from Class object.

By the way, Method and UnboudMethod are both implemented as struct METHOD inside Ruby. The difference is just only not using the pointer for self.



Normal Proc

Okay, next is Proc.

Proc object is an object which hold a code snippet. It is usually created with a block. It is created by Proc.new or Kernel#lambda. They are synonyms.

They are methods which transform a given block to a Proc object.

Inside Ruby, there is a little bit different from Proc object and a block which is executed by method call.

For example, in Ruby older versions, about a global jump behaviour when it executes break or retry.

This document offers detailed information about Proc.

Proc is unconventional from Method or UnboundMethod. There is no base method. It's weak to belong to a particular object... But most important difference is that it's a closure.


def create_closure
  counter = 0
  Proc.new { p counter += 1 }
end

c = create_closure
c.call     # => 1
c.call     # => 2
c.call     # => 3
....

After passing through the create_closure method, there is already no scope which the local variable counter belongs to.

But Proc object remembers the local variable. You can access the local variable inside the Proc object untill the Proc object vanishes.

Proc object keeps the context for local variables. The context has been kept the status (condition) when the Proc object was created. The create_closure holds a new independent context everytime.


c1 = create_closure
c1.call     # => 1
c1.call     # => 2
c1.call     # => 3

c2 = create_closre
c2.call     # => 1
c2.call     # => 2
c1.call     # => 4
c1.call     # => 5
c1.call     # => 6
c1.call     # => 7
c2.call     # => 3
c1.call     # => 8
c2.call     # => 4

The counter c1 refers to and another counter c2 refers to are totally different things.

They are like instance variables.

So, what is the benefit? Currying ?

Come to think of it, we don't write the code which operates Proc object itself so much.

But this is an example. If you want to make two objects share the variable, but don't want to use global variables or instance variables.


def create_twin
 shared = 0
 return [
   Proc.new { p shared += 1 },
   Proc.new { p shared += 1 }
 ]
end

dee, dum = create_twin
hikaru, kaoru = create_twin

dee.call    # => 1
dum.call    # => 2; dee and dum share the shared

hikaru.call # => 1
kaoru.call  # => 2; hikaru and kaoru share the other shared

This technique is usual in Perl rather than in Ruby.

It may be said that one of the technique which defines private methods in JavaScript is the aplication of this technique.

If you work with Ajax, you'll experience to come to appreciate Closure by yourself.

(2006-11-17: ma2 suggested a bug in the above code. Now fixed.)

Proc takes arguments as same as Method. They will be passed along to the block arguments like this:


sum = 0
acc = Proc.new{|num| p sum += num}

acc.call(3) # => 3
acc.call(2) # => 5



Block Syntax

I think we usally use Proc to implement a method with a block rather than writing Proc.new.


def iterator(a, b, &block)
 ...
end

As you write a method in the above style and call the method with a block, the block (code snippet) will be convert to Proc object and assign to the block argument block.

This is an example of the method which evaluate the block with context in eigenclass for particular object.


def singleton_class_eval(&block)
 (class << self; self end).class_eval(&block)
end

In the above example, this method just passes the block to class_eval. Hence, it may not seem to be so useful. But if you get a block as an object, you can handle it as you like.

We can have a big dream.



DSL

Rails is using much in the way, once it gets Proc objects and keeps them in the instance variables, then invokes them when needed.

As I said at RubyKaigi2006, the reason we can write declarative representations in Rails is - it separates between the time to execute method definition and the time to execute code block.

For example, if option which is added to validation,


class UserRegistration < ActiveRecord::Base
 validates_presence_of :phone, :if => Proc.new{|reg| reg.stage >= 1}
 ....
end

If the application has several windows for user registration and you don't want to enable validates_presence_of :phone option on the first window, you can write like this, based on the assignment of page number to stage attribute preliminarily.

ActiveRecord holds Proc object which is passed as an argument. ActiveRecord invokes Proc object with the instance of UserRegistration each time the validation of UserRegistration is executed.

Proc.new is executed when the class defins. But the content of the block which was passed as the argument will be executed later.



Calling Notation

Proc object has [] and call to call the code which is held inside the Proc object. This is the same as Method and UnboundMethod. Proc object has also Proc#yield.

[] and call don't check out the number of arguments as same as Method or UnboundMethod.

But yield doesn't check out the number of arguments.

This is the same of executing yield statement in the method with block. For example,


100.times do
 print "Hello"
end

Kernel#times passes the value of counter to the block. But sometimes it is not used like this.

I guess this may be the reason yield doesn't check out the number of arguments. If I always have to write like this: 100.times do |i|, I hate it.

Proc#yield has a good spec.



How to use Proc

As already mentioned, most typical case to use Proc is when we impliment a method with block.

As like as the above create_twin example, Proc as closure gives us some peace of mind.

Because local variables that Proc refers to are not be able to be refered by anything except the Proc after passing through the context in which the Proc was created.

So, there is no need to worry about overwriting.

After passing through create_twin, the counter will not be able to access except the pair of Procs.

Technically, there is a loophole. But no prob, you have to write explicit code.

If you want to do that with instance variables, it's not so easy. Because you can access any instance variables anywhere in the instance.

Name collision between class and sub-class is really the problem. It remains possible that Ruby 2.0 will solve the problem. But we can do nothing at present.

We can use external scope reference to closure, if you create a liblary you never know who use and how extend.



Continuation

Once you understand the concept about Continuation, it's so natural. But it's very difficult to explain.

I think it's easier to learn Continuation in Scheme rather than in Ruby.

I also couldn't use Ruby Continuation before reading Programming Language SCHEME.

Ruby supports Continuation. I think this is one of the aspects.

If I'm not mistaken, Parrot said that he implemented Continuation by seeing porting Ruby.

But there is a few cases using effectively. Matz also said "I could implement Continuation, so I just did." It doesn't seem to support positively.

Sasada is not enthusiastic to support Continuation in YARV. So it might not be possible to support in Ruby 2.0.

I hope not. So I always say please support Continuation when I see Sasada.

Continuation is an object which represents the process from here to end. You can use Kernel#callcc to make Continuation objects.


cont = nil
callcc {|c| cont = c }
puts :ok
exit

Now, let's consider the above code.


# puts :ok
# exit

The callcc will evaluate a block with a block argument which is the above code, Continuation object.

In this case, the object is kept in the block.

What is useful? Well, if you call the Continuation object, you can always execute the process next step after callcc. In this case, the context for local variables are also kept.

In this above code, there is exit to stop the program. To keep simple explanation about the process from here.

But in practice, it's not so easy. Then Continuation is in it's element.

Okay, how about this code? I wrote the code for my presentation in RubyKaigi. Then fixed a bug.


@cont = []
ActiveRecord::Base.transaction do
 catch :save_tx do
   collection.each do |item|
     ....
     callcc{|c| @cont.push c; throw :save_tx} if something?
     ....
   end
 end
end
unless @cont.empty?
 ActiveRecord::Base.transaction do
   @cont.pop.call
 end
end

ActiveRecord::Base.transaction is a method which does the following:

But when condition is right, don't you want to commit at once, then continue the process as if there is nothing happen?

The above code is the code which has the function to commit at once.


save_tx:
 for (Object item : collection) {
    ....
 }

Then evaluates the block.

Confusing? Yeah, I think so. If you are not expert, I recommend you to read the library Change CGI to FastCGI written by Matz.



How to call

You can call Continuation object with [] or .call.

It's callable and holds the context. So, Continuation is one of the members of Method/Proc group.

But it's very different from them. Continuation#call doesn't return any controls basically.

In fact, arguments for .call or [] become the evaluated value of callcc as an Array format when it's called back.

It's possible to distinguish when executing callcc and when returning from Continuation#call.

It's also possible to give some information to the environment returned.

If you know C, I think it's like an return value of setjmp.



How to use Continuation

If you get used to Continuation, it's a natural concept. Continuation is the process from here to end. It's useful when you want to do something temporarily and leave the process from here to end.

You can also do it with Thread. But I think Continuation is more natural than Thread. In fact, it's possible to implement Thread with Continuation.

Although, it's messy to implement the preemptive Thread.

I think there is no hesitation to decide using Continuation and Method/UnboundMethod/Proc. Continuation is a kind of character.



Announcement

Well, this is an introductory information, not main story.

But it's long, so decided to publish first.

Next time, I'll try to consider about these callable objects in a different point of view.