Thursday, February 28, 2008

The Myth of Duck Typing

Ruby aficionados swear by what they call "duck typing" and I figured, since I'd criticized C# in comparison with Ruby, it was only fair to return the favor.

Duck typing isn't new. Only the term is -- apparently coined in the last ten years, to describe one variant of dynamic typing. But the underlying principles have been around for a long time (it's Lisp's fiftieth anniversary this year).

But Ruby takes it to an extreme that is, to put it bluntly, unhealthy to quality code. The Pickaxe book (also known as Programming Ruby) has this to say (emphasis mine):

You'll have noticed that in Ruby we don't declare the types of variables or methods--everything is just some kind of object.
. . .
If you've come to Ruby from a language such as C# or Java, where you're used to giving all your variables and methods a type, you may feel that Ruby is just too sloppy to use to write "real "applications.

It isn't.
. . .
[O]nce you use Ruby for a while, you['ll] realize that dynamically typed variables actually add to your productivity in many ways. (p. 365)
and
If you want to write your programs using the duck typing philosophy, you really only need to remember one thing: an object's type is determined by what it can do, not by its class. (p. 370)
. . .
You don't need to check the type of the arguments. If they support [the method you're calling], everything will just work. If they don't, your method will throw an exception anyway.
. . .
Now sometimes you may want more than this style of laissez-faire programming. ... (p. 371)
and it goes on to explain that all you need to do is call respond_to? to see if an object responds to the method you're calling.

And therein lies the problem. I have seen Ruby code littered with calls to respond_to? Rails calls it about 200 times. To use Dave Thomas' term, it's sloppy.

Here's a typical use of respond_to?
def render(obj)
case
when obj.respond_to?(:to_html)
return obj.to_html
when obj.respond_to(:to_json)
return json_to_html(obj.to_json)
when obj.respond_to(:to_s)
return html_encode(obj.to_s)
else
# Not realistic, since all objects respond to to_s
raise "can't render"
end
end
Gee, wouldn't it be great if the language handled this automatically? What if we invented the idea of an interface, which you could use to declare a set of methods that you expect to be handled by an object? What if you could then specify that a variable or method parameter had to respond to the interface, with an appropriate type exception (uh, let's call it an interface exception) being thrown if the entire interface wasn't supported by the object? And what if we added the idea of method overloading, so that we could have alternate versions of methods that expect parameters that respond to different interfaces? And what if the compiler could check variables passed to methods to see if they support the requested interfaces and give you compile-time errors instead of runtime errors? Gee, C# does all of that.

None of this is fundamentally incompatible with Ruby. The advantages to dynamic typing that are mentioned are, in fact, advantages. It is hugely useful to not have to specify the type of an object when it doesn't matter. But there is certainly a problem with not being able to specify what you expect in an object when it does matter. And it does, way too often.

There is nothing in the concept of dynamic typing or duck typing that requires that interfaces (or even classes) cannot be specified. But, instead, Ruby pushes work onto programmers that the compiler and runtime system ought to be doing automatically. In Ruby, it seems to me that duck typing is dynamic typing with blinders on. It results in bloated, sloppy, repetitive code. Personally, I'd much rather see code like this:
def render(IConvertsToHTML obj)
return obj.to_html
end

def render(IConvertsToJSON arg)
return json_to_html(obj.to_json)
end

def render(IConvertsToString arg)
return html_encode(obj.to_s)
end
The appropriate method can be decided dynamically at runtime -- the decision is not made at compile time. But, once inside the methods, support for the specified interface could be guaranteed. Also, notice that there is no need for an extra method to handle the unrealistic error case that I tossed in to make this point, though we could certainly add any number of additional methods for additional things we want to convert. It would be not be hard to invent a syntax for easy specification of interfaces as well as on-the-fly specification of required methods for a variable declaration.

If the Ruby powers that be want it, Ruby can have all the advantages of dynamic typing and all the advantages of interfaces combined with great compiler support. What's the disadvantage?

Update 2/28: Changed example code.

Update 3/2: Follow-up posted.

15 comments:

geezusfreeek2 said...

I agree that this use of respond_to? is horrible, but this is not a good demonstration of duck typing. It's not duck typing if you have to check to make sure it quacks. It should either quack or it shouldn't. If it quacks, then it works. If it doesn't quack, it should throw its own exception. That is, you shouldn't have to throw your own if it doesn't quack. That's done for you already. Also, this whole deal where you try a different method if the first doesn't work… that seems to be some tight coupling, which is kind of making it miss the whole point.

Roy Leban said...

Clearly, I didn't provide the best example. I was trying to keep it short and I threw in the exception as the third option to show how that shouldn't be necessary with optional typing. I've updated the post with a better example.

The point isn't about respond_to? (A purist could even argue that respond_to? shouldn't exist -- we should just try the method in question and rescue if it throws an exception).

The problem is the inability to specify requirements for variables and parameters when they are known. People use respond_to? to work around this deficiency.

I don't want my code throwing exceptions at runtime, especially on a server. I don't want the "or it doesn't" half of the "if it quacks" statement. I want the language, the system, and the compiler to make me more efficient and to help me avoid stupid mistakes.

Jorge Ortiz said...

You say:

"And what if the compiler could check variables passed to methods to see if they support the requested interfaces and give you compile-time errors instead of runtime errors? ... None of this is fundamentally incompatible with Ruby."

Compile-time type-checking is indeed fundamentally incompatible with Ruby, because Ruby allows you to change the type of an object at run-time. With Ruby, you can't know the run-time type of an object at compile-time.

Consider code like:

def myMethod(MyInterface i)
...
end

obj = eval "MyClass.new"

myMethod(obj)

We know that the type of "obj" is "MyClass", so we can determine whether or not MyClass implements MyInterface. But what if the string passed to "eval" is read in from the command-line or from a file? How would you determine the type of obj at compile-time?

tante said...

I personally come from a Python background but I guess it's somewhat similar so I'll respond here.

If I'm not mistaken, respond_to? is the check whether a certain object has a certain method/attribute.

Using that is just not very good style from a duck typing background. There's the saying "It's easier to ask for forgiveness than for permission" and it applies here:

You don't check first and call then, that leads to ugly code.

if you have an instance i with three serializers "to_html", "to_json" and "to_xml" for example, you'd just call the serializer you need at that moment:
If you need json, call the appropriate serializer, if there is an issue, you get the exception that tells you what is wrong. If you want things like a hierarchy of serializers as "If json doesn't work, gimme XML and if that fails gimme the string", wrap that in a function and just call that.

The idea of the necessity of checking before calling comes usually from people that have a strong Java/C# background, because they feel they have to do manually what in Java/C+ the compiler does. But that is not how you program dynamic languages, that is trying to make them Java. Which they ain't.

pragdave said...

To be fair, I don't really advocate respond_to in the book: I just say that it's available, and show how you could use it. At the end of the section, I say "However, before going down this path, make sure you're getting
a real benefit---it's a lot of extra code to write and to maintain."

I personally use respond_to very rarely.


Dave

Roy Leban said...

Thanks for the feedback, everybody! A couple of thoughts.

Jorge: I'm not saying Ruby should disallow changing object types at runtime or that it should disallow having completely untyped variables. But when I know I need a particular interface (not a class), why can't I tell the compiler so it can enforce it? Isn't it the job of compilers to take work off of programmers?

And as for those strings read in from the command line that get evaled? Great idea for a dev tool. Bad idea for a server. Ever hear of "SQL Injection"?

Tante: Here's a challenge. Write your version of render(obj) that doesn't use respond_to and has the same functionality: 1. If the object can convert itself to HTML use that; 2. If the object can can convert itself to JSON, use that; 3. Otherwise, generate HTML from the object using to_s; 4. Don't hide any errors by masking exceptions other than those necessary to determine 1 & 2. Bonus points if you make it easy to add extra cases in the future, like generating HTML from XML that the object provides.

My version of this is pretty ugly. If you write a good example, I'll add it to the post.

Dave: You're right. I didn't mean to imply that you supported using respond_to? regularly. Sorry. That section does have one of my favorite footnotes. I also use respond_to? rarely, but I would like not to use it at all because every time I do use it, the compiler could have done it for me.

commons_guy said...

"And as for those strings read in from the command line that get evaled? Great idea for a dev tool. Bad idea for a server. Ever hear of "SQL Injection"?"

$SAFE should cover you in many cases. Bear in mind that some Ruby templating engines use eval() to achieve their ends.

"Here's a challenge. Write your version of render(obj) that doesn't use respond_to and has the same functionality"

I'd call a render(obj) method that isn't told what it's supposed to be rendering a code smell. You've created a straw man, then are trying to make your point by claiming it's made of straw.

Moreover, your alternate implementation is indeterminate for objects that implement multiple interfaces (e.g., IConvertsToHTML and IConvertsToJSON).

But, since you insist...and forgive the probably awful formatting from Blogger's edit pane:

def send_or_nil(obj, msg)
begin
obj.send(msg)
rescue NoMethodError
nil
end
end

def render(obj)
result=nil

[:to_html, :to_json].each do |m|
result=send_or_nil(obj, m)
break if result
end

result ? result : obj.to_s
end

You might consider moving the array of symbols somewhere that'll be easier to find. You could open Object and implement send_or_nil() on it instead of having it in whatever class implements render().

Jorge Ortiz said...

"But when I know I need a particular interface (not a class), why can't I tell the compiler so it can enforce it?"

Because any such type-checking at compile-time would be useless. Suppose class C implements interface I at compile-time. At run-time, however, I dynamically remove a method from class C, such that it no longer implements I. Then all your compile-time type-checking is rendered useless.

I'm not against compile-time type-checking. I just think it is fundamentally incompatible with Ruby's goals and philosophy. (It's one of the reasons I don't use Ruby.)

Roy Leban said...

commons_guy: I don't agree it's a strawman. This is a real case. I have a list of objects (of unknown types) and render them. Many of the classes I create support to_html, but not all of them. Why shouldn't I be able to handle it gracefully?

Your implementation below doesn't have the same functionality as my examples. Make it match and I'll gladly post it in an update with proper formatting (why doesn't Blogger support the pre tag in comments?). And you're using nil as a magic value. What if the methods we're calling return nil as a valid result? Neither of the examples have a problem with that.

Jim said...

Jorge: Just wanted to point out that you cannot change the type of an object at runtime. You can change it's methods, which is unrelated to duck-typing. Using reflection, I can do that in Java. You can change the variable's type, except that variables don't have type. Only objects have type.

Rob: I agree that this type of respond_to? chaining is wrong. But I think that a better technique would be to define a render method and call obj.render, and let the object do it's rendering. But that's just me

Jim said...

Rob: I also forgot to mention that if you want interfaces, then you should be using a mixin. So to implement obj.render, I'd want to implement a mixin for the different types of rendering.

Jim said...

Final comment: the idea of a compile time is incompatible with Ruby as currently implemented!

And if you are concerned about code being changed, you should be running tests, which IMHO, are the only way to ensure that your code doesn't modify itself and remove a needed method at runtime. Like I said, Java can do it via reflection, even C can do it via twiddling with the dispatch table.

Gabriel C. said...

I can't resist to bring this to notice: Duck typing done right.
Scala is very promising :)

Mike Witters said...

Good post.

I started in Smalltalk which I loved for a long time. When Java started on the scene I saw the benefits of strong typing that you support in your post. But after using Java for the past 7 years, I am loving the comeback of dynamic typing languages like Scala.

This is really one of those arguments that is totally subjective and hence can't be won.

To each his/her own.

Mike Koss said...

The main distinction between Duck typing and interfaces is that objects can guarantee that they respond to a "contract"; i.e., a whole collection of methods, rather than a single one. I also think your example is a little artificial - I don't generally want the compiler arbitrarily choosing among multiple interfaces to an object. And even when it does, you're most likely going to have conditional code based on the type of interface you got.

Real systems are generally built with specific implementations in mind; certainly they have to be tested against specific implementations. I have always felt that building "plugable" interfaces that work across unknown implementations is an elusive goal. The only places it's been done successfully is when a) one side of the implementation has an established standard against which to test, and b) the interfaces are kept very simple.

I think Microsoft's experiment with OLE (and Apple's experiment with AppleScript) has shown the fragility of trying to integrate highly complex objects exposing a myriad of interfaces.

By the way, your example brought to mind the elegant way that Prototype (javascript library) handled returning an interface to the XMLHttpRequest object:

var Ajax = {
  getTransport: function() {
    return Try.these(
      function() {return new XMLHttpRequest()},
      function() {return new ActiveXObject('Msxml2.XMLHTTP')},
      function() {return new ActiveXObject('Microsoft.XMLHTTP')}
    ) || false;
  },

  activeRequestCount: 0
}

var Try = {
  these: function() {
    var returnValue;

    for (var i = 0, length = arguments.length; i < length; i++) {
      var lambda = arguments[i];
      try {
        returnValue = lambda();
        break;
      } catch (e) {}
    }

    return returnValue;
  }
}

Note that it's important to control the order in which each interface is selected - a built in language feature would not necessarily control that for you. I think the Try.these() paradigm is a good one for wrapping Duck typed object of unknown origin.

Post a Comment