Monday, March 24, 2008

Dates vs. Quality

Haven't posted in a while because we are in ship mode. Fortunately, it's not the 25 hours a day / 8 days a week no-sleeping no-eating ship mode, but it's hectic nonetheless. At the end, it always seems to come down to meeting a date versus delivering the quality that you want.

What's the best way to avoid problems?

  • Work at a small company (or in a small group) so you can set your own dates. Large organizations tend to be top down and impose schedules without any idea of what is actually happening.
  • When you start out, set the date based on what you hope to accomplish. Or, set the date first, then pick a feature set you can implement in the time available. Or, iterate back and forth. But, whatever you do, don't set them independently of each other. I know this isn't news to anybody reading this blog, but it still seems to get ignored a lot.
  • Be willing to cut anything -- anything -- if you have to make your date. Even features that are almost done. You can always ship them later. If you have an immovable force and an irresistible object, the dev team explodes.
  • Be willing to change the date if you can't get the quality you want (my wife and I slipped our wedding date because we weren't going to have the quality we wanted on the original date we picked).
  • Stay calm. No matter what, you will be dealing with the issue at the end.
They say, on the Internet, nobody knows you're a dog. Similarly, nobody has any idea what you were going to ship or (hopefully) when you were going to ship. They only know what you actually do ship. And, if you don't ship in the first place, it's pretty hard to ship updates.

Wednesday, March 12, 2008

To Fork or Not to Fork

... that is the question. You're working on a big project and you start a new version, but you have to keep the old version alive. Do you fork?

I have to say that I hate this question. Neither answer is right and it's a bigger problem today with internet services than it was with monolithic apps shipped on CD. You might deploy a new rev of the old version at any time to fix a security hole.

If you fork, maintaining the old version can be a royal pain. Bug fixes don't automatically roll back (or forward) and maintaining two independent copies of the source code on dev machines is always confusing.

If you don't fork, you end up with lots of places where you're checking which version it is. There are lots of places where it's easy to screw up, like config files and data files. You'll probably end up forking individual files and classes. Do your new classes inherit from the old version of the equivalent classes? Either way you answer that, it can play havoc with inheritance. And you need a way to build or deploy the old version without it being corrupted by the new version.

Most of us use source code control systems. Why don't they help? Source code control systems are really pretty trivial systems. I've written two myself and neither was brilliant software. Just to note: both systems were built more than ten years ago and, each time, we had needs that couldn't be met with off-the-shelf systems, of which there far fewer. Both systems are long dead.

It seems to me that source code control systems are built as if the biggest problem is the database. Well, certainly that's a big problem, but it's pretty much a solved problem at this point. The real problem is the interface to the database (and by that I mean any GUI as well as any shell commands).

The Missing Features

There are at least two large features that source code control systems seem to be lacking.

Semi-fork. Why can't I start a fork for a codebase and specify, on a file-by-file basis, decide which ones get forked? Whenever I edit a file, I can choose whether that file gets forked or not. Even better, drop it below the file level. If the source code control system is integrated into the development environment, why can't I pick on a class-by-class or method-by-method basis? Instead of having if (v2) or #if V2 sprinkled throughout my code, why not build this ability deep into the dev environment? Allowing the user to switch between showing the different versions isn't rocket science -- even Microsoft Word can toggle between showing different versions of a document.

Lightweight fork. I haul a laptop back and forth from work and I have all my current code on it, including stuff that I haven't checked in yet. I've hauled hard disks and flash drives as well, but syncing is a pain and, of course, it's "cheating" as far as source code control goes, which means that I can't check in when I'm using my home machine because the records on my work machine will get screwed up. The alternative is to enlist twice and check out code at work and at home independently. But, if I do that, I can't easily take work home that I haven't checked in. And I really don't want to check in code I'm in the middle of working on. It might not be tested -- hey, it might not even compile at the moment. A fork would solve the problem but forks are heavyweight and, generally, can only be done by an administrator.

Why can't I have a lightweight fork that I can create at any time? In fact, whenever I start making changes, it could instantly be a lightweight fork. I can commit that fork and I have a backup that I can also get on any other machine. When I commit the changes into the main branch of the codebase, the fork is automatically removed (though it's remembered for historical purposes). The underlying systems probably support this already. We just need the UI to catch up.

Moving Forward

It seems to me that the feature set of source code control systems hasn't changed much in a long time. What's your favorite missing feature in source code control systems? What features are needed for the needs of constantly updated, always live services? And how do we move source code control systems forward so they incorporate features like these?

Thursday, March 6, 2008

Software to Die for

Software development teams track a lot of statistics. Lines of code, tests, bugs, etc. But there was one project I worked on with a very unusual statistic: deaths.

Fortunately, the death count was zero for the project -- nobody had died during the development. But that wasn't true of every project the company had.

The company in question was a power company and I was a usability consultant on a large software project. At power companies, people get electrocuted and succumb to carbon monoxide poisoning. Things like that happen. Obviously, the goal is always zero deaths and nothing the company ever said implied otherwise. Still, they tracked the statistic for every project, including software projects.

Not all of the developers got it (some were too wrapped up in database schemas, business rules, and UI widgets), but it certainly put my work in perspective. My job was to make sure that the software did its job quickly and effectively, enabling the employees to do their jobs, so customers would have power and nobody would die. Not all of our software can make the difference between life and death, but, sometimes, I think maybe developers ought to act like it does.

Tuesday, March 4, 2008

What's a Dev Blog?

I know a developer who reads more than 150 blogs. Yet he says he doesn't plan to follow thisDev. He says dev blogs aren't worthwhile reading for experienced developers -- there's just too little useful information.

I think he has a point with many dev blogs. But, I'm not trying to teach about software development. I'm writing for my peers. I plan to write about what interests me, with the hope that it will interest and be useful to others. I hope to spark discussions like The Myth of Duck Typing and Quacking Again did. I plan to write about:

  • Things that bother me, frustrate me, and drive me crazy in the development tools that we all use, as well as things that I think could be better.
  • Things that I like in those same development tools. Although the criticism is more interesting and provocative, I'll give kudos too.
  • Major issues such as architecture, performance, scale, etc.
  • Tips and tricks that I've learned recently. Even though I've been a software developer for more than 30 years, I'm still learning. Mistakes can be costly -- you might as well learn from mine. Plus, I want a place to save new tricks I learn and this seems as good a place as any.
In general, I am not planning to write about:
  • What code I've been writing. It's probably no more interesting than yours.
  • What my code looks like. Hey, of course I love my code. You probably love yours.
  • How you should write code. This isn't a "learn to be a developer" blog.
  • Why you should switch from [name language one] to [name language two]. Also known as why [language name] is the best language ever invented. I've already criticized and praised both C# and Ruby. Every language has advantages and disadvantages. Knowing them helps us pick the right tool for the job. And the tools we use can always stand improvement.
I won't always be right. But I'll try not to be boring. I hope it's useful to you and I hope I learn something myself.

Is this the right mix? I'm interested in your thoughts.

Sunday, March 2, 2008

Quacking Again - More On Duck Typing

I hadn't intended to write about Ruby again so soon, but my last post stirred up some controversy. So, I thought I'd present some better examples and try to accurately represent all of the positions. (Don't worry, I'll get back to criticizing C#, JavaScript and other languages shortly.)

Some people thought that my example was a "strawman". I disagree. If you have a set of objects and you don't know their types, you might not know their capabilities. Isn't that part of the point of dynamic typing? I frequently implement to_html in my classes and I frequently implement methods like name, long_name, full_name, etc. It's helpful if code which uses methods like these can fall back to other methods. Saying "you should just implement to_html everywhere" is not an acceptable answer. I'm busy trying to render a web page -- I don't want to have to go through the rest of my system, including classes I didn't write, and implement to_html everywhere just in case I happen to get a particular type of object.

Some people thought I was saying interface declarations should be required. No, no, no. I'm just opposed to discarding information that would help me write better code.

Some people argued that Ruby doesn't have a compiler. Why should I care? That's an implementation detail that should be irrelevant to anyone using the language. C# could be interpreted and Ruby could be compiled.

Do interfaces allow for compile-time checking or runtime checking? The answer is, to some extent, both. But my point was not about exceptions -- it was about writing better code. The key points are:

  • Don't discard useful information known by the developer because there's no way to specify it in the language. At a minimum, comment it (much Ruby code, and code in general), doesn't do this). But, even better, formalize it.
  • When an exception does occur, raise it at the highest level possible. The deeper an exception is raised, the harder it is to figure out what the problem is. Avoid the irony of a deep exception for a condition known by the developer when they first wrote the code.
  • Let the compiler and runtime do as much work for the programmer as possible. This is DRYness at its best.
Do you disagree with any of those points?

In the examples below, I have tried to write each example in the best way possible. All have the same functionality, except as noted. Although I know Ruby fans usually omit parens after method calls that don't take arguments, I've included them in the interests of clarity (and I prefer them myself). If you have a way to improve one of the examples, let me know or add a comment, and I'll update the example.

You can decide which way you prefer. I know which one I'd like to see.

Example 1: Using respond_to?
def render1(obj)
case
when obj.respond_to?(:to_html)
return obj.to_html()
when obj.respond_to(:to_json)
return json_to_html(obj.to_json())
when obj.respond_to(:to_s)
return html_encode(obj.to_s())
end
end
Example 2: Using rescue NoMethodException
This example does not work properly if any other NoMethodException occurs within the called methods. Any such errors would be masked.
def render2(obj)
return obj.to_html()

rescue NoMethodException => e
begin
return json_to_html(obj.to_json())

rescue NoMethodException => e
return html_encode(obj.to_s())
end
end
Example 3: Using rescue NoMethodException and checking the Exception message
This example does not work properly if any other NoMethodException for the same method occurs within the called methods (which is a possibility if to_html calls to_html on contained objects). Any such errors would be masked.
def render3(obj)
return obj.to_html()

rescue NoMethodException => e

# Warning: assumes particular format of Exception message
raise if (e.message !~ /method `to_html'/)

begin
return json_to_html(obj.to_json())

rescue NoMethodException => e
# Warning: assumes particular format of Exception message
raise if (e.message !~ /method `to_json'/)

return html_encode(obj.to_s())
end
end
Example 4: Using a common send_in_order method
Also fixes limitation of example 3. Would break if Object.send gets overridden.
# send_in_order sends a series of methods, in order, to an object.
# Returns the method that was successful and the result of the method call
# Known limitation: All of the methods must take the same parameters
# Could be modified to handle that, but not necessary for this example.
class Object
def send_in_order(methods, *params)
methods.each { |m|
begin
result = self.send(m, *params)
return m, result
rescue NoMethodError => e
# Warning: assumes particular format of Exception message and backtrace
if (e.message !~ Regexp.new("method `" + m.to_s() + "'") ||
e.backtrace[0] !~ /in `send'/ ||
e.backtrace[1] !~ /in `send_in_order'/)
raise
end
end
}
end
end
def render4(obj)
method_called,result = obj.send_in_order([:to_html, :to_json, :to_s])

case method_called
when :to_html:
return result
when :to_json:
return json_to_html(result)
when :to_s:
return html_encode(result)
end
end

Example 5: Using interfaces (with made up syntax)
Yes, the syntax below for defining interfaces isn't even close to fully thought out.
interface IConvertsToHtml
supports to_html()
end
interface IConvertsToJson
supports to_json()
end
def render5(IConvertsToHtml obj)
return obj.to_html()
end

def render5(IConvertsToJson obj) < (IConvertsToHtml obj)
return json_to_html(obj.to_json())
end

def render5(arg) # Always lowest priority
return html_encode(obj.to_s())
end

Thursday, February 28, 2008

The Myth of Duck Typing

Ruby aficionados swear by what they call "duck typing" and I figured, since I'd criticized C# in comparison with Ruby, it was only fair to return the favor.


Duck typing isn't new. Only the term is — apparently coined in the last ten years, to describe one variant of dynamic typing. But the underlying principles have been around for a long time (it's Lisp's fiftieth anniversary this year). But Ruby takes it to an extreme that is, to put it bluntly, unhealthy to quality code. The Pickaxe book (also known as Programming Ruby) has this to say (emphasis mine):
You'll have noticed that in Ruby we don't declare the types of variables or methods--everything is just some kind of object. . . . If you've come to Ruby from a language such as C# or Java, where you're used to giving all your variables and methods a type, you may feel that Ruby is just too sloppy to use to write "real "applications. It isn't. . . . [O]nce you use Ruby for a while, you['ll] realize that dynamically typed variables actually add to your productivity in many ways. (p. 365)
and
If you want to write your programs using the duck typing philosophy, you really only need to remember one thing: an object's type is determined by what it can do, not by its class. (p. 370) . . . You don't need to check the type of the arguments. If they support [the method you're calling], everything will just work. If they don't, your method will throw an exception anyway. . . . Now sometimes you may want more than this style of laissez-faire programming. ... (p. 371)
and it goes on to explain that all you need to do is call respond_to? to see if an object responds to the method you're calling. And therein lies the problem. I have seen Ruby code littered with calls to respond_to? Rails calls it about 200 times. To use Dave Thomas' term, it's sloppy. Here's a typical use of respond_to?
def render(obj)
    case
    when obj.respond_to?(:to_html)
        return obj.to_html
    when obj.respond_to(:to_json)
        return json_to_html(obj.to_json)
    when obj.respond_to(:to_s)
        return html_encode(obj.to_s)
    else
        # Not realistic, since all objects respond to to_s
        raise "can't render"
    end
end
Gee, wouldn't it be great if the language handled this automatically?

What if we invented the idea of an interface, which you could use to declare a set of methods that you expect to be handled by an object? What if you could then specify that a variable or method parameter had to respond to the interface, with an appropriate type exception (uh, let's call it an interface exception) being thrown if the entire interface wasn't supported by the object? And what if we added the idea of method overloading, so that we could have alternate versions of methods that expect parameters that respond to different interfaces? And what if the compiler could check variables passed to methods to see if they support the requested interfaces and give you compile-time errors instead of runtime errors?

Gee, C# does all of that.

None of this is fundamentally incompatible with Ruby. The advantages to dynamic typing that are mentioned are, in fact, advantages. It is hugely useful to not have to specify the type of an object when it doesn't matter.

But there is certainly a problem with not being able to specify what you expect in an object when it does matter. And it does, way too often. There is nothing in the concept of dynamic typing or duck typing that requires that interfaces (or even classes) cannot be specified. But, instead, Ruby pushes work onto programmers that the compiler and runtime system ought to be doing automatically. In Ruby, it seems to me that duck typing is dynamic typing with blinders on. It results in bloated, sloppy, repetitive code. Definitely not DRY code! Personally, I'd much rather see code like this:
def render(IConvertsToHTML obj)
    return obj.to_html
end

def render(IConvertsToJSON arg)
    return json_to_html(obj.to_json)
end

def render(IConvertsToString arg)
    return html_encode(obj.to_s)
end
The appropriate method can be decided dynamically at runtime — the decision is not made at compile time. But, once inside the methods, support for the specified interface could be guaranteed. Also, notice that there is no need for an extra method to handle the unrealistic error case that I tossed in to make this point, though we could certainly add any number of additional methods for additional things we want to convert.

It would be not be hard to invent a syntax for easy specification of interfaces as well as on-the-fly specification of required methods for a variable declaration. If the Ruby powers that be want it, Ruby can have all the advantages of dynamic typing and all the advantages of interfaces combined with great compiler support. What's the disadvantage?

Update 2/28: Changed example code.

Update 3/2: Follow-up posted.

Monday, February 25, 2008

What is Null?

C# got null wrong.

I really like C#. I think it's one of the best things that Microsoft has ever done. The .NET Framework is also very solid. And Visual Studio is a mature, solid development environment. As a group, they make application development much easier.

But C# got a number of things wrong and one of those things, in particular, has been bugging me recently. Every object in the system inherits from the Object class and that means that you can count on all objects supporting a minimum set of methods. The minimum set is Equals, GetHashCode, GetType, ReferenceEquals, and ToString. Unfortunately, null is not an object, so null doesn't respond to any of these methods. This is most obvious with

null.ToString();
which generates a runtime exception. Oddly,
if (null == null) ...
compiles and runs properly, but it shouldn't. Neither side of the comparison supports Equals. The compiler must special case this. There are other special cases as well. (string + null) and (null + string) both return the string.

But why doesn't null respond to ToString? The obvious answer is that it's not an object -- it's a special "magic" value. But magic values are almost never a good idea and I don't see why this case is an exception. The fact that null is a magic value is a compiler implementation detail which has nothing to do with how I want to use it as a programmer.

In recent languages, Ruby, which has plenty of flaws of its own, got this one right. Nil is a singleton instance of NilClass. But they also made a minor mistake. NilClass doesn't respond to the empty? method meaning that you can't use s.empty? unless you know for sure that the variable contains a string. Since Ruby's classes are open, I can fix this and I do (and don't worry -- there's plenty of time to talk about Ruby flaws in the future).

Can C# fix this? Well, there are some problems. If they just change it, any code that was written that relied upon the Exception being thrown will break. That wouldn't be good. But, it seems to me they could fix it by throwing the Exception anyway, then continuing on (and returning the empty string) if the Exception isn't caught. Since the uncaught exception would terminate the application, the worst that would happen is that some apps which would have crashed will keep going. They could do the same thing with the other Object methods as well.