Saturday, April 5, 2008

Real World Bottlenecks

Yesterday, I wrote about the problem at the Democratic caucuses and asked the question of what would be a good algorithm for tabulating a lot of votes quickly with a relative lack of technology.

First, an update on the parameters. 320 people voting for 19 delegates -- 9 men plus 9 women, plus 1 extra delegate of either gender (the candidate with the highest remaining vote count). Plus 9 alternates, in a similar manner. This is means we needed to tally more than 6,000 individual votes, split into two halves, for men and women. Voting was by candidate number.

What happened is that teams of people entered the ballots into very simple spreadsheets -- basically, just the candidate numbers. This was good because it was both simple and parallelizable. Like most developers, I type numbers very quickly. I was paired with another software guy (one person has to read, one has to tally) and we breezed through 85 ballots in 30 minutes (935 entries, an average of about an entry every 2 seconds). But the fact that we did 85 ballots is indicative of a problem -- there weren't enough laptops and teams, and other teams weren't as fast. Still, all the data entry was done in under an hour.

Then, something inexplicable happened. I don't actually know what. Given all the data in spreadsheets, it should be a simple matter to pull it all into one spreadsheet, perform a series of COUNTIF formulas for each of the potential candidate numbers and then sort the results by total. Even if that spreadsheet hadn't been created in advance, this is like a 5 minute operation. Instead, it took more than an hour to take all of the tallies into results. The people doing the totaling vanished into some other room, so I don't know what happened. Beforehand, I heard something about Microsoft Access being used, but I don't know why it would have been. Access isn't the best application to use when what you want to do is count data, especially when equivalent data has been entered in multiple columns.

Net result: it took more than two hours for the results of the vote to be known.

Everybody was well intended, but I think there were a number of factors contributing to the a less-than-desirable end result:

  • Unreasonable restrictions (initial statements that computers couldn't be used, or that no more than one computer could be used).
  • A single plan, for a situation that didn't happen (200 candidates), rather than multiple plans for different situations. The plan was inflexible.
  • Not enough parallelism.
  • A final step that was overly complicated.
And it's those last two that are the biggest problems -- no matter how much we optimize the system, if we have a bottleneck somewhere, all our optimizations will be for naught.

Of course, it would be much simpler if Washington used a primary, with systems that are already in-place to count votes, but that's a whole 'nother story.

1 comments:

Norman said...

Wonderful analysis, Roy!

Poor planning, lack of parallelism and ungainly data merging are all serious problems, but sadly the two biggest problems are:

1) No one wants to or has the time to head up a credentialing effort, so the task usually falls on "the willing" rather than "the able".

2) Lessons go unlearned. I was shocked last week to hear our Caucus Chair express pride at how quickly and efficiently the LD Caucus went this year. Evidently two hours to process credentials and over two hours to count votes is a job well done, according to those in charge. My sense is that most of the Caucus attendees, like you, had a very different take on how quick and efficient the Caucus process was.

Since I was appointed Chair of the Technology Committee for the State Party, I have a vested in establishing "best practices", if not coming up with a system that we can use in the future.

Naturally, allocating delegates through a Primary would "solve" the Credentials part of the Caucus delay problem. You should encourage your readers to contact their Democratic State Committee members and urge them in that direction. The State Party Chair is, or least was, deadset against one, so the more Democrats that make our voices heard the better. I can tell you that your State Committee man, who had the pleasure of your company during your LD Caucus, not only voted for us allocating delegates via a primary this year, but made an impassioned plea for a primary on behalf of the Technology Committee as well, but he (I) was in the minority.

Post a Comment