Less Code Is Better

By Paul Barry
May 10, 2011 | Comments: 9

Head First Python contains examples of Python's list comprehension technology, which is a technique that lets me take code like this:

    new_list = []
    for thing in some_list:
        new_list.append(do_something(thing))

and turn it into shorter code, like this:

    new_list = [do_something(thing) for thing in some_list]

Note that both code fragments assume the existence of a list called some_list and a function called do_something(). List comprehensions are an example of Python's functional programming facilities, whereas the three lines of "standard code" are an example of Python's imperative programming facilities. Both work, of course and - depending on your point of view, one technique may appeal to you over the other.

I'm a big fan of the "less code is better" principle, in that I firmly believe the number of bugs in my code is directly related to the number of lines of code I write. Any technique that lets me write less code always gets a big thumbs-up from me, and using list comprehensions lets me write less code. To illustrate, let's take a real-world example of where using a list comprehension can have a dramatic impact.

Imagine I've designed a web application that has a HTML form element that allows my users to select an age from a drop down list. I'm building my web application using the popular Django web framework, which has lots of built-in goodness for dynamically creating HTML. In fact, Django can dynamically generate a HTML form directly from a model definition, which is way cool. When I use code like this to define my data model, Django takes it and dynamically generates a form:

    from django.db import models 

    class StudentData(models.Model):
        name = models.CharField(max_length=50, 
                                verbose_name="Student's Name")
        age = models.IntegerField(verbose_name="Student's Age")

As well as the self-explanatory arguments to the field definitions, there's also an argument called choices which allows me to specify a list of tuples which Django will use to restrict the input values allowed. If I create a list like this:

    _ALLOWED_AGES = [(4, '4'), (5, '5'),
                     (6, '6'), (7, '7'),
                     (8, '8'), (9, '9'),
                     (10, '10'), (11, '11'),
                     (12, '12'), (13, '13'),
                     (14, '14'), (15, '13'),
                     (16, '16'), (17, '17'),
                     (18, '18')]

I can then change my model definition for age to look like this:

    age = models.IntegerField(verbose_name="Student's Age", 
                              choices=_ALLOWED_AGES)

Django will now use this list to create a HTML drop-down list within the form. The first element in each tuple is the value stored in the model, whereas the second element is displayed within the form's drop-down list. Pretty easy, eh?

Well... yes, it it easy... but, look at all that extra code. Count the lines added: 9. That's nine places where I could potentially introduce an error into my code. Remember: more code means more errors. Did you notice that I have a typo in the string-value associated with 15? It is incorrectly coded as '13', not '15'? Hard to spot, isn't it?

Let's improve things--and by improve I mean "reduce the possibility of error"--by writing some code to generate the list, instead of simply defining it. I'll start with an empty list which I then populate with the generated tuples using a standard for loop:

    _ALLOWED_AGES = []
    for each in range(4,19):
        _ALLOWED_AGES.append((each, str(each)))

Which gets me down to 3 lines of code, which isn't bad. Of course, I can rewrite this using a list comprehension:

    _ALLOWED_AGES = [(each, str(each)) for each in range(4,19)]

Which gets me down to just 1 line of code, which I think is pretty cool.

Now, ask yourself this question: which would you rather maintain? Nine lines of code, three lines of code or just one line of code?

To conclude, and to shamelessly paraphrase a famous Canadian singer/songwriter: Less code is better. Bumper stickers should be issued!


Comments: 9

Sorry, PJ. Gotta disagree with you. More code is not necessarily better (that would be the exact opposite of your thesis). The best code is the code that is easiest to comprehend and maintain by a third party.
Your list comprehension code may be great stuff that really shows off your skills with Python, and it's fine to do that if YOU are the only one that will ever see those lines of code again. However, it reminds me too much of the many programs I have had to debug in my career. Not only was the code, in many cases, not immediately obvious, but, in many cases, the original programmer didn't have a clue what the code was originally meant to do.
I'm not a Python programmer (obviously), but the 3 lines of code are pretty self-explanatory, whereas I would have no idea where to begin with the "list comprehension" code. I would go with the 3 lines in a production program. Trust me.

Thanks for the comment, Frank. I totally agree with your sentiment, in that easy to understand code is to be preferred. However, I'd counter that within the Python world (and within most any programming community) it is always better to use the idioms that are popular within that community (as opposed to the ones you are comfortable with). Among Python folk, using list comprehensions is well understood - as it is in all the purely functional programming languages. Of course, I can appreciate that on a first glance the list comprehension looks like "line noise", but it really isn't to someone familiar with Python. Thanks again for the comment. You make a good point: clearly written code is to be preferred. --Paul.

Totally agree with everything that has been said. First off, if you debug python code, it means that you know language, so, idioms should be familiar to you. I love python.

I've come to believe that preferred code style may have a neurological basis. Some people prefer to work from memory, others by sight, eye scanning may focus on color or shape, and other factors may apply. Experience has a lot to do with it. The kinds of style that each programmer remembers as having been a win is important, even if the memory is not statistically accurate.

The bottom line is that code should be easy to read (as well as correct and fast) but that depends to a significant degree on the reader. Trying to establish "the one true style" is pointless.

IMNSHO, shorter, idiomatic code is almost always not better. Often, it is very pleasing to the writer. Concise, well written code that avoids cleverness should be more important than even performance.

When I first started out 35 years ago, I used to shudder when I heard grey beards say they didn't bother to figure lisp or FORTRAN "code" out, they just rewrote it. Today, I do pretty much the same thing when working on Perl and C programs.

I'll take easy to read over idiomatic every time.

I find this discussion fascinating.

Every one acts like easiest is universally defined. I don't know that list comprehensions are so idiomatic. In real language, they would be expressed as; taking group of items A make a group of items B with these properties. This seems to me to be a how description. The for loop stuff seems more procedural, meaning it focuses on the little details rather than the task.

Does this dichotomy seem realistic?

Great discussion, folks!

I'll expand and comment on a few of the points, if I may?

I think it is important to stress that it would be incorrect to think that all idiomatic code is "bad", as it is possible to write bad code in almost any programming language with or without language-specific idioms.

Are some idioms in some languages "bad" and hard to understand? Well... sometimes they are and sometimes they aren't. It depends.

Are language-specific idioms to be avoided? I don't think so. In fact, I'd be pretty annoyed if a developer took my Python list comprehension and re-wrote it as a "for" loop, as that - in my mind - would not make it easier to read (again, to me, as a Python programmer).

List comprehensions in Python have been around for a long time and there's good descriptions of them in every Python book that I've read. The list comprehension is a description of what has to happen to one list in transforming it into another, with an emphasis on the "what". The for loop is more concerned with the "how" of doing the transformation, where the programmer is telling the interpreter how to do things. With the list comprehension, the interpreter is being told what is required, not how to go about doing things in a step-by-step fashion. It's not unlike regex technologies, where something like '^(\d+)$' is a specification of what is to be found, as opposed to an algorithm to use when finding something. As I said in my original post, some people will have an easier time with one technique over the other...

[As an aside, another example of a technology that concentrates on specifying the "what" (and not the "how") is SQL.]

For those folk interested in the origins of list comprehensions, they are related to a similar construct from Math theory. Don't ask me which bit, as my eyes tend to glaze over (like a lot of people) when I hear the words "Math" and "theory" in the same sentence. ;-)

--Paul.

As an aside, another example of a technology that concentrates on specifying the "what" (and not the "how") is SQL.

And, it is the embodiment of less code. Not appropriate to games and such, but for most applications, when combined with a high-core/SSD machine, one can define high normality databases that just fly. And little client code needed. Yum

Terse code should not be confused with the concept of "creating less code". The stated example is terse code and really is harder to read and maintain even in Python.

Less code in my opinion is the concept of thinking through what you are going to code and write it as straight-forward, maintainable and compact as you reasonably can.

I have often seen functions and sections of code that were written just plain wrong and more lengthy than it should have been for the functionality being achieved.