Do you know where your bugs are hiding in your codebase? Sometimes bugs can be hard to spot, but other times a quick look at a code file can reveal patterns that suggest where they may be lurking. This post is intended to help you spot some of those red flags that may indicate code problems.

These red flags are similar to code smells, but a code smell usually connotes design issues, and require understanding what the code does to suss out. Red flags are signs that you can read just by looking at the code.

Spotting red flags in your code

Taken to extremes, anything that makes the reader spend time thinking about the code itself, instead of what the code is intended to do, can be considered a red flag. Ideally, code should be written to remove as many barriers to understanding as possible. Code segments that are harder to read are more likely to harbor bugs, so tread carefully when you encounter these eight red flags.

Red Flag 1. Visual mechanics

While it’s important to be concerned about high-level design issues in your code, the nitty-gritty of construction matters, too. Keeping your code clean and up-to-date is an important part of maintenance.

In many ways, reading code is just like reading natural language. There are rules and conventions that have evolved over time to make things easier for the reader to understand. Code that doesn’t follow these conventions is harder to read, and is less likely to be read. Clean code will be read more often and will be glossed over less.

Red Flag 2. Inconsistent formatting

The most obvious part of code visuals is indentation and structure, so this could be considered a subset of Number 1. But consistent formatting, especially indentation and spacing, is so important to making the code easier to be read that we gave it a Red Flag of its own.

A monstrosity like this is a breeding ground for bugs:

if ($phase == 1) {
   menu_setup();
}

else if ($phase == 2){database_setup();}
if ( 3 == $phase )
   {
      run_cleanup();
   }
else { get_user_input(); }

Did you catch that both else and elseif are used here? Is that intentional? Could $phase be changed by database_setup()? Who knows?

Here’s how this segment could be rewritten to avoid the red flags:

if ( $phase == 1 ) {
   menu_setup();
}
elseif ( $phase == 2 ) {
   database_setup();
}
elseif ( $phase == 3 ) {
   run_cleanup();
}
else {
   get_user_input();
}

Or maybe:

if ($phase==1)
{
   menu_setup();
}
elseif ($phase==2)
{
   database_setup();
}
elseif ($phase==3)
{
   run_cleanup();
}
else
{
   get_user_input();
}

Either one of those reformatted versions are better than the original. There are plenty of arguments over how to handle indentation, brace placement, tabs vs. spaces, and how tight to have parentheses, and I’m not going to address them here. The key is that it should be consistent throughout your codebase, whatever you choose.

Red Flag 3. Incorrect English in comments

There’s a famous saying that “Programs must be written for people to read, and only incidentally for machines to execute.” This is even more true when working with comments in the code.

When you write comments, use complete English sentences. Don’t forget capital letters and punctuation. Write as if you were writing an email to someone else, because that’s pretty much what you are doing.

I found an example of this in a programming book recently. One of the code samples had a block of comments that read:

# random.random() produces numbers
# uniformly between 0 and 1
# it's the random function we'll use
# most often

Unfortunately, we have to waste brain cycles on figuring out what the writer meant, with no clues from capitalization or punctuation. How much easier it would have been to read:

# The random.random() function produces
# numbers uniformly between 0 and 1.
# It's the random function we'll use
# most often.

Red Flag 4. Misspelled identifiers

Spelling also matters, both in comments and in the identifiers used. When you come across a misspelled word, you have to stop and figure out what the original intent was.

The worst kind of misspelling is when the programmer created an alternate spelling so they could have two variables named the same thing. I’ve found code that had two index variables, named index and indx:


for my $index ( 'A'..'Z' ) {
   for my $indx ( 1..10 ) {
      $totals[$indx][$index] = ...

Is that third line correct? Or should it have been $totals[$index][$indx] instead?
Instead, make existing variable names more descriptive:


for my $column_index ( 'A'..'Z' ) {
   for my $row_index ( 1..10 ) {
      $totals[$row_index][$column_index] = ...

Red Flag 5. Cut & paste patterns

Sometimes when a programmer gets a piece of code working, they will cut and paste the working code and then modify key elements of it in the new location. This can lead to code like this:


# Add current paycheck and adjustments to totals.
$tot{'salary'} += $curr{'salary'}+$adj{'salary'};
$tot{'federal'} += $curr{'federal'}+$adj{'federal'};
$tot{'state'} += $curr{'state'}+$adj{'state'};
$tot{'medical'} += $curr{'state'}+$adj{'medical'};

Blocks of cut-and-paste code like this lull the reader into thinking they know what the code is doing. Because the first few lines of code make sense, the reader assumes that the rest of the code is also correct. Instead, the reader should be more alert to problems in cut-and-pasted code.

This code is a mess to read. It’s hard to pick out patterns in the code. Putting the common parts of the code into columns would make it easier to see the patterns.


$tot{'salary'}  += $curr{'salary'}  + $adj{'salary'};
$tot{'federal'} += $curr{'federal'} + $adj{'federal'};
$tot{'state'}   += $curr{'state'}   + $adj{'state'};
$tot{'medical'} += $curr{'state'}   + $adj{'medical'};

Now it’s much easier to read, making it easier to tell that the line for $tot{'medical'} is using $curr{'state'}.

If the code is rewritten like this, the intent is much clearer and it’s less prone to errors:


for my $category ( 'salary', 'state', 'federal', 'medical' ) {
   $tot{ $category } += $curr{ $category } + $adj{ $category }
}

Red Flag 6. Numbered variables instead of distinguishing between similar ones

Numbered variables can also be red flags because they make it harder for the programmer to distinguish between similar variables that differ by only one character, and force them to spend mental cycles keeping track of what the various numbers mean.

Numbering variables usually happens when previously working code has to be extended, and the programmer doing the extending chose the simplest approach that could work.

Take this code that totals up the sizes of files:


while ( my $file = get_next_file() ) {
   $size += $file->bytes;
}
print "$size bytes used.";

So far, so good. But if the programmer later has to extend the code to count up the number of blocks used by the file and number of files overall, it turns into this:


while ( my $file = get_next_file() ) {
   $size += $file->bytes();
   $size2 += $file->blocks_used();
   $size3++;
}
print "$size bytes used in $size3 blocks in $size2 files.";

I sure hope that that print statement has the right variables matched up. Oops, it doesn’t.

Red Flag 7. Excessive nesting and overly large functions

Putting too much code within a logical grouping of code such as a function can hide bugs. To get a visual sense of code complexity, look at how much indenting there is. If you see more than three layers of indentation, there’s probably too much going on:


function user_checkup() {
   for my $user ( get_all_users() ) {
      if ( $user->type == ADMIN ) {
         for my $ticket ( $user->get_all_support_tickets() ) {
            if ( $ticket->status == CLOSED ) {
               for my $email ( $ticket->recipients ) {
                  send_email_to_recipient( $ticket, $email );
                     ... etc etc
               }
               ... etc etc
            }
            elsif ( $ticket->status == PENDING ) {
               ... etc etc
            }
            ... etc etc
         }
      }
      elsif ( $user->type == USER ) {
         ... etc
      }
      ... etc
      if ( $user->type != ADMIN ) {
         # Do some special case for non-admins.
         # etc etc etc...
      }
   }
}

There’s just far too much happening in the above block of code. Notably, it probably didn’t start out like that. Maybe it began as a quick routine to check up on an admin’s status queue, but then kept growing and adding more cases and functionality. Before long you’ve got a function hundreds of lines long with way too much going on.

These sorts of too long and too complex functions are difficult to understand all at once with our mistake-prone human brains. Maybe there’s a $total in one section that gets used somewhere else by accident. Maybe something modifies a $ticket in one place that affects something else in another.

Red Flag 8. Functions with long parameter lists

While we’re talking about things too complex to wrap your head around, functions with long parameter lists can also be a nightmare. If you’re using a dynamic language like Perl or PHP where parameters are not type-checked, it’s even more important.

Consider a function like this:


function generate_report( $user, $dbhandle, $emailaddr,
   $include_summary, $omit_hyperlinks, $filename )

Check how these functions are called very carefully. It’s all too easy to get a couple of arguments out of order, and you might not even notice in certain cases. For example, if $filename and $omit_hyperlinks are passed in the incorrect order, the mistake might not be so easy to discover. If you pass a $filename with a value of “0” and a $omit_hyperlinks with a value of “report.txt,” that’s not something that might get discovered right away. “0” is a valid filename, and “report.txt” evaluates to a Boolean “true” for the value of the $omit_hyperlinks flag.

Searching for red flags in your code

Keep an eye out for these sorts of patterns in the code you work with, and be watchful for bugs that might be lurking within. You just might find a problem that nobody has yet discovered.

Just as important, be wary of red flags in your own code. If you find yourself taking the false shortcut of not being consistent in your formatting, stop and take the time to do it right. When you find yourself with functions too large to wrap your head around, stop and break them up into smaller chunks. It’s never fun to look back at some bad code and wonder, “Who wrote this garbage?” … only to realize it was you.

 

Andy Lester has been involved with open source for two decades, contributing many modules to Perl's CPAN as well as speaking at open source conferences and user groups around the country. He's the author of the search tool ack and his book Land The Tech Job You Love is published by Pragmatic Bookshelf. View posts by .

Interested in writing for New Relic Blog? Send us a pitch!