Wednesday, May 27, 2015

It's crazy how easily bugs can stay hidden

Sometimes it's really surprising how little a bug can actually effect the functionality of an application.

A situation that I ran into today involved a stack object inside of a c++ application.


class Stack
    typedef Item* pointer;
    pointer clear(void);
    pointer pop(void);
    void push(pointer);
    pointer m_top;
This is a stack of pointers. Each Item has an Item::next() function, which points to the next item on the stack.
Push sets pointer->m_next = this->m_top, and then m_top = pointer;
Pop does the equivelent of m_top = m_top->next() ; return previous m_top;
Clear returns m_top and sets m_top to nullptr.

Pretty simple stuff.

Consider though that the actual implementation of this stack is hand-written assembly, for two reasons. 1) To squeeze every ounce of performance out of the application. 2) To get atomic thread safe operation.

Even better, Windows offers something called an SList :

Anyway, suffice to say, the details of this datastructure are anything but trivial.

Here's where the bug comes in.

Our implementation of Stack::clear(), depending on what platform was being compiled for, actually had the same implementation as pop. The obvious consequence of this is that clear() didn't ACTUALLY empty the list. It only pulled off the first item!

This code has been in use for over ten years! No one noticed a problem with it, in all of 10 years of production use. How crazy is that?

Talk about an obvious bug having almost no effect :-)