Crashing Your Way to Great Legacy C Tests

Adding tests to legacy C or C++ code can be a challenge. Code not designed to be tested won’t naturally be testable. Dependencies will be unmanaged and invisible. Getting that first test written will hurt, a lot. Don’t despair! The first test is the hardest, but subsequent tests are much easier.

Knowing what to do and what to expect, when you start adding tests to your legacy code, can ease the journey. This article will give you an idea of what to expect when getting that first bit of C or C++ into the test harness.

Michael Feathers suggests this algorithm for adding tests to legacy code.

  1. Identify change points.
  2. Find test points.
  3. Break dependencies.
  4. Write tests.
  5. Make changes and refactor.

There’s another process that is especially helpful that works with Michael’s algorithm. Here’s the scenario:

You want to test some existing legacy code. The function your test needs to call is part of a interwoven mass of C data structures and functions. Just getting this function-call data-structure free-for-all compiled in the test harness is a big deal. Getting it to run can be even bigger deal. It’s not obvious what data the legacy code function uses, so what needs to be initialized is not obvious either. You can crash your way to discover what needs to be initialized.

The crash test method starts with an empty test case and a legacy code function you want to test. The algorithm, expressed in C, looks like this:

void addNewLegacyCtest()
{
  makeItCompile();
  makeItLink();
  while (runCrashes())
  {
    findRuntimeDependency();
    fixRuntimeDependency();
  }
  makeTestMoreMeaningful();
}

makeItCompile()

makeItCompile() usually starts with an #include for the function’s header file and a call to the target function. A long list of compilation errors is your reward. It’s best to attack the first error, as it is the likely cause for 101 others. Add one include file at a time. Continue attacking that first error. Your goal is to get to the minimum set of #includes that result in a clean compile. Don’t get discouraged.

A shortcut for a really bad dependency mess is to copy the #includes from a production code file that calls the target function. You’ll get clean compile, but a messy include list. Once the test is up and running, try to prune the list of includes.

To call the function under test you will also have to provide data structures and parameters to the function. During makeItCompile(), feel free to use null pointers and simple literal data values. Use memset() to bulk fill data structures with zeros. Later you will have to do more than feed the code under tests meaningless inputs, but by plugging dependencies with null pointers and simple data you can get to a clean compile sooner. Later, when the test runs, a crashing the test runner will lead you to the data that is causing the problem (at least sometimes).

The less committed TDDers might get discouraged by this process, resulting in early termination of makeItCompile() via exit(FAILURE). Don’t give up, exit(FAILURE) is a last resort.

Once makeItCompile() returns, makeItLink() starts immediately.

makeItLink()

With the web of includes in place, parameters and globals dependencies plugged with NULL pointers and other meaningless data, you are ready for your next reward, link errors. makeItLink() can be quiet involved. The unresolved externals need to be resolved either by linking in parts of the production code, or providing Test Doubles. makeItLink() might also result in an exit(FAILURE) for those looking for an excuse to not write their unit tests.

runCrashes()

Once the executable test runner is built link-error free, the most likely outcome is for the test runner to crash. The crashes are caused by the uninitialized and improperly initialized data left dangling earlier. You are right on track! Hang in there! The crashes are leading you right to the runtime dependencies.

Stay in the loop as long as runCrashes() is TRUE. In the loop you find and fix the runtime dependencies as you knit together the needed global data and parameters to make the code under test happy. Getting runCrashes() to transition to FALSE is a major breakthrough. The function under test is running in the test harness! Let’s look a little deeper at the find and fix processes.

findRuntimeDependency()

If you have a debugger, fire it up and visit the crash site. Inspecting for clues will likely yield the root cause of the illegal access. If you don’t have a debugger, it’s time to get one. You can also inspect the input data to find obvious problem initializations. Single stepping through the code under test can also be revealing.

findRuntimeDependency() returns when the root cause of the crash is discovered. Michael suggests that we lean on the compiler as a way to check the extent of a compile time dependency. findRuntimeDependency() leans on the execution environment, letting the running program, and the OS support for illegal memory access, guide the way to the runtime dependencies.

This technique is valuable for embedded and non-embedded software, but the embedded software only benefits when memory is managed so that illegal memory access is detected. Without illegal memory access detection you code may happily run off in the weeds, only to bring turmoil later.

Crash testing your way to good test cases is good reason to run embedded software tests in the development environment. If you want to see some other reasons to run your embedded software tests in the development environment look at TDD for Embedded Software.

fixRuntimeDependency()

With the root cause in focus, the missing initialization should be more clear. The cause might be uninitialized global data, a missing function pointer, or some other unexpected value. Figure out what to initialize next and go into a series of makeItCompile() and makeItLink() operations. Once this runtime dependency is resolved, the most likely outcome is another crash. The good news is that eventually, when all the runtime dependency holes are plugged, the crashes stop. Then new tests come quickly.

If you have bit off too big of a test target, it may take a long time and many crashes to get a clean run. You can always reconsider and pick an easier target.

addMeaningfulTest()

Take a deep breath. The hard part is done! Now put some meaning into the tests.

Usually after addNewLegacyCtest() the first thing to do is cutPasteRenameTweak() until you can’t think of any more tests. I highly recommend against the following algorithm for adding new legacy C tests.

void addMoreLegacyCtests()
{
  while (!testsAreSufficientForCurrentNeeds())
  {
    copyPasteTweakTheLastTest();
  }
}

copyPasteTweakTheLastTest()

Once a first legacy test is running there is usually great relief and rejoicing. Developers immediately envision a new test, involving a simple tweaking of the input and the checks. So, copyPasteTweakTheLastTest() is the natural thing to do. This is all well and good, but copyPasteTweakTheLastTest() must be followed by refactor, or the test cases will immediately become a mess.

The tests need to be readable, maintainable, and self-documenting. A mass of duplication does not meet that need.

I suggest this as an alternative algorithm:

void addMoreLegacyCtests()
{
  while (!testsAreSufficientForCurrentNeeds())
  {
    copyPasteTweakTheLastTest();
    while (!testDifferencesAreEvident())
    {
      inspectTheCopyPastedTests();
      if (setupStepsAreSimilar())
      {
        extractAndParameterizeTheCommonSetup();
        extractAndParameterizeTheCommonAssertions();        
      } 
      else
        ;//Maybe a new test case group is needed
    }
  }
}

testsAreSufficientForCurrentNeeds()

In testsAreSufficientForCurrentNeeds() you decide if the current tests are adequate, as the name suggests. (I love self documenting code.) While deciding ask:

  • How should the inputs be varied?
  • What should be checked to verify the operation of the code under test?
  • What other tests are needed?

testDifferencesAreEvident()

The first time through testDifferencesAreEvident() almost always results in a FALSE return value. copyPasteTweakTheLastTest() provides a new test case which you can be happy about. But the first few times copyPasteTweakTheLastTest() is executed the resulting test cases are filled with duplication. The valuable tweaks are hidden in the mass of initialization code and a bank of assertions, obscuring the differences between test cases. You are not being paid per line of code are you? So clean it up. After a few laps around this loop each copyPasteTweakTheLastTest() creates a new concise and readable test case.

Why the big deal about keeping tests clean? Tests have to be kept clean because in a few days (or hours) each test will require a lot of work to understand. At the time you copyPasteTweakTheLastTest() the differences are clear to you, but no one else. And they won’t be clear to you for long. So it is the best time to reduce the duplication and improve the readability of the tests. The only way testDifferencesAreEvident() returns TRUE is when the duplication has been refactored into shared test data and helper functions.

extractAndParameterizeTheCommonSetup()

In this activity common setup is extracted into shared test case variables and helper functions. Tests are run after each change. Sometimes extractAndParameterizeTheCommonSetup() results in new initialization code that the production code could use. You have created a useful utility and should consider calling promoteTestSetupCodeToProductionCode().

extractAndParameterizeTheCommonAssertions()

This is just like extractAndParameterizeTheCommonSetup(). Maybe this description should be refactored like this:

  extractAndParameterizeCommon(setup);
  extractAndParameterizeCommon(assertions);

Your hard work is rewarded. After a couple spins through this cycle testDifferencesAreEvident() will repeatedly return TRUE. You will have the tools to add new test scenarios rapidly because you build on clean refactored test code. I’ll provide an example in a future post.

When you have enough tests: exit(SUCCESS);

Thanks to my friends in Seoul for helping me articulate this process! Have I been doing too much C lately?

8 thoughts on “Crashing Your Way to Great Legacy C Tests

  1. The algorithm you present here looks like TDD.

    Knowing that it will crash and accepting it as a progress relieved us a lot. Without it we might have exit(FAILURE) πŸ™‚ Thank you for clarifying this process and your guidance which kept us in the right way.

  2. Nice Algorithm James! I’m especially glad that you put the refactor bit in at the end. I’ve left my share of those Copy-Paste-Tweak tests in my wake. I know understand how hard they make the code to change. You ought to be able to read only one or two lines of code to understand what they test does. Ok, so we are talking C here so maybe 5 or 6 lines!

  3. Thanks James,

    It was great time for us to work with you and to build feasible solution for our own environments. We couldn’t do it without you.

    In this article, you quite clearly summarized lots of lesson learned we’ve figured out. Thanks. πŸ˜‰

    The followings are my idea.

    1) Adding top-most algorithm

    In my oppinion, why don’t you put the top-most algorithm on the beginning of this article something like below:

    attackLegacyCTests() // active and aggressive attitude
    {
    addNewLegacyCtest();
    addMoreLegacyCtests();
    }

    2) Describing makeTestMoreMeaningful()

    You skip to describe makeTestMoreMeaningFul(). But, in this step, I think that there are good noticeable points what we usually passed over.


    makeTestMoreMeaningful()

    Don’t use arbitary, unmeaningful, name such as Test1(), FirstTest(). The name of tests should be taken to show what’s exactly supposed to do. If it is hard to give a simple and concrete name on the test, it might have more than one responsibilities. If so, it’s time to break it by SingleResponsibilityPrinciple.

    3) Refactoring long addMoreLegacyCTests() algorithm

    I think that extracting refactoring steps as another function helps more clear. πŸ™‚

    addMoreLegacyCtests()
    {
    while (!testsAreSufficientForCurrntNeeds())
    {
    copyPasteTweakTheLastTest();
    refactorTests();
    }
    }

    refactorTests() // or keepTestsClean()
    {
    while (!testDifferenncesAreEvident())
    {
    inspectTheCopyPastedTests();
    if (setupStepsAreSimilar())
    {
    extractAndParameterizeTheCommonSetup();
    extractAndParameterizeTheCommonAssertions();
    }
    else
    ;//Maybe a new test case group is needed
    }
    }

    Thanks again, James. πŸ˜€

  4. Rick

    Your top level algorithm nicely shows that the first test is very different than subsequent tests. I like you explanation of makeTestMoreMeaningful() and application of SRP.

    Nice job refactoring my refactoring steps!

    James

  5. Pingback: #include Test Double « James Grenning’s Blog

  6. Pingback: Unit testing RTOS dependent code – RTOS Test-Double « James Grenning’s Blog

Leave a Reply

Your email address will not be published. Required fields are marked *

Be gone spammers * Time limit is exhausted. Please reload the CAPTCHA.