Accessing static Data and Functions in Legacy C — Part 2

Maybe you read Part 1 of this article. If you did you’ll know it concerns adding tests to legacy code (legacy code is code without tests). You will also know that the code has file scope functions and data that we want to test directly.

My opinion on accessing private parts of well designed code, is that you do not need to. You can test well design code through its public interface. Take it as a sign that the design is deteriorating when you cannot find a way to fully test a module through its public interface.

Part 1 showed how to #include the code under test in the test file to gain access to the private parts, a pragmatic thing to do when wrestling untested code into a test harness. This article shows another technique that may have an advantage for you over the technique shown in Part 1. Including the code under test in a test case can only be done once in a test build. What if you need access to the hidden parts in two test cases? You can’t. That causes multiple definition errors at link time.

This article shows how to create a test access adapter to overcome that problem.

We’d like to get IsLeapYear() under test. IsLeapYear() is a static function from Date.c introduced in Part 1. I’d like to write this test, but IsLeapYear is hidden, so we get compilation and/or link errors.

TEST(Date, regular_non_leap_year)
{
    CHECK_FALSE(IsLeapYear(1954));
    CHECK_FALSE(IsLeapYear(2013));
}

You get compilation errors, because there is no function declaration in Date.h for the hidden functions and data. If you overcame those errors by adding declarations, in the test file or tolerating the related warnings, you would be rewarded with unresolved external reference errors by the linker.

The solution is pretty straight-forward; write the test using a test access adaptor function:

TEST(Date, regular_non_leap_year)
{
    CHECK_FALSE(CallPrivate_IsLeapYear(1954));
    CHECK_FALSE(CallPrivate_IsLeapYear(2013));
}

CallPrivate_IsLeapYear() is a global function declared in DateTestAccess.h like this:

#ifndef DATE_TEST_ADAPTOR_INCLUDED
#define DATE_TEST_ADAPTOR_INCLUDED
 
#include "Date.h"
 
bool CallPrivate_IsLeapYear(int year);
 
#endif

CallPrivate_IsLeapYear() is implemented in DateTestAccess.c like this:

#include "Date.c"
 
bool CallPrivate_IsLeapYear(int year)
{
return IsLeapYear(year);
}

You could add similar accessors for private data to DateTestAccess.h

#ifndef DATE_TEST_ADAPTOR_INCLUDED
#define DATE_TEST_ADAPTOR_INCLUDED
 
#include "Date.h"
 
bool CallPrivate_IsLeapYear(int year);
 
const int * GetPrivate_nonLeapYearDaysPerMonth(void);
 
#endif

Implemented like this:

const int * GetPrivate_nonLeapYearDaysPerMonth(void)
{
    return nonLeapYearDaysPerMonth;
}

There are some variations of this that could be helpful. If the code under test has problem #include dependencies, you could #define symbols that are needed in DateTestAdaptor.c and also #define the include guard symbol that prevents the real header from being included.

I don’t love doing any of this, except when I do. That is limited to when it solves the problem of getting hidden code under test without modifying the code under test. A plus is that the code under test does not know it is being tested, that’s a good thing. Neither of these approaches are long term solutions. They are pragmatic steps toward getting code under tests so that it can be safely refactored and have new functionality test-driven into it.

14 thoughts on “Accessing static Data and Functions in Legacy C — Part 2

  1. Nice. I’ve done similar things in a pinch. Marked things testonly_* and even told the compiler to hide them for a production build

  2. If you talking about conditional compilation, I try to avoid it for test access (and most things). #if TESTING is a test smell. I use different builds. For example, Date.c would go into a production code library, while DateTestAdaptor.c would be with other test code and compiled but not put in the production code library. The test files are linked explicitly as .o files with test main. Public names in the tests override things in the library.

  3. “accessing private parts of well designed code, […] you do not need to”
    Say that you have a rather complex algorithm in 15 steps (new code, not legacy). In the final program, there is only need for one externally callable function that executes all the 15 steps. No need to expose the intermediate steps to the outside world. You’re saying that you are not writing separate tests for all the steps? Couldn’t it be quite daunting to come up with test data that covers the algorithm in its whole, rather than the steps separately? And then when such a test fails, it wouldn’t tell you which step broke.
    How would you solve this?

  4. Hello Gauthier

    Do you have a real problem that illustrates the code you are talking about? In the abstract, it is difficult to come up with a satisfying reply to the problem you pose. But I’ll give it a shot anyway.

    If the client code only needs the one API function, then that API function should be all that it can see. This does not mean that the algorithm is fully implemented in that one module or class. If there are complex things going on in the algorithm that cannot be conveniently tested through the client’s API then the code is telling you it is too complex. Helpers would be extracted to make the code testable. The helpers don’t need to be visible to the client, but they are there.

    On the other and, let’s say the code can be conveniently and fully tested through the public API, then there may be no need to separate out and make visible the inner workings.

    Take a car as an analogy, the driver of the car knows to press and release the accelerator, read the tachometer, speedometer, and check engine light. That is the API. It does not mean that the drive train should not be made of separate modules such as fuel injector, manifold, …, transmission. Open the hood and there is access to the different parts. Each part would be independently tested or verified. I suspect that assembled engines are tested before installing into the car. Note the engine is not welded to the frame or the transmission. They are separate components.

    Take an integrator as an example. Input comes from some source and is fed to the integrator. Internally the integrator keeps a FIFO of the values used in its calculations. One client of the integrator puts values into the integrator, another client takes integrated results out. I could test the FIFO’s various boundary states indirectly through the public API but it will be pretty inconvenient to do so. That would lead me to extract the FIFO and test it independently even though the client of the integrator does not need to directly use the FIFO.

    If you have a real example in mind, let’s explore it.

  5. I do not have a real example that would be better than your simplified example of an integrator. Once you have extracted the FIFO handling to its own module, its functions are now accessible to other modules, aren’t they? In practice, the FIFO handling that was supposed to be hidden, became accessible.

    This is what makes me itch a bit. To test something it seems you have to make it visible to all the other modules in your code base. I understand that these other modules do not need to use FIFO, but they could. This fells like an issue to me: I would like to make some functions static and not publish any API to them, but still test them separately.

  6. The user of the integrator has no knowledge of the implementation. The instance of the FIFO is hidden, so outsiders cannot get at the integrator’s copy. Just like you cannot get to your fuel injector directly while you drive your car.

    You probably have some other facilities to help in the hiding. If you program in Java or C#, you can use package scope. If you program in C you can have public header files, kept separate from the C source, as well as essentially private header files. One convention is to have the public headers under and ‘include’ directory and the private headers in the ‘src’ tree with the production code. People would have to not have the ‘src’ directory on their include path. In C++ you can do similarly and us the PIMLP pattern.

    Design involves tradeoffs. I am not willing to trade away the tests.

  7. For the record, it’s not as much hiding code to other people as to other modules that interests me. TDD seems like it would have me break up reasonably sized modules into many tiny ones, resulting in a number of files that is harder to manage. But that may rather be a smell about software architecture.
    Hiding the header files (C) seems like a viable solution, thanks for the hint. And I do hear you about tradeoffs!

  8. Hi,

    my approach in such cases is that i replace all the static in the code under test with my defined PRIVATE with:
    #ifdef TESTING
    #define PRIVATE
    #else
    #define PRIVATE static

    That works quite fine for me, but I do in fact have to change the code under test. But this can also be done with find and replace, with the disadvantage that i do not have such clear defined access adaptors as in your approach.

  9. Hi again Gauthier

    The problem is not reasonably sized modules. The problem is big spaghetti code modules. Many people worry about “ravioli code”, but from what I see, spaghetti code is the bigger problem.

  10. Hi dt

    That works too, with the side effect of possible name clashes. I’d prefer to do that with a forced include or a command line switch to avoid adding the conditional compilation pollution to my production code. With a forced include or a command line definition, you could probably just define static to be nothing.

  11. Gauthier Östervall made a comment about wanting to expose variables used in complex operation. I think you mentioned exposing the inner workings of the complex operation using helper functions and hiding those functions using the file hierarchy. That may work for some situations, but what about the situation where you need the code to run as possible on a slow processor. A function call will slow things down considerably.

    Also, dt mentioned using #define to remove the static operator using a #define directive. This is fine if ok for the memory address of a specific variable to change because of the removal of the static keyword. It would not be a good idea if you are doing unit tests of the code as a whole. In that situation, you want the variables to stay at the same production code address. Testing the production code as a whole using some type of unit test software framework will allow you find bugs resulting from the interaction of the code as a whole.

    I think a better way would be Always store all file level variables in a static structure type variable and then create a pointer to that variable in a section of memory that is not used by the production code. That pointer can be used by unit test code to test the production code. If you are working with old code, the static variables can be combined into a structure type.

    Here’s one way to do it:
    #define EMIT_PRAGMA(x) _Pragma(#x)
    #define UNIT_TEST_MACRO(Type,Variable,SectionName) \
    EMIT_PRAGMA(DATA_SECTION( p ## Variable ,”SectionName”)) \
    Type * p ## Variable = &Variable

    The only issue is that a compiler using the C99 or higher standard has to be used (because of the_Pragma() preprocessing directive).

    What is your opinion?

  12. The approach I am showing is for adding tests to legacy code without changing the legacy code. Once tests are in place, code structure can be improved.

    “A function call will slow things down considerably”

    Hopefully that is more of an exception than a rule. For off-target testing, that performance concern would definitely not be relevant.

    “I think a better way…”

    Interesting approach, but would mean changing code to write a test. I’m not saying I will never do that, I am saying I prefer to not change existing code to get it under test. Moving all file scope variables into a structure is a change to the legacy code I’d not want to do without tests in place already. If you already practice putting all file scope variables into a struct, then this would work quite nicely. It is basically the same idea, adding a level of indirection, as the accessor functions I described.

  13. Hi James,
    I’m quite late for this great post, but after reading it I’m confused about static functions usage. According to the post, I understand static functions should be avoided when possible, and defined as helper functions with their respective public API. In this way we could test them through their API. Is that correct?

  14. Hi Santiago, It’s never too late to learn something new 🙂 Thank you for your question.

    In a TDD and Refactoring environment, static functions are created through extract-method refactoring. The functionality is already tested. So the fact that the function is in-line or extracted does not change that it is tested. I would not say “avoid static functions”. Instead, I’d say avoid static functions that cannot be fully tested through the public interface.

    Sometimes, when you extract a function, you discover that the function actually should be public but in a different module/file. Then, move it and write tests directly for the now-public function. Then it is likely the function is being tested twice. Once indirectly from its old location and now directly in its new home module/file. You may find other places in the code that need that same function, further reducing duplication and improving separation of concerns. In my book, this happens to the TimeService.

    Thanks for commenting!

Leave a Reply

Your email address will not be published. Required fields are marked *

Be gone spammers * Time limit is exhausted. Please reload the CAPTCHA.