C/C++ force-run init code

I’m working (in Microsoft Visual C++ 2005.) on some tools for a large MMORPG that my company is developing. One thing I’m working on is a tool that scans over the codebase, finds various “magic” tokens in the code, parses the marked code, and auto-generates various snippets of code. (For instance, something that automaticaly finds enums and generates arrays of strings, to make enum-to-string conversion automatic).

Anyhow, I find myself with some snippets of code that I need to force-run at game startup time. I’m trying to come up with some way to do that forcing without any actual call being made in some global init() function.

Now, the code is mostly in C, but we have some C++, so I figured I could do something tricky with C++ initializers by declaring a class whose constructor takes a function pointer as one of its arguments, and then defining a single dummy member of that class at file scope with its argument being the function I want to have executed. So at code startup time, all global objects will be constructed, and presto, my code will be run.
Now, that works fine when I put it into a source file, compile that file into a project, and run the project. BUT, when I put it into a source file, compile that file into a library, and link the library into the project, it does NOT work, presumably because the linker strips out what it views as an unneeded variable.

So, I’m looking for two things:
(1) A different way to force code to run at startup time, one that can ideally be done entirely within a single file. That is, in foo.c, I put:



int myFunc()
{
  gSomeVariable++;
}

AUTO_RUN(myFunc);


and presto, myFunc gets run at startup time

or

(2) A way to force a specific variable, or alternatively all the functions and variables in a single file, to get linked even if the linker thinks they are not being used.

Thanks

After some testing, I got this (your option #2) to work by using some arcane linker options—this is under g++/ld, but MSVC++ probably has similar settings. Look for a linker option to specify a starting “undefined reference to <symbol>” (-u <symbol> in ld). This will force creation of the object <symbol> (this is probably the name of your class object with some attached stuff, like a prepended underscore or something).

Here’s my code:


//====================================================================
// myinit.h, the header for the library

class myInit {
public:
  myInit( const char* x ); // just for debugging
  myInit( void (*f)() );
};

//====================================================================
// myinit.cpp, the code compiled to a dll
// (this contains the two initializer objects as well)

#include <iostream>

#include "myinit.h"

using namespace std;

myInit::myInit( const char* x ) { cout << "myinit(" << x << ")." << endl; }
myInit::myInit( void (*f)() ) { cout << "myinit(void(*)())." << endl; (*f)(); }

void dothis() { cout << "dothis() done.
"; }

myInit A("myinit.cpp");
myInit B(dothis);

//====================================================================
// inittest.cpp, containing main()

#include <iostream>
#include "myinit.h"

using namespace std;

int main( int argc, char** argv ) {
  cout << "In main()." << endl;
  return -1;
}

//====================================================================
// make commands

// Create the library libmyinit.dll from myinit.cpp
// ... the -fpic is not necessary on my system; YMMV
g++ -c myinit.cpp # g++ -fpic -c myinit.cpp
g++ -o libmyinit.dll -shared myinit.o

// Link to the library
// ... the commented-out version doesn't create A or B from myinit.cpp
# g++ -v -g -o inittest.exe inittest.cpp -L . -lmyinit
// ... but this version, by forcing definition of _A, creates them both,
// at least for me
// (note the leading underscore; I could also have used __imp__A).
g++ -v -Xlinker -u -Xlinker _A -g -o inittest.exe inittest.cpp -L . -lmyinit

Hope that’s somewhat readable.

Hmm. I’ll poke around and see if I can find the equivalent linker option. Since I’m autogenerating the relevant function, perhaps I can also auto-generate a linker command file or something like that. I’m definitely trying to find a way to make everything work such that these automatic commands can be added to new files and projects as simply and mistake-proof-ly as possible…

In any case, thanks.

The C++ standard claims that any object of static lifetime, whose creation or destruction has side effects, cannot be dead-stripped by the linker.

Your description is slightly ambiguous, but it sure sounds like it falls under this description, and Microsoft’s compiler is getting this wrong.

However, you might try “insisting” a little bit. Make your global variable a static member of the class, instead of a “free floating” global variable. Perhaps this will remind the compiler of its responsibilities in this regard.

Alternatively, if the functions you’re trying to run at startup have or can have return values, you could skip the class altogether and just declare a global variable of the appropriate type assigned to the function result. In your example


int auto_run = myFunc();

at global scope in your function ought the be enough, no class necessary. Although it might trigger the same bug, so you might want to do something like:


class IWishIDidntNeedThisClass
{
public:
       static int auto_run;
}

IWishIDidntNeedThisClass::auto_run = myFunc();

Is this a static library or a DLL?

Does the standard say anything about which modules in a library must be included by the linker, though? My recollection is that it does not – that while it most certainly would be improper for an implementation to strip static objects from an included object file, it is not improper for an implementation to fail to include an object file when there are no symbolic references to it.

The OP’s problem might be solved, somewhat inelegantly, by ensuring that the initializing object is defined within an object file that is guaranteed to be referenced, if such exists in his environment.

It’s a static library.

The assign-the-function-result-to-a-global-variable trick works, IF the global variable is in a different file. Which is my fallback position, but I’d rather figure out how to get it to work all within a single file.
That’s also the problem with trying to ensure that that file contains something else important… that entire file is basically one autogenerated function to do some load-time data fixup. Ideally, no one else ever needs to know anything about that file.

I don’t have my copy of the standard available at the moment. This might be a gray area, but I interpret the clause I mentioned earlier as specifically covering the case you mention: There is a symbolic reference to this object file: the global variable it defines. That’s pretty much the definition of global. And since it has side effects in the code that is included, it’s not dead code and can’t be stripped.

I don’t think most linkers discard whole modules, anyway; I’m pretty sure they’re going to make their decisions on a symbol-by-symbol basis. The “module” is just a segmenting scheme for the user; as far as the compiler is concerned, this could all be in one long file.

Of course, what it should do is irrelevant to solving the problem if what it does do. There might not be a good solution to the OP’s problem in this compiler.

At this point, I’d say just call the init function from your main code, and live with the fact you’re exposing it. Given the peculiarities of making this work, it’s clearly a compiler gray area (or just plain wrong by some clause I didn’t find), so your fellow developers aren’t likely to know the details. I’d be inclined to say that “hiding” this call might be more of an issue from a design standpoint than the couple lines of comment you’d need to explain why this random init function is lying around somewhere.

I don’t know the standard, but g++/ld appears to behave the same way (I recompiled my earlier example into a static library to check), so it’s not just Microsoft.

After thinking about the problem a bit more, I tend to agree. Hiding behavior in constructors far away from the main entry point doesn’t sound like it’s good self-documenting practice, but I don’t know MaxTheVool’s reasons for avoiding something like an init() call.