代码回迁历程：从c++17到pincrt (Paradiams for rebasing recent c++ code to out-dated pincrt)

Recently I have finished some code refactoring, reverting a third-party project following c++17 standard to an out-dated pincrt fashion. More accurately, I was integrating the most recent NVM simulation work VANS with zsim. During this code migration process, I have learned to handle many implementation issues related to C++ STL and would like to share it.

Brief Glossary

zsim: a fast and accurate core simulator. The runtime is based on pincrt. I have been using zsim as my primary experiment tool but for simplicity it does not directly support any detailed DDR memory models.
VANS: an Optane-oriented NVM memory simulator. It is claimed to be written in C++17 fashion and have the most accurate result against the actual Optane NVM hardware. This time I want to integrate it with zsim, supporting simulating app behaviors on a core+NVM system.
pincrt: The basic C runtime specialized for a pintool like zsim, derived from stlport, now “maintained” by intel. It also contains several C++ STL support. Based on the official manual (only 11-page long) and my hands-on experience, pincrt only supports limited C++11 standards and no support for more recent standards. Pincrt does not support exceptions and RTTI (Run-time type information).

Motivation

I want to integrate VANS with zsim, to test a research idea working on a NVM system (e.g., For a specific workload, check the total energy consuming and latency distribution)
I want to learn and train my C++ skills by reviewing a complete and well-written C++17 project.
Most tutorials are telling people how to migrate their old-fashioned C++98 code into a recent one (like C++20). I would like to do the reverse thing, answering these questions: what are those new features? Why do we need such grammar sugars? Are they practical? How are they implemented (in a simple way)?

Rebasing process

The first time I directly copied VANS source files into zsim and tried compiling, I saw 30K+ error lines… So let’s do it right now. I categorized pincrt’s missing language features into loosely-coupled ones and closely-coupled ones. Now I would show some of them via the related code examples respectively.

Loosely-coupled features

For loosely-coupled features, I mean no or little code change to the direct files. I hope to complete those missing features (function impl, constants, …) into one file missingfunc.h and just include it at the beginning, to minimize the injection. Obviously keeping the integrity of original VANS is the better way.

String to number conversion

In C++17, this conversion is (not very) gracefully done by the method std::stoX. E.g., a std::string to long conversion would be std::stol. On the other hand, number-to-string conversion is gracefully solved by polymorphism std::to_string introduced by C++11.

In the ancient C++ times, there are alternative ways to achieve that. See https://stackoverflow.com/questions/7663709/how-can-i-convert-a-stdstring-to-int

For simple conversion I adopt the method from above and add an inline function in namespace std: std::istringstream([a const char*]) >> [your target long variable];. Similarly we finish other necessary number conversion. On the other hand, use stringstream to convert a string into a number in any type.

For complex conversion, the method can return pos and do a radix parsing by base parameters (see https://en.cppreference.com/w/cpp/string/basic_string/stol). VANS uses one instance. Here I replace it with sscanf to control more string format, but in an intrusive way sadly.

If possible I want to know a brief implementation of a complete std::stoX method in the future.

Missing methods in vector, map and unordered_map

Pincrt does not implement many methods in std::vector, std::map and std::unordered_map.

at: To index the content we usually use the [] operator, however, STL uses another at() method to throw the exception when the index does not exist. To implement this method, we derive the original std::vector to our g_vector thanks to zsim, and add this method in our header. We replace every exception throw with assertion, since pincrt does not handle any exceptions:

template <typename K, typename V> class g_map : public std::map<K, V, std::less<K>, StlGlobAlloc<std::pair<const K, V> > > {
    V& at(const K& key) {
        assert(this->count(key));
        return this->find(key)->second;
    }

vector::data returns the direct C-style array of a vector, given that the implementation of vector is a consistent memory area. It can be implemented by return (T*) this->_M_start;
unordered_map::reserve pre-sets an expected map size at the beginning, to avoid multiple rehashing. I did not find a good way to do it but a void method is okay if we do not care too much about performance.
emplace is also a performance-targeted method. It inserts a new element to the container without unnecessary memory copy. Programmers pass all constructing parameters of the new element directly to emplace. I just heard this method today and do not like the way of programmers hinting memory movement via a totally new method. It adds complexity for code migration. Again replacing it to insert is totally harmless.
std::array is a new container since C++11, wrapping the C-style array. Here I just revert VANS code to use a traditional C-style array.

To add a custom method I finally define a new class from the original STL. Is there any easier way to directly modify the STL class?

Tuple and unpacking

C++11 invents std::tuple to represent a couple of variables in any type, besides the original std::pair. Since pincrt does not include tuple, I check the VANS code and realize it only uses 2 and 3 elements in std::tuple. Therefore, I create a new triple class which is essentially a std::pair<A, std::pair<B, C>> class. Then I define its first, second and third method for element accessing. Plus a similar make_triple easily packs a triple. To extract a tuple, STL provides both std::get<index> and std::tie to access. For pincrt we can also implement the first one easily, or modifying the VANS code to use triple’s own accessing method.

I learned this keyword when I was an undergraduate, but never used it since that class. C++11 changes its semantics and C++17 finally removes this expression. However, although it is another example of “programmers control data movement”, register is defined in a better way than emplace. Sad about the loss… Now our C++17 compiler would throw warnings for registers in zsim’s mtrand.h. We have to add the -Wno-register flag to CXXFLAG in SConstruct.

Get the maximal value of a type

This can be done by a static method std::numeric_limits<YourType>::max() in C++ since C++98. But C++11 adds constexpr prefix to allow its use in more places, e.g., the enum definition in VANS common.h. For pincrt, thankfully we can use its equivalent forms: XXX_MAX macros like ULONG_MAX. Note that the max() is in header limits and ULONG_MAX is in climits. This example just shows both the flexibility and historical baggage in ancient C++STL.

Closely-coupled features

Unfortunately there are many times when we have to modify a lot of codes (considering context) to adapt to pincrt. There is not a standardized form.

Exceptions

I have to change the exception handling (try, catch, throw) based on the context. Since the exceptions influence the execution flow, I have to check the context code for modification. My overall principle is to replace throw by assert, to abort the program and also show some error messages.

Another unordered_map issue

C++11 allows brace-enclosed map initialization, for example, there is std::vector<std::string> a = {"a", "b", "c"} in one class definition, to make class init more easy. I have not found good ways to replace it in-line. Instead I have to push_back them one-by-one in the class constructor.

Shared pointers

C++ uses its shared_ptr for “smart pointers”. VANS heavily uses them, but pincrt does not support them. I have tried many ways to bypass it but all do not work. Then I found that STL’s shared pointers are absorbed from boost. So I try to replace every std::shared_ptr with boost::shared_ptr and include boost/shared_ptr.hpp. Remember to config the boost include path correctly in SConstruct. This solves all shared_ptr problems.

The additional make_shared is more tricky. Directly including boost’s make_shared.hpp generates more problems, since boost uses C++11 allocators which pincrt do not have. This is a deadlock. Finally I check Boost’s code and control the allocation (and more) behaviors by macros and solve the pointer issues…phew

The correct way of using make_shared is:

#define BOOST_NO_CXX11_ALLOCATOR
#define BOOST_NO_CXX11_RVALUE_REFERENCES
#define BOOST_NO_EXCEPTION
namespace std {
    class bad_alloc : public std::exception {}; // Define a void class
}
#include <boost/make_shared.hpp>

From above we can see the flexibility of boost by controlling the options, just like many other linux utilities. But it creates too much workload to boost developers, and make the code more unreadable. Then what is the better way? I do not know.

Lambda function

VANS uses lots of lambda functions in its code. It even uses a 2-d functor matrix for each state transition. Basically VANS has a lambda function (the type is std::function) in the base_reqeust class, to call back the target function when a request is finished. It also took me many days to bypass the lambda function type.

Finally my professor told me how to do: for each callback function, create a functor (a function class), overload its operator () by the target callback function. If the lambda captures any variables (e.g., VANS captures this many times in ait.cpp and rmw.cpp), add those to the new functor’s constructor parameter list. Now that all the functors are derived from one BaseCallbackFunction class, we use it as the type of callback fields in base_request. This is the moment when I realize the beauty of C++’s complexity.

Result

This blog only shows partial rebase problems and tricks. It took me almost 7 nights within one week to identify and fix all the misalignments. Finally zsim is capable of simulating NVM operations. The detailed method of zsim integration is another story. The final repository is in https://github.com/leepoly/zsim/tree/vans. I realized such code migration can dramatically improve your understanding about C++ grammar sugars, understanding why and how it has been implemented. If you have better ways to do the conversion, please comment.