CRTP "interface" for delete operator?

Pages: 12
I am using a third party library that includes the source code which I need to alter. The main reason is that I need to manage the memory used by the library using a custom memory manager. While considering my options, I was curious if it is possible to write a simple CRTP interface (say... ICustomDeleter) that uses customized a delete method. The point is to call a template sizeof calculator when an object is deleted to eliminate the need for size fields for each memory allocation. I can alternatively accomplish this with virtual functions but I thought the template function would be easier to manage in the long run.

This seems impossible however since the CRTP is essentially a base class and as I understand it, delete operator overloads are only for a specific class (the overload would only be called if delete is called on the base class).
Last edited on
There are so many ways to implement what you describe, but since I don't have much context of your actual needs I must be quite general without further exchange.

To your point about CRTP, I'm not entirely sure what you intend, but it seems you may be thinking of an "intrusive" solution?

A great deal of custom memory management can be done externally to the target or subject. For example, there's an entire paradigm of custom memory allocation associated with some "smart pointer" classes (not necessarily from the STL). However, for a common example, consider std::shared_ptr using make_shared<>. This technique allocates an object that is unified with the node (saving time and space). Without this technique, the node uses a pointer to the target (the parameter type to shared_ptr). This is a two part design, with separate allocations for the node and the parameterized type. However, make_shared allocates one block of memory for the node sufficient to accommodate the storage for the target type, then uses in place new and in place delete to handle construction/destruction of the object, while the node's otherwise "natural" destruction takes are of the actual deallocation.

It is further, then, possible to take this notion to the extent of allocating the node/object pair from a block of RAM (like an array), which is even faster (though not using make_shared).

This "external" method is more common than what you described.

However, to the point of CRTP - it is a useful and interesting tool, but a related yet differing strategy I find even more generally useful is policy based design, where the parameter to a template becomes the base from which the class is derived. The technique allows one to build a resulting class from a collection of options - a static way of morphing the result at compile time, avoiding virtual base classes.

Ultimately what your strategy may really require is the size of the object, not necessarily its entire type, which is then a type of integer parameter to a template, a much simpler thing to implement.

So, let's exchange more specifically about what you require.
Last edited on
I admit I don't fully understand the goal, but recall that CRTP supplies a unique base class to each derived class.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
#include <iostream>
#include <new>
#include <type_traits>

template <typename Derived>
struct custom_deleter_base {
  void operator delete(void*) 
  {
      std::cout << "I'm here!\n";
  }
  
protected:
  ~custom_deleter_base() = default;
};

struct foo : custom_deleter_base<foo>
{};

int main()
{
    std::aligned_storage_t<sizeof(foo), alignof(foo)> storage;
    foo const* x = new (&storage) foo;
    delete (x);
}

Edit to fix main function
Last edited on
I am new to memory management so feel free to point out bad ideas if they exist. I am sure that everyone considers efficiency to be a goal of custom memory management but with me it is particularly important because of the complexity and wide range of computing devices that will eventually use what I am building. Both memory and speed need to be optimized.

My basic idea for memory management is to allocate large blocks of memory for dynamic allocations and static allocations independently. The memory manager tracks usage and makes changes between sessions (learning over time so that memory use is optimized).

Static allocations (allocations for non-array objects) are done in multiples of the size of a pointer... (8 nowadays) where there is a class that keeps track of statistics and unused memory buckets of multiples of 8 that are used by the program. I designed it this way to minimize the needed memory (no overhead for saving allocated block sizes) and to minimize the time it takes to allocate memory (pulling an unused memory block off a stack). This is the type of allocation that inspired this discussion since I need to know the size of the memory block being deleted in order to push the "de-allocated" memory block onto the correct stack (and adjust appropriate statistics when necessary)

Dynamic allocations (for arrays where the size is not known until allocated) use a fairly traditional approach except that instead of a linked list of sizes, I use a variation on a red-black (AA) tree to cut down on big O for finding an appropriately sized block.
Last edited on
I've written custom allocation support along the lines of your description myself, in C++, many years ago (and I still use that library).

I'd recommend you study the basic notion of inplace new and inplace delete, then apply them along the lines of make_shared, but where storage is provided from the bulk resource you mention (like an array of objects). If you're object does not require shared (reference counted) containment, then you only need consider a customized interpretation of the unique_ptr design. Where you do require shared containment you'll want to allocate the node and the object as unified whole, while still taken from a bulk resource.

This doesn't require or even imply the use of CRTP, but of more conventional templates and likely a class (or family of classes, as appropriate) representing the storage resource, and a "smart pointer" metaphor to implement RAII on that storage. In my own work some attention was paid to thread load storage for threaded support, particularly where de-allocation came from a different thread than the allocation. This allowed for an approach that eliminated the need to lock the storage resources for allocation, since each thread operated it's own, and subsequently to 'queue' deallocations 'after the fact' even though object destruction occurred "in real time" as expected (destruction is not deferred, but deallocation could be).

This places no requirements upon the classes targeted by this approach, no more than the requirements imposed on classes for use with unique_ptr or shared_ptr. Indeed, if such requirements existed they'd be imposed by the design of make_shared or make_unique, and they're not. An examination of the source for make_shared is informative (as is make_unique) in the technique, and can easily be expanded for a paradigm where bulk pre-allocated storage is the source, rather than a malloc like resource.

At the risk of over simplification, imagine one allocates a raw block of RAM on a void *, or perhaps a char * for the convenience of byte level measurement. If a class' 'sizeof' were 100 bytes, then using inplace new and inplace delete allows for construction and destruction of objects placed at each 100 byte boundary within the block of RAM, without ever calling new directly. The counterpart to "make_shared" or "make_unique" would perform the construction, placing the object where it should exist. Conversely, the destructor of the RAII manager (shared_ptr or unique_ptr in the case of the standard library) would selectively choose a custom destruction and subsequent deallocation when leaving scope, where destruction and deallocation are two separate steps within that destructor.




Last edited on
I should probably add that I am not really familiar with the specifics of several of the things you reference. While I get the basics of shared and smart pointers, I am not familiar with the specifics that you refer to in terms of how they work (though I can certainly research them). Most of my understanding of C++ and computer science in general is self taught, though I have taken general computer science classes in addition to a class in data structures and algorithms. I know the language pretty well but I am not very familiar with any libraries other than Qt (which I decided to use for this project because of its cross-platform GUI and extensive collection of other classes that are useful for my needs).

All I am saying is that I could use a starting point to learn about strategies you are suggesting.

EDIT: For example, in your later paragraphs (and elsewhere), you use the term "in place". Not sure what this means. As mentioned I don't know the std namespace well enough to know how make_shared or make_unique handle/manage memory. Don't worry about oversimplifying things. With the exception of explaining language basics, it usually helps.

EDIT2: Keep in mind that if possible, I want to minimize modifying 3rd party library code (I am currently using two third party libraries). If I am to maximize speed, I need to override the general delete operator and somehow make it size aware if possible.

EDIT3: I probably should have brought up that last point sooner as it is the main reason that this is all an issue. Managing my own libraries and classes isn't really a big deal, especially after learning some things from your suggestions. It is the incorporation of third party libraries that is the source of the issues here... if possible, I want to avoid rewriting large portions of these libraries.
Last edited on
My apologies, while reading your post I had the impression of a seasoned engineer so I chose a terse, dense reply. In reality most of us are self taught no matter what degrees we got, with the possible exception of the most recent graduates, and their days as autodidacts are fast upon them, too.

What is more formally called "in place syntax" applies to "new" declarations.

For any type T, new T() creates an object from the heap, calling a construct and optionally taking parameters.

Given a raw block of memory, say void * v = malloc( 1024 ):

::new( v ) T();

This calls the constructor for T, having placed the object at the location v, without other allocation.

This technique is used in the make_shared approach for shared_ptr. That would be used thus:

std::shared_ptr<T> p = std::make_shared<T>();

Where optional parameters for T's constructor are placed typically inside the parenthesis. This fashions a new T inside a node "owned" by p. Subsequently p is used as any raw pointer for access, as in p->x. When p falls from scope, deletion is automatic. However, the object T is constructed in place, not by a separate "new T()", but by "new(v) T()", where v is position immediately after the node structure, itself allocated to accommodate all the space required by T. This creates one allocation for two C++ structures.

Subsequently, the destructor can be called explicitly, assuming rp is a raw pointer of type T, with:

rp->~T();

While this calls the destructor, it takes no action with regard to deallocating storage. That would be handled separately.

In modern C++ there is a paradigm of exclusive use of either shared_ptr, unique_ptr or scoped_ptr instead of raw pointers. As such, there is seldom any use for an explicit call to delete. Despite copious tutorials invoking the standard use of "new" and "delete", that usage is consider an older style more associated with pre-C++11. There is virtually zero overhead using unique_ptr or scoped_ptr, as they are uncommonly lightweight considering their function. They have certain restrictions appropriate for their use cases (unique_ptr can be moved but not copied).

std::unique_ptr<T> p = make_unique<T>();

Is effectively the same as:

T * p = new T();

With the exception of the exact type of p. p->x or p->foo() is the same for either case. The main difference is there is no use for "delete p;" in the first case, because unique_ptr<T> performs destruction upon release from scope.

std::scoped_ptr is nearly identical in most respects, but does not allow moves or copies (it guards its charge very jealously).

std::shared_ptr functions similarly with the expanded feature of reference counted ownership. It is particularly suited for threaded design where "ownership" is not explicitly known. That is:

1
2
3
std::shared_ptr<T> p = make_shared<T>();
std::shared_ptr<T> p2 = p;
std::shared_ptr<T> p3 = p;


This creates what is conceptually three pointers to the same T. In reality, while p->x and p2->x and p3->x all point to the same instance of T, they exert "ownership" (deletion, ultimately) through a node structure which counts how many of these shared pointers are "holding" that instance. It will not matter when any of these pointers fall from scope, even from other threads. The underlying instance will be maintained until the last of these fall from scope. Deletion is assured at that point.

At one point in early versions of shared_ptr, it was compatible with the standard use of new. This was valid in older versions from boost (from which shared_ptr was taken), so this was valid:

std::shared_ptr<T> p = new T();

Modern compilers generally don't allow this anymore without configuration. For a while (through the pre-C++11 era), make_shared was considered an optional performance enhancement. The source can still be examined in the current boost source for it.

If you have no interest in shared containment (it's slower), an all inline smart pointer is so lightweight as to nearly evaporate into nothing but awareness and automatic deletion.

A naive example would be:
1
2
3
4
5
6
7
8
9
10
template <typename T> spointer
{
 private:
 T * p;
 public:
 spointer( T *i ) : p( i ){};
 ~spointer() { if ( i ) delete i; }

 T* operator ->() { return p; }
};


It's major failing is ignoring assignment or copy construction. Alexandrescu would caution against any named member functions. There is no "get" function to get the pointer, only the -> operator. The std::scoped_ptr denies copies or assignments, while unique_ptr offers to move them, leaving the source "empty" (nullptr). This is necessary to avoid deleting the underlying instance owned by the smartpointer multiple times (crashing). std::shared_ptr offers copies through reference counting. Alexendrescu offers a good treatise in his 2001 book, "Modern C++", on the notion of a conversion operator to T *, so approach with caution. It is an option, but with thorns. In reality all copying of pointers, even in C, is fraught with peril. Ownership is unclear without serious attempt at clarity. That is the origin of reference counting. Deep copy, creating a duplicate instance, is an alternative, but unattractive.


If you contemplate the in place construction and explicit call of the destructor from an RAII manager (that is, a smart pointer), you should come to the realization that these techniques impose virtually zero demands upon the target classes they manage. If "new" and "delete" work on a struct or class, these techniques are generally applicable without alteration to the classes.

Since such a "smart pointer" is fully aware of the type (it's size, etc), you are in complete control of customized allocation/deallocation strategy.

I have not reviewed the current versions of shared_ptr to verify this issue. The last I worked on this in earnest was when I created a smart pointer library now used in high performance real time application work in the pre-C++11 era. However, the reason I wrote a new smart pointer in the first place was the discovery that shared_ptr could not be used with genuinely custom allocation strategies. It was an insidious problem, because 99% of the work was completed in my case, using the parameter clearly named as if custom allocation were possible. The problem wasn't allocation, though, it was deletion. For whatever reason, the original shared_ptr from that era would not support the notion of custom deletion. To use shared_ptr, I would have to alter the source. Since that was from boost at the time, it was an option to explore, but ultimately the design requirements included new features not found in shared_ptr, so a new smart pointer was created, including one similar to unique_ptr (before even auto_ptr existed).

That issue may have been resolved in recent versions of shared pointer - research would likely say by searching for custom allocation articles/papers regarding std::shared_ptr in modern C++.

If you want some research, a work by Alexandrescu from 2001 discusses a policy based smart pointer design called loki (or the loki library). Although it is older, the code and work is still quite relevant on the subject, particularly with advice on how to create the interface for smart pointers. It does not, however, include variadic templates, but that is actually a simple read to catch up and is pertinent for creating a counterpart to make_shared (it nearly becomes trivial, you'll need maybe an hour, if that, to read and then implement the techniques).

The CRTP you reference is also discussed in Alexndrescu's book, but policy based design is a related notion. Instead of base definition of CRTP, policy based templates simply inherit from the parameter, as in:

1
2
3
4
5
6

template< T, B > class A : public B
{
 T v;
};


B is called a policy.

If B is similarly declared, it may also have a policy (or policies), going as deep as required.

Loki constructs smart pointers using policy based design in order to select behavior as required.
Last edited on
Wow. Thanks Niccolo for that in depth explanation. I believe I followed it without difficulty. So basically I can create my own "wrapper" for pointers with inline access and they can use in place construction after using a custom allocator and then use whatever function I wish to use in order to "deallocate" the pointer when the pointer is no longer needed.

Since my application will be multi-threaded, I will probably need to create a shared/reference counted version as well.

EDIT: Interestingly, one library already has a reference counted base class for almost everything it uses so all I have to do is modify that. Qt on the other hand will be a bit more challenging. Actually, since it uses a lot of PIMPL based classes, I may opt for another GUI if I can find one that is as handy.
Last edited on
@primem0ver,

I'm glad you found that useful.

In place construction and explicit call of the destructor was originally intended for use where objects must be explicitly placed at a particular location, where some hardware maps memory for I/O.

That can be done more primitively in a C style sense where constructors and destructors are obviated, and one merely uses a pointer to some arbitrary location in memory, but then one must either explicitly deny standard construction or personally chose not to use them (perilous at best in some C++ design). The in place construction stands in for an explicit call to the constructor functions, countered by the syntax for explicit call of the destructor.

I would caution against the wording "deallocate the pointer", but I easily understand what you mean. Pedantically, the deallocation is upon what the pointer is pointing to, not the pointer itself ;)

Sorry, it's just a minor detail.

I've used Qt, and I have to say I'm not a fan. I know it's quite popular, and I use it when I write plugins for 3DS Max and a few other AutoDesk products (they use Qt now - didn't a while back).

Instead I can heartily recommend WxWidgets. It's robust, been around a long, long while and can produce excellent results on most of the same platforms. Like Qt, it does try to do more than is really required, but then both come from a time long before C++11, so some of that is historical.

What you describe is known as an instrusive reference counting paradigm (it requires derivation from the reference counting base). The unified node/object structure created by "make_shared" is an answer to that.

The intrusive approach was long preferred for high performance, because of the single allocation feature (a node pointing to an object is a two part design, but it isn't intrusive - it places no requirements on the target).

The "make_shared" approach offers the same benefit as the old (very old) intrusive pattern but without the intrusion.

WxWidgets also comes from the pre-C++11 era (pre-C++03, too). It has been well updated, and is currently maintained. It is slightly faster at GUI because it does not paint it's controls, it uses the OS native controls.

I'd suggest you search for "loki alexandrescu" - while his book is still available, the loki library (updated since the book) is online, more readable than boost or standard library code, and is a fair example of design.

Really, though, all you need is to review std::atomic, probably just for the unsigned integer. That makes for a thread protected reference counter automatically.

Loki might not help with the unified node/object paradigm (which is highly related to all custom allocation of the same), but the rest of a smart pointer design is well discussed. I'm sure there are more recent treatise on the subject, sure, but it's one I know of. I've not seen recent updates. The original(s) did not have variadic template parameters, but that's a quick study. It makes the counterpart to implementing "make_shared" trivial where forwarding the constructor parameters is required.

I assume I'll hear from you as you dive in. Best of luck.
Last edited on
@Niccolo

Slightly OT but since you mentioned it, I might as well ask. There are two things about Qt (beyond its comprehensiveness) that I would be hard pressed to do without. So I am curious if WxWidgets has these (or similar) features as well.

1) I really like the signals and slots feature of Qt for two reasons. It is similar to windows message handling that is used in .NET (events and delegates) which makes programming a UI very straightforward and simple. More importantly, it allows me to call a slot (function) using a string name. My project involves plugins that use XML defined UI objects and corresponding handlers. I can specify a string function name in XML and connect it directly to a slot call in Qt. This is an important functionality required by my project and I would need the same ability in WxWidgets.

2) Does WxWidgets have something like QtCreator for designing the UI? While not absolutely necessary, this would be a very helpful time-saver.
Last edited on
One additional thing I would need is the ability to create custom controls. Can I do that in WxWidgets?
WxWidgets is quite good for custom controls.

It also uses at least 2 (maybe 3) paradigms for messages. The "traditional" (from MFC) define tables, which are still supported by all but deprecated.

Another is a "bind" like facility, which binds messages to response functions without message ID tables.

Because of the "bind" like message approach, it would be trivial to arrange binds by text configuration, but I'm not sure if they duplicated this Qt concept already or not. Bind is, essentially, a delegate.

Signals and slots have become a feature of the C++ standard library, taken from boost.

However, one of the things Qt does is implement reflection, and that's not in WxWidgets.

There are UI designers (at least 2) for WxWidgets.



Ok. In the course of designing these pointer classes that have been discussed, I have run into a problem I am not sure how to solve (or if it can be).

Lets say that I want to create an in place constructor for some unknown type T. Now lets say that the constructor for type T requires a parameter list. I want to create an in place constructor for a list of parameters that are of an unknown length and type. Is it possible to do this? Example:

1
2
3
4
5
6
7
// where p is of type T*

MyPointerClass::CreateObject(...) // accepts unknown parameter list
{
	p = (T*)memoryManager.GetNew(sizeof(T));
	::new(p) T(...) // passes unknown parameter list
}
Last edited on
Look into variadic templates.

Yes, there's an answer there.

Oh, and this is also the purpose of the && (reference to reference) added to C++ of late, a subject sometimes referred to as "perfect forwarding" of parameters.

A hint from a sample I have in my own archives:



1
2
 template< typename ST, typename A, typename... Args > 
 inline bool  New( ProMemory::NodeAllocator< ST, A > const & a, Args &&... args ); 



Also...look into std::forward< Args >( args )...
Last edited on
Thanks for the pointers. (har har...;)
Last edited on
The syntax for an in place new operator as I can find documentation is not being accepted (I am using VS 2017). What am I doing wrong?

1
2
3
4
5
// inside class
static MyGlobalObject* _singleton;

void* p = malloc(size); //some calculated size that will hold a block of data, starting with _singleton)
_singleton = new (p) MyGlobalObject();  // gives error: function "operator new" cannot be called with the given argument list 



My first thought, albeit not well informed by context otherwise, is to ensure global scope with ::new( p )
Don't forget to #include <new>.
Thanks mbozzi. That was the issue. I find it strange that a keyword needs a header (even if technically it is a function).
I am new to memory management
and yet, you think you can write a better general purpose memory manager than what comes with the compiler.
Static allocations (allocations for non-array objects) are done in multiples of the size of a pointer... (8 nowadays)
So if you allocate a 32-bit int, it will waste 4 bytes.
Both memory and speed need to be optimized.
Are you sure? did you measure? How much space do you need to save? How much space will your memory manager save over the stock manager? How much faster will it be? Will these changes save more memory and time than other changes?

I urge you to just write the application code with the stock memory manager. When you're done, profile the code and make whatever optimizations are called for. If space is an issue, concentrate on compressing your data. If you still need more time and space, that is the time to consider rewriting the memory manager.
Pages: 12