How to better to pass lambda in C++. Part 1.
0. Motivation
Sometimes we use “std::function” like magic box and don’t really care what’s happening under hood. But things are getting worse in complex apps that have intensive usage of it. To get proper decision where to use it or switch to regular function, or check other variants, we need to understand how it really works.
1. Lest refresh things.
We can pass an object in C++ in 2 ways:
// 1. pass-by-value
void fun(std::function<int(void)> f);
// 2. pass-by-const-ref
void fun(const std::function<int(void)>& f);
Important:
Note: In the 2nd case, we can’t omit the “const” specifier. In this case we can call function with r-value. for example:
void fun_call(const std::function<int(void)>& f) { f(); } int main (int argc, const char* argv[]) { // in this case argument is temporary(r-value). // error - compiler cant take reference of temporary fun_call(std::function<int(void)>([]{ printf("lambda rv call\n"); return 0; })); std::function<int(void)> i = []() -> int { printf("lambda lv call\n"); return 0; }; fun_call(i); // ok - pass l-value by ref } // outputs // lambda rv call // lambda lv call
So what the difference and which version should we use? To answer this question we need to remember that in C++ lambda is implemented as function object.
Lets check 2 cases:
- The first one, let’s define a function that accepts argument by const reference
void call_fun(std::function<void(void)>& f) {
f();
}
int main (int argc, const char* argv[])
{
// 1 define lambda
std::function<void(void)> no_capture = []() {
printf("no_capture %d", rand()%30);
};
call_fun(no_capture);
}
Let’s check what do we get, by compiling it ot LLVM-IR:
$clang++ -O1 -S -emit-llvm -std=c++11 ../main.cpp
You can read more about LLVM-IR here
As output we get:
define i32 @main(i32 %0, i8** nocapture readnone %1) local_unnamed_addr #1 personality i8* bitcast (i32 (...)* @__gxx_personality_v0 to i8*) {
; 1. first - allocate object of type "std::function" on stack
%3 = alloca %"class.std::__1::function", align 16
; 2. call std::function constructor(but don't call it.)
call fastcc void @"_ZNSt3__18functionIFvvEEC1IZ4mainE3$_0vEET_"(%"class.std::__1::function"* nonnull %3)
; 3. pass pointer to function object to call_fun and invoke it.
invoke void @_Z8call_funRKNSt3__18functionIFvvEEE(%"class.std::__1::function"* nonnull align 16 dereferenceable(48) %3)
; 4. destructor of functor
call void @_ZNSt3__18functionIFvvEED1Ev(%"class.std::__1::function"* nonnull %3) #24
ret i32 0
}
As you can see, it internally creates a function object(functor) and passes it to the “call_func”
- The second case, define a function that accepts argument by value
void call_fun(std::function<void(void)> f) {
f();
}
int main (int argc, const char* argv[])
{
std::function<void(void)> no_capture = []() {
printf("no_capture %d", rand()%30);
};
call_fun(no_capture);
}
Let’s check the result:
define i32 @main(i32 %0, i8** nocapture readnone %1) local_unnamed_addr #1 personality i8* bitcast (i32 (...)* @__gxx_personality_v0 to i8*) {
; first allocation of std::function
%3 = alloca %"class.std::__1::function", align 16
; second(the one we'll pass to function as copy)
%4 = alloca %"class.std::__1::function", align 16
%5 = getelementptr inbounds %"class.std::__1::function", %"class.std::__1::function"* %3, i64 0, i32 0, i32 0, i32 0, i64 0
; construct the first arg
call fastcc void @"_ZNSt3__18functionIFvvEEC1IZ4mainE3$_0vEET_"(%"class.std::__1::function"* nonnull %3)
; call copy const to the second function.
invoke void @_ZNSt3__18functionIFvvEEC1ERKS2_(%"class.std::__1::function"* nonnull %4, %"class.std::__1::function"* nonnull align 16 dereferenceable(48) %3)
; invoke call_func with copy of the lambda(you can see that in register %4 we store the copy of lambda).
invoke void @_Z8call_funNSt3__18functionIFvvEEE(%"class.std::__1::function"* nonnull %4)
to label %7 unwind label %10
; destuct both of them
call void @_ZNSt3__18functionIFvvEED1Ev(%"class.std::__1::function"* nonnull %4) #25
call void @_ZNSt3__18functionIFvvEED1Ev(%"class.std::__1::function"* nonnull %3) #25
ret i32 0
}
We can compile it to actual binary and check output with Hopper Disassembler.
2. Summary.
That’s the difference. Each time you pass it by value, you’ll get a copy of the std::function object.
Of course, the compiler will optimize such kind of std::function and will not create any actual objects(__you can check this yourself by providing “-O3” optimization flag instead of “-01”).
But if your lambda will have any captured variables(it has a state), all of your objects will be copied(in case they captured by value), calling copy-constructors and destructors. If captured objects isn’t just a POD, they can have heap allocations, non-trivial destructors. This will cause performance issues with your code.
3. Here is one more example.
Let’s imagine we have a code block like this:
int int_val = 10;
std::string some_str = "abc";
std::function<int(void)> m_lambda = [int_val, &some_str]() -> int {
printf("some_str: %s\n", some_str.c_str());
return int_val;
};
How this can be implemented without the “std:function” template?
It can be something like this(it’s just a basic idea, not real impl):
struct maybe_lambda {
maybe_lambda(int i_val, std::string& s_val):
int_val{i_val},
some_str{s_val}
{
}
// to pass object by value wee need to define
// a copy constructor
maybe_lambda(const maybe_lambda& other):
int_val(other.int_val),
some_str(other.some_str)
{
printf("copy constructor called\n");
}
// move constructor and copy-assignment operators are omitted
// for simplicity
int operator()() {
printf("some_str: %s\n", some_str.c_str());
return int_val;
}
// captured variables became
// members of the functor
int int_val;
std::string& some_str;
};
// here we take object by value
// this will call copy constructor
void call_fun(maybe_lambda l) {
l();
}
int main (int argc, const char* argv[])
{
int int_val = 10;
std::string some_str = "abc";
// init function
maybe_lambda some_functor(int_val, some_str);
some_functor();
// pass functor by value
// we actually pass a copy with of captures objects
// been copied
call_fun(some_functor);
}
As you can see yourself, no magic is happening here. Compiler will generate this for us.
The output is:
some_str: abc
copy constructor called
some_str: abc
Copy constructor is being called for the function, as well as for “std::function”. In case you’ve captured some “heavy” object(for ex JSON, etc), you’ll have it copied. So you should avoid this by passing “std::function” by ref or capture JSON by ref. But in this case you should be care that you lambda doesn’t outlive the object you’re referencing to.
I’ve even seen constructions like this:
using Func = std::function<void(void)>;
std::shared_ptr<Func> p = std::make_shared<Func>([]{
///....
});
But if you find yourself writing code like above, you should take a rest :)
I hope you’ll find this post useful.
Note: If you’ve found some errors in this blog or you have any questions - click at the email button and drop me a message :)