proxygen
|
<section class="dex_guide">
<section class="dex_document">
folly::fibers is an async C++ framework, which uses fibers for parallelism.
Fibers (or coroutines) are lightweight application threads. Multiple fibers can be running on top of a single system thread. Unlike system threads, all the context switching between fibers is happening explicitly. Because of this every such context switch is very fast (~200 million of fiber context switches can be made per second on a single CPU core).
folly::fibers implements a task manager (FiberManager), which executes scheduled tasks on fibers. It also provides some fiber-compatible synchronization primitives.
... folly::EventBase evb; auto& fiberManager = folly::fibers::getFiberManager(evb); folly::fibers::Baton baton;
fiberManager.addTask([&]() { std::cout << "Task 1: start" << std::endl; baton.wait(); std::cout << "Task 1: after baton.wait()" << std::endl; });
fiberManager.addTask([&]() { std::cout << "Task 2: start" << std::endl; baton.post(); std::cout << "Task 2: after baton.post()" << std::endl; });
evb.loop(); ...
This would print:
Task 1: start Task 2: start Task 2: after baton.post() Task 1: after baton.wait()
It's very important to note that both tasks in this example were executed on the same system thread. Task 1 was suspended by baton.wait()
call. Task 2 then started and called baton.post()
, resuming Task 1.
The only real downside to using fibers is the need to keep a pre-allocated stack for every fiber being run. That either makes you application use a lot of memory (if you have many concurrent tasks and each of them uses large stacks) or creates a risk of stack overflow bugs (if you try to reduce the stack size).
We believe these problems can be addressed (and we provide some tooling for that), as fibers library is used in many critical applications at Facebook (mcrouter, TAO, Service Router). However, it's important to be aware of the risks and be ready to deal with stack issues if you decide to use fibers library in your application.
std::function
s used for callbacks, context objects to be passed between callbacks etc.) </section><section class="dex_document">
Let's take a look at this basic example:
... folly::EventBase evb; auto& fiberManager = folly::fibers::getFiberManager(evb); folly::fibers::Baton baton;
fiberManager.addTask([&]() { std::cout << "Task: start" << std::endl; baton.wait(); std::cout << "Task: after baton.wait()" << std::endl; });
evb.loop();
baton.post(); std::cout << "Baton posted" << std::endl;
evb.loop();
...
This would print:
Task: start Baton posted Task: after baton.wait()
What makes fiber-task different from any other task run on e.g. an folly::EventBase
is the ability to suspend such task, without blocking the system thread. So how do you suspend a fiber-task ?
fibers::Baton
is the core synchronization primitive which is used to suspend a fiber-task and notify when the task may be resumed. fibers::Baton
supports two basic operations: wait()
and post()
. Calling wait()
on a Baton will suspend current fiber-task until post()
is called on the same Baton.
Please refer to Baton for more detailed documentation.
fibers::Baton
is the only native synchronization primitive of folly::fibers library. All other synchronization primitives provided by folly::fibers are built on top of fibers::Baton
.Let's say we have some existing library which provides a classic callback-style asynchronous API.
void asyncCall(Request request, folly::Function<void(Response)> cb);
If we use folly::fibers we can just make an async call from a fiber-task and wait until callback is run:
fiberManager.addTask([]() { ... Response response; fibers::Baton baton;
asyncCall(request, [&](Response r) mutable { response = std::move(r); baton.post(); }); baton.wait();
// Now response holds response returned by the async call ... }
Using fibers::Baton
directly is generally error-prone. To make the task above simpler, folly::fibers provide fibers::await
function.
With fibers::await
, the code above transforms into:
fiberManager.addTask([]() { ... auto response = fibers::await([&](fibers::Promise<Response> promise) { asyncCall(request, [promise = std::move(promise)](Response r) mutable { promise.setValue(std::move(r)); }); });
// Now response holds response returned by the async call ... }
Callback passed to fibers::await
is executed immediately and then fiber-task is suspended until fibers::Promise
is fulfilled. When fibers::Promise
is fulfilled with a value or exception, fiber-task will be resumed and 'fibers::await' returns that value (or throws an exception, if exception was used to fulfill the Promise
).
fiberManager.addTask([]() { ... try { auto response = fibers::await([&](fibers::Promise<Response> promise) { asyncCall(request, [promise = std::move(promise)](Response r) mutable { promise.setException(std::runtime_error("Await will re-throw me")); }); }); assert(false); // We should never get here } catch (const std::exception& e) { assert(e.what() == "Await will re-throw me"); } ... }
If fibers::Promise
is not fulfilled, fibers::await
will throw a std::logic_error
.
fiberManager.addTask([]() { ... try { auto response = fibers::await([&](fibers::Promise<Response> promise) { // We forget about the promise }); assert(false); // We should never get here } catch (const std::logic_error& e) { ... } ... }
Please refer to await for more detailed documentation.
fibers::Baton
or fibers::await
. These primitives should only be used to integrate with other asynchronous API which are not fibers-compatible.Let's say we have some existing library which provides a Future-based asynchronous API.
folly::Future<Response> asyncCallFuture(Request request);
The good news are, folly::Future
is already fibers-compatible. You can simply write:
fiberManager.addTask([]() { ... auto response = asyncCallFuture(request).get();
// Now response holds response returned by the async call ... }
Calling get()
on a folly::Future
object will only suspend the calling fiber-task. It won't block the system thread, letting it process other tasks.
Following the explanations above we may wrap an existing asynchronous API in a function:
Response fiberCall(Request request) { return fibers::await([&](fibers::Promise<Response> promise) { asyncCall(request, [promise = std::move(promise)](Response r) mutable { promise.setValue(std::move(r)); }); }); }
We can then call it from a fiber-task:
fiberManager.addTask([]() { ... auto response = fiberCall(request); ... });
But what happens if we just call fiberCall
not from within a fiber-task, but directly from a system thread ? Here another important feature of fibers::Baton
(and thus all other folly::fibers synchronization primitives built on top of it) comes into play. Calling wait()
on a fibers::Baton
within a system thread context just blocks the thread until post()
is called on the same folly::Baton
.
What this means is that you no longer need to write separate code for synchronous and asynchronous APIs. If you use only folly::fibers synchronization primitives for all blocking calls inside of your synchronous function, it automatically becomes asynchronous when run inside a fiber-task.
Classic asynchronous APIs (same applies to folly::Future-based APIs) generally rely on copying/moving-in input arguments and often require you to copy/move in some context variables into the callback. E.g.:
... Context context;
asyncCall(request, [request, context](Response response) mutable { doSomething(request, response, context); }); ...
Fibers-compatible APIs look more like synchronous APIs, so you can actually pass input arguments by reference and you don't have to think about passing context at all. E.g.
fiberManager.addTask([]() { ... Context context;
auto response = fiberCall(request);
doSomething(request, response, context); ... });
Same logic applies to fibers::await
. Since fibers::await
call blocks until promise is fulfilled, it's safe to pass everything by reference.
So should you just run all the code inside a fiber-task ? No exactly.
Similarly to system threads, every fiber-task has some stack space assigned to it. Stack usage goes up with the number of nested function calls and objects allocated on the stack. folly::fibers implementation only supports fiber-tasks with fixed stack size. If you want to have many fiber-tasks running concurrently - you need to reduce the amount of stack assigned to each fiber-task, otherwise you may run out of memory.
However if you know that some function never suspends a fiber-task, you can use fibers::runInMainContext
to safely call it from a fiber-task, without any risk of running out of stack space of the fiber-task.
Result useALotOfStack() { char buffer[1024*1024]; ... }
... fiberManager.addTask([]() { ... auto result = fibers::runInMainContext([&]() { return useALotOfStack(); }); ... }); ...
fibers::runInMainContext
will switch to the stack of the system thread (main context), run the functor passed to it and then switch back to the fiber-task stack.
Remember that it's fine to use fibers::runInMainContext
in general purpose functions (those which may be called both from fiber-task and non from fiber-task). When called in non-fiber-task context fibers::runInMainContext
would simply execute passed functor right away.
fibers::runInMainContext
some other functions in folly::fibers are also executing some of the passed functors on the main context. E.g. functor passes to fibers::await
is executed on main context, finally-functor passed to FiberManager::addTaskFinally
is also executed on main context etc. Relying on this can help you avoid extra fibers::runInMainContext
calls (and avoid extra context switches).Consider the following example:
... folly::EventBase evb; auto& fiberManager = folly::fibers::getFiberManager(evb); std::mutex lock; folly::fibers::Baton baton;
fiberManager.addTask([&]() { std::lock_guard<std::mutex> lg(lock); baton.wait(); });
fiberManager.addTask([&]() { std::lock_guard<std::mutex> lg(lock); });
evb.loop(); // We won't get here :( baton.post(); ...
First fiber-task will grab a lock and then suspend waiting on a fibers::Baton
. Then second fiber-task will be run and it will try to grab a lock. Unlike system threads, fiber-task can be only suspended explicitly, so the whole system thread will be blocked waiting on the lock, and we end up with a dead-lock.
There're generally two ways we can solve this problem. Ideally we would re-design the program to never not hold any locks when fiber-task is suspended. However if we are absolutely sure we need that lock - folly::fibers library provides some fiber-task-aware lock implementations (e.g. TimedMutex).
</section><section class="dex_document">
All of the features of folly::fibers library are actually built on top a single synchronization primitive called Baton. fibers::Baton
is a fiber-specific version of folly::Baton
. It only supports two basic operations: wait()
and post()
. Whenever wait()
is called on the Baton, the current thread or fiber-task is suspended, until post()
is called on the same Baton. wait()
does not suspend the thread or fiber-task if post()
was already called on the Baton. Please refer to Baton for more detailed documentation.
Baton is thread-safe, so wait()
and post()
can be (and should be :) ) called from different threads or fiber-tasks.
fibers::Baton
for synchronization, becomes asynchronous when used in fiber context.fibers::Baton
also supports wait with timeout.
fiberManager.addTask([=]() { auto baton = std::make_shared<folly::fibers::Baton>(); auto result = std::make_shared<Result>();
fiberManager.addTask([=]() { *result = sendRequest(...); baton->post(); });
bool success = baton.timed_wait(std::chrono::milliseconds{10}); if (success) { // request successful ... } else { // handle timeout ... } });
wait()
when using timed_wait()
API it's generally not safe to pass fibers::Baton
by reference. You have to make sure that task, which fulfills the Baton is either cancelled in case of timeout, or have shared ownership for the Baton.As you could see from previous examples, the easiest way to create a new fiber-task is to call addTask()
:
fiberManager.addTask([]() { ... });
It is important to remember that addTask()
is not thread-safe. I.e. it can only be safely called from the the thread, which is running the folly::FiberManager
loop.
If you need to create a fiber-task from a different thread, you have to use addTaskRemote()
:
folly::EventBase evb; auto& fiberManager = folly::fibers::getFiberManager(evb);
std::thread t([&]() { fiberManager.addTaskRemote([]() { ... }); });
evb.loopForever();
addTaskFinally()
is useful when you need to run some code on the main context in the end of a fiber-task.
fiberManager.addTaskFinally( [=]() { ... return result; }, [=](Result&& result) { callUserCallbacks(std::move(result), ...) } );
Of course you could achieve the same by calling fibers::runInMainContext()
, but addTaskFinally()
reduces the number of fiber context switches:
fiberManager.addTask([=]() { ... folly::fibers::runInMainContext([&]() { // Switched to main context callUserCallbacks(std::move(result), ...) } // Switched back to fiber context
// On fiber context we realize there's no more work to be done. // Fiber-task is complete, switching back to main context. });
addTask()
and addTaskRemote()
are creating detached fiber-tasks. If you need to know when fiber-task is complete and/or have some return value for it - addTaskFuture()
/ addTaskRemoteFuture()
can be used.
folly::EventBase evb; auto& fiberManager = folly::fibers::getFiberManager(evb);
std::thread t([&]() { auto future1 = fiberManager.addTaskRemoteFuture([]() { ... }); auto future2 = fiberManager.addTaskRemoteFuture([]() { ... });
auto result1 = future1.get(); auto result2 = future2.get(); ... });
evb.loopForever();
All the listed synchronization primitives are built using fiber::Baton
. Please check their source code for detailed documentation.
</section><section class="dex_document">
Similarly to system threads, every fiber-task has some stack space assigned to it. Stack usage goes up with the number of nested function calls and objects allocated on the stack. folly::fibers implementation only supports fiber-tasks with fixed stack size. If you want to have many fiber-tasks running concurrently - you need to reduce the amount of stack assigned to each fiber-task, otherwise you may run out of memory.
Stack size used for every fiber-task is part of FiberManager configuration. But how do you pick the right stack size ?
First of all you need to figure out the maximum number of concurrent fiber-tasks your application may have. E.g. if you are writing a Thrift-service you will probably have a single fiber-task for every request in-fly (but remember that e.g. fibers::collectAll
and some other synchronization primitives may create extra fiber-tasks). It's very important to get that number first, because if you will at most need 100 concurrent fiber-tasks, even 1MB stacks will result in at most 100MB used for fiber stacks. On the other hand if you need to have 100,000 concurrent fiber-tasks, even 16KB stacks will result in 1.6GB peak memory usage just for fiber stacks.
folly::fibers also supports recording stack usage (it can be enabled via recordStackEvery
option of FiberManager
). When enabled, the stack of each fiber-task will be filled with magic values. Later linear search can be performed to find the boundary of unused stack space.
By default every fiber-task stack is allocated with a special guard page next to it (this can be controlled via useGuardPages
option of FiberManager
). If a stack overflow happens - this guard page will be accessed, which will result in immediate segmentation fault.
</section><section class="dex_document">
folly::fibers library doesn't implement it's own event system. Instead it allows fibers::FiberManager
to work with any other event system by implementing fibers::LoopController
interface.
The easiest way to create a fibers::FiberManager
attached to a folly::EventBase
is by using fibers::getFiberManager
function:
folly::EventBase evb; auto& fiberManager = folly::fibers::getFiberManager(evb);
fiberManager.addTask([]() { ... });
evb.loop();
Such fibers::FiberManager
will be automatically destroyed, when folly::EventBase
is destroyed.
fibers::FiberManager
has any outstanding fiber-tasks, when folly::EventBase
is being destroyed, it will keep running the event loop until all those tasks are finished.</section><section class="dex_document">
folly::fibers provides some GDB extensions which can be very useful for debugging. To load them simply run the following in GDB console:
fbload folly_fibers
You can use $get_fiber_manager_map_evb()
and $get_fiber_manager_map_vevb()
to get folly::EventBase
=> fibers::FiberManager
and folly::VirtualEventBase
=> fibers::FiberManager
mappings respectively:
(gdb) fbload stl (gdb) p $get_fiber_manager_map_evb() $2 = std::unordered_map with 2 elements = { [0x7fffffffda80] = std::unique_ptr<folly::fibers::FiberManager> containing 0x7ffff5c22a00, [0x7fffffffd850] = std::unique_ptr<folly::fibers::FiberManager> containing 0x7ffff5c22800 }
This will only list fibers::FiberManager
s created using fibers::getFiberManager()
function.
Given a pointer to a fibers::FiberManager
you can get a list of all its active fibers:
(gdb) p *((folly::fibers::FiberManager*)0x7ffff5c22800) $4 = folly::fibers::FiberManager = { 0x7ffff5d23380 = folly::fibers::Fiber = { state = Awaiting immediate, backtrace available = true } }
fiber-print-limit
command can be used to change the maximum number of fibers printed for a fibers::FiberManager
(default value is 100).
(gdb) fiber-print-limit 10 New fiber limit for FiberManager printer set to 10
Given a pointer to a fibers::Fiber
, which is running some fiber-task, you can get its current state:
(gdb) p *((folly::fibers::Fiber*)0x7ffff5d23380) $5 = folly::fibers::Fiber = { state = Awaiting immediate, backtrace available = true }
Every fibers::Fiber
, which is suspended (and so has its backtrace available), can be activated. To activate a fiber-task you can either use fiber
GDB command, passing a fibers::Fiber
pointer to it:
(gdb) fiber 0x7ffff5d23380 Fiber 140737317581696 activated. You can call 'bt' now.
or simply call activate()
on a fibers::Fiber
object:
(gdb) p ((folly::fibers::Fiber*)0x7ffff5d23380)->activate() $6 = "Fiber 0x7ffff5d23380 activated. You can call 'bt' now."
Once fiber-task is activated you can explore its stack using bt
and frame
commands, just like a regular thread.
(gdb) bt #1 0x00000000005497e9 in folly::fibers::FiberImpl::deactivate() (this=0x7ffff5d233a0) at buck-out/dbg/gen/folly/fibers/fibers_core#default,headers/folly/fibers/BoostContextCompatibility.h:105 #2 0x000000000054996d in folly::fibers::FiberManager::deactivateFiber(folly::fibers::Fiber*) (this=0x7ffff5c22800, fiber=0x7ffff5d23380) at buck-out/dbg/gen/folly/fibers/fibers_core#default,headers/folly/fibers/FiberManagerInternal-inl.h:103 #3 0x0000000000548b91 in folly::fibers::Fiber::<lambda()>::operator()(void) (__closure=0x7ffff59ffb20) at folly/fibers/Fiber.cpp:175 #4 0x0000000000548d78 in folly::fibers::Fiber::preempt(folly::fibers::Fiber::State) (this=0x7ffff5d23380, state=folly::fibers::Fiber::AWAITING_IMMEDIATE) at folly/fibers/Fiber.cpp:185 #5 0x000000000043bcc6 in folly::fibers::FiberManager::runInMainContext<FiberManager_nestedFiberManagers_Test::TestBody()::<lambda()>::<lambda()> >(<unknown type in /mnt/fio0/andrii/fbsource/fbcode/buck-out/dbg/gen/folly/fibers/test/fibers_test, CU 0x31b, DIE 0x111bdb>) (this=0x7ffff5c22800, func=<unknown type in /mnt/fio0/andrii/fbsource/fbcode/buck-out/dbg/gen/folly/fibers/test/fibers_test, CU 0x31b, DIE 0x111bdb>) at buck-out/dbg/gen/folly/fibers/fibers_core#default,headers/folly/fibers/FiberManagerInternal-inl.h:459 #6 0x00000000004300f3 in folly::fibers::runInMainContext<FiberManager_nestedFiberManagers_Test::TestBody()::<lambda()>::<lambda()> >(<unknown type in /mnt/fio0/andrii/fbsource/fbcode/buck-out/dbg/gen/folly/fibers/test/fibers_test, CU 0x31b, DIE 0xf7caa>) (func=<unknown type in /mnt/fio0/andrii/fbsource/fbcode/buck-out/dbg/gen/folly/fibers/test/fibers_test, CU 0x31b, DIE 0xf7caa>) at buck-out/dbg/gen/folly/fibers/fibers_core#default,headers/folly/fibers/FiberManagerInternal.h:551 #7 0x0000000000422101 in FiberManager_nestedFiberManagers_Test::<lambda()>::operator()(void) const (__closure=0x7ffff5d23450) at folly/fibers/test/FibersTest.cpp:1537 ...
To deactivate previously activated fiber-task and switch back to the stack of current thread simply use fiber-deactivate
GDB command:
(gdb) fiber-deactivate Fiber de-activated.
</section></section>