proxygen: folly::detail::TurnSequencer< Atom

#include <TurnSequencer.h>

Public Types
enum	TryWaitResult { TryWaitResult::SUCCESS, TryWaitResult::PAST, TryWaitResult::TIMEDOUT }

Public Member Functions
	TurnSequencer (const uint32_t firstTurn=0) noexcept

bool	isTurn (const uint32_t turn) const noexcept
	Returns true iff a call to waitForTurn(turn, ...) won't block. More...

void	waitForTurn (const uint32_t turn, Atom< uint32_t > &spinCutoff, const bool updateSpinCutoff) noexcept

template<class Clock = std::chrono::steady_clock, class Duration = typename Clock::duration>
TryWaitResult	tryWaitForTurn (const uint32_t turn, Atom< uint32_t > &spinCutoff, const bool updateSpinCutoff, const std::chrono::time_point< Clock, Duration > *absTime=nullptr) noexcept

void	completeTurn (const uint32_t turn) noexcept
	Unblocks a thread running waitForTurn(turn + 1) More...

uint8_t	uncompletedTurnLSB () const noexcept

Private Types
enum	: uint32_t { kTurnShift = 6, kWaitersMask = (1 << kTurnShift) - 1, kMinSpins = 20, kMaxSpins = 2000 }

Private Member Functions
uint32_t	futexChannel (uint32_t turn) const noexcept

uint32_t	decodeCurrentSturn (uint32_t state) const noexcept

uint32_t	decodeMaxWaitersDelta (uint32_t state) const noexcept

uint32_t	encode (uint32_t currentSturn, uint32_t maxWaiterD) const noexcept

Private Attributes
Futex< Atom >	state_

Detailed Description

template<template< typename > class Atom>
struct folly::detail::TurnSequencer< Atom >

A TurnSequencer allows threads to order their execution according to a monotonically increasing (with wraparound) "turn" value. The two operations provided are to wait for turn T, and to move to the next turn. Every thread that is waiting for T must have arrived before that turn is marked completed (for MPMCQueue only one thread waits for any particular turn, so this is trivially true).

TurnSequencer's state_ holds 26 bits of the current turn (shifted left by 6), along with a 6 bit saturating value that records the maximum waiter minus the current turn. Wraparound of the turn space is expected and handled. This allows us to atomically adjust the number of outstanding waiters when we perform a FUTEX_WAKE operation. Compare this strategy to sem_t's separate num_waiters field, which isn't decremented until after the waiting thread gets scheduled, during which time more enqueues might have occurred and made pointless FUTEX_WAKE calls.

TurnSequencer uses futex() directly. It is optimized for the case that the highest awaited turn is 32 or less higher than the current turn. We use the FUTEX_WAIT_BITSET variant, which lets us embed 32 separate wakeup channels in a single futex. See http://locklessinc.com/articles/futex_cheat_sheet for a description.

We only need to keep exact track of the delta between the current turn and the maximum waiter for the 32 turns that follow the current one, because waiters at turn t+32 will be awoken at turn t. At that point they can then adjust the delta using the higher base. Since we need to encode waiter deltas of 0 to 32 inclusive, we use 6 bits. We actually store waiter deltas up to 63, since that might reduce the number of CAS operations a tiny bit.

To avoid some futex() calls entirely, TurnSequencer uses an adaptive spin cutoff before waiting. The overheads (and convergence rate) of separately tracking the spin cutoff for each TurnSequencer would be prohibitive, so the actual storage is passed in as a parameter and updated atomically. This also lets the caller use different adaptive cutoffs for different operations (read versus write, for example). To avoid contention, the spin cutoff is only updated when requested by the caller.

Definition at line 72 of file TurnSequencer.h.

Member Enumeration Documentation

template<template< typename > class Atom>

anonymous enum : uint32_t

private

Enumerator
kTurnShift	kTurnShift counts the bits that are stolen to record the delta between the current turn and the maximum waiter. It needs to be big enough to record wait deltas of 0 to 32 inclusive. Waiters more than 32 in the future will be woken up 32*n turns early (since their BITSET will hit) and will adjust the waiter count again. We go a bit beyond and let the waiter count go up to 63, which is free and might save us a few CAS
kWaitersMask
kMinSpins	The minimum spin count that we will adaptively select.
kMaxSpins	The maximum spin count that we will adaptively select, and the spin count that will be used when probing to get a new data point for the adaptation

Definition at line 222 of file TurnSequencer.h.

        : uint32_t {
     kTurnShift = 6,
     kWaitersMask = (1 << kTurnShift) - 1,
 
     kMinSpins = 20,
 
     kMaxSpins = 2000,
   };

template<template< typename > class Atom>

enum folly::detail::TurnSequencer::TryWaitResult

strong

Enumerator
SUCCESS
PAST
TIMEDOUT

Definition at line 82 of file TurnSequencer.h.

82 { SUCCESS, PAST, TIMEDOUT };

ErrorBehavior::SUCCESS

folly::detail::FutexResult::TIMEDOUT

Constructor & Destructor Documentation

template<template< typename > class Atom>

folly::detail::TurnSequencer< Atom >::TurnSequencer ( const uint32_t firstTurn = 0 )

inlineexplicitnoexcept

Definition at line 73 of file TurnSequencer.h.

74 : state_(encode(firstTurn << kTurnShift, 0)) {}

folly::detail::TurnSequencer::encode

uint32_t encode(uint32_t currentSturn, uint32_t maxWaiterD) const noexcept

Definition: TurnSequencer.h:260

folly::detail::TurnSequencer::state_

Futex< Atom > state_

Definition: TurnSequencer.h:244

folly::detail::TurnSequencer::kTurnShift

Definition: TurnSequencer.h:230

Member Function Documentation

template<template< typename > class Atom>

void folly::detail::TurnSequencer< Atom >::completeTurn ( const uint32_t turn )

inlinenoexcept

Unblocks a thread running waitForTurn(turn + 1)

Definition at line 195 of file TurnSequencer.h.

Referenced by folly::detail::SingleElementQueue< T, Atom >::dequeueImpl(), and folly::detail::SingleElementQueue< T, Atom >::enqueueImpl().

                                                   {
     uint32_t state = state_.load(std::memory_order_acquire);
     while (true) {
       DCHECK(state == encode(turn << kTurnShift, decodeMaxWaitersDelta(state)));
       uint32_t max_waiter_delta = decodeMaxWaitersDelta(state);
       uint32_t new_state = encode(
           (turn + 1) << kTurnShift,
           max_waiter_delta == 0 ? 0 : max_waiter_delta - 1);
       if (state_.compare_exchange_strong(state, new_state)) {
         if (max_waiter_delta != 0) {
           detail::futexWake(
               &state_, std::numeric_limits<int>::max(), futexChannel(turn + 1));
         }
         break;
       }
       // failing compare_exchange_strong updates first arg to the value
       // that caused the failure, so no need to reread state_
     }
   }

template<template< typename > class Atom>

uint32_t folly::detail::TurnSequencer< Atom >::decodeCurrentSturn ( uint32_t state ) const

inlineprivatenoexcept

Definition at line 252 of file TurnSequencer.h.

Referenced by folly::detail::TurnSequencer< std::atomic >::isTurn(), and folly::detail::TurnSequencer< std::atomic >::tryWaitForTurn().

                                                              {
     return state & ~kWaitersMask;
   }

template<template< typename > class Atom>

uint32_t folly::detail::TurnSequencer< Atom >::decodeMaxWaitersDelta ( uint32_t state ) const

inlineprivatenoexcept

Definition at line 256 of file TurnSequencer.h.

Referenced by folly::detail::TurnSequencer< std::atomic >::completeTurn(), and folly::detail::TurnSequencer< std::atomic >::tryWaitForTurn().

                                                                 {
     return state & kWaitersMask;
   }

template<template< typename > class Atom>

uint32_t folly::detail::TurnSequencer< Atom >::encode	(	uint32_t	currentSturn,
		uint32_t	maxWaiterD
	)		const

inlineprivatenoexcept

Definition at line 260 of file TurnSequencer.h.

Referenced by folly::detail::TurnSequencer< std::atomic >::completeTurn(), and folly::detail::TurnSequencer< std::atomic >::tryWaitForTurn().

                                                                              {
     return currentSturn | std::min(uint32_t{kWaitersMask}, maxWaiterD);
   }

template<template< typename > class Atom>

uint32_t folly::detail::TurnSequencer< Atom >::futexChannel ( uint32_t turn ) const

inlineprivatenoexcept

Returns the bitmask to pass futexWait or futexWake when communicating about the specified turn

Definition at line 248 of file TurnSequencer.h.

Referenced by folly::detail::TurnSequencer< std::atomic >::completeTurn(), and folly::detail::TurnSequencer< std::atomic >::tryWaitForTurn().

                                                       {
     return 1u << (turn & 31);
   }

template<template< typename > class Atom>

bool folly::detail::TurnSequencer< Atom >::isTurn ( const uint32_t turn ) const

inlinenoexcept

Returns true iff a call to waitForTurn(turn, ...) won't block.

Definition at line 77 of file TurnSequencer.h.

                                                   {
     auto state = state_.load(std::memory_order_acquire);
     return decodeCurrentSturn(state) == (turn << kTurnShift);
   }

template<template< typename > class Atom>

template<class Clock = std::chrono::steady_clock, class Duration = typename Clock::duration>

TryWaitResult folly::detail::TurnSequencer< Atom >::tryWaitForTurn	(	const uint32_t	turn,
		Atom< uint32_t > &	spinCutoff,
		const bool	updateSpinCutoff,
		const std::chrono::time_point< Clock, Duration > *	absTime = `nullptr`
	)

inlinenoexcept

Blocks the current thread until turn has arrived. If updateSpinCutoff is true then this will spin for up to kMaxSpins tries before blocking and will adjust spinCutoff based on the results, otherwise it will spin for at most spinCutoff spins. Returns SUCCESS if the wait succeeded, PAST if the turn is in the past or TIMEDOUT if the absTime time value is not nullptr and is reached before the turn arrives

Definition at line 109 of file TurnSequencer.h.

Referenced by folly::detail::TurnSequencer< std::atomic >::waitForTurn().

                             {
     uint32_t prevThresh = spinCutoff.load(std::memory_order_relaxed);
     const uint32_t effectiveSpinCutoff =
         updateSpinCutoff || prevThresh == 0 ? kMaxSpins : prevThresh;
 
     uint32_t tries;
     const uint32_t sturn = turn << kTurnShift;
     for (tries = 0;; ++tries) {
       uint32_t state = state_.load(std::memory_order_acquire);
       uint32_t current_sturn = decodeCurrentSturn(state);
       if (current_sturn == sturn) {
         break;
       }
 
       // wrap-safe version of (current_sturn >= sturn)
       if (sturn - current_sturn >= std::numeric_limits<uint32_t>::max() / 2) {
         // turn is in the past
         return TryWaitResult::PAST;
       }
 
       // the first effectSpinCutoff tries are spins, after that we will
       // record ourself as a waiter and block with futexWait
       if (tries < effectiveSpinCutoff) {
         asm_volatile_pause();
         continue;
       }
 
       uint32_t current_max_waiter_delta = decodeMaxWaitersDelta(state);
       uint32_t our_waiter_delta = (sturn - current_sturn) >> kTurnShift;
       uint32_t new_state;
       if (our_waiter_delta <= current_max_waiter_delta) {
         // state already records us as waiters, probably because this
         // isn't our first time around this loop
         new_state = state;
       } else {
         new_state = encode(current_sturn, our_waiter_delta);
         if (state != new_state &&
             !state_.compare_exchange_strong(state, new_state)) {
           continue;
         }
       }
       if (absTime) {
         auto futexResult = detail::futexWaitUntil(
             &state_, new_state, *absTime, futexChannel(turn));
         if (futexResult == FutexResult::TIMEDOUT) {
           return TryWaitResult::TIMEDOUT;
         }
       } else {
         detail::futexWait(&state_, new_state, futexChannel(turn));
       }
     }
 
     if (updateSpinCutoff || prevThresh == 0) {
       // if we hit kMaxSpins then spinning was pointless, so the right
       // spinCutoff is kMinSpins
       uint32_t target;
       if (tries >= kMaxSpins) {
         target = kMinSpins;
       } else {
         // to account for variations, we allow ourself to spin 2*N when
         // we think that N is actually required in order to succeed
         target = std::min<uint32_t>(
             kMaxSpins, std::max<uint32_t>(kMinSpins, tries * 2));
       }
 
       if (prevThresh == 0) {
         // bootstrap
         spinCutoff.store(target);
       } else {
         // try once, keep moving if CAS fails.  Exponential moving average
         // with alpha of 7/8
         // Be careful that the quantity we add to prevThresh is signed.
         spinCutoff.compare_exchange_weak(
             prevThresh, prevThresh + int(target - prevThresh) / 8);
       }
     }
 
     return TryWaitResult::SUCCESS;
   }

template<template< typename > class Atom>

uint8_t folly::detail::TurnSequencer< Atom >::uncompletedTurnLSB ( ) const

inlinenoexcept

Returns the least-most significant byte of the current uncompleted turn. The full 32 bit turn cannot be recovered.

Definition at line 217 of file TurnSequencer.h.

                                               {
     return uint8_t(state_.load(std::memory_order_acquire) >> kTurnShift);
   }

template<template< typename > class Atom>

void folly::detail::TurnSequencer< Atom >::waitForTurn	(	const uint32_t	turn,
		Atom< uint32_t > &	spinCutoff,
		const bool	updateSpinCutoff
	)

inlinenoexcept

See tryWaitForTurn Requires that turn is not a turn in the past.

Definition at line 86 of file TurnSequencer.h.

Referenced by folly::detail::SingleElementQueue< T, Atom >::dequeueImpl(), and folly::detail::SingleElementQueue< T, Atom >::enqueueImpl().

                                             {
     const auto ret = tryWaitForTurn(turn, spinCutoff, updateSpinCutoff);
     DCHECK(ret == TryWaitResult::SUCCESS);
   }

Member Data Documentation

template<template< typename > class Atom>

Futex<Atom> folly::detail::TurnSequencer< Atom >::state_

private

This holds both the current turn, and the highest waiting turn, stored as (current_turn << 6) | min(63, max(waited_turn - current_turn))

Definition at line 244 of file TurnSequencer.h.

Referenced by folly::detail::TurnSequencer< std::atomic >::completeTurn(), folly::detail::TurnSequencer< std::atomic >::isTurn(), folly::detail::TurnSequencer< std::atomic >::tryWaitForTurn(), and folly::detail::TurnSequencer< std::atomic >::uncompletedTurnLSB().

The documentation for this struct was generated from the following file:

proxygen/folly/folly/detail/TurnSequencer.h

Public Types

Public Member Functions

Private Types

Private Member Functions

Private Attributes

Detailed Description

template<template< typename > class Atom> struct folly::detail::TurnSequencer< Atom >

Member Enumeration Documentation

Constructor & Destructor Documentation

Member Function Documentation

Member Data Documentation

template<template< typename > class Atom>
struct folly::detail::TurnSequencer< Atom >