Namespaces
namespace	shotlog

Classes
struct	PerScopeStats
	Per-scope, all-thread atomic counters. More...
struct	NetworkCounters
	Per-tick network counters maintained by the network code. More...
struct	Snapshot
	Globally-visible snapshot returned to the aggregator callback. More...
class	ScopeTimer
	RAII scoped timer. More...

Typedefs
using	ScopeId = std::uint16_t
	Dense small id used to index the global stats table.

Functions
void	initParallelFromEnv ()
	Initialize from environment.
template<class Iter, class Fn>
void	parallelFor (Iter begin, Iter end, Fn &&fn)
	Call fn(*it) for every element in [begin, end).
ScopeId	registerScope (const char *name)
	Register (or look up) a scope name and return its dense id.
const char *	scopeName (ScopeId id)
	Returns the human-readable name a ScopeId was registered with, or "" if id is out of range.
std::size_t	scopeCount ()
	Returns the highest registered id + 1.
void	recordSample (ScopeId id, std::uint64_t ticks) noexcept
	Recording entry point — public so unit tests can invoke it directly without a real ScopeTimer.
void	tickEnd (std::uint64_t tickWallNs) noexcept
	Tick boundary marker — call once per server tick() end.
void	initFromEnv ()
	Initialize from environment variables.
void	startAggregator (std::function< void(const Snapshot &)> cb)
	Spawn the 1 Hz aggregator thread.
void	stopAggregator ()
	Stop the aggregator and join its thread. Idempotent.
NetworkCounters &	net ()
	Network counter accessor. Hot-path code increments these directly.
std::uint64_t	ticksToNs (std::uint64_t ticks) noexcept
	Convenience: convert SDL performance-counter ticks to nanoseconds.

Variables
std::atomic< bool >	parallelEnabled {true}
	Master switch for parallel execution.
constexpr std::size_t	k_parallelThreshold = 64
	Minimum items below which parallelFor runs sequentially even when the master switch is on.
std::atomic< bool >	enabled {false}
	Master switch.
constexpr std::size_t	k_maxScopes = 64
	Compile-time caps.
constexpr std::size_t	k_histogramBuckets = 32
	Histogram bucket count.
constexpr ScopeId	k_invalidScope = static_cast<ScopeId>(-1)

Typedef Documentation

◆ ScopeId

using group2::perf::ScopeId = std::uint16_t

Dense small id used to index the global stats table.

Function Documentation

◆ initFromEnv()

void group2::perf::initFromEnv ( )

Initialize from environment variables.

Call once at startup, before the first scope is hit. GROUP2_SERVER_PROFILE=1 → enable sampling + 1 Hz log line GROUP2_SERVER_PROFILE_CSV=path → also write CSV rows to path

Idempotent.

Here is the caller graph for this function:

◆ initParallelFromEnv()

void group2::perf::initParallelFromEnv ( )

inline

Initialize from environment.

Idempotent.

Default ON (PR-8). GROUP2_SERVER_PARALLEL=0 flips it off for diagnostics / A-B comparison; any other value (or unset) leaves it on.

Here is the caller graph for this function:

◆ net()

NetworkCounters & group2::perf::net ( )

inline

Network counter accessor. Hot-path code increments these directly.

Here is the caller graph for this function:

◆ parallelFor()

template<class Iter, class Fn>

void group2::perf::parallelFor	(	Iter	begin,
		Iter	end,
		Fn &&	fn )

inline

Call fn(*it) for every element in [begin, end).

Routes through TBB when (a) available, (b) the runtime flag is on, and (c) the input range is large enough to amortize dispatch cost. Otherwise sequential.

Here is the caller graph for this function:

◆ recordSample()

void group2::perf::recordSample	(	ScopeId	id,
		std::uint64_t	ticks )

noexcept

Recording entry point — public so unit tests can invoke it directly without a real ScopeTimer.

Here is the caller graph for this function:

◆ registerScope()

ScopeId group2::perf::registerScope ( const char * name )

Register (or look up) a scope name and return its dense id.

First call for a given name is O(n) over already-registered scopes; subsequent calls are cached at the call site. Thread-safe.

◆ scopeCount()

std::size_t group2::perf::scopeCount ( )

Returns the highest registered id + 1.

◆ scopeName()

const char * group2::perf::scopeName ( ScopeId id )

Returns the human-readable name a ScopeId was registered with, or "" if id is out of range.

Used by the aggregator's logger.

◆ startAggregator()

void group2::perf::startAggregator ( std::function< void(const Snapshot &)> cb )

Spawn the 1 Hz aggregator thread.

Calls cb(snap) once per second on a dedicated thread. Safe to call once. cb runs on the aggregator thread, so do not touch ECS / non-thread-safe state from inside it.

Here is the caller graph for this function:

◆ stopAggregator()

void group2::perf::stopAggregator ( )

Stop the aggregator and join its thread. Idempotent.

Here is the caller graph for this function:

◆ tickEnd()

void group2::perf::tickEnd ( std::uint64_t tickWallNs )

noexcept

Tick boundary marker — call once per server tick() end.

Here is the caller graph for this function:

◆ ticksToNs()

std::uint64_t group2::perf::ticksToNs ( std::uint64_t ticks )

inlinenoexcept

Convenience: convert SDL performance-counter ticks to nanoseconds.

Here is the caller graph for this function:

Variable Documentation

◆ enabled

std::atomic< bool > group2::perf::enabled {false}

Master switch.

Toggled at process startup based on GROUP2_SERVER_PROFILE. When false, ScopeTimer ctor early-outs after a single relaxed atomic load.

◆ k_histogramBuckets

std::size_t group2::perf::k_histogramBuckets = 32

inlineconstexpr

Histogram bucket count.

Buckets are log2-spaced over the SDL performance-counter tick scale (resolution typically 100 ns). Index = __builtin_clzll-derived msb position; bucket 0 holds the smallest measurable values.

◆ k_invalidScope

ScopeId group2::perf::k_invalidScope = static_cast<ScopeId>(-1)

inlineconstexpr

◆ k_maxScopes

std::size_t group2::perf::k_maxScopes = 64

inlineconstexpr

Compile-time caps.

Both fit in a single CPU cache line per scope (PerScopeStats is ~512 B; we keep it small enough for hot scopes to coexist in L2).

◆ k_parallelThreshold

std::size_t group2::perf::k_parallelThreshold = 64

inlineconstexpr

Minimum items below which parallelFor runs sequentially even when the master switch is on.

Avoids paying TBB dispatch overhead for trivially-small work where sequential is faster.

◆ parallelEnabled

std::atomic<bool> group2::perf::parallelEnabled {true}

inline

Master switch for parallel execution.

Defaults ON as of PR-8. Earlier benches (idle-bot loadtest, pre-PR-7) suggested defaulting off because the synthetic test's per-item work was too small. With AI bots actually moving + PR-7 (collision/movement parallel) + PR-8 (per-component-type parallel serialization) the per-item work is meaningful and the 16-core box pays clear dividends:

N=100, AI: tick p99 1.57 ms (off) → 0.39 ms (on) N=300, AI: tick p99 12 ms (off) → 1.57 ms (on) N=500, AI: tick p99 50+ ms (off) → 3.15 ms when OS gives CPU

Below the k_parallelThreshold element-count, parallelFor short-circuits to sequential anyway, so small inputs still win.

Kill switch: GROUP2_SERVER_PARALLEL=0 flips back to sequential without rebuilding — useful for diff bisection if a regression appears.

Namespaces

Classes

Typedefs

Functions

Variables

Typedef Documentation

◆ ScopeId

Function Documentation

◆ initFromEnv()

◆ initParallelFromEnv()

◆ net()

◆ parallelFor()

◆ recordSample()

◆ registerScope()

◆ scopeCount()

◆ scopeName()

◆ startAggregator()

◆ stopAggregator()

◆ tickEnd()

◆ ticksToNs()

Variable Documentation

◆ enabled

◆ k_histogramBuckets

◆ k_invalidScope

◆ k_maxScopes

◆ k_parallelThreshold

◆ parallelEnabled