|
group2 0.1.0
CSE 125 Group 2
|
Namespaces | |
| namespace | shotlog |
Classes | |
| struct | PerScopeStats |
| Per-scope, all-thread atomic counters. More... | |
| struct | NetworkCounters |
| Per-tick network counters maintained by the network code. More... | |
| struct | Snapshot |
| Globally-visible snapshot returned to the aggregator callback. More... | |
| class | ScopeTimer |
| RAII scoped timer. More... | |
Typedefs | |
| using | ScopeId = std::uint16_t |
| Dense small id used to index the global stats table. | |
Functions | |
| void | initParallelFromEnv () |
| Initialize from environment. | |
| template<class Iter, class Fn> | |
| void | parallelFor (Iter begin, Iter end, Fn &&fn) |
| Call fn(*it) for every element in [begin, end). | |
| ScopeId | registerScope (const char *name) |
| Register (or look up) a scope name and return its dense id. | |
| const char * | scopeName (ScopeId id) |
| Returns the human-readable name a ScopeId was registered with, or "" if id is out of range. | |
| std::size_t | scopeCount () |
| Returns the highest registered id + 1. | |
| void | recordSample (ScopeId id, std::uint64_t ticks) noexcept |
| Recording entry point — public so unit tests can invoke it directly without a real ScopeTimer. | |
| void | tickEnd (std::uint64_t tickWallNs) noexcept |
| Tick boundary marker — call once per server tick() end. | |
| void | initFromEnv () |
| Initialize from environment variables. | |
| void | startAggregator (std::function< void(const Snapshot &)> cb) |
| Spawn the 1 Hz aggregator thread. | |
| void | stopAggregator () |
| Stop the aggregator and join its thread. Idempotent. | |
| NetworkCounters & | net () |
| Network counter accessor. Hot-path code increments these directly. | |
| std::uint64_t | ticksToNs (std::uint64_t ticks) noexcept |
| Convenience: convert SDL performance-counter ticks to nanoseconds. | |
Variables | |
| std::atomic< bool > | parallelEnabled {true} |
| Master switch for parallel execution. | |
| constexpr std::size_t | k_parallelThreshold = 64 |
| Minimum items below which parallelFor runs sequentially even when the master switch is on. | |
| std::atomic< bool > | enabled {false} |
| Master switch. | |
| constexpr std::size_t | k_maxScopes = 64 |
| Compile-time caps. | |
| constexpr std::size_t | k_histogramBuckets = 32 |
| Histogram bucket count. | |
| constexpr ScopeId | k_invalidScope = static_cast<ScopeId>(-1) |
| using group2::perf::ScopeId = std::uint16_t |
Dense small id used to index the global stats table.
| void group2::perf::initFromEnv | ( | ) |
Initialize from environment variables.
Call once at startup, before the first scope is hit. GROUP2_SERVER_PROFILE=1 → enable sampling + 1 Hz log line GROUP2_SERVER_PROFILE_CSV=path → also write CSV rows to path
Idempotent.
|
inline |
Initialize from environment.
Idempotent.
Default ON (PR-8). GROUP2_SERVER_PARALLEL=0 flips it off for diagnostics / A-B comparison; any other value (or unset) leaves it on.
|
inline |
Network counter accessor. Hot-path code increments these directly.
|
inline |
Call fn(*it) for every element in [begin, end).
Routes through TBB when (a) available, (b) the runtime flag is on, and (c) the input range is large enough to amortize dispatch cost. Otherwise sequential.
|
noexcept |
Recording entry point — public so unit tests can invoke it directly without a real ScopeTimer.
| ScopeId group2::perf::registerScope | ( | const char * | name | ) |
Register (or look up) a scope name and return its dense id.
First call for a given name is O(n) over already-registered scopes; subsequent calls are cached at the call site. Thread-safe.
| std::size_t group2::perf::scopeCount | ( | ) |
Returns the highest registered id + 1.
| const char * group2::perf::scopeName | ( | ScopeId | id | ) |
Returns the human-readable name a ScopeId was registered with, or "" if id is out of range.
Used by the aggregator's logger.
| void group2::perf::startAggregator | ( | std::function< void(const Snapshot &)> | cb | ) |
Spawn the 1 Hz aggregator thread.
Calls cb(snap) once per second on a dedicated thread. Safe to call once. cb runs on the aggregator thread, so do not touch ECS / non-thread-safe state from inside it.
| void group2::perf::stopAggregator | ( | ) |
Stop the aggregator and join its thread. Idempotent.
|
noexcept |
Tick boundary marker — call once per server tick() end.
|
inlinenoexcept |
Convenience: convert SDL performance-counter ticks to nanoseconds.
| std::atomic< bool > group2::perf::enabled {false} |
Master switch.
Toggled at process startup based on GROUP2_SERVER_PROFILE. When false, ScopeTimer ctor early-outs after a single relaxed atomic load.
|
inlineconstexpr |
Histogram bucket count.
Buckets are log2-spaced over the SDL performance-counter tick scale (resolution typically 100 ns). Index = __builtin_clzll-derived msb position; bucket 0 holds the smallest measurable values.
|
inlineconstexpr |
Compile-time caps.
Both fit in a single CPU cache line per scope (PerScopeStats is ~512 B; we keep it small enough for hot scopes to coexist in L2).
|
inlineconstexpr |
Minimum items below which parallelFor runs sequentially even when the master switch is on.
Avoids paying TBB dispatch overhead for trivially-small work where sequential is faster.
|
inline |
Master switch for parallel execution.
Defaults ON as of PR-8. Earlier benches (idle-bot loadtest, pre-PR-7) suggested defaulting off because the synthetic test's per-item work was too small. With AI bots actually moving + PR-7 (collision/movement parallel) + PR-8 (per-component-type parallel serialization) the per-item work is meaningful and the 16-core box pays clear dividends:
N=100, AI: tick p99 1.57 ms (off) → 0.39 ms (on) N=300, AI: tick p99 12 ms (off) → 1.57 ms (on) N=500, AI: tick p99 50+ ms (off) → 3.15 ms when OS gives CPU
Below the k_parallelThreshold element-count, parallelFor short-circuits to sequential anyway, so small inputs still win.
Kill switch: GROUP2_SERVER_PARALLEL=0 flips back to sequential without rebuilding — useful for diff bisection if a regression appears.