Professional Services Automated Build & Test

A layered testing methodology for a real-time game

Call of Orion is a Python/Arcade space-survival game with thousands of moving parts — physics, combat AI, pathfinding, save/load, and a live render loop. Keeping it stable takes a deliberate test pyramid: a wide base of fast unit tests, a middle tier of headless integration and performance gates that assert real frame-rate thresholds, and a narrow top of multi-minute soak runs — all bug-focused and green before anything merges.

Back to Professional Services

2,767

Fast unit tests

466

Integration & soak

3,233

Total — zero failures

~2.5 min

Fast-suite runtime

Why a pyramid

A game's logic and its render loop fail in different ways, so they need different tests. Pure logic — damage routing, inventory math, A* pathfinding, save round-trips — is covered by a huge, fast unit suite that runs in a couple of minutes and pinpoints regressions precisely. Anything that depends on a real Arcade window — frame timing, GPU rendering, full-scene behaviour — moves up into headless integration and performance tests. And slow-burn problems like memory growth or frame-rate decay only show up under sustained load, so they get their own soak tier.

The test pyramid

Three tiers plus a lint gate: many fast, focused tests at the base; a few broad, slow ones at the top.

What each tier covers

Tier	Scope	What it proves
Fast unit	Isolated logic — no window	Player physics, weapons & melee arcs, asteroids, alien AI, pickups, blueprints, shields, damage routing, buildings, ship modules & AI-pilot behaviour, drones & A* pathing, inventory math, fog of war, and save/restore round-trips all behave exactly as specified.
Integration + performance	Real Arcade window (headless)	Full-frame FPS holds above threshold across all three zones, trade and combat scenes, AI-pilot fleets and station shields; GPU rendering microbenchmarks and all six resolution presets stay within budget.
Soak / endurance	5-minute sustained sessions	FPS and resident memory (RSS) stay flat over time — no leaks, no frame-rate decay — across idle, combat churn, dialogue, station-shield cycles, and Star-Maze pressure.

Fast by default, slow on demand. The default pytest run executes only the fast unit suite; the window-bound integration and soak tests are opt-in, because a shared Arcade window pollutes other tests' window-size math and each one is comparatively slow. Developers get a tight feedback loop locally; the full multi-hour suite runs as the pre-merge gate.

The quality gate

Lint → fast unit → integration/performance → soak. Each result is recorded; only an all-green run merges.

Linting is treated as a bug gate, not a style police: the rule set is deliberately narrow, targeting the failure classes that have actually caused crashes — undefined names, variables used before assignment, mutable default arguments, and loop-variable closure bugs — without drowning the signal in whitespace nits. Every full cycle is written up with totals, durations, and any anomalies, so the suite's health is auditable over time.

Want a test suite that catches regressions before your users do?

Layered, bug-focused testing — from fast unit checks to performance and endurance gates — built for whatever you ship.

keith.estanol@rpg4you.com Back to all services