SYNTHETIC SURVEILLANCE
AMBIENT.AIAI-generated CCTV footage, built to demo and stress-test behavioral security-detection models.
A series of simulated CCTV “incident sequence” reels built to demonstrate Ambient.ai’s behavioral-detection platform. A single threat actor moves through an escalating breach across five surveillance stages — reconnaissance, perimeter breach, transit, entry, and interior climax — each captured as if by a different security camera on one coherent site, and each catching a behavioral precursor before the threat reaches an occupied area. The implicit promise: with the system deployed, the final stage never happens. Every frame is AI-generated, letting us stage scenarios that would be dangerous, costly, or impossible to film with real actors and cameras — produced safely, with no one ever at risk. The pipeline combined tightly art-directed image generation, manual image editing, AI video generation, compositing, masking, and effects work.
The flagship “Planned Breach”: one threat actor across five cameras — vehicle-gate recon, a badge-reader tailgate, the secured wing, the executive suite entry, then the interior climax.
The wider camera network across the corporate campus.
“The Campus Intrusion”: street-side recon, a weak fence junction, and transit toward an occupied wing.
The surrounding campus cameras.
Galleries, exhibit storage, and after-hours interiors for cultural institutions.
Server floors and the interior infrastructure that surrounds them.
Factory floors, raw-materials storage, and loading docks.
Control rooms, access gates, and hardened perimeters for critical infrastructure.
Hospital-site cameras — entrance, pharmacy access corridors, parking structure, and restricted clinical areas.
Banking-hall, ATM, and lot environments.
Every reel is fully synthetic — no cameras, no actors, no one ever at risk. Getting footage this directed out of today's models meant treating generative AI like a film set: a tightly art-directed, multi-stage pipeline with a human craftsman in the loop at every step.
Stills generated with GPT Image 2, then art-directed and retouched with Nano Banana Pro and Nano Banana 2. Photoshop for frame-level manual editing, and After Effects to composite scenes together and build the surveillance look — timecode overlays, sensor grain, and the lens character that sells each shot as genuine CCTV.
Video generation at this level of direction is hard — the models still aren't fully tetherable, and pulling a specific action, blocking, and camera angle out of them takes real persistence. We approached it like directing a shoot: deliberate prompting, many takes, and relentless selection until the behavior on screen matched the brief.
Each shot began as contact sheets — multiple generated options to lock the framing, lighting, and look before any motion. From the chosen direction we moved to a first-pass animation, then a refined motion pass to dial in timing, pacing, and the precise behavioral beat each stage needed to read. A few of those option boards:




Generations rarely came out clean. When a take nailed one element but broke another, we cut the usable parts from multiple generations, masked out artifacts and inconsistencies, and rotoscoped key elements to keep the subject reading clearly at the top of frame — a frame-by-frame finishing pass layered on top of the AI.
The escalation sequences follow a single threat actor across five different cameras, lighting conditions, and environments. Holding that identity consistent — wardrobe, build, and demeanor from the perimeter all the way to the interior climax — was its own discipline, solved through reference-driven editing and careful continuity at every stage.



