all

Case study

Live camera streaming for multiple stores with Nginx, FFmpeg, and Supervisord

Live camera streaming architecture for multiple stores

Sometimes the right streaming platform is not really a platform.

This project started with a practical requirement: provide live camera feeds from multiple supply stores to internal staff through separate web cabinets, without introducing a heavy video stack, without recording, and without building around unnecessary abstractions.

The goal was simple:

  • connect multiple store locations
  • expose live camera feeds in a browser
  • start and stop streams on demand from the backend
  • keep the system operationally simple
  • leave room to add more stores later

The final setup used a minimal stack:

  • FFmpeg for ingest
  • Nginx as the RTMP and HLS layer
  • Supervisord to start streaming processes on demand
  • HLS for browser playback
  • RTMP as the intermediate transport layer

No transcoding was needed. The cameras already produced streams in the required codec, so the system could stay focused on transport and delivery rather than video processing.

Requirements

The initial rollout covered:

  • 4 store locations
  • 2 cameras per location
  • 8 camera feeds total

The number of connected stores is expected to grow.

From the start, the system needed to support:

  • multiple separate store contexts
  • live-only streaming
  • backend-controlled stream lifecycle
  • browser-friendly playback
  • minimal operational overhead

Recording was explicitly out of scope for this phase. This helped keep the architecture smaller and more predictable.

Why this approach

There was no need for a full video platform.

The workload did not require:

  • transcoding into multiple bitrate ladders
  • long-term storage
  • DVR playback
  • complex access federation
  • large-scale public distribution

What it needed was a reliable, understandable path from camera feed to browser playback.

That made a lightweight stack a better fit than a heavier media platform.

The design principle was straightforward:

solve the actual delivery problem, not the hypothetical future platform problem.

Architecture

At a high level, the flow looks like this:

  flowchart LR
    A[Store cameras] --> B[FFmpeg ingest]
    B --> C[Nginx RTMP]
    C --> D[HLS output]
    D --> E[Browser clients]
    F[Backend] --> G[Supervisord]
    G --> B

Components

FFmpeg

FFmpeg is responsible for ingesting the camera feed and pushing it into the streaming pipeline.

In this setup it is used as a practical transport tool, not as a transcoding engine. Since all cameras already output in a compatible codec, there is no need to spend CPU on re-encoding.

That simplifies the system significantly:

  • lower CPU usage
  • fewer moving parts
  • less latency introduced by processing
  • easier operational reasoning

Nginx

Nginx handles two roles here:

  • RTMP as the intermediate streaming layer
  • HLS output for browser playback

Each stream is written into its own HLS directory based on stream name. That keeps output separation clear and makes it easier for the backend or frontend layer to map streams to the correct store or cabinet.

This structure also keeps delivery predictable:

  • one stream name
  • one output path
  • one browser-facing playback target

Supervisord

Streams are not meant to run all the time.

They are started and stopped from the backend on demand, and Supervisord is used as the process manager that actually launches the FFmpeg jobs.

This is an important part of the design.

Instead of building a larger orchestration layer too early, the project uses a simple and proven process supervisor:

  • backend decides when a stream should start
  • Supervisord launches the corresponding process
  • FFmpeg ingests and pushes into Nginx
  • browser clients consume HLS output

This keeps control logic simple and avoids turning stream lifecycle into a separate infrastructure problem.

Why RTMP plus HLS

This combination exists for a reason.

RTMP as an intermediate layer

RTMP works well as a stable internal handoff format between ingest and delivery layers.

In this use case it provides a practical intermediate transport path:

  • easy for FFmpeg to publish
  • easy for Nginx to receive
  • simple to structure around stream names
  • good fit for internal relay architecture

The point here is not modernity. It is usefulness.

HLS for browser playback

Browsers do not consume RTSP directly in a practical, portable way for this kind of internal web workflow.

HLS gives the frontend a delivery format that is much easier to work with in browser-based cabinets.

That means staff can access streams through the existing web interface without requiring special desktop software or direct camera connectivity.

Why no transcoding

This was one of the most important simplifications.

All cameras were the same model family and already emitted video in the required codec. Because of that, the stack did not need to spend compute on format conversion.

This changes the whole system profile.

Without transcoding:

  • CPU is no longer the main concern
  • network becomes the dominant resource
  • stream startup stays simpler
  • operational footprint stays smaller
  • capacity planning is easier

That was exactly the right tradeoff for this project.

The system is meant to relay live video efficiently, not to transform it.

Backend-controlled lifecycle

Another practical requirement was that streams should not run permanently.

Streams are enabled and disabled from the backend when needed in the cabinet interface. This matters for both operational control and resource usage.

That model makes sense when:

  • not all feeds need to be active all the time
  • users open streams on demand
  • infrastructure should stay focused on real usage
  • process lifecycle should remain tied to application logic

In this design, the application stays in charge of intent, and Supervisord stays in charge of execution.

That separation is simple and useful.

Scaling beyond the first stores

The current stage is small enough to keep the system easy to reason about:

  • 4 stores
  • 8 cameras
  • HD feeds
  • live-only access

But the structure is intentionally ready for more stores.

The important thing is that growth here is mostly about:

  • more stream definitions
  • more managed FFmpeg processes
  • more network throughput
  • continued discipline in naming and stream isolation

It is not yet a problem that requires a media platform rewrite.

This matters because many systems get overbuilt before they get used.

A minimal stack is often the more scalable decision at the beginning, because it keeps the operational model clear while actual usage patterns are still forming.

Why this was a good fit

This approach works well when the requirements look like this:

  • internal live streaming
  • browser playback
  • known camera format
  • no transcoding
  • no recording
  • moderate scale
  • clear backend ownership of stream lifecycle

It is especially useful when simplicity is a feature, not a compromise.

The stack remains easy to explain:

  • camera feed in
  • FFmpeg publishes
  • Nginx distributes
  • HLS is consumed in browser
  • Supervisord manages process lifecycle

That is a much easier system to operate than a large video platform introduced too early.

Where this approach does not fit

This design is not universal.

A different solution would make more sense if the system needed:

  • recording and retention
  • many codec or bitrate variants
  • large public audience distribution
  • advanced access control at media layer
  • adaptive multi-bitrate packaging at scale
  • deeper analytics around playback quality

Those requirements push the system toward a different class of architecture.

That was not the case here.

Operational lessons

A few practical lessons from this setup are reusable.

1. Minimal is often enough

If cameras already emit the right codec and browser playback can be solved through HLS, there is no reason to add transcoding or a heavier media layer by default.

2. Backend control simplifies resource use

Starting streams on demand instead of running everything permanently helps keep the system closer to real user behavior.

3. Process supervision does not need to be complicated

Supervisord is a perfectly reasonable choice when what you need is controlled start/stop behavior for a known process model.

4. Browser playback changes the architecture

The main reason for HLS here is not theoretical protocol preference. It is practical web delivery.

5. Simplicity improves maintainability

A small, understandable stack is easier to extend than an overbuilt stack introduced before real constraints appear.

Deliverables

  • Architecture design for live-only camera streaming
  • Stream ingest setup with FFmpeg
  • RTMP relay layer with Nginx
  • HLS output structure per stream
  • Backend-controlled stream lifecycle
  • Supervisord-based process management
  • Per-store cabinet integration model
  • Stream naming and output separation strategy
  • Browser-compatible delivery path
  • Initial rollout for multiple stores and camera feeds

Delivery time

The initial implementation was delivered as a focused integration effort for the first production rollout.

The key output was not just stream transport itself, but a practical operating model:

  • stream start from backend
  • predictable per-stream output
  • browser-compatible delivery
  • minimal stack without a heavy video platform

Final takeaway

This project did not need a video platform.

It needed a working streaming path for real operations.

Using Nginx, FFmpeg, RTMP, HLS, and Supervisord, the result was a small and understandable system that:

  • serves live camera feeds in browser
  • supports multiple store locations
  • starts streams on demand
  • stays simple enough to operate and extend

That is often the better engineering choice.

Not the biggest stack.
The right one.