A Guide to Camera Video and Control Protocols

Historically, video cameras have presented an analogue or at least wired interface to present a video signal to a display or recorder. Dedicated wired interfaces are often on coaxial cable (e.g. analogue composite for standard definition or serial digital HD-SDI for higher-definition video imagery). However, many modern cameras are IP-capable, using standard networks (typically Ethernet) to transport video images from camera to viewer. The advantages of this approach are manifold.

Existing network infrastructure, especially if cabling and switches support at least Gigabit Ethernet (GbE), can be used, reducing installation costs. Video imagery can be made available to multiple users of the data using established network techniques such as multicasting. Finally, a network-based architecture is inherently scalable, and can even work over greater distances than those covered by a local area network, using satellite links, for example.

Camera video

Transport

One of the most popular protocols for sending camera video over a network is Real-Time Streaming Protocol (RTSP). Many cameras are RTSP-capable, as are widely available video streaming viewers such as VLC. RTSP allows the viewer to interrogate the video streaming protocols supported by the camera and to subsequently initiate video streaming.

Control messages and responses are handled via a conventional TCP connection, while the video stream itself, which uses RTP (Real-Time Transport Protocol), may be delivered by UDP unicast, UDP multicast or TCP. Monitoring of quality of service is handled by RTSP while sequence numbers, timestamps and the video payload itself is handled by RTP.

One essential feature supported by RTSP is the media name, which defines the characteristics of the video stream to be provided, or which camera sensor is to be used on a multi-sensor camera.

Codecs

The camera video payload is encoded according to published standards. Encoding typically takes advantage of redundancy in video images, maintaining acceptable perceived video quality while significantly reducing the bandwidth required for transmission. Bandwidth efficiency is particularly important when multiple video streams are to be delivered over shared local or especially wide area networks. The most common coding schemes are Advanced Video Coding (AVC, or H.264/MPEG-4 Part 10) and its successor High Efficiency Video Coding (HEVC, or H.265/MPEG-H Part 2).

Hardware support for both encoding and decoding in many graphics cards or integrated graphics means that the complex computations required can be efficiently implemented.

ONVIF

The Open Network Video Interface Forum (ONVIF) standard was developed by a forum of industry suppliers and is aimed at interoperability between cameras and software applications that display camera video. As far as video streaming is concerned, it uses RTSP and is in that sense a superset of it. However, it adds other facilities not included in RTSP, such as camera configuration and a discovery mechanism that eliminates the need for consumers of camera video to know the addresses of each camera and its supported media strings. ONVIF compatibility is an important consideration when evaluating cameras for use in a multi-sensor system.

GigE Vision

Another widely-used method for transporting camera video is GigE Vision, another standard which has its roots in industrial automation. It uses Ethernet as its transport media but has its own methods for camera control and video encoding. It is a licensed, non-open standard and is less widely adopted than RTSP or ONVIF.

Camera control

Many cameras support variable field of view (zoom), normally implemented either as optical-only or optical/digital zoom. Some cameras further include a pan/tilt platform (pan as a rotation in the horizontal plane, tilt as a rotation in the vertical plane), allowing the camera’s field of view to be controlled and directed towards a specific object or location. Though these pan, tilt and zoom (PTZ) functions can, as a minimum, be controlled by the operator using a conventional joystick, it is often required that they be controllable by a third-party system. By such means, the camera can be sent for example, through a sequence of zoom levels and directions (a camera tour) or can be directed towards a specific object (slew-to-cue), the object having potentially been detected by another sensor such as a direction finder, acoustic detector, radar or scanning camera.

Pelco-D

Camera control interfaces are normally network-based, though some legacy implementations may use serial communications. The Pelco-D camera control protocol is one such legacy protocol, widely used by PTZ cameras in security and surveillance applications. For this reason, it is a very compact protocol, requiring only a few bytes to be sent to the camera to initiate a specific action. However, Pelco-D commands can just as well be sent across a network interface as short packets as across a serial interface.

Subscribe to continue reading this article, it's free.

Free access to Engineering Insights, authored by our industry leading experts. You will also receive the Cambridge Pixel newsletter which includes the latest Engineering Insights releases.

Fill in the form below and you will be sent an Instant Access link.

Title:

First Name:

Last Name:

Company:

Email Address:

City:

Country:

Your email address will be added to our mailing list. You may unsubscribe at any time. We take data protection seriously. See our Privacy Policy.