To IP or not to IP: Lessons from AoIP broadcast installations

Kypros Christodoulides
Technical Sales & Support, SC Media Canada

the client's dilemma

The most common question I get asked by broadcast project leaders planning new audio installations is whether they should choose a “cutting edge” Audio-over-IP mixing console or a more “traditional” one. In terms of Calrec consoles, this is basically a choice between…

  1. a)  A latest-generation IP Core with native AES67/ST2110-30 I/O routing, or…

  2. b)  A “tested-and-true” Bluefin2 Core with a Hydra2 I/O network.

This blog entry summarizes my personal insights into this critical decision, and it includes some of the lessons I learned from large-scale AoIP broadcast installations.

Many Possibilities

The evolution of networked audio I/O systems can be viewed as an accelerating progression from simple point-to-point interconnects (Analog, then AES3, then MADI) to more sophisticated I/O-sharing systems. The latter started out as proprietary solutions developed by individual manufacturers, and were based on either Layer 2 (“ethernet”, e.g. Cobranet, Ethersound, Hydra2, etc.), or Layer 3 (“IP”, e.g. Livewire, Dante, etc.) of the OSI model. To achieve interoperability between devices from a broad range of manufacturers, the IP-based solutions are currently converging to non-proprietary open-source standards like AES67 and ST2110-30.

Every modern broadcasting system is bound to include several of these interconnect types, but the main system architecture is typically designed around one. While Calrec’s AoIP systems are fully compliant with AES67 and ST2110-30, our recent deployments of both AoIP and “traditional” systems have also included Dante interfaces (primarily for intercom links), MADI links (for older equipment), and Waves SoundGrid I/Os (for real-time effects processing on external PCs).

The Lure of IP

IP-based solutions for broadcast have been gaining momentum for many good reasons.

The use of “off-the-shelf” IP components (network switches, SFPs, etc.) offers major economies of scale, increases a system’s versatility, and allows devices from many manufacturers to exchange audio and video streams in a flexible manner.

The speed and density of standard IP components has grown exponentially over the years, while the cost of these components keeps falling. We are now at a point where passing multiple ultra-high-definition video signals and full-resolution immersive audio through IP networks is technically very achievable and not particularly expensive.

Aided by the emergence of standards like PTP (Precision Time Protocol, for sample-accurate synchronization of audio and video devices), and NMOS IS-04 / IS-05 (standardized device discovery and connection management), IP systems based on AES67/ST2110 are well on the way to becoming a dominant technology in broadcast.

Many IP-based processing engines (e.g. Calrec’s ImPulse IP core) also offer new advantages that were absent from earlier generations:

  • they can be physically separated from the control surfaces and from the I/O boxes (since control and audio travel over standard IP),

  • they can be fully virtualized (physical control surfaces are not always necessary),

  • they can drive multiple independent surfaces at the same time, and…

  • they pack a lot more processing power than earlier generations.

    For example, Calrec’s ImPulse Core can drive up to 4 independent Apollo or Artemis consoles at the same time, each able to mix 1122 input channels with full processing!

Traditional Console Advantages

On the other side of the coin, “traditional” mixing consoles still have advantages that make them appropriate, and even preferable, in certain applications. Some proprietary I/O networking protocols (e.g. Calrec’s Hydra2) are mature technologies that guarantee…

  • ease of use,

  • reliable auto-discovery,

  • very low latency,

  • fast deployment, and…

  • straightforward diagnostics.

    These are the advantages inherent to a fully deterministic Layer 2 network built and controlled by a single manufacturer.

    To interface with IP devices where necessary, Calrec’s H2-IP Gateway units can translate Hydra2 to/from AES67/ST2110-30 audio streams (including label transfer and mic preamp controls). And Modular I/O frames on the Hydra2 network can include I/O cards with a multitude of other formats such as MADI, Dante, SDI, etc. This is illustrated on the diagram below.

    Furthermore, stand-alone systems (e.g. OB trucks), typically don’t benefit from the same economies of scale as large broadcast centres. Traditional consoles often prove more cost- effective in these applications.

Lessons learned

For anyone with a traditional audio engineering background (such as myself), AoIP systems present a new challenge and a steep learning curve. Having commissioned both traditional and AoIP consoles over the years, I’ve gained some insights into the real-world challenges involved. The most poignant lessons came from deploying large-scale AoIP systems with multiple mixing consoles in full ST2110 environments (including both audio and video essences):

  • Stable PTP is the most critical factor to getting any AoIP device to work right. This can be a challenge when the Grand Master Clock generator and PTP distribution through the primary and secondary networks are managed by other teams (e.g. Video, IT).
  • Not all devices support audio streams with high-channel counts. Calrec AoIP equipment supports up to 80 channels of uncompressed 24-bit/48kHz audio per stream, but some manufacturers don’t even support 8 channel streams! This becomes an issue when you’re trying to transfer a decent number of audio channels between devices: it takes a lot of time to create and manage a large number of small streams! It’s an operational challenge as well, because operators are forced to deal with a huge routing matrix with hundreds of tiny streams instead of smaller routing matrix with just a handful of high- channel-count streams.
  • Not all devices support 0.125ms packet times (AES67 levels B and C)! This one was a big surprise, since 0.125ms packets are the “de facto” standard in broadcast. Sure, 1ms packets (AES67 level A) will work, but they have much higher latency and can only transfer 8 audio channels per stream.
  • Signal-flow thinking doesn’t apply to AoIP system diagnostics. I’ve been well aware of this fact for some time now, but undoing decades of practice doesn’t come easy: during a recent AoIP project I caught myself using a “good signal path” strategy to diagnose a transmission problem between devices (this doesn’t work in IP world!)
  • Audio-only IP installations (AES67) are easier than Audio-and-Video-over-IP (ST2110). These standards are practically interchangeable from the audio point of view, but in ST2110 systems the most critical system components (e.g. PTP GMs and network switches) are often dictated and controlled by the Video and IT teams.
  • Sometimes a problem is outside of your control. Example: one persistent issue I encountered in a recent ST2210 installation turned out to be a stuck IGMP subscription on one of the client’s network switches. It took an IT specialist to solve this one.
  • Solving IP system problems takes time!


In the final analysis, my answer to the Client’s Dilemma depends on who is asking the question:

From a Management perspective, AoIP solutions make perfect sense now, especially for large- scale systems involving multiple audio consoles and other equipment from diverse manufacturers. The resulting systems are more versatile, scalable, and “future-proof” than traditional ones, and the infrastructure cost decreases with growing system size and with time.

From an Engineering perspective, AoIP solutions are significantly more challenging to design, deploy, and troubleshoot than traditional systems. They require careful planning, close coordination with IT departments, sophisticated diagnostic equipment, and (for those with traditional broadcast audio backgrounds) a lot of learning. I generally advise clients to allow 3x more time to deploy an AoIP system, and this may be an underestimate if video is also involved.

From an Operational perspective, AoIP solutions are a mixed bag: they certainly offer additional flexibility (creating audio streams is like running “virtual MADI fibers” from device to device!) But they also complicate operation in some ways: the audio streams must be set-up, labelled, and organized to be operational, and because they are virtual, they can change from day to day and from show to show. Most audio operators are comfortable with patching audio channels inside large pipelines (e.g. MADI), but with AoIP the pipelines themselves can be created, patched, and deleted on the fly. This added versatility adds a layer of complexity (and extra work) that some operators may be hesitant to embrace.