Professional audio over digital networks

Choice of network technology

The network has to ensure that the audio samples transmitted by the source are delivered to the correct destination, in the correct order, and in a timely fashion.

The following is an overview of the three different ways of doing that which were in common use at the beginning of this century (headed by the names that were used in academic research papers in the mid-'80s, with an alternative description in parentheses) and our new Flexilink technology.

Synchronous transfer mode (stream-based)

This is the technology used for digital telephony (ISDN).

The network is star-connected, i.e. consists of a number of "switches" (the word used within the telecomms industry for a telephone exchange) linked to each other and to end equipment (telephones, faxes, modems) by point-to-point connections.

On each link, one "frame" is sent every 125 µsec. A "channel" is one byte within the frame; on faster links there are more channels because the link carries more bytes in 125 µsec. Channels are identified by their position in the frame, and routes are set up in the switches in terms such as "channel 3 (the third byte in each frame) on this input is to be sent out as channel 12 on that output". The memory in which these pieces of information are stored in the switch is called a "routing table".

The processes of setting up a call and of transferring the data are quite separate. To make a call, "signalling" messages are sent to the local switch specifying the number to be called; software works out how the call can best be routed through the network to reach its destination, and makes the necessary entries in the routing table in each switch, as a result of which the switch's hardware copies each data byte to the correct place in the correct outgoing frame.

To terminate a call, a signalling message is sent telling the software in each switch that the channels involved in the call are no longer required and can be used for another call. It also tells the equipment at the other end of the call that the call has finished.

Advantages of this system for digital audio include

The main disadvantage is

Packet transfer mode (packet-based)

This is the technology used for data services, particularly Internet Protocol (IP).

The network consists of a number of "subnetworks" connected to each other by "routers"; the topology of each subnetwork is immaterial. Routers can also be connected to each other by point-to-point links.

The unit of transmission is a "datagram" which is rather like a postal packet: it consists of the data to be sent (typically 500 to 1500 bytes) and various pieces of "header" information including the sender's and destination addresses.

The sender transmits the datagram on its subnetwork, and if the destination is not on the same subnetwork the packet must be copied to an adjacent subnetwork (chosen for being in some sense "closer" to the destination) by a router, this process being repeated until the packet arrives at its destination.

If a stream of data is to be sent, each instalment is made up into a packet and the appropriate header added; the header must include a sequence number or a timestamp so that the receiving end can extract the data in the correct order, because a packet that happens to go by a faster route than the one in front may overtake it. The total amount of header information for an audio packet is typically 80 to 90 bytes, so large packets must be used if the overheads are not to be disproportionate.

At each router, the same kind of decisions that ISDN makes at call set-up need to be made for each packet.

Because there is no reservation of capacity, a router may find it has more traffic to send on a particular output than the output can take. Initially the packets waiting to be output will be queued up in a buffer, and when the queue gets long the routing software may be able to start sending packets for some destinations on a different output, but eventually the buffer can fill up and some packets will then be lost.

(There is a protocol, RSVP, defined to allow capacity to be reserved for traffic such as live audio, but it is not widely implemented and has some significant drawbacks, including that there is no guarantee that the traffic will be sent along the route on which the reservation has been made.)

This system works quite well for data traffic; its main advantage for digital audio is

whereas the disadvantages include

Asynchronous transfer mode (cell-based)

Asynchronous transfer mode (ATM) was intended as a halfway house between the other two technologies, able to carry both live media and datagram traffic on a single network. The proponents of ATM technology (such as the ATM Forum, which has since been incorporated into the Broadband Forum) really ought to have found a better term for the networks, particularly given the confusion with networks of cash dispensers; the term "Broadband ISDN" was occasionally used, and is a better description because the service offered is similar to the original ISDN but offering a wide range of throughput (or "bandwidth") per call instead of a fixed 56 or 64 kbit/s.

The network structure was thus similar to that for synchronous transfer mode, with the unit of transmission being a "cell" consisting of 48 data bytes preceded by a "header". Most of the header was taken up with a "virtual channel identifier" which was used, instead of the position in the frame, as an index into the routing table.

The process of setting up and clearing down calls was separated from that of data transfer in the same way as in ISDN, and the signalling messages used for call set-up also conveyed information about the type of data to be transmitted, so that before deciding to accept a call the destination could know what kind of traffic it would carry (for example PCM audio, in which case it also specified how many channels, the sample rate, number of bits per sample, etc) and to which standards it conformed.

For each call one of several "service classes" could be chosen, including "constant bit rate" (CBR) for a reserved-capacity ISDN-like service not restricted to multiples of 56 or 64 kbit/s, and "unspecified bit rate" (UBR) for a best effort service suitable for datagram traffic.

Advantages of this system for digital audio included

The main disadvantages were


Our new Flexilink technology combines the best features of all the above technologies. It offers a "synchronous" service with guaranteed low latency for live media, using variable-sized packets, and an "asynchronous" service for best-effort traffic that uses all the bandwidth not occupied by the synchronous packets.

More details are available here.

Advantages of this system for digital audio include



Copyright ©2001,2007,2014 Nine Tiles