The Challenge of Audio Synchronization in Wireless Headphones

Bluetooth headphones are becoming increasingly popular for their convenience and portability. According to the Washington Post, by 2023, 81 million pairs of such headphones will have been purchased in the United States, compared to 39 million wired versions. However, building and configuring a pair of wireless headphones consisting of two independent devices presents a significant challenge in terms of audio synchronization. What methods do we use to achieve perfect harmony, and what does the famous split from the truck commercial have to do with it?

Synchronization issue – from cars to Bluetooth headphones

We all remember the Volvo commercial in which Jean Claude van Damme performed “The Epic Split” on the mirrors of a pair of reversing trucks 11 years ago. We start with it because it's an excellent example of the problem of synchronizing two independent devices coming from outside virtual space – in this case, done manually by two drivers.

In doing so, the truck drivers had to pay attention to two key issues:

a) The trucks must be parallel and travel at a constant speed.
b) The drivers had to communicate with the actor and with each other.

How does this analogy apply to Bluetooth technology in wireless headphones?

a) The sound must be played at a constant speed and synchronized.
b) The headphones must “communicate” with both the phone and each other.

Constant audio playback speed between wireless headphones

The problem of keeping the speed of both trucks constant is not trivial. Vehicle speedometer accuracy is up to 1 km/h. Then there is the uncertainty associated with measuring the angular velocity of the axle and not knowing the exact diameter of the wheel. If the trucks were to break up, van Damme would slip off the mirrors of the vehicles and take a spectacular fall.

In the world of audio, the playback speed of both headphones is theoretically constant because the code and the hardware on which it runs are identical. The processor executes instructions according to a clock frequency, usually supplied externally. The clock source is usually a quartz resonator based on the piezoelectric effect. The oscillation frequency depends on the shape of the crystal and the electrodes, as well as the temperature. Due to the imperfection of the manufacturing process and the different operating conditions (e.g. one earphone staying in the shade), there are slight differences in the frequency of the generated signal. Over time, this difference will cause the sound in one earpiece to overtake the sound in the other. The user will initially perceive this as directionality (the impression that the sound source is on either side of the head), and over time, as audio asynchrony that prevents further listening.

To solve this problem, we usually measure the actual amount of audio being processed, such as bytes or clock edges (moments when the clock signal changes state). By comparing the two values between the headphones, we can make adjustments to the actual playback speed.

Figure 1: Representation of two square waveforms at different frequencies. It can be seen that the successive slopes slowly move away from each other over time.


What is snooping in communication between headphones?

We can imagine Jean Claude van Damme as the stunt director. Although we can't see it in the commercial, it's easy to imagine a situation where he's the one giving commands to the drivers, such as “Start", “Faster”, “Slower”, “Closer” or “Farther”. In the absence of wireless communication, all participants rely on voice communication. The director must send the command and the drivers must acknowledge receipt of the command. The command will reach both drivers, but only the one who is closer will have to acknowledge it. If only one driver hears the command, he must relay it to the other driver. If it is the closer driver who does not hear the command, the other driver must acknowledge its receipt for them. This rather complex logic is designed to reduce the communication between the stunt director and the drivers, and to leave the repetition to the drivers.

An analogous situation exists with headphones. Both need to monitor the communication because there is no time to transmit the entire audio stream between them. Since it is impossible to connect two headphones to the phone at the same time, the so-called snooping is used. This is a method by which the other headphone, which has all the information about the connection, is able to listen in on the communication and receive the data with audio. The snooping headphone knows when the primary headphone should acknowledge the packet, and if it does not, it can do so on behalf of the primary.

This behavior reduces the number of retransmissions. Using a wireless connection, headphones can exchange missing packets, reducing audio distortion.

Figure 2: Snooping diagram. The red arrow is the connection between the primary handset and the phone. The snooping headphone (secondary) does not maintain its connection with the phone, and when communication is needed, it acts as the primary headphone, so the yellow arrow next to the phone starts at the same place as the red arrow. Communication between headphones (headphones connection) is marked in blue.

LE Audio – the new standard for audio streaming

A new standard for Bluetooth audio streaming called LE Audio has recently emerged. It uses a low-energy version of Bluetooth for streaming that uses much less power than the previous version, allowing devices to run longer without recharging the battery. 

The phone streams audio to two headphones simultaneously, without the need for snooping. The audio stream packets are sent with a fixed and known period (e.g., every 10 milliseconds the phone receives consecutive 10 millisecond parts of the recording), and the devices are much easier to synchronize. In addition, LE Audio's legacy Subband Coding (SBC) codec has been replaced with a new Low Complexity Communication (LC3) codec that delivers higher quality sound to headphone users. In other words, there has been a significant improvement in sound detail and clarity.

The new standard has significantly increased the bandwidth of the connection, allowing more data to be transmitted – up to 6 independent streams. The sound can also be managed more precisely. Instead of stereo and mono modes, we get surround sound or transmission of multiple language versions simultaneously. A mode for transmitting audio to all synchronized devices in the transmission area has also been added.

Due to the complexity of the new standard and the need to have Bluetooth radios in the appropriate standard, LE audio technology is currently supported in a small number of phones and headphones, which means that its development is taking place right in front of our eyes.

Summary – Comarch's role in the implementation of headphones supporting new versions of Bluetooth

Audio stream synchronization is a key challenge for wireless headphones and determines the ability to listen to long recordings. Headphone-to-headphone communication makes it possible to improve the quality of audio playback by reducing the number of lost packets. The introduction of the new LE Audio standard will solve the problems of the current implementation of True Wireless Stereo (TWS), which means stereophonic audio playback on headphones without maintaining a wired connection.

What is Comarch's role in the development of Bluetooth-based audio devices? Our team of engineers can boast successful implementations of headphones using this technology, among the many solutions we have delivered. The devices we have prepared have been introduced into mass production and sales with the expert support of Comarch specialists. We are also currently working on the implementation of audio synchronization for wireless headphones.

You can read more about Comarch's experience in this area in the Case Study: Comarch Experience in Bluetooth and here.