Linux setup for real time audio production.

In this article I’ll tell you how to setup your linux box to be an excellent basis for an audio production workstation. I’ll show you how I tweaked my system and what problems I encountered in the process. I will explain the options and issues along the way as well.

Let me start with the hardware specs of my laptop – Dell Latitude E6320 with Intel i5-2520m CPU, 4GiB RAM, 320GiB 7200rpm HDD with Arch Linux and Alesis IO|2 USB audio interface. Despite relatively modest specs, you really can do a lot with a PC like this when it comes to audio production on Linux.

A standard, general purpose Linux distribution always needs some extra tuning, as it is configured for rather different use cases. Workloads like general computing, multimedia or even gaming do not impose such strict real time requirements as audio processing. Thus, this additional configuration is indispensable – you need to tweak your system, so it will operate smoothly under heavy audio processing workloads in real time. The ultimate goal is getting a snappy system with minimal audio latency and no xruns whatsoever.

What are these wretched xruns?

The xrun, is an abbreviation for buffer under- or overrun (hence the ‘x’). This can happen in all sorts of real time applications, like in video streaming, process control, and, obviously, audio production. Each one of these processes needs some kind of input and output buffering to process the data.

In audio production, the data stream arriving from the audio interface lands in the input audio buffer and the audio processing callback is fired. If your system is unable to process this data in the short time before the next chunk of data arrives, the incoming data is lost – this is the buffer overrun condition. Similar situation may occur when your system is too slow to fill the output audio buffer for the audio interface on time. The buffer is not filled (or not completely filled) and the stream of data is not contiguous. This is the buffer underrun condition. Either of these can happen, sometimes even both. If it is the case, you get a corrupted audio stream and an xrun.

For time critical systems, these buffers should be as short as possible to reduce the system latency. However, short buffers make the processing part use more resources, as the processing callback is fired much more often. If the buffer is small and CPU is too slow – you’ll get xruns.

Of course, a general rule for reducing xruns is to make the buffers bigger. It’s perfectly suitable solution for a non-real time processing, like mixing or mastering, where higher latency is not a problem. For live work, however, like guitar processing or tracking, latency should be as low as possible.

As you can see, the audio buffer length is a compromise between the audio latency and system load. With a reasonably powerful hardware (see my PC specs above) and a proper system configuration, audio latency can be inaudible and system load low enough to have many audio tracks and real-time effects running at the same time on your Linux system.

System configuration explained

In older days, Linux kernel needed to be patched to achieve necessary low latency operation for real time audio processing. However, it is NOT necessary with current kernels that have kernel preemption enabled. This mechanism enables kernel code to be preempted everywhere unless forcibly turned-off by disabling local interrupts. This change made the Linux kernel more efficient – a process that run slow, long-running kernel code (e.g. syscall or driver) would no longer block other kernel mode tasks until it finishes its job – it would be preempted and other kernel tasks will be scheduled to run. This reduced the kernel latency, as Arch Linux Wiki states:

the stock Arch kernel with CONFIG_PREEMPT=y can operate with worst case latency up to 10ms

Some hardware configurations may introduce much higher latency than that. But if you think that a custom kernel with real-time patches will help you, then go ahead and try it, but in most cases (if not all) you don’t have to. I’d do it only as a last resort, when I run out of other options.

I use a stock, unmodified kernel from Arch Linux repositories and it works perfectly. Of course, as I said before, some general system tweaking is needed to make the audio production smooth and xrun-free.

First of all, you need the Jack real-time audio server. It provides real-time audio processing capabilities for Linux and the Jack API implementation for audio applications. Two versions are available at the time of writing, Jack1 and Jack2 (a.k.a. jackdmp). I use version 2, as it supports multiple processors (SMP) and connecting audio applications without gaps in audio processing. See jackaudio.org for more details.

A very cool thing about Jack2 is that It basically should work fine out-of-the-box, well, at least on Arch. The Arch Linux package sets up the system limits for the audio group automatically. It also enables the members of the audio group to use rtc (a real time clock device) and hpet (High Performance Event Timer) devices for high performance timing, so no further intervention is needed. Just add yourself to the ‘audio’ group and relogin.

However, if you happen to use other distro than Arch, you may have to check the /etc/security/limits.conf file or /etc/security/limits.d/ directory for files that configure pam_limits module to allow the maximum realtime priority of 99 for the audio group. Here’s a quick hint for you how to add the correct config if you haven’t got one already:

Real time applications may force the operating system to lock parts or all of their memory space by using mlock() or mlockall() system calls. Memory pages that contain addresses locked by these calls are guaranteed to stay in memory – they won’t be paged out (swapped). Because we are going to use applications that use memory locking, e.g. Jack2, Guitarix or Ardour4, we don’t want to impose any limits for these. Thus the “memlock” line sets the limit for the amount of locked-in-memory space, in KiB, to unlimited.

Why do we need to to that, anyway? Well, allowing an unpriviledged user process to raise its priority effectively to the maximum possible and giving it unlimited memory locking possibility can be a major security problem. Such process can effectively starve the other processes to death (CPU and memory wise). Therefore, these features are granted only for the members of the ‘audio’ group. Consult limits.conf manpages for more details.

The permissions for high performance timing devices are set using /usr/lib/udev/rules.d/40-hpet-permissions.rules file:

These devices need to be accessible for audio group members, since they provide the necessary timing features. Depending on an application, the rtc0, HPET or even TSC timers may be used.

Please remember to check these settings out, because various distros may configure these differently. Still, it is configured like this by the Jack2 package on Arch Linux out-of-the-box.

Next thing that is usually tweaked at this point is the system swappiness. It is adjusted using sysctl parameter vm.swappiness, sets how aggressively “anonymous memory pages” are going to be swapped out. Long story short, the kernel divides the memory pages into filesystem backed (stored in the filesystem cache) and anonymous pages that don’t have any filesystem backing, i.e. runtime data. Therefore, this parameter sets the swapping priority balance between the two. For audio production, as there is a lot of runtime data being used and the necessity of real time operation, it is desirable to make your system swap out anonymous memory pages much less.

The default value of 60 is good for a general purpose system. Real time audio processing software need to access the memory without any unnecessary delays. Thus it is advisable to decrease the system swappiness. A value of 10 is a good choice, as advised by the Arch Wiki.

Many tutorials on the web suggest tweaking fs.inotify.max_user_watches parameter as well. This is the maximum number of files that can be watched for changes using the inotify system API. This is not directly related to real time audio processing, however, Arch Linux wiki advises to raise it up to 524288.

A word (or more) about timers

Real-time audio processing needs a very precise timing. Any modern PC has at least a couple of them, so it is crucial to select the best one for Jack.

The TSC – Time Stamp Counter – may tick at the highest frequency of the three, but it might not be reliable enough for real time audio. As it’s the cpu clock cycle counter, it is affected by CPU clock frequency changes. It may even stop when the CPU enters deep energy saving mode (a.k.a. deep C-state). Modern CPUs, however, calibrate the tick rate of the TSC so that it ticks at a constant rate regardless of the frequency/C-state. Look for “CONSTANT_TSC” and “NONSTOP_TSC” flags in /proc/cpuinfo. It is cheap to read this counter with the RDTSC assembly instruction, but it may need some extra calculations to enhance reliability. It has to be polled periodically to guarantee precise timing.

The HPET – High Precision Event Timer – is a special, multichannel, high frequency timer chip built into the motherboard chipset. As a successor to the older acpi_pmtimer (which of course can still be present on some older motherboards), It was designed to be very stable and precise. It is a programmable clock event device with many functions, such as being able to fire an interrupt precisely when needed (e.g. each few ms). Being a separate chip it is not adversely influenced by the varying CPU frequency. It is more precise than the TSC, but its more costly to access it.

The RTC device is the oldest one of the three. It runs at the lowest frequency of the three, but it may trigger interrupts at periodic intervals every 1 second, at “a frequency that can be set to any power-of-2 multiple in the range 2 Hz to 8192 Hz” (see man 4 rtc), or after a previously set interval. This is a legacy chip – a last resort when the HPET or TSC are not available – which is a very rare case these days.

For me, a few trial-by-error sessions revealed that using HPET instead of a default Jack timer (which eventually uses system timer that uses TSC by default) gives far less xruns. Even though my TSC is constant and non-stop, it seems to be affected by CPU throttling/frequency scaling. As I don’t want to disable CPU throttling and scaling, I used HPET timer in my configuration

However, if you don’t care about CPU energy saving, just set your CPU governor to ‘performance’. This will disable any CPU throttling and frequency scaling – your CPU will run on full power regardless of the system load. This usually helps dealing with xruns, but on my laptop, when using HPET, the audio performance of the whole platform seems unaffected by the CPU power saving features.

Configuring and running Jack

With all said tweaks applied, we can configure and start jack2 audio server. My audio interface is an USB 1.1 audio class device and runs at 48kHz sampling rate with 24bit samples. It has two input channels with switchable sensitivity/impedance for guitar, line or microphone in. The card is equipped with digital input (optionally enabled) and digital output as well.

My initial Jack2 configuration is like so:

The command line parameters:

  • -v – start in verbose mode – log much more information to the console
  • -R – use realtime scheduling
  • -c hpet – use HPET timer, see the timers section for details
  • -d alsa – use ALSA backend, any option after this one will configure the backend only.
  • -d hw:1 – use the “hw:1” ALSA device, which is the Alesis IO|2 interface in my setup
  • -r48000 – set 48kHz sampling rate
  • -p256 – set the buffer length to 256 samples
  • -n3 – use 3 periods, i.e. 3 buffers of 256 samples

The realtime scheduling is a must for the a low latency audio processing. Setting the ALSA backend, as I’m using the USB audio device. If I had a Firewire device, I would set FFADO as a backend. The sample rate must match the native sample rate of the device in use, 48kHz in this case. Now, the buffer length and periods are crucial settings to make the latency low without getting xruns. In my setup, the initial value was 256 samples with 3 buffers. The jackd manpage states that 3 periods must be used for USB devices. With these settings I had 5.3ms latency – which was very nice – but the number of xruns was unacceptable, leading to audio glitches, clicks, pops, noise and other unpleasant artefacts every now and again.

Something was still wrong – definitely something was preventing the USB controller from feeding/getting the data in time. It was time to review my hardware configuration.

The problem of IRQ sharing

Linux has an easy way of getting hardware interrupts information and statistics – just display the contents of /proc/interrupts file:

You may see a potential problem right away – this PC has two embedded USB controllers, but one of them shares the IRQ 17 with several other devices, namely the wireless adapter (wlp2s0) and a card reader (mmc0) device. As a side note, you may want to look at IRQ’s from a different perspective – use the ‘lspci -vvv’ command, to see which hardware devices are routed to which IRQs. The ‘-vvv’ switch enhances output verbosity a lot, so most of the output info is omitted for brevity:

Let’s now see the USB device topology with lsusb:

The “Alesis” device is the sound adapter, and the lsusb output shows it’s connected to the Bus 2. Unfortunately, my PC has two USB ports and both are connected to this bus. I guess that the other controller is routed to the docking station header on this PC.

With all these data, the situation is now clear. A shared IRQ line between the three devices is the culprit here. As the USB device connected to the controller handling Bus 2 was operating in real-time, the amount of IRQs was quite high and, still, the other devices on that line needed to be handled as well.

Since I don’t need a wireless network connection when I’m making music, the easiest solution was to disable the WLAN with a hardware switch my laptop has. Of course, you’ll get the same results if you disable your WLAN with NetworkManager or even ‘ip link set <YOUR_WLAN_DEVICE> down’.

This reduced the xrun count a fair bit, yet there was something still in the way. I noticed that the xrun count increases every 2 seconds. That was really interesting. What could slow down my PC exactly every 2 seconds? Interestingly, ‘conky’ was running at the moment, and guess what was its refresh rate 😉 ? Killing it solved the problem and my system became completely xrun-free at last!

So, here’s your top tip for pro audio – check your IRQ’s, disable WLAN and kill all the non-essential processes.

But, what to do when the device that hogs the shared IRQ line can’t be disabled, or simply you don’t want to disable it? There’s a solution to this as well.

Any modern kernel with CONFIG_PREEMPT enabled should have CONFIG_IRQ_FORCED_THREADING option enabled as well. This option, when enabled with ‘threadirqs’ kernel command line parameter, will fire software IRQs on separate kernel threads. These threads’ priority can be altered to achieve minimal latency and reduce xrun count for a pro audio system. A very handy script called ‘rtirq’, written by Rui Nuno Capela (http://www.rncbc.org/), adjusts the priorities of the IRQs that are crucial for audio production. Of course, the default configuration should work out of the box, yet it is always advised to check and adjust the rtirq’s configuration to your needs. See http://alsa.opensrc.org/Rtirq and/or https://fedoraproject.org/wiki/JACK_Audio_Connection_Kit#rtirq for more details.

Conclusion

This is how I got my setup configured for a smooth, xrun-free audio processing. However, be advised, that it is not universal. Sometimes, you’ll get your real-time audio system running smoothly right from the start and, sometimes, you’ll have to spend several hours tuning it to achieve the same result. It all depends on the hardware configuration. A slight change (like, in my second Dell E6320, with a different WLAN card) may change your system drastically, with regard to real-time audio.

I wanted to give you some general rules and configuration guidelines to get you going. I really hope that this article is useful and will help you become a new (open source 😉 ) rock star in a jiffy.

As always, I strongly encourage you to comment the articles! Stay tuned for more stuff.

Cheers!

Cristos.

 

 

Leave a Reply

Your email address will not be published. Required fields are marked *

Time limit is exhausted. Please reload CAPTCHA.