Reverse engineering USB audio gear

2021-09-19 | Olivia Mackintosh

Earlier this year I were helping to reverse engineer the USB audio interface on the Pioneer DJM line of DJ mixers in order to add Linux support. I’d like to share my process, knowledge I picked up along the way, as well as some technical details.

Background

I’m a record collector and amateur DJ / hacker and picked up a used DJM-750 around 3 years ago without really thinking too much about the audio interface and whether it would work with the Linux kernel.

Time eventually passed and the mixer became the central hub for my desktop audio needs, living on my desk alongside my decks. I wanted to use it as a general-purpose audio interface since it has 8 input and 8 output channels and supports a sample rate of up to 96KHz. Unfortunately, although the mixer was detected by my Linux distribution, ALSA (the Linux audio subsystem), would not detect that there was an audio device present and I could not use any of the audio IO. Bummer: The only option is to reverse engineer and add support I thought.

Pioneer does provide several drivers for Windows that allow you to use their Rekordbox software but it’s designed to be locked down and to require a subscription to use and it still doesn’t allow you to access the IO directly and use it with other software or with Windows as a generic audio interface. But the proprietary driver is useful for reverse engineering efforts.

Preliminary investigations

Often we can learn a lot about devices by using standard tools. In particular: lsusb -v will list all of the devices on all busses as well as listing all of the “descriptors”. This is a technical term to mean: “A specification in the firmware of the device about how the device communicates”. It usually contains things like endpoint ID, transfer type, vendor ID, product ID and so on. For lots of devices there will be multiple “interfaces” for a single device, each will have a descriptor. Interfaces are a bit like ports with networking and are just allow frames to separated.

Tools

The silver lining is that I could set up a Windows virtual machine, install the Pioneer driver, and sniff the USB traffic between the mixer and the virtual machine to figure out what is going on. For that, I used two things:

  1. The usbmon kernel module with Wireshark
  2. OpenViszla: an FPGA-based USB analyzer. I used ViewSB developed by Qyriad and Kate Temkin as the frontend since Wireshark doesn’t yet work with OpenViszla. I do want to develop a driver for Wireshark support though since the OpenViszla hardware is really quite cheap compared to things like the TotalPhase Beagle 480.

Hardware capture

The protocol analyzer sits between the PC and mixer. It then is connected to a PC that will recieve the analysis data (this can be the same computer as the host):

    Host PC <---
               OpenViszla Card <--> DJM-750
Analysis PC <---

USB Specificaiton

USB 2.0 is not only a protocol but is a complete specification covering the phyical cable and connector all the way up to the structure of frames and packets that are transmitted between host and client but we only need to focus on the higher layers for the purposes of reverse engineering the data that is sent.

For the DJM-750, we look to the high-speed mode. In full-speed/low-speed, the frame rate is 1ms and is fixed because it’s used as a timing reference for isochronous transfers. More on isochronous later. In high-speed, there are 8 microframes each with a duration of 125ms. Thus, 8 microframes add up to 1ms duration. This is important as in a mixed device tree of HS and FS/LS devices, all of the communication between hubs is done at HS. It’s still possible to retain the crucial timing though since there is a common demonimator.

Isochronous transfer

Audio is naturally time-sensitive, so [audio] interfaces generally use isochronous transfers which are continuous and periodic in nature. This is in contrast to interrupt transfers used for mice and keyboard and to bulk transfers used for mass storage.

What is vendor specific?

Discovered values

Input Values
------------
Control Tone LINE	0x0000
Control Tone CD/LINE	0x0001 (check)
Control Tone PHONO	0x0003
Post Fader		0x0006
Cross Fader A		0x0007
Cross Fader B		0x0008
MIC			0x0009
AUX			0x000d
REC OUT			0x000a
NONE			0x000f

Channel Mask
------------
0x0100 = Channel 1
0x0200 = Channel 2
0x0300 = Channel 3
0x0400 = Channel 4
...and so on.

This can be represented as:

n << 8

where n is the channel number


Can find actual value by logical OR

CH 2 Control Tone PHONO:
	0x0200 | 0x0003 = 0x0203

For each Pioneer devices we need to know:
* The number of channels *
* The supported input types for each channel *

	* = in quirks-table.h already


Representing Input Values as #defines
--------------------------------------

#define PIONEER_CTRLTONE_LINE	0x0000
#define PIONEER_CTRLTONE_CDLINE	0x0001
#define PIONEER_CTRLTONE_PHONO	0x0003
#define PIONEER_POSTFADER	0x0006
#define PIONEER_XFADER_A	0x0007
#define PIONEER_XFADER_B	0x0008
#define PIONEER_MIC		0x0009

Adding support to the kernel

We need to pay attention to the following files: - sound/usb/quirks-table.h Describing the vendor-specific interfaces - sound/usb/quirks.c Setting the sample-rate - sound/usb/mixer-quirks.c Adding alsamixer controls that send specific values to the device

Kernel contributions

Further reading

Creative Commons Licence

This work is licensed under a Creative Commons Attribution-NoDerivatives 4.0 International License