Audio Bus interface

Audio Bus interface

This section talks about communication i/f to xfer audio data. The audio genrating source are microphones, while speakers deliver that audio. In the past, microphones and speakers used to be analog sensing device, and their chip output was analog signal. However, analog signals being iprone to interference on PCB are never a preferred solution. So, now these microhones/speakers have a lot of analog logic on these chips itself, along with digital circuitry that can drive out these audio signals in digital binary format(string of 1's and 0's).

There are many different standards for transmitting digital audio data from one place to another. Some formats, such as I2S, TDM, and PDM are typically used for inter-IC communication on the same PC board. Others, such as S/PDIF and Ethernet AVB are primarily used for data connections from one PCB to another through cabling.

Audio BandWidth: Typical Audio data is sampled at 8KHz to 192KHz. It has 8bit to 32 bit resolution. For 8Khz and 8 bit resolution, we need 8k*8=64kHz bus frequency, or as high as 192k*32=6Mhz. As such, a lot of these interfaces don't need to run faster than few MHz.

Techincal doc by Analog Devices about I2C, TDM and PDM => https://www.analog.com/media/en/technical-documentation/technical-articles/MS-2275.pdf

I2S:

I2S is called Inter IC sound, and is the most common  digital audio format used for audio data transfer between ICs. The I2S standard was introduced by Philips Semiconductors (now NXP) in 1986 and was revised in 1996

Philips was bought NXP, and so bty he spec is mainatined y them here: https://www.nxp.com/docs/en/user-manual/UM11732.pdf

The attached diagram shows the waveform/connections.

FIXME ATTACH DIAGRAM --

 

I2S devices can either be in Master or slave mode. Master always drives Clock and Channel Select. Data can be driven either by Master or slave. There are 3 wires for I2S:

  1. Clock => SCK (serial Clk) or BCLK (Bit clk). Master drives SCK.
  2. Channel Select => WS (Word select) or LRCK (Left Right clk) or FS (Frame Sync). I2S can carry 1 or 2 channels of data. That channel is selected via this line, and it also serves to synchronize the frame (i.e it indicates start/end of current frame). 0= Left channel, 1=Right channel. It's better to think of WS as "Frame sync", as when you have > 2 channels, then WS will be a pulse, and in that case, it will indicate start of new frame. Master drives WS.
  3. Data => SD (serial Data) or SDATA, SDIN, SDOUT. Either Master or slave may drive SD. However, SD is unidirectional - either master or slave drives it, but has to be configured beforehand as an i/p or o/p.

Apart from 3 wires, it may have variations where following lines are added. These signals are not part of spec, but added by certain companies for their product (and commonly supported).

  1. Master clock => MCK (typically 256 x LRCLK) = commonly included for synchronizing the internal operation of the analog/digital converters. Master drives MCK along with WS.
  2. Multiplexed data line for upload => Having dedicated SDAT lines for i/p and o/p. So, instead of 1 SD pin, we have 2 separate data pins with fixed direction => SDIN and SDOUT.

 

TDM:

When we talk about I2S, we are usually talking about legacy mode I2S as explained above. I2S can also operate in a Time Division Multiplexing (TDM) mode. We use this when we want to xfer more than 2 channels of data. That is not supported by legacy I2S. We can have multiple data i/p or o/p pin with 1 set of SCK and WS (might need multiple WS?). This will solve our problem of transmitting multiple channels. But that will increase the number of pins in system. Instead we want to use single data line for transferring audio signals from all channels. This is where TDM spec comes in. There is no standard for TDM interfaces, so ICs have their own slightly-different flavor of a TDM implementation. Since TDM has no spec, a common implementation includes all 5 signals used in I2C above: BCLK, LRCK, SDIN, SDOUT and optional MCK.

A TDM data stream can carry as many as sixteen channels of data and has a data/clock configuration similar to that of I2S. Each channel of data uses a slot on the data bus that is 1/Nth the width of the frame, where N is the number of channels being transferred. A TDM frame clock is often implemented as a single bit-wide pulse (rather than I2S’s alternating High and low for left and right channels). So, at start of WS pulse, we send fixed width of data for all N channels one after the other (called a frame), and then start of next WS pulse, we send next bits of data for all N channels and so on. Thus we avoid having multiple Data Lines. Clock rates are higher here at multi MHz (although < 25MHz) as multiple channels are supported on a single data line, so higher BW needed.


TDM is commonly used for a system with multiple sources feeding one input, or one source driving multiple devices. In the former case, each TDM source shares a common data bus. The source must be configured to drive the bus only during its appropriate channel, and tri-state its driver while the other devices are driving the other channels.

 

PDM:

PDM is another variation of I2S, where 2 channels are supported by using only 2 wires => Clk and Data. 2 PDM sources drive a common Data Out (DOUT) line which feeds into the PDM receiver's DIN line. A clock generated by the system master can be used by two slave devices, which use alternate edges of the clock to output their data on a common signal line.

PDM basics:PDM is Pulse Density Modulation. Here Audio data amplitude is represented via only 1 data line, by modulating the density of 1's and 0's to indicate higher or lower amplitude. When more 1's aretransmitted, then it's indicating a higher amplitude, and when more 0's are transmitted, it's indicating a low amplitude.

https://users.ece.utexas.edu/~bevans/courses/rtdsp/lectures/10_Data_Conversion/AP_Understanding_PDM_Digital_Audio.pdf

A PDM-based architecture differs from I2S and TDM in that the decimation filter is in the receiving IC, rather than the transmitting IC. The output of the source is the raw high sample rate modulated data, such as the output of a Sigma-Delta modulator, rather than a decimated data, as it is in I2S.

 


 

MEMS Microphone:

MEMS (Micro-Electro-Mechanical Systems) microphones come in a package (just like other chips). There may be analog or digital MEMS microphone. Analog ones produce an output voltage that is proportional to the instantaneous air pressure level. They have just 3 pins => VDD, Gnd and Output Voltage pin. This output pin is connected to a microcontroller on a PCB, which processes this analog o/p and produces bits. AIn a digital microphone, the whole circuitry to process the analog o/p and produce digital bitstream is put in a single package. So, it's lot better performance in terms of noise immunity.

Basic intro: https://www.edn.com/basic-principles-of-mems-microphones/

Digital Microphone Teardown: https://www.signalessence.com/blog/mems-microphone-teardown

There are 2 separate dies inside a digital mems microphone:

  1. Microphone die: This die has the microphone accoustic sensor. This is made using same semiconductor production lines that is used for making transistors. Layers of different materials are deposited on top of a silicon wafer and the unwanted material is then etched away, creating a moveable membrane and a fixed backplate over a cavity in the base wafer.  The sensor backplate is a stiff perforated structure that allows air to move easily through it, while the membrane is a thin solid structure that flexes in response to the change in air pressure caused by sound waves. MEMS microphones need to have a hole in their package to allow sound to reach the acoustic sensor. 
  2. ASIC die: This die is a digital/analog ckt that has a charge pump to place a fixed charge on the microphone membrane.  The ASIC then measures the voltage variations caused when the capacitance between the membrane and the fixed backplate changes due to the motion of the membrane in response to sound waves.  An ADC inside the ASIC converts the voltage variations into digital format for processing and/or transmission. Finally o/p is sent out in digital format using I2S, TDM, PDM. PDM data is captured by separate microcontroller/processor, which then decides what to do with it. Typically it converts PDM into PCM via dedicated hardware filters or in software and then it finally plays on speakers.

 

The microphone accoustic sensor itself is on silicon wafer, fabricated on semiconductor production lines using highly automated processes. Layers of different materials are deposited on top of a silicon wafer and the unwanted material is then etched away, creating a moveable membrane and a fixed backplate over a cavity in the base wafer.  The sensor backplate is a stiff perforated structure that allows air to move easily through it, while the membrane is a thin solid structure that flexes in response to the change in air pressure caused by sound waves.