EmbDev.net

Forum: µC & Digital Electronics Proof of concept: ATmega MP3 decoder


von Horst (Guest)


Attached files:

Rate this post
useful
not useful
Here it comes - my implementation of an MP3 decoder on 8 bit AVR.

The main intention to do this was to check if there's a lower bitrate 
alternative to four bit ADPCM providing better quality than the terrible 
sounding two or three bit ADPCM variants.
And while considering the feasiblity I was wondering if an 8 bit 
platform providing 10.000.000 multiply operations @ 20 MHz clock is 
capable of handling an MP3 frame every 70 ms in realtime.
Well - it does.
Of course, you won't get a full-blown MP3 decoder running on an 8 bit 
AVR supporting every option and bitrate. But it's well-suited to play 
back some speech samples or simple melodies, a 64 MBit serial flash will 
provide enough capacity to store more than half an hour of sound.
Unfortunately I didn't have any idea how to keep the RAM requirements 
below 4 kBytes, so an ATmega with at least 8 kBytes RAM is needed.

And yes - we all know that there are plenty of ways to do this much 
better, quicker, easier and whatever.
No need to mention here, please.
It was fun for me, I like to do it in assembly language.

The current implementation is made to play back a single channel, 16 
kBit/s MP3 stream with 8 kHz sampling frequency (stream has to be 
encoded according to MPEG 2.5 Layer 3/LSF).
As this is a proof of concept there are some prerequisites and 
restrictions to respect.
The decoder can only handle special crafted MP3 files.
It doesn't handle any ID3 tags (neither V1 nor V2.x).
It doesn't support the MP3 bit reservoir, thus requiring a true 
forward-sequential format without any back-references.
To focus on the essential parts of the decoding process only long blocks 
are handled yet.
If the MP3 stream contains short/mixed blocks they will just be skipped.
Fortunately some older versions of the free LAME MP3 encoder are 
offering the needed options to encode the MP3 stream omitting the 
unsupported features.
I recommend to use the following parameters for a single-channel source 
file with 8 kHz sampling frequency:

lame -b 16 --cbr --noshort --nores -q 0 -t <source file> <output file>

This will produce an MP3 file with 16 kBit/s (-b 16) constant bitrate 
(--cbr), no short frames (--noshort), no use of the bit reservoir 
(--nores), with the best quality (-q 0) and without the Lame tag 
embedded (-t). The output file doesn't contain proprietary data. Every 
MP3 decoder should be able to play back as long as it supports MPEG 2.5.
I've successfully used the LAME version 3.90.3 for Windows (this version 
is a little hard to find and therefore has been included in the project 
folder).
Apparently the LAME encoder everytime inserts two short blocks at the 
beginning of the MP3 stream even if instructed not to do.
Although the decoder will skip this frames silently, they're wasting 
memory. You can just delete these frames (simply cut the first 288 bytes 
of the MP3 file). To avoid losing the first part of the audio, add 150 
ms of silence before (this is recommended anyway even if you don't touch 
the short blocks).


I've used AVR Studio 4 for the project.
The assembly code provides three major options of how the MP3 input and 
the decoded output can be handled and has been set up to run on an 
ATmega 1284P@16MHz (16.384 MHz would exactly match the 8 kHz sampling 
frequency). Using the internal 8 MHz clock unfortunately won't be 
sufficient, but you could try to set the OSCCAL register to a higher 
value to gain some more processing power (didn't try this myself yet).
Before including your own MP3 file into the code it has to be converted 
using some bin2inc utility.

1. You can just use the AVR Studio simulator to play with the code, no 
need of real AVR hardware.
I've implemented and debugged the decoder this way.
The simulator provides a function to write the output of a parallel port 
into a file.
To activate use menu "Debug"->"AVR Simulator Options"->"Stimuli and 
logging" and enter an output file name for PORTA. When running the code, 
the decoded samples will be written into the output file in a special 
manner.
To make use of it I've provided two small Gawk scripts for conversion.
Using "gawk -f convout.awk simulator_output_file >samples.txt" will 
convert the special format to a text file containing lines with 16 bit 
hexadecimal values. You may use the script also to convert other output 
data, it was quite helpful for array dumps during troubleshooting.
With "gawk -f mkbin.awk -v BINMODE=3 samples.txt >samples.raw" the 
samples will be written as binary PCM data.
I've used Audacity to import the raw data. It should recognize the main 
parameters itself (Signed 16 bit PCM, Big Endian, one channel), you just 
have to enter the sampling frequency (8000 Hz).
If you want to go deeper into the code (perhaps to add some of the 
missing features), my urgent recommendation is to get a source of 
libmad/madplay and Helix decoder, get it to run on your PC and use it 
for reference.

2. Play back an MP3 file included directly in the code.
To use this option a high-impedance speaker or headphones have to be 
connected to pins OC1A and OC1B via simple RC low pass filter. Try 47 
Ohms and 2.2 or 4.7 uF. For suppression of DC connect a 100 uF capacitor 
in series. Of course, this is just the simplest design. Higher order low 
pass filters will provide better quality. The play back will start after 
reset.
Keep in mind that using an 8 bit PWM for audio play back only provides a 
small range of dynamic, so the use of optimized/normalized full scale 
source material is recommended.

3. Play back an MP3 file read from serial SPI flash.
To use this option you need the output circuitry described with option 
2.
The PIND.3 input should be applied with logic H or L for controlling 
purposes.
Additionally a 64 MBit serial flash chip is required, I've used Winbond 
W25Q64BV (it's available in PDIP package). Connect it to the 
controller's SPI interface, using its /SS output as /CS on the flash 
chip. Note that the W25Q64BV only supports 3.6 volts max for supply.
If you own a programmer for the serial flash you can use it to store an 
MP3 file (starting with address 0).
Alternatively a quick & dirty mini terminal has been added to the code 
to upload an MP3 file into the serial flash. For this option an RS232 
interface on USART0 has to be adapted.
In any case a two-byte value (low byte first) has to be inserted at the 
beginning of the MP3 file, holding the number of MP3 frames (to 
calculate just take the length of the file divided by frame size 144).
If you decide to use the built-in upload terminal, connect your PC to 
the RS232 interface using 115k2 8n2 and no flow control (I recommend 
TeraTerm). Keep PIND.3 at logic H and reset the controller. The built-in 
terminal should respond, erasing any current flash memory content. When 
prompted, start the upload in BINARY mode (simple file upload, no 
transmission protocol like X- or Z-Modem). Once the upload has been 
finished, put a logic L on PIND.3, reset the controller and the play 
back should start.

There are several additional flags to control if a particular function 
is optimized for speed or for a lower code flash memory usage. They 
basically switch between the insertion of full expanded macros (most of 
all are multiply operations) or just subroutine calls. Depending on the 
controller's clock frequency you can save approx. half of the code 
memory (size of the tables won't be reduced) and there's still room to 
save even a few bytes more because currently unrolled loops remain 
unrolled, for instance.

Would be interesting if the decoder part of other compressing voice 
codecs could also be implemented on ATmega, perhaps an even more 
efficient one such as Wideband AMR/G722.2 (maybe only a subset of codec 
modes for a start).
Any volunteers?

von Horst M. (horst)


Rate this post
useful
not useful
Read the final part of the story here: 
Beitrag "Software MP3 decoder for ATmega/ATxmega"

Please log in before posting. Registration is free and takes only a minute.
Existing account
Do you have a Google/GoogleMail account? No registration required!
Log in with Google account
No account? Register here.