Decoding and playing an audio stream using libavcodec, libavformat and libao

libavcodec and libavformat are pretty awesome when it comes to decoding/demuxing audio streams. They provide a codec and format agnostic API that processes containers, extracts metadata, and and decodes streams as fixed-size packets one by one into PCM samples.

Unfortunately, libav is not very easy on beginners. It’s very difficult to find code examples or tutorials or even documentation that isn’t just API references that is up to date. So after playing with it for a while, I decided to blog about my audio decoding venture, maybe it would help someone get into programming with libav more easily.

I used the equally awesome libao to send the decoded PCM samples to the audio device. It is both cross-platform and very elegantly simple to use.

#include <stdio.h>
#include <libavcodec/avcodec.h>
#include <libavformat/avformat.h>
#include <libavutil/avutil.h>
#include <ao/ao.h>

void die(const char* message)
{
	fprintf(stderr, "%s\n", message);
	exit(1);
}

int main(int argc, char* argv[])
{
	if (argc < 2) {
		die("Please provide the file path as the first argument");
	}

	const char* input_filename = argv[1];

	// This call is necessarily done once in your app to initialize
	// libavformat to register all the muxers, demuxers and protocols.
	av_register_all();

	// A media container
	AVFormatContext* container = 0;

	if (avformat_open_input(&container, input_filename, NULL, NULL) < 0) {
		die("Could not open file");
	}

	if (av_find_stream_info(container) < 0) {
		die("Could not find file info");
	}

	int stream_id = -1;

	// To find the first audio stream. This process may not be necessary
	// if you can gurarantee that the container contains only the desired
	// audio stream
	int i;
	for (i = 0; i < container->nb_streams; i++) {
		if (container->streams[i]->codec->codec_type == CODEC_TYPE_AUDIO) {
			stream_id = i;
			break;
		}
	}

	if (stream_id == -1) {
		die("Could not find an audio stream");
	}

	// Extract some metadata
	AVDictionary* metadata = container->metadata;

	const char* artist = av_dict_get(metadata, "artist", NULL, 0)->value;
	const char* title = av_dict_get(metadata, "title", NULL, 0)->value;

	fprintf(stdout, "Playing: %s - %s\n", artist, title);

	// Find the apropriate codec and open it
	AVCodecContext* codec_context = container->streams[stream_id]->codec;

	AVCodec* codec = avcodec_find_decoder(codec_context->codec_id);

	if (!avcodec_open(codec_context, codec) < 0) {
		die("Could not find open the needed codec");
	}

	// To initalize libao for playback
	ao_initialize();

	int driver = ao_default_driver_id();

	// The format of the decoded PCM samples
	ao_sample_format sample_format;
	sample_format.bits = 16;
	sample_format.channels = 2;
	sample_format.rate = 44100;
	sample_format.byte_format = AO_FMT_NATIVE;
	sample_format.matrix = 0;

	ao_device* device = ao_open_live(driver, &sample_format, NULL);

	AVPacket packet;
	int buffer_size = AVCODEC_MAX_AUDIO_FRAME_SIZE;
	int8_t buffer[AVCODEC_MAX_AUDIO_FRAME_SIZE];

	while (1) {

		buffer_size = AVCODEC_MAX_AUDIO_FRAME_SIZE;

		// Read one packet into `packet`
		if (av_read_frame(container, &packet) < 0) {
			break;	// End of stream. Done decoding.
		}

		// Decodes from `packet` into the buffer
		if (avcodec_decode_audio3(codec_context, (int16_t*)buffer, &buffer_size, &packet) < 1) {
			break;	// Error in decoding
		}

		// Send the buffer contents to the audio device
		ao_play(device, (char*)buffer, buffer_size);
	}

	av_close_input_file(container);

	ao_shutdown();

	fprintf(stdout, "Done playing. Exiting...");

	return 0;
}

For the sake of simplicity, I have made the assumptions that the PCM samples are of 16bit size each, and that the sample rate is 44.1K samples/second, and channel count is 2. In order to extract these information from the container, you’ll have to read at least one packet first so they would become available. You could decode that read packet and then continue from the next one, or seek back to the beginning of the stream afterwards, and start from there. Like in the following code:

ao_sample_format sample_format;

AVPacket dummy_packet;
av_read_frame(container, &dummy_packet);

if (codec_context->sample_fmt == AV_SAMPLE_FMT_U8) {
	sample_format.bits = 8;
} else if (codec_context->sample_fmt == AV_SAMPLE_FMT_S16) {
	sample_format.bits = 16;
} else if (codec_context->sample_fmt == AV_SAMPLE_FMT_S32) {
	sample_format.bits = 32;
}

sample_format.channels = codec_context->channels;
sample_format.rate = codec_context->sample_rate;
sample_format.byte_format = AO_FMT_NATIVE;
sample_format.matrix = 0;

// Now seek back to the beginning of the stream
av_seek_frame(container, stream_id, 0, AVSEEK_FLAG_ANY);

If you run into any trouble understanding the API, you should get into the header files and check out the interface first-hand. It is very well commented.

And don’t forget to link against avcodec, avformat and ao.

Happy coding!

This entry was posted in c, coding and tagged , , . Bookmark the permalink.

2 Responses to Decoding and playing an audio stream using libavcodec, libavformat and libao

  1. SiT says:

    I’ve found that the samples in the doc section of libav which show basic usage of the lib’s a very good entry point. They reflect the latest api changes too, so you don’t have to scream because someone uses old deprecated or non-existing methods 🙂

  2. andrey says:

    thank you for manual

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s