mirror of
https://github.com/alsa-project/alsa-lib.git
synced 2025-10-29 05:40:25 -04:00
259 lines
12 KiB
Text
259 lines
12 KiB
Text
/*! \page pcm PCM (digital audio) interface
|
|
|
|
<P>Although abbreviation PCM stands for Pulse Code Modulation, we are
|
|
understanding it as general digital audio processing with volume samples
|
|
generated in continuous time periods.</P>
|
|
|
|
<P>Digital audio is the most commonly used method of representing
|
|
sound inside a computer. In this method sound is stored as a sequence of
|
|
samples taken from the audio signal using constant time intervals.
|
|
A sample represents volume of the signal at the moment when it
|
|
was measured. In uncompressed digital audio each sample require one
|
|
or more bytes of storage. The number of bytes required depends on number
|
|
of channels (mono, stereo) and sample format (8 or 16 bits, mu-Law, etc.).
|
|
The length of this interval determines the sampling rate. Commonly used
|
|
sampling rates are between 8kHz (telephone quality) and
|
|
48kHz (DAT tapes).</P>
|
|
|
|
<P>The physical devices used in digital audio are called the
|
|
ADC (Analog to Digital Converter) and DAC (Digital to Analog Converter).
|
|
A device containing both ADC and DAC is commonly known as a codec.
|
|
The codec device used in a Sound Blaster cards is called a DSP which
|
|
is somewhat misleading since DSP also stands for Digital Signal Processor
|
|
(the SB DSP chip is very limited when compared to "true" DSP chips).</P>
|
|
|
|
<P>Sampling parameters affect the quality of sound which can be
|
|
reproduced from the recorded signal. The most fundamental parameter
|
|
is sampling rate which limits the highest frequency that can be stored.
|
|
It is well known (Nyquist's Sampling Theorem) that the highest frequency
|
|
that can be stored in a sampled signal is at most 1/2 of the sampling
|
|
frequency. For example, an 8 kHz sampling rate permits the recording of
|
|
a signal in which the highest frequency is less than 4 kHz. Higher frequency
|
|
signals must be filtered out before feeding them to ADC.</P>
|
|
|
|
<P>Sample encoding limits the dynamic range of a recorded signal
|
|
(difference between the faintest and the loudest signal that can be
|
|
recorded). In theory the maximum dynamic range of signal is number_of_bits *
|
|
6dB. This means that 8 bits sampling resolution gives dynamic range of
|
|
48dB and 16 bit resolution gives 96dB.</P>
|
|
|
|
<P>Quality has price. The number of bytes required to store an audio
|
|
sequence depends on sampling rate, number of channels and sampling
|
|
resolution. For example just 8000 bytes of memory is required to store
|
|
one second of sound using 8kHz/8 bits/mono but 48kHz/16bit/stereo takes
|
|
192 kilobytes. A 64 kbps ISDN channel is required to transfer a
|
|
8kHz/8bit/mono audio stream in real time, and about 1.5Mbps is required
|
|
for DAT quality (48kHz/16bit/stereo). On the other hand it is possible
|
|
to store just 5.46 seconds of sound in a megabyte of memory when using
|
|
48kHz/16bit/stereo sampling. With 8kHz/8bits/mono it is possible to store
|
|
131 seconds of sound using the same amount of memory. It is possible
|
|
to reduce memory and communication costs by compressing the recorded
|
|
signal but this is beyond the scope of this document. </P>
|
|
|
|
\section pcm_general_overview General overview
|
|
|
|
ALSA uses the ring buffer to store outgoing (playback) and incoming (capture,
|
|
record) samples. There are two pointers being mantained to allow
|
|
a precise communication between application and device pointing to current
|
|
processed sample by hardware and last processed sample by application.
|
|
The modern audio chips allow to program the transfer time periods.
|
|
It means that the stream of samples is divided to small chunks. Device
|
|
acknowledges to application when the transfer of a chunk is complete.
|
|
|
|
\section pcm_transfer Transfer methods in unix environments
|
|
|
|
In the unix environment, data chunk acknowledges are received via standard I/O
|
|
calls or event waiting routines (poll or select function). To accomplish
|
|
this list, the asynchronous notification of acknowledges should be listed
|
|
here. The ALSA implementation for these methods is described in
|
|
the \ref alsa_transfers section.
|
|
|
|
\subsection pcm_transfer_io Standard I/O transfers
|
|
|
|
The standard I/O transfers are using the read (see 'man 2 read') and write
|
|
(see 'man 2 write') C functions. There are two basic behaviours of these
|
|
functions - blocked and non-blocked (see the O_NONBLOCK flag for the
|
|
standard C open function - see 'man 2 open'). In non-blocked behaviour,
|
|
these I/O functions never stops, they return -EAGAIN error code, when no
|
|
data can be transferred (the ring buffer is full in our case). In blocked
|
|
behaviour, these I/O functions stop and wait until there is a room in the
|
|
ring buffer (playback) or until there are a new samples (capture). The ALSA
|
|
implementation can be found in the \ref alsa_pcm_rw section.
|
|
|
|
\subsection pcm_transfer_event Event waiting routines
|
|
|
|
The poll or select functions (see 'man 2 poll' or 'man 2 select' for further
|
|
details) allows to receive the acknowledges from the device while
|
|
application can wait to events from other sources (like keyboard, screen,
|
|
network etc.), too. The select function is old and deprecated in modern
|
|
applications, so the ALSA library does not support it. The implemented
|
|
transfer routines can be found in the \ref alsa_transfers section.
|
|
|
|
\subsection pcm_transfer_async Asynchronous notification
|
|
|
|
ALSA driver and library knows to handle the asynchronous notifications over
|
|
the SIGIO signal. This signal allows to interrupt application and transfer
|
|
data in the signal handler. For further details see the sigaction function
|
|
('man 2 sigaction'). The section \ref pcm_async describes the ALSA API for
|
|
this extension. The implemented transfer routines can be found in the
|
|
\ref alsa_transfers section.
|
|
|
|
\section pcm_open_behaviour Blocked and non-blocked open
|
|
|
|
The ALSA PCM API uses a different behaviour when the device is opened
|
|
with blocked or non-blocked mode. The mode can be specified with
|
|
\a mode argument in \link ::snd_pcm_open() \endlink function.
|
|
The blocked mode is the default (without \link ::SND_PCM_NONBLOCK \endlink mode).
|
|
In this mode, the behaviour is that if the resources have already used
|
|
with another application, then it blocks the caller, until resources are
|
|
free. The non-blocked behaviour (with \link ::SND_PCM_NONBLOCK \endlink)
|
|
doesn't block the caller in any way and returns -EBUSY error when the
|
|
resources are not available. Note that the mode also determines the
|
|
behaviour of standard I/O calls, returning -EAGAIN when non-blocked mode is
|
|
used and the ring buffer is full (playback) or empty (capture).
|
|
The operation mode for I/O calls can be changed later with
|
|
the \link snd_pcm_nonblock() \endlink function.
|
|
|
|
\section pcm_async Asynchronous mode
|
|
|
|
There is also possibility to receive asynchronous notification after
|
|
specified time periods. You may see the \link ::SND_PCM_ASYNC \endlink
|
|
mode for \link ::snd_pcm_open() \endlink function and
|
|
\link ::snd_async_add_pcm_handler() \endlink function for further details.
|
|
|
|
\section pcm_handshake Handshake between application and library
|
|
|
|
The ALSA PCM API design uses the states to determine the communication
|
|
phase between application and library. The actual state can be determined
|
|
using \link ::snd_pcm_state() \endlink call. There are these states:
|
|
|
|
\par SND_PCM_STATE_OPEN
|
|
The PCM device is in the open state. After the \link ::snd_pcm_open() \endlink open call,
|
|
the device is in this state. Also, when \link ::snd_pcm_hw_params() \endlink call fails,
|
|
then this state is entered to force application calling
|
|
\link ::snd_pcm_hw_params() \endlink function to set right communication
|
|
parameters.
|
|
|
|
\par SND_PCM_STATE_SETUP
|
|
The PCM device has accepted communication parameters and it is waiting
|
|
for \link ::snd_pcm_prepare() \endlink call to prepare the hardware for
|
|
selected operation (playback or capture).
|
|
|
|
\par SND_PCM_STATE_PREPARE
|
|
The PCM device is prepared for operation. Application can use
|
|
\link ::snd_pcm_start() \endlink call, write or read data to start
|
|
the operation.
|
|
|
|
\par SND_PCM_STATE_RUNNING
|
|
The PCM device is running. It processes the samples. The stream can
|
|
be stopped using the \link ::snd_pcm_drop() \endlink or
|
|
\link ::snd_pcm_drain \endlink calls.
|
|
|
|
\par SND_PCM_STATE_XRUN
|
|
The PCM device reached overrun (capture) or underrun (playback).
|
|
You can use the -EPIPE return code from I/O functions
|
|
(\link ::snd_pcm_writei() \endlink, \link ::snd_pcm_writen() \endlink,
|
|
\link ::snd_pcm_readi() \endlink, \link ::snd_pcm_readi() \endlink)
|
|
to determine this state without checking
|
|
the actual state via \link ::snd_pcm_state() \endlink call. You can recover from
|
|
this state with \link ::snd_pcm_prepare() \endlink,
|
|
\link ::snd_pcm_drop() \endlink or \link ::snd_pcm_drain() \endlink calls.
|
|
|
|
\par SND_PCM_STATE_DRAINING
|
|
The device is in this state when application using the capture mode
|
|
called \link ::snd_pcm_drain() \endlink function. Until all data are
|
|
read from the internal ring buffer using I/O routines
|
|
(\link ::snd_pcm_readi() \endlink, \link ::snd_pcm_readn() \endlink),
|
|
then the device stays in this state.
|
|
|
|
\par SND_PCM_STATE_PAUSED
|
|
The device is in this state when application called
|
|
the \link ::snd_pcm_pause() \endlink function until the pause is released.
|
|
Not all hardware supports this feature. Application should check the
|
|
capability with the \link ::snd_pcm_hw_params_can_pause() \endlink.
|
|
|
|
\par SND_PCM_STATE_SUSPENDED
|
|
The device is in the suspend state provoked with the power management
|
|
system. The stream can be resumed using \link ::snd_pcm_resume() \endlink
|
|
call, but not all hardware supports this feature. Application should check
|
|
the capability with the \link ::snd_pcm_hw_params_can_resume() \endlink.
|
|
In other case, the calls \link ::snd_pcm_prepare() \endlink,
|
|
\link ::snd_pcm_drop() \endlink, \link ::snd_pcm_drain() \endlink can be used
|
|
to leave this state.
|
|
|
|
\section pcm_formats PCM formats
|
|
|
|
The full list of formats present the \link ::snd_pcm_format_t \endlink type.
|
|
The 24-bit linear samples uses 32-bit physical space, but the sample is
|
|
stored in low three bits. Some hardware does not support processing of full
|
|
range, thus you may get the significative bits for linear samples via
|
|
\link ::snd_pcm_hw_params_get_sbits \endlink function. The example: ICE1712
|
|
chips support 32-bit sample processing, but low byte is ignored (playback)
|
|
or zero (capture). The function \link ::snd_pcm_hw_params_get_sbits() \endlink
|
|
returns 24 in the case.
|
|
|
|
\section alsa_transfers ALSA transfers
|
|
|
|
There are two methods to transfer samples in application. The first method
|
|
is the standard read / write one. The second method, uses the direct audio
|
|
buffer to communicate with the device while ALSA library manages this space
|
|
itself. You can find example of all communication schemes for playback
|
|
in \ref example_test_pcm "Sine-wave generator example".
|
|
|
|
\subsection alsa_pcm_rw Read / Write transfer
|
|
|
|
There are two versions of read / write routines. The first expects the
|
|
interleaved samples at input, and the second one expects non-interleaved
|
|
(samples in separated buffers) at input. There are these functions for
|
|
interleaved transfers: \link ::snd_pcm_writei \endlink,
|
|
\link ::snd_pcm_readi \endlink. For non-interleaved transfers, there are
|
|
these functions: \link ::snd_pcm_writen \endlink and \link ::snd_pcm_readn
|
|
\endlink.
|
|
|
|
\subsection alsa_mmap_rw Direct Read / Write transfer (via mmaped areas)
|
|
|
|
There are two functions for this kind of transfer. Application can get an
|
|
access to memory areas via \link ::snd_pcm_mmap_begin \endlink function.
|
|
This functions returns the areas (single area is equal to a channel)
|
|
containing the direct pointers to memory and sample position description
|
|
in \link ::snd_pcm_channel_area_t \endlink structure. After application
|
|
transfers the data in the memory areas, then it must be acknowledged
|
|
the end of transfer via \link ::snd_pcm_mmap_commit() \endlink function
|
|
to allow the ALSA library update the pointers to ring buffer. This sort of
|
|
communication is also called "zero-copy", because the device does not require
|
|
to copy the samples from application to another place in system memory.
|
|
|
|
\par
|
|
|
|
If you like to use the compatibility functions in mmap mode, there are
|
|
read / write routines equaling to standard read / write transfers. Using
|
|
these functions discards the benefits of direct access to memory region.
|
|
See the \link ::snd_pcm_mmap_readi() \endlink,
|
|
\link ::snd_pcm_writei() \endlink, \link ::snd_pcm_readn() \endlink
|
|
and \link ::snd_pcm_writen() \endlink functions.
|
|
|
|
\section pcm_examples Examples
|
|
|
|
The full featured examples with cross-links:
|
|
|
|
\par Sine-wave generator
|
|
\ref example_test_pcm "example code"
|
|
\par
|
|
This example shows various transfer methods for the playback direction.
|
|
|
|
\par Latency measuring tool
|
|
\ref example_test_latency "example code"
|
|
\par
|
|
This example shows the measuring of minimal latency between capture and
|
|
playback devices.
|
|
|
|
*/
|
|
|
|
/**
|
|
* \example ../test/pcm.c
|
|
* \anchor example_test_pcm
|
|
*/
|
|
/**
|
|
* \example ../test/latency.c
|
|
* \anchor example_test_latency
|
|
*/
|