rtf_filter(3) filters a chunk of data


#include <rtfilter.h>

unsigned int rtf_filter(hfilter filt, const void *x,
void *y, unsigned int ns);


This function applies the filter referenced by filt on ns samples specified by the pointer x and writes the filtered data into the array pointed by y. The arrays pointed by x and y must bemade of values whose correspond to the type specified at the creation of the filter. In addition, the two arrays must not overlap (failing to comply will lead to undefined results).

Their number of elements have to be equal to ns multiplied by the number of channels processed (specified at the creation of the filter). The arrays should be packed by channels with the following pattern:


where SiCj refers to the data value of the i-th sample and the j-th channel and k refers to the number of channel specified at the creation of the filter.


Returns the number of samples written in the array pointer by y. For most of the filters, this value will always be equal to ns. This is however not the case of a downsampling filter whose the number of samples returned may vary from one call to another.


On platforms that support SIMD instructions, rtf_filter() is implemented in 2 different versions: one normal and one using SIMD instruction set which performs nearly 4x faster than the normal one when processing float data types. The SIMD version is automatically selected at runtime if the following conditions are met (otherwise, the implementation falls back to the normal version):

The input x and output y are aligned on 16 bytes boundary (128 bits)
The sample strides (the size of the data type multiplied by the number of channel) of the input and output are multiples of 16 bytes.

The first condition is easily met by allocating x and y using memory allocation function mandating a certain alignment (for example, posix_memalign(3) on POSIX platform).

The second condition is met if the number of channels is carefully chosen. Given the boost obtained with the SIMD version, it is often interesting to add unused channels into the input and output (when possible) just to make the strides multiple of 16 bytes (for example using always a multiple of 4 channels when dealing with float and real values).