View Source

Hardware and audio data requirements for the operation of the Audio classification IV are given in the table:

Microphone	The microphone must be physically connected, correctly recognized by the operating system as an audio capture device, and enabled in Axxon One (see Microphone). Depending on the microphone type (analog, USB, or IP microphone), you must ensure a stable power supply and adequate signal levels
Stream	The number of channels is Mono (single-channel stream). Stereo or multi-channel streams must be downmixed to mono before being fed to the detector
Audio data format	The data format is 16-bit signed integer PCM (little-endian). Other formats (32-bit float, A-law, μ-law, and so on) require conversion on the source side
Sample rate	Supported sampling rates are 8000, 16000, 32000, 48000 Hz. The recommended sampling rate is 16000 Hz. Lower rates reduce recognition quality; higher rates are redundant and increase the load on transmission and processing channels
Programmable gain amplifier	The device must be equipped with a programmable gain amplifier (PGA) with small gain adjustment steps (1 dB) and a low self-noise. This allows for precise adjustment of the microphone's sensitivity without introducing additional nonlinear distortion
Digital signal processing (DSP)	All audio processing algorithms that could distort the original signal and interfere with recognition must be disabled on the hardware and drivers: Automatic gain control (AGC) Noise suppression/cancellation Beamforming Acoustic echo cancellation (AEC) Equalizers (EQ) and high-pass filters (HPF) De‑esser Spatial sound, virtual surround Each feature must have an embedded disable method (physical switch, driver settings, control panel). Embedded audio processing devices interfere with the operation of analytical algorithms
Sound pressure level (SPL)	The sound pressure level of the target event, measured at the point of microphone installation with a sound level meter (Class 2 according to IEC 61672), must meet the following operating conditions: In a noisy environment—80–82 dB or higher (production facility, busy street, room with operating equipment). In a quiet environment—58–60 dB or higher (office, meeting room, living room). If the signal level does not meet the requirements, the system can fail to detect events or generate false alarms
Operating system	Windows OS: Use the standard audio API, and redirect the stream via WASAPI or ASIO. In the audio control panel, go to Device Properties → Additional device properties → Enhancements → clear the Disable all enhancements checkbox. If the tab is missing, check the settings in the utility of your sound card driver. Linux OS: Use ALSA-compatible drivers with basic support. When you use PulseAudio or PipeWire, ensure they are configured for "transparent" mode (flat volumes, noise reduction disabled, and attenuation disabled). When you work with embedded codecs, refer to documented Device Tree Source (DTS) examples to ensure correct audio data transmission without additional processing