Master Module

class sudio.core.Master(input_device_index=None, output_device_index=None, data_format=sudio.io.SampleFormat.SIGNED16, nperseg=500, noverlap=None, window='hann', NOLA_check=True, input_dev_sample_rate=44100, input_dev_nchannels=2, input_dev_callback=None, output_dev_nchannels=2, output_dev_callback=None, buffer_size=30, audio_data_directory='./sudio/')

Bases: object

The Master class is responsible for managing audio data streams, applying windowing, and handling input/output devices. This class provides various methods for processing, recording, and playing audio data, with functionality to handle different audio formats, sample rates, and device configurations.

Parameters

  • output_device_indexint, optional

    Index of the output audio device. If None, uses the system’s default output device.

  • data_formatSampleFormat, default=SampleFormat.SIGNED16

    Audio sample format (e.g., FLOAT32, SIGNED16, UNSIGNED8)

  • npersegint, default=500

    Number of samples per segment for windowing and processing

  • noverlapint, optional

    Number of overlapping samples between segments. If None, defaults to half of nperseg.

  • windowstr, float, tuple, or ndarray, default=’hann’

    Window function type for signal processing:

    • String: scipy.signal window type (e.g., ‘hamming’, ‘blackman’)

    • Float: Beta parameter for Kaiser window

    • Tuple: Window name with parameters

    • ndarray: Custom window values

    • None: Disables windowing

  • NOLA_checkbool, default=True

    Perform Non-Overlap-Add (NOLA) constraint verification

  • input_dev_sample_rateint, default=44100

    Input device sample rate in Hz. Behavior depends on input:

    • If a specific value is provided: Uses the given sample rate

    • If None: Automatically selects the default sample rate of the input device(if it exist)

    • If the selected rate is unsupported, raises an error

    • Recommended range typically between 8000 Hz and 96000 Hz

  • input_dev_nchannelsint, default=2

    Number of input device channels

  • input_dev_callbackcallable, optional

    Custom callback for input device processing

  • output_dev_nchannelsint, default=2

    Number of output device channels

  • output_dev_callbackcallable, optional

    Custom callback for output device processing

  • buffer_sizeint, default=30

    Size of the audio stream buffer

  • audio_data_directorystr, default=’./sudio/’

    Directory for storing audio data files

Notes

  • Various methods are available for audio processing, pipeline management, and device control. Refer to individual method docstrings for details.

  • The class uses multi-threading for efficient audio stream management.

  • Window functions are crucial for spectral analysis and should be chosen carefully.

  • NOLA constraint ensures proper reconstruction in overlap-add methods.

  • Custom callbacks allow for flexible input/output handling but require careful implementation.

CACHE_INFO_SIZE = 32
BUFFER_TYPE = '.su'
start()

Starts the audio input and output streams and launches any registered threads.

Returns:

  • self (Master) – Returns the instance of the Master class for method chaining.

add_file(filename, sample_format=sudio.io.SampleFormat.UNKNOWN, nchannels=None, sample_rate=None, safe_load=True)

Adds an audio file to the database with optional format and parameter adjustments.

Supports WAV, FLAC, VORBIS, and MP3 file formats.

Parameters:

filenamestr

Path/name of the audio file to add.

sample_formatSampleFormat, optional

Desired sample format. Defaults to automatically detecting or using master’s format.

nchannelsint, optional

Number of audio channels. Defaults to file’s original channel count.

sample_rateint, optional

Desired sample rate. Defaults to file’s original rate.

safe_loadbool, default True

If True, modifies file to match master object’s audio attributes.

Returns:

:

AudioWrap

Wrapped audio file with processed metadata and data.

Raises:

ImportError

If safe_load is True and file’s channel count exceeds master’s channels.

add(record, safe_load=True)

Adds audio data to the local database from various input types.

Supports adding: - AudioWrap objects - Audio file paths (mp3, WAV, FLAC, VORBIS) - AudioMetadata records

Parameters:

recordUnion[AudioWrap, str, AudioMetadata]

The audio record to add to the database.

safe_loadbool, default True

If True, ensures record matches master object’s audio specifications.

Returns:

:

AudioWrap

Processed and wrapped audio record.

Raises:

ImportError

If safe_load is True and record’s channel count exceeds master’s.

TypeError

If record is not a supported type.

Notes:

  • Uses cached files for optimized memory management

  • Execution time may vary based on cached file state

Examples:

master = sudio.Master()
audio = master.add('./alan kujay.mp3')

master1 = sudio.Master()
audio1 = master1.add(audio)

master.echo(audio[:10])
master1.echo(audio1[:10])
recorder(record_duration, name=None)

Record audio for a specified duration. This method captures audio input for a given duration and stores it as a new record in the Master object’s database.

Parameters:

record_durationfloat

The duration of the recording in seconds.

namestr, optional

A custom name for the recorded audio. If None, a timestamp-based name is generated.

return:

AudioWrap instance

Notes:

  • The recording uses the current audio input settings of the Master object

(sample rate, number of channels, etc.).

  • The recorded audio is automatically added to the Master’s database and can be

accessed later using the provided or generated name.

  • This method temporarily modifies the internal state of the Master object to

facilitate recording. It restores the previous state after recording is complete.

Examples:

Record for 5 seconds with an auto-generated name

>>> recorded_audio = master.recorder(5)

Record for 10 seconds with a custom name

>>> recorded_audio = master.recorder(10, name="my_recording")

Use the recorded audio

>>> master.echo(recorded_audio)
load(name, safe_load=True, series=False)

Loads a record from the local database. Trying to load a record that was previously loaded, outputs a wrapped version of the named record.

Parameters:
  • name (str) – record name

  • safe_load (bool) – Flag to safely load the record. if safe load is enabled then load function tries to load a record in the local database based on the master settings, like the frame rate and etc (default: True).

  • series (bool) – Return the record as a series (default: False).

Return type:

Union[AudioWrap, AudioMetadata]

Returns:

(optional) AudioWrap object, AudioMetadata

get_record_info(record)

Retrieves metadata for a given record.

Parameters:

record (Union[str, AudioWrap]) – The record (str, or AudioWrap) whose info is requested.

Return type:

dict

Returns:

information about saved record in a dict format [‘frameRate’ ‘sizeInByte’ ‘duration’ ‘nchannels’ ‘nperseg’ ‘name’].

syncable(*target, nchannels=None, sample_rate=None, sample_format=sudio.io.SampleFormat.UNKNOWN)

Prepares a list of targets to be synchronized. Determines whether the target can be synced with specified properties or not

Parameters:
  • target – Targets to sync. wrapped objects

  • nchannels (int) – Number of channels (default: None); if the value is None, the target will be compared to the ‘self’ properties.

  • sample_rate (int) – Sample rate (default: None); if the value is None, the target will be compared to the ‘self’ properties.

  • sample_format (SampleFormat) – Sample format (default: SampleFormat.UNKNOWN); if the value is None, the target will be compared to the ‘self’ properties.

Returns:

only objects that need to be synchronized.

sync(*targets, nchannels=None, sample_rate=None, sample_format=sudio.io.SampleFormat.UNKNOWN, output='wrapped')

Synchronizes audio across multiple records. Synchronizes targets in the AudioWrap object format with the specified properties.

Parameters:
  • targets – Records to sync. wrapped objects.

  • nchannels (int) – Number of channels (default: None); if the value is None, the target will be synced to the ‘self’ properties.

  • sample_rate (int) – Sample rate (default: None); if the value is None, the target will be synced to the ‘self’ properties.

  • sample_format (SampleFormat) – if the value is None, the target will be synced to the ‘self’ properties.

  • output – can be ‘wrapped’, ‘series’ or ‘ndarray_data’

Returns:

synchronized objects.

del_record(record)

Deletes a record from the local database.

Parameters:

record (Union[str, AudioMetadata, AudioWrap]) – Record to delete (str, AudioMetadata, or AudioWrap).

export(obj, file_path='./', format=sudio.io.FileFormat.UNKNOWN, quality=0.5, bitrate=128)

Exports a record to a file in WAV, MP3, FLAC, or VORBIS format. The output format can be specified either through the format argument or derived from the file extension in the file_path. If a file extension (‘.wav’, ‘.mp3’, ‘.flac’, or ‘.ogg’) is included in file_path, it takes precedence over the format argument. If no extension is provided, the format argument is used, defaulting to WAV if set to FileFormat.UNKNOWN. The exported file is saved at the specified file_path.

Parameters:
  • obj (Union[str, AudioMetadata, AudioWrap]) – Record to export (str, AudioMetadata, or AudioWrap). - str: Path to a file to be loaded and exported. - AudioMetadata: A metadata object containing audio data. - AudioWrap: Objects that wrap or generate the audio data.

  • file_path (str) – Path to save the exported file (default: ‘./’). - A new filename can be specified at the end of the path. - If a valid file extension (‘.wav’, ‘.mp3’, ‘.flac’, or ‘.ogg’) is provided, it determines the output format, overriding the format argument. - If no extension is included and the path is set to ‘./’, the name of the record is used.

  • format (FileFormat) – Output format (FileFormat.WAV, FileFormat.MP3, FileFormat.FLAC, or FileFormat.VORBIS). Defaults to FileFormat.UNKNOWN, which results in WAV being chosen unless a valid extension is provided in file_path.

  • quality (float) – Quality setting for encoding (default: 0.5). - For WAV: Ignored - For MP3: Converted to scale 0-9 (0 highest, 9 lowest) - For FLAC: Converted to scale 0-8 (0 fastest/lowest, 8 slowest/highest) - For VORBIS: Used directly (0.0 lowest, 1.0 highest)

  • bitrate (int) – Bitrate for MP3 encoding in kbps (default: 128). Only used if the format is MP3.

Returns:

None

Raises:

  • TypeError: Raised if obj is not one of the expected types (str, AudioMetadata, or AudioWrap).

  • ValueError: Raised if an unsupported format is provided.

get_record_names()

Returns a list of record names in the local database.

Return type:

list

get_nperseg()

Returns the number of segments per window.

get_nchannels()

Returns the number of audio channels.

get_sample_rate()

Returns the sample rate of the master instanse core processor.

stream(record, block_mode=False, safe_load=False, on_stop=None, loop_mode=False, use_cached_files=True, stream_mode=StreamMode.optimized)

Streams a record with optional loop and safe load modes.

Note

The audio data maintaining process has additional cached files to reduce dynamic memory usage and improve performance, meaning that, The audio data storage methods can have different execution times based on the cached files.

Note

The recorder can only capture normal streams(Non-optimized streams)

Parameters:
  • record (Union[str, AudioMetadata, AudioWrap]) – Record to stream (str, AudioMetadata, or AudioWrap).

  • block_mode (bool) – Whether to block the stream (default: False).

  • safe_load (bool) – Whether to safely load the record (default: False). load an audio file and modify it according to the ‘Master’ attributes(like the frame rate, number oof channels, etc).

  • on_stop (callable) – Callback for when the stream stops (default: None).

  • loop_mode (bool) – Whether to enable loop mode (default: False).

  • use_cached_files – Whether to use cached files (default: True).

  • stream_mode (StreamMode) – Streaming mode (default: StreamMode.optimized).

Return type:

StreamControl

Returns:

A StreamControl object

mute()

Mutes the master main stream.

unmute()

Unmutes the master main stream.

is_muted()

Checks if the audio stream is muted.

echo(*args, enable=None, main_output_enable=False)

Play audio or manage audio streaming output functionality.

Provides flexible audio playback and echo control with multiple input types and configuration options.

Parameters:

*argsstr or AudioMetadata or AudioWrap

Audio sources to play: - File path (str) - AudioMetadata object - AudioWrap object Multiple sources can be passed simultaneously

enablebool, optional

Controls real-time echo behavior when no specific audio is provided: - None (default): Toggle echo on/off - True: Enable echo - False: Disable echo

main_output_enablebool, default False

Determine whether to maintain the main audio stream’s output during playback. Helps prevent potential audio feedback.

Behavior:

  • With no arguments: Manages real-time echo state

  • With audio sources: Plays specified audio through default output

  • Fallback to web audio if system audio is unavailable

  • Supports multiple audio playbacks

Note:

If system audio fails, it’ll try playing through your web browser using HTML5 audio (if you’re in a supported environment like Jupyter).

Examples:

>>> master = sudio.Master()
>>> master.add('audio1.ogg')
>>> master.add('audio2.ogg')
>>> # Play multiple audio sources
>>> master.echo('audio1', 'audio2.wav')
>>> # Toggle real-time echo
>>> master.echo(enable=True)
disable_echo()

Disables the echo functionality.

wrap(record)

wraps a record as a AudioWrap.

Parameters:

record (Union[str, AudioMetadata]) – Record to wrap (str or AudioMetadata).

prune_cache()

Retrieve a list of unused or orphaned cache files in the audio data directory.

Scans the audio data directory and identifies cache files that are no longer referenced in the local database, helping manage file system resources.

Returns:

: list

Absolute file paths of cache files not associated with any current audio record.

Notes:

  • Compares existing files against local database records

  • Filters out currently used cache files

  • Useful for identifying potential cleanup candidates

clean_cache(max_retries=3, retry_delay=0.1)

The audio data maintaining process has additional cached files to reduce dynamic memory usage and improve performance, meaning that, The audio data storage methods can have different execution times based on the cached files. This function used to clean the audio cache by removing cached files.

The function implements retry logic for handling permission errors and ensures files are properly deleted across different operating systems.

is_started()

Checks if the audio input and output streams are started.

get_window()

Retrieves the current window configuration.

Returns:

dict or None: A dictionary containing window information if available, or None if not set.

  • ’type’: str, the type of window.

  • ’window’: window data.

  • ’overlap’: int, the overlap value.

  • ’size’: int, the size of the window.

disable_std_input()

Disables standard input stream by acquiring the main stream’s lock object.

enable_std_input()

Enables standard input stream by clearing the main stream’s lock.

add_pipeline(pip, name=None, process_type=PipelineProcessType.MAIN, channel=None)

Adds a new processing pipeline.

Parameters:

  • pip (obj): Pipeline object or array of defined pipelines.

Note

In PipelineProcessType.MULTI_STREAM process type, pip must be an array of defined pipelines. The size of the array must be the same as the number of input channels.

  • name (str): Indicates the name of the pipeline.

  • process_type (PipelineProcessType): Type of processing pipeline (default: PipelineProcessType.MAIN). it can be:

  • PipelineProcessType.MAIN: Processes input data and passes it to activated pipelines (if exist).

  • PipelineProcessType.BRANCH: Represents a branch pipeline with optional channel parameter.

  • PipelineProcessType.MULTI_STREAM: Represents a multi_stream pipeline mode. Requires an array of pipelines.

  • channel (obj): None or [0 to self.nchannel].

The input data passed to the pipeline can be a NumPy array in (self.nchannel, 2[2 windows in 1 frame], self._nperseg) dimension [None] or mono (2, self._nperseg) dimension. In mono mode [self.nchannel = 1 or mono mode activated], channel must be None.

Note

The pipeline must process data and return it to the core with the dimensions as same as the input.

set_pipeline(stream)

sets the main processing pipeline.

disable_pipeline()

Disables the current processing pipeline.

clear_pipeline()

Clears all pipeline’s data.

set_window(window='hann', noverlap=None, NOLA_check=True)

Configures the window function for audio processing.

Parameters:
  • window (object) – The window function (default: ‘hann’).

  • noverlap (int) – Number of overlapping segments (default: None).

  • NOLA_check (bool) – Perform the NOLA check (default: True).

get_sample_format()

Returns the sample format of the master instance.

Return type:

SampleFormat

static get_default_input_device_info()

Returns information about the default input audio device.

:return AudioDeviceInfo

Return type:

AudioDeviceInfo

static get_default_output_device_info()

Returns information about the default output audio device.

:return AudioDeviceInfo

Return type:

AudioDeviceInfo

static get_device_count()

Returns the number of available audio devices.

Return type:

int

static get_device_info_by_index(index)

Returns information about a specific audio device by index.

Parameters:

index (int) – The index of the audio device (int).

Return type:

AudioDeviceInfo

:return AudioDeviceInfo

static get_input_devices()

Returns a list of available input devices.

static get_output_devices()

Returns a list of available output devices.