Preprocess (vital_sqi.preprocess)
Signal preprocessing utilities: tapering and smoothing, removal of flat/constant regions, and time- or beat-based segmentation.
Tapering & smoothing
- vital_sqi.preprocess.preprocess_signal.scale_pattern(s, window_size)[source]
Scales or resamples the signal to a specified window size for comparison with a template.
- Parameters:
s (np.ndarray) – Input signal as a 1D array of floats.
window_size (int) – The desired size of the output signal.
- Returns:
Resampled and smoothed signal to match the desired window size.
- Return type:
np.ndarray
- vital_sqi.preprocess.preprocess_signal.smooth_signal(s, window_len=5, window='flat')[source]
Smooths the signal using a specified window.
- vital_sqi.preprocess.preprocess_signal.taper_signal(s, window=None, shift_min_to_zero=True)[source]
Applies a tapering window to the signal and optionally shifts the minimum value to zero.
- Parameters:
s (np.ndarray) – Input signal as a 1D array of floats.
window (np.ndarray, optional) – Window shape to apply, defaults to Tukey window if None.
shift_min_to_zero (bool, optional) – If True, shifts the signal minimum value to zero.
- Returns:
Tapered and optionally shifted signal.
- Return type:
np.ndarray
Removal utilities
Signal Processing Utilities for Removing Noise and Interpolating Missing Data.
- vital_sqi.preprocess.removal_utilities.get_start_end_points(start_cut_pivot, end_cut_pivot, length_df)[source]
Determines the start and end points for each retained signal segment.
- Parameters:
start_cut_pivot (array-like) – Array of starting points of removed segments.
end_cut_pivot (array-like) – Array of corresponding ending points of removed segments.
length_df (int) – Length of the original signal.
- Returns:
Arrays of start and end milestones for retained segments.
- Return type:
- vital_sqi.preprocess.removal_utilities.interpolate_signal(s, missing_index, missing_len, method='arima', lag_ratio=10)[source]
Interpolates missing signal segments using ARIMA.
- Parameters:
s (pd.DataFrame) – Signal with first column as pd.Timestamp and second as float.
missing_index (list or array-like) – Starting indices of missing segments.
missing_len (list or array-like) – Lengths of missing segments corresponding to each starting index.
method (str, optional) – Interpolation method (only ‘arima’ supported, default).
lag_ratio (int, optional) – Multiplier for the ARIMA lag window size (default 10).
- Returns:
Signal with interpolated segments.
- Return type:
pd.DataFrame
- vital_sqi.preprocess.removal_utilities.remove_invalid_smartcare(s, info, output_signal=True)[source]
Filters out invalid signal samples based on Smartcare oximeter data.
- Parameters:
s (pd.DataFrame) – Signal with first column as pd.Timestamp and second as float.
info (pd.DataFrame) – Info containing “SPO2_PCT”, “PERFUSION_INDEX”, and “PULSE_BPM” columns.
output_signal (bool, optional) – If True, returns processed signal along with milestones.
- Returns:
Processed signal (optional) and DataFrame of retained segment milestones.
- Return type:
- vital_sqi.preprocess.removal_utilities.remove_unchanged(s, sampling_rate, duration=10, output_signal=True)[source]
Removes flat (unchanged) segments of the signal considered as noise.
- Parameters:
- Returns:
Processed signal (optional) and DataFrame of retained segment milestones.
- Return type:
- vital_sqi.preprocess.removal_utilities.trim_signal(s, sampling_rate, duration_left=300, duration_right=300)[source]
Trims noise from the beginning and end of the signal.
- Parameters:
- Returns:
Trimmed signal.
- Return type:
pd.DataFrame
Segment splitter
Splitting long recordings into segments with optional overlapping: - By duration - By beat
- vital_sqi.preprocess.segment_split.save_segment(segment_list, segment_name='segment', save_file_folder=None, save_image=False, save_img_folder=None)[source]
Saves segments of waveforms to .csv files and optionally plots them to image files.
- Parameters:
segment_list (list) – List of segments (arrays or DataFrames).
segment_name (str, optional) – Base filename for saved files (default is “segment”).
save_file_folder (str, optional) – Directory to save .csv files (default is current working directory).
save_image (bool, optional) – If True, saves images of each segment (default is False).
save_img_folder (str, optional) – Directory to save image files (default is current working directory).
- Return type:
None
- vital_sqi.preprocess.segment_split.split_segment(s, sampling_rate, split_type=0, duration=30.0, overlapping=0, peak_detector=6, wave_type='PPG')[source]
Splits a long signal into segments based on time or beat, with optional overlap.
- Parameters:
s (pd.DataFrame) – Signal data with timestamps as the first column and signal values as the second.
split_type (int, optional) – 0: split by time; 1: split by beat (default is 0).
duration (float, optional) – Segment length in seconds (if split_type=0) or in beats (if split_type=1, default is 30).
overlapping (float or int, optional) – Overlap in seconds (only used when split_type=0; ignored for beat-based split, default is 0).
peak_detector (int, optional) – Type of peak detector for beat-based segmentation, 1–7 (default is 6 — vitalDSP detector).
wave_type (str, optional) – Type of signal, either ‘PPG’ or ‘ECG’ (default is ‘PPG’).
- Returns:
segments (list) – List of segmented DataFrames.
milestones (pd.DataFrame) – DataFrame containing start and end indices of each segment.
Examples
>>> from vital_sqi.common.utils import generate_timestamp >>> s = np.arange(100000) >>> timestamps = generate_timestamp(None, 100, len(s)) >>> df = pd.DataFrame({'time': timestamps, 'signal': s}) >>> segments, milestones = split_segment(df, sampling_rate=100, duration=5)