Rule engine (vital_sqi.rule)

Three classifier families live here, plus a small math module:

  • Rule / RuleSet — apply pre-computed accept-band thresholds, segment by segment. Used by vital_sqi.pipeline.pipeline_functions.classify_segments().

  • vital_sqi.rule.auto_threshold — strategies for deriving those thresholds from the observed SQI distribution. Both the programmatic pipeline and the web app’s Inspect view share these helpers.

  • classify_segments_robust() — a separate non-rule-based three-regime classifier (clean / bimodal / heavy-noise) that needs no pre-calibrated thresholds at all.

Auto-threshold strategies

Strategies for picking accept-band thresholds from an SQI distribution.

Two policies live here:

  • quantile_band() — pick a fixed quantile window around the observed distribution. This is the classic auto_mode=True from vital_sqi.pipeline.pipeline_functions.classify_segments(): trim the bottom lower_pct and top upper_pct tails.

  • tuned_band() — given a set of SQI columns, derive a per-column quantile that targets a joint accept rate. Assumes rules are independent (a common simplifying assumption for orthogonal SQIs); the per-rule keep-rate is target ** (1/n) so the product of independent keep-rates equals the target.

Both strategies share a degenerate-band guard that returns None when the observed distribution is too narrow to produce a meaningful rule. Callers should drop those columns from the rule set rather than build a band that rejects every segment.

This module is deliberately UI-free: it only deals with numbers. The Inspect view and classify_segments both consume it.

class vital_sqi.rule.auto_threshold.Band(column, lower, upper, quantile_lo, quantile_hi, note='')[source]

Bases: object

An accept band (lower, upper) plus diagnostic provenance.

Parameters:
column: str
lower: float
note: str = ''
quantile_hi: float
quantile_lo: float
upper: float
property width: float
vital_sqi.rule.auto_threshold.DEGENERATE_BAND_HALF_WIDTH = 1e-06

Bands narrower than this collapse to “reject everything” under percentile-based auto-mode. Empirically this is the same threshold used by vital_sqi.calibration.threshold_estimator for its epsilon guard.

vital_sqi.rule.auto_threshold.per_rule_quantile(target_accept_rate, n_rules)[source]

Symmetric per-rule trim that yields target_accept_rate jointly.

Under the independence approximation, the joint accept rate is the product of per-rule accept rates. Solving:

target = keep ** n_rules
keep   = target ** (1 / n_rules)
trim   = 1 - keep              # split symmetrically across both tails
lower_pct = trim / 2

For target=0.85, n_rules=5 this gives a per-rule keep-rate of ~0.968 → bands at p1.6/p98.4. Much more forgiving than the legacy p5/p95 (which on 5 independent rules expects ~60% joint accept).

Parameters:
  • target_accept_rate (float) – Desired fraction of segments that should pass all rules. Clamped to (0, 1); values at the extremes give degenerate bands.

  • n_rules (int) – Number of independent rules in the set. 1 returns the symmetric split corresponding to a single quantile pair.

Returns:

lower_pct (the upper quantile is 1 - lower_pct).

Return type:

float

vital_sqi.rule.auto_threshold.quantile_band(column, values, *, lower_pct=0.05, upper_pct=0.95)[source]

Compute an accept band from the empirical lower/upper quantiles.

Parameters:
  • column (str) – SQI name (only used in the returned Band for diagnostics).

  • values (Sequence[float]) – Observed SQI values; NaN / inf are dropped before quantile computation.

  • lower_pct (float) – Lower quantile (e.g. 0.05 for p5). Must be in [0, 0.5).

  • upper_pct (float) – Upper quantile (e.g. 0.95 for p95). Must be in (0.5, 1].

Returns:

None when fewer than 2 finite values are available, or when the resulting band is narrower than DEGENERATE_BAND_HALF_WIDTH. Callers should treat both as “this SQI cannot contribute a useful rule”.

Return type:

Band or None

vital_sqi.rule.auto_threshold.strictest_columns(per_rule_rejects, *, mad_multiplier=3.0)[source]

Return rule names whose rejection count is an upward outlier.

Used by the Inspect view’s “Drop strictest rule” button.

Uses the modified Z-score (median + mad_multiplier × MAD) rather than the parametric mean + k·std, because the latter is blown up by the very outlier we’re trying to detect. The classic modified Z-score from Iglewicz & Hoaglin (1993) flags a sample as an outlier when its rescaled deviation 0.6745 * (x - median) / MAD exceeds 3.5 — we use a slightly looser cutoff (the mad_multiplier default of 3.0 in raw units) so the UI surfaces obviously-strict rules without nagging on borderline cases.

Returns an empty list when fewer than 3 rules are supplied (with only 2 points one is always the “outlier”) or when every rule rejects roughly the same number of segments (MAD == 0).

Parameters:
  • per_rule_rejects (dict[str, int]) – {rule_name: n_segments_rejected}.

  • mad_multiplier (float) – How many MADs above the median a count must be to count as an outlier. Defaults to 3.0 (≈ the 99th percentile of a normal distribution).

Return type:

List[str]

vital_sqi.rule.auto_threshold.tuned_bands(column_values, *, target_accept_rate=0.85)[source]

Per-column accept bands sized to hit target_accept_rate jointly.

Degenerate columns are dropped silently — they don’t count towards n_rules, so the per-rule quantile is recomputed only over the columns that actually contribute a band. This keeps the auto-tune sensible when half the catalogue is constant.

Two-pass algorithm:

  1. Filter to columns whose p5/p95 band is non-degenerate (cheap sanity check; bands narrower than that won’t survive any tighter trim either).

  2. Compute the per-rule quantile from the surviving count, then compute each column’s actual band at that quantile.

Parameters:
  • column_values (dict[str, Sequence[float]]) – Mapping from SQI column name to its observed values.

  • target_accept_rate (float) – Desired joint accept rate in (0, 1).

Returns:

One entry per surviving column, in iteration order of the input.

Return type:

list of Band

Single-SQI rule

Class Rule contains thresholds and corresponding labels of an SQI. Labels are either ‘accept’ or ‘reject’.

class vital_sqi.rule.rule_class.Rule(name, rule=None)[source]

Bases: object

A class to represent and manage threshold-based rules for Signal Quality Indices (SQI).

name

The name of the SQI rule.

Type:

str

rule

The rule definition, containing thresholds, boundaries, and labels.

Type:

dict or None

load_def(source=None):

Loads rule definitions from a specified source.

update_def(op_list, value_list, label_list):

Updates rule definitions based on provided lists of operands, values, and labels.

save_def(file_path, file_type="json", overwrite=False):

Saves the current rule definition to a specified file path.

apply_rule(x):

Applies the rule to an input x, returning the appropriate label.

write_rule():

Returns a string representation of the rule for display purposes.

apply_rule(x)[source]

Apply the rule to an SQI value and return its quality label.

The rule stores a sorted boundaries array and a parallel labels array built from the "def" entries. Lookup is O(log n) via bisect.bisect_left:

  • If x equals a boundary exactly, the label at the boundary position is returned (handles closed-interval endpoints).

  • Otherwise bisect_left locates the interval [boundaries[i-1], boundaries[i]) containing x and returns labels[i*2].

For the standard four-element calibrated rule encoding the open interval (lower, upper):

boundaries = [lower, upper]
labels     = ["reject", "accept", "reject", ...]
so: x <= lower → “reject”, lower < x < upper → “accept”,

x >= upper → “reject”.

Parameters:

x (float) – The SQI value to evaluate.

Returns:

"accept", "reject", or None if no label is defined for the interval containing x.

Return type:

str or None

load_def(source=None)[source]

Loads rule definitions from a specified source.

Parameters:

source (str, optional) – The file path to load rule definitions from (default is None).

save_def(file_path, file_type='json', overwrite=False)[source]

Saves the rule definition to a specified file.

Parameters:
  • file_path (str) – The path to save the rule definition.

  • file_type (str, optional) – The format to save the file in (default is “json”).

  • overwrite (bool, optional) – If True, allows overwriting existing files (default is False).

update_def(op_list, value_list, label_list)[source]

Updates rule definitions with new thresholds, values, and labels.

Parameters:
  • op_list (list of str) – List of operators for the rule (e.g., [“<=”, “>”]).

  • value_list (list of float) – List of threshold values corresponding to each operator.

  • label_list (list of str) – List of labels (“accept” or “reject”) corresponding to each threshold.

Raises:

ValueError – If invalid operator, value, or label is provided.

Examples

>>> rule = Rule("test_sqi")
>>> rule.load_def("../resource/rule_dict.json")
>>> rule.update_def(op_list=["<=", ">"],
                value_list=[5, 5],
                label_list=["accept", "reject"])
>>> print(rule.rule['def'])
[{'op': '>', 'value': '10', 'label': 'reject'},
{'op': '>=', 'value': '3', 'label': 'accept'},
{'op': '<', 'value': '3', 'label': 'reject'},
{'op': '<=', 'value': 5, 'label': 'accept'},
{'op': '>', 'value': 5, 'label': 'reject'}]
write_rule()[source]

Returns a string representation of the rule.

Returns:

A string representation of the rule for display.

Return type:

str

Composite rule set

RuleSet Class for managing a set of SQI rules and building a decision flowchart.

class vital_sqi.rule.ruleset_class.RuleSet(rules)[source]

Bases: object

A class to manage a set of rules for Signal Quality Indicators (SQI) and execute decision flow based on the provided rules.

rules

A dictionary where keys are rule order (int) and values are Rule instances.

Type:

dict

export_rules():

Exports the rules as a flowchart.

execute(value_df):

Executes the rules on a single-row DataFrame and returns a decision.

execute(value_df)[source]

Execute the rule set on a single-row DataFrame and return a decision.

Rules are evaluated in ascending integer key order. This is a linear early-exit scan — not recursive. The first rule that returns "reject" causes immediate return without evaluating subsequent rules. Only when every rule returns "accept" is the overall decision "accept".

To minimise average evaluation cost, place the most discriminative or cheapest-to-compute rules at the lowest integer keys so they are checked first.

Parameters:

value_df (pd.DataFrame) – A single-row DataFrame. Every rule.name used by this RuleSet must appear as a column.

Returns:

"accept" if all rules pass, "reject" as soon as any rule fails.

Return type:

str

Raises:
  • TypeError – If value_df is not a pd.DataFrame.

  • ValueError – If value_df does not have exactly one row.

  • KeyError – If a rule’s SQI name is absent from value_df.

export_rules()[source]

Generates a flowchart representing the rule execution order.

Returns:

The generated flowchart in string format.

Return type:

str

Robust auto-classifier

Robust SQI segment classifier.

Implements a three-regime quality classification strategy:
  • clean : most segments are good (rank-IQR threshold)

  • bimodal : clear good/bad split (GMM with Bhattacharyya fallback)

  • heavy_noise : most segments are bad (conservative accept)

Public entry point:

result = classify_segments_robust(sqis_df, sqi_names)
# result.decisions   — list[str] "accept"/"reject"
# result.scores      — np.ndarray float in [0,1]
# result.regime      — str  "clean" | "bimodal" | "heavy_noise"
# result.regime_info — dict with diagnostic fields
# result.file_flagged — bool

Ported and simplified from frequency_resonance/src/core/signal_processing/sqi_scorer.py.

class vital_sqi.rule.robust_classifier.RobustResult(decisions: 'List[str]', scores: 'np.ndarray', regime: 'str', regime_info: 'dict' = <factory>, file_flagged: 'bool' = False)[source]

Bases: object

Parameters:
decisions: List[str]
file_flagged: bool = False
regime: str
regime_info: dict
scores: ndarray
vital_sqi.rule.robust_classifier.classify_segments_robust(sqis_df, sqi_names=None, config=None)[source]

Classify signal segments using a rank+IQR consensus score and automatic regime detection.

Parameters:
  • sqis_df (pd.DataFrame) – One row per segment, one column per SQI. May contain inf/NaN.

  • sqi_names (list of str, optional) – Columns to use. Defaults to all non-index columns.

  • config (dict, optional) –

    Override default thresholds:

    clean_threshold (default 0.7) bad_threshold (default 0.3) heavy_noise_quantile (default 0.8)

Return type:

RobustResult