ADR-020: Parallel Threshold Scanning Command#

Status: Accepted

Date: 2025-10-27

Context#

Following ADR-019 (Serial Threshold Scanning), we need to support parallel scanning mode where multiple channels’ thresholds are stepped simultaneously in coordinated fashion.

Difference from Serial:

  • Serial: One channel at a time (slower, simpler logic)

    • Natural order: scan ch1 fully, then ch2, then ch3

    • Outer loop: channels; inner loop: threshold values

    • Clear separation reduces noise

  • Parallel: All channels step together (faster, coordinated)

    • Step together: all channels at step 0, then step 1, etc.

    • Outer loop: steps; inner loop: channels

    • Captures cross-channel effects at each step

    • Requires all channels have same number of steps

Use Cases:

  • Characterize how threshold changes on one channel affect others

  • Simultaneous multi-channel optimization

  • Faster characterization (single measurement per step)

  • Statistical analysis of inter-channel correlations

Key Requirement: All channels must have identical step counts

  • If ch1 has 21 steps but ch2 has 19 steps → error

  • Enforced at CLI validation layer (fail fast)

  • User can adjust nsteps to match all channels

Decision#

We implement parallel threshold scanning with strict validation and coordinated stepping:

Layer 1: CLI Command Handler#

File: src/haniwers/v1/cli/threshold.py Function: parallel()

Responsibilities:

  • Parse CLI arguments (same as serial: --thresholds, --nsteps, --step, --duration)

  • Load configuration and apply CLI overrides

  • Validate that all channels have compatible threshold ranges

  • Delegate to business logic layer

  • Display results to user

Validation (NEW for parallel):

# Step 3: Validate threshold ranges are compatible
try:
    validate_threshold_ranges(cfg.sensors, scan_type="parallel")
except ValueError as e:
    typer.echo(f"[ERROR] {e}", err=True)
    raise typer.Exit(code=1)

Layer 2: Business Logic Module#

File: src/haniwers/v1/threshold/parallel.py Function: run_parallel_threshold_scan()

Responsibilities:

  • Implement parallel (coordinated) scanning algorithm

  • Manage device lifecycle

  • Execute coordinated threshold stepping loops

  • Collect data once per step (all channels active)

  • Save results to per-channel CSV files

  • Error handling (if any channel fails, skip that step)

Algorithm:

1. Build all_ranges: {ch: [vth1, vth2, ...]} for each sensor
2. Validate (precondition): all ranges have same length
3. For each step_idx in range(num_steps):
   a. Get threshold_values: {ch: threshold_at_step} for all channels
   b. Apply all thresholds simultaneously
   c. If all succeeded: collect data once for all channels
   d. Save results for each channel
   e. If any failed: skip data collection, log warning

Consequences#

Positive#

  • Efficiency: One measurement per step (vs. serial: one per channel×step)

    • Serial: 3 channels × 21 steps = 63 measurements

    • Parallel: 21 measurements (same duration, more multi-channel data)

  • Coordinated Data: Captures simultaneous multi-channel response

    • Can analyze cross-channel correlations

    • More realistic detector behavior

    • Useful for inter-channel threshold balancing

  • Independent from Serial: Separate implementation

    • No code duplication

    • Can evolve independently

    • Clear responsibilities

  • Strict Validation: Fail-fast on configuration errors

    • User sees clear error message immediately

    • Prevents confusing “missing data for channel X at step Y” errors

    • Guides users to fix configuration (adjust nsteps)

Negative#

  • Configuration Constraint: All channels must match

    • Users cannot have ch1 with 10 steps and ch2 with 5 steps

    • Workaround: adjust all to common value (e.g., 10)

    • Reduced flexibility vs. serial

  • Error Handling Complexity: One channel failure skips step

    • If ch2 write fails at step 5, no data collected at step 5

    • Alternative: partial data collection (rejected - causes confusion)

    • Current approach: conservative, favors data consistency

  • Validator Function: New validate_threshold_ranges() in helpers/validator.py

    • Added complexity to validator module

    • Only used by parallel (not general utility)

    • Could extract to threshold-specific module later

Alternatives Considered#

1. Allow Varying Step Counts (Flexible Parallelism)#

Decision: REJECTED

  • ✓ More flexible: ch1 with 21 steps, ch2 with 10 steps OK

  • ✓ Better for heterogeneous channels

  • ✗ Complex implementation: need iteration in columns

  • ✗ Confusing results: missing data in some CSVs

  • ✗ Hard to debug: unclear which channel stopped at which step

  • ✗ Error-prone: subtle bugs in step iteration

2. Partial Data Collection on Failure#

Decision: REJECTED

  • ✓ Recovers from some failures

  • ✗ Inconsistent results: ch1 has step 5 data, ch2 doesn’t

  • ✗ Hard to analyze: CSV has missing rows

  • ✗ Silent failures: user might not notice missing channel

3. Combine into Single Function (run_threshold_scan with mode parameter)#

Decision: REJECTED

  • ✓ Less code duplication

  • ✗ Violates Single Responsibility Principle

  • ✗ Hard to test: mixed logic

  • ✗ Harder to maintain: different algorithms in one function

  • ✗ Future: can’t deprecate one mode without affecting other

Implementation Status#

COMPLETE - All core implementation tasks finished (2025-10-27)

  • Plan documented in planning/cli-threshold-scan.md

  • Validation function validate_threshold_ranges() added to helpers/validator.py

  • Create src/haniwers/v1/threshold/parallel.py with run_parallel_threshold_scan()

  • Add parallel() command to src/haniwers/v1/cli/threshold.py

  • Write unit tests for run_parallel_threshold_scan() (6 tests)

  • Write integration tests for threshold parallel command (3 tests)

  • Test validation error messages (4 validation tests)

  • Test with mock device (all 19 tests passing)

  • Test with real hardware (optional, pending lab access)

Testing Strategy#

IMPLEMENTED - 19 tests, 100% passing

Unit Tests - Stepping Algorithm#

File: tests/v1/unit/threshold/parallel/test_stepping.py

  • test_generate_threshold_values_basic() - threshold = center + step*size

  • test_generate_threshold_values_clipping() - clip to [1, 1023]

  • test_all_channels_step_together() - verify coordinated stepping

Unit Tests - Result Files#

File: tests/v1/unit/threshold/parallel/test_result_files.py

  • test_csv_files_created_per_channel() - one file per channel

  • test_csv_append_mode() - header once, data appended

  • test_audit_log_created() - threshold_operations.csv created

Unit Tests - Validation#

File: tests/v1/unit/threshold/parallel/test_validation.py

  • test_matched_nsteps_passes() - all channels with same nsteps → pass

  • test_mismatched_nsteps_fails() - mismatched nsteps → ValueError

  • test_error_message_shows_breakdown() - error shows step counts: {1: 8, 2: 10}

  • test_validation_not_called_for_serial() - serial mode skips validation

Unit Tests - Error Handling#

File: tests/v1/unit/threshold/parallel/test_error_handling.py

  • test_clear_error_message() - validator error is clear and helpful

  • test_validation_called_before_scan() - validation happens at orchestrator

  • test_skipped_steps_tracked() - ParallelScanResult tracks skipped steps

  • test_error_messages_aggregated() - all errors collected in result

  • test_apply_threshold_retry_returns_bool() - retry logic never raises

  • test_apply_threshold_success_returns_true() - success returns bool

Integration Tests#

File: tests/v1/integrations/threshold/test_parallel_scan.py

  • test_basic_parallel_scan_with_mock() - full end-to-end with mock device

  • test_parallel_scan_with_failing_device() - error recovery with simulated failures

  • test_parallel_scan_produces_output_files() - verifies all expected output files created

Test Coverage Results#

Component

Target

Achieved

run_parallel_threshold_scan()

90%+

✅ Complete with comprehensive tests

parallel() CLI handler

80%+

✅ Integration tests verify CLI flow

validate_threshold_ranges()

100%

✅ 100% (validation critical)

Overall

80%+

19/19 tests passing (100%)

References#

Implementation Guidance:

  • planning/cli-threshold-scan.md: Parallel Threshold Scanning Flow (code structure section)

Validation Function:

  • src/haniwers/v1/helpers/validator.py: validate_threshold_ranges(sensors, scan_type="parallel")

Related Source Files:

  • src/haniwers/v1/cli/threshold.py: CLI command entry point

  • src/haniwers/v1/threshold/parallel.py: NEW - business logic module

  • src/haniwers/v1/threshold/writer.py: apply_threshold() API

  • src/haniwers/v1/config/model.py: SensorConfig.threshold_range() method

Example Configuration (All channels have same nsteps):

[sensors.ch1]
id = 1
center = 200
nsteps = 8        # All channels MUST have same nsteps
step_size = 5

[sensors.ch2]
id = 2
center = 300
nsteps = 8        # ← MUST match ch1
step_size = 5

[sensors.ch3]
id = 3
center = 250
nsteps = 8        # ← MUST match ch1, ch2
step_size = 5

Usage Example:

# Parallel scan - all channels step together
haniwers-v1 threshold parallel \
  --thresholds "1:200;2:300;3:250" \
  --nsteps 8 \
  --step 5 \
  --duration 5 \
  --mock

# Results (all channels at each step):
# Step 0: all channels at vth=160, collect data once, save for ch1/ch2/ch3
# Step 1: all channels at vth=165, collect data once, save for ch1/ch2/ch3
# ...
# Step 8: all channels at vth=200, collect data once, save for ch1/ch2/ch3

Validation Error Example:

$ haniwers-v1 threshold parallel --thresholds "1:200;2:300" --nsteps 10 --step 5

# Error: Parallel scanning requires all channels to have same number of steps.
# Got: {1: 21, 2: 19}.
# Ensure all channels have same center, nsteps, and step_size configuration.

Notes#

  • Independent Implementation: Serial and Parallel are separate

    • Different algorithms (no shared code)

    • Can be tested independently

    • Can be deprecated separately in future

  • Validation at CLI Level: Fail-fast approach

    • Better than discovering missing data during analysis

    • Clear error messages guide users to fix config

    • Validator reusable for other tools

  • Efficiency Gain: Real benefit from parallelism

    • Multi-channel correlation analysis

    • Time savings for characterization

    • More realistic detector representation

  • Conservative Error Handling: Skip step on failure

    • Ensures CSV consistency (all channels present or absent)

    • Simplifies analysis (no missing data surprises)

    • Alternative: retry logic (future enhancement)