Implement Deterministic Simulation Testing Framework #7

Open
opened 2025-06-04 12:24:09 +00:00 by dannym · 0 comments
Owner

Summary

We need a deterministic simulation framework to test Eve Relay under controlled conditions, especially for CCNs. The goal is reproducible testing that lets us catch bugs and verify correctness across complex scenarios.

Why This Matters

  • Unit tests and integration tests can't reproduce intermittent failures reliably
  • Need to verify CCN isolation and security properties
  • Complex multi-client scenarios are hard to test manually
  • Want to catch race conditions and edge cases

Core Requirements

Must Be Deterministic

  • Same inputs = same outputs, always
  • No external dependencies (real network, system clock, etc.)
  • Reproducible test runs with seed values
  • Controlled event ordering

Architecture Components

Simulation Orchestrator

  • Manages discrete event scheduling
  • Controls simulated time progression
  • Drives test scenarios from scripts

Simulated Network

  • Fake network layer between clients and relay
  • Controllable latency, reordering, partitions
  • Message delivery control

Simulated Clients

  • Each client has own keypair
  • Can send EVENT, REQ, CLOSE messages
  • Receives relay responses
  • Actions driven by orchestrator

Relay Under Test

  • Actual Eve Relay binary
  • Needs hooks for deterministic behavior:
    • Seeded PRNG for internal IDs
    • Controllable internal scheduling
    • State injection for specific test scenarios

State Verifier

  • Records all events and state changes
  • Compares actual vs expected outcomes
  • Reports violations

Test Scenarios to Support

CCN Basic Flow

  1. Setup: Start relay, create 10 clients with keypairs
  2. Invitations: Test various invitation patterns (chain invites, concurrent invites)
  3. Authentication: Clients connect and authenticate to CCN
  4. Messaging: Text notes, reactions, concurrent sends
  5. Subscriptions: REQ filtering, event delivery
  6. Edge Cases: Invalid events, unauthorized access attempts
  7. Disconnections: Client drops, reconnects, catches up

Security Testing

  • CCN boundary enforcement (events don't leak between CCNs)
  • Authentication bypass attempts
  • Malformed event handling
  • Timing attack scenarios

Stress Testing

  • Many concurrent clients
  • High event volume
  • Complex filter combinations
  • Resource leak detection

"Mad Science" Testing

  • Bizarre event sequences
  • Events with thousands of tags
  • Rapid connect/disconnect cycles
  • Complex nested REQ filters

Implementation Plan

Phase 1: Basic Framework

  • Simulation orchestrator with discrete event scheduler
  • Simulated clock implementation
  • Basic simulated client that can connect/send/receive
  • Simple test scenario runner

Phase 2: Full Client Simulation

  • Network simulation layer with controllable properties
  • State recording and verification
  • CCN invitation and authentication flow

Phase 3: Advanced Testing

  • Complex scenario scripting
  • Fault injection capabilities
  • Performance metrics collection
  • Multi-CCN testing

Phase 4: Reporting

  • Automated report generation (Markdown format in reports/)
  • Failure reproduction tooling

Technical Notes

  • Language: Go (good concurrency model for this)
  • Client modeling: Each client as goroutine
  • Deterministic seeding: All PRNGs seeded from test seed
  • No external I/O: Relay must run in isolation

Report Format

Generated reports should go in reports/report-YYYYMMDD-XXX.md with:

  • Test summary (pass/fail counts)
  • Failed scenario details with logs
  • Performance metrics
  • Reproduction instructions (seed values)

Questions/Decisions Needed

  1. How much relay instrumentation do we need for deterministic behavior?
  2. Should we simulate crypto operations or mock them?
  3. What's the minimum viable first version?

References

Taking inspiration from TigerBeetle's state machine testing and Antithesis-style deterministic exploration, but adapted for our needs.


This replaces manual testing of complex scenarios and gives us confidence in relay correctness under all conditions we can think of (and some we can't).

## Summary We need a deterministic simulation framework to test Eve Relay under controlled conditions, especially for CCNs. The goal is reproducible testing that lets us catch bugs and verify correctness across complex scenarios. ## Why This Matters - Unit tests and integration tests can't reproduce intermittent failures reliably - Need to verify CCN isolation and security properties - Complex multi-client scenarios are hard to test manually - Want to catch race conditions and edge cases ## Core Requirements ### Must Be Deterministic - Same inputs = same outputs, always - No external dependencies (real network, system clock, etc.) - Reproducible test runs with seed values - Controlled event ordering ### Architecture Components **Simulation Orchestrator** - Manages discrete event scheduling - Controls simulated time progression - Drives test scenarios from scripts **Simulated Network** - Fake network layer between clients and relay - Controllable latency, reordering, partitions - Message delivery control **Simulated Clients** - Each client has own keypair - Can send EVENT, REQ, CLOSE messages - Receives relay responses - Actions driven by orchestrator **Relay Under Test** - Actual Eve Relay binary - Needs hooks for deterministic behavior: - Seeded PRNG for internal IDs - Controllable internal scheduling - State injection for specific test scenarios **State Verifier** - Records all events and state changes - Compares actual vs expected outcomes - Reports violations ## Test Scenarios to Support ### CCN Basic Flow 1. **Setup**: Start relay, create 10 clients with keypairs 2. **Invitations**: Test various invitation patterns (chain invites, concurrent invites) 3. **Authentication**: Clients connect and authenticate to CCN 4. **Messaging**: Text notes, reactions, concurrent sends 5. **Subscriptions**: REQ filtering, event delivery 6. **Edge Cases**: Invalid events, unauthorized access attempts 7. **Disconnections**: Client drops, reconnects, catches up ### Security Testing - CCN boundary enforcement (events don't leak between CCNs) - Authentication bypass attempts - Malformed event handling - Timing attack scenarios ### Stress Testing - Many concurrent clients - High event volume - Complex filter combinations - Resource leak detection ### "Mad Science" Testing - Bizarre event sequences - Events with thousands of tags - Rapid connect/disconnect cycles - Complex nested REQ filters ## Implementation Plan ### Phase 1: Basic Framework - [ ] Simulation orchestrator with discrete event scheduler - [ ] Simulated clock implementation - [ ] Basic simulated client that can connect/send/receive - [ ] Simple test scenario runner ### Phase 2: Full Client Simulation - [ ] Network simulation layer with controllable properties - [ ] State recording and verification - [ ] CCN invitation and authentication flow ### Phase 3: Advanced Testing - [ ] Complex scenario scripting - [ ] Fault injection capabilities - [ ] Performance metrics collection - [ ] Multi-CCN testing ### Phase 4: Reporting - [ ] Automated report generation (Markdown format in ```reports/```) - [ ] Failure reproduction tooling ## Technical Notes - **Language**: Go (good concurrency model for this) - **Client modeling**: Each client as goroutine - **Deterministic seeding**: All PRNGs seeded from test seed - **No external I/O**: Relay must run in isolation ## Report Format Generated reports should go in ```reports/report-YYYYMMDD-XXX.md``` with: - Test summary (pass/fail counts) - Failed scenario details with logs - Performance metrics - Reproduction instructions (seed values) ## Questions/Decisions Needed 1. How much relay instrumentation do we need for deterministic behavior? 2. Should we simulate crypto operations or mock them? 3. What's the minimum viable first version? ## References Taking inspiration from TigerBeetle's state machine testing and Antithesis-style deterministic exploration, but adapted for our needs. --- This replaces manual testing of complex scenarios and gives us confidence in relay correctness under all conditions we can think of (and some we can't).
dannym added the
priority
high
type
feature
effort
large
labels 2025-06-04 13:46:42 +00:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: Arx/Eve-Relay#7
No description provided.