---
name: testing-with-boc
description: "Write tests for bocpy Behavior-Oriented Concurrency code. Use when: writing pytest tests for @when behaviors, Cown scheduling, send/receive messaging, cown grouping, chained behaviors, exception propagation. Covers parameter-count rules, module-level class requirements, and the send/receive assertion pattern."
---

# Testing with Behavior-Oriented Concurrency (BOC)

This skill describes how to write tests for code that uses the `bocpy` library for
behavior-oriented concurrency. BOC schedules work as **behaviors** — decorated
functions that run once all required **cowns** (concurrently-owned data) are
available. Testing BOC programs requires specific patterns because behaviors
execute asynchronously on worker interpreters.

> **Design guidance:** if you find yourself reaching for `time.sleep`,
> `threading.Event`, polling loops, or `wait_for_*` helpers inside a
> behavior or in test code that drives one, stop and read
> `thinking-in-boc` first. The right answer is almost always to express
> the dependency through the cown graph, not through a classical
> synchronization primitive.

## Key Concepts

| Concept | Description |
|---------|-------------|
| `Cown(value)` | A concurrently-owned wrapper. Behaviors receive exclusive temporal access to the cown's `.value`. |
| `@when(*cowns)` | Decorator that schedules the function as a behavior. The decorator replaces the function with a `Cown` holding the return value. **The decorated function must have exactly as many parameters as there are arguments to `@when`.** |
| `send(tag, contents)` | Sends a cross-interpreter message with the given tag. |
| `receive(tags, timeout)` | Blocks until a message with a matching tag arrives (or times out). Returns `(TIMEOUT, None)` on timeout. |
| `TIMEOUT` | Sentinel string returned as the tag by `receive` when a timeout elapses. |
| `wait(timeout)` | Blocks until all scheduled behaviors have completed. |

### Critical rule: parameter count must match `@when` argument count

The number of parameters on the decorated function **must exactly equal** the
number of arguments passed to `@when`. A mismatch causes unspecified behavior
that can crash the worker interpreter. Because the behavior never completes, the
test will hang forever unless `receive` is called with a timeout.

```python
# CORRECT — 1 @when arg, 1 function param
@when(x)
def good(x):
    return x.value * 2

# WRONG — 1 @when arg, 2 function params (default args count!)
@when(x)
def bad(x, factor=2):       # will crash — factor is an extra param
    return x.value * factor

# FIX — capture extra values via closure, not default args
factor = 2
@when(x)
def fixed(x):               # 1 param matches 1 @when arg
    return x.value * factor  # factor captured from enclosing scope
```

### Do not use the `def _(c, x=x)` loop-capture idiom

A common Python idiom for snapshotting a loop variable is to bind it as a
default argument:

```python
for i, c in enumerate(cowns):
    @when(c)
    def _(c, i=i):          # unnecessary AND breaks @when
        send("done", i)
```

**You don't need this with `@when`.** The transpiler rewrites the call site as
`whencall('__behavior__N', (c,), (i,))`, snapshotting captures into a tuple at
schedule time. There is no late-binding hazard to defend against — just
reference the loop variable directly:

```python
for i, c in enumerate(cowns):
    @when(c)
    def _(c):
        send("done", i)     # i is captured by value at schedule time
```

Adding `i=i` to the signature actively breaks the behavior. The transpiler
treats every name in the signature as a behavior parameter and discards the
default, so the worker sees a function with an extra positional arg that the
runtime never supplies. See the "Inspecting Transpiler Output" section of
`.github/copilot-instructions.md` for how to use `export_module.py` to
confirm exactly which names are parameters and which are captures.

If you do want a fresh scope per iteration (e.g. to avoid sharing mutable
state between iterations), use a helper function:

```python
def _schedule(c, i):                # fresh scope per iteration
    @when(c)
    def _(c):
        send("done", i)

for i, c in enumerate(cowns):
    _schedule(c, i)
```

### Critical rule: classes and functions must be declared at module level

Behaviors run in separate sub-interpreters. The transpiler exports the module so
workers can import it, which means **any class or function referenced inside a
`@when` behavior must be defined at module level**. A class defined inside a test
method or local function cannot be resolved by the worker and will crash.

```python
# CORRECT — class at module level
class Accumulator:
    def __init__(self):
        self.items = []
    def add(self, item):
        self.items.append(item)

def test_accumulator(self):
    acc = Cown(Accumulator())

    @when(acc)
    def _(a):
        a.value.add(42)       # Accumulator is importable ✓

# WRONG — class inside test method
def test_accumulator_bad(self):
    class Accumulator:        # local class — worker can't import it
        ...
    acc = Cown(Accumulator())

    @when(acc)
    def _(a):
        a.value.add(42)       # will crash
```

## Project Test Setup

- Tests use **pytest** and live in the `test/` directory.
- Install test dependencies: `pip install -e .[test]`
- Run: `pytest -vv`

## Pattern 1 — Assert Inside a Behavior via `send`/`receive`

Because behaviors run asynchronously, you **cannot** assert directly in the test
body after scheduling a behavior. Instead, use `send` to ship the result out of
the behavior and `receive` in the test to collect and verify it.

```python
from bocpy import Cown, when, send, receive, drain, TIMEOUT, wait

RECEIVE_TIMEOUT = 10

class TestExample:
    @classmethod
    def teardown_class(cls):
        wait()  # drain the scheduler after all tests

    def receive_asserts(self, count=1):
        """Helper: collect `count` assertion messages and fail on mismatch.

        Uses a timeout so that if a behavior never fires (e.g. due to a
        parameter-count mismatch in @when) the test fails quickly instead
        of hanging forever. The "assert" queue is always drained before
        returning so leftover messages from a failing test do not leak
        into subsequent tests in CI.
        """
        failed = None
        timed_out = False
        try:
            for _ in range(count):
                result = receive("assert", RECEIVE_TIMEOUT)
                if result[0] == TIMEOUT:
                    timed_out = True
                    break
                _, (actual, expected) = result
                if failed is None and actual != expected:
                    failed = (actual, expected)
        finally:
            drain("assert")

        assert not timed_out, (
            "Timed out waiting for an 'assert' message from a behavior. "
            "Check that every @when arg count matches the decorated "
            "function's parameter count."
        )
        if failed is not None:
            actual, expected = failed
            assert actual == expected, f"expected {expected!r}, got {actual!r}"

    def test_double(self):
        x = Cown(3)

        @when(x)
        def result(x):
            return x.value * 2

        @when(result)
        def _(r):
            send("assert", (r.value, 6))

        self.receive_asserts()
```

### Why this pattern?

`@when` returns immediately — the behavior hasn't executed yet. The test thread
must block on `receive("assert")` to synchronize with the behavior's completion.
Calling `wait()` in `teardown_class` ensures any remaining work finishes before
the next test class starts.

## Pattern 2 — Testing Nested / Chained Behaviors

Behaviors can schedule further behaviors. Use multiple `send` calls and
`receive_asserts(count)` to verify each step in the chain.

```python
def test_nested(self):
    x = Cown(1)

    @when(x)
    def step1(x):
        x.value *= 2          # x is now 2

        @when(x)
        def step2(x):
            x.value *= 3      # x is now 6

        return step2

    @when(x, step1)
    def check(x, s):
        send("assert", (x.value, 2))

        @when(x, s.value)     # s.value is the inner Cown
        def deep_check(x, _):
            send("assert", (x.value, 6))

    self.receive_asserts(2)
```

## Pattern 3 — Multi-Cown Coordination

Pass multiple cowns to `@when` to atomically operate on several pieces of data.
The scheduler guarantees deadlock-free acquisition.

```python
def test_transfer(self):
    x = Cown(100)
    y = Cown(0)

    @when(x, y)
    def _(x, y):
        y.value += 50
        x.value -= 50

    @when(x)
    def _(x):
        send("assert", (x.value, 50))

    @when(y)
    def _(y):
        send("assert", (y.value, 50))

    self.receive_asserts(2)
```

## Pattern 4 — Cown Grouping

When you have a dynamic number of cowns (e.g., a list), you can pass them to
`@when` as a **list** (or slice) rather than individual arguments. Inside the
behavior, that parameter is delivered as a `list[Cown]` — each element is an
acquired cown whose `.value` you can read or write.

You can mix single cowns and groups freely in any order. Each distinct argument
to `@when` becomes its own parameter in the decorated function:

| `@when(...)` arguments | Behavior parameters |
|------------------------|---------------------|
| `@when(list_of_cowns)` | `(group: list[Cown])` |
| `@when(cowns[:9], cowns[9])` | `(group: list[Cown], single: Cown)` |
| `@when(cowns[0], cowns[1:])` | `(single: Cown, group: list[Cown])` |
| `@when(cowns[:4], cowns[4], cowns[5:])` | `(g0: list[Cown], single: Cown, g1: list[Cown])` |
| `@when(cowns[0], cowns[1:9], cowns[9])` | `(s0: Cown, group: list[Cown], s1: Cown)` |

### Full group example

```python
from bocpy import Cown, when, send, receive

cowns = [Cown(i) for i in range(10)]  # values 0..9, sum = 45

# All cowns as a single group
@when(cowns)
def group_sum(group: list[Cown[int]]):
    return sum(c.value for c in group)

# Group + single cown
@when(cowns[:9], cowns[9])
def group_then_single(group: list[Cown[int]], single: Cown[int]):
    return sum(c.value for c in group) + single.value

# Single cown + group
@when(cowns[0], cowns[1:])
def single_then_group(single: Cown[int], group: list[Cown[int]]):
    return single.value + sum(c.value for c in group)

# Group + single + group
@when(cowns[:4], cowns[4], cowns[5:])
def group_single_group(g0: list[Cown[int]], single: Cown[int], g1: list[Cown[int]]):
    return sum(c.value for c in g0) + single.value + sum(c.value for c in g1)
```

### Testing grouped results

The results are all cowns, so use the same `send`/`receive` pattern. You can
itself pass a list of result cowns as a group to `@when`:

```python
def test_cown_grouping(self):
    expected = 45
    results = [group_sum, group_then_single, single_then_group, group_single_group]

    @when(results)
    def check(results: list[Cown]):
        for r in results:
            send("assert", (r.value, expected))

    self.receive_asserts(len(results))
```

### Key rules for grouping

- Pass a **list** (or slice) of cowns to `@when` — the behavior receives the
  corresponding parameter as `list[Cown]`.
- Pass a **single cown** — the parameter receives that `Cown` directly.
- You can **interleave** singles and groups in any order. The positional mapping
  between `@when(...)` arguments and the decorated function's parameters is 1:1.
- Type-annotate grouped parameters as `list[Cown[T]]` for clarity.

## Pattern 5 — Exception Propagation

If a behavior raises, the exception is captured in the returned cown's `.value`
**and** the cown's `.exception` flag is set to `True`. This lets downstream
behaviors distinguish a thrown exception from a value that just happens to be
an `Exception` instance returned normally.

```python
def test_exception_in_behavior(self):
    x = Cown(1)

    @when(x)
    def bad(x):
        x.value /= 0          # ZeroDivisionError

    @when(bad)
    def _(b):
        send("assert", (b.exception, True))
        send("assert", (isinstance(b.value, ZeroDivisionError), True))
        b.value = None         # writing .value clears the exception flag

    self.receive_asserts(2)


def test_returned_exception_is_not_flagged(self):
    """An Exception object *returned* from a behavior is just a value."""
    x = Cown(1)

    @when(x)
    def returns_exc(x):
        return ValueError("not really an error")

    @when(returns_exc)
    def _(r):
        send("assert", (r.exception, False))
        send("assert", (isinstance(r.value, ValueError), True))

    self.receive_asserts(2)
```

Notes:

- Writing `cown.value = ...` from inside a behavior **clears** `.exception`.
- `cown.exception` is also writable inside a behavior, in case you want to
  manually mark or unmark a cown as carrying an error.
- Always assert on `.exception` before `isinstance(.value, Exception)` —
  otherwise a behavior that legitimately returns an `Exception` will be
  indistinguishable from one that raised.

## Pattern 6 — Noticeboard

The noticeboard is a global key-value store (up to 64 keys) that behaviors can
read and write **without** acquiring any cowns. Writes are non-blocking; reads
return a snapshot taken once per behavior execution.

| Function | Purpose |
|----------|---------|
| `notice_write(key, value)` | Non-blocking write. |
| `notice_update(key, fn, default=None)` | Atomic read-modify-write. `fn` and `default` must be picklable. Returning `REMOVED` deletes the entry. |
| `notice_delete(key)` | Non-blocking delete. |
| `noticeboard()` | Read-only mapping — snapshot of the noticeboard, cached for the duration of the current behavior. |
| `notice_read(key, default=None)` | Convenience: one key from the snapshot. |

### Key rule: snapshot per behavior

Within a single behavior, `noticeboard()` and `notice_read()` always return
data from the **same** snapshot — even if other behaviors write in the
meantime. To see a write made by another behavior, schedule a follow-up
behavior (typically by chaining via a cown returned from `@when`).

```python
def test_noticeboard_roundtrip(self):
    x = Cown(0)

    @when(x)
    def step1(x):
        notice_write("greeting", "hello")

    # The chain on `step1` ensures step2 runs *after* the write has been
    # applied and step2's snapshot sees it.
    @when(x, step1)
    def step2(x, _):
        send("assert", (notice_read("greeting"), "hello"))

    self.receive_asserts()
```

### Atomic update

`notice_update` runs `fn(current_value)` on the scheduler and writes the
result back atomically. Lambdas and closures are **not** picklable — use a
module-level function (optionally wrapped with `functools.partial`) or an
`operator` function.

```python
from functools import partial
from operator import add

def _bump(n, by):
    return n + by

class TestCounter:
    @classmethod
    def teardown_class(cls):
        wait()

    def test_atomic_increment(self):
        x = Cown(0)

        @when(x)
        def init(x):
            notice_write("count", 0)

        @when(x, init)
        def bump(x, _):
            notice_update("count", partial(_bump, by=5))
            notice_update("count", partial(add, 3))

        @when(x, bump)
        def check(x, _):
            send("assert", (notice_read("count"), 8))

        receive_asserts()
```

### Delete via `REMOVED`

Returning the `REMOVED` sentinel from a `notice_update` callback deletes the
entry. `notice_delete(key)` is the direct form.

```python
def _drop_if_zero(n):
    return REMOVED if n == 0 else n - 1

def test_remove_via_update(self):
    x = Cown(0)

    @when(x)
    def init(x):
        notice_write("lives", 1)

    @when(x, init)
    def tick(x, _):
        notice_update("lives", _drop_if_zero)   # 1 -> 0
        notice_update("lives", _drop_if_zero)   # 0 -> REMOVED

    @when(x, tick)
    def check(x, _):
        send("assert", ("lives" in noticeboard(), False))

    self.receive_asserts()
```

### Common noticeboard pitfalls

| Pitfall | Fix |
|---------|-----|
| Reading a value back inside the **same** behavior that wrote it | The snapshot was taken at the start of the behavior. Chain a follow-up `@when` to observe the write. |
| Passing a lambda or closure to `notice_update` | They are not picklable. Use a module-level function with `functools.partial`, or an `operator` function. |
| Asserting in the test body that `noticeboard()` contains a key | Read inside a behavior and `send` the result out — `noticeboard()` and `notice_read()` outside any behavior return a snapshot that is never refreshed. |
| Writing more than 64 distinct keys | Excess writes are dropped with a logged warning — they do **not** raise. Keep tests within the limit (and `notice_delete` keys you no longer need). |

## Pattern 7 — Parameterized Tests

Use `@pytest.mark.parametrize` to sweep inputs. Each invocation gets its own
cowns so tests are isolated.

```python
@pytest.mark.parametrize("n", [1, 10, 15])
def test_fibonacci(self, n):
    result = fib_parallel(n)
    expected = fib_sequential(n)

    @when(result)
    def _(r):
        send("assert", (r.value, expected))

    self.receive_asserts()
```

## Pattern 8 — Testing `send`/`receive` Messaging Directly

For code that uses the lower-level messaging API without behaviors:

```python
from bocpy import send, receive, TIMEOUT

def test_basic_messaging():
    send("tag", "payload")
    tag, value = receive("tag", 1)
    assert tag != TIMEOUT
    assert value == "payload"

def test_receive_timeout():
    tag, value = receive("tag", 0.1)
    assert tag == TIMEOUT
    assert value is None

def test_timeout_with_after_callback():
    tag, value = receive("tag", 0.1, lambda: ("fallback", 42))
    assert tag == "fallback"
    assert value == 42
```

## Pattern 9 — Complex Objects in Cowns

Mutable objects (e.g., class instances) work inside cowns. Behaviors mutate them
in-place under exclusive access.

```python
class Counter:
    def __init__(self):
        self.n = 0
    def increment(self):
        self.n += 1

def test_object_in_cown(self):
    c = Cown(Counter())

    for _ in range(10):
        @when(c)
        def _(c):
            c.value.increment()

    @when(c)
    def _(c):
        send("assert", (c.value.n, 10))

    self.receive_asserts()
```

## Common Pitfalls

| Pitfall | Fix |
|---------|-----|
| **Parameter count mismatch in `@when`** | The decorated function must have **exactly** as many parameters as `@when` arguments. A mismatch crashes the worker. Use closure variables instead of default arguments to capture extra values. |
| **Classes/functions defined inside a test** | Behaviors run in sub-interpreters that import the module. Define all classes and functions used in behaviors at **module level** so workers can resolve them. |
| Asserting in the test body right after `@when` | The behavior hasn't run yet. Use `send`/`receive` to synchronize. |
| `receive` without a timeout | If a behavior crashes silently, the test hangs forever. Always pass a timeout (e.g. `RECEIVE_TIMEOUT = 10`) and assert the result is not `TIMEOUT`. |
| Forgetting `wait()` in teardown | Pending behaviors may leak into the next test class. Always call `wait()` in `teardown_class`. |
| Reading `cown.value` outside a behavior | A cown must be acquired first. Read values inside `@when` or use `send`/`receive`. |
| Using default arguments to capture loop variables | Default args add parameters, breaking the arg-count rule. Use a closure variable instead: `val = i` on a separate line before `@when`. |
| Mismatched `receive_asserts` count | The count must match the exact number of `send("assert", ...)` calls expected. |
| Non-XIData-compatible objects in cowns across interpreters | Stick to built-in types or objects that support cross-interpreter data. |
| Test function names with uppercase letters (N802) | Test names must be lowercase. E.g., `test_t_equals_transpose`, **not** `test_T_equals_transpose`, even when testing a property like `.T`. |
| Assigning `Cown(m)` to an unused variable (F841) | When the return value isn't needed (e.g., releasing a resource), use bare `Cown(m)` without assignment. |
| Using single quotes (Q000) | The project enforces `inline-quotes = double`. Use `"nan"` not `'nan'`. |
| Multi-line class docstring formatting (D205/D209) | Summary line, then blank line, then body. Closing `"""` on its own line. |
| Placing `# noqa: B023` on the `def` line instead of the violation line | `# noqa: B023` must go on the line that **references** the loop variable, not the `def _(a):` line above it. |