# @aituber-onair/noise ![@aituber-onair/noise logo](https://raw.githubusercontent.com/shinshin86/aituber-onair/main/packages/noise/images/aituber-onair-noise.png) AITuber OnAir Noise is a context-aware response rewrite engine for disturbing predictable LLM phrasing without changing the meaning of the reply. Do not let AI responses end in predictable harmony. It is designed for AI VTubers and AI character streams where a response can feel too clean, too agreeable, or too neatly summarized. The package detects predictability, builds structured friction parameters, asks an LLM for multiple rewrite candidates, and selects the candidate that best preserves the character while avoiding a predictable landing. Noise is not just a rewrite engine: it is a deviation orchestration engine. Research across conversation analysis, improv theory, humor theory, and field analysis of successful AI VTubers converges on one formula (see `docs/design-research.md`): > Pleasant unpredictability = (established pattern) x (deviation shipped with a > simultaneous "this is play" marker) x (safe target) x (relational license) x > (return to pattern). Remove any factor and the same output flips from charm > to malfunction. So in addition to rewriting, Noise schedules when deviation is allowed (rhythm), decides how much deviation the relationship has earned (relationship capital), refuses to disturb sincere moments (sincerity gate), certifies teasing as play (play markers), reuses shared memories as running gags (gag ledger), and learns from audience reactions (reaction loop). ## Why this exists LLMs are trained on the average of a huge amount of text, and preference tuning (RLHF) pushes them further toward replies that are safe, agreeable, and neatly summarized. That is fine for an assistant, but for an AI character stream it produces **predictable harmony** (予定調和): the same temperature every time, a tidy closing every time, and an audience that gets bored. Human conversation is engaging precisely because it does *not* go to plan — a retort, a pause, a deliberately withheld reaction, a callback to an old joke. The hard part is **not** generating disruption — an LLM can do that. The hard part is that whether a broken expectation reads as *charm* or as *malfunction* does not live in the text; it lives in the receiver. The same blunt line is "endearing gap" from a beloved regular character and "rude" from a stranger. So Noise is less a text generator and more a controller: it manages **when, how far, and toward whom** a reply may deviate, and learns from how the audience reacts. ## How it works (one turn) After the LLM produces a draft reply, Noise runs this pipeline (the same one the browser sample visualizes under "ノイズの判断を見る"): 1. **Diagnose** — is this draft too predictable? Detect clean closings, over-apology, over-agreement, etc., and score it. 2. **Three gates — may we disrupt at all?** - **Sincerity gate**: if the viewer is making a serious or vulnerable bid, stop everything (failed uptake of a sincere moment is the worst violation). - **Relationship capital**: unlock stronger interventions (teasing, callbacks) only as the bond grows. - **Rhythm**: rest right after a disruption, because constant disruption becomes a new predictable style. 3. **Plan** — choose which interventions to use, limited to what the gates allow. 4. **Generate & score candidates** — ask the LLM for several rewrites and score them on predictability reduction, character preservation, genericity, and whether a play marker is present. 5. **Select & quality-check** — pick the strongest safe candidate; reject over-corrections. 6. **Learn** — record what was actually applied; later `reportReaction()` feeds the audience response back, raising or lowering how far Noise will push next time and promoting well-received moments into the gag ledger. In short: **keep the character's "form", choreograph when and how far to break it, and always return to form.** The laughter-flavored reaction signals are not there to make the AI tell jokes — they are the sensor that measures whether a deviation (a bet) actually paid off. ## Basic Usage ```ts import { createContaminator } from '@aituber-onair/noise'; const contaminator = createContaminator({ intensity: 0.42, mode: 'performer', chat: { provider: 'openai', options: { apiKey: process.env.OPENAI_API_KEY!, model: 'gpt-4o-mini', }, }, }); const result = await contaminator.contaminate({ systemPrompt: 'You are a strange AI VTuber.', messages: [{ role: 'user', content: 'Thanks for the stream!' }], draft: 'Thank you for coming today. It was a very fun stream. Please look forward to the next one.', streamContext: { currentSituation: 'The stream is ending too neatly.', }, seed: 'ending-1', constraints: { preserveCodeBlocks: true, preserveUrls: true, preserveNumbers: true, maxAddedChars: 120, }, }); console.log(result.text); console.log(result.diagnosis); console.log(result.plan); console.log(result.applied); console.log(result.quality); ``` ## Conditional Usage Noise does not have to run on every LLM reply. In a production stream, a common pattern is to diagnose the draft first, then rewrite only when the response is likely to land too safely: ```ts import { createContextFingerprint, createContaminator, diagnosePredictability, } from '@aituber-onair/noise'; const context = createContextFingerprint({ systemPrompt, messages, streamContext, }); const diagnosis = diagnosePredictability({ draft: llmReply, context, }); const shouldUseNoise = diagnosis.score >= 0.45; const finalReply = shouldUseNoise ? ( await contaminator.contaminate({ systemPrompt, messages, draft: llmReply, streamContext, }) ).text : llmReply; ``` This makes Noise behave like a post-generation effect: use it for overly safe closings, repeated phrasing, forced positivity, and stream situations where a flat response would weaken the character. Skip it for precise announcements, system messages, and high-stakes text. ## Browser Example This package includes a browser lab for trying LLM-based rewrites and adaptive memory providers. ```sh npm -w @aituber-onair/noise run example:noise-sample ``` ## Deviation Orchestration ### Rhythm: platform -> tilt -> platform A deviation only reads as an event against a stretch of normal, in-character turns. The built-in rhythm controller skips noise right after a tilt (cooldown) and can require platform turns before tilting: ```ts const contaminator = createContaminator({ rhythm: { minPlatformTurns: 2, // in-character turns required before a tilt cooldownTurns: 2, // in-character turns enforced after a tilt tiltThreshold: 0.45, // diagnosis score needed to tilt forcedTiltAfter: 8, // tilt anyway after this many flat turns }, }); ``` When a turn is skipped, `contaminate()` returns the draft unchanged with `result.skipped` describing why (`'cooldown'`, `'platform'`, `'low_predictability'`, `'repair'`, or `'sincerity'`). Pass `forceTilt: true` in the input to bypass the rhythm gate. ### Relationship capital The same tease that charms an established audience alienates a new one. Pass `relationshipCapital` (0-1) per call — derived from any bond system, for example kizuna points — and Noise caps both the effective mode and the intervention vocabulary: - `stranger` (< 0.25): phrasing-level edits only (`subtle`). - `acquaintance` (< 0.55): + soft disagreement, dispreferred shape, length violation (`performer`). - `regular` (< 0.8): + contrarian reframe, callbacks, boke bait, status seesaw (`inversion`). - `companion` (>= 0.8): + tsukkomi, withheld uptake (`chaotic`). ```ts const result = await contaminator.contaminate({ systemPrompt, messages, draft, relationshipCapital: 0.7, }); console.log(result.gates.relationship.tier); // 'regular' ``` ### Sincerity gate When recent user messages carry a sincere bid — distress, a serious consultation, a heavy life event — all noise is suppressed before any other processing. Failed uptake of a sincere moment is the worst possible violation. Disable with `sincerityGate: false` if the app handles this elsewhere. ### Play markers Benign violation theory: a violation must be decoded as play at the same moment it lands. Teasing-class interventions (`tsukkomi`, `withheld_uptake`, `boke_bait`, `status_seesaw`, `contrarian_reframe`) require a playful marker (laughter token, exaggeration, self-tease) in the same reply; candidates without one are penalized and flagged with a `missing_play_marker` issue. ### Gag ledger and callbacks Callbacks — resurfacing a shared past moment — are the highest-value, lowest-risk surprise: they are unexpected and prove memory at the same time. ```ts await contaminator.recordMoment({ summary: 'The viewer exploded a pudding in the fridge', source: 'user', }); // Later turns may plan a `callback` intervention with that moment as material. ``` Moments are also promoted automatically when a tilt gets a positive reaction. ### Reaction loop Every deviation is a bet; feed the observed result back: ```ts const reaction = await contaminator.reportReaction({ signal: 'laughter' }); // 'laughter' | 'positive' | 'neutral' | 'silence' | 'pushback' | 'discomfort' ``` Positive signals widen the violation budget and promote the latest tilt into the gag ledger. Negative signals shrink the budget and schedule repair turns during which noise stays off. Subscribe to lifecycle events via `onNoiseEvent` (`tilt_applied`, `noise_skipped`, `repair_advised`, `moment_recorded`, `callback_used`) to let the app stage reactions — solo AI chaos is nonsense, chaos with a visible reactor is comedy. ### Positioning: why the vocabulary sounds like comedy Noise is **not** a library for making an AI character do comedy. The goal is unchanged: keep LLM replies from converging to the safe, average landing. The comedy-flavored vocabulary (reactions like "it got laughs", boke/tsukkomi interventions, the gag ledger) exists for three structural reasons: 1. **Every deviation is a bet, and the payoff lives in the audience.** Whether a broken expectation reads as charm or as malfunction is not a property of the text — expectancy violations theory shows it is decided by the receiver's appraisal. An engine that injects deviation without observing reception is an open-loop controller: it cannot know whether to push further or pull back. `reportReaction()` is that sensor, and the violation budget is the feedback loop. The API signals themselves are neutral (`laughter` / `positive` / `silence` / `pushback` / `discomfort`). 2. **Humor research is borrowed as measurement science, not as a goal.** The most developed body of knowledge about when a norm violation lands as *pleasure* instead of offense is humor theory (benign violation theory, boke/tsukkomi as a grammar for certifying deviation as play). Noise uses it to keep deviations safe, the same way it uses conversation analysis for response shapes — neither makes the output a joke. 3. **In a live stream, laughter is the most observable proxy for "the deviation was accepted."** You cannot directly measure "the audience appraised the violation positively", but you can literally count 草 and w in chat. That is why the browser lab labels its reaction buttons in streamer terms (ウケた / スベった): it is the sample's translation into its own context, not the library's purpose. Likewise the gag ledger is at heart a *shared-memory callback* device — resurfacing a moment the audience lived through proves memory and deepens the relationship; being funny is optional. ## Rewrite Modes `mode` controls how far Noise may move the response away from a predictable landing: - `subtle`: small edits that remove obvious polish. - `performer`: character-safe live-stream phrasing. - `bold`: stronger streamer judgment and clearer live tension. - `inversion`: reverses the expected emotional landing while preserving facts. - `chaotic`: the largest coherent disruption, with self-repair and unfinished edges. ## Design Noise works after an LLM has already produced a draft. It is independent from conversation-loop detectors such as `@aituber-onair/manneri`: those tools can watch the conversation flow before generation, while Noise watches the response landing after generation. The engine pipeline: - `createContextFingerprint()` reads the persona, recent messages, and optional `streamContext`. - `diagnosePredictability()` classifies why the draft feels too safe, generic, or over-polished. - `assessSincerity()`, `resolveRelationshipTier()`, and `decideRhythm()` gate whether this turn may deviate at all, and how far. - `buildInterventionPlan()` and `buildFrictionParameters()` turn the diagnosis into structured instructions such as grounding in recent comments, reducing over-apology, adding streamer judgment, dispreferred response shape, boke/tsukkomi moves, status seesaw, or a callback from the gag ledger. - `generateRewriteCandidates()` asks an LLM for multiple candidates from those structured parameters, each with a self-reported typicality so selection can prefer the distribution tail. - `evaluateRewriteCandidates()` checks predictability reduction, context grounding, specificity, persona preservation, meaning preservation, aggression risk, ungrounded detail risk, genericity (stock phrases and near-repeats of the character's own recent outputs), play markers, and whether the final sentence — the highest-value surprise position — actually changed. - `selectBestCandidate()` returns the strongest safe candidate. The full intervention vocabulary: | Intervention | What it does | | --- | --- | | `ground_in_recent_comment` | Reference something a viewer actually said | | `add_streamer_judgment` | Make a streamer-side decision | | `soft_disagreement` | Replace clean agreement with a warm reservation | | `contrarian_reframe` | Reverse the expected emotional landing | | `self_repair` | Live-speech self-correction mid-flow | | `unfinished_margin` | Leave the final thought slightly open | | `reduce_over_apology` | Drop service-style apology tone | | `reduce_over_agreement` | Weaken automatic acceptance | | `increase_specificity` | Add a concrete anchor | | `acknowledge_tension` | Name the visible trouble | | `break_clean_closing` | Avoid a tidy goodbye | | `callback` | Resurface a gag-ledger moment as a running gag | | `dispreferred_shape` | Human-shaped hedged/grudging (dis)agreement | | `boke_bait` | Plant a correctable absurdity inviting the audience retort | | `tsukkomi` | Sharp but clearly playful retort to the absurd part | | `withheld_uptake` | Deadpan past the expected reaction once | | `status_seesaw` | Brief confident stance, immediately self-mocked | | `response_length_violation` | Strikingly short reply where a paragraph was expected | Noise does not import or depend on Manneri. If an app has external knowledge about the stream, pass it as plain `streamContext`; Noise treats it as ordinary runtime context, not as a package-specific integration. This package does not depend on any LLM SDK. You can use the built-in `@aituber-onair/chat` integration for OpenAI, OpenAI-compatible, Gemini, Claude, OpenRouter, xAI, Kimi, DeepSeek, Mistral, and Gemini Nano providers: ```ts const contaminator = createContaminator({ chat: { provider: 'claude', options: { apiKey: process.env.CLAUDE_API_KEY!, model: 'claude-3-5-haiku-latest', }, }, }); ``` You can also use a custom adapter: ```ts const contaminator = createContaminator({ model: { async generate({ system, prompt }) { const response = await fetch('/api/rewrite', { method: 'POST', body: JSON.stringify({ system, prompt }), }); const json = await response.json(); return json.text; }, }, }); ``` If none of `chat`, `llm`, or `model` is provided, `contaminate()` throws. Noise no longer falls back to local rule-based rewriting because that can change a character's personality too easily. ## Safety By default, code blocks, URLs, and numbers are protected before rewriting and restored after rewriting. The safety guard also avoids mutating high-stakes medical, legal, and financial text. The purpose of this package is not to make the AI more human or to break facts. It only disturbs the way a reply lands when it is becoming too predictable. ## Quality Report Every rewrite returns a `quality` report: ```ts if (!result.quality.passed) { console.warn(result.quality.issues); } ``` The report is intentionally conservative. It flags outputs that are still too predictable, too aggressive for the character, over-explain the noise, or add details that were not present in the draft or recent conversation. ## Adaptive Memory Noise can keep a small memory of predictable response patterns. The memory does not store the full conversation by default. It tracks repeated closings, repeated phrases, the character's own recent responses (for the genericity penalty), recently used rewrite directives, and topic-level loops so later plans can avoid collapsing into the same style of rewrite. It also persists the deviation orchestration state: the rhythm counters, the violation budget learned from reactions, and the gag ledger of memorable moments. Without a configured store, the same state still works in-memory for the lifetime of the contaminator instance, so the rhythm controller and reaction loop function out of the box. The root package exports an environment-independent in-memory store: ```ts import { InMemoryNoiseMemoryStore, createContaminator, } from '@aituber-onair/noise'; const store = new InMemoryNoiseMemoryStore(); const contaminator = createContaminator({ memory: { scopeId: 'stream-session', store, }, }); ``` For browsers, import the web provider: ```ts import { LocalStorageNoiseMemoryStore } from '@aituber-onair/noise/web'; const store = new LocalStorageNoiseMemoryStore(); ``` For Node.js, import the node provider: ```ts import { JsonFileNoiseMemoryStore } from '@aituber-onair/noise/node'; const store = new JsonFileNoiseMemoryStore({ filePath: './noise-memory.json', }); ``` `detectNoiseRuntime()` can detect `browser`, `node`, or `unknown`, but the recommended production style is to import `@aituber-onair/noise/web` or `@aituber-onair/noise/node` explicitly. This keeps browser bundles from pulling in Node.js modules. ## Streaming `createContaminationStream()` uses the Web-standard `TransformStream` API. The current MVP buffers the full text and contaminates it on flush so the engine can rewrite with enough context.