# Expo AI Runtime ## Compact Implementation Architecture ## 1. Product Definition Expo AI Runtime is a mobile-native AI provider layer for Expo and React Native. It gives app developers one TypeScript API for: - Apple Foundation Models - Android Gemini Nano through AICore / ML Kit GenAI - Local bring-your-own-model runtimes such as LiteRT-LM - Optional cloud model fallback - Streaming - Sessions - Structured output - Capability detection - Privacy-aware provider routing This should not start as a full agent framework. The first version should solve the hardest mobile-native problem: safely calling the best available native or local model from a React Native app. --- ## 2. Positioning ### Short positioning A mobile-native AI runtime for Expo and React Native. ### Expanded positioning Expo AI Runtime unifies system on-device models, local custom models, and cloud AI providers behind one TypeScript API, with native capability detection and privacy-aware fallback routing. ### What it is not in v1 - Not a full LangChain replacement - Not a full agent framework - Not a vector database - Not a hosted AI backend - Not a model marketplace at launch ### What it may become A React Native-native equivalent to parts of LangChain, Vercel AI SDK, and Genkit, but focused on: - iOS Foundation Models - Android AICore / Gemini Nano - Local model execution - Mobile device capability detection - Offline-first AI behavior - App privacy boundaries --- ## 3. Core Architecture ```mermaid flowchart TD App[Expo / React Native App] --> API[ExpoAI TypeScript API] API --> Router[Provider Router] API --> Capabilities[Capability Registry] API --> Sessions[Session Manager] API --> Schema[Structured Output Layer] API --> Privacy[Privacy Metadata] API --> Errors[Normalized Error Layer] Router --> IOS[iOS Native Adapter] Router --> Android[Android Native Adapter] Router --> Local[Local BYOM Adapter] Router --> Cloud[Cloud Fallback Adapter] IOS --> AppleFM[Apple Foundation Models] Android --> AICore[ML Kit GenAI Prompt API] AICore --> GeminiNano[Gemini Nano via AICore] Local --> LiteRT[LiteRT-LM] Cloud --> Backend[Private AI Backend] Backend --> Providers[OpenAI / Gemini / Anthropic / Bedrock] ``` --- ## 4. Package Structure Use a plugin-oriented architecture so the core stays small. ```txt packages/ expo-ai-core/ src/ index.ts ExpoAI.ts types.ts errors.ts provider-router.ts capability-registry.ts session-manager.ts structured-output.ts privacy.ts expo-ai-apple-foundation-models/ ios/ ExpoAIAppleModule.swift AppleFoundationModelsAdapter.swift AppleSessionStore.swift AppleErrorMapper.swift src/ index.ts expo-ai-android-aicore/ android/ src/main/java/expo/modules/ai/aicore/ ExpoAICoreModule.kt AICoreGeminiNanoAdapter.kt AndroidSessionStore.kt AndroidErrorMapper.kt src/ index.ts expo-ai-litert-lm/ ios/ android/ src/ index.ts model-manager.ts expo-ai-cloud/ src/ index.ts cloud-provider.ts expo-ai-evals/ src/ eval-runner.ts fixtures/ ``` For an MVP, this can begin as a monorepo with only: ```txt expo-ai-core expo-ai-apple-foundation-models expo-ai-android-aicore expo-ai-cloud ``` Defer `expo-ai-litert-lm` until after the system-model bridge works. --- ## 5. Provider Model ```ts export type ExpoAIProvider = | 'system-preferred' | 'apple-foundation-models' | 'apple-private-cloud-compute' | 'android-aicore-gemini-nano' | 'litert-lm' | 'cloud' | 'none'; ``` Default provider priority: ```ts export const defaultProviderPriority: ExpoAIProvider[] = [ 'apple-foundation-models', 'apple-private-cloud-compute', 'android-aicore-gemini-nano', 'litert-lm', 'cloud', ]; ``` Provider categories: | Category | Provider | Purpose | | ------------- | ---------------------------- | ---------------------------------------- | | System model | Apple Foundation Models | Preferred iOS native model path | | System model | Android AICore / Gemini Nano | Preferred Android native model path | | Private cloud | Apple Private Cloud Compute | Apple-managed larger model path | | Local BYOM | LiteRT-LM | Downloaded or bundled local models | | Cloud | Private backend | Reliable fallback and advanced reasoning | --- ## 6. Public TypeScript API The public API should be simple and capability-first. ```ts import { ExpoAI } from '@expo-ai/core'; const result = await ExpoAI.generate({ prompt: 'Summarize this note in five bullets.', fallback: 'cloud', }); console.log(result.text); ``` ### Core API ```ts export namespace ExpoAI { export function getAvailability(): Promise; export function getCapabilities(): Promise; export function listProviders(): Promise; export function generate(options: GenerateOptions): Promise; export function stream(options: GenerateOptions): AsyncIterable; export function createSession(options?: CreateSessionOptions): Promise; export function generateObject(options: GenerateObjectOptions): Promise; export function summarize(options: SummarizeOptions): Promise; export function rewrite(options: RewriteOptions): Promise; export function proofread(options: ProofreadOptions): Promise; } ``` --- ## 7. Capability Detection Every app-facing feature should check capabilities at runtime. ```ts export type ExpoAICapabilities = { available: boolean; provider: ExpoAIProvider; isOnDevice: boolean; isSystemManagedModel: boolean; sendsPromptOffDevice: boolean; supportsTextGeneration: boolean; supportsStreaming: boolean; supportsSessions: boolean; supportsStructuredOutput: boolean; supportsTools: boolean; supportsImageInput: boolean; supportsSpeechInput: boolean; supportsSummarization: boolean; supportsRewrite: boolean; supportsProofreading: boolean; supportsBringYourOwnModel: boolean; supportsModelDownload: boolean; contextWindow?: number; reasonUnavailable?: ExpoAIUnavailableReason; }; ``` Unavailable reasons: ```ts export type ExpoAIUnavailableReason = | 'unsupported_os_version' | 'unsupported_device' | 'model_not_downloaded' | 'model_initializing' | 'apple_intelligence_disabled' | 'aicore_unavailable' | 'aicore_initializing' | 'unsupported_bootloader_state' | 'missing_dependency' | 'provider_not_configured' | 'unknown'; ``` --- ## 8. Session Model The Expo layer owns the cross-platform session abstraction. ```ts export type CreateSessionOptions = { instructions?: string; provider?: ExpoAIProvider; fallback?: 'none' | 'cloud' | 'any'; temperature?: number; maxOutputTokens?: number; metadata?: Record; }; ``` ```ts export type ExpoAISession = { id: string; provider: ExpoAIProvider; generate(options: SessionGenerateOptions): Promise; stream(options: SessionGenerateOptions): AsyncIterable; generateObject(options: SessionGenerateObjectOptions): Promise; reset(): Promise; dispose(): Promise; }; ``` Native mapping: | Concept | iOS | Android | Cloud | | ----------------- | -------------------------------- | -------------------------- | ------------------- | | Session | Native Foundation Models session | Native or emulated session | Conversation/thread | | Instructions | Native instructions | Prompt prefix | System message | | Streaming | Native if available | Native or emulated | Provider stream | | Structured output | Native if available | JSON prompt + validation | JSON mode/tools | | Tools | Native if available | Emulated | Provider tools | --- ## 9. Native Module Boundary Keep native methods primitive. Put ergonomics in TypeScript. ```ts NativeExpoAI.getAvailability(): Promise; NativeExpoAI.getCapabilities(): Promise; NativeExpoAI.generate( options: NativeGenerateOptions ): Promise; NativeExpoAI.createSession( options: NativeCreateSessionOptions ): Promise<{ sessionId: string; provider: ExpoAIProvider }>; NativeExpoAI.generateInSession( sessionId: string, options: NativeGenerateOptions ): Promise; NativeExpoAI.disposeSession( sessionId: string ): Promise; ``` Streaming should use native events: ```ts NativeExpoAI.startStreaming( requestId: string, options: NativeGenerateOptions ): Promise; NativeExpoAI.cancelStreaming( requestId: string ): Promise; ``` Event shape: ```ts export type ExpoAIStreamEvent = | { requestId: string; type: 'start' } | { requestId: string; type: 'token'; text: string } | { requestId: string; type: 'error'; error: ExpoAIError } | { requestId: string; type: 'done'; result: GenerateResult }; ``` --- ## 10. iOS Adapter Primary native provider: ```txt Apple Foundation Models ``` Responsibilities: - Check OS and model availability. - Detect Apple Intelligence disabled or unavailable states. - Create and store native sessions. - Support generation. - Support streaming where available. - Support structured output where available. - Map native errors into `ExpoAIError`. - Return privacy metadata. Implementation files: ```txt ios/ ExpoAIAppleModule.swift AppleFoundationModelsAdapter.swift AppleSessionStore.swift AppleErrorMapper.swift ``` Pseudo-shape: ```swift public class ExpoAIAppleModule: Module { public func definition() -> ModuleDefinition { Name("ExpoAIApple") AsyncFunction("getAvailability") { () async throws in return try await AppleFoundationModelsAdapter.getAvailability() } AsyncFunction("generate") { (options: [String: Any]) async throws in return try await AppleFoundationModelsAdapter.generate(options) } } } ``` --- ## 11. Android Adapter Primary native provider: ```txt ML Kit GenAI Prompt API Gemini Nano AICore ``` Responsibilities: - Check AICore readiness. - Detect unsupported devices. - Detect initializing / model unavailable states. - Create or emulate sessions. - Support prompt-based generation. - Support task helpers such as summarize/rewrite/proofread where available. - Map native errors into `ExpoAIError`. - Return privacy metadata. Implementation files: ```txt android/ src/main/java/expo/modules/ai/aicore/ ExpoAICoreModule.kt AICoreGeminiNanoAdapter.kt AndroidSessionStore.kt AndroidErrorMapper.kt ``` Pseudo-shape: ```kotlin class ExpoAICoreModule : Module() { override fun definition() = ModuleDefinition { Name("ExpoAICore") AsyncFunction("getAvailability") { AICoreGeminiNanoAdapter.getAvailability(appContext.reactContext) } AsyncFunction("generate") { options: Map -> AICoreGeminiNanoAdapter.generate(appContext.reactContext, options) } } } ``` Android should never assume AICore is ready. It should return a retryable unavailable state when the model or service is still initializing. --- ## 12. LiteRT-LM Adapter LiteRT-LM should be the preferred bring-your-own-model runtime, but not part of the first MVP unless the system provider bridge is already stable. Responsibilities: - Manage model files. - Verify model compatibility. - Handle downloads and checksums. - Initialize runtime. - Run inference. - Surface memory and performance errors. - Support provider metadata. Model config: ```ts export type LocalModelConfig = { id: string; runtime: 'litert-lm'; source: 'bundled' | 'remote' | 'huggingface' | 'file'; uri?: string; checksum?: string; sizeBytes?: number; license?: string; }; ``` Usage: ```ts await ExpoAI.generate({ provider: 'litert-lm', model: { id: 'google/gemma-mobile', runtime: 'litert-lm', source: 'huggingface', }, prompt: 'Summarize this document.', }); ``` --- ## 13. Cloud Fallback Cloud fallback should exist in v1 as an adapter interface, even if the implementation starts as a private backend stub. ```ts export type CloudProviderConfig = { endpoint: string; headers?: Record; provider?: 'openai' | 'gemini' | 'anthropic' | 'bedrock' | 'custom'; }; ``` Fallback behavior: ```ts await ExpoAI.generate({ prompt: 'Extract the risks from this proposal.', fallback: 'cloud', }); ``` Routing rules: 1. Try the requested provider. 2. If unavailable and fallback is allowed, try the next provider. 3. If prompt is sensitive and cloud fallback is disabled, fail locally. 4. Return provider metadata with every result. --- ## 14. Privacy Model Every result should identify where processing happened. ```ts export type ExpoAIPrivacyInfo = { provider: ExpoAIProvider; isOnDevice: boolean; sendsPromptOffDevice: boolean; privacyMode: 'on-device' | 'apple-private-cloud-compute' | 'third-party-cloud' | 'unknown'; }; ``` Recommended UI copy: | Mode | Copy | | --------------------------- | ----------------------------------------------- | | On-device | Processed privately on this device. | | Apple Private Cloud Compute | Processed using Apple Private Cloud Compute. | | Third-party cloud | Processed using a configured cloud AI provider. | Do not silently fall back to third-party cloud for sensitive workflows unless the app has disclosed and enabled this behavior. --- ## 15. Structured Output Expose structured output from the start. ```ts const result = await ExpoAI.generateObject({ prompt: 'Extract project name, budget, timeline, and risks.', schema: { type: 'object', properties: { projectName: { type: 'string' }, budget: { type: 'string' }, timeline: { type: 'string' }, risks: { type: 'array', items: { type: 'string' }, }, }, required: ['projectName', 'timeline', 'risks'], }, fallback: 'cloud', }); ``` Implementation strategy: | Provider | Strategy | | ----------------------- | -------------------------------------------------- | | Apple Foundation Models | Native guided generation where available | | Android AICore | JSON prompt + validation + repair retry | | LiteRT-LM | JSON prompt + validation + repair retry | | Cloud | Provider-native JSON mode or tools where available | --- ## 16. Error Model Normalize all provider errors. ```ts export class ExpoAIError extends Error { code: ExpoAIErrorCode; provider: ExpoAIProvider; retryable: boolean; fallbackRecommended: boolean; nativeMessage?: string; } ``` ```ts export type ExpoAIErrorCode = | 'UNAVAILABLE' | 'UNSUPPORTED_DEVICE' | 'MODEL_NOT_READY' | 'MODEL_DOWNLOAD_REQUIRED' | 'USER_SETTING_REQUIRED' | 'INVALID_PROMPT' | 'CONTEXT_WINDOW_EXCEEDED' | 'SAFETY_BLOCKED' | 'RATE_LIMITED' | 'CANCELLED' | 'TIMEOUT' | 'NATIVE_PROVIDER_ERROR' | 'UNKNOWN'; ``` --- ## 17. Evaluation Harness Add an eval package early, even if it is simple. Purpose: - Compare provider output quality. - Verify schema validity. - Track latency. - Track fallback frequency. - Test privacy boundaries. - Detect SDK regressions. Structure: ```txt expo-ai-evals/ fixtures/ summarize.json rewrite.json extract.json safety.json src/ runEvalSuite.ts scoreSchemaValidity.ts compareProviders.ts ``` Eval result: ```ts export type EvalResult = { provider: ExpoAIProvider; testName: string; passed: boolean; latencyMs: number; usedFallback: boolean; schemaValid?: boolean; errorCode?: ExpoAIErrorCode; }; ``` --- ## 18. MVP Scope Build first: ```txt @expo-ai/core @expo-ai/apple-foundation-models @expo-ai/android-aicore @expo-ai/cloud ``` MVP API: ```ts ExpoAI.getAvailability(); ExpoAI.getCapabilities(); ExpoAI.listProviders(); ExpoAI.generate(); ExpoAI.createSession(); ExpoAI.generateObject(); ExpoAI.summarize(); ExpoAI.rewrite(); ``` MVP features: - iOS system model provider - Android system model provider - Cloud fallback adapter - Capability detection - Normalized errors - Privacy metadata - Basic structured output - Basic streaming if native support is available Defer: - LiteRT-LM - model catalog - model downloads - local RAG - native tools - agent loop - long-term memory - Genkit/LangChain bridges --- ## 19. Phase Roadmap ### Phase 1: System Provider MVP Deliver: - Core TypeScript API - iOS Foundation Models adapter - Android AICore / ML Kit adapter - Capability detection - Generate - Session creation - Error normalization - Privacy metadata ### Phase 2: Cloud Fallback + Structured Output Deliver: - Private backend cloud adapter - Provider routing - `generateObject` - JSON validation and repair loop - Provider metadata on every result ### Phase 3: Streaming Deliver: - Native event streaming - AsyncIterable wrapper - Cancellation - Timeout handling ### Phase 4: Task APIs Deliver: - summarize - rewrite - proofread - image input where available - provider-specific feature mapping ### Phase 5: LiteRT-LM BYOM Deliver: - LiteRT-LM provider - model file manager - remote model download - checksum verification - model compatibility metadata ### Phase 6: Local Context / RAG Deliver: - local search abstraction - app-managed local index - `generateWithContext` - future platform-specific search adapters ### Phase 7: Tools + Agent Loop Deliver: - tool registry - native iOS tool support where available - JS-emulated Android tools - tool-call validation - loop limits - permissions ### Phase 8: Backend Framework Bridges Deliver optional integrations: - Genkit backend bridge - LangChain backend bridge - Vercel AI SDK backend bridge --- ## 20. Key Engineering Principles 1. Capabilities over assumptions Every feature should be runtime-detected. 2. Providers over platforms iOS and Android can each support multiple providers. 3. Stable TypeScript API, replaceable native adapters Native SDKs will change faster than the public API. 4. Privacy metadata on every result The app should always know whether inference happened on-device or off-device. 5. Cloud fallback must be explicit Never silently send sensitive prompts to third-party cloud providers. 6. Start with generation, not agents Agents, tools, RAG, and memory should come after the provider layer is reliable. 7. BYOM is separate from system models Apple Foundation Models and Android AICore are system-model providers. LiteRT-LM is the first BYOM path. --- ## 21. First Build Checklist ### Repository - Create monorepo. - Add package workspaces. - Add example Expo app. - Add TypeScript build. - Add native iOS module. - Add native Android module. ### Core API - Define provider types. - Define capability types. - Define error types. - Implement provider router. - Implement cloud fallback interface. - Implement structured output validator. ### iOS - Add Expo module. - Add availability check. - Add generate method. - Add session store. - Map native errors. ### Android - Add Expo module. - Add ML Kit / AICore dependency. - Add availability check. - Add generate method. - Add retryable AICore initialization state. - Map native errors. ### Example App - Show capability card. - Show generate prompt box. - Show provider used. - Show privacy mode. - Show fallback state. - Show error details. ### Tests - Unit test provider routing. - Unit test error mapping. - Unit test schema validation. - Add basic device integration tests. - Add golden prompt fixtures. --- ## 22. Recommended v1 Name Package naming options: ```txt @expo-ai/core @expo-ai/runtime @stewmore/expo-ai-runtime expo-ai-runtime expo-local-ai ``` Best working name: ```txt expo-ai-runtime ``` Why: - Broader than “foundation models” - Does not imply Apple-only behavior - Allows cloud, local, and system providers - Leaves room for future agent/runtime features --- ## 23. Summary Build this as a mobile-native AI runtime, not a full agent framework. The first milestone should prove: - iOS can call Apple Foundation Models through Expo. - Android can call Gemini Nano through AICore / ML Kit through Expo. - The same TypeScript API works across both. - Apps can detect capabilities before showing features. - Results include provider and privacy metadata. - Cloud fallback is explicit and controllable. - Structured output works consistently enough for app features. Once that foundation is stable, expand into LiteRT-LM, model downloads, local context, tools, and backend framework bridges.