[Back to AI Framework Overview](../../README.md) | [All Providers](../README.md) # @memberjunction/ai-gemini MemberJunction AI provider for Google Gemini models. Provides both LLM and image generation capabilities, supporting Gemini 2.5, Gemini 3, and Flash model families with native multimodal support, thinking/reasoning, and streaming. ## Architecture ```mermaid graph TD A["GeminiLLM
(Chat Provider)"] -->|extends| B["BaseLLM
(@memberjunction/ai)"] C["GeminiImageGenerator
(Image Provider)"] -->|extends| D["BaseImageGenerator
(@memberjunction/ai)"] A -->|wraps| E["GoogleGenAI
(@google/genai)"] C -->|wraps| E A -->|provides| F["Chat + Streaming"] A -->|provides| G["Thinking/Reasoning
Budget Control"] C -->|provides| H["Image Generation
+ Editing + Variations"] B -->|registered via| I["@RegisterClass"] D -->|registered via| I style A fill:#7c5295,stroke:#563a6b,color:#fff style C fill:#7c5295,stroke:#563a6b,color:#fff style B fill:#2d6a9f,stroke:#1a4971,color:#fff style D fill:#2d6a9f,stroke:#1a4971,color:#fff style E fill:#2d8659,stroke:#1a5c3a,color:#fff style F fill:#b8762f,stroke:#8a5722,color:#fff style G fill:#b8762f,stroke:#8a5722,color:#fff style H fill:#b8762f,stroke:#8a5722,color:#fff style I fill:#b8762f,stroke:#8a5722,color:#fff ``` ## Features ### LLM (GeminiLLM) - **Chat Completions**: Full conversational AI with system instructions - **Streaming**: Real-time response streaming with chunk processing - **Thinking/Reasoning**: Configurable thinking budget for Gemini 2.5+ and thinking levels for Gemini 3+ - **Multimodal Input**: Native support for text, images, audio, video, and file inputs - **Message Alternation**: Automatic handling of Gemini's role alternation requirements - **Safety Handling**: Detection and reporting of content blocking with detailed safety ratings - **Effort Level Mapping**: Maps MJ effort levels (1-100) to Gemini thinking budgets (0-24576) ### Image Generation (GeminiImageGenerator) - **Text-to-Image**: Generate images using Gemini image models - **Image Editing**: Edit existing images using multimodal context - **Image Variations**: Create variations of existing images - **Resolution Control**: Support for sizes up to 4K (3840x2160) - **Style and Quality**: Configurable style and quality parameters ### Embeddings (GeminiEmbedding) - **Multimodal Embeddings**: Map text, images, video, audio, and PDF into one shared 3072-dimensional vector space - **Cross-Modal Retrieval**: Embed text and media together so a text query can match an image, audio, or video - **Text and Batch**: Single and batch text embedding ## Installation ```bash npm install @memberjunction/ai-gemini ``` ## Usage ### Chat Completion ```typescript import { GeminiLLM } from "@memberjunction/ai-gemini"; const llm = new GeminiLLM("your-google-api-key"); const result = await llm.ChatCompletion({ model: "gemini-2.5-flash", messages: [ { role: "system", content: "You are a helpful assistant." }, { role: "user", content: "Explain quantum computing." }, ], temperature: 0.7, }); ``` ### Streaming with Thinking ```typescript const result = await llm.ChatCompletion({ model: "gemini-2.5-pro", messages: [{ role: "user", content: "Solve this math problem step by step." }], effortLevel: "75", streaming: true, streamingCallbacks: { OnContent: (content) => process.stdout.write(content), }, }); console.log("Thinking:", result.data.choices[0].message.thinking); ``` ### Image Generation ```typescript import { GeminiImageGenerator } from "@memberjunction/ai-gemini"; const generator = new GeminiImageGenerator("your-google-api-key"); const result = await generator.GenerateImage({ prompt: "A futuristic city at night", model: "gemini-3-pro-image-preview", size: "2048x2048", }); ``` ### Embeddings ```typescript import { GeminiEmbedding } from "@memberjunction/ai-gemini"; const embedding = new GeminiEmbedding("your-google-api-key"); // Text (3072-dim vector) const text = await embedding.EmbedText({ text: "a golden retriever in the snow" }); // Multimodal: text + image fused into ONE vector (cross-modal retrieval) const multimodal = await embedding.EmbedContent({ content: [ { type: "text", content: "product photo:" }, { type: "image_url", content: "", mimeType: "image/png" }, ], }); console.log(multimodal.vector.length); // 3072 ``` ## Thinking Budget / Effort Level The provider maps MJ effort levels to Gemini's thinking system: | Effort Level | Gemini 2.5 (Budget) | Gemini 3+ (Level) | |-------------|---------------------|-------------------| | 1-5 (Flash only) | 0 (disabled) | MINIMAL | | 1-33 | 1024-4096 | LOW | | 34-66 | 4097-12288 | MEDIUM | | 67-100 | 12289-24576 | HIGH | ## Supported Parameters | Parameter | Supported | Notes | |-----------|-----------|-------| | temperature | Yes | Default 0.5 | | topP | Yes | Nucleus sampling | | topK | Yes | Top-K sampling | | seed | Yes | Deterministic outputs | | stopSequences | Yes | Custom stop sequences | | effortLevel | Yes | Maps to thinking budget/level | | responseFormat | Yes | JSON and text modes | | streaming | Yes | Real-time streaming | | frequencyPenalty | No | Not supported by Gemini | | presencePenalty | No | Not supported by Gemini | | minP | No | Not supported by Gemini | ## Class Registration - `GeminiLLM` -- Registered via `@RegisterClass(BaseLLM, 'GeminiLLM')` - `GeminiImageGenerator` -- Registered via `@RegisterClass(BaseImageGenerator, 'GeminiImageGenerator')` - `GeminiEmbedding` -- Registered via `@RegisterClass(BaseEmbeddings, 'GeminiEmbedding')` ## Dependencies - `@memberjunction/ai` - Core AI abstractions - `@memberjunction/global` - Class registration - `@google/genai` - Google GenAI SDK