# Gemma This folder is organized into several categories, each focusing on a speicific aspect of working with Gemma models: * [Inference and serving](#inference-and-serving) : How to load, run and deploy Gemma models for inference * [Prompting](#prompting) : Explore various prompting techniques * [RAG (Retrieval Augmented Generation)](#rag) : How to build RAG systems with Gemma * [Finetuning](#finetuning) : Dive into finetuning Gemma models for specific tasks and domains * [Alignment](#alignment) : Techniques for aligning Gemma models * [Evaluation](#evaluation) : How to evaluate Gemma models * [Agentic AI](#agentic-ai) : How to build an intelligent agents using Gemma models ## Inference and serving | Notebook Name | Description | :------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | [[Gemma_1]Basics_with_HF.ipynb]([Gemma_1]Basics_with_HF.ipynb) | Load, run, finetune and deploy Gemma using [Hugging Face](https://huggingface.co/). | | [[Gemma_1]Common_use_cases.ipynb]([Gemma_1]Common_use_cases.ipynb) | Illustrate some common use cases for Gemma. | | [[Gemma_1]Inference with Flax/NNX](https://flax.readthedocs.io/en/latest/guides/gemma.html) | Gemma 1 inference with Flax/NNX framework (linking to Flax documentation) | | [[Gemma_1]Inference_on_TPU.ipynb]([Gemma_1]Inference_on_TPU.ipynb) | Basic inference of Gemma with JAX/Flax on TPU. | | [[Gemma_1]Using_with_Ollama.ipynb]([Gemma_1]Using_with_Ollama.ipynb) | Run Gemma models using [Ollama](https://www.ollama.com/). | | [[Gemma_1]Using_with_OneTwo.ipynb]([Gemma_1]Using_with_OneTwo.ipynb) | Integrate Gemma with [Google OneTwo](https://github.com/google-deepmind/onetwo). | | [[Gemma_1]data_parallel_inference_in_jax_tpu.ipynb]([Gemma_1]data_parallel_inference_in_jax_tpu.ipynb) | Parallel inference of Gemma with JAX/Flax on TPU. | | [[Gemma_2]Constrained_generation.ipynb]([Gemma_2]Constrained_generation.ipynb) | Constrained generation with Gemma models using [LlamaCpp](https://github.com/abetlen/llama-cpp-python/) and [Guidance](https://github.com/guidance-ai/guidance/tree/main/). | | [[Gemma_2]Deploy_in_Vertex_AI.ipynb]([Gemma_2]Deploy_in_Vertex_AI.ipynb) | Deploy a Gemma model using [Vertex AI](https://cloud.google.com/vertex-ai). | | [[Gemma_2]Deploy_with_vLLM.ipynb]([Gemma_2]Deploy_with_vLLM.ipynb) | Deploy a Gemma model using [vLLM](https://github.com/vllm-project/vllm). | | [[Gemma_2]Game_Design_Brainstorming.ipynb]([Gemma_2]Game_Design_Brainstorming.ipynb) | Use Gemma to brainstorm ideas during game design using Keras. | | [[Gemma_2]Gradio_Chatbot.ipynb]([Gemma_2]Gradio_Chatbot.ipynb) | Building a Chatbot with Gemma and Gradio | | [[Gemma_2]Guess_the_word.ipynb]([Gemma_2]Guess_the_word.ipynb) | Play a word guessing game with Gemma using Keras. | | [[Gemma_2]Keras_Quickstart.ipynb]([Gemma_2]Keras_Quickstart.ipynb) | Gemma 2 pre-trained 9B model quickstart tutorial with Keras. | | [[Gemma_2]Keras_Quickstart_Chat.ipynb]([Gemma_2]Keras_Quickstart_Chat.ipynb) | Gemma 2 instruction-tuned 9B model quickstart tutorial with Keras. Referenced in this [blog](https://developers.googleblog.com/en/fine-tuning-gemma-2-with-keras-hugging-face-update/). | | [[Gemma_2]Synthetic_data_generation.ipynb]([Gemma_2]Synthetic_data_generation.ipynb) | Synthetic data generation with Gemma 2 | | [[Gemma_2]Using_Gemini_and_Gemma_with_RouteLLM.ipynb]([Gemma_2]Using_Gemini_and_Gemma_with_RouteLLM.ipynb) | Route Gemma and Gemini models using [RouteLLM](https://github.com/lm-sys/RouteLLM/). | | [[Gemma_2]Using_with_LLM_Comparator.ipynb](Gemma/[Gemma_2]Using_with_LLM_Comparator.ipynb) | Compare Gemma with another LLM using [LLM Comparator](https://github.com/pair-code/llm-comparator/). | | [[Gemma_2]Using_with_Langfun_and_LlamaCpp.ipynb]([Gemma_2]Using_with_Langfun_and_LlamaCpp.ipynb) | Leverage [Langfun](https://github.com/google/langfun) to seamlessly integrate natural language with programming using Gemma 2 and [LlamaCpp](https://github.com/ggerganov/llama.cpp). | | [[Gemma_2]Using_with_Langfun_and_LlamaCpp_Python_Bindings.ipynb]([Gemma_2]Using_with_Langfun_and_LlamaCpp_Python_Bindings.ipynb) | Leverage [Langfun](https://github.com/google/langfun) for smooth language-program interaction with Gemma 2 and [llama-cpp-python](https://github.com/abetlen/llama-cpp-python). | | [[Gemma_2]Using_with_LlamaCpp.ipynb]([Gemma_2]Using_with_LlamaCpp.ipynb) | Run Gemma models using [LlamaCpp](https://github.com/abetlen/llama-cpp-python/). | | [[Gemma_2]Using_with_Llamafile.ipynb]([Gemma_2]Using_with_Llamafile.ipynb) | Run Gemma models using [Llamafile](https://github.com/Mozilla-Ocho/llamafile/). | | [[Gemma_2]Using_with_LocalGemma.ipynb]([Gemma_2]Using_with_LocalGemma.ipynb) | Run Gemma models using [Local Gemma](https://github.com/huggingface/local-gemma/). | | [[Gemma_2]Using_with_Mesop.ipynb]([Gemma_2]Using_with_Mesop.ipynb) | Integrate Gemma with [Google Mesop](https://google.github.io/mesop/). | | [[Gemma_2]Using_with_Ollama_Python.ipynb]([Gemma_2]Using_with_Ollama_Python.ipynb) | Run Gemma models using [Ollama Python library](https://github.com/ollama/ollama-python). | | [[Gemma_2]Using_with_SGLang.ipynb]([Gemma_2]Using_with_SGLang.ipynb) | Run Gemma models using [SGLang](https://github.com/sgl-project/sglang/). | | [[Gemma_2]Using_with_Xinference.ipynb]([Gemma_2]Using_with_Xinference.ipynb) | Run Gemma models using [Xinference](https://github.com/xorbitsai/inference/). | | [[Gemma_2]Using_with_mistral_rs.ipynb]([Gemma_2]Using_with_mistral_rs.ipynb) | Run Gemma models using [mistral.rs](https://github.com/EricLBuehler/mistral.rs/). | | [[Gemma_2]for_Japan_using_Transformers_and_PyTorch.ipynb]([Gemma_2]for_Japan_using_Transformers_and_PyTorch.ipynb) | [Gemma 2 for Japan](https://blog.google/intl/ja-jp/company-news/technology/gemma-2-2b/) | | [[Gemma_2]on_Groq.ipynb]([Gemma_2]on_Groq.ipynb) | Leverage the free Gemma 2 9B IT model hosted on [Groq](https://groq.com/) (super fast speed). | | [[Gemma_3]Inference_images_and_videos.ipynb]([Gemma_3]Inference_images_and_videos.ipynb) | Inference on images and videos using Gemma 3 4B IT model. | | [[Gemma_3]Using_with_Ollama_Python_Inference_with_Images.ipynb]([Gemma_3]Using_with_Ollama_Python_Inference_with_Images.ipynb) | Run inference with images on Gemma 3 using [Ollama Python library](https://github.com/ollama/ollama-python). | | [[Gemma_3]Using_with_Transformersjs.ipynb]([Gemma_3]Using_with_Transformersjs.ipynb) | Run Gemma 3 with [Transformers.js](https://github.com/huggingface/transformers.js). | | [[Gemma_3]Activation_Hacking.ipynb]([Gemma_3]Activation_Hacking.ipynb) | Examine and modify internal states, including the residual stream, MLP activations, and attention mechanisms. | | [[Gemma_3]Chess.ipynb]([Gemma_3]Chess.ipynb) | Gemma \| Chess: Learn, Analyze, and Discover a New Dimension! | | [[Gemma_3]Gradio_LlamaCpp_Chatbot.ipynb]([Gemma_3]Gradio_LlamaCpp_Chatbot.ipynb) | Building a Chatbot with Gemma 3 QAT text model using Llama.cpp and Gradio. | | [[Gemma_3n]Audio_understanding_with_HF.ipynb]([Gemma_3n]Audio_understanding_with_HF.ipynb) | Run Gemma 3n with audio input | | [[Gemma_3n]Multimodal_understanding_with_HF.ipynb]([Gemma_3n]Multimodal_understanding_with_HF.ipynb) | Run Gemma 3n with image + audio input | | [[Gemma_3n]MatFormer_Lab.ipynb]([Gemma_3n]MatFormer_Lab.ipynb) | Run Gemma 3n with MatFormers and Mix-n-Match | ## Prompting | Notebook Name | Description | | :------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | [[Gemma_1]Advanced_Prompting_Techniques.ipynb]([Gemma_1]Advanced_Prompting_Techniques.ipynb) | Illustrate advanced prompting techniques with Gemma. | | [[Gemma_2]LangChain_chaining.ipynb]([Gemma_2]LangChain_chaining.ipynb) | Illustrate LangChain chaining with Gemma. | | [[Gemma_2]Prompt_chaining.ipynb]([Gemma_2]Prompt_chaining.ipynb) | Illustrate prompt chaining and iterative generation with Gemma. | | [[Gemma_3]In-context_Learning.ipynb]([Gemma_3]In-context_Learning.ipynb) | Demonstrate in-context learning with Gemma 3 long context window | ## RAG | Notebook Name | Description | | :------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | [[Gemma_1]Minimal_RAG.ipynb]([Gemma_1]Minimal_RAG.ipynb) | Minimal example of building a RAG system with Gemma using [Google UniSim](https://github.com/google/unisim) and [Hugging Face](https://huggingface.co/). | | [[Gemma_1]RAG_with_ChromaDB.ipynb]([Gemma_1]RAG_with_ChromaDB.ipynb) | Build a Retrieval Augmented Generation (RAG) system with Gemma using [ChromaDB](https://www.trychroma.com/) and [Hugging Face](https://huggingface.co/). | | [[Gemma_2]RAG_LlamaIndex.ipynb]([Gemma_2]RAG_LlamaIndex.ipynb) | RAG example with [LlamaIndex](https://www.llamaindex.ai/) using Gemma. | | [[Gemma_2]RAG_PDF_Search_in_multiple_documents_on_Colab.ipynb]([Gemma_2]RAG_PDF_Search_in_multiple_documents_on_Colab.ipynb) | RAG PDF Search in multiple documents using Gemma 2 2B on Google Colab. | | [[Gemma_2]Using_with_Elasticsearch_and_LangChain.ipynb]([Gemma_2]Using_with_Elasticsearch_and_LangChain.ipynb) | Example to demonstrate using Gemma with [Elasticsearch](https://www.elastic.co/elasticsearch/), [Ollama](https://www.ollama.com/) and [LangChain](https://www.langchain.com/). | | [[Gemma_2]Using_with_Firebase_Genkit_and_Ollama.ipynb]([Gemma_2]Using_with_Firebase_Genkit_and_Ollama.ipynb) | Example to demonstrate using Gemma with [Firebase Genkit](https://firebase.google.com/docs/genkit/) and [Ollama](https://www.ollama.com/) | | [[Gemma_2]Using_with_LangChain.ipynb]([Gemma_2]Using_with_LangChain.ipynb) | Examples to demonstrate using Gemma with [LangChain](https://www.langchain.com/). | | [[Gemma_3]Local_Agentic_RAG.ipynb]([Gemma_3]Local_Agentic_RAG.ipynb) | Build local Agentic RAG without any external APIs using [FastEmbed](https://github.com/qdrant/fastembed), [Ollama- Gemma3](https://ollama.com/models), and [Qdrant Vector database](https://cloud.qdrant.io) | | [[Gemma_3]RAG_with_EmbeddingGemma.ipynb]([Gemma_3]RAG_with_EmbeddingGemma.ipynb) | Build simple RAG with [EmbeddingGemma](https://ai.google.dev/gemma/docs/embeddinggemma) | ## Finetuning | Notebook Name | Description | |:-----------------------------------------------------------------------------------------------------------------------------------| --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | [[Gemma_1]Finetune_distributed.ipynb]([Gemma_1]Finetune_distributed.ipynb) | Chat with Gemma 7B and finetune it so that it generates responses in pirates' tone. | | [[Gemma_1]Finetune_with_LLaMA_Factory.ipynb]([Gemma_1]Finetune_with_LLaMA_Factory.ipynb) | Finetune Gemma using [LLaMA-Factory](https://github.com/hiyouga/LLaMA-Factory). | | [[Gemma_1]Finetune_with_XTuner.ipynb]([Gemma_1]Finetune_with_XTuner.ipynb) | Finetune Gemma using [XTuner](https://github.com/InternLM/xtuner). | | [[Gemma_2]Custom_Vocabulary.ipynb]([Gemma_2]Custom_Vocabulary.ipynb) | Demonstrate how to use a custom vocabulary "<unused[0-98]>" tokens in Gemma. | | [[Gemma_2]Finetune_with_Axolotl.ipynb]([Gemma_2]Finetune_with_Axolotl.ipynb) | Finetune Gemma using [Axolotl](https://github.com/OpenAccess-AI-Collective/axolotl). | | [[Gemma_2]Finetune_with_CALM.ipynb]([Gemma_2]Finetune_with_CALM.ipynb) | Finetune Gemma using [CALM](https://github.com/google-deepmind/calm). | | [[Gemma_2]Finetune_with_Function_Calling.ipynb]([Gemma_2]Finetune_with_Function_Calling.ipynb) | Finetuning Gemma for Function Calling using [PyTorch/XLA](https://github.com/pytorch/xla). | | [[Gemma_2]Finetune_with_JORA.ipynb]([Gemma_2]Finetune_with_JORA.ipynb) | Finetune Gemma using [JORA](https://github.com/aniquetahir/JORA). | | [[Gemma_2]Finetune_with_LitGPT.ipynb]([Gemma_2]Finetune_with_LitGPT.ipynb) | Finetune Gemma using [LitGPT](https://github.com/Lightning-AI/litgpt). | | [[Gemma_2]Finetune_with_Torch_XLA.ipynb]([Gemma_2]Finetune_with_Torch_XLA.ipynb) | Finetune Gemma using [PyTorch/XLA](https://github.com/pytorch/xla). | | [[Gemma_2]Finetune_with_Unsloth.ipynb]([Gemma_2]Finetune_with_Unsloth.ipynb) | Finetune Gemma using [Unsloth](https://unsloth.ai/blog/gemma). | | [[Gemma_2]Translator_of_Old_Korean_Literature.ipynb]([Gemma_2]Translator_of_Old_Korean_Literature.ipynb) | Use Gemma to translate old Korean literature using Keras. | | [[Gemma_3]Full_Model_Finetune_using_HF.ipynb]([Gemma_3]Full_Model_Finetune_using_HF.ipynb) | Full model fine-tune on a mobile game NPC dataset using Hugging Face Transformers and TRL | | [[Gemma_3n]Finetuned_LoRA_Unsloth_on_Mental_Health_dataset.ipynb]([Gemma_3n]Finetuned_LoRA_Unsloth_on_Mental_Health_dataset.ipynb) | Finetuning of Gemma-3N (4B) model using [Unsloth](https://unsloth.ai/blog/gemma) on mental health counseling conversations to create an emotional first aid assistant, locally. | ## Alignment | Notebook Name | Description | | :------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | [[Gemma_2]Aligning_DPO.ipynb]([Gemma_2]Aligning_DPO.ipynb) | Demonstrate how to align a Gemma model using DPO (Direct Preference Optimization) with [Hugging Face TRL](https://huggingface.co/docs/trl/en/index). | ## Evaluation | Notebook Name | Description | | :------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | [[Gemma_2]evaluation.ipynb]([Gemma_2]evaluation.ipynb) | Demonstrate how to use Eleuther AI's LM evaluation harness to perform model evaluation on Gemma. | ## Agentic AI | Notebook Name | Description | | :------------------------------------------------------------------------------------------------ | :----------------------------------------------------------------------------------------------------------------------------------- | | [[Gemma_2]Agentic_AI.ipynb]([Gemma_2]Agentic_AI.ipynb) | Demonstrate how to build an Agentic AI using Gemma 2. | | [[Gemma_2]Function_Calling_with_Groq_Langchain.ipynb]([Gemma_2]Function_Calling_with_Groq_Langchain.ipynb) | Demonstrate how to create a simple agent using Langchain and groq using Gemma2. | | [[Gemma_3]Meme_Generator.ipynb]([Gemma_3]Meme_Generator.ipynb) | Meme Generator using Gemma 3 4B IT model | | [[Gemma_3]Function_Calling_Routing_and_Monitoring_using_Gemma_Google_Genai.ipynb]([Gemma_3]Function_Calling_Routing_and_Monitoring_using_Gemma_Google_Genai.ipynb) | Implement and Monitor Agentic RAG workflow | | [[Gemma_3]Function_Calling_with_HF.ipynb]([Gemma_3]Function_Calling_with_HF.ipynb) | Demonstrate how to use function calling with Gemma 3 using [Hugging Face](https://huggingface.co/). | | [[Gemma_3]Function_Calling_with_HF_document_summarizer.ipynb]([Gemma_3]Function_Calling_with_HF_document_summarizer.ipynb ) | Demonstrate how to build a document summarizer using function calling with Gemma 3 and Hugging Face. | | [[Gemma_3]Local_Agentic_RAG.ipynb]([Gemma_3]Local_Agentic_RAG.ipynb) | Build local Agentic RAG without any external APIs using [FastEmbed](https://github.com/qdrant/fastembed), [Ollama- Gemma3](https://ollama.com/models), and [Qdrant Vector database](https://cloud.qdrant.io) |