{
 "cells": [
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Generating images with AI\n",
    "\n",
    "This notebook demonstrates how to use OpenAI DALL-E 3 to generate images, in combination with other LLM features like text and embedding generation.\n",
    "\n",
    "Here, we use Chat Completion to generate a random image description and DALL-E 3 to create an image from that description, showing the image inline.\n",
    "\n",
    "Lastly, the notebook asks the user to describe the image. The embedding of the user's description is compared to the original description, using Cosine Similarity, and returning a score from 0 to 1, where 1 means exact match."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "dotnet_interactive": {
     "language": "csharp"
    },
    "polyglot_notebook": {
     "kernelName": "csharp"
    },
    "tags": [],
    "vscode": {
     "languageId": "polyglot-notebook"
    }
   },
   "outputs": [],
   "source": [
    "// Usual setup: importing Semantic Kernel SDK and SkiaSharp, used to display images inline.\n",
    "\n",
    "#r \"nuget: Microsoft.SemanticKernel, 1.23.0\"\n",
    "#r \"nuget: System.Numerics.Tensors, 8.0.0\"\n",
    "#r \"nuget: SkiaSharp, 2.88.3\"\n",
    "\n",
    "#!import config/Settings.cs\n",
    "#!import config/Utils.cs\n",
    "#!import config/SkiaUtils.cs\n",
    "\n",
    "using Microsoft.SemanticKernel;\n",
    "using Microsoft.SemanticKernel.TextToImage;\n",
    "using Microsoft.SemanticKernel.Embeddings;\n",
    "using Microsoft.SemanticKernel.Connectors.OpenAI;\n",
    "using System.Numerics.Tensors;"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Setup, using three AI services: images, text, embedding\n",
    "\n",
    "The notebook uses:\n",
    "\n",
    "* **OpenAI Dall-E 3** to transform the image description into an image\n",
    "* **text-embedding-ada-002** to compare your guess against the real image description\n",
    "\n",
    "**Note:**: For Azure OpenAI, your endpoint should have DALL-E API enabled."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "dotnet_interactive": {
     "language": "csharp"
    },
    "polyglot_notebook": {
     "kernelName": "csharp"
    },
    "vscode": {
     "languageId": "polyglot-notebook"
    }
   },
   "outputs": [],
   "source": [
    "using Kernel = Microsoft.SemanticKernel.Kernel;\n",
    "\n",
    "#pragma warning disable SKEXP0001, SKEXP0010\n",
    "\n",
    "// Load OpenAI credentials from config/settings.json\n",
    "var (useAzureOpenAI, model, azureEndpoint, apiKey, orgId) = Settings.LoadFromFile();\n",
    "\n",
    "// Configure the three AI features: text embedding (using Ada), chat completion, image generation (DALL-E 3)\n",
    "var builder = Kernel.CreateBuilder();\n",
    "\n",
    "if(useAzureOpenAI)\n",
    "{\n",
    "    builder.AddAzureOpenAITextEmbeddingGeneration(\"text-embedding-ada-002\", azureEndpoint, apiKey);\n",
    "    builder.AddAzureOpenAIChatCompletion(model, azureEndpoint, apiKey);\n",
    "    builder.AddAzureOpenAITextToImage(\"dall-e-3\", azureEndpoint, apiKey);\n",
    "}\n",
    "else\n",
    "{\n",
    "    builder.AddOpenAITextEmbeddingGeneration(\"text-embedding-ada-002\", apiKey, orgId);\n",
    "    builder.AddOpenAIChatCompletion(model, apiKey, orgId);\n",
    "    builder.AddOpenAITextToImage(apiKey, orgId);\n",
    "}\n",
    "   \n",
    "var kernel = builder.Build();\n",
    "\n",
    "// Get AI service instance used to generate images\n",
    "var dallE = kernel.GetRequiredService<ITextToImageService>();\n",
    "\n",
    "// Get AI service instance used to extract embedding from a text\n",
    "var textEmbedding = kernel.GetRequiredService<ITextEmbeddingGenerationService>();"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Generate a (random) image with DALL-E 3\n",
    "\n",
    "**genImgDescription** is a Semantic Function used to generate a random image description. \n",
    "The function takes in input a random number to increase the diversity of its output.\n",
    "\n",
    "The random image description is then given to **Dall-E 3** asking to create an image."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "dotnet_interactive": {
     "language": "csharp"
    },
    "polyglot_notebook": {
     "kernelName": "csharp"
    },
    "tags": [],
    "vscode": {
     "languageId": "polyglot-notebook"
    }
   },
   "outputs": [],
   "source": [
    "#pragma warning disable SKEXP0001\n",
    "\n",
    "var prompt = @\"\n",
    "Think about an artificial object correlated to number {{$input}}.\n",
    "Describe the image with one detailed sentence. The description cannot contain numbers.\";\n",
    "\n",
    "var executionSettings = new OpenAIPromptExecutionSettings \n",
    "{\n",
    "    MaxTokens = 256,\n",
    "    Temperature = 1\n",
    "};\n",
    "\n",
    "// Create a semantic function that generate a random image description.\n",
    "var genImgDescription = kernel.CreateFunctionFromPrompt(prompt, executionSettings);\n",
    "\n",
    "var random = new Random().Next(0, 200);\n",
    "var imageDescriptionResult = await kernel.InvokeAsync(genImgDescription, new() { [\"input\"] = random });\n",
    "var imageDescription = imageDescriptionResult.ToString();\n",
    "\n",
    "// Use DALL-E 3 to generate an image. OpenAI in this case returns a URL (though you can ask to return a base64 image)\n",
    "var imageUrl = await dallE.GenerateImageAsync(imageDescription.Trim(), 1024, 1024);\n",
    "\n",
    "await SkiaUtils.ShowImage(imageUrl, 1024, 1024);"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Let's play a guessing game\n",
    "\n",
    "Try to guess what the image is about, describing the content.\n",
    "\n",
    "You'll get a score at the end 😉"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "dotnet_interactive": {
     "language": "csharp"
    },
    "polyglot_notebook": {
     "kernelName": "csharp"
    },
    "tags": [],
    "vscode": {
     "languageId": "polyglot-notebook"
    }
   },
   "outputs": [],
   "source": [
    "// Prompt the user to guess what the image is\n",
    "var guess = await InteractiveKernel.GetInputAsync(\"Describe the image in your words\");\n",
    "\n",
    "// Compare user guess with real description and calculate score\n",
    "var origEmbedding = await textEmbedding.GenerateEmbeddingsAsync(new List<string> { imageDescription } );\n",
    "var guessEmbedding = await textEmbedding.GenerateEmbeddingsAsync(new List<string> { guess } );\n",
    "var similarity = TensorPrimitives.CosineSimilarity(origEmbedding.First().Span, guessEmbedding.First().Span);\n",
    "\n",
    "Console.WriteLine($\"Your description:\\n{Utils.WordWrap(guess, 90)}\\n\");\n",
    "Console.WriteLine($\"Real description:\\n{Utils.WordWrap(imageDescription.Trim(), 90)}\\n\");\n",
    "Console.WriteLine($\"Score: {similarity:0.00}\\n\\n\");\n",
    "\n",
    "//Uncomment this line to see the URL provided by OpenAI\n",
    "//Console.WriteLine(imageUrl);"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": ".NET (C#)",
   "language": "C#",
   "name": ".net-csharp"
  },
  "language_info": {
   "file_extension": ".cs",
   "mimetype": "text/x-csharp",
   "name": "C#",
   "pygments_lexer": "csharp",
   "version": "11.0"
  },
  "polyglot_notebook": {
   "kernelInfo": {
    "defaultKernelName": "csharp",
    "items": [
     {
      "aliases": [],
      "name": "csharp"
     }
    ]
   }
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}