{
  "nbformat": 4,
  "nbformat_minor": 0,
  "metadata": {
    "colab": {
      "provenance": [],
      "authorship_tag": "ABX9TyMKwAc7P1yf8hiu4NNcryRo",
      "include_colab_link": true
    },
    "kernelspec": {
      "name": "python3",
      "display_name": "Python 3"
    },
    "language_info": {
      "name": "python"
    }
  },
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "view-in-github",
        "colab_type": "text"
      },
      "source": [
        "<a href=\"https://colab.research.google.com/github/olafthiele/gtd23/blob/main/Demo_Sprechen_1_coqui.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
      ]
    },
    {
      "cell_type": "markdown",
      "source": [
        "# Sprechen 1 - coqui mit Thorsten\n",
        "\n",
        "Dieses Notebook erzeugt Aussagen der Stimme Thorsten mit der coqui-Lib:\n",
        "https://github.com/coqui-ai/TTS\n",
        "\n",
        "Thorstens Stimme: https://www.thorsten-voice.de\n",
        "\n",
        "---\n",
        "\n",
        "Allgemein:\n",
        "\n",
        "Dieses Notebook läuft auf den Servern von Google, es verändert nichts auf Ihrem lokalen Rechner. Alle Eingaben werden Beendigung gelöscht.\n",
        "\n",
        "Colab Notebooks enthalten Text- oder Codeblöcke. Codeblöcke werden durch SHIFT-ENTER ausgeführt (oder Play-Symbol oben links klicken).\n",
        "\n",
        "Solange ein Block ausgeführt wird, wird nicht mehr die Nr [3] des Blocks, sondern ein Kreis angezeigt.\n",
        "\n",
        "Sie können beliebig Daten verändern, das geschieht nur für Sie."
      ],
      "metadata": {
        "id": "D7gmoHPYVWiH"
      }
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "cChXLDqgqBWs"
      },
      "outputs": [],
      "source": [
        "!pip install tts\n",
        "!apt install espeak-ng \n",
        "# gleiches Installationsprinzip, grundlegend andere Technik Torch"
      ]
    },
    {
      "cell_type": "code",
      "source": [
        "# Vorhandene Modelle anzeigen, String kopieren, um es zu nutzen\n",
        "!tts --list_models"
      ],
      "metadata": {
        "id": "dDwPknncqL6C"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "code",
      "source": [
        "# Hilfe anzeigen\n",
        "# !tts -h\n",
        "# Test erstellen, dadurch wird das Modell in /root/.local/ gespeichert\n",
        "!tts --text \"Dies ist ein Test.\" --model_name \"tts_models/de/thorsten/tacotron2-DCA\""
      ],
      "metadata": {
        "id": "WGEsszx_r6vu"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "code",
      "source": [
        "import IPython.display as ipd\n",
        "ipd.Audio('tts_output.wav') "
      ],
      "metadata": {
        "id": "2F2_PMkPw_bJ"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "code",
      "source": [
        "# Lauf mit Griffin Lim Algo ohne Vocoder - metallisch\n",
        "!tts --text \"Die G T D ist die beste Konferenz der Welt.\" \\\n",
        "    --model_path /root/.local/share/tts/tts_models--de--thorsten--tacotron2-DCA/model_file.pth \\\n",
        "    --config_path /root/.local/share/tts/tts_models--de--thorsten--tacotron2-DCA/config.json \\\n",
        "    --out_path metallisch.wav"
      ],
      "metadata": {
        "id": "cvY7oaQ_sBGz"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "code",
      "source": [
        "# Audio anhören können\n",
        "import IPython.display as ipd\n",
        "ipd.Audio('metallisch.wav') # load a local WAV file"
      ],
      "metadata": {
        "id": "G7uEmC2WsfXj"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "code",
      "source": [
        "# Lauf mit Vocoder WaveGAN\n",
        "!tts --text \"Die G T D ist die beste Konferenz der Welt.\" --model_name \"tts_models/de/thorsten/tacotron2-DCA\"\n",
        "ipd.Audio('tts_output.wav') # load a local WAV file"
      ],
      "metadata": {
        "id": "1VPCsxrCs6TK"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "code",
      "source": [
        "# Beispiel ohne Punkt\n",
        "!tts --text \"Die German Testing Days sind super.\" --model_name \"tts_models/de/thorsten/vits\"\n",
        "ipd.Audio('tts_output.wav') # load a local WAV file"
      ],
      "metadata": {
        "id": "pw8v3JgxwSpU"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "code",
      "source": [],
      "metadata": {
        "id": "gWSSVgVbDBeG"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "source": [
        "# AWS Polly als synthetische Stimme"
      ],
      "metadata": {
        "id": "1hHkTJGVZ-eh"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "import requests\n",
        "import IPython.display as ipd\n",
        "\n",
        "# Define the Flask API URL\n",
        "api_url = 'http://85.209.51.176:7777/tts'   # Replace with the actual API URL\n",
        "\n",
        "# Define the message to be sent\n",
        "message = 'Guten Tag auf den German Testing Days'\n",
        "\n",
        "import IPython.display as ipd\n",
        "import base64\n",
        "\n",
        "# Send a POST request to the API endpoint\n",
        "response = requests.post(api_url, json={'message': message})\n",
        "\n",
        "# Check if the request was successful\n",
        "if response.status_code == 200:\n",
        "    # Retrieve the audio base64 string from the response\n",
        "    audio_base64 = response.json()['speech']\n",
        "    \n",
        "    # Convert the base64 string to audio and play it\n",
        "    audio_data = base64.b64decode(audio_base64)\n",
        "    audio_element = ipd.Audio(data=audio_data, autoplay=True)\n",
        "    ipd.display(audio_element)\n",
        "else:\n",
        "    print('Error:', response.json()['error'])\n"
      ],
      "metadata": {
        "id": "gjGUXMZLzQHd"
      },
      "execution_count": null,
      "outputs": []
    }
  ]
}