{ "nbformat": 4, "nbformat_minor": 0, "metadata": { "colab": { "provenance": [], "authorship_tag": "ABX9TyMKwAc7P1yf8hiu4NNcryRo", "include_colab_link": true }, "kernelspec": { "name": "python3", "display_name": "Python 3" }, "language_info": { "name": "python" } }, "cells": [ { "cell_type": "markdown", "metadata": { "id": "view-in-github", "colab_type": "text" }, "source": [ "<a href=\"https://colab.research.google.com/github/olafthiele/gtd23/blob/main/Demo_Sprechen_1_coqui.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>" ] }, { "cell_type": "markdown", "source": [ "# Sprechen 1 - coqui mit Thorsten\n", "\n", "Dieses Notebook erzeugt Aussagen der Stimme Thorsten mit der coqui-Lib:\n", "https://github.com/coqui-ai/TTS\n", "\n", "Thorstens Stimme: https://www.thorsten-voice.de\n", "\n", "---\n", "\n", "Allgemein:\n", "\n", "Dieses Notebook läuft auf den Servern von Google, es verändert nichts auf Ihrem lokalen Rechner. Alle Eingaben werden Beendigung gelöscht.\n", "\n", "Colab Notebooks enthalten Text- oder Codeblöcke. Codeblöcke werden durch SHIFT-ENTER ausgeführt (oder Play-Symbol oben links klicken).\n", "\n", "Solange ein Block ausgeführt wird, wird nicht mehr die Nr [3] des Blocks, sondern ein Kreis angezeigt.\n", "\n", "Sie können beliebig Daten verändern, das geschieht nur für Sie." ], "metadata": { "id": "D7gmoHPYVWiH" } }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "cChXLDqgqBWs" }, "outputs": [], "source": [ "!pip install tts\n", "!apt install espeak-ng \n", "# gleiches Installationsprinzip, grundlegend andere Technik Torch" ] }, { "cell_type": "code", "source": [ "# Vorhandene Modelle anzeigen, String kopieren, um es zu nutzen\n", "!tts --list_models" ], "metadata": { "id": "dDwPknncqL6C" }, "execution_count": null, "outputs": [] }, { "cell_type": "code", "source": [ "# Hilfe anzeigen\n", "# !tts -h\n", "# Test erstellen, dadurch wird das Modell in /root/.local/ gespeichert\n", "!tts --text \"Dies ist ein Test.\" --model_name \"tts_models/de/thorsten/tacotron2-DCA\"" ], "metadata": { "id": "WGEsszx_r6vu" }, "execution_count": null, "outputs": [] }, { "cell_type": "code", "source": [ "import IPython.display as ipd\n", "ipd.Audio('tts_output.wav') " ], "metadata": { "id": "2F2_PMkPw_bJ" }, "execution_count": null, "outputs": [] }, { "cell_type": "code", "source": [ "# Lauf mit Griffin Lim Algo ohne Vocoder - metallisch\n", "!tts --text \"Die G T D ist die beste Konferenz der Welt.\" \\\n", " --model_path /root/.local/share/tts/tts_models--de--thorsten--tacotron2-DCA/model_file.pth \\\n", " --config_path /root/.local/share/tts/tts_models--de--thorsten--tacotron2-DCA/config.json \\\n", " --out_path metallisch.wav" ], "metadata": { "id": "cvY7oaQ_sBGz" }, "execution_count": null, "outputs": [] }, { "cell_type": "code", "source": [ "# Audio anhören können\n", "import IPython.display as ipd\n", "ipd.Audio('metallisch.wav') # load a local WAV file" ], "metadata": { "id": "G7uEmC2WsfXj" }, "execution_count": null, "outputs": [] }, { "cell_type": "code", "source": [ "# Lauf mit Vocoder WaveGAN\n", "!tts --text \"Die G T D ist die beste Konferenz der Welt.\" --model_name \"tts_models/de/thorsten/tacotron2-DCA\"\n", "ipd.Audio('tts_output.wav') # load a local WAV file" ], "metadata": { "id": "1VPCsxrCs6TK" }, "execution_count": null, "outputs": [] }, { "cell_type": "code", "source": [ "# Beispiel ohne Punkt\n", "!tts --text \"Die German Testing Days sind super.\" --model_name \"tts_models/de/thorsten/vits\"\n", "ipd.Audio('tts_output.wav') # load a local WAV file" ], "metadata": { "id": "pw8v3JgxwSpU" }, "execution_count": null, "outputs": [] }, { "cell_type": "code", "source": [], "metadata": { "id": "gWSSVgVbDBeG" }, "execution_count": null, "outputs": [] }, { "cell_type": "markdown", "source": [ "# AWS Polly als synthetische Stimme" ], "metadata": { "id": "1hHkTJGVZ-eh" } }, { "cell_type": "code", "source": [ "import requests\n", "import IPython.display as ipd\n", "\n", "# Define the Flask API URL\n", "api_url = 'http://85.209.51.176:7777/tts' # Replace with the actual API URL\n", "\n", "# Define the message to be sent\n", "message = 'Guten Tag auf den German Testing Days'\n", "\n", "import IPython.display as ipd\n", "import base64\n", "\n", "# Send a POST request to the API endpoint\n", "response = requests.post(api_url, json={'message': message})\n", "\n", "# Check if the request was successful\n", "if response.status_code == 200:\n", " # Retrieve the audio base64 string from the response\n", " audio_base64 = response.json()['speech']\n", " \n", " # Convert the base64 string to audio and play it\n", " audio_data = base64.b64decode(audio_base64)\n", " audio_element = ipd.Audio(data=audio_data, autoplay=True)\n", " ipd.display(audio_element)\n", "else:\n", " print('Error:', response.json()['error'])\n" ], "metadata": { "id": "gjGUXMZLzQHd" }, "execution_count": null, "outputs": [] } ] }