{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import panel as pn\n",
    "\n",
    "from panel.widgets import SpeechToText, GrammarList\n",
    "\n",
    "pn.extension()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The `SpeechToText` widget controls the speech recognition service of the browser by wrapping the [HTML5 `SpeechRecognition` API](https://developer.mozilla.org/en-US/docs/Web/API/SpeechRecognition).\n",
    "\n",
    "The functionality is **experimental** and **only supported by Chrome and a few other browsers**. See https://caniuse.com/speech-recognition or https://developer.mozilla.org/en-US/docs/Web/API/SpeechRecognition#Browser_compatibility for an up-to-date list of browsers supporting the SpeechRecognition API. Even in Chrome the `grammars`, `interim_results` and `max_alternatives` parameters are not yet supported.\n",
    "\n",
    "On some browsers, like Chrome, using Speech Recognition on a web page involves a server-based recognition engine. **Your audio is sent to a web service for recognition processing, so it won't work offline**. Whether this is secure and confidential enough for your use case is up to you to evaluate.\n",
    "\n",
    "#### Parameters:\n",
    "\n",
    "* **``results``** (List\\[Dict\\]): The results recognized. A list of Dictionaries.\n",
    "* **``value``** (str): The most recent SpeechRecognition result as a string.\n",
    "\n",
    "* **``lang``** (str): The language of the current SpeechRecognition service in BCP 47 format. For example 'en-US'.\n",
    "* **``continuous``** (bool): Controls whether continuous results are returned for each recognition, or only a single result. Defaults to False.\n",
    "* **``interim_results``** (boolean): Controls whether interim results should be returned (True) or not (False.) Interim results are results that are not yet final (e.g. the RecognitionResult.is_final property is False).\n",
    "* **``max_alternatives``** (int): Sets the maximum number of RecognitionAlternatives provided per result. A number between 1 and 5. The default value is 1.\n",
    "* **``service_uri``** (str): Specifies the location of the speech recognition service used by the current SpeechRecognition service to handle the actual recognition. The default is the user agent's default speech service.\n",
    "* **``grammars``** (GrammarList): A GrammarList object that represents the grammars that will be understood by the current SpeechRecognition service.\n",
    "\n",
    "* **``started``** (boolean): Returns True if the Speech Recognition Service is started and False otherwise.\n",
    "* **``audio_started``** (boolean): Returns True if the Audio is started and False otherwise\n",
    "* **``sound_started``** (boolean): Returns True if the Sound is started and False otherwise\n",
    "* **``speech_started``** (boolean): Returns True if the the User has started speaking and False otherwise\n",
    "\n",
    "* **``button_hide``** (bool): Whether to show (False) or hide (True) the toggle start/ stop button.\n",
    "* **``button_type``** (str): One of 'default', 'primary', 'success', 'warning', 'danger' and 'light'.\n",
    "* **``button_not_started``** (str): The text to show on the button when the SpeechRecognition service is NOT started. If '' a *muted microphone* icon is shown.\n",
    "* **``button_started``** (str): The text to show on the button when the SpeechRecognition service is started. If '' a *muted microphone* icon is shown.\n",
    "\n",
    "##### Events\n",
    "\n",
    "Events are transient boolean parameters which when set to true trigger an event and then immediately revert to False.\n",
    "\n",
    "* **``start``** (bool): Starts the speech recognition service listening to incoming audio with intent to recognize grammars associated with the current SpeechRecognition.\n",
    "* **``stop``** (bool): Stops the speech recognition service from listening to incoming audio, and attempts to return a list of RecognitionResult using the audio captured so far.\n",
    "* **``abort``** (bool): Stops the speech recognition service from listening to incoming audio, and doesn't attempt to return a list of RecognitionResult.\n",
    "\n",
    "#### Properties\n",
    "\n",
    "* **``results_deserialized``** (List\\[RecognitionResult\\]): The results recognized. A list of RecognitionResult objects.\n",
    "* **``results_as_html``** (str): Returns the `results` formatted as html. Convenience property for ease of use.\n",
    "\n",
    "### Grammar\n",
    "\n",
    "A set of words or patterns of words that we want the speech recognition service to recognize.\n",
    "\n",
    "For example\n",
    "\n",
    "```python\n",
    "grammar = Grammar(\n",
    "    src='#JSGF V1.0; grammar colors; public <color> = aqua | azure | beige;',\n",
    "    weight=0.7\n",
    ")\n",
    "```\n",
    "\n",
    "Wraps the [HTML SpeechGrammar API](https://developer.mozilla.org/en-US/docs/Web/API/SpeechGrammar).\n",
    "\n",
    "#### Parameters:\n",
    "\n",
    "* **``src``** (str): A set of words or patterns of words that we want the recognition service to recognize. Defined using JSpeech Grammar Format. See https://www.w3.org/TR/jsgf/.\n",
    "* **``uri``** (str): An uri pointing to the definition. If src is available it will be used. Otherwise uri. The uri will be loaded on the client side only.\n",
    "* **``weight``** (float): The weight of the grammar. A number in the range 0–1. Default is 1.\n",
    "\n",
    "### GrammarList\n",
    "\n",
    "A list of Grammar objects containing words or patterns of words that we want the recognition service to recognize.\n",
    "\n",
    "For example\n",
    "\n",
    "```python\n",
    "grammar = '#JSGF V1.0; grammar colors; public <color> = aqua | azure | beige | bisque ;'\n",
    "grammar_list = GrammarList()\n",
    "grammar_list.add_from_string(grammar, 1)\n",
    "```\n",
    "\n",
    "Wraps the [HTML 5 SpeechGrammarList API](https://developer.mozilla.org/en-US/docs/Web/API/SpeechGrammarList).\n",
    "\n",
    "#### Methods:\n",
    "\n",
    "* **``add_from_string``** (src: str, weight: float): Takes a grammar src and weight and adds it to the GrammarList as a new Grammar object. The new Grammar object is returned.\n",
    "* **``add_from_uri``** (uri: str, weight: float): Takes a grammar uri and weight and adds it to the GrammarList as a new Grammar object. The new Grammar object is returned.\n",
    "\n",
    "### RecognitionAlternative\n",
    "\n",
    "The RecognitionAlternative represents a word or sentence that has been recognised by the speech recognition service.\n",
    "\n",
    "Wraps the [HTML5 SpeechRecognitionAlternative API](https://developer.mozilla.org/en-US/docs/Web/API/SpeechRecognitionAlternative).\n",
    "\n",
    "#### Methods:\n",
    "\n",
    "* **``confidence``** (float): A numeric estimate between 0 and 1 of how confident the speech recognition system is that the recognition is correct.\n",
    "* **``transcript``** (str): The transcript of the recognised word or sentence.\n",
    "\n",
    "\n",
    "### RecognitionResult\n",
    "\n",
    "The Result represents a single recognition match, which may contain multiple RecognitionAlternative objects.\n",
    "\n",
    "Wraps the [HTML5 SpeechRecognitionResult API](https://developer.mozilla.org/en-US/docs/Web/API/SpeechRecognitionResult).\n",
    "\n",
    "\n",
    "#### Methods:\n",
    "\n",
    "* **``is_final``** (boolean): A Boolean that states whether this result is final (True) or not (False) — if so, then this is the final time this result will be returned; if not, then this result is an interim result, and may be updated later on.\n",
    "* **``alternatives``** (List\\[RecognitionResult\\]): The list of the n-best alternatives.\n",
    "\n",
    "___"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "speech_to_text_basic = SpeechToText(button_type=\"light\")\n",
    "\n",
    "pn.Row(speech_to_text_basic.controls(['value'], jslink=False), speech_to_text_basic)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To get the most recent result we can simply access the `value` parameter:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "speech_to_text_basic.value"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "For more detailed results including the confidence level access the `results` parameter:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "speech_to_text_basic.results"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Advanced Example\n",
    "\n",
    "We start by instantiating a `SpeechToText` object with a `GrammarList`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "grammar_list = GrammarList()\n",
    "\n",
    "src = \"#JSGF V1.0; grammar colors; public <color> = aqua | azure | beige | bisque | black | blue | brown | chocolate | coral | crimson | cyan | fuchsia | ghostwhite | gold | goldenrod | gray | green | indigo | ivory | khaki | lavender | lime | linen | magenta | maroon | moccasin | navy | olive | orange | orchid | peru | pink | plum | purple | red | salmon | sienna | silver | snow | tan | teal | thistle | tomato | turquoise | violet | white | yellow ;\"\n",
    "grammar_list.add_from_string(src, 1)\n",
    "\n",
    "speech_to_text = SpeechToText(button_type=\"light\", grammars=grammar_list, height=50)\n",
    "\n",
    "controls = speech_to_text.controls(jslink=False)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Next we create a callback which will render the `results_as_html`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "@pn.depends(speech_to_text)\n",
    "def results(results):\n",
    "    return pn.pane.HTML(speech_to_text.results_as_html, width=100, margin=(0, 15, 0, 15))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Finally we compose this into an `app`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "app = pn.Row(controls, speech_to_text, results)\n",
    "app.servable()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Try changing some of the parameters. For example changing the `continuous` parameter will keep the SpeechRecognition service open which lets you say multiple statements after each other."
   ]
  }
 ],
 "metadata": {
  "language_info": {
   "name": "python",
   "pygments_lexer": "ipython3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}