{ "nbformat": 4, "nbformat_minor": 0, "metadata": { "colab": { "name": "Speech Recognition using Python.ipynb", "provenance": [], "authorship_tag": "ABX9TyOV6ZOV/eO+H5GybBKIl3a0", "include_colab_link": true }, "kernelspec": { "name": "python3", "display_name": "Python 3" } }, "cells": [ { "cell_type": "markdown", "metadata": { "id": "view-in-github", "colab_type": "text" }, "source": [ "\"Open" ] }, { "cell_type": "markdown", "metadata": { "id": "HeljZdx2oSCe", "colab_type": "text" }, "source": [ "# Speech Recognition using Python\n", "\n", "\n", "## In this tutorial, I will develop a speech recognition system using python from scratch using necessary libraries.\n", "\n", "![alt text](https://miro.medium.com/max/720/1*ukSjlyyW7YQlA29Z_u75LA.jpeg)\n", "\n", "Before getting started there are some necessary tools that you need to download and install to successfully complete this tutorial.\n" ] }, { "cell_type": "markdown", "metadata": { "id": "UQ9BYp4ZodwQ", "colab_type": "text" }, "source": [ "## Necessary tools required for the tutorial:\n", "\n", "* **Python** → Just go to your command prompt and type `python -V` (Capital V). When you hit enter, it will prompt you the latest version of python and make sure it says `Python 3.8.1` . It is the latest version of Python for Windows (I literally downloaded it yesterday). If you have some other versions, just uninstall and [download](https://www.python.org/downloads/) a new one. Just do it, I will let you know the reason later.\n", "\n", "Now in the second case if you get a message something like this `Python is not an internal or an external command`. With no further due, take a deep breath and download Python from [here](https://www.python.org/downloads/).\n", "\n", "* **Visual Studio Code** → This is one of my favorite python code editors. Just download this to get a better programming experience with the rich features. Don’t worry, I’m not advertising (I wish I was). Anyway, one of the best things I like about the VS Code is that it has a built-in terminal. Every time if you want to install any external python libraries, you can use that built-in terminal to do the installation. You can skip a step of going to the command prompt. To download the Visual Studio Code, please use this [link](https://code.visualstudio.com/download). If you get stuck somewhere, don’t worry, I got you. Follow the [YouTube video](https://www.youtube.com/watch?v=E9U-EBG8jVk&t=88s) to get a better understanding.\n", "\n", "* **Speech Recognition library** → Now that you have installed and spent some time with the IDE (Visual Studio Code). It’s time to install some libraries inside the Visual Studion Terminal. Just type `pip install SpeechRecognition`. To read more about this library, the [documentation](https://pypi.org/project/SpeechRecognition/) is always available.\n", "\n", "\n", "![alt text](https://miro.medium.com/max/1042/1*xWnbNjFo13JYFcyGYtQJOw.png)\n", "\n", "* **PyAudio library**→ Basically, it is a cross-platform [audio/video library](https://pypi.org/project/PyAudio/). Since we are dealing with the audio in this tutorial, we need to install it. Also, if we want to use our device’s microphone for input, then we need to install PyAudio. Now install the library might be tremendously hard. Especially if it is windows. Believe me, I have taken 5 straight hours installing it. Although it is not just typing, `pip install PyAudio` in the command prompt. If you are lucky, then it might work but if you are unlucky individual like me, you will get an error message as shown below:\n", "\n", "![alt text](https://miro.medium.com/max/1094/1*ex34dlkQVbohBQ3iSTUaeg.png)\n", "\n", "So I somehow discovered an alternate way to install the library. I could do this with the help of a [YouTube](https://www.youtube.com/watch?v=AKymlea8sYM) video. Special thanks to the creator of this video. Please see the above video for a successful installation.\n", "\n", "If you are not OK with watching a video, don’t worry I got you. Just follow the steps given down below:\n", "\n", "* Download the [PyAudio](https://www.lfd.uci.edu/~gohlke/pythonlibs/#pyaudio) file from the “[Unofficial Windows Binaries for Python Extension Packages](https://www.lfd.uci.edu/~gohlke/pythonlibs/)”, thanks to [Christoph Gohlke](https://www.lfd.uci.edu/~gohlke/), [Laboratory for Fluorescence Dynamics](https://www.lfd.uci.edu/), [University of California](https://www.uci.edu/), Irvine. To access the link, just click [here](https://www.lfd.uci.edu/~gohlke/pythonlibs/#pyaudio). Also, if your version of Python is something like 3.8.1 then download the file which has the name “**PyAudio-0.2.11-cp38-cp38-win32.whl**”. The second one from the top. But the whole point here is both the version of the PyAudio, and the Python must be the same.\n", "\n", "\n", "* Press “**Run**” below the search bar. Or ask Cortona or Google for help. Make sure you type `%appdata%` in the run command prompt. It will look something like shown below:\n", "\n", "![alt text](https://miro.medium.com/max/399/1*YrRC_GGaMGH0U5XzXVi0dg.png)\n", "\n", "* As soon as you click OK from the Run command prompt. It will redirect you inside the app data of your computer. I will look something like this.\n", "\n", "![alt text](https://miro.medium.com/max/994/1*9LX4xVZb4WFqn5ID94ch7Q.png)\n", "\n", "Make sure you go back until you get “Local”, “LocalLow”, Roaming”. Click on **Local**.\n", "\n", "* Click on ***Programs → Python →Python 38–32 → Scripts →Paste the downloaded file (PyAudio)*** inside the Scripts folder.\n", "\n", "## To summarize\n", "\n", "C:\\Users\\tanup\\AppData\\Local\\Programs\\Python\\Python38–32\\Script (This is in my case).\n", "\n", "![alt text](https://miro.medium.com/max/1057/1*UjNjCDyozTSXFQyou5MPYg.png)\n", "\n", "* And this is an important step. Here you need to select the file and then get the path from the bar above as shown in the below:\n", "\n", "![alt text] (https://miro.medium.com/max/1215/1*G0yu9p50QXDVsVyXm_BAvg.png)\n", "\n", "Copy the above path we will need it in the below step.\n", "\n", "* The last step and the easiest step now is to install the library manually. Now just to your command and type the below command.\n", "\n", "pip install C:\\Users\\tanup\\AppData\\Local\\Programs\\Python\\Python38–32\\Scripts\\PyAudio-0.2.11-cp38-cp38-win32.whl\n", "\n", "```\n", "pip install C:\\Users\\tanup\\AppData\\Local\\Programs\\Python\\Python38–32\\Scripts\\PyAudio-0.2.11-cp38-cp38-win32.whl\n", "\n", "```\n", "\n", "Don’t freak out by seeing the command, it’s nothing but the pip install and the path where your PyAudio file is stored. Therefore, I told you to copy the path of the file.\n", "\n", "![alt text](https://miro.medium.com/max/1412/1*QLHdQG-c4FUEoHHMYUTYQg.png)\n", "\n", "So once you see the success message then congratulations you have successfully installed the library the hard way. If you want to double-check your successful installation. Then type IDLE in the below the windows search bar and then type `import pyaudio`. If you have successfully installed the library, then you will get no errors.\n", "\n", "\n", "![alt text](https://miro.medium.com/max/969/1*HVug4leKCKYBf316XC_2kg.png)\n", "\n", "___\n", "\n", "## Its time to code now\n", "\n", "Finally, its time for us to code\n", "\n", "I would like to give the credits for the author of the code because I got the code from a [GitHub Repository](https://github.com/umangahuja1/Youtube/blob/master/Python_Extras/speech.py).\n", "\n", "\n", "\n", "\n" ] }, { "cell_type": "code", "metadata": { "id": "pmUKZ3lcsk3E", "colab_type": "code", "colab": {} }, "source": [ "import speech_recognition as sr\n", "r = sr.Recognizer()\n", "with sr.Microphone() as source:\n", " print(\"Speak Anything :\")\n", " audio = r.listen(source)\n", " try:\n", " text = r.recognize_google(audio)\n", " print(\"You said : {}\".format(text))\n", " except:\n", " print(\"Sorry could not recognize what you said\")" ], "execution_count": 0, "outputs": [] }, { "cell_type": "markdown", "metadata": { "id": "nDPZSXtYq3Lm", "colab_type": "text" }, "source": [ "I will provide a brief explanation of the code:\n", "\n", "* Firstly import the library `speech_recognition`.\n", "* Then we need to recognize the speech. We need not write the recognizer function from scratch thanks to the library. All we need to do is with the help of `speech_recognizer` invoke a function called `Recognizer()`.\n", "* There are so many methods for recognizing the speech from an audio source. Choose the one that fits your needs (Just Kidding). Just use the one I’m showing. Some of the most common methods are:\n", "\n", "* `recognize_bing()` : Microsoft Bing Speech\n", "* `recognize_google()` : Google Web Speech\n", "* `recognize_google_cloud()` : Google Cloud Speech\n", "* `recognize_houndify()` : Houndify Speech\n", "* `recognize_ibm()` : IMB Speech\n", "* `recognize_wit()` : Wit.ai Speech\n", "\n", "In this tutorial, I will use the second method, i.e. `recognize_google()`. But make sure just executing this will cause an error because you need to provide an audio source. Now to get the audio source you have to listen to it right. Because the whole point is to listen to what the user says, and then display it. So in the code, so the method `listen` is used.\n", "\n", "* Once you have done, as shown above, the next step is to print the output. Meaning the system has listened to the user’s input, it should then display the output in a readable format.\n", "\n", "* Now last but not least. Executing the program is the only pending task. I bet if you have followed every line of this tutorial and all the installation process then you would not get any errors. But if you get any errors, you know what to do. Post it in the contacts section I’ll have a look and I shall look over it. If everything works, then let’s see the output.\n", "\n", "\n", "![alt text](https://miro.medium.com/max/1294/1*kuJF-A6K_-GdSYaz5-y8dA.png)\n", "\n", "Now I don’t know if you noticed this or not. The moment you executed this program at the bottom corner of your laptop’s display, the microphone icon would be enabled. Meaning that it’s listening (Isn’t is freaky). I would pop up when the program is executed, and then it disappears.\n", "\n", "![alt text](https://miro.medium.com/max/552/1*ayJv32Dg-hFNPeiyH0QWJw.jpeg)\n", "\n", "Now, this is not a great project or great invention, all we are doing is using the right libraries at the right time. Now, this is a good start for those who did not know that libraries can create wonders. Also, you have learned how to manually install a library. I hope you have enjoyed reading this tutorial. You learned something new today. Stay tuned for more updates. Until then, see you goodbye. Have a nice day.\n", "\n", "___\n", "\n", "\n", "\n" ] } ] }