{ "cells": [ { "attachments": {}, "cell_type": "markdown", "id": "b35fe87b", "metadata": {}, "source": [ "# Rock - Paper - Scissors - Lizard - Spock\n", "\n", "Welcome to [Pycon 23](https://pycon.it/) Beginners'Day! In this workshop you will learn the basics of programming in the Python language by developing several versions of the classic Rock-Paper-Scissors game from scratch. From the classic version to the **top** version in which through Machine Learning our program will recognize our move from the webcam, everything is at your fingertips." ] }, { "attachments": {}, "cell_type": "markdown", "id": "78e5f376", "metadata": {}, "source": [ "## Task 1\n", "\n", "In the next cell we will study:\n", "\n", "* what is a variable in Python\n", "* what are the types of variables you can have in Python\n", "* how it is possible to take input from the user\n", "* how it is possible to create a list with the possible game choices\n", "* how it is possible to generate the computer's move randomly" ] }, { "cell_type": "code", "execution_count": null, "id": "6cfa68db", "metadata": {}, "outputs": [], "source": [ "integer=10\n", "boolean=True\n", "number_float=0.13\n", "string='pycon'" ] }, { "cell_type": "code", "execution_count": null, "id": "129aa8da", "metadata": {}, "outputs": [], "source": [ "print('Hello World :)')" ] }, { "cell_type": "code", "execution_count": null, "id": "a62875ca", "metadata": {}, "outputs": [], "source": [ "print(integer)" ] }, { "cell_type": "code", "execution_count": null, "id": "aeb5dade", "metadata": {}, "outputs": [], "source": [ "print(string,integer)" ] }, { "cell_type": "code", "execution_count": null, "id": "568556bb", "metadata": {}, "outputs": [], "source": [ "result = input('give me a number?')" ] }, { "cell_type": "code", "execution_count": null, "id": "a957db01", "metadata": {}, "outputs": [], "source": [ "print(result)\n" ] }, { "cell_type": "code", "execution_count": null, "id": "b277808f", "metadata": {}, "outputs": [], "source": [ "user_action = input(\"Enter a choice (rock, paper, scissors): \")" ] }, { "cell_type": "code", "execution_count": null, "id": "687209b0", "metadata": {}, "outputs": [], "source": [ "import random\n" ] }, { "cell_type": "code", "execution_count": null, "id": "3546541d", "metadata": {}, "outputs": [], "source": [ "possible_actions = [\"rock\", \"paper\", \"scissors\"]\n", "computer_action = random.choice(possible_actions)\n" ] }, { "cell_type": "code", "execution_count": null, "id": "36523ce3", "metadata": {}, "outputs": [], "source": [ "print(f\"\\nYou chose {user_action}, computer chose {computer_action}.\\n\")\n" ] }, { "attachments": {}, "cell_type": "markdown", "id": "f9faebb5", "metadata": {}, "source": [ "## Task 2\n", "\n", "In this cell we will compare the moves of the computer and the player to figure out **who won** and display an appropriate message" ] }, { "cell_type": "code", "execution_count": null, "id": "90af3eea", "metadata": {}, "outputs": [], "source": [ "if user_action == computer_action:\n", " print(f\"Both players selected {user_action}. It's a tie!\")\n", "elif user_action == \"rock\":\n", " if computer_action == \"scissors\":\n", " print(\"Rock smashes scissors! You win!\")\n", " else:\n", " print(\"Paper covers rock! You lose.\")\n", "elif user_action == \"paper\":\n", " if computer_action == \"rock\":\n", " print(\"Paper covers rock! You win!\")\n", " else:\n", " print(\"Scissors cuts paper! You lose.\")\n", "elif user_action == \"scissors\":\n", " if computer_action == \"paper\":\n", " print(\"Scissors cuts paper! You win!\")\n", " else:\n", " print(\"Rock smashes scissors! You lose.\")" ] }, { "attachments": {}, "cell_type": "markdown", "id": "f9182a9f", "metadata": {}, "source": [ "### Task 2a - We repeat the game rounds to make a real game\n", "\n", "In this cell we will use a Python **loop** (specifically a `while` loop) to **play an indefinite number of rounds**. In particular we will repeat inside the `while` loop everything we have done so far for the single run:\n", "- take a choice as input from the user\n", "- generate the computer move\n", "- compare moves\n", "- show an output\n", "\n", "We will add one to these operations: **we ask the user if he wants to play again and, if not, we exit the game loop**." ] }, { "cell_type": "code", "execution_count": null, "id": "47580fd9", "metadata": {}, "outputs": [], "source": [ "while True:\n", " user_action = input(\"Enter a choice (rock, paper, scissors): \")\n", " possible_actions = [\"rock\", \"paper\", \"scissors\"]\n", " computer_action = random.choice(possible_actions)\n", " print(f\"\\nYou chose {user_action}, computer chose {computer_action}.\\n\")\n", "\n", " if user_action == computer_action:\n", " print(f\"Both players selected {user_action}. It's a tie!\")\n", " elif user_action == \"rock\":\n", " if computer_action == \"scissors\":\n", " print(\"Rock smashes scissors! You win!\")\n", " else:\n", " print(\"Paper covers rock! You lose.\")\n", " elif user_action == \"paper\":\n", " if computer_action == \"rock\":\n", " print(\"Paper covers rock! You win!\")\n", " else:\n", " print(\"Scissors cuts paper! You lose.\")\n", " elif user_action == \"scissors\":\n", " if computer_action == \"paper\":\n", " print(\"Scissors cuts paper! You win!\")\n", " else:\n", " print(\"Rock smashes scissors! You lose.\")\n", "\n", " play_again = input(\"Play again? (y/n): \")\n", " if play_again.lower() != \"y\":\n", " break" ] }, { "attachments": {}, "cell_type": "markdown", "id": "dd16cb05", "metadata": {}, "source": [ "## Task 3: Optimizations in the code\n", "\n", "Now that we have a basic version of the game where we can play against the computer and also increase the length of a game, let's be a little more **pro**.\n", "\n", "We will go in the next cells to implement a series of optimizations that will serve to make our code more maintainable and readable." ] }, { "attachments": {}, "cell_type": "markdown", "id": "2dfe73b6", "metadata": {}, "source": [ "### Task 3a: Let's create an enum\n", "\n", "In this cell we are going to generalize the concept of \"action\" by creating a class that **inherits** the behavior of Python's `IntEnum`" ] }, { "cell_type": "code", "execution_count": null, "id": "0ec721b6", "metadata": {}, "outputs": [], "source": [ "from enum import IntEnum\n", "\n", "class Action(IntEnum):\n", " Rock = 0\n", " Paper = 1\n", " Scissors = 2" ] }, { "cell_type": "code", "execution_count": null, "id": "689b925b", "metadata": {}, "outputs": [], "source": [ "print('Action.Rock == Action.Rock',Action.Rock == Action.Rock)\n", "print('Action.Rock == Action(0)',Action.Rock == Action(0))\n", "print('Action(0)',Action(0))" ] }, { "attachments": {}, "cell_type": "markdown", "id": "92eea47a", "metadata": {}, "source": [ "### Task 3b: Let's use functions to optimize the code\n", "\n", "Through the use of functions we divide our main program into \"blocks\" of code that can be called at any time we need them. In particular, our game can be divided into 3 phases:\n", "\n", "- Let the user play -> `get_user_selection()`\n", "- Let the computer play -> `get_computer_selection()`\n", "- Decide who won -> `determine_winner(user_selection, computer_selection)`" ] }, { "cell_type": "code", "execution_count": null, "id": "08c5263b", "metadata": {}, "outputs": [], "source": [ "def get_user_selection():\n", " user_input = input(\"Enter a choice (rock[0], paper[1], scissors[2]): \")\n", " selection = int(user_input)\n", " action = Action(selection)\n", " return action\n", "\n", "\n", "def get_user_selection():\n", " choices = [f\"{action.name}[{action.value}]\" for action in Action]\n", " choices_str = \", \".join(choices)\n", " selection = int(input(f\"Enter a choice ({choices_str}): \"))\n", " action = Action(selection)\n", " return action" ] }, { "cell_type": "code", "execution_count": null, "id": "9a592ca5", "metadata": {}, "outputs": [], "source": [ "def get_computer_selection():\n", " selection = random.randint(0, len(Action) - 1)\n", " action = Action(selection)\n", " return action" ] }, { "cell_type": "code", "execution_count": null, "id": "20cccf2d", "metadata": {}, "outputs": [], "source": [ "def determine_winner(user_action, computer_action):\n", " if user_action == computer_action:\n", " print(f\"Both players selected {user_action.name}. It's a tie!\")\n", " elif user_action == Action.Rock:\n", " if computer_action == Action.Scissors:\n", " print(\"Rock smashes scissors! You win!\")\n", " else:\n", " print(\"Paper covers rock! You lose.\")\n", " elif user_action == Action.Paper:\n", " if computer_action == Action.Rock:\n", " print(\"Paper covers rock! You win!\")\n", " else:\n", " print(\"Scissors cuts paper! You lose.\")\n", " elif user_action == Action.Scissors:\n", " if computer_action == Action.Paper:\n", " print(\"Scissors cuts paper! You win!\")\n", " else:\n", " print(\"Rock smashes scissors! You lose.\")" ] }, { "attachments": {}, "cell_type": "markdown", "id": "4ca7428d", "metadata": {}, "source": [ "Once these functions have been created, we can create a single one that contains all the game logic that we can invoke (or call) every time we want to start a new game:\n", "\n", "- `start_game()`\n" ] }, { "cell_type": "code", "execution_count": null, "id": "8ebfa484", "metadata": {}, "outputs": [], "source": [ "def start_game():\n", " while True:\n", " try:\n", " user_action = get_user_selection()\n", " except ValueError as e:\n", " range_str = f\"[0, {len(Action) - 1}]\"\n", " print(f\"Invalid selection. Enter a value in range {range_str}\")\n", " continue\n", "\n", " computer_action = get_computer_selection()\n", " determine_winner(user_action, computer_action)\n", "\n", " play_again = input(\"Play again? (y/n): \")\n", " if play_again.lower() != \"y\":\n", " break" ] }, { "cell_type": "code", "execution_count": null, "id": "c1473a6c", "metadata": {}, "outputs": [], "source": [ "start_game()" ] }, { "attachments": {}, "cell_type": "markdown", "id": "6955e8bc", "metadata": {}, "source": [ "### Task 3c: Let's create a dictionary with the winning moves\n", "\n", "Let's create a dictionary where we will have a key/value pair for every possible move. In particular:\n", "- the **key** will be the action specified in our `Action` class\n", "- the **value** will be **a list** containing the actions of class `Action` that *lose* against the move specified as key" ] }, { "cell_type": "code", "execution_count": null, "id": "96893f8d", "metadata": {}, "outputs": [], "source": [ "victories = {\n", " Action.Rock: [Action.Scissors], # Rock beats scissors\n", " Action.Paper: [Action.Rock], # Paper beats rock\n", " Action.Scissors: [Action.Paper] # Scissors beats paper\n", "}" ] }, { "attachments": {}, "cell_type": "markdown", "id": "c3205d3c", "metadata": {}, "source": [ "### Task 3d: Let's use the dictionary and the `in` operator to simplify the checks" ] }, { "cell_type": "code", "execution_count": null, "id": "dce77a8a", "metadata": {}, "outputs": [], "source": [ "def determine_winner(user_action, computer_action):\n", " print(f\"You chose {user_action.name}. The computer chose {computer_action.name}.\")\n", " defeats = victories[user_action]\n", " if user_action == computer_action:\n", " print(f\"Both players selected {user_action.name}. It's a tie!\")\n", " elif computer_action in defeats:\n", " print(f\"{user_action.name} beats {computer_action.name}! You win!\")\n", " else:\n", " print(f\"{computer_action.name} beats {user_action.name}! You lose.\")" ] }, { "cell_type": "code", "execution_count": null, "id": "e8edf419", "metadata": {}, "outputs": [], "source": [ "start_game()" ] }, { "attachments": {}, "cell_type": "markdown", "id": "ee1b1835", "metadata": {}, "source": [ "### Task 3e: Let's add the other moves: `lizard` and `spock`\n", "\n", "It is important to note that thanks to the optimizations already done **adding new moves comes to us *almost* free!**" ] }, { "cell_type": "code", "execution_count": null, "id": "60f6e2d5", "metadata": {}, "outputs": [], "source": [ "class Action(IntEnum):\n", " Rock = 0\n", " Paper = 1\n", " Scissors = 2\n", " Lizard = 3\n", " Spock = 4\n", "\n", "victories = {\n", " Action.Scissors: [Action.Lizard, Action.Paper],\n", " Action.Paper: [Action.Spock, Action.Rock],\n", " Action.Rock: [Action.Lizard, Action.Scissors],\n", " Action.Lizard: [Action.Spock, Action.Paper],\n", " Action.Spock: [Action.Scissors, Action.Rock]\n", "}" ] }, { "attachments": {}, "cell_type": "markdown", "id": "67881ce8", "metadata": {}, "source": [ "### Task 3f: Let's make the game more *catchy* via ASCII art\n", "\n", "We will create two new dictionaries:\n", "- in `ascii_action` we will put the ascii art of the moves\n", "- in `ascii_results` we will put the ascii art of the possible results" ] }, { "cell_type": "code", "execution_count": null, "id": "4422813d", "metadata": {}, "outputs": [], "source": [ "ascii_action = {\n", " Action.Scissors: r\"\"\"\n", " _____ _\n", " / ___| (_)\n", " \\ `--. ___ _ ___ ___ ___ _ __ ___\n", " `--. \\/ __| / __/ __|/ _ \\| '__/ __|\n", " /\\__/ / (__| \\__ \\__ \\ (_) | | \\__ \\\\\n", " \\____/ \\___|_|___/___/\\___/|_| |___/\n", " \"\"\",\n", " Action.Paper: r\"\"\"\n", " ______\n", " | ___ \\\n", " | |_/ /_ _ _ __ ___ _ __\n", " | __/ _` | '_ \\ / _ \\ '__|\n", " | | | (_| | |_) | __/ |\n", " \\_| \\__,_| .__/ \\___|_|\n", " | |\n", " |_|\n", " \"\"\",\n", " Action.Rock: r\"\"\"\n", " ______ _\n", " | ___ \\ | |\n", " | |_/ /___ ___| | __\n", " | // _ \\ / __| |/ /\n", " | |\\ \\ (_) | (__| <\n", " \\_| \\_\\___/ \\___|_|\\_\\\n", "\n", " \"\"\",\n", " Action.Lizard: r\"\"\"\n", " _ _ _\n", " | | (_) | |\n", " | | _ __________ _ _ __ __| |\n", " | | | |_ /_ / _` | '__/ _` |\n", " | |___| |/ / / / (_| | | | (_| |\n", " \\_____/_/___/___\\__,_|_| \\__,_|\n", " \"\"\",\n", " Action.Spock: r\"\"\"\n", " _____ _\n", " / ___| | |\n", " \\ `--. _ __ ___ ___| | __\n", " `--. \\ '_ \\ / _ \\ / __| |/ /\n", " /\\__/ / |_) | (_) | (__| <\n", " \\____/| .__/ \\___/ \\___|_|\\_\\\\\n", " | |\n", " |_|\n", " \"\"\"\n", "}\n", "\n", "COMPUTER_WIN=-1\n", "HUMAN_WIN=1\n", "DROW=0\n", "ascii_result = {\n", " COMPUTER_WIN: r\"\"\"\n", " _____ ________ _________ _ _ _____ ___________\n", "/ __ \\ _ | \\/ || ___ \\ | | |_ _| ___| ___ \\\\\n", "| / \\/ | | | . . || |_/ / | | | | | | |__ | |_/ /\n", "| | | | | | |\\/| || __/| | | | | | | __|| /\n", "| \\__/\\ \\_/ / | | || | | |_| | | | | |___| |\\ \\\n", " \\____/\\___/\\_| |_/\\_| \\___/ \\_/ \\____/\\_| \\_|\n", "\n", "\n", " _ _ _____ _ _ _____ _ _ _\n", "| | | |_ _| \\ | |/ ___| | | | |\n", "| | | | | | | \\| |\\ `--. | | | |\n", "| |/\\| | | | | . ` | `--. \\ | | | |\n", "\\ /\\ /_| |_| |\\ |/\\__/ / |_|_|_|\n", " \\/ \\/ \\___/\\_| \\_/\\____/ (_|_|_)\n", "\n", " \"\"\",\n", " HUMAN_WIN: r\"\"\"\n", " _ _ _ ____ ___ ___ _ _\n", "| | | | | | | \\/ | / _ \\ | \\ | |\n", "| |_| | | | | . . |/ /_\\ \\| \\| |\n", "| _ | | | | |\\/| || _ || . ` |\n", "| | | | |_| | | | || | | || |\\ |\n", "\\_| |_/\\___/\\_| |_/\\_| |_/\\_| \\_/\n", "\n", "\n", " _ _ _____ _ _ _____ _ _ _\n", "| | | |_ _| \\ | |/ ___| | | | |\n", "| | | | | | | \\| |\\ `--. | | | |\n", "| |/\\| | | | | . ` | `--. \\ | | | |\n", "\\ /\\ /_| |_| |\\ |/\\__/ / |_|_|_|\n", " \\/ \\/ \\___/\\_| \\_/\\____/ (_|_|_)\n", "\n", "\n", " __\n", " / _|\n", " | |_ ___ _ __ _ __ _____ __\n", " | _/ _ \\| '__| | '_ \\ / _ \\ \\ /\\ / /\n", " _ _ _| || (_) | | | | | | (_) \\ V V / _ _ _\n", "(_|_|_)_| \\___/|_| |_| |_|\\___/ \\_/\\_/ (_|_|_)\n", "\n", " \"\"\",\n", " DROW: r\"\"\"\n", " _ _ _\n", " | | (_) | |\n", " __ _ | |_ _ ___ __| | __ _ __ _ _ __ ___ ___\n", " / _` | | __| |/ _ \\/ _` | / _` |/ _` | '_ ` _ \\ / _ \\\\\n", "| (_| | | |_| | __/ (_| | | (_| | (_| | | | | | | __/\n", " \\__,_| \\__|_|\\___|\\__,_| \\__, |\\__,_|_| |_| |_|\\___|\n", " __/ |\n", " |___/\n", " ___ _ _ __\n", " / / | | | (_) \\ \\\\\n", "| || |__ _____ __ | |__ ___ _ __ _ _ __ __ _ | |\n", "| || '_ \\ / _ \\ \\ /\\ / / | '_ \\ / _ \\| '__| | '_ \\ / _` || |\n", "| || | | | (_) \\ V V / | |_) | (_) | | | | | | | (_| || |\n", "| ||_| |_|\\___/ \\_/\\_/ |_.__/ \\___/|_| |_|_| |_|\\__, || |\n", " \\_\\ __/ /_/\n", " |___/ \"\"\"\n", "}" ] }, { "attachments": {}, "cell_type": "markdown", "id": "75f2c6d5", "metadata": {}, "source": [ "After that we will create two functions to easily display actions and results in ASCII art:\n", "- `display_action`\n", "- `display_results`" ] }, { "cell_type": "code", "execution_count": null, "id": "c96921e2", "metadata": {}, "outputs": [], "source": [ "def display_action(action):\n", " print(ascii_action[action])\n", "\n", "def display_result(result):\n", " print(ascii_result[result])" ] }, { "cell_type": "code", "execution_count": null, "id": "a5138c08", "metadata": {}, "outputs": [], "source": [ "display_action(Action.Spock)" ] }, { "attachments": {}, "cell_type": "markdown", "id": "c1b643e3", "metadata": {}, "source": [ "To use these functions we will also need to modify the `determine_winner` function" ] }, { "cell_type": "code", "execution_count": null, "id": "252c6e45", "metadata": {}, "outputs": [], "source": [ "def determine_winner(user_action, computer_action):\n", " print(f\"You chose\")\n", " display_action(user_action)\n", " print(f\"The computer chose\")\n", " display_action(computer_action)\n", " defeats = victories[user_action]\n", " if user_action == computer_action:\n", " display_result(DROW)\n", " return DROW\n", " elif computer_action in defeats:\n", " display_result(HUMAN_WIN)\n", " return HUMAN_WIN\n", " else:\n", " display_result(COMPUTER_WIN)\n", " return COMPUTER_WIN" ] }, { "cell_type": "code", "execution_count": null, "id": "7a7ea9d5", "metadata": {}, "outputs": [], "source": [ "start_game()" ] }, { "attachments": {}, "cell_type": "markdown", "id": "5e478a36", "metadata": {}, "source": [ "### We keep the scores obtained in each game by users\n", "\n", "We will no longer be satisfied with just the single heat victory messages. We really want to play a game to understand who wins between user and computer after N rounds. Now we can have a real game against the computer and decide when to finish it!" ] }, { "cell_type": "code", "execution_count": null, "id": "ec5766f6", "metadata": {}, "outputs": [], "source": [ "def print_game_results(game_results):\n", " num_tied = game_results.count(DROW)/len(game_results)*100\n", " num_player_wins = game_results.count(HUMAN_WIN)/len(game_results)*100\n", " num_computer_wins =game_results.count(COMPUTER_WIN)/len(game_results)*100\n", "\n", " print( 'There were ', num_tied, '% tied games', \"\\nthe player won \", num_player_wins, '% of games\\nthe computer won ', num_computer_wins, '% of games\\nin a total of ', len(game_results), ' games')\n", "\n", "def start_game(num_games=1):\n", " game_results=[]\n", " counter=0\n", " while True:\n", " try:\n", " user_action = get_user_selection()\n", " except ValueError as e:\n", " range_str = f\"[0, {len(Action) - 1}]\"\n", " print(f\"Invalid selection. Enter a value in range {range_str}\")\n", " continue\n", "\n", " computer_action = get_computer_selection()\n", " game_results.append(determine_winner(user_action, computer_action))\n", " counter+=1\n", "\n", " if counter>=num_games:\n", " break\n", " print_game_results(game_results)\n", " return game_results\n", "\n" ] }, { "cell_type": "code", "execution_count": null, "id": "1c55861d", "metadata": {}, "outputs": [], "source": [ "game_results=start_game(5)\n" ] }, { "attachments": {}, "cell_type": "markdown", "id": "aae3e00d", "metadata": {}, "source": [ "### We use a graphical interface!\n", "\n", "In the next cell we're going to use a Jupyter feature that allows us to create a drop-down menu on the fly (after all, this is an HTML page, isn't it?) and to associate a behavior with the choice of item from the menu!\n", "\n", "Related concepts:\n", "- list comprehension\n", "- `widgets.Dropdown`" ] }, { "cell_type": "code", "execution_count": null, "id": "a6b41097", "metadata": {}, "outputs": [], "source": [ "import ipywidgets as widgets\n", "options=[(action.name,action.value) for action in Action]\n", "menu = widgets.Dropdown(\n", " options=options ,\n", " description='Chose:')\n", "output = widgets.Output(layout={'border': '1px solid black'})\n", "\n", "def on_button_clicked(b):\n", " output.clear_output()\n", " with output:\n", " computer_action = get_computer_selection()\n", " determine_winner(Action(menu.value), computer_action)\n", "\n", "button = widgets.Button(description=\"Play!\", button_style='success', icon='check')\n", "button.on_click(on_button_clicked)\n", "box = widgets.VBox([menu, button, output])\n", "\n", "display(box)" ] }, { "attachments": {}, "cell_type": "markdown", "id": "d95e966e", "metadata": {}, "source": [ "## Time to use ML!\n", "\n", "In the following cells we will use Machine Learning to train a predictive model capable of deducing the user's move starting from the shot of the hand obtained with the webcam.\n", "\n", "Let's install the necessary libraries and import them:" ] }, { "attachments": {}, "cell_type": "markdown", "id": "2065bc78", "metadata": {}, "source": [] }, { "cell_type": "code", "execution_count": null, "id": "735944ea", "metadata": {}, "outputs": [], "source": [ "!pip install numpy\n", "!pip install opencv-python\n", "!pip install mediapipe\n", "!pip install requests" ] }, { "cell_type": "code", "execution_count": null, "id": "451dc433", "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "import mediapipe as mp\n", "import cv2" ] }, { "cell_type": "code", "execution_count": null, "id": "6c0a4085", "metadata": {}, "outputs": [], "source": [ "import requests\n", "url = \"https://raw.githubusercontent.com/ntu-rris/google-mediapipe/main/data/gesture_train.csv\"\n", "\n", "# If repo is private - we need to add a token in header:\n", "\n", "\n", "resp = requests.get(url)\n", "\n", "with open('./gesture_train.csv', 'wb') as f:\n", " f.write(resp.content)\n" ] }, { "cell_type": "code", "execution_count": null, "id": "182454a8", "metadata": {}, "outputs": [], "source": [ "# Define default camera intrinsic\n", "img_width = 640\n", "img_height = 480\n", "intrin_default = {\n", " 'fx': img_width*0.9, # Approx 0.7w < f < w https://www.learnopencv.com/approximate-focal-length-for-webcams-and-cell-phone-cameras/\n", " 'fy': img_width*0.9,\n", " 'cx': img_width*0.5, # Approx center of image\n", " 'cy': img_height*0.5,\n", " 'width': img_width,\n", "}\n", "class GestureRecognition:\n", " def __init__(self):\n", "\n", " # 11 types of gesture 'name':class label\n", " self.gesture = {\n", " 'fist':0,'one':1,'two':2,'three':3,'four':4,'five':5,'six':6,\n", " 'rock':7,'spiderman':8,'yeah':9,'ok':10,\n", " }\n", "\n", " # Load training data\n", " file = np.genfromtxt('./gesture_train.csv', delimiter=',')\n", " # Extract input joint angles\n", " angle = file[:,:-1].astype(np.float32)\n", " # Extract output class label\n", " label = file[:, -1].astype(np.float32)\n", " # Use OpenCV KNN\n", " self.knn = cv2.ml.KNearest_create()\n", " self.knn.train(angle, cv2.ml.ROW_SAMPLE, label)\n", "\n", "\n", "\n", " def eval(self, angle):\n", " # Use KNN for gesture recognition\n", " data = np.asarray([angle], dtype=np.float32)\n", " ret, results, neighbours ,dist = self.knn.findNearest(data, 3)\n", " idx = int(results[0][0]) # Index of class label\n", "\n", " return list(self.gesture)[idx] # Return name of class label\n", "\n", "\n", "class MediaPipeHand:\n", " def __init__(self, static_image_mode=True, max_num_hands=1,\n", " model_complexity=1, intrin=None):\n", " self.max_num_hands = max_num_hands\n", " if intrin is None:\n", " self.intrin = intrin_default\n", " else:\n", " self.intrin = intrin\n", "\n", " # Access MediaPipe Solutions Python API\n", " mp_hands = mp.solutions.hands\n", " # help(mp_hands.Hands)\n", "\n", " # Initialize MediaPipe Hands\n", " # static_image_mode:\n", " # For video processing set to False:\n", " # Will use previous frame to localize hand to reduce latency\n", " # For unrelated images set to True:\n", " # To allow hand detection to run on every input images\n", "\n", " # max_num_hands:\n", " # Maximum number of hands to detect\n", "\n", " # model_complexity:\n", " # Complexity of the hand landmark model: 0 or 1.\n", " # Landmark accuracy as well as inference latency generally\n", " # go up with the model complexity. Default to 1.\n", "\n", " # min_detection_confidence:\n", " # Confidence value [0,1] from hand detection model\n", " # for detection to be considered successful\n", "\n", " # min_tracking_confidence:\n", " # Minimum confidence value [0,1] from landmark-tracking model\n", " # for hand landmarks to be considered tracked successfully,\n", " # or otherwise hand detection will be invoked automatically on the next input image.\n", " # Setting it to a higher value can increase robustness of the solution,\n", " # at the expense of a higher latency.\n", " # Ignored if static_image_mode is true, where hand detection simply runs on every image.\n", "\n", " self.pipe = mp_hands.Hands(\n", " static_image_mode=static_image_mode,\n", " max_num_hands=max_num_hands,\n", " model_complexity=model_complexity,\n", " min_detection_confidence=0.5,\n", " min_tracking_confidence=0.5)\n", "\n", " # Define hand parameter\n", " self.param = []\n", " for i in range(max_num_hands):\n", " p = {\n", " 'keypt' : np.zeros((21,2)), # 2D keypt in image coordinate (pixel)\n", " 'joint' : np.zeros((21,3)), # 3D joint in camera coordinate (m)\n", " 'class' : None, # Left / right / none hand\n", " 'score' : 0, # Probability of predicted handedness (always>0.5, and opposite handedness=1-score)\n", " 'angle' : np.zeros(15), # Flexion joint angles in degree\n", " 'gesture' : None, # Type of hand gesture\n", " 'rvec' : np.zeros(3), # Global rotation vector Note: this term is only used for solvepnp initialization\n", " 'tvec' : np.asarray([0,0,0.6]), # Global translation vector (m) Note: Init z direc to some +ve dist (i.e. in front of camera), to prevent solvepnp from wrongly estimating z as -ve\n", " 'fps' : -1, # Frame per sec\n", " # https://github.com/google/mediapipe/issues/1351\n", " # 'visible' : np.zeros(21), # Visibility: Likelihood [0,1] of being visible (present and not occluded) in the image\n", " # 'presence': np.zeros(21), # Presence: Likelihood [0,1] of being present in the image or if its located outside the image\n", " }\n", " self.param.append(p)\n", "\n", "\n", " def result_to_param(self, result, img):\n", " # Convert mediapipe result to my own param\n", " img_height, img_width, _ = img.shape\n", "\n", " # Reset param\n", " for p in self.param:\n", " p['class'] = None\n", "\n", " if result.multi_hand_landmarks is not None:\n", " # Loop through different hands\n", " for i, res in enumerate(result.multi_handedness):\n", " if i>self.max_num_hands-1: break # Note: Need to check if exceed max number of hand\n", " self.param[i]['class'] = res.classification[0].label\n", " self.param[i]['score'] = res.classification[0].score\n", "\n", " # Loop through different hands\n", " for i, res in enumerate(result.multi_hand_landmarks):\n", " if i>self.max_num_hands-1: break # Note: Need to check if exceed max number of hand\n", " # Loop through 21 landmark for each hand\n", " for j, lm in enumerate(res.landmark):\n", " self.param[i]['keypt'][j,0] = lm.x * img_width # Convert normalized coor to pixel [0,1] -> [0,width]\n", " self.param[i]['keypt'][j,1] = lm.y * img_height # Convert normalized coor to pixel [0,1] -> [0,height]\n", "\n", " # Ignore it https://github.com/google/mediapipe/issues/1320\n", " # self.param[i]['visible'][j] = lm.visibility\n", " # self.param[i]['presence'][j] = lm.presence\n", "\n", " if result.multi_hand_world_landmarks is not None:\n", " for i, res in enumerate(result.multi_hand_world_landmarks):\n", " if i>self.max_num_hands-1: break # Note: Need to check if exceed max number of hand\n", " # Loop through 21 landmark for each hand\n", " for j, lm in enumerate(res.landmark):\n", " self.param[i]['joint'][j,0] = lm.x\n", " self.param[i]['joint'][j,1] = lm.y\n", " self.param[i]['joint'][j,2] = lm.z\n", "\n", " # Convert relative 3D joint to angle\n", " self.param[i]['angle'] = self.convert_joint_to_angle(self.param[i]['joint'])\n", " # Convert relative 3D joint to camera coordinate\n", " self.convert_joint_to_camera_coor(self.param[i], self.intrin)\n", "\n", " return self.param\n", "\n", "\n", " def convert_joint_to_angle(self, joint):\n", " # Get direction vector of bone from parent to child\n", " v1 = joint[[0,1,2,3,0,5,6,7,0,9,10,11,0,13,14,15,0,17,18,19],:] # Parent joint\n", " v2 = joint[[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20],:] # Child joint\n", " v = v2 - v1 # [20,3]\n", " # Normalize v\n", " v = v/np.linalg.norm(v, axis=1)[:, np.newaxis]\n", "\n", " # Get angle using arcos of dot product\n", " angle = np.arccos(np.einsum('nt,nt->n',\n", " v[[0,1,2,4,5,6,8,9,10,12,13,14,16,17,18],:],\n", " v[[1,2,3,5,6,7,9,10,11,13,14,15,17,18,19],:])) # [15,]\n", "\n", " return np.degrees(angle) # Convert radian to degree\n", "\n", "\n", " def convert_joint_to_camera_coor(self, param, intrin, use_solvepnp=True):\n", " # MediaPipe version 0.8.9.1 onwards:\n", " # Given real-world 3D joint centered at middle MCP joint -> J_origin\n", " # To estimate the 3D joint in camera coordinate -> J_camera = J_origin + tvec,\n", " # We need to find the unknown translation vector -> tvec = [tx,ty,tz]\n", " # Such that when J_camera is projected to the 2D image plane\n", " # It matches the 2D keypoint locations\n", "\n", " # Considering all 21 keypoints,\n", " # Each keypoints will form 2 eq, in total we have 42 eq 3 unknowns\n", " # Since the equations are linear wrt [tx,ty,tz]\n", " # We can solve the unknowns using linear algebra A.x = b, where x = [tx,ty,tz]\n", "\n", " # Consider a single keypoint (pixel x) and joint (X,Y,Z)\n", " # Using the perspective projection eq:\n", " # (x - cx)/fx = (X + tx) / (Z + tz)\n", " # Similarly for pixel y:\n", " # (y - cy)/fy = (Y + ty) / (Z + tz)\n", " # Rearranging the above linear equations by keeping constants to the right hand side:\n", " # fx.tx - (x - cx).tz = -fx.X + (x - cx).Z\n", " # fy.ty - (y - cy).tz = -fy.Y + (y - cy).Z\n", " # Therefore, we can factor out the unknowns and form a matrix eq:\n", " # [fx 0 (x - cx)][tx] [-fx.X + (x - cx).Z]\n", " # [ 0 fy (y - cy)][ty] = [-fy.Y + (y - cy).Z]\n", " # [tz]\n", "\n", " idx = [i for i in range(21)] # Use all landmarks\n", "\n", " if use_solvepnp:\n", " # Method 1: OpenCV solvePnP\n", " fx, fy = intrin['fx'], intrin['fy']\n", " cx, cy = intrin['cx'], intrin['cy']\n", " intrin_mat = np.asarray([[fx,0,cx],[0,fy,cy],[0,0,1]])\n", " dist_coeff = np.zeros(4)\n", "\n", " ret, param['rvec'], param['tvec'] = cv2.solvePnP(\n", " param['joint'][idx], param['keypt'][idx],\n", " intrin_mat, dist_coeff, param['rvec'], param['tvec'],\n", " useExtrinsicGuess=True)\n", " # Add tvec to all joints\n", " param['joint'] += param['tvec']\n", "\n", " else:\n", " # Method 2:\n", " A = np.zeros((len(idx),2,3))\n", " b = np.zeros((len(idx),2))\n", "\n", " A[:,0,0] = intrin['fx']\n", " A[:,1,1] = intrin['fy']\n", " A[:,0,2] = -(param['keypt'][idx,0] - intrin['cx'])\n", " A[:,1,2] = -(param['keypt'][idx,1] - intrin['cy'])\n", "\n", " b[:,0] = -intrin['fx'] * param['joint'][idx,0] \\\n", " + (param['keypt'][idx,0] - intrin['cx']) * param['joint'][idx,2]\n", " b[:,1] = -intrin['fy'] * param['joint'][idx,1] \\\n", " + (param['keypt'][idx,1] - intrin['cy']) * param['joint'][idx,2]\n", "\n", " A = A.reshape(-1,3) # [8,3]\n", " b = b.flatten() # [8]\n", "\n", " # Use the normal equation AT.A.x = AT.b to minimize the sum of the sq diff btw left and right sides\n", " x = np.linalg.solve(A.T @ A, A.T @ b)\n", " # Add tvec to all joints\n", " param['joint'] += x\n", "\n", "\n", "\n", " def forward(self, img):\n", "\n", " # Extract result\n", " result = self.pipe.process(img)\n", "\n", " # Convert result to my own param\n", " param = self.result_to_param(result, img)\n", "\n", " return param\n", "\n" ] }, { "cell_type": "code", "execution_count": null, "id": "00e39b7b", "metadata": {}, "outputs": [], "source": [ "import io\n", "\n", "try:\n", " from google.colab.output import eval_js\n", " colab = True\n", "except:\n", " colab = False\n", "\n", "# colab=False\n", "\n", "if colab:\n", " from IPython.display import display, Javascript\n", " from google.colab.output import eval_js\n", " from base64 import b64decode\n", " from PIL import Image as PIL_Image\n", "\n", "\n", " def take_photo(quality=0.8):\n", " js = Javascript('''\n", " async function takePhoto(quality) {\n", " const div = document.createElement('div');\n", " const capture = document.createElement('button');\n", " capture.textContent = 'Capture';\n", " div.appendChild(capture);\n", "\n", " const video = document.createElement('video');\n", " video.style.display = 'block';\n", " const stream = await navigator.mediaDevices.getUserMedia({video: true});\n", "\n", " document.body.appendChild(div);\n", " div.appendChild(video);\n", " video.srcObject = stream;\n", " await video.play();\n", "\n", " // Resize the output to fit the video element.\n", " google.colab.output.setIframeHeight(document.documentElement.scrollHeight, true);\n", "\n", " // Wait for Capture to be clicked.\n", " await new Promise((resolve) => capture.onclick = resolve);\n", "\n", " const canvas = document.createElement('canvas');\n", " canvas.width = video.videoWidth;\n", " canvas.height = video.videoHeight;\n", " canvas.getContext('2d').drawImage(video, 0, 0);\n", " stream.getVideoTracks()[0].stop();\n", " div.remove();\n", " return canvas.toDataURL('image/jpeg', quality);\n", " }\n", " ''')\n", " display(js)\n", " data = eval_js('takePhoto({})'.format(quality))\n", " binary = b64decode(data.split(',')[1])\n", "\n", "\n", " image = PIL_Image.open(io.BytesIO(binary))\n", " image_np = np.array(image)\n", "\n", " # with open(filename, 'wb') as f:\n", " # f.write(binary)\n", " return image_np\n", "else:\n", " import cv2\n", " def take_photo(filename='photo.jpg', quality=0.8):\n", " cam = cv2.VideoCapture(0)\n", "\n", " cv2.namedWindow(\"test\")\n", "\n", " img_counter = 0\n", "\n", " while True:\n", " ret, frame = cam.read()\n", " # Convert the image from BGR color (which OpenCV uses) to RGB color (which face_recognition uses)\n", " if not ret:\n", " print(\"failed to grab frame\")\n", " break\n", " cv2.imshow(\"test\", frame)\n", "\n", " k = cv2.waitKey(1)\n", " if k%256 == 27 or k%256 == 32 :\n", " # ESC pressed\n", " break\n", "\n", " cam.release()\n", "\n", " cv2.destroyAllWindows()\n", "\n", " # Preprocess image\n", " img = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)\n", " # Flip image for 3rd person view\n", " img = cv2.flip(img, 1)\n", "\n", " # To improve performance, optionally mark image as not writeable to pass by reference\n", " img.flags.writeable = False\n", "\n", " return img" ] }, { "cell_type": "code", "execution_count": null, "id": "739775c9", "metadata": {}, "outputs": [], "source": [ "def start_game(num_games=1):\n", " game_results=[]\n", " counter=0\n", " # Load mediapipe hand class\n", " pipe = MediaPipeHand(static_image_mode=True, max_num_hands=1)\n", " # Load gesture recognition class\n", " gest = GestureRecognition()\n", " while True:\n", " try:\n", " img = take_photo()\n", "\n", " # # Show the image which was just taken.\n", " # plt.imshow(img)\n", " # Feedforward to extract keypoint\n", " param = pipe.forward(img)\n", " # Evaluate gesture for all hands\n", "\n", " for p in param:\n", " if p['class'] is not None:\n", " p['gesture'] = gest.eval(p['angle'])\n", " # print(p['class'])\n", " # print(p['gesture'])\n", "\n", " if p['gesture']=='fist':\n", " action = Action.Rock\n", " elif p['gesture']=='five':\n", " action = Action.Paper\n", " elif (p['gesture']=='three') or (p['gesture']=='yeah'):\n", " action = Action.Scissors\n", " elif (p['gesture']=='rock') :\n", " action = Action.Lizard\n", " elif (p['gesture']=='four'):\n", " action = Action.Spock\n", " if action is not None:\n", " computer_action = get_computer_selection()\n", " game_results.append(determine_winner(action, computer_action))\n", " counter+=1\n", " print_game_results(game_results)\n", " old_action=action\n", "\n", " if counter>=num_games:\n", " break\n", " except Exception as err:\n", " # Errors will be thrown if the user does not have a webcam or if they do not\n", " # grant the page permission to access it.\n", " print(str(err))\n", " raise err\n", "\n", " pipe.pipe.close()" ] }, { "cell_type": "code", "execution_count": null, "id": "310fada0", "metadata": {}, "outputs": [], "source": [ "start_game(num_games=5)" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.1" }, "vscode": { "interpreter": { "hash": "a868160dc497589bad58e6ca4f95a34f6f09fb234f71a91b012eccf9b5d92496" } } }, "nbformat": 4, "nbformat_minor": 5 }