{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# OpenData CSV Database\n", "\n", "By running the code cell below, you can browse the csv-datafiles that can be found from CERN OpenData portal, CMS doc database and cms-opendata-education GitHub organisation. These files have been listed in a single file called 'csvDatabase.csv'. If you notice that a file is missing, please add it to the database.\n", "\n", "Currently using this notebook you can filter the files by:\n", "- filename\n", "- number of events (rows in the file)\n", "- Files that contain a parameter\n", " - M (invariant mass)\n", " - (px, py, pz) (momentum components)\n", " - eta (pseudorapidity)\n", "- Events, where there are a spesific number of decay products:\n", " - One particle\n", " - Two particles\n", " - Four particles" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": false }, "outputs": [], "source": [ "# -----------------------------------------------------------------------------\n", "''' This program can be used to filter a csv-database full of csv-files that\n", "contain data for CMS OpenData in education material. This program is helpful\n", "if you want to search for a spesific kind of datafile for example a file that\n", "contains data of more than 100 000 collision events.\n", "\n", "To use this program, just run the code and start searching for files.\n", "\n", "NOTE! This program only works on platforms that support ipywidgets. At the\n", "time this was written, the widgets did not work on JupyterLab by default.\n", "This program should work fine at least in Jupyter Notebook and MyBinder.\n", "This code doesn't have error check, so the user input must be correct so that\n", "the program doesn't crash.\n", "\n", "If you have any questions, please contact the author.\n", "Author: Juha Teuho, juha.teuho@gmail.com\n", "'''\n", "\n", "# -----------------------------------------------------------------------------\n", "# Import modules and set display options.\n", "# -----------------------------------------------------------------------------\n", "import ipywidgets as widgets\n", "import pandas as pd\n", "from ipywidgets import interact, interactive, fixed, interact_manual, Layout\n", "\n", "pd.options.display.max_colwidth = 80\n", "\n", "# -----------------------------------------------------------------------------\n", "\n", "\n", "\n", "# -----------------------------------------------------------------------------\n", "''' Read the csv file and save few datacolumns to variables. \n", " The file must contain columns 'Name', 'n-events', 'ParentDataset', \n", " 'Source' and 'Parameters'\n", "'''\n", "file = pd.read_csv(\"csvDatabase.csv\")\n", "try:\n", " filenames = file['Name']\n", " events = file['n-events']\n", " parent = file['ParentDataset']\n", " sources = file['Source']\n", " parameters = file['Parameters']\n", "except:\n", " print(\"Invalid file!\")\n", "\n", "# -----------------------------------------------------------------------------\n", " \n", " \n", " \n", "# -----------------------------------------------------------------------------\n", "# Initiate variables and widgets.\n", "\n", "# Create global lists to use buttons.\n", "buttons = []\n", "indeces = []\n", "\n", "# Create text widget\n", "t = widgets.Text()\n", "\n", "# Create widgets for event filter.\n", "min_event = widgets.Text(value=\"0\",description='Min:')\n", "max_event = widgets.Text(value=\"999999\",description='Max:')\n", "search_button = widgets.Button(description=\"Search\",button_style='success')\n", "event_widgets = [min_event, max_event, search_button]\n", "\n", "# creates a checkbox widget with a desired description.\n", "def make_cb(description):\n", " return widgets.Checkbox(value=False,description=description,disabled=False)\n", "\n", "# Create checkboxes for some parameters and filters.\n", "checkboxes = [make_cb('M'), make_cb('(px,py,pz)'), make_cb('η'),\n", " make_cb('One particle'),make_cb('Two particles'), \n", " make_cb('Four particles')]\n", "''' Define the parameters, which will be used to filter the files.\n", " Invariant mass: if file contains parameter 'M'\n", " Momentum: if file contains 'py' or 'py1'. \n", " Not using \"px\", because there were inconsistencies with \n", " those in the datafiles.\n", " Pseudorapidity: if file contains 'eta'\n", " One particle: if file contains 'Q'.\n", " Two particles: if file contains 'Q2' but not 'Q4'\n", " Four particles: if file contains 'Q4'\n", "'''\n", "checkbox_params = [['M'],['py','py1'],\n", " ['eta','eta1'],['Q'],\n", " ['Q2','Q4',True],['Q4']]\n", "\n", "# Widget to display and hide program output.\n", "out = widgets.Output(layout={'border': '1px solid black'})\n", "# -----------------------------------------------------------------------------\n", "\n", "\n", "\n", "# -----------------------------------------------------------------------------\n", "''' FUNCTIONS\n", "The functions are divided in five parts:\n", " 1: name search\n", " 2: event search\n", " 3: parameter search\n", " 4: button operation\n", " 5: other\n", "'''\n", "# -----------------------------------------------------------------------------\n", "\n", "\n", "\n", "# -----------------------------------------------------------------------------\n", "# 1. Name search funtions\n", "# -----------------------------------------------------------------------------\n", "\n", "# -----------------------------------------------------------------------------\n", "def handle_submit(sender):\n", " ''' This function gets called, when the namesearch-button is pressed.\n", " '''\n", " namesearch(t.value)\n", "# -----------------------------------------------------------------------------\n", "\n", "# -----------------------------------------------------------------------------\n", "def namesearch(text):\n", " ''' Searches for names according to the user input.\n", " Prints the corresponding filenames as buttons.\n", " '''\n", " hide()\n", " display_name_widgets()\n", " for i in range(len(filenames)):\n", " if text.lower() in filenames[i].lower():\n", " create_buttons(i)\n", "# -----------------------------------------------------------------------------\n", " \n", "# -----------------------------------------------------------------------------\n", "def display_name_widgets():\n", " print(\"Enter the filename or part of it and press ENTER\")\n", " print(\"See more information of a datafile by clicking the name of the file.\")\n", " print(\"Note that the file information will appear at the end of the output!\")\n", " display(t)\n", " t.on_submit(handle_submit)\n", "# -----------------------------------------------------------------------------\n", " \n", "# -----------------------------------------------------------------------------\n", "# 2. Event search funtions\n", "# -----------------------------------------------------------------------------\n", "\n", "# -----------------------------------------------------------------------------\n", "def search_events(sender):\n", " ''' Search for the number of events according to the user input.\n", " Prints the corresponding filenames and event number as buttons.\n", " '''\n", " hide()\n", " display_event_widgets()\n", " with out:\n", " for i in range(len(events)): \n", " if int(min_event.value) <= events[i] <= int(max_event.value):\n", " create_buttons(i, True)\n", "# -----------------------------------------------------------------------------\n", " \n", "# -----------------------------------------------------------------------------\n", "def display_event_widgets():\n", " print(\"Choose the minimum and maximum number of events and click 'Search'.\")\n", " print(\"See more information of a datafile by clicking the name of the file.\")\n", " print(\"Note that the file information will appear at the end of the output!\")\n", " for widget in event_widgets:\n", " display(widget)\n", "# -----------------------------------------------------------------------------\n", "\n", "# -----------------------------------------------------------------------------\n", "# 3. Parameter search funtions\n", "# -----------------------------------------------------------------------------\n", "\n", "# -----------------------------------------------------------------------------\n", "\n", "#\n", "def check_params(paramlist):\n", " ''' Checks if chosen parameters are in the files.\n", " Returns a list of indices in the original csv-file.\n", " At those indices the parameter conditions are met.\n", " \n", " Input: paramlist (2, 3 or 4 parameters)\n", " params[0]: list of string, where searches are done\n", " params[1]: a spesific pattern to be searched\n", " params[2]: another pattern to be searched\n", " params[3]: True or False depending on if both params[1]\n", " and params[2] should be in params or params[1]\n", " is and params[2] is not\n", " Output: match\n", " list of indeces in params[0] on which the conditions were met\n", " '''\n", " params = parameters\n", " param1 = paramlist[0]\n", " try:\n", " param2 = paramlist[1]\n", " except:\n", " param2=None\n", " try:\n", " inverse = paramlist[2]\n", " except:\n", " inverse=False\n", " match = []\n", " if param2 and not inverse:\n", " for i in range(len(params)):\n", " if param1 in params[i] or param2 in params[i]:\n", " match.append(i)\n", " elif param2 and inverse:\n", " for i in range(len(params)):\n", " if param1 in params[i] and param2 not in params[i]:\n", " match.append(i)\n", " else:\n", " for i in range(len(params)):\n", " if param1 in params[i]:\n", " match.append(i)\n", " return match\n", "# -----------------------------------------------------------------------------\n", "\n", "# -----------------------------------------------------------------------------\n", "def parameter_search(change):\n", " ''' Search for the parameter according to the user selection.\n", " Prints the corresponding filenames as buttons.\n", " '''\n", " hide()\n", " display_checkboxes()\n", " chosen = []\n", " for i in range(len(checkboxes)): #Loop over all checkboxes\n", " if checkboxes[i].value: # If checkbox is checked\n", " chosen_box = check_params(checkbox_params[i])\n", " if chosen:\n", " chosen = [value for value in chosen if value in chosen_box]\n", " else:\n", " chosen = chosen_box\n", " if chosen: \n", " for index in chosen:\n", " create_buttons(index)\n", "# -----------------------------------------------------------------------------\n", "\n", "# -----------------------------------------------------------------------------\n", "def display_checkboxes():\n", " print(\"Filter by parameter. For the number of particles, choose only one at a time to avoid errors.\")\n", " print(\"See more information of a datafile by clicking the name of the file.\")\n", " print(\"Note that the file information will appear at the end of the output!\")\n", " for box in checkboxes:\n", " display(box)\n", "# -----------------------------------------------------------------------------\n", "\n", "# -----------------------------------------------------------------------------\n", "# 4. Button operation funtions\n", "# -----------------------------------------------------------------------------\n", "\n", "# -----------------------------------------------------------------------------\n", "def create_buttons(i, print_events=False):\n", " ''' Creates and displays buttons\n", " '''\n", " global buttons\n", " global indeces\n", " if print_events:\n", " des = str(filenames[i] + \" \" + str(events[i]))\n", " newbutton = widgets.ToggleButton(layout=Layout(width='initial'),\n", " description=des)\n", " else:\n", " newbutton = widgets.ToggleButton(layout=Layout(width='initial'),\n", " description=filenames[i])\n", " display(newbutton)\n", " # If button is clicked, call function 'button_click'\n", " newbutton.observe(button_click) \n", " # Append buttons to list to know which button was clicked\n", " buttons.append(newbutton) \n", " # Append indeces to know, which files correspond to buttons\n", " indeces.append(i) \n", "# -----------------------------------------------------------------------------\n", "\n", "# -----------------------------------------------------------------------------\n", "def button_click(change):\n", " ''' When a button is clicked, it prints out the information corresponding \n", " to that file.\n", " '''\n", " global buttons\n", " for i in range(len(buttons)):\n", " if buttons[i].value == True:\n", " buttons[i].value = False\n", " print_info(i)\n", "# -----------------------------------------------------------------------------\n", "\n", "# -----------------------------------------------------------------------------\n", "# 5. Other funtions\n", "# -----------------------------------------------------------------------------\n", " \n", "# -----------------------------------------------------------------------------\n", "def hide():\n", " ''' Clears all output and global lists\n", " '''\n", " out.clear_output()\n", " buttons = []\n", " indeces = [] \n", "# -----------------------------------------------------------------------------\n", " \n", "# -----------------------------------------------------------------------------\n", "def print_info(i):\n", " ''' Prints the information from row i in the database.\n", " '''\n", " print(\"Filename:\",filenames[indeces[i]])\n", " print(\"Events:\",events[indeces[i]])\n", " print(\"Parent dataset:\",parent[indeces[i]])\n", " print(\"Source:\",sources[indeces[i]])\n", " print(\"Parameters:\",parameters[indeces[i]],\"\\n\")\n", "# -----------------------------------------------------------------------------\n", "\n", "# -----------------------------------------------------------------------------\n", "def dropdown_menu(change):\n", " ''' The change in user interface, when a spesific value on dropdown menu \n", " is selected.\n", " '''\n", " if change['type'] == 'change' and change['name'] == 'value':\n", " if change['new'] == 'Name':\n", " hide()\n", " with out:\n", " display_name_widgets()\n", " elif change['new'] == 'Events':\n", " hide()\n", " with out:\n", " display_event_widgets()\n", " search_button.on_click(search_events)\n", " elif change['new'] == 'Parameters':\n", " hide()\n", " with out:\n", " display_checkboxes()\n", " for box in checkboxes:\n", " box.observe(parameter_search)\n", "# -----------------------------------------------------------------------------\n", "\n", "\n", "\n", "# -----------------------------------------------------------------------------\n", "''' Create the user interface and display it \n", "'''\n", "w = widgets.Dropdown(options=['Name', 'Events', 'Parameters'],\n", " value=None, description='Filter by:')\n", "w.observe(dropdown_menu)\n", "display(w)\n", "display(out)\n", "# -----------------------------------------------------------------------------\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.3" } }, "nbformat": 4, "nbformat_minor": 2 }