{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Gathering system data - Python for System Administrators \n", "\n", "## Goals:\n", " - Gathering System Data with multiplatform and platform-dependent tools\n", " - Get infos from files, /proc, /sys\n", " - Capture command output\n", " - Use psutil to get IO, CPU and memory data\n", " - Parse files with a strategy\n", " \n", "## Non-goals for this lesson:\n", " - use with, yield or pipes" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Modules" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "import psutil\n", "import glob\n", "import sys\n", "import subprocess\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "#\n", "# Our code is p3-ready\n", "#\n", "from __future__ import print_function, unicode_literals" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "def grep(needle, fpath):\n", " \"\"\"A simple grep implementation\n", "\n", " goal: open() is iterable and doesn't\n", " need splitlines()\n", " goal: comprehension can filter lists\n", " \"\"\"\n", " return [x for x in open(fpath) if needle in x]\n", "\n", "# Do we have localhost?\n", "grep(\"localhost\", \"/etc/hosts\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "#The psutil module is very nice\n", "import psutil\n", "\n", "#Works on Windows, Linux and MacOS\n", "psutil.cpu_percent() " ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "#And its output is very easy to manage\n", "ret = psutil.disk_io_counters()\n", "print(ret)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# Exercise: Which other informations \n", "# does psutil provide? \n", "# Use this cell and the tab-completion jupyter functionalities." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "# Exercise\n", "def multiplatform_vmstat(count):\n", " # Write a vmstat-like function printing every second:\n", " # - cpu usage%\n", " # - bytes read and written in the given interval\n", " # Hint: use psutil and time.sleep(1)\n", " # Hint: use this cell or try on ipython and *then* write the function\n", " # using %edit vmstat.py\n", " for i in range(count):\n", " raise NotImplementedError\n", " print(cpu_usage, bytes_rw)\n", "\n", "multiplatform_vmstat(5)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "%load course/multiplatform_vmstat.py\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "# Run your vmstat implementation.\n", "\n", "multiplatform_vmstat(5)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "#\n", "# subprocess\n", "#\n", "# The check_output function returns the command stdout\n", "from subprocess import check_output\n", "\n", "# It takes a *list* as an argument!\n", "out = check_output(\"ping -w1 -c1 www.google.com\".split())\n", "\n", "# and returns a string\n", "print(out)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# If you want to stream command output, use subprocess.Popen\n", "# and check carefully subprocess documentation!" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "def sh(cmd, shell=False, timeout=0):\n", " \"\"\"\"Returns an iterable output of a command string\n", " checking...\n", " \"\"\"\n", " from sys import version_info as python_version\n", " if python_version < (3, 3): # ..before using..\n", " if timeout:\n", " raise ValueError(\"Timeout not supported until Python 3.3\")\n", " output = check_output(cmd.split(), shell=shell)\n", " else:\n", " output = check_output(cmd.split(), shell=shell, timeout=timeout)\n", " return output.splitlines()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# Exercise:\n", "# implement a multiplatform pgrep-like function.\n", "def ppgrep(program):\n", " \"\"\"\n", " A multiplatform pgrep-like function.\n", " Prints a list of processes executing 'program'\n", " @param program - eg firefox, explorer.exe\n", " \n", " Hint: use subprocess, os and list-comprehension\n", " eg. items = [x for x in a_list if 'firefox' in x] \n", " \"\"\"\n", " raise NotImplementedError" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "%load course/pgrep.py\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Parsing /proc\n", "\n", "Linux /proc filesystem is a cool place to get data\n", "\n", "In the next example we'll see how to get:\n", " - thread informations;\n", " - disk statistics;\n", " \n", " " ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# Parsing /proc - 1\n", "def linux_threads(pid):\n", " \"\"\"Retrieving data from /proc\n", " \"\"\"\n", " from glob import glob\n", " # glob emulates shell expansion of * and ?\n", " path = \"/proc/{}/task/*/status\".format(pid)\n", " \n", " \n", " # pick a set of fields to gather\n", " t_info = ('Pid', 'Tgid', 'voluntary') # this is a tuple!\n", " for t in glob(path):\n", " # ... and use comprehension to get \n", " # intersting data.\n", " t_info = [x \n", " for x in open(t) \n", " if x.startswith(t_info)] # startswith accepts tuples!\n", " print(t_info)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "# If you're on linux try linux_threads\n", "pid_of_init = 1 # or systemd ?\n", "linux_threads(pid_of_init)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "# On linux /proc/diskstats is the source of I/O infos\n", "disk_l = grep(\"sda\", \"/proc/diskstats\")\n", "print(''.join(disk_l))" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "# To gather that data we put the header in a multiline string\n", "from course import diskstats_headers as headers\n", "print(*headers, sep='\\n')" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "#Take the 1st entry (sda), split the data...\n", "disk_info = disk_l[0].split()\n", "# ... and tie them with the header\n", "ret = zip(headers, disk_info)\n", "\n", "# On py3 we need to iterate over the generators\n", "print(list(ret))" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "# Try to mangle ret\n", "print('\\n'.join(str(x) for x in ret))\n", "# Exercise: trasform ret in a dict." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "# We can create a reusable commodity class with\n", "from collections import namedtuple\n", "\n", "# using the imported `headers` as attributes\n", "# like the one provided by psutil\n", "DiskStats = namedtuple('DiskStat', headers)\n", "\n", "# ... and disk_info as values\n", "dstat = DiskStats(*disk_info)\n", "print(dstat.device, dstat.writes_ms)\n", "\n", "# Homework: check further features with\n", "# help(collections)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# Exercise\n", "# Write the following function \n", "def linux_diskstats(partition):\n", " \"\"\"Print every second I/O information from /proc/diskstats\n", " \n", " @param: partition - eg sda1 or vdx1\n", " \n", " Hint: use the above `grep` function\n", " Hint: use zip, time.sleep, print() and *magic\n", " \"\"\"\n", " diskstats_headers = ('reads reads_merged reads_sectors reads_ms'\n", " ' writes writes_merged writes_sectors writes_ms'\n", " ' io_in_progress io_ms_weight').split()\n", " \n", " while True:\n", " raise NotImplementedError\n", " print(values, sep=\"\\t\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "%load course/linux_diskstats.py\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "# Using check_output with split() doesn't always work\n", "from os import makedirs\n", "makedirs('/tmp/course/b l a n k s') # , exist_ok=True) this on py3\n", " \n", "check_output('ls \"/tmp/course/b l a n k s\"'.split())" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "# You can use\n", "from shlex import split\n", "# and\n", "cmd = split('dir -a \"/tmp/course/b l a n k s\"')\n", "check_output(cmd)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## zip on py3 is a generator \n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "# zip_iterables():\n", "\"\"\"The zip method joins list elements pairwise\n", " like a zip fastener\n", "\"\"\"\n", "from sys import version_info as python_version\n", "a_list = [0, 1, 2, 3]\n", "b_list = [\"a\", \"b\", \"c\", \"d\"]\n", "zipper = zip(a_list, b_list)\n", "print(zipper)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "if python_version >= (3,):\n", " zipper = list(zipper)\n", "assert zipper == [(0, \"a\"), (1, \"b\"), (2, \"c\"), (3, \"d\")]" ] } ], "metadata": { "kernelspec": { "display_name": "Python 2", "language": "python", "name": "python2" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 2 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython2", "version": "2.7.12" } }, "nbformat": 4, "nbformat_minor": 0 }