{ "metadata": { "name": "", "signature": "sha256:466e416e19b02efbe5b294c7a517ac9bf46377f3f3064a9cb7ba844a7e3c74ee" }, "nbformat": 3, "nbformat_minor": 0, "worksheets": [ { "cells": [ { "cell_type": "code", "collapsed": false, "input": [ "%load_ext watermark\n", "%watermark -d -v -a 'Sebastian Raschka' " ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "Sebastian Raschka 10/12/2014 \n", "\n", "CPython 3.4.2\n", "IPython 2.3.1\n" ] } ], "prompt_number": 2 }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "
" ] }, { "cell_type": "heading", "level": 1, "metadata": {}, "source": [ "Assigning Majority Mood Labels" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "After classification (regarding the [web app](http://sebastianraschka.com/Webapps/musicmood.html)), a user can leave feedback whether he or she agrees with or not to add a new mood label to the databse. The goal is to collect multiple opinions for particular songs and update the ground truth label by majority rule." ] }, { "cell_type": "code", "collapsed": false, "input": [ "# Adding a new majority rule column to the database\n", "import sqlite3\n", "\n", "conn = sqlite3.connect('../../dataset/training/growing_training.sqlite')\n", "c = conn.cursor()\n", "c.execute('ALTER TABLE moodtable ADD COLUMN majoritymood TEXT;')\n", "conn.commit()" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "code", "collapsed": false, "input": [ "# Get the majority-rule-based mood label\n", "\n", "from collections import Counter\n", "\n", "test1 = 'happy,happy,sad,happy,sad'\n", "\n", "count = Counter(test1.split(','))\n", "print(count.most_common(1))" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "[('happy', 3)]\n" ] } ], "prompt_number": 24 }, { "cell_type": "code", "collapsed": false, "input": [ "# Update the database\n", "\n", "c.execute('SELECT rowid FROM moodtable')\n", "rowids = c.fetchall()\n", "for i in rowids:\n", " c.execute('SELECT mood from moodtable WHERE rowid=?;', i)\n", " mood = c.fetchone()\n", " mood = Counter(mood[0].split(',')).most_common(1)[0][0]\n", " query = 'UPDATE moodtable SET majoritymood=? WHERE rowid=?;'\n", " data = [mood, i[0]]\n", " c.execute(query, data)\n", "conn.commit()\n", "conn.close()" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 33 }, { "cell_type": "code", "collapsed": false, "input": [ "# Show the results\n", "\n", "import pandas as pd\n", "import sqlite3\n", "\n", "conn = sqlite3.connect('../../dataset/training/growing_training.sqlite')\n", "print(pd.read_sql('SELECT * FROM moodtable', conn).tail())\n", "conn.close()" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ " artist title \\\n", "1330 florence the machine dog days are over \n", "1331 ed sheeran fire \n", "1332 dream theater breaking all illusions \n", "1333 coldplay ink \n", "1334 u2 it's a beautiful day \n", "\n", " lyrics mood majoritymood \n", "1330 Happiness, it hurt like a train on a track\\nCo... happy happy \n", "1331 The rain won't stop falling\\nIt's harder than ... sad sad \n", "1332 With the sun in place, there's a test of faith... happy happy \n", "1333 Got a tattoo that said \"2gether thru life\"\\r\\n... happy happy \n", "1334 The heart is a bloom\\nShoots up through stony ... happy happy \n" ] } ], "prompt_number": 4 }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "
" ] }, { "cell_type": "heading", "level": 3, "metadata": {}, "source": [ "Randomly select a happy or sad song" ] }, { "cell_type": "code", "collapsed": false, "input": [ "conn = sqlite3.connect('../../dataset/training/growing_training.sqlite')\n", "\n", "sql = \"SELECT artist,title FROM moodtable WHERE majoritymood='happy' ORDER BY RANDOM() LIMIT 1;\"\n", "cursor = conn.cursor()\n", "cursor.execute(sql)\n", "result = cursor.fetchone()\n", "artistname = result[0]#result[1].decode('utf-8')\n", "songtitle = result[1]#result[2].decode('utf-8')\n", "print(artistname, songtitle)\n", "\n", "conn.close()" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "faith hill red umbrella\n" ] } ], "prompt_number": 12 }, { "cell_type": "code", "collapsed": false, "input": [ "my_dir = '/Users/sebastian/github/www.sebastianraschka.com/Webapps/musicmood_pythonanywhere_local/\n", "\n", "le = pickle.load(open(os.path.join(my_dir, 'lyrics_label_encoder.p'), 'rb'))\n", "tfidf = pickle.load(open(os.path.join(my_dir, 'lyrics_tfidf_noporter.p'), 'rb'))\n", "clf = pickle.load(open(os.path.join(my_dir, 'lyrics_clf.p'), 'rb')) # trained on no porter" ], "language": "python", "metadata": {}, "outputs": [] } ], "metadata": {} } ] }