{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Multiplying Numpy Arrays: dot vs ```__mul__```\n", "\n", "Data Science has everything to do with linear algebra.\n", "\n", "When we want to do a weighted sum, we can put the weights in a row vector, and what they multiply in a column vector. \n", "\n", "Assigning weights, usually iteratively, in response to back propagation, is at the heart of machine learning, from logistic regression to neural networks.\n", "\n", "Lets go over the basics of creating row and column vectors, such that dot products become possible.\n", "\n", "You will find np.dot(A, B) works the same as A.dot(B) when it comes to numpy arrays." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "import numpy as np" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "one_dimensional = np.array([1,1,1,2,3,3,3,3,3])" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "one_dimensional" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "one_dimensional.shape # not yet rows & columns" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "one_dimensional.reshape((9,-1)) # let numpy figure out how many columns" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "one_dimensional # still the same" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "one_dimensional.ndim" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "two_dimensional = one_dimensional.reshape(1,9) # recycle same name" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "two_dimensional.shape # is now 2D even if just the one row" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "two_dimensional.ndim" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "class M:\n", " \"\"\"Symbolic representation of multiply, add\"\"\"\n", " def __init__(self, s):\n", " self.s = str(s)\n", " def __mul__(self, other):\n", " return M(self.s + \" * \" + other.s) # string\n", " def __add__(self, other):\n", " return M(self.s + \" + \" + other.s)\n", " def __repr__(self):\n", " return self.s\n", " \n", "#Demo\n", "one = M(1)\n", "two = M(2)\n", "print(one * two)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "A,B,C = map(M, ['A','B','C']) # create three M type objects" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "m_array = np.array([A,B,C]) # put them in a numpy array" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "m_array.dtype # infers type (Object)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "m_array = m_array.reshape((-1, len(m_array))) # make this 2 dimensional" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "m_array.shape # transpose works for > 1 dimension" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "m_array.T # stand it up (3,1) vs (1,3) shape" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "m_array.dot(m_array.T) # row dot column i.e. self * self.T" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "m_array.T[1,0] = M('Z') # transpose is not a copy" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "m_array # original has changes" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "m_array * m_array # dot versus element-wise" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### LAB CHALLENGE:\n", "\n", "Create two arrays of compatiable dimensions and form their dot product.\n", "\n", "```numpy.random.randint``` is a good source of random numbers (for data)." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "from pandas import Series" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "A = Series(np.arange(10))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### LAB CHALLENGE:\n", "\n", "Does the pandas Series have a dot product method? Check it out!" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.4" } }, "nbformat": 4, "nbformat_minor": 2 }