{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Manipulating data in Python\n", "\n", "In Python, you can manipulate your data in various ways. Help on function genfromtxt in module numpy.lib.npyio:

genfromtxt(fname, dtype=<class 'float'>, comments='#', delimiter=None, skip_header=0, skip_footer=0, converters=None, missing_values=None, filling_values=None, usecols=None, names=None, excludelist=None, deletechars=None, replace_space='_', autostrip=False, case_sensitive=True, defaultfmt='f%i', unpack=None, usemask=False, loose=True, invalid_raise=True, max_rows=None, encoding='bytes')
    Load data from a text file, with missing values handled as specified. If the filename\n", " extension is .gz or .bz2, the file is first decompressed. Note\n", " that generators must return byte strings in Python 3k. The strings\n", " in a list or produced by a generator are treated as lines.\n", " dtype : dtype, optional\n", " Data type of the resulting array.\n", " If None, the dtypes will be determined by the contents of each\n", " column, individually.\n", " comments : str, optional\n", " The character used to indicate the start of a comment.\n", " All the characters occurring on a line after a comment are discarded\n", " delimiter : str, int, or sequence, optional\n", " The string used to separate values. By default, any consecutive\n", " whitespaces act as delimiter. An integer or sequence of integers\n", " can also be provided as width(s) of each field.\n", " skiprows : int, optional\n", " skiprows was removed in numpy 1.10. Please use skip_header instead.\n", " skip_header : int, optional\n", " The number of lines to skip at the beginning of the file.\n", " skip_footer : int, optional\n", " The number of lines to skip at the end of the file.\n", " converters : variable, optional\n", " The set of functions that convert the data of a column to a value.\n", " The converters can also be used to provide a default value\n", " for missing data: converters = {3: lambda s: float(s or 0)}.\n", " missing : variable, optional\n", " missing was removed in numpy 1.10. Please use missing_values\n", " instead.\n", " missing_values : variable, optional\n", " The set of strings corresponding to missing data.\n", " filling_values : variable, optional\n", " The set of values to be used as default when the data are missing.\n", " usecols : sequence, optional\n", " Which columns to read, with 0 being the first. For example,\n", " usecols = (1, 4, 5) will extract the 2nd, 5th and 6th columns.\n", " names : {None, True, str, sequence}, optional\n", " If names is True, the field names are read from the first line after\n", " the first skip_header lines. This line can optionally be proceeded\n", " by a comment delimeter. If names is a sequence or a single-string of\n", " comma-separated names, the names will be used to define the field names\n", " in a structured dtype. If names is None, the names of the dtype\n", " fields will be used, if any.\n", " excludelist : sequence, optional\n", " A list of names to exclude. This list is appended to the default list\n", " ['return','file','print']. Excluded names are appended an underscore:\n", " for example, file would become file_.\n", " deletechars : str, optional\n", " A string combining invalid characters that must be deleted from the\n", " names.\n", " defaultfmt : str, optional\n", " A format used to define default field names, such as \"f%i\" or \"f_%02i\".\n", " autostrip : bool, optional\n", " Whether to automatically strip white spaces from the variables.\n", " replace_space : char, optional\n", " Character(s) used in replacement of white spaces in the variables\n", " names. By default, use a '_'.\n", " case_sensitive : {True, False, 'upper', 'lower'}, optional\n", " If True, field names are case sensitive.\n", " If False or 'upper', field names are converted to upper case.\n", " If 'lower', field names are converted to lower case.\n", " unpack : bool, optional\n", " If True, the returned array is transposed, so that arguments may be\n", " unpacked using x, y, z = loadtxt(...)\n", " usemask : bool, optional\n", " If True, return a masked array.\n", " If False, return a regular array.\n", " loose : bool, optional\n", " If True, do not raise errors for invalid values.\n", " invalid_raise : bool, optional\n", " If True, an exception is raised if an inconsistency is detected in the\n", " number of columns.\n", " If False, a warning is emitted and the offending lines are skipped.\n", " max_rows : int, optional\n", " The maximum number of rows to read. Must not be used with skip_footer\n", " at the same time. If given, the value must be at least 1. Default is\n", " to read the entire file.\n", " \n", " .. versionadded:: 1.10.0\n", " encoding : str, optional\n", " Encoding used to decode the inputfile. Does not apply when fname is\n", " a file object. The special value 'bytes' enables backward compatibility\n", " workarounds that ensure that you receive byte arrays when possible\n", " and passes latin1 encoded strings to converters. Override this value to\n", " receive unicode arrays and pass strings as input to converters. If set\n", " to None the system default is used. The default value is 'bytes'.\n", " \n", " .. versionadded:: 1.14.0\n", " \n", " Returns\n", " -------\n", " out : ndarray\n", " Data read from the text file. If usemask is True, this is a\n", " masked array.\n", " \n", " See Also\n", " --------\n", " numpy.loadtxt : equivalent function when no data is missing.\n", " \n", " Notes\n", " -----\n", " * When spaces are used as delimiters, or when no delimiter has been given\n", " as input, there should not be any missing data between two fields.\n", " * When the variables are named (either by a flexible dtype or with names,\n", " there must not be any header in the file (else a ValueError\n", " exception is raised).\n", " * Individual values are not stripped of spaces by default.\n", " When using a custom converter, make sure the function does remove spaces.\n", " \n", " References\n", " ----------\n", " .. [1] NumPy User Guide, section I/O with NumPy\n", " _.\n", " \n", " Examples\n", " ---------\n", " >>> from io import StringIO\n", " >>> import numpy as np\n", " \n", " Comma delimited file with mixed dtype\n", " \n", " >>> s = StringIO(\"1,1.3,abcde\")\n", " >>> data = np.genfromtxt(s, dtype=[('myint','i8'),('myfloat','f8'),\n", " ... ('mystring','S5')], delimiter=\",\")\n", " >>> data\n", " array((1, 1.3, 'abcde'),\n", " dtype=[('myint', '>> s.seek(0) # needed for StringIO example only\n", " >>> data = np.genfromtxt(s, dtype=None,\n", " ... names = ['myint','myfloat','mystring'], delimiter=\",\")\n", " >>> data\n", " array((1, 1.3, 'abcde'),\n", " dtype=[('myint', '>> s.seek(0)\n", " >>> data = np.genfromtxt(s, dtype=\"i8,f8,S5\",\n", " ... names=['myint','myfloat','mystring'], delimiter=\",\")\n", " >>> data\n", " array((1, 1.3, 'abcde'),\n", " dtype=[('myint', '>> s = StringIO(\"11.3abcde\")\n", " >>> data = np.genfromtxt(s, dtype=None, names=['intvar','fltvar','strvar'],\n", " ... delimiter=[1,3,5])\n", " >>> data\n", " array((1, 1.3, 'abcde'),\n", " dtype=[('intvar', '>> import numpy as np\n", " >>> import matplotlib.pyplot as plt\n", " >>> from scipy.optimize import curve_fit\n", " \n", " >>> def func(x, a, b, c):\n", " ... return a * np.exp(-b * x) + c\n", " \n", " Define the data to be fit with some help(curve_fit) plt.ylabel('y')\n", " >>> plt.legend()\n", " >>> plt.show()\n", "\n" ] } ], "source": [ "help(curve_fit)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can now overplot the best-fit line to the data:" ] }, { "cell_type": "code", "execution_count": 92, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 92, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "text/plain": [ "