{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Numbers" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "It might seem that we have only seen one type of data so far: *numbers*. However, Python actually has *two* types of numbers -- *integers* and *floating point numbers* (or *floats*, for short). We have seen both already! Understanding the differences between integers and floats is crucial for any data scientist, as we'll see in this section." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Two Types of Numbers" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Python recognizes two types of numbers: *integers* and *floating point numbers* (or *floats* for short).\n", "An {dterm}`integer` is a number without decimals. For instance:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "42" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "-7" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "A {dterm}`float` is a number that *does* have decimals (even if that decimal component is zero), like:" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "```{margin}\n", "The term 'float' comes from \"floating point number\", since they're represented by the computer as a value and the position of the decimal point -- so the decimal point can 'float' to any position in the value.\n", "```" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "3.14159265" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "42.0" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "When floats get really big or really small they might be printed in scientific notation. You can write floats in scientific notation, too." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "1000000000000000000.0" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "1e18" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Tip**\n", "\n", "You can place underscores in large numbers to make them easier to read -- Python will ignore them. For instance, it is hard to read `10000000000`, but somewhat easier to read `10_000_000_000`." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "When we write a number, Python automatically determines whether it is a *float* or an *integer*. We can see the type that Python has determined using the `type` function:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "type(42)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "type(-7)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "type(3.14159265)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "type(42.0)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can also use the type function to ask the type of a variable:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "x = 42\n", "type(x)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "y = 7\n", "z = 2\n", "type(y + z)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Types and Arithmetic" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Every value in Python has a type. Since an expression results in a value, we can ask about its type:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "type(1 + 4)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As we see above, adding and subtracting integers results in an integer. However, if we add an integer and a float, the result will be a float:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "type(1 + 3.1415)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This makes sense: the result of `1 + 3.1415` is `4.1415`, and so Python treats it as a float because it is a decimal number. But consider this:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "type(1.2 + 3.8)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Mathematically-speaking, the result of $1.2 + 3.8$ is just $5$, which has no fractional component. But Python treats the result as a float instead of an integer! This might surprise you at first, but Python is following a simple rule here: if the result of arithmetic *could* be a decimal number, the result is a float.\n", "\n", "Let's put that to to the test. Suppose we add two integers. The result cannot have a decimal component, so it will be an integer. But if we add an integer and a float, the result *could* be a decimal number, depending on the exact number used. Therefore the result will be a float.\n", "\n", "```{margin}\n", "\n", "Why is the result of combining a `float` and an `int` always a `float`? Python doesn't want to make you guess what type something is, so it wouldn't be nice to have the result sometimes be an `int` and sometimes a `float`. If we have to choose between the two, the best choice is clearly `float`.\n", "```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now you try:" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Question**:\n", " Suppose we perform the division `6/3`. What is the type of the result?\n", "\n", "
Answer:`float`. Always and forever.
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Since dividing two integers *could* result in a decimal number, the result is *always* a `float`, even when the answer is mathematically-speaking an integer." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Tip**\n", "\n", "If you find this rule confusing, you can replace it with these two equivalent rules instead:\n", "\n", "1. When two numbers are combined, with one of them being a float, the result is a float.\n", "2. Dividing two numbers results in a float, even if both numbers are integers." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Conversions" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Sometimes we *know* that something Python thinks is a float should be an integer, or *vice versa*. For instance, we have seen that `6/3` will be a float, even though we know that (mathematically-speaking) the result has no decimal place. We can *convert* a float to an integer using the `int` function, like so:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "int(4/2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Likewise, if we want to convert an integer to a float, we use the `float` function:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "float(42)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Suppose you try to convert a number like `3.14` to an integer. What do you think will happen?" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "int(3.14)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "It looks like Python is rounding the number -- but be careful. To be precise, Python is rounding the number *towards* zero:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "int(3.9999)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "int(-2.9999)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now you try:" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Question**:\n", " Which of the two is bigger? `int(-3.9999)` or `int(-4.0001)`?\n", "\n", "
Answer:`int(-3.999)`
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Since Python rounds the numbers *towards* zero, `int(-3.9999)` will evaluate to `-3` while `int(-4.0001)` will evaluate to `-4` and we know that `-3>-4`." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Integers and Floats *Redux*" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "There are some important differences between integers and floats. First, integers can be *arbitrarily* large, while floats can *overflow*. For instance, let's compute $2^{10{,}000}$, first using integers, and then using floats.\n", "\n", "With integers, we write `2**10_000`. The result will be an integer (and a big one, too):" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "2**10_000" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now let's try it with floats. We can do this by writing `2.0**10_000`:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [ "raises-exception" ] }, "outputs": [], "source": [ "2.0**10_000" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`OverflowError`! This is Python's way of telling us that the result of the expression is too big for Python to compute using floats." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "```{margin}\n", "\n", "Without getting too deep into the details, the reason Python can represent huge integers, but not huge floats, has to do with *memory*. Python allows integers to take up as much memory as they want, but requires floats to take up a specific, fixed amount of memory. In the case of an overflow, it would require more memory than allowed to compute the result of the expression. Python limits the memory given to a float in order to ensure that doing computations with floats is very fast.\n", "```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Floating point numbers are also of a fixed precision, meaning that only so many digits can be stored. If we try to compute or store a number with too many decimal places (say, more than 16), Python will \"forget\" the digits beyond a certain point:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "0.12345678901234567890123" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here's another example. $1/(2^{10{,}000})$ is a very small decimal number, but it isn't zero. It is too small for Python to calculate using floats, however:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "1/(2**10000)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Because floats lack some precision, small arithmetic errors called *floating point errors* can result from float operations:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# should be 251, exactly!\n", "2.51 * 100" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This also means that sometimes something that *should* be zero doesn't seem to be zero, but instead appears to be a super small number." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "(3.0 * 1.2) - 3.6" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This might look like a bug in Python, but it isn't! This is an inherent limitation of *all* programming languages which use floating point numbers (which is most of them). It usually isn't that big of an issue, as long as you're aware of the problem and are careful. For instance, you should get an uneasy feeling when you write code like `int(2.51 * 100)`, because it may not behave the way you'd first expect:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# \"should\" be 251!\n", "int(2.51 * 100)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now you try:" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Question**:\n", " Python supports a `round()` function that (in simple terms) rounds a decimal to an integer like we do with hand so 3.7 gets rounded to 4 and 3.1 gets rounded to 3. With this in mind, answer the following question:\n", "Do `int(2.51*100)` and `round(2.51*100)` equivalent?\n", "\n", "
Answer:False
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As we saw earlier, Python evaluates `2.51*100` to `250.99999999999997`.\n", "- `int(250.99999999999997)` evaluates to `250`\n", "- `round(250.99999999999997)` evaluates to `251`" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Summary\n", "\n", "- Everything in Python has a type -- these are called **data types**.\n", "- We can find the type of an object by calling `type` function on an object or expression.\n", "- Python has two basic number types: `float` and `int`.\n", "- When faced with division or an expression that involves any floats, the end result will be a float." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.3" } }, "nbformat": 4, "nbformat_minor": 4 }