{ "cells": [ { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Learning about spatial data and maps for archaeology (and other things)\n", "\n", "### Spatial Thinking and Skills Exercise 1 for Archaeology of Scotland\n", "\n", "#### Made by Rachel Opitz, Archaeology, University of Glasgow\n", "\n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "Archaeologists regularly work with maps and data about where sites, samples and objects are found. We ask lots of questions that have a spatial component. Which Bronze Age cairns are close to the coast in England? In this excavation, is bone found inside a building or outside in the yard? In archaeology space and place matter. It's important to learn to work with spatial data and maps in order to succeed in a variety of careers in archaeology and heritage management.\n", "\n", "The aim of this exercise is for you to:\n", "* learn to make very simple static maps\n", "* learn to ask simple questions using spatial data. This is sometimes referred to as 'writing queries'.\n", "* start thinking about the importance of spatial relationships and data in archaeology. \n", "\n", "To start working with spatial data and maps, you need to put together your toolkit. You're currently working inside something called a jupyter notebook, which will be a key part of your spatial analysis toolkit. It's a place to keep notes, pictures, code and maps together. You can add tools and data into your jupyter notebook and then use them to ask spatial questions and make maps and visualisations that help answer those questions. \n", "\n", "Work your way down the page, reading the notes and comments at each step and then running it's code to see the results. In jupyter notebooks you hit 'Ctrl+Enter' to run each bit of code. Anything written with a # symbol in front of it is a comment. Be sure to read these!\n", "\n", "### Let's get started... Remember, hit 'Ctrl'+'Enter' to run the code in any cell in the page." ] }, { "cell_type": "markdown", "metadata": { "scrolled": true, "slideshow": { "slide_type": "slide" } }, "source": [ "![The map that came to life](https://c1.staticflickr.com/4/3017/2863068137_055aef279a_b.jpg)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### We'll start by adding some of the tools we will need. They're not quite like these tools...\n", "\n", "![They're not quite like these tools...](https://facetsarchaeology.files.wordpress.com/2016/07/dsc01218.jpg?w=664&h=429)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "# Start loading your tools by telling the notebook to 'import' them (from the internet).\n", "\n", "%matplotlib inline\n", "import pandas as pd\n", "import requests\n", "import fiona\n", "import geopandas as gpd\n", "import ipywidgets as widgets\n", "import bokeh\n", "\n", "# These are what we call prerequisites. They are basic toosl you need to get started.\n", "# Pandas manipulate data. Geo-pandas manipulate geographic data. They're also black and white and like to eat bamboo... \n", "# You need these to manipulate your data!\n", "# Fiona helps with geographic data.\n", "# Requests are for asking for things. It's good to be able to ask for things.\n", "# ipywidgets supports interactivity. \n", "# Matplotlib is your tool for drawing graphs and basic maps. You need this!\n", "\n", "\n", "# Remember to click inside this box and hit Ctrl+Enter to make things happen!" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Now that we have the basic tools loaded up we need some data. This data is from the Linlithgow Carmetlite Monastery Cemetery excavations, and can be downloaded from the Archaeological Data Service (ADS).\n", "\n", "The [ADS](http://archaeologydataservice.ac.uk/) is an archaeological archive that provides data on an open access basis. You can learn more about the Linlithgow excavations dataset, which is part of the 'Medieval Monastic Cemeteries of Britain (1050-1600)' Project [here](http://archaeologydataservice.ac.uk/archives/view/cemeteries_ahrb_2005/index.cfm). " ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "url = 'https://raw.githubusercontent.com/ropitz/spatialarchaeology/master/data/linlithgow_burials_Features.json'\n", "# This is where I put the data. It's in a format called geojson, used to represent geometry (shapes) and attributes (text).\n", "# Geojson is a common format for spatial data, especially if it is being shared online.\n", "\n", "request = requests.get(url)\n", "# Please get me the data at that web address (url)\n", "\n", "b = bytes(request.content)\n", "# I will use the letter 'b' to refer to the data, like a nickname. \n", "# In this step, I am reading the stuff on the page the url (web address ) points to\n", "\n", "with fiona.BytesCollection(b) as f:\n", " crs = f.crs\n", " linlithgow_burials = gpd.GeoDataFrame.from_features(f, crs=crs)\n", " print(linlithgow_burials.head())\n", "# In this step I will use the fiona tool to wrap up all the data from 'b' into a tidy package. \n", "# Then I check the coordinate system (crs) listed in the file\n", "# and print out the first few lines of the file so I can check everything looks ok. \n", "# Don't worry if you don't understand all the details of this part!\n", "\n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Does that look right? \n", "\n", "You should have a bunch of information that describes the shapes of lines. These are the outlines of the shape of each burial and a 'stick figure' type skeleton for some of them of the contexts from Linlithgow. You should also have a bunch of descriptions and information about the burials archaeology. Importatly you should be able to spot the column names: Descriptio, objectid, shape_area, shape_leng, SU, definition, finds_note, formation, geometry, interpret. Each column contains a different type of information. Note that SU (stratigraphic unit) = context. \n", "\n", "Spatial data by itself isn't that useful. If we just had a bunch of lines and no descriptions of them we couldn't say much about the archaeological features at Linlithgow's cemetery. It's the combination of spatial and descriptive data that is interesting. " ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [], "source": [ "# Right now you have your spatial and descritive data as a table. \n", "# It's hard to read spatial data as just a list of numbers and understand the shapes that are being described.\n", "# Let's visualise the data as a map to better understand the spatial information. \n", "\n", "linlithgow_map1 = linlithgow_burials.plot(column='AT', cmap='Accent', edgecolor='grey', figsize=(15, 15));\n", "\n", "# Let's break down that command.\n", "# 'plot' means draw me a map showing the geometry of each feature in my data. \n", "# We want to control things like the color of different types of burials on our map. \n", "# I used the pastel colorscale command (cmap stands for 'colour map') \n", "# and asked it to draw the polygons differently based on the type of burial. \n", "# The 'AT' column, you can see in the table, lists the different types of burials.\n", "# I also told it to make my figure 15x15 in size (figsize).\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Well done! You just made your first archaeological map. It shows all the burials excavated at Carmelite Linlithgow. \n", "\n", "This is good, but what if you only want to look at one kind of burial? We can select specific types of burials from within our dataset by searching (aka querying) for them. \n", "\n", "How do we know what kind of burials we have? Looking at what's inside the data describing all those shapes on the map should help. \n", "\n", "Start by printing out our data in a tidy way. Just type its name..." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [], "source": [ "linlithgow_burials\n", "# Typing the name of any dataset will print it out" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In archaeology we often talk about the dates at which different things happened, and when activities started and ended. Sometimes we are not sure when something happened, so we give a range of dates with an early guess and a late guess. Look at the 'e_date' (earliest date) and 'l_date' (latest date) columns in the table, and you'll see the date ranges guessed for each burial." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [], "source": [ "# Say you only want to look at the burials from before 1400, the first ones. Pandas use square brackets [] to make selections. \n", "# Here we select all the rows (.loc) where the column 'L_DATE' has a value less than 1400. < means 'less than' in code\n", "\n", "linlithgow_burials.loc[linlithgow_burials['L_DATE']<1400]" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [], "source": [ "# You should have a table view of the data and if you look at the 'L_Date' column, you should only see dates earlier than 1400.\n", "# If we want to see this result as a map, we just add the .plot command to the end.\n", "\n", "linlithgow_burials.loc[linlithgow_burials['L_DATE'] <1400].plot(column='AT', cmap='Accent', figsize=(15, 15))\n", "\n", "# Note that I've used many of the same commands that I used before to control the color of the features and the map size." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [], "source": [ "# Try and do the same thing for burials that are earlier than 1500\n", "linlithgow_burials.loc[(linlithgow_burials['L_DATE']<=1500) & (linlithgow_burials['L_DATE'] >= 1400)]" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [], "source": [ "# Remember to draw it as a map!\n", "linlithgow_burials.loc[(linlithgow_burials['L_DATE']<=1500) & (linlithgow_burials['L_DATE'] >= 1400)].plot(column='AT', cmap='Accent', figsize=(15, 15))" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [], "source": [ "# Let's save these selections of 'pre1400' and 'pre1500' so we can use them again.\n", "# I've given them names here. These are now 'named variables'\n", "pre1400 = linlithgow_burials.loc[linlithgow_burials['L_DATE'] <1400]\n", "pre1500 = linlithgow_burials.loc[(linlithgow_burials['L_DATE']<=1500) & (linlithgow_burials['L_DATE'] >= 1400)]" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [], "source": [ "#Test your named variable by printing it out again, calling it by its name.\n", "pre1400" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "So far these searches have been about the attributes of our data, or the way each context has been described. You could try selecting on different attributes to ask your own questions, following the pattern of commands we used above. For example, you could search for a specific type of burial, or for burials later than a certain date. \n", "\n", "We can also ask questions about spatial relationships between contexts or about the real-world location of our contexts. For example, we could try and find out the location of our whole data. To describe the location of the whole dataset, we might draw a box around all the features. This is called a 'bounding box'. Let's find the bounding box, or real world location and extent of our data. We use the command 'total_bounds' to ask this question. Things 'in bounds' are inside the box." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [], "source": [ "pre1400.total_bounds" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You should see a set of coordinates that represent the location of your data in the real world. They are in a coordinate system called OSBG. OSBG is one of the most common coordinate systems in the UK. You can learn more about coordinate systems [here](https://www.e-education.psu.edu/natureofgeoinfo/c2_p10.html) and [here](https://www.e-education.psu.edu/natureofgeoinfo/c2_p11.html).\n", "\n", "Why does this matter? Well, if you wanted to get out a map and find the location of this cemetery so you could go visit the place, you would need the coordinates. If you wanted to tell someone else where they were, or tell a planner the area where they should not build a road, you would need the coordiantes to do so." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [], "source": [ "# Now do the same thing for the pre-1500 burials. The results should be similar, but not identical.\n", "# Take a minute and think about why this would be the case.\n", "pre1500.total_bounds" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [], "source": [ "# So far we have been asking questions about groups of features of different types. \n", "# We can also ask spatial questions about single contexts or burials.\n", "# Now we will select a single burial by the context number assigned to it.\n", "pre1500_182 = pre1500.loc[pre1500['CONTEXT'] == 182]\n", "pre1500_182\n", "pre1500_182.plot()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [], "source": [ "#Now we will select another specific burial.\n", "pre1500_201 = pre1500.loc[pre1500['CONTEXT'] == 201]\n", "pre1500_201\n", "pre1500_201.plot()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### One of these burials is immature (a young person) while one is identified as an older person.\n", " \n", "What kinds of spatial questions can we ask by comparing individual burials? We might ask if they are they different shapes or sizes, or facing different directions. Look at the maps and try and spot any differences. Orientation is easy enough, but it's hard to compare size when the burials are on different maps. Putting the individual burials we want to compare on the same map will make it easier." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [], "source": [ "# To do this we have to provide a list of the values we are interested in seeing on the map, in square brackets []\n", "pre1500_both= pre1500.loc[pre1500['CONTEXT'].isin([201,182])]\n", "pre1500_both\n", "pre1500_both.plot(column='CONTEXT', cmap='Accent', figsize=(15, 15))" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "What does this map suggest? Are there any size, shape or orientation differences in these burials? What might similarities or differences mean?" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Now try another question. What if we wanted to find all the contexts defined as infant burials? \n", "\n", "Look in the 'AT' column. Infants are defined as 'INFANT (0-5)'. In many archaeological situations infants are buried differently. Perhaps they are in a separate area, or always close to an adult. These are interesting questions to investigate spatially." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [], "source": [ "linlithgow_infants = linlithgow_burials[linlithgow_burials['AT'].str.contains('INFANT')]\n", "linlithgow_infants\n", "# The command .str.contains means that we want all the contexts where the word 'infant' appears anywhere in the AT column.\n", "# It doesn't have to be an exact match, which is useful as archaeological data is often a little inconsistent or untidy." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [], "source": [ "# Now create a map of all the burials of infants.\n", "linlithgow_infants.plot(column='CONTEXT', cmap='Accent', edgecolor='grey', figsize=(15, 15))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### What if we wanted to know about burials that were near infant burials? Let's construct a new query.\n", "How close is close? Let's say 0.5meters. This will be a two step process...\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [], "source": [ "# First we use the 'buffer' command to expand the size of each line and make it 0.5m thick\n", "# Doing this defines the area within 0.5 of each infant burial.\n", "linlithgow_infants_close = linlithgow_infants.buffer(0.5)\n", "linlithgow_infants_close.plot(cmap='Accent', edgecolor='grey', figsize=(15, 15))" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [], "source": [ "# Now we plot the intersection between the buffered infant burials shapes and all the other burials shapes.\n", "# This result should return all the burials that physically overlap the area within 0.5 of infant burials.\n", "linlithgow_burials.union(linlithgow_infants_close).plot(cmap='Accent', edgecolor='grey', figsize=(15, 15))" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### What do we conclude? Are infants buried close to other burials? To adults?\n", "\n", "### This ends the tutorial. You can practice writing queries (asking questions of your data) by playing around in this notebook. Try changing values or searching for different types of burials or their dates. You'll be doing this in class during your next practical!\n", "\n", "Hopefully you learned to:\n", "* make very simple static maps\n", "* ask simple questions using spatial data. This is sometimes referred to as 'writing queries'.\n", "* start thinking about the importance of spatial relationships and data in archaeology. \n" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.5" } }, "nbformat": 4, "nbformat_minor": 2 }