{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"# Unit 2.2 Data Compression, Images\n",
"> Lab will perform alterations on images, manipulate RGB values, and reduce the number of pixels. College Board requires you to learn about Lossy and Lossless compression. \n",
"- toc: true\n",
"- image: /images/python.png\n",
"- categories: []\n",
"- type: ap\n",
"- week: 25"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Enumerate \"Data\" Big Idea from College Board \n",
"> Some of the big ideas and vocab that you observe, talk about it with a partner ...\n",
"- \"Data compression is the reduction of the number of bits needed to represent data\"\n",
"- \"Data compression is used to save transmission time and storage space.\"\n",
"- \"lossy data can reduce data but the original data is not recovered\"\n",
"- \"lossless data lets you restore and recover\"\n",
"\n",
"The [Image Lab Project](https://csp.nighthawkcodingsociety.com/starter/rgb/) contains a plethora of College Board Unit 2 data concepts. Working with Images provides many opportunities for compression and analyzing size."
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Image Files and Size\n",
"> Here are some [Images Files](https://github.com/nighthawkcoders/nighthawk_csp/tree/master/starter/static/img). Download these files, load them into `images` directory under **_notebooks** in your Blog. \n",
" \n",
"- [Clouds Impression](https://github.com/nighthawkcoders/APCSP/blob/master/_notebooks/images/clouds-impression.png)\n",
"- [Lassen Volcano](https://github.com/nighthawkcoders/APCSP/blob/master/_notebooks/images/lassen-volcano.jpg)\n",
"- [Green Square](https://github.com/nighthawkcoders/APCSP/blob/master/_notebooks/images/green-square-16.png)\n",
"\n",
"Describe some of the meta data and considerations when managing Image files. Describe how these relate to Data Compression ...\n",
"- File Type, PNG and JPG are two types used in this lab\n",
"- Size, height and width, number of pixels\n",
"- Visual perception, lossy compression"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Displaying images in Python Jupyter notebook\n",
"> Python Libraries and Concepts used for Jupyter and Files/Directories\n",
"\n",
"### IPython \n",
"> Support visualization of data in Jupyter notebooks. Visualization is specific to View, for the web visualization needs to be converted to HTML.\n",
"\n",
"### pathlib\n",
"> File paths are different on Windows versus Mac and Linux. This can cause problems in a project as you work and deploy on different Operating Systems (OS's), pathlib is a solution to this problem. \n",
"- What are commands you use in terminal to access files?\n",
"- What are the command you use in Windows terminal to access files?\n",
"- What are some of the major differences?\n",
"\n",
"Provide what you observed, struggled with, or leaned while playing with this code.\n",
"- Why is path a big deal when working with images?\n",
"- How does the meta data source and label relate to Unit 5 topics?\n",
"- Look up IPython, describe why this is interesting in Jupyter Notebooks for both Pandas and Images?"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"\n",
"from IPython.display import Image, display\n",
"from pathlib import Path # https://medium.com/@ageitgey/python-3-quick-tip-the-easy-way-to-deal-with-file-paths-on-windows-mac-and-linux-11a072b58d5f\n",
"\n",
"# prepares a series of images\n",
"def image_data(path=Path(\"images/\"), images=None): # path of static images is defaulted\n",
" if images is None: # default image\n",
" images = [\n",
" {'source': \"Peter Carolin\", 'label': \"Clouds Impression\", 'file': \"clouds-impression.png\"},\n",
" {'source': \"Peter Carolin\", 'label': \"Lassen Volcano\", 'file': \"lassen-volcano.jpg\"}\n",
" ]\n",
" for image in images:\n",
" # File to open\n",
" image['filename'] = path / image['file'] # file with path\n",
" return images\n",
"\n",
"def image_display(images):\n",
" for image in images: \n",
" display(Image(filename=image['filename']))\n",
"\n",
"\n",
"# Run this as standalone tester to see sample data printed in Jupyter terminal\n",
"if __name__ == \"__main__\":\n",
" # print parameter supplied image\n",
" green_square = image_data(images=[{'source': \"Internet\", 'label': \"Green Square\", 'file': \"green-square-16.png\"}])\n",
" image_display(green_square)\n",
" \n",
" # display default images from image_data()\n",
" default_images = image_data()\n",
" image_display(default_images)\n",
" "
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Reading and Encoding Images (2 implementations follow)\n",
"\n",
"### PIL (Python Image Library)\n",
"> [Pillow or PIL](https://pillow.readthedocs.io/en/stable/) provides the ability to work with images in Python. [Geeks for Geeks](https://www.geeksforgeeks.org/working-images-python/?ref=lbp) shows some ideas on working with images. \n",
"\n",
"\n",
"### base64\n",
"> Image formats (JPG, PNG) are often called ***Binary File formats**, it is difficult to pass these over HTTP. Thus, [base64](https://en.wikipedia.org/wiki/Base64) converts binary encoded data (8-bit, ASCII/Unicode) into a text encoded scheme (24 bits, 6-bit Base64 digits). Thus base64 is used to transport and embed binary images into textual assets such as HTML and CSS.\n",
"- How is Base64 similar or different to Binary and Hexadecimal?\n",
"- Translate first 3 letters of your name to Base64.\n",
"\n",
"\n",
"### numpy\n",
"> [Numpy](https://numpy.org/) is described as \"The fundamental package for scientific computing with Python\". In the Image Lab, a Numpy array is created from the image data in order to simplify access and change to the RGB values of the pixels, converting pixels to grey scale.\n",
"\n",
"\n",
"### io, BytesIO\n",
"> Input and Output (I/O) is a fundamental of all Computer Programming. Input/output (I/O) buffering is a technique used to optimize I/O operations. In large quantities of data, how many frames of input the server currently has queued is the buffer. In this example, there is a very large picture that lags.\n",
"- Where have you been a consumer of buffering? \n",
"- From your consumer experience, what effects have you experienced from buffering? \n",
"- How do these effects apply to images?\n"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Data Structures, Imperative Programming Style, and working with Images\n",
"> Introduction to creating meta data and manipulating images. Look at each procedure and explain the the purpose and results of this program. Add any insights or challenges as you explored this program.\n",
"- Does this code seem like a series of steps are being performed?\n",
"- Describe Grey Scale algorithm in English or Pseudo code?\n",
"- Describe scale image? What is before and after on pixels in three images?\n",
"- Is scale image a type of compression? If so, line it up with College Board terms described?"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from IPython.display import HTML, display\n",
"from pathlib import Path # https://medium.com/@ageitgey/python-3-quick-tip-the-easy-way-to-deal-with-file-paths-on-windows-mac-and-linux-11a072b58d5f\n",
"from PIL import Image as pilImage # as pilImage is used to avoid conflicts\n",
"from io import BytesIO\n",
"import base64\n",
"import numpy as np\n",
"\n",
"# prepares a series of images\n",
"def image_data(path=Path(\"images/\"), images=None): # path of static images is defaulted\n",
" if images is None: # default image\n",
" images = [\n",
" {'source': \"Internet\", 'label': \"Green Square\", 'file': \"green-square-16.png\"},\n",
" {'source': \"Peter Carolin\", 'label': \"Clouds Impression\", 'file': \"clouds-impression.png\"},\n",
" {'source': \"Peter Carolin\", 'label': \"Lassen Volcano\", 'file': \"lassen-volcano.jpg\"}\n",
" ]\n",
" for image in images:\n",
" # File to open\n",
" image['filename'] = path / image['file'] # file with path\n",
" return images\n",
"\n",
"# Large image scaled to baseWidth of 320\n",
"def scale_image(img):\n",
" baseWidth = 320\n",
" scalePercent = (baseWidth/float(img.size[0]))\n",
" scaleHeight = int((float(img.size[1])*float(scalePercent)))\n",
" scale = (baseWidth, scaleHeight)\n",
" return img.resize(scale)\n",
"\n",
"# PIL image converted to base64\n",
"def image_to_base64(img, format):\n",
" with BytesIO() as buffer:\n",
" img.save(buffer, format)\n",
" return base64.b64encode(buffer.getvalue()).decode()\n",
"\n",
"# Set Properties of Image, Scale, and convert to Base64\n",
"def image_management(image): # path of static images is defaulted \n",
" # Image open return PIL image object\n",
" img = pilImage.open(image['filename'])\n",
" \n",
" # Python Image Library operations\n",
" image['format'] = img.format\n",
" image['mode'] = img.mode\n",
" image['size'] = img.size\n",
" # Scale the Image\n",
" img = scale_image(img)\n",
" image['pil'] = img\n",
" image['scaled_size'] = img.size\n",
" # Scaled HTML\n",
" image['html'] = '
' % image_to_base64(image['pil'], image['format'])\n",
" \n",
"# Create Grey Scale Base64 representation of Image\n",
"def image_management_add_html_grey(image):\n",
" # Image open return PIL image object\n",
" img = image['pil']\n",
" format = image['format']\n",
" \n",
" img_data = img.getdata() # Reference https://www.geeksforgeeks.org/python-pil-image-getdata/\n",
" image['data'] = np.array(img_data) # PIL image to numpy array\n",
" image['gray_data'] = [] # key/value for data converted to gray scale\n",
"\n",
" # 'data' is a list of RGB data, the list is traversed and hex and binary lists are calculated and formatted\n",
" for pixel in image['data']:\n",
" # create gray scale of image, ref: https://www.geeksforgeeks.org/convert-a-numpy-array-to-an-image/\n",
" average = (pixel[0] + pixel[1] + pixel[2]) // 3 # average pixel values and use // for integer division\n",
" if len(pixel) > 3:\n",
" image['gray_data'].append((average, average, average, pixel[3])) # PNG format\n",
" else:\n",
" image['gray_data'].append((average, average, average))\n",
" # end for loop for pixels\n",
" \n",
" img.putdata(image['gray_data'])\n",
" image['html_grey'] = '
' % image_to_base64(img, format)\n",
"\n",
"\n",
"# Jupyter Notebook Visualization of Images\n",
"if __name__ == \"__main__\":\n",
" # Use numpy to concatenate two arrays\n",
" images = image_data()\n",
" \n",
" # Display meta data, scaled view, and grey scale for each image\n",
" for image in images:\n",
" image_management(image)\n",
" print(\"---- meta data -----\")\n",
" print(image['label'])\n",
" print(image['source'])\n",
" print(image['format'])\n",
" print(image['mode'])\n",
" print(\"Original size: \", image['size'])\n",
" print(\"Scaled size: \", image['scaled_size'])\n",
" \n",
" print(\"-- original image --\")\n",
" display(HTML(image['html'])) \n",
" \n",
" print(\"--- grey image ----\")\n",
" image_management_add_html_grey(image)\n",
" display(HTML(image['html_grey'])) \n",
" print()"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Data Structures and OOP\n",
"> Most data structures classes require Object Oriented Programming (OOP). Since this class is lined up with a College Course, OOP will be talked about often. Functionality in remainder of this Blog is the same as the prior implementation. Highlight some of the key difference you see between imperative and oop styles.\n",
"- Read imperative and object-oriented programming on Wikipedia\n",
"- Consider how data is organized in two examples, in relations to procedures\n",
"- Look at Parameters in Imperative and Self in OOP\n",
"\n",
"## Additionally, review all the imports in these three demos. Create a definition of their purpose, specifically these ...\n",
"- PIL\n",
"- numpy\n",
"- base64"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from IPython.display import HTML, display\n",
"from pathlib import Path # https://medium.com/@ageitgey/python-3-quick-tip-the-easy-way-to-deal-with-file-paths-on-windows-mac-and-linux-11a072b58d5f\n",
"from PIL import Image as pilImage # as pilImage is used to avoid conflicts\n",
"from io import BytesIO\n",
"import base64\n",
"import numpy as np\n",
"\n",
"\n",
"class Image_Data:\n",
"\n",
" def __init__(self, source, label, file, path, baseWidth=320):\n",
" self._source = source # variables with self prefix become part of the object, \n",
" self._label = label\n",
" self._file = file\n",
" self._filename = path / file # file with path\n",
" self._baseWidth = baseWidth\n",
"\n",
" # Open image and scale to needs\n",
" self._img = pilImage.open(self._filename)\n",
" self._format = self._img.format\n",
" self._mode = self._img.mode\n",
" self._originalSize = self.img.size\n",
" self.scale_image()\n",
" self._html = self.image_to_html(self._img)\n",
" self._html_grey = self.image_to_html_grey()\n",
"\n",
"\n",
" @property\n",
" def source(self):\n",
" return self._source \n",
" \n",
" @property\n",
" def label(self):\n",
" return self._label \n",
" \n",
" @property\n",
" def file(self):\n",
" return self._file \n",
" \n",
" @property\n",
" def filename(self):\n",
" return self._filename \n",
" \n",
" @property\n",
" def img(self):\n",
" return self._img\n",
" \n",
" @property\n",
" def format(self):\n",
" return self._format\n",
" \n",
" @property\n",
" def mode(self):\n",
" return self._mode\n",
" \n",
" @property\n",
" def originalSize(self):\n",
" return self._originalSize\n",
" \n",
" @property\n",
" def size(self):\n",
" return self._img.size\n",
" \n",
" @property\n",
" def html(self):\n",
" return self._html\n",
" \n",
" @property\n",
" def html_grey(self):\n",
" return self._html_grey\n",
" \n",
" # Large image scaled to baseWidth of 320\n",
" def scale_image(self):\n",
" scalePercent = (self._baseWidth/float(self._img.size[0]))\n",
" scaleHeight = int((float(self._img.size[1])*float(scalePercent)))\n",
" scale = (self._baseWidth, scaleHeight)\n",
" self._img = self._img.resize(scale)\n",
" \n",
" # PIL image converted to base64\n",
" def image_to_html(self, img):\n",
" with BytesIO() as buffer:\n",
" img.save(buffer, self._format)\n",
" return '
' % base64.b64encode(buffer.getvalue()).decode()\n",
" \n",
" # Create Grey Scale Base64 representation of Image\n",
" def image_to_html_grey(self):\n",
" img_grey = self._img\n",
" numpy = np.array(self._img.getdata()) # PIL image to numpy array\n",
" \n",
" grey_data = [] # key/value for data converted to gray scale\n",
" # 'data' is a list of RGB data, the list is traversed and hex and binary lists are calculated and formatted\n",
" for pixel in numpy:\n",
" # create gray scale of image, ref: https://www.geeksforgeeks.org/convert-a-numpy-array-to-an-image/\n",
" average = (pixel[0] + pixel[1] + pixel[2]) // 3 # average pixel values and use // for integer division\n",
" if len(pixel) > 3:\n",
" grey_data.append((average, average, average, pixel[3])) # PNG format\n",
" else:\n",
" grey_data.append((average, average, average))\n",
" # end for loop for pixels\n",
" \n",
" img_grey.putdata(grey_data)\n",
" return self.image_to_html(img_grey)\n",
"\n",
" \n",
"# prepares a series of images, provides expectation for required contents\n",
"def image_data(path=Path(\"images/\"), images=None): # path of static images is defaulted\n",
" if images is None: # default image\n",
" images = [\n",
" {'source': \"Internet\", 'label': \"Green Square\", 'file': \"green-square-16.png\"},\n",
" {'source': \"Peter Carolin\", 'label': \"Clouds Impression\", 'file': \"clouds-impression.png\"},\n",
" {'source': \"Peter Carolin\", 'label': \"Lassen Volcano\", 'file': \"lassen-volcano.jpg\"}\n",
" ]\n",
" return path, images\n",
"\n",
"# turns data into objects\n",
"def image_objects(): \n",
" id_Objects = []\n",
" path, images = image_data()\n",
" for image in images:\n",
" id_Objects.append(Image_Data(source=image['source'], \n",
" label=image['label'],\n",
" file=image['file'],\n",
" path=path,\n",
" ))\n",
" return id_Objects\n",
"\n",
"# Jupyter Notebook Visualization of Images\n",
"if __name__ == \"__main__\":\n",
" for ido in image_objects(): # ido is an Imaged Data Object\n",
" \n",
" print(\"---- meta data -----\")\n",
" print(ido.label)\n",
" print(ido.source)\n",
" print(ido.file)\n",
" print(ido.format)\n",
" print(ido.mode)\n",
" print(\"Original size: \", ido.originalSize)\n",
" print(\"Scaled size: \", ido.size)\n",
" \n",
" print(\"-- scaled image --\")\n",
" display(HTML(ido.html))\n",
" \n",
" print(\"--- grey image ---\")\n",
" display(HTML(ido.html_grey))\n",
" \n",
" print()"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Hacks\n",
"> Early Seed award\n",
"- Add this Blog to you own Blogging site.\n",
"- In the Blog add a Happy Face image.\n",
"- Have Happy Face Image open when Tech Talk starts, running on localhost. Don't tell anyone. Show to Teacher.\n",
"\n",
"> AP Prep\n",
"- In the Blog add notes and observations on each code cell that request an answer.\n",
"- In blog add College Board practice problems for 2.2\n",
"- Choose 2 images, one that will more likely result in lossy data compression and one that is more likely to result in lossless data compression. Explain.\n",
"\n",
"> Project Addition\n",
"- If your project has images in it, try to implement an image change that has a purpose. (Ex. An item that has been sold out could become gray scale)\n",
"\n",
"> Pick a programming paradigm and solve some of the following ...\n",
"- Numpy, manipulating pixels. As opposed to Grey Scale treatment, pick a couple of other types like red scale, green scale, or blue scale. We want you to be manipulating pixels in the image.\n",
"- Binary and Hexadecimal reports. Convert and produce pixels in binary and Hexadecimal and display.\n",
"- Compression and Sizing of images. Look for insights into compression Lossy and Lossless. Look at PIL library and see if there are other things that can be done.\n",
"- There are many effects you can do as well with PIL. Blur the image or write Meta Data on screen, aka Title, Author and Image size.\n",
"\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "base",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.10"
},
"orig_nbformat": 4,
"vscode": {
"interpreter": {
"hash": "8b82d5009c68ba5675978267e2b13a671f2a7143d61273c5a3813c97e0b2493d"
}
}
},
"nbformat": 4,
"nbformat_minor": 2
}