{
"metadata": {
"name": "",
"signature": "sha256:9b05ca5a3cd95989b7fb52040aeb5f36f1ffd803397f804ec61b361d290dc501"
},
"nbformat": 3,
"nbformat_minor": 0,
"worksheets": [
{
"cells": [
{
"cell_type": "heading",
"level": 1,
"metadata": {},
"source": [
"Multiple Labels from a Segmented Image"
]
},
{
"cell_type": "heading",
"level": 3,
"metadata": {},
"source": [
"Objective"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This example is inspired by a real task in the lab. We obtained a carefully segmented image of gold nanoparticles and wanted to quantify various aspects of each group (ie single particles, dimers, trimers and large clusters). For example, how does the eccentricity vary with each subgroup? The color labels were pre-assigned, and utilities to juggle the various species were imported into pyparty."
]
},
{
"cell_type": "heading",
"level": 3,
"metadata": {},
"source": [
"Environment Setup"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Configure notebook style (see NBCONFIG.ipynb), add imports and paths. The **%run** magic used below **requires IPython 2.0 or higher.**"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"%run NBCONFIG.ipynb"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"Populating the interactive namespace from numpy and matplotlib\n"
]
},
{
"html": [
""
],
"metadata": {},
"output_type": "display_data",
"text": [
""
]
}
],
"prompt_number": 1
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"First let's look at the test data, which we have prelabeled from a previous analysis."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"from pyparty.data import nanolabels, nanogold\n",
"\n",
"NANOLABELS = nanolabels()\n",
"ax1, ax2 = splot(1,2)\n",
"\n",
"showim(nanogold(), ax1, title='SEM nanoparticles')\n",
"showim(NANOLABELS, ax2, 'spectral', \n",
" title='size-segmented nanoparticles');\n",
"\n",
"print 'unique colors:', np.unique(NANOLABELS)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"unique colors: [ 0. 1. 2. 3. 4.]\n"
]
},
{
"metadata": {},
"output_type": "display_data",
"svg": [
"\n",
"\n",
"\n",
"\n"
],
"text": [
""
]
}
],
"prompt_number": 2
},
{
"cell_type": "heading",
"level": 3,
"metadata": {},
"source": [
"Working with Masks"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The segmented image has five unique labels, with 0 being the background. The first task is to split the segmented image into masks for each of the particle categories. ``pyparty`` provides a utility to simplify this task, **multi_mask(labeledimage, names, ignore=0)**:\n",
"\n",
"- labeledimage : image of labels color uniquely by particle group (see above)\n",
"- names (optional) : Names for each category. If none, actual value of image is used.\n",
" - For example, \"singles\" instead of (1.0), which is the pixel value in the image\n",
"- ignore : Pixel values to ignore **(0)**\n",
" - **Default to 0**, which is a black background. \n",
"- astype : Container with which to output (names, masks), default is tuple\n",
" - (name1, mask1) vs. {name1 : mask1}\n",
" - If None, returns generator. **I recommend the OrderedDict**\n",
"\n",
"This is best illustrated with an example."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"from pyparty.multi import multi_mask\n",
"from collections import OrderedDict\n",
"\n",
"NAMES = ('singles', 'dimers', 'trimers', 'clusters')\n",
"\n",
"axes = splot(2,2, figsize=(10,6))\n",
"\n",
"masks = multi_mask(NANOLABELS, *NAMES, astype=OrderedDict)\n",
"\n",
"for idx, (name, image) in enumerate(masks.items()):\n",
" showim(image, axes[idx], 'gray', title=name)\n",
"\n",
"type(masks), masks.keys()[0], masks.values()[0].shape"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 3,
"text": [
"(collections.OrderedDict, 'singles', (614, 1012))"
]
},
{
"metadata": {},
"output_type": "display_data",
"svg": [
"\n",
"\n",
"\n",
"\n"
],
"text": [
""
]
}
],
"prompt_number": 3
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We see that the return of *multi_masks()* is an OrderedDict of boolean arrays, each one corresponding to subgroup of interest. For certain tasks, the masks are sufficient; for example, the area of the image occupied by dimers can be found by the ratio of white pixels to dark pixels:"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"L, W = NANOLABELS.shape\n",
"dimer_area = sum(masks['dimers'])\n",
"\n",
"print 'Dimer coverage: %.1f%%' % \\\n",
" (100.0 * ( float(dimer_area) / (L * W)) )"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"Dimer coverage: 6.7%\n"
]
}
],
"prompt_number": 4
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"For more complex tasks, working directly with masks can be cumbersome. Therefore, we turn to a simple Canvas container object called **MultiCanvas**."
]
},
{
"cell_type": "heading",
"level": 3,
"metadata": {},
"source": [
"Working with MultiCanvas"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"See the related [MultiCanvas tutorial](http://nbviewer.ipython.org/github/hugadams/pyparty/blob/master/examples/Notebooks/multi_tutorial.ipynb?create=1). We will focus on the constructor, *MultiCanvas.from_labeled()*, which reads a labeled image into a container of names/canvii. The opetions of *from_labeled* are:\n",
"\n",
"- **ignore**: 0\n",
" - Same as multi_masks()\n",
"- **neighbors** : 4 or 8 (4)\n",
" - Connectivity of labels\n",
"- **pmankwargs** : args passed directly to ParticleManager.from_labels()\n",
" - For example, pmin=10\n",
"- **maximum** : 10; total number of unique labels an image may have before an error is raised.\n",
" - **This is a failsafe.** If a user passes a labeled image with 500 distinct colors, this will result in an extremely slow contstruction, mostly do to the slowness of `canvas.from_labels()`. multi-masks should still be fairly fast. Raise the maximum at your discretion!\n",
" \n",
"One should note that we can easily get the aforementioned directly from the MultiCanvas via:"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"from pyparty import MultiCanvas\n",
"\n",
"mc = MultiCanvas.from_labeled(NANOLABELS, *NAMES)\n",
"mc"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stderr",
"text": [
"No handlers could be found for logger \"pyparty.tools.manager\"\n"
]
},
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 5,
"text": [
"MultiCanvas (0xabd508c): \n",
" singles - Canvas (0xaa81d4c) : 614 X 1012 : 1162 particles\n",
" dimers - Canvas (0xaa817dc) : 614 X 1012 : 269 particles \n",
" trimers - Canvas (0xabd532c) : 614 X 1012 : 46 particles \n",
" clusters - Canvas (0xabd529c) : 614 X 1012 : 40 particles "
]
}
],
"prompt_number": 5
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's use MultiCanvas to the following:\n",
"\n",
"*Plot the distribution of areas in each particle subgroup (ie piechart or histogram). And how does this compare to multiples of the **mean single particle** area?* "
]
},
{
"cell_type": "heading",
"level": 4,
"metadata": {},
"source": [
"Step 1: Shared plotting parameters"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Before plotting, let's set some shared parameters like colors, as well as compute the mean singles area $\\mu$. "
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"mc.set_colors('r','g','y', 'magenta')\n",
"mc.mycolors"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 6,
"text": [
"{'clusters': 'magenta', 'dimers': 'g', 'singles': 'r', 'trimers': 'y'}"
]
}
],
"prompt_number": 6
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"# Take out singles, take mean\n",
"mu = np.mean(mc['singles'].area)\n",
"'Mean single particle area: %s%%' % round(mu,1)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 7,
"text": [
"'Mean single particle area: 74.6%'"
]
}
],
"prompt_number": 7
},
{
"cell_type": "heading",
"level": 4,
"metadata": {},
"source": [
"Step 2: Pie Chart"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"While the histogram and piechart are both built into the multicanvas, the histogram will take more keywords to look nice. Therefore, let's do the pie chart first. We will use the keyword \"*autopct = percent*\" to specify that we want to see both the percentage AND count of the species. (autopct also accepts *cout* or *both* or any valid pie autopct keyword)."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"ax1, ax2 = splot(1, 2, figsize=(10,5))\n",
"\n",
"chartkwds = {'autopct':'percent', 'shadow':True}\n",
"\n",
"mc.pie(ax1, **chartkwds);\n",
"mc.pie(ax2, attr='area', explode=(0,0,0,0.1), **chartkwds);"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "display_data",
"svg": [
"\n",
"\n",
"\n",
"\n"
],
"text": [
""
]
}
],
"prompt_number": 8
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
" **We see that although clusters only account for 2.6% of the particle count, they make up over 10% of the coverage!**"
]
},
{
"cell_type": "heading",
"level": 4,
"metadata": {},
"source": [
"Step 3: The stacked histogram (sans clusters)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Because there are very few clusters, but they are large in area, it's tough to get a nice histogram with them included. Therefore, we will drop them (they are the third index):"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"del mc['clusters']\n",
"mc.names"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 9,
"text": [
"['singles', 'dimers', 'trimers']"
]
}
],
"prompt_number": 9
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now, we merely make the histogram:"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"BINS = 30\n",
"YMAX = XMAX = 300\n",
"BINS=30\n",
"\n",
"mc.hist(attr='area', bins=BINS);\n",
"\n",
"# Add vline at 1, 2, 3 times mean\n",
"plt.vlines((mu, 2*mu, 3*mu), 0, YMAX, linestyles='--')\n",
"\n",
"# Add text to plot\n",
"amu = '$A_\\mu$'\n",
"textkwds = {'color':'blue', 'bbox':{'facecolor':'gray', 'alpha':.5}}\n",
"\n",
"plt.text(25, 275, 'x < %s' % amu, **textkwds)\n",
"plt.text(80, 275, '%s < x < 2%s' % (amu, amu), **textkwds)\n",
"plt.text(158, 275, '2%s < x <2%s' % (amu,amu), **textkwds)\n",
"plt.text(250, 200, 'x > 3%s' % amu, **textkwds)\n",
"\n",
"plt.xlim(0,XMAX)\n",
"plt.ylim(0,YMAX);"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "display_data",
"svg": [
"\n",
"\n",
"\n",
"\n"
],
"text": [
""
]
}
],
"prompt_number": 10
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We see that the single particles distribution is centered around the single particle mean area nicely. The histogram is misleading because it displays the **number** of particles moreso than their relative portion of the surface coverage. *For this, a pie chart was more informative.*"
]
},
{
"cell_type": "heading",
"level": 4,
"metadata": {},
"source": [
"Miscellaneous: Patch plots"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"mc['singles'].patchshow(gcolor='red', title='Singles patches');"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "display_data",
"svg": [
"\n",
"\n",
"\n",
"