Overview
Cell morphology in general, and cell shape in particular, provide a readout of organizational and physiological cell state. In evaluating cell health or state, we frequently make inferences from the outline and texture of the cell’s membrane. This is easy to do on a cell-by-cell basis but becomes substantially harder when confronted by a population of hundreds or thousands of cells.
One of the data products produced by the Allen Institute for Cell Science is a large corpus of segmented cell and nuclear shapes. These are a necessary step in the isolation of individual cells from our high-magnification fields of cells. These segmentations provide a simple way to characterize cell and nuclear shape, being nothing more than True/False (cell/not-cell or nucleus/not-nucleus) matrices.
We find that our cells have more variation in shape than might naively be assumed, requiring twenty components to capture half the shape variation, while nuclear shape is more stereotyped. The primary modes of variation for cell shape are how much of a waist a given cell has and how much it leans to the side or stands up straight. The primary modes of variation for nuclear shape are how much the nucleus presents as pancaked/squashed or presents as cylindrical. To help build intuition about shape across our population, let's take a look at a decomposition that tells us mean shapes and breaks out shape variation along a couple of axes.
We find that our cells have more variation in shape than might naively be assumed, requiring twenty components to capture half the shape variation, while nuclear shape is more stereotyped. The primary modes of variation for cell shape are how much of a waist a given cell has and how much it leans to the side or stands up straight. The primary modes of variation for nuclear shape are how much the nucleus presents as pancaked/squashed or presents as cylindrical. To help build intuition about shape across our population, let's take a look at a decomposition that tells us mean shapes and breaks out shape variation along a couple of axes.
Sections below mix code and visualizations using the common Jupyter Notebook data science toolkit.
Jupyter.org provides no-installation-needed introductions to allow you to use this powerful ecosystem. |
Dive in to the What shape are our hiPS Cells? Jupyter notebook
First we need to get Jupyter setup.
With Jupyter running, you can paste the code below and work through to the bottom of the notebook, or use our published github version: |
Code for setup
Code Editor
Code to load the cells
Code Editor
What do our cells look like?
We’ll need to define some plotting helper functions first and then we can plot our cells.
We’ll need to define some plotting helper functions first and then we can plot our cells.
Code for plotting
Code Editor
Our cells look good in stripes! Also, it helps us see their contours.
So we’ve got a very nice number of cells loaded....
Let’s find their principal components.
We’ll do this over a number of batches and use multithreading but it will still take some time.
Let’s find their principal components.
We’ll do this over a number of batches and use multithreading but it will still take some time.
Code for principal components
Code Editor
How well do those principal components do at explaining the variance?
Code for plotting variance
Code Editor
These capture a middle amount of the natural variation seen in the cell outlines. It is a fair bit easier to capture nuclear variance than to capture cell shape variance, likely because the nucleus is a more stereotyped shape.
What do the principal components look like?
Let’s see how they are distributed by calculating them for a subset of the cells we trained on. Since the information content levels off and it would be cluttered to look at all 20 components, I’ll limit this to the first 6.
Let’s see how they are distributed by calculating them for a subset of the cells we trained on. Since the information content levels off and it would be cluttered to look at all 20 components, I’ll limit this to the first 6.
Code for visualization of principal components
Code Editor
None of those are so bimodal or otherwise non-normally distributed that we can’t get a sense of what their effects are on the overall cell shape just by looking at the effect of reconstructing a cell with the mean shape plus one of the components set to its 10th or 90th percentile value.
Let’s create reconstructions of the mean cell and nucleus with component perturbations.
Code for principal component reconstruction with perturbations
Code Editor
With reconstructions in hand, the last task is to visualize them.
Code for principal component reconstruction visualization
Code Editor
The cell components exhibit more interpretable variation, as expected. The first cell component describes whether the cell has a narrow or straight waist, the second (perhaps) whether the cell spreads at the bottom, the third is a rotational artifact, and the higher components become difficult to interpret.
The first nuclear component is maybe an indicator of having a sharp waist. The second and third nuclear components are cylindrical/sphere and pancake/sphere, respectively. As with the cell components, high order nuclear components become harder to interpret.
The first nuclear component is maybe an indicator of having a sharp waist. The second and third nuclear components are cylindrical/sphere and pancake/sphere, respectively. As with the cell components, high order nuclear components become harder to interpret.