geoplot.sankey(*args, projection=None, start=None, end=None, path=None, hue=None, categorical=False, scheme=None, k=5, cmap='viridis', vmin=None, vmax=None, legend=False, legend_kwargs=None, legend_labels=None, legend_values=None, legend_var=None, extent=None, figsize=(8, 6), ax=None, scale=None, limits=(1, 5), scale_func=None, **kwargs)¶A geospatial Sankey diagram (flow map).
| Parameters: |
|
|---|---|
| Returns: | The axis object with the plot on it. |
| Return type: | AxesSubplot or GeoAxesSubplot instance |
Examples
A Sankey diagram is a type of plot useful for visualizing flow through a network. Minard’s diagram of Napolean’s ill-fated invasion of Russia is a classical example. A Sankey diagram is useful when you wish to show movement within a network (a graph): traffic load a road network, for example, or typical airport traffic patterns.
This plot type is unusual amongst geoplot types in that it is meant for two columns of geography,
resulting in a slightly different API. A basic sankey specifies data, start points, end points, and,
optionally, a projection.
import geoplot as gplt
import geoplot.crs as gcrs
gplt.sankey(mock_data, start='origin', end='destination', projection=gcrs.PlateCarree())
However, Sankey diagrams need additional geospatial context to be interpretable. In this case (and for the remainder of the examples) we will provide this by overlaying world geometry.
ax = gplt.sankey(mock_data, start='origin', end='destination', projection=gcrs.PlateCarree())
ax.coastlines()
This function is very seaborn-like in that the usual df argument is optional. If geometries are provided
as independent iterables it can be dropped.
ax = gplt.sankey(projection=gcrs.PlateCarree(), start=network['from'], end=network['to'])
ax.set_global()
ax.coastlines()
You may be wondering why the lines are curved. By default, the paths followed by the plot are the actual shortest paths between those two points, in the spherical sense. This is known as great circle distance. We can see this clearly in an ortographic projection.
ax = gplt.sankey(projection=gcrs.Orthographic(), start=network['from'], end=network['to'],
extent=(-180, 180, -90, 90))
ax.set_global()
ax.coastlines()
ax.outline_patch.set_visible(True)
Plot using a different distance metric, pass it as an argument to the path parameter. Awkwardly, cartopy
crs objects (not geoplot ones) are required.
import cartopy.ccrs as ccrs
ax = gplt.sankey(projection=gcrs.PlateCarree(), start=network['from'], end=network['to'],
path=ccrs.PlateCarree())
ax.set_global()
ax.coastlines()
One of the most powerful sankey features is that if your data has custom paths, you can use those instead
with the path parameter.
gplt.sankey(dc, path=dc.geometry, projection=gcrs.AlbersEqualArea(), scale='aadt',
limits=(0.1, 10))
The hue parameter colorizes paths based on data.
ax = gplt.sankey(network, projection=gcrs.PlateCarree(),
start='from', end='to', path=PlateCarree(),
hue='mock_variable')
ax.set_global()
ax.coastlines()
cmap changes the colormap.
ax = gplt.sankey(network, projection=gcrs.PlateCarree(),
start='from', end='to',
hue='mock_variable', cmap='RdYlBu')
ax.set_global()
ax.coastlines()
legend adds a legend.
ax = gplt.sankey(network, projection=gcrs.PlateCarree(),
start='from', end='to',
hue='mock_variable', cmap='RdYlBu',
legend=True)
ax.set_global()
ax.coastlines()
Pass keyword arguments to the legend with legend_kwargs. This is often necessary for positioning.
ax = gplt.sankey(network, projection=gcrs.PlateCarree(),
start='from', end='to',
hue='mock_variable', cmap='RdYlBu',
legend=True, legend_kwargs={'bbox_to_anchor': (1.4, 1.0)})
ax.set_global()
ax.coastlines()
Specify custom legend labels with legend_labels.
ax = gplt.sankey(network, projection=gcrs.PlateCarree(),
start='from', end='to',
hue='mock_variable', cmap='RdYlBu',
legend=True, legend_kwargs={'bbox_to_anchor': (1.25, 1.0)},
legend_labels=['Very Low', 'Low', 'Average', 'High', 'Very High'])
ax.set_global()
ax.coastlines()
Change the number of bins with k.
ax = gplt.sankey(network, projection=gcrs.PlateCarree(),
start='from', end='to',
hue='mock_variable', cmap='RdYlBu',
legend=True, legend_kwargs={'bbox_to_anchor': (1.25, 1.0)},
k=3)
ax.set_global()
ax.coastlines()
Change the binning sceme with scheme.
ax = gplt.sankey(network, projection=gcrs.PlateCarree(),
start='from', end='to',
hue='mock_variable', cmap='RdYlBu',
legend=True, legend_kwargs={'bbox_to_anchor': (1.25, 1.0)},
k=3, scheme='equal_interval')
ax.set_global()
ax.coastlines()
If your variable of interest is already categorical, specify categorical=True to
use the labels in your dataset directly.
ax = gplt.sankey(network, projection=gcrs.PlateCarree(),
start='from', end='to',
hue='above_meridian', cmap='RdYlBu',
legend=True, legend_kwargs={'bbox_to_anchor': (1.2, 1.0)},
categorical=True)
ax.set_global()
ax.coastlines()
scale can be used to enable linewidth as a visual variable.
ax = gplt.sankey(network, projection=gcrs.PlateCarree(),
start='from', end='to',
scale='mock_data',
legend=True, legend_kwargs={'bbox_to_anchor': (1.2, 1.0)},
color='lightblue')
ax.set_global()
ax.coastlines()
By default, the polygons will be scaled according to the data such that the minimum value is scaled by a factor of
0.2 while the largest value is left unchanged. Adjust this using the limits parameter.
ax = gplt.sankey(network, projection=gcrs.PlateCarree(),
start='from', end='to',
scale='mock_data', limits=(1, 3),
legend=True, legend_kwargs={'bbox_to_anchor': (1.2, 1.0)},
color='lightblue')
ax.set_global()
ax.coastlines()
The default scaling function is a linear one. You can change the scaling function to whatever you want by
specifying a scale_func input. This should be a factory function of two variables which, when given the
maximum and minimum of the dataset, returns a scaling function which will be applied to the rest of the data.
def trivial_scale(minval, maxval):
def scalar(val):
return 2
return scalar
ax = gplt.sankey(network, projection=gcrs.PlateCarree(),
start='from', end='to',
scale='mock_data', scale_func=trivial_scale,
legend=True, legend_kwargs={'bbox_to_anchor': (1.1, 1.0)},
color='lightblue')
ax.set_global()
ax.coastlines()
In case more than one visual variable is used, control which one appears in the legend using legend_var.
ax = gplt.sankey(network, projection=gcrs.PlateCarree(),
start='from', end='to',
scale='mock_data',
legend=True, legend_kwargs={'bbox_to_anchor': (1.1, 1.0)},
hue='mock_data', legend_var="hue")
ax.set_global()
ax.coastlines()