where $n$ is the number of observations, $w_{ij}$ is the cell of binary matrix, and $\bar{y}$ is the sample mean.
from libpysal.examples import load_example
import geopandas as gpd
guerry_loc = load_example('Guerry')
guerry = gpd.read_file(guerry_loc.get_path('guerry.geojson'))
print(guerry.shape)
guerry.head()
Example not available: Guerry Example not downloaded: Chicago parcels Example not downloaded: Chile Migration Example not downloaded: Spirals (85, 24)
dept | Region | Dprtmnt | Crm_prs | Crm_prp | Litercy | Donatns | Infants | Suicids | MainCty | ... | Infntcd | Dntn_cl | Lottery | Desertn | Instrct | Prsttts | Distanc | Area | Pop1831 | geometry | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1 | E | Ain | 28870 | 15890 | 37 | 5098 | 33120 | 35039 | 2 | ... | 60 | 69 | 41 | 55 | 46 | 13 | 218.372 | 5762 | 346.03 | MULTIPOLYGON (((801150.00000 2092615.00000, 80... |
1 | 2 | N | Aisne | 26226 | 5521 | 51 | 8901 | 14572 | 12831 | 2 | ... | 82 | 36 | 38 | 82 | 24 | 327 | 65.945 | 7369 | 513.00 | MULTIPOLYGON (((729326.00000 2521619.00000, 72... |
2 | 3 | C | Allier | 26747 | 7925 | 13 | 10973 | 17044 | 114121 | 2 | ... | 42 | 76 | 66 | 16 | 85 | 34 | 161.927 | 7340 | 298.26 | MULTIPOLYGON (((710830.00000 2137350.00000, 71... |
3 | 4 | E | Basses-Alpes | 12935 | 7289 | 46 | 2733 | 23018 | 14238 | 1 | ... | 12 | 37 | 80 | 32 | 29 | 2 | 351.399 | 6925 | 155.90 | MULTIPOLYGON (((882701.00000 1920024.00000, 88... |
4 | 5 | E | Hautes-Alpes | 17488 | 8174 | 69 | 6962 | 23076 | 16171 | 1 | ... | 23 | 64 | 79 | 35 | 7 | 1 | 320.280 | 5549 | 129.10 | MULTIPOLYGON (((886504.00000 1922890.00000, 88... |
5 rows × 24 columns
from libpysal.weights.contiguity import Queen
import splot
from esda.moran import Moran
y = guerry['Crm_prp'].values
w = Queen.from_dataframe(guerry)
w.transform = 'r'
moran = Moran(y, w)
moran.I
0.26355334031776523
from esda.geary import Geary
geary = Geary(guerry["Crm_prp"], w)
print(f"The Geary's C is {geary.C}")
print(f"The simulated pseudo p-value is {geary.p_sim}")
The Geary's C is 0.6778380960702041 The simulated pseudo p-value is 0.001
where $w_{ij}(d)$ is the binary weight assigned based on the distance bin.
import pandas as pd
from pysal.lib import weights
pts = guerry.centroid
xys = pd.DataFrame({"X": pts.x, "Y": pts.y})
min_thr = weights.util.min_threshold_distance(xys)
min_thr
C:\Users\noibar\AppData\Local\Temp\ipykernel_6356\1545389804.py:3: UserWarning: Geometry is in a geographic CRS. Results from 'centroid' are likely incorrect. Use 'GeoSeries.to_crs()' to re-project geometries to a projected CRS before this operation. pts = guerry.centroid
96726.14323874969
from esda.getisord import G
w_db = weights.DistanceBand.from_dataframe(guerry, min_thr)
gao = G(guerry["Crm_prp"], w_db)
print(
"Getis & Ord G: %.3f | Pseudo P-value: %.3f" % (gao.G, gao.p_sim)
)
Getis & Ord G: 0.049 | Pseudo P-value: 0.001
C:\Users\noibar\anaconda3\envs\geo_env\lib\site-packages\esda\getisord.py:167: RuntimeWarning: overflow encountered in longlong_scalars EG2DEN = ((sum(y) ** 2 - sum(y2)) ** 2) * n * (n - 1) * (n - 2) * (n - 3)
The concept of bivariate spatial correlation is complex and often misinterpreted. It is typically considered to be the correlation between one variable and the spatial lag of another variable, as originally implemented in the precursor of GeoDa (e.g., as described in Anselin, Syabri, and Smirnov 2002). However, this does not take into account the inherent correlation between the two variables. More precisely, the bivariate spatial correlation is between $x_i$ and $\Sigma_j w_{ij}y_{j}$, but does not take into account the correlation between $x_i$ and $y_i$, i.e., between the two variables at the same location. <...> As a result, this statistic is often interpreted incorrectly, as it may overestimate the spatial aspect of the correlation that instead may be due mostly to the in-place correlation.
from esda.moran import Moran_BV
x = guerry['Litercy'].values
y = guerry['Crm_prp'].values
moran_bv = Moran_BV(y, x, w)
print(f"Bivariate Moran's I: {round(moran_bv.I, 3)}")
Bivariate Moran's I: -0.316
import scipy
import matplotlib.pyplot as plt
import numpy as np
plt.scatter(x,y)
plt.plot(np.unique(x), np.poly1d(np.polyfit(x, y, 1))(np.unique(x)))
r, p = scipy.stats.pearsonr(x, y)
from splot.esda import plot_moran_bv_simulation, plot_moran_bv
import matplotlib.pyplot as plt
plot_moran_bv(moran_bv)
plt.show()
C:\Users\noibar\anaconda3\envs\geo_env\lib\site-packages\splot\_viz_esda_mpl.py:628: FutureWarning: `shade` is now deprecated in favor of `fill`; setting `fill=True`. This will become an error in seaborn v0.14.0; please update your code. sbn.kdeplot(moran_bv.sim, shade=shade, color=color, ax=ax, **kwargs)
from esda.moran import Moran_BV_matrix
from splot.esda import moran_facet
guerry_vars = guerry[['Crm_prs', 'Litercy', 'Donatns', 'Wealth']]
matrix = Moran_BV_matrix(guerry_vars, w)
moran_facet(matrix)
fig = plt.gcf()
fig.savefig('moran_facet.png')
plt.close()
from IPython.display import Image
Image(filename='moran_facet.png')