Merging Data¶
There are two ways to combine datasets in geopandas – attribute joins and spatial joins.
In an attribute join, a GeoSeries
or GeoDataFrame
is combined with a regular pandas Series
or DataFrame
based on a common variable. This is analogous to normal merging or joining in pandas.
In a Spatial Join, observations from to GeoSeries
or GeoDataFrames
are combined based on their spatial relationship to one another.
In the following examples, we use these datasets:
In [1]: world = gpd.read_file(gpd.datasets.get_path('naturalearth_lowres'))
In [2]: cities = gpd.read_file(gpd.datasets.get_path('naturalearth_cities'))
# For attribute join
In [3]: country_shapes = world[['geometry', 'iso_a3']]
In [4]: country_names = world[['name', 'iso_a3']]
# For spatial join
In [5]: countries = world[['geometry', 'name']]
In [6]: countries = countries.rename(columns={'name':'country'})
Attribute Joins¶
Attribute joins are accomplished using the merge
method. In general, it is recommended to use the merge
method called from the spatial dataset. With that said, the stand-alone merge
function will work if the GeoDataFrame is in the left
argument; if a DataFrame is in the left
argument and a GeoDataFrame is in the right
position, the result will no longer be a GeoDataFrame.
For example, consider the following merge that adds full names to a GeoDataFrame
that initially has only ISO codes for each country by merging it with a pandas DataFrame
.
# `country_shapes` is GeoDataFrame with country shapes and iso codes
In [7]: country_shapes.head()
Out[7]:
geometry iso_a3
0 POLYGON ((61.21081709172574 35.65007233330923,... AFG
1 (POLYGON ((16.32652835456705 -5.87747039146621... AGO
2 POLYGON ((20.59024743010491 41.85540416113361,... ALB
3 POLYGON ((51.57951867046327 24.24549713795111,... ARE
4 (POLYGON ((-65.50000000000003 -55.199999999999... ARG
# `country_names` is DataFrame with country names and iso codes
In [8]: country_names.head()