geopandas.GeoDataFrame

class geopandas.GeoDataFrame(*args, geometry=None, crs=None, **kwargs)

A GeoDataFrame object is a pandas.DataFrame that has a column with geometry. In addition to the standard DataFrame constructor arguments, GeoDataFrame also accepts the following keyword arguments:

Parameters
crsvalue (optional)

Coordinate Reference System of the geometry objects. Can be anything accepted by pyproj.CRS.from_user_input(), such as an authority string (eg “EPSG:4326”) or a WKT string.

geometrystr or array (optional)

If str, column to use as geometry. If array, will be set as ‘geometry’ column on GeoDataFrame.

See also

GeoSeries

Series object designed to store shapely geometry objects

Examples

Constructing GeoDataFrame from a dictionary.

>>> from shapely.geometry import Point
>>> d = {'col1': ['name1', 'name2'], 'geometry': [Point(1, 2), Point(2, 1)]}
>>> gdf = geopandas.GeoDataFrame(d, crs="EPSG:4326")
>>> gdf
    col1                 geometry
0  name1  POINT (1.00000 2.00000)
1  name2  POINT (2.00000 1.00000)

Notice that the inferred dtype of ‘geometry’ columns is geometry.

>>> gdf.dtypes
col1          object
geometry    geometry
dtype: object

Constructing GeoDataFrame from a pandas DataFrame with a column of WKT geometries:

>>> import pandas as pd
>>> d = {'col1': ['name1', 'name2'], 'wkt': ['POINT (1 2)', 'POINT (2 1)']}
>>> df = pd.DataFrame(d)
>>> gs = geopandas.GeoSeries.from_wkt(df['wkt'])
>>> gdf = geopandas.GeoDataFrame(df, geometry=gs, crs="EPSG:4326")
>>> gdf
    col1          wkt                 geometry
0  name1  POINT (1 2)  POINT (1.00000 2.00000)
1  name2  POINT (2 1)  POINT (2.00000 1.00000)
__init__(*args, geometry=None, crs=None, **kwargs)

Initialize self. See help(type(self)) for accurate signature.

Methods

__init__(*args[, geometry, crs])

Initialize self.

abs()

Return a Series/DataFrame with absolute numeric value of each element.

add(other[, axis, level, fill_value])

Get Addition of dataframe and other, element-wise (binary operator add).

add_prefix(prefix)

Prefix labels with string prefix.

add_suffix(suffix)

Suffix labels with string suffix.

affine_transform(matrix)

Return a GeoSeries with translated geometries.

agg([func, axis])

Aggregate using one or more operations over the specified axis.

aggregate([func, axis])

Aggregate using one or more operations over the specified axis.

align(other[, join, axis, level, copy, …])

Align two objects on their axes with the specified join method.

all([axis, bool_only, skipna, level])

Return whether all elements are True, potentially over an axis.

any([axis, bool_only, skipna, level])

Return whether any element is True, potentially over an axis.

append(other[, ignore_index, …])

Append rows of other to the end of caller, returning a new object.

apply(func[, axis, raw, result_type, args])

Apply a function along an axis of the DataFrame.

applymap(func[, na_action])

Apply a function to a Dataframe elementwise.

asfreq(freq[, method, how, normalize, …])

Convert TimeSeries to specified frequency.

asof(where[, subset])

Return the last row(s) without any NaNs before where.

assign(**kwargs)

Assign new columns to a DataFrame.

astype(dtype[, copy, errors])

Cast a pandas object to a specified dtype dtype.

at_time(time[, asof, axis])

Select values at particular time of day (e.g., 9:30AM).

backfill([axis, inplace, limit, downcast])

Synonym for DataFrame.fillna() with method='bfill'.

between_time(start_time, end_time[, …])

Select values between particular times of the day (e.g., 9:00-9:30 AM).

bfill([axis, inplace, limit, downcast])

Synonym for DataFrame.fillna() with method='bfill'.

bool()

Return the bool of a single element Series or DataFrame.

boxplot([column, by, ax, fontsize, rot, …])

Make a box plot from DataFrame columns.

buffer(distance[, resolution])

Returns a GeoSeries of geometries representing all points within a given distance of each geometric object.

clip([lower, upper, axis, inplace])

Trim values at input threshold(s).

combine(other, func[, fill_value, overwrite])

Perform column-wise combine with another DataFrame.

combine_first(other)

Update null elements with value in the same location in other.

compare(other[, align_axis, keep_shape, …])

Compare to another DataFrame and show the differences.

contains(other[, align])

Returns a Series of dtype('bool') with value True for each aligned geometry that contains other.

convert_dtypes([infer_objects, …])

Convert columns to best possible dtypes using dtypes supporting pd.NA.

copy([deep])

Make a copy of this object’s indices and data.

corr([method, min_periods])

Compute pairwise correlation of columns, excluding NA/null values.

corrwith(other[, axis, drop, method])

Compute pairwise correlation.

count([axis, level, numeric_only])

Count non-NA cells for each column or row.

cov([min_periods, ddof])

Compute pairwise covariance of columns, excluding NA/null values.

covered_by(other[, align])

Returns a Series of dtype('bool') with value True for each aligned geometry that is entirely covered by other.

covers(other[, align])

Returns a Series of dtype('bool') with value True for each aligned geometry that is entirely covering other.

crosses(other[, align])

Returns a Series of dtype('bool') with value True for each aligned geometry that cross other.

cummax([axis, skipna])

Return cumulative maximum over a DataFrame or Series axis.

cummin([axis, skipna])

Return cumulative minimum over a DataFrame or Series axis.

cumprod([axis, skipna])

Return cumulative product over a DataFrame or Series axis.

cumsum([axis, skipna])

Return cumulative sum over a DataFrame or Series axis.

describe([percentiles, include, exclude, …])

Generate descriptive statistics.

diff([periods, axis])

First discrete difference of element.

difference(other[, align])

Returns a GeoSeries of the points in each aligned geometry that are not in other.

disjoint(other[, align])

Returns a Series of dtype('bool') with value True for each aligned geometry disjoint to other.

dissolve([by, aggfunc, as_index, level, …])

Dissolve geometries within groupby into single observation.

distance(other[, align])

Returns a Series containing the distance to aligned other.

div(other[, axis, level, fill_value])

Get Floating division of dataframe and other, element-wise (binary operator truediv).

divide(other[, axis, level, fill_value])

Get Floating division of dataframe and other, element-wise (binary operator truediv).

dot(other)

Compute the matrix multiplication between the DataFrame and other.

drop([labels, axis, index, columns, level, …])

Drop specified labels from rows or columns.

drop_duplicates([subset, keep, inplace, …])

Return DataFrame with duplicate rows removed.

droplevel(level[, axis])

Return DataFrame with requested index / column level(s) removed.

dropna([axis, how, thresh, subset, inplace])

Remove missing values.

duplicated([subset, keep])

Return boolean Series denoting duplicate rows.

eq(other[, axis, level])

Get Equal to of dataframe and other, element-wise (binary operator eq).

equals(other)

Test whether two objects contain the same elements.

estimate_utm_crs([datum_name])

Returns the estimated UTM CRS based on the bounds of the dataset.

eval(expr[, inplace])

Evaluate a string describing operations on DataFrame columns.

ewm([com, span, halflife, alpha, …])

Provide exponential weighted (EW) functions.

expanding([min_periods, center, axis])

Provide expanding transformations.

explode([column])

Explode muti-part geometries into multiple single geometries.

ffill([axis, inplace, limit, downcast])

Synonym for DataFrame.fillna() with method='ffill'.

fillna([value, method, axis, inplace, …])

Fill NA/NaN values using the specified method.

filter([items, like, regex, axis])

Subset the dataframe rows or columns according to the specified index labels.

first(offset)

Select initial periods of time series data based on a date offset.

first_valid_index()

Return index for first non-NA/null value.

floordiv(other[, axis, level, fill_value])

Get Integer division of dataframe and other, element-wise (binary operator floordiv).

from_dict(data[, geometry, crs])

Construct GeoDataFrame from dict of array-like or dicts by overiding DataFrame.from_dict method with geometry and crs

from_features(features[, crs, columns])

Alternate constructor to create GeoDataFrame from an iterable of features or a feature collection.

from_file(filename, **kwargs)

Alternate constructor to create a GeoDataFrame from a file.

from_postgis(sql, con[, geom_col, crs, …])

Alternate constructor to create a GeoDataFrame from a sql query containing a geometry column in WKB representation.

from_records(data[, index, exclude, …])

Convert structured or record ndarray to DataFrame.

ge(other[, axis, level])

Get Greater than or equal to of dataframe and other, element-wise (binary operator ge).

geom_almost_equals(other[, decimal, align])

Returns a Series of dtype('bool') with value True if each aligned geometry is approximately equal to other.

geom_equals(other[, align])

Returns a Series of dtype('bool') with value True for each aligned geometry equal to other.

geom_equals_exact(other, tolerance[, align])

Return True for all geometries that equal aligned other to a given tolerance, else False.

get(key[, default])

Get item from object for given key (ex: DataFrame column).

groupby([by, axis, level, as_index, sort, …])

Group DataFrame using a mapper or by a Series of columns.

gt(other[, axis, level])

Get Greater than of dataframe and other, element-wise (binary operator gt).

head([n])

Return the first n rows.

hist([column, by, grid, xlabelsize, xrot, …])

Make a histogram of the DataFrame’s.

idxmax([axis, skipna])

Return index of first occurrence of maximum over requested axis.

idxmin([axis, skipna])

Return index of first occurrence of minimum over requested axis.

infer_objects()

Attempt to infer better dtypes for object columns.

info([verbose, buf, max_cols, memory_usage, …])

Print a concise summary of a DataFrame.

insert(loc, column, value[, allow_duplicates])

Insert column into DataFrame at specified location.

interpolate(distance[, normalized])

Return a point at the specified distance along each geometry

intersection(other[, align])

Returns a GeoSeries of the intersection of points in each aligned geometry with other.

intersects(other[, align])

Returns a Series of dtype('bool') with value True for each aligned geometry that intersects other.

isin(values)

Whether each element in the DataFrame is contained in values.

isna()

Detect missing values.

isnull()

Detect missing values.

items()

Iterate over (column name, Series) pairs.

iterfeatures([na, show_bbox, drop_id])

Returns an iterator that yields feature dictionaries that comply with __geo_interface__

iteritems()

Iterate over (column name, Series) pairs.

iterrows()

Iterate over DataFrame rows as (index, Series) pairs.

itertuples([index, name])

Iterate over DataFrame rows as namedtuples.

join(other[, on, how, lsuffix, rsuffix, sort])

Join columns of another DataFrame.

keys()

Get the ‘info axis’ (see Indexing for more).

kurt([axis, skipna, level, numeric_only])

Return unbiased kurtosis over requested axis.

kurtosis([axis, skipna, level, numeric_only])

Return unbiased kurtosis over requested axis.

last(offset)

Select final periods of time series data based on a date offset.

last_valid_index()

Return index for last non-NA/null value.

le(other[, axis, level])

Get Less than or equal to of dataframe and other, element-wise (binary operator le).

lookup(row_labels, col_labels)

Label-based “fancy indexing” function for DataFrame.

lt(other[, axis, level])

Get Less than of dataframe and other, element-wise (binary operator lt).

mad([axis, skipna, level])

Return the mean absolute deviation of the values over the requested axis.

mask(cond[, other, inplace, axis, level, …])

Replace values where the condition is True.

max([axis, skipna, level, numeric_only])

Return the maximum of the values over the requested axis.

mean([axis, skipna, level, numeric_only])

Return the mean of the values over the requested axis.

median([axis, skipna, level, numeric_only])

Return the median of the values over the requested axis.

melt([id_vars, value_vars, var_name, …])

Unpivot a DataFrame from wide to long format, optionally leaving identifiers set.

memory_usage([index, deep])

Return the memory usage of each column in bytes.

merge(*args, **kwargs)

Merge two GeoDataFrame objects with a database-style join.

min([axis, skipna, level, numeric_only])

Return the minimum of the values over the requested axis.

mod(other[, axis, level, fill_value])

Get Modulo of dataframe and other, element-wise (binary operator mod).

mode([axis, numeric_only, dropna])

Get the mode(s) of each element along the selected axis.

mul(other[, axis, level, fill_value])

Get Multiplication of dataframe and other, element-wise (binary operator mul).

multiply(other[, axis, level, fill_value])

Get Multiplication of dataframe and other, element-wise (binary operator mul).

ne(other[, axis, level])

Get Not equal to of dataframe and other, element-wise (binary operator ne).

nlargest(n, columns[, keep])

Return the first n rows ordered by columns in descending order.

notna()

Detect existing (non-missing) values.

notnull()

Detect existing (non-missing) values.

nsmallest(n, columns[, keep])

Return the first n rows ordered by columns in ascending order.

nunique([axis, dropna])

Count distinct observations over requested axis.

overlaps(other[, align])

Returns True for all aligned geometries that overlap other, else False.

pad([axis, inplace, limit, downcast])

Synonym for DataFrame.fillna() with method='ffill'.

pct_change([periods, fill_method, limit, freq])

Percentage change between the current and a prior element.

pipe(func, *args, **kwargs)

Apply func(self, *args, **kwargs).

pivot([index, columns, values])

Return reshaped DataFrame organized by given index / column values.

pivot_table([values, index, columns, …])

Create a spreadsheet-style pivot table as a DataFrame.

pop(item)

Return item and drop from frame.

pow(other[, axis, level, fill_value])

Get Exponential power of dataframe and other, element-wise (binary operator pow).

prod([axis, skipna, level, numeric_only, …])

Return the product of the values over the requested axis.

product([axis, skipna, level, numeric_only, …])

Return the product of the values over the requested axis.

project(other[, normalized, align])

Return the distance along each geometry nearest to other

quantile([q, axis, numeric_only, interpolation])

Return values at the given quantile over requested axis.

query(expr[, inplace])

Query the columns of a DataFrame with a boolean expression.

radd(other[, axis, level, fill_value])

Get Addition of dataframe and other, element-wise (binary operator radd).

rank([axis, method, numeric_only, …])

Compute numerical data ranks (1 through n) along axis.

rdiv(other[, axis, level, fill_value])

Get Floating division of dataframe and other, element-wise (binary operator rtruediv).

reindex([labels, index, columns, axis, …])

Conform Series/DataFrame to new index with optional filling logic.

reindex_like(other[, method, copy, limit, …])

Return an object with matching indices as other object.

relate(other[, align])

Returns the DE-9IM intersection matrices for the geometries

rename([mapper, index, columns, axis, copy, …])

Alter axes labels.

rename_axis([mapper, index, columns, axis, …])

Set the name of the axis for the index or columns.

rename_geometry(col[, inplace])

Renames the GeoDataFrame geometry column to the specified name.

reorder_levels(order[, axis])

Rearrange index levels using input order.

replace([to_replace, value, inplace, limit, …])

Replace values given in to_replace with value.

representative_point()

Returns a GeoSeries of (cheaply computed) points that are guaranteed to be within each geometry.

resample(rule[, axis, closed, label, …])

Resample time-series data.

reset_index([level, drop, inplace, …])

Reset the index, or a level of it.

rfloordiv(other[, axis, level, fill_value])

Get Integer division of dataframe and other, element-wise (binary operator rfloordiv).

rmod(other[, axis, level, fill_value])

Get Modulo of dataframe and other, element-wise (binary operator rmod).

rmul(other[, axis, level, fill_value])

Get Multiplication of dataframe and other, element-wise (binary operator rmul).

rolling(window[, min_periods, center, …])

Provide rolling window calculations.

rotate(angle[, origin, use_radians])

Returns a GeoSeries with rotated geometries.

round([decimals])

Round a DataFrame to a variable number of decimal places.

rpow(other[, axis, level, fill_value])

Get Exponential power of dataframe and other, element-wise (binary operator rpow).

rsub(other[, axis, level, fill_value])

Get Subtraction of dataframe and other, element-wise (binary operator rsub).

rtruediv(other[, axis, level, fill_value])

Get Floating division of dataframe and other, element-wise (binary operator rtruediv).

sample([n, frac, replace, weights, …])

Return a random sample of items from an axis of object.

scale([xfact, yfact, zfact, origin])

Returns a GeoSeries with scaled geometries.

select_dtypes([include, exclude])

Return a subset of the DataFrame’s columns based on the column dtypes.

sem([axis, skipna, level, ddof, numeric_only])

Return unbiased standard error of the mean over requested axis.

set_axis(labels[, axis, inplace])

Assign desired index to given axis.

set_crs([crs, epsg, inplace, allow_override])

Set the Coordinate Reference System (CRS) of the GeoDataFrame.

set_flags(*[, copy, allows_duplicate_labels])

Return a new object with updated flags.

set_geometry(col[, drop, inplace, crs])

Set the GeoDataFrame geometry using either an existing column or the specified input.

set_index(keys[, drop, append, inplace, …])

Set the DataFrame index using existing columns.

shift([periods, freq, axis, fill_value])

Shift index by desired number of periods with an optional time freq.

simplify(*args, **kwargs)

Returns a GeoSeries containing a simplified representation of each geometry.

skew([xs, ys, origin, use_radians])

Returns a GeoSeries with skewed geometries.

slice_shift([periods, axis])

Equivalent to shift without copying data.

sort_index([axis, level, ascending, …])

Sort object by labels (along an axis).

sort_values(by[, axis, ascending, inplace, …])

Sort by the values along either axis.

squeeze([axis])

Squeeze 1 dimensional axis objects into scalars.

stack([level, dropna])

Stack the prescribed level(s) from columns to index.

std([axis, skipna, level, ddof, numeric_only])

Return sample standard deviation over requested axis.

sub(other[, axis, level, fill_value])

Get Subtraction of dataframe and other, element-wise (binary operator sub).

subtract(other[, axis, level, fill_value])

Get Subtraction of dataframe and other, element-wise (binary operator sub).

sum([axis, skipna, level, numeric_only, …])

Return the sum of the values over the requested axis.

swapaxes(axis1, axis2[, copy])

Interchange axes and swap values axes appropriately.

swaplevel([i, j, axis])

Swap levels i and j in a MultiIndex on a particular axis.

symmetric_difference(other[, align])

Returns a GeoSeries of the symmetric difference of points in each aligned geometry with other.

tail([n])

Return the last n rows.

take(indices[, axis, is_copy])

Return the elements in the given positional indices along an axis.

to_clipboard([excel, sep])

Copy object to the system clipboard.

to_crs([crs, epsg, inplace])

Transform geometries to a new coordinate reference system.

to_csv([path_or_buf, sep, na_rep, …])

Write object to a comma-separated values (csv) file.

to_dict([orient, into])

Convert the DataFrame to a dictionary.

to_excel(excel_writer[, sheet_name, na_rep, …])

Write object to an Excel sheet.

to_feather(path[, index, compression])

Write a GeoDataFrame to the Feather format.

to_file(filename[, driver, schema, index])

Write the GeoDataFrame to a file.

to_gbq(destination_table[, project_id, …])

Write a DataFrame to a Google BigQuery table.

to_hdf(path_or_buf, key[, mode, complevel, …])

Write the contained data to an HDF5 file using HDFStore.

to_html([buf, columns, col_space, header, …])

Render a DataFrame as an HTML table.

to_json([na, show_bbox, drop_id])

Returns a GeoJSON representation of the GeoDataFrame as a string.

to_latex([buf, columns, col_space, header, …])

Render object to a LaTeX tabular, longtable, or nested table/tabular.

to_markdown([buf, mode, index, storage_options])

Print DataFrame in Markdown-friendly format.

to_numpy([dtype, copy, na_value])

Convert the DataFrame to a NumPy array.

to_parquet(path[, index, compression])

Write a GeoDataFrame to the Parquet format.

to_period([freq, axis, copy])

Convert DataFrame from DatetimeIndex to PeriodIndex.

to_pickle(path[, compression, protocol, …])

Pickle (serialize) object to file.

to_postgis(name, con[, schema, if_exists, …])

Upload GeoDataFrame into PostGIS database.

to_records([index, column_dtypes, index_dtypes])

Convert DataFrame to a NumPy record array.

to_sql(name, con[, schema, if_exists, …])

Write records stored in a DataFrame to a SQL database.

to_stata(path[, convert_dates, write_index, …])

Export DataFrame object to Stata dta format.

to_string([buf, columns, col_space, header, …])

Render a DataFrame to a console-friendly tabular output.

to_timestamp([freq, how, axis, copy])

Cast to DatetimeIndex of timestamps, at beginning of period.

to_wkb([hex])

Encode all geometry columns in the GeoDataFrame to WKB.

to_wkt(**kwargs)

Encode all geometry columns in the GeoDataFrame to WKT.

to_xarray()

Return an xarray object from the pandas object.

touches(other[, align])

Returns a Series of dtype('bool') with value True for each aligned geometry that touches other.

transform(func[, axis])

Call func on self producing a DataFrame with transformed values.

translate([xoff, yoff, zoff])

Returns a GeoSeries with translated geometries.

transpose(*args[, copy])

Transpose index and columns.

truediv(other[, axis, level, fill_value])

Get Floating division of dataframe and other, element-wise (binary operator truediv).

truncate([before, after, axis, copy])

Truncate a Series or DataFrame before and after some index value.

tshift([periods, freq, axis])

Shift the time index, using the index’s frequency if available.

tz_convert(tz[, axis, level, copy])

Convert tz-aware axis to target time zone.

tz_localize(tz[, axis, level, copy, …])

Localize tz-naive index of a Series or DataFrame to target time zone.

union(other[, align])

Returns a GeoSeries of the union of points in each aligned geometry with other.

unstack([level, fill_value])

Pivot a level of the (necessarily hierarchical) index labels.

update(other[, join, overwrite, …])

Modify in place using non-NA values from another DataFrame.

value_counts([subset, normalize, sort, …])

Return a Series containing counts of unique rows in the DataFrame.

var([axis, skipna, level, ddof, numeric_only])

Return unbiased variance over requested axis.

where(cond[, other, inplace, axis, level, …])

Replace values where the condition is False.

within(other[, align])

Returns a Series of dtype('bool') with value True for each aligned geometry that is within other.

xs(key[, axis, level, drop_level])

Return cross-section from the Series/DataFrame.

Attributes

T

area

Returns a Series containing the area of each geometry in the GeoSeries expressed in the units of the CRS.

at

Access a single value for a row/column label pair.

attrs

Dictionary of global attributes of this dataset.

axes

Return a list representing the axes of the DataFrame.

boundary

Returns a GeoSeries of lower dimensional objects representing each geometries’s set-theoretic boundary.

bounds

Returns a DataFrame with columns minx, miny, maxx, maxy values containing the bounds for each geometry.

cascaded_union

Deprecated: Return the unary_union of all geometries

centroid

Returns a GeoSeries of points representing the centroid of each geometry.

columns

The column labels of the DataFrame.

convex_hull

Returns a GeoSeries of geometries representing the convex hull of each geometry.

crs

The Coordinate Reference System (CRS) represented as a pyproj.CRS object.

cx

Coordinate based indexer to select by intersection with bounding box.

dtypes

Return the dtypes in the DataFrame.

empty

Indicator whether DataFrame is empty.

envelope

Returns a GeoSeries of geometries representing the envelope of each geometry.

exterior

Returns a GeoSeries of LinearRings representing the outer boundary of each polygon in the GeoSeries.

flags

Get the properties associated with this pandas object.

geom_type

Returns a Series of strings specifying the Geometry Type of each object.

geometry

Geometry data for GeoDataFrame

has_sindex

Check the existence of the spatial index without generating it.

has_z

Returns a Series of dtype('bool') with value True for features that have a z-component.

iat

Access a single value for a row/column pair by integer position.

iloc

Purely integer-location based indexing for selection by position.

index

The index (row labels) of the DataFrame.

interiors

Returns a Series of List representing the inner rings of each polygon in the GeoSeries.

is_empty

Returns a Series of dtype('bool') with value True for empty geometries.

is_ring

Returns a Series of dtype('bool') with value True for features that are closed.

is_simple

Returns a Series of dtype('bool') with value True for geometries that do not cross themselves.

is_valid

Returns a Series of dtype('bool') with value True for geometries that are valid.

length

Returns a Series containing the length of each geometry expressed in the units of the CRS.

loc

Access a group of rows and columns by label(s) or a boolean array.

ndim

Return an int representing the number of axes / array dimensions.

shape

Return a tuple representing the dimensionality of the DataFrame.

sindex

Generate the spatial index

size

Return an int representing the number of elements in this object.

style

Returns a Styler object.

total_bounds

Returns a tuple containing minx, miny, maxx, maxy values for the bounds of the series as a whole.

type

Return the geometry type of each geometry in the GeoSeries

unary_union

Returns a geometry containing the union of all geometries in the GeoSeries.

values

Return a Numpy representation of the DataFrame.