geopandas.GeoDataFrame.to_arrow#
- GeoDataFrame.to_arrow(*, index=None, geometry_encoding='WKB', interleaved=True, include_z=None)[source]#
Encode a GeoDataFrame to GeoArrow format.
See https://geoarrow.org/ for details on the GeoArrow specification.
This functions returns a generic Arrow data object implementing the Arrow PyCapsule Protocol (i.e. having an
__arrow_c_stream__
method). This object can then be consumed by your Arrow implementation of choice that supports this protocol.Added in version 1.0.
- Parameters:
- indexbool, default None
If
True
, always include the dataframe’s index(es) as columns in the file output. IfFalse
, the index(es) will not be written to the file. IfNone
, the index(ex) will be included as columns in the file output except RangeIndex which is stored as metadata only.- geometry_encoding{‘WKB’, ‘geoarrow’ }, default ‘WKB’
The GeoArrow encoding to use for the data conversion.
- interleavedbool, default True
Only relevant for ‘geoarrow’ encoding. If True, the geometries’ coordinates are interleaved in a single fixed size list array. If False, the coordinates are stored as separate arrays in a struct type.
- include_zbool, default None
Only relevant for ‘geoarrow’ encoding (for WKB, the dimensionality of the individial geometries is preserved). If False, return 2D geometries. If True, include the third dimension in the output (if a geometry has no third dimension, the z-coordinates will be NaN). By default, will infer the dimensionality from the input geometries. Note that this inference can be unreliable with empty geometries (for a guaranteed result, it is recommended to specify the keyword).
- Returns:
- ArrowTable
A generic Arrow table object with geometry columns encoded to GeoArrow.
Examples
>>> from shapely.geometry import Point >>> data = {'col1': ['name1', 'name2'], 'geometry': [Point(1, 2), Point(2, 1)]} >>> gdf = geopandas.GeoDataFrame(data) >>> gdf col1 geometry 0 name1 POINT (1 2) 1 name2 POINT (2 1)
>>> arrow_table = gdf.to_arrow() >>> arrow_table <geopandas.io._geoarrow.ArrowTable object at ...>
The returned data object needs to be consumed by a library implementing the Arrow PyCapsule Protocol. For example, wrapping the data as a pyarrow.Table (requires pyarrow >= 14.0):
>>> import pyarrow as pa >>> table = pa.table(arrow_table) >>> table pyarrow.Table col1: string geometry: binary ---- col1: [["name1","name2"]] geometry: [[0101000000000000000000F03F0000000000000040,01010000000000000000000040000000000000F03F]]