Reference Data API

Open In Colab

This guide provides detailed instructions on how to use the gfw-api-python-client to access reference data, specifically geographic regions. The Reference Data API offers access to static datasets, currently focusing on Exclusive Economic Zones (EEZs), Marine Protected Areas (MPAs), and Regional Fisheries Management Organizations (RFMOs). Here is a Jupyter Notebook version of this guide with more usage examples.

Note: See the Datasets, Reference Data Caveats, and Terms of Use pages in the GFW API documentation for details on GFW data, API licenses, and rate limits.

Prerequisites

Getting Started

To interact with the Regions endpoints, you first need to instantiate the gfw.Client and then access the references resource:

import os

import gfwapiclient as gfw


access_token = os.environ.get(
    "GFW_API_ACCESS_TOKEN",
    "<OR_PASTE_YOUR_GFW_API_ACCESS_TOKEN_HERE>",
)

gfw_client = gfw.Client(
    access_token=access_token,
)

The gfw_client.references object provides methods to retrieve different types of geographic regions. Each of these methods returns a result object, which offers convenient ways to access the data as Pydantic models using .data() or as pandas DataFrames using .df().

Tip: Use IPython or Python 3.11+ with python -m asyncio to run gfw-api-python-client code interactively, as these environments support executing async / await expressions directly in the console.

Retrieving Exclusive Economic Zones (EEZs)

To get a list of available Exclusive Economic Zone (EEZ) regions, use the get_eez_regions() method:

eez_regions_result = await gfw_client.references.get_eez_regions()

Access the list of EEZ region as Pydantic models

eez_regions_data = eez_regions_result.data()
eez_region = eez_regions_data[-1]
print((eez_region.id, eez_region.dataset))
print(eez_region.model_dump())

Output:

(48999, 'public-eez-areas')
{'id': 48999, 'label': 'Overlapping claim Peñón de Vélez de la Gomera: Spain / Morocco', 'iso3': None, 'dataset': 'public-eez-areas'}

Access the EEZ regions as a DataFrame

eez_regions_df = eez_regions_result.df()
print(eez_regions_df.info())
print(eez_regions_df[["id", "dataset"]].head())

Output:

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 285 entries, 0 to 284
Data columns (total 4 columns):
 #   Column      Non-Null Count  Dtype
---  ------      --------------  -----
 0   id          285 non-null    int64
 1   label       285 non-null    object
 2   iso3        234 non-null    object
 3   dataset     285 non-null    object
dtypes: int64(1), object(3)
memory usage: 17.9+ KB

Retrieving Marine Protected Areas (MPAs)

To get a list of available Marine Protected Area (MPA) regions, use the get_mpa_regions() method:

mpa_regions_result = await gfw_client.references.get_mpa_regions()

Access the list of MPA region as Pydantic models

mpa_regions_data = mpa_regions_result.data()
mpa_region = mpa_regions_data[-1]
print((mpa_region.id, mpa_region.dataset))
print(mpa_region.model_dump())

Output:

('555799979', 'public-mpa-all')
{'id': '555799979', 'label': 'NAF Marine Protected Area - Marine Protected Area', 'name': None, 'dataset': 'public-mpa-all'}

Access the MPA regions as a DataFrame

mpa_regions_df = mpa_regions_result.df()
print(mpa_regions_df.info())
print(mpa_regions_df[["id", "dataset"]].head())

Output:

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 16591 entries, 0 to 16590
Data columns (total 4 columns):
 #   Column   Non-Null Count  Dtype
---  ------   --------------  -----
 0   id       16591 non-null  object
 1   label    16591 non-null  object
 2   name     0 non-null      object
 3   dataset  16591 non-null  object
dtypes: object(4)
memory usage: 518.6+ KB

Retrieving Regional Fisheries Management Organizations (RFMOs)

To get a list of available Regional Fisheries Management Organization (RFMO) regions, use the get_rfmo_regions() method:

rfmo_regions_result = await gfw_client.references.get_rfmo_regions()

Access the list of RFMO region as Pydantic models

rfmo_regions_data = rfmo_regions_result.data()
rfmo_region = rfmo_regions_data[-1]
print((rfmo_region.id, rfmo_region.dataset))
print(rfmo_region.model_dump())

Output:

('BOBP-IGO', 'public-rfmo')
{'id': 'WCPFC', 'label': 'WCPFC', 'rfb': None, 'dataset': 'public-rfmo'}
{'id': 'BOBP-IGO', 'label': 'BOBP-IGO', 'rfb': None, 'dataset': 'public-rfmo', 'ID': 'BOBP-IGO'}

Access the RFMO regions as a DataFrame

rfmo_regions_df = rfmo_regions_result.df()
print(rfmo_regions_df.info())
print(rfmo_regions_df[["id", "dataset"]].head())

Output:

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 42 entries, 0 to 41
Data columns (total 5 columns):
 #   Column   Non-Null Count  Dtype
---  ------   --------------  -----
 0   id       42 non-null     object
 1   label    42 non-null     object
 2   rfb      0 non-null      object
 3   dataset  42 non-null     object
 4   ID       42 non-null     object
dtypes: object(5)
memory usage: 1.8+ KB

Next Steps

Explore the Usage Guides and Workflow Guides for other API resources to understand how you can combine reference data with dynamic information about events, vessels, and more. Check out the following resources: