# Bulk Download API
This guide provides detailed instructions on how to use the [gfw-api-python-client](https://github.com/GlobalFishingWatch/gfw-api-python-client) to access the [Bulk Download API](https://globalfishingwatch.org/our-apis/documentation#bulk-download-api), which is designed to support workflows that require bulk access to data, including integration with platforms and tools used by data engineers and researchers. Here is a [Jupyter Notebook](https://github.com/GlobalFishingWatch/gfw-api-python-client/blob/develop/notebooks/usage-guides/bulk-downloads-api.ipynb) version of this guide with more usage examples.
> **Note:** See the [Datasets](https://globalfishingwatch.org/our-apis/documentation#api-dataset), [Data Caveats](https://globalfishingwatch.org/our-apis/documentation#data-caveat), [SAR (Synthetic-Aperture Radar) Data Caveats](https://globalfishingwatch.org/our-apis/documentation#sar-fixed-infrastructure-data-caveats), and [Terms of Use](https://globalfishingwatch.org/our-apis/documentation#terms-of-use) pages in the [GFW API documentation](https://globalfishingwatch.org/our-apis/documentation#introduction) for details on GFW data, API licenses, and rate limits.
## Prerequisites
- Before using the `gfw-api-python-client`, ensure it is installed (see the [Getting Started](../getting-started) guide) and that you have obtained an API access token from the [Global Fishing Watch API portal](https://globalfishingwatch.org/our-apis/tokens).
## Getting Started
To interact with the Bulk Download endpoints, you first need to instantiate the `gfw.Client` and then access the `bulk_downloads` resource:
```python
import time
import os
import gfwapiclient as gfw
access_token = os.environ.get(
"GFW_API_ACCESS_TOKEN",
"",
)
gfw_client = gfw.Client(
access_token=access_token,
)
```
The `gfw_client.bulk_downloads` object provides methods to:
- Create bulk reports based on specific filters and spatial parameters.
- Monitor previously created bulk report generation status.
- Get signed URL to download previously created bulk report data, metadata and
region geometry (in GeoJSON format) files.
- Query previously created bulk report data records in JSON format.
These methods return a `result` object, which offers convenient ways to access the data as Pydantic models using `.data()` or as pandas DataFrames using `.df()`.
> **Tip:** Use [IPython](https://ipython.readthedocs.io/en/stable/) or Python 3.11+ with `python -m asyncio` to run `gfw-api-python-client` code interactively, as these environments support executing `async` / `await` expressions directly in the console.
## Create a Bulk Report (`create_bulk_report`)
The `create_bulk_report()` method allows you create a bulk report based on specified filters and spatial parameters. The `name` parameter is mandatory. Please [learn more about create a bulk report here](https://globalfishingwatch.org/our-apis/documentation#create-a-bulk-report) and [check its data caveats here](https://globalfishingwatch.org/our-apis/documentation#data-caveat) and [here](https://globalfishingwatch.org/our-apis/documentation#sar-fixed-infrastructure-data-caveats).
```python
timestamp = int(time.time() * 1000)
dataset = "public-fixed-infrastructure-data:latest"
region_dataset = "public-eez-areas"
region_id = "8466" # Argentinian Exclusive Economic Zone
name = f"{dataset.split(':')[0]}_{region_dataset}__{region_id}_{timestamp}"
create_bulk_report_result = await gfw_client.bulk_downloads.create_bulk_report(
name=name,
dataset=dataset,
region={
"dataset": region_dataset,
"id": region_id,
},
filters=["label = 'oil'", "label_confidence = 'high'"],
)
```
### Access Create a Bulk Report Result as Pydantic models
```python
create_bulk_report_data = create_bulk_report_result.data()
print((
create_bulk_report_data.id,
create_bulk_report_data.name,
create_bulk_report_data.status,
create_bulk_report_data.created_at,
))
```
**Output:**
```
('c5e32895-4374-41d2-8b2e-ac414ed6757f',
'public-fixed-infrastructure-data_public-eez-areas__8466_1768085547174',
'pending',
datetime.datetime(2026, 1, 10, 22, 52, 30, 9000, tzinfo=TzInfo(0)))
```
### Access Create a Bulk Report Result as a DataFrame
```python
create_bulk_report_df = create_bulk_report_result.df()
print(create_bulk_report_df.info())
print(create_bulk_report_df.head())
```
**Output:**
```
RangeIndex: 1 entries, 0 to 0
Data columns (total 12 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 id 1 non-null object
1 name 1 non-null object
2 file_path 1 non-null object
3 format 1 non-null object
4 filters 1 non-null object
5 geom 1 non-null object
6 status 1 non-null object
7 owner_id 1 non-null int64
8 owner_type 1 non-null object
9 created_at 1 non-null datetime64[ns, UTC]
10 updated_at 1 non-null datetime64[ns, UTC]
11 file_size 0 non-null object
dtypes: datetime64[ns, UTC](2), int64(1), object(9)
memory usage: 228.0+ bytes
```
## Get Bulk Report by ID (`get_bulk_report_by_id`)
The `get_bulk_report_by_id()` method allows you retrieves metadata and status of the previously created bulk report based on the provided bulk report ID. The `id` parameter is mandatory. Please [learn more about get bulk report by id report here](https://globalfishingwatch.org/our-apis/documentation#get-bulk-report-by-id) and [check its data caveats here](https://globalfishingwatch.org/our-apis/documentation#data-caveat) and [here](https://globalfishingwatch.org/our-apis/documentation#sar-fixed-infrastructure-data-caveats).
> **Important:** We recommend to use this method to poll the status of previously created bulk report, if it takes **several minutes or hours** to generate until it status is `done` or `failed`.
```python
bulk_report_result = await gfw_client.bulk_downloads.get_bulk_report_by_id(
id=create_bulk_report_data.id
)
```
### Access Get Bulk Report by ID Result as Pydantic models
```python
bulk_report_data = bulk_report_result.data()
print((
create_bulk_report_data.id,
create_bulk_report_data.name,
create_bulk_report_data.status,
create_bulk_report_data.created_at,
))
```
**Output:**
```
('c5e32895-4374-41d2-8b2e-ac414ed6757f',
'public-fixed-infrastructure-data_public-eez-areas__8466_1768085547174',
'pending',
datetime.datetime(2026, 1, 10, 22, 52, 30, 9000, tzinfo=TzInfo(0)))
```
### Access Get Bulk Report by ID Result as a DataFrame
```python
bulk_report_df = bulk_report_result.df()
print(bulk_report_df.info())
print(bulk_report_df.head())
```
**Output:**
```
RangeIndex: 1 entries, 0 to 0
Data columns (total 12 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 id 1 non-null object
1 name 1 non-null object
2 file_path 1 non-null object
3 format 1 non-null object
4 filters 1 non-null object
5 geom 1 non-null object
6 status 1 non-null object
7 owner_id 1 non-null int64
8 owner_type 1 non-null object
9 created_at 1 non-null datetime64[ns, UTC]
10 updated_at 1 non-null datetime64[ns, UTC]
11 file_size 0 non-null object
dtypes: datetime64[ns, UTC](2), int64(1), object(9)
memory usage: 228.0+ bytes
```
## Get All Bulk Reports Created by User or Application (`get_all_bulk_reports`)
The `get_all_bulk_reports()` method allows you retrieves a list of **metadata and status** of the previously created bulk reports based on specified pagination, sorting, and filtering criteria. Please [learn more about get all bulk reports created by user or application here](https://globalfishingwatch.org/our-apis/documentation#get-all-bulk-reports-by-user) and [check its data caveats here](https://globalfishingwatch.org/our-apis/documentation#data-caveat) and [here](https://globalfishingwatch.org/our-apis/documentation#sar-fixed-infrastructure-data-caveats).
```python
bulk_reports_result = await gfw_client.bulk_downloads.get_all_bulk_reports(
status="done",
)
```
### Access All Created Bulk Reports Result as Pydantic models
```python
bulk_reports_data = bulk_reports_result.data()
bulk_report_item = bulk_reports_data[-1]
print((
bulk_report_item.id,
bulk_report_item.name,
bulk_report_item.status,
bulk_report_item.created_at,
))
```
**Output:**
```
('0c0cada1-72dd-4fb0-bdf6-7fe8c7fdb1e3',
'sar-fixed-infrastructure-data-20241207-region-1',
'done',
datetime.datetime(2025, 12, 7, 10, 3, 12, 371000, tzinfo=TzInfo(0)))
```
### Access All Created Bulk Reports Result as a DataFrame
```python
bulk_reports_df = bulk_reports_result.df()
print(bulk_reports_df.info())
print(bulk_reports_df.head())
```
**Output:**
```
RangeIndex: 7 entries, 0 to 6
Data columns (total 12 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 id 7 non-null object
1 name 7 non-null object
2 file_path 7 non-null object
3 format 7 non-null object
4 filters 7 non-null object
5 geom 4 non-null object
6 status 7 non-null object
7 owner_id 7 non-null int64
8 owner_type 7 non-null object
9 created_at 7 non-null datetime64[ns, UTC]
10 updated_at 7 non-null datetime64[ns, UTC]
11 file_size 7 non-null float64
dtypes: datetime64[ns, UTC](2), float64(1), int64(1), object(8)
memory usage: 804.0+ bytes
```
## Get Bulk Report File Download URL (`get_bulk_report_file_download_url`)
The `get_bulk_report_file_download_url()` method allows you retrieves **signed URL** that points to a **downloadable file** hosted on Global Fishing Watch's cloud infrastructure to **download file(s)** (i.e., `"DATA"`, `"README"`, or `"GEOM"`) of the previously created bulk report. The `id` parameter is mandatory. Please [learn more about get bulk report file download url here](https://globalfishingwatch.org/our-apis/documentation#download-bulk-report-url-file) and [check its data caveats here](https://globalfishingwatch.org/our-apis/documentation#data-caveat) and [here](https://globalfishingwatch.org/our-apis/documentation#sar-fixed-infrastructure-data-caveats).
```python
bulk_report_file_download_url_result = (
await gfw_client.bulk_downloads.get_bulk_report_file_download_url(
id=bulk_reports_data[0].id, file="DATA"
)
)
```
### Access Get Bulk Report File Download URL Result as Pydantic models
```python
bulk_report_file_download_url_data = bulk_report_file_download_url_result.data()
print(bulk_report_file_download_url_data.url)
```
**Output:**
```
'https://storage.googleapis.com/gfw-api-bulk-pro-us-central1/705f2f9a-f695-43f1-a4bf-7746f3deb091/data.json.gz?X-Goog-Algorithm=GOOG4-RSA-SHA256&X-Goog-Credential=api-bulk-pro%40gfw-production.iam.gserviceaccount.com%2F20260110%2Fauto%2Fstorage%2Fgoog4_request&X-Goog-Date=20260110T225232Z&X-Goog-Expires=60&X-Goog-SignedHeaders=host&X-Goog-Signature=481a4ff7244b7286f303b37bb7941c291a26d1e3502debdb7611b8cb2d5edf37bc7aa0287b15a11c2f69f72e88791da3f76873a2fd7d08f911691c35ee8e095b825615510de8256f8cd275211997141e026837e118d86e01c026c457dc1f47d43ff2cb07131c3d21e7908c847bf1e3d87cd4773f02e8e4512a7c15e93799de186b9ea004be50cd3e53292f01e9393595a81c42cc3686f65d280f4f16076759da4722c17c2a6a698393c919cdd083402421a1bbf425b618244b3a9b30e48b770a9dc7f9eed8e63af04f8e31f0b6723fdf76fa7262ded89e7a375fbaea3b031bf29db22b1961878facd79c92d633ab6aa2309c0ce3982104d9835058ecd829bee8'
```
### Access Get Bulk Report File Download URL Result as a DataFrame
```python
bulk_report_file_download_url_df = bulk_report_file_download_url_result.df()
print(bulk_report_file_download_url_df.info())
print(bulk_report_file_download_url_df.iloc[0]["url"])
```
**Output:**
```
RangeIndex: 1 entries, 0 to 0
Data columns (total 1 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 url 1 non-null object
dtypes: object(1)
memory usage: 140.0+ bytes
```
## Query Bulk Fixed Infrastructure Data Report (`query_bulk_fixed_infrastructure_data_report`)
The `query_bulk_fixed_infrastructure_data_report()` method allows you retrieves **data records** of a previously created **fixed infrastructure data** (i.e., `public-fixed-infrastructure-data:latest` dataset) bulk report data in JSON format based on specified pagination, sorting, and including criteria. The `id` parameter is mandatory. Please [learn more about query bulk fixed infrastructure data report in JSON format here](https://globalfishingwatch.org/our-apis/documentation#get-data-in-json-format) and [check its data caveats here](https://globalfishingwatch.org/our-apis/documentation#data-caveat) and [here](https://globalfishingwatch.org/our-apis/documentation#sar-fixed-infrastructure-data-caveats).
```python
bulk_fixed_infrastructure_data_report_result = (
await gfw_client.bulk_downloads.query_bulk_fixed_infrastructure_data_report(
id=bulk_reports_data[0].id
)
)
```
### Access Query Bulk Fixed Infrastructure Data Report Result as Pydantic models
```python
bulk_fixed_infrastructure_data_report_data = (
bulk_fixed_infrastructure_data_report_result.data()
)
bulk_fixed_infrastructure_data_report_item = bulk_fixed_infrastructure_data_report_data[
-1
]
print((
bulk_fixed_infrastructure_data_report_item.structure_id,
bulk_fixed_infrastructure_data_report_item.lat,
bulk_fixed_infrastructure_data_report_item.lon,
bulk_fixed_infrastructure_data_report_item.label,
bulk_fixed_infrastructure_data_report_item.label_confidence,
))
```
**Output:**
```
('1051638', -53.0895574340617, -67.32289149541135, 'oil', 'high')
```
### Access Query Bulk Fixed Infrastructure Data Report Result as a DataFrame
```python
bulk_fixed_infrastructure_data_report_result_df = (
bulk_fixed_infrastructure_data_report_result.df()
)
print(bulk_fixed_infrastructure_data_report_result_df.info())
print(bulk_fixed_infrastructure_data_report_result_df.head())
```
**Output:**
```
RangeIndex: 1238 entries, 0 to 1237
Data columns (total 9 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 detection_id 1237 non-null object
1 detection_date 1238 non-null datetime64[ns]
2 structure_id 1238 non-null object
3 lat 1238 non-null float64
4 lon 1238 non-null float64
5 structure_start_date 1238 non-null datetime64[ns]
6 structure_end_date 7 non-null datetime64[ns]
7 label 1238 non-null object
8 label_confidence 1238 non-null object
dtypes: datetime64[ns](3), float64(2), object(4)
memory usage: 87.2+ KB
```
## Next Steps
Explore the [Usage Guides](index) and [Workflow Guides](../workflow-guides/index) for other API resources to understand how you can combine the reporting and statistical capabilities of the 4Wings API with vessel information, event data, and more. Check out the following resources:
- [4Wings API](4wings-api)
- [Vessels API](vessels-api)
- [Events API](events-api)
- [Insights API](insights-api)
- [Datasets API](datasets-api)
- [Reference Data API](references-data-api)