
Exploring the vessel presence layer in `gfwr`
Source:vignettes/articles/gfw_vessel_presence.Rmd
gfw_vessel_presence.RmdOverview
The gfw_ais_presence() function provides gridded vessel
presence data from Global Fishing Watch’s 4Wings
Map Visualization API. Global Fishing Watch’s vessel presence layer
includes all vessel types, fishing or not, and summarizes their presence
in hours, based on the data transmitted by their AIS transponders.
In this vignette we will explore several ways to call this function, and the filters that have been implemented in our APIs to support the exploration of this layer.
Setup
To get started, first load the gfwr package and some of
the packages we will use
## ℹ Loading gfwr
library(dplyr)
library(tidyr)
library(ggplot2)
library(rnaturalearth)
library(rnaturalearthdata)
library(glue)
library(viridis)
library(forcats)
library(biscale)
library(patchwork)We will fetch data for January-March 2024. Remember the date
intervals in gfwr include the start of the interval and
exclude the last date of the interval:
start_date <- '2024-01-01' # will be included
end_date <- '2024-04-01' # will be excluded. search will be up to 2024-03-31Handling pre-built regions: MPAs, RFMOs, and EEZs
gfw_ais_presence() was designed to provide data for a
specific region, offering users the ability to select from multiple
built-in region types by specifying a specific Exclusive Economic Zone
(EEZ), Marine Protected Area (MPA), or Regional Fisheries Management
Organization (RFMO).
Note: The use of a region is mandatory, as the API is not designed to handle global requests
The list of available regions for each type, and their
label and id, can be accessed with the
gfw_regions() function.
eez_regions <- gfw_regions(region_source = 'EEZ')
eez_regions
## # A tibble: 285 × 5
## iso label id GEONAME POL_TYPE
## <chr> <chr> <dbl> <chr> <chr>
## 1 ASM American Samoa 8444 United States Exclusive Eco… 200NM
## 2 SHN Ascension 8379 British Exclusive Economic … 200NM
## 3 COK Cook Islands 8446 New Zealand Exclusive Econo… 200NM
## 4 FLK Falkland / Malvinas Islands 8389 Overlapping claim Falkland … Overlap…
## 5 PYF French Polynesia 8440 French Exclusive Economic Z… 200NM
## 6 PCN Pitcairn 8439 British Exclusive Economic … 200NM
## 7 SHN Saint Helena 8380 British Exclusive Economic … 200NM
## 8 WSM Samoa 8445 Samoan Exclusive Economic Z… 200NM
## 9 TON Tonga 8448 Tongan Exclusive Economic Z… 200NM
## 10 SHN Tristan da Cunha 8382 British Exclusive Economic … 200NM
## # ℹ 275 more rowsgfwr also includes the gfw_region_id()
function to get the label and id for a
specific region using the region argument. For EEZs,
region corresponds to the name or the country or the ISO3
code. Note that, for some countries, the name will return multiple
regions. For RFMOs, region corresponds to the RFMO
abbreviation (e.g. "ICCAT") and for MPAs it refers to the
name of the MPA.
To fetch the numeric code of the Senegal EEZ, let’s use
gfw_region_id()
# Use gfw_region_id function to get EEZ code for Senegal
senegal_eez_code <- gfw_region_id(region = "Senegal", region_source = "EEZ")
senegal_eez_code
## # A tibble: 2 × 5
## iso3 label id GEONAME POL_TYPE
## <chr> <chr> <dbl> <chr> <chr>
## 1 NA Joint regime area: Senegal / Guinea-Bissau 48964 Joint regime … Joint r…
## 2 SEN Senegal 8371 Senegalese Ex… 200NMThe results show the EEZ and a Joint regime area. We will pick the
EEZ code, 8371.
Calling the function
The gfw_ais_presence() function allows users to specify
multiple criteria to customize the data they download, including the
date range, spatial and temporal resolution, and grouping variables. See
the documentation for gfw_ais_presence() or the GFW
APIs for more info about these parameter options.
Spatial resolution can be LOW = 0.1 degree or HIGH = 0.01 degree,
vp_senegal <- gfw_ais_presence(spatial_resolution = "LOW",
temporal_resolution = "MONTHLY",
start_date = start_date,
end_date = end_date,
region_source = "EEZ",
region = 8371)
vp_senegal
## # A tibble: 4,173 × 5
## Lat Lon `Time Range` `Vessel IDs` `Vessel Presence Hours`
## <dbl> <dbl> <chr> <dbl> <dbl>
## 1 14.5 -18 2024-02 57 100
## 2 11.6 -18.1 2024-01 21 24
## 3 11.4 -19.3 2024-01 9 12
## 4 13.8 -20 2024-03 5 6
## 5 13 -17.5 2024-01 24 61
## 6 14.6 -18.5 2024-03 48 52
## 7 15.4 -18.5 2024-02 26 30
## 8 11.9 -17.7 2024-02 58 66
## 9 11.8 -18.9 2024-02 10 11
## 10 11.7 -19.6 2024-02 4 4
## # ℹ 4,163 more rowsWithout grouping variables, the function will return a number of
vessel IDs present (for a definition of
Vessel ID see our vessel
identity vignette) and the total vessel presence hours for each cell
(lat, lon).
The Time Range column will be expressed in the temporal
units of the temporal resolution selected. In this example,
MONTHLY will create a Time Range expressed in months:
YYYY-MM
Explore other temporal resolution and how the results vary.
vp_senegal <- gfw_ais_presence(spatial_resolution = "LOW",
temporal_resolution = "YEARLY",
start_date = start_date,
end_date = end_date,
region_source = "EEZ",
region = 8371)
vp_senegal
## # A tibble: 1,445 × 5
## Lat Lon `Time Range` `Vessel IDs` `Vessel Presence Hours`
## <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 15.3 -17.5 2024 20 31
## 2 11.8 -18.7 2024 31 34
## 3 11.9 -17.9 2024 105 129
## 4 14.1 -16.4 2024 4 81
## 5 15.1 -19.8 2024 12 13
## 6 11.2 -19.5 2024 19 22
## 7 15.9 -18.6 2024 67 80
## 8 15.7 -16.7 2024 1 2
## 9 13.7 -18.4 2024 71 84
## 10 16 -18.9 2024 27 36
## # ℹ 1,435 more rowsGrouping variables
The outputs of gfw_ais_presence() can be grouped by
FLAG, GEARTYPE, FLAGANDGEARTYPE,
MMSI or VESSEL_ID. This will create extra
grouping columns, and the number of vessel presence hours will be
expressed accordingly.
vp_senegal_flag <- gfw_ais_presence(spatial_resolution = "LOW",
temporal_resolution = "MONTHLY",
group_by = "FLAG",
start_date = start_date,
end_date = end_date,
region_source = "EEZ",
region = 8371)
vp_senegal_flag |> count(flag) |> arrange((desc(n)))
## # A tibble: 84 × 2
## flag n
## <chr> <int>
## 1 MHL 2918
## 2 LBR 2900
## 3 PAN 2653
## 4 MLT 2076
## 5 SGP 1897
## 6 BHS 1697
## 7 HKG 1639
## 8 NOR 1285
## 9 ATG 1270
## 10 PRT 1261
## # ℹ 74 more rowsNote that these results are grouped by
Vessel ID, which is not the same as grouping by number of
vessels. Check our vessel
identity vignette for more information.
Grouping by MMSI will group the results at the MMSI scale, which can correspond to individual vessels, but this is not always the case.
vp_senegal_MMSI <- gfw_ais_presence(spatial_resolution = "LOW",
temporal_resolution = "MONTHLY",
group_by = "MMSI",
start_date = start_date,
end_date = end_date,
region_source = "EEZ",
region = 8371)
vp_senegal_MMSI
## # A tibble: 79,918 × 7
## Lat Lon `Time Range` mmsi `Entry Timestamp` `Exit Timestamp`
## <dbl> <dbl> <chr> <dbl> <dttm> <dttm>
## 1 14.4 -20 2024-02 314573000 2024-02-16 13:00:00 2024-02-17 03:00:00
## 2 12.2 -17.9 2024-03 352003666 2024-03-19 00:00:00 2024-03-19 20:00:00
## 3 14.5 -18.6 2024-01 228032600 2024-01-16 23:00:00 2024-01-17 16:00:00
## 4 12.9 -19.2 2024-03 538009957 2024-03-22 05:00:00 2024-03-23 04:00:00
## 5 13 -18.9 2024-03 440006000 2024-02-18 17:00:00 2024-03-16 05:00:00
## 6 14.7 -17.4 2024-03 787777888 2024-03-30 10:00:00 2024-03-31 23:00:00
## 7 13.9 -17.8 2024-03 538002884 2024-03-17 05:00:00 2024-03-18 21:00:00
## 8 12.8 -18 2024-03 248128000 2024-03-10 23:00:00 2024-03-31 14:00:00
## 9 14 -18.3 2024-02 538006075 2024-02-19 14:00:00 2024-02-20 10:00:00
## 10 15.9 -19 2024-02 563106700 2024-01-31 03:00:00 2024-02-01 02:00:00
## # ℹ 79,908 more rows
## # ℹ 1 more variable: `Vessel Presence Hours` <dbl>Finally, grouping by Vessel ID not only returns the
Vessel IDs of the active vessels in the area, it also
returns all the identity details about the vessels. Knowing this can
help a lot in workflows that need detailed information about vessel
identity, gears, and characteristics.
vp_senegal_vesselID <- gfw_ais_presence(spatial_resolution = "LOW",
temporal_resolution = "MONTHLY",
group_by = "VESSEL_ID",
start_date = start_date,
end_date = end_date,
region_source = "EEZ",
region = 8371)
vp_senegal_vesselID
## # A tibble: 79,936 × 16
## Lat Lon `Time Range` `Vessel ID` Flag `Vessel Name` `Entry Timestamp`
## <dbl> <dbl> <chr> <chr> <chr> <chr> <dttm>
## 1 15.2 -17.8 2024-01 25bfd0a97-7… DNK PAWIKAN 2024-01-15 16:00:00
## 2 11.1 -19.1 2024-01 696b92039-9… MHL PERLY 2024-01-30 16:00:00
## 3 12.4 -18.5 2024-03 ad009e811-1… BHS AUTUMN STREAM 2024-03-02 19:00:00
## 4 12.7 -17.4 2024-01 89da242f9-9… PAN MSC SUEZ 2024-01-12 08:00:00
## 5 12.8 -18.8 2024-01 bab37e213-3… MHL NORDIC ORION 2024-01-01 00:00:00
## 6 14.6 -17.8 2024-01 882c3d526-6… THA KANCHANA NAR… 2024-01-01 23:00:00
## 7 14.2 -18.7 2024-02 a19c4747d-d… PRT BRITTA OLDEN… 2024-02-25 16:00:00
## 8 11.3 -19 2024-02 63913b575-5… MLT DORADO LNG 2024-02-27 03:00:00
## 9 15.5 -17.6 2024-03 cd9d9eb63-3… LBR ASL NEPTUNE 2024-03-21 17:00:00
## 10 15.3 -17.8 2024-03 93e609520-0… LBR KRITI CAPTAIN 2024-03-12 21:00:00
## # ℹ 79,926 more rows
## # ℹ 9 more variables: `Exit Timestamp` <dttm>, `Gear Type` <chr>,
## # `Vessel Type` <chr>, MMSI <dbl>, IMO <dbl>, CallSign <chr>,
## # `First Transmission Date` <dttm>, `Last Transmission Date` <dttm>,
## # `Vessel Presence Hours` <dbl>The columns include Lat, Lon, Time Range, Vessel ID, Flag, Vessel Name, Entry Timestamp, Exit Timestamp, Gear Type, Vessel Type, MMSI, IMO, CallSign, First Transmission Date, Last Transmission Date, Vessel Presence Hours.
vp_senegal_vesselID |> count(`Gear Type`)
## # A tibble: 17 × 2
## `Gear Type` n
## <chr> <int>
## 1 BUNKER 274
## 2 CARGO 40261
## 3 CARRIER 2020
## 4 DRIFTING_LONGLINES 343
## 5 FISHING 438
## 6 GEAR 26
## 7 INCONCLUSIVE 394
## 8 OTHER 27947
## 9 OTHER_PURSE_SEINES 60
## 10 PASSENGER 1352
## 11 POLE_AND_LINE 16
## 12 PURSE_SEINES 54
## 13 PURSE_SEINE_SUPPORT 24
## 14 SEISMIC_VESSEL 414
## 15 SET_LONGLINES 43
## 16 TRAWLERS 5891
## 17 TUNA_PURSE_SEINES 379
vp_senegal_vesselID |> count(`Vessel Type`)
## # A tibble: 9 × 2
## `Vessel Type` n
## <chr> <int>
## 1 BUNKER 274
## 2 CARGO 40329
## 3 CARRIER 2020
## 4 FISHING 7631
## 5 GEAR 30
## 6 OTHER 27862
## 7 PASSENGER 1352
## 8 SEISMIC_VESSEL 414
## 9 SUPPORT 24Mapping vessel presence with ggplot2
Before mapping let’s define a theme using ggplot2
# Map theme with dark background
map_theme <- ggplot2::theme_minimal() +
ggplot2::theme(
panel.border = element_blank(),
legend.position = "bottom", legend.box = "vertical",
legend.key.height = unit(3, "mm"),
legend.key.width = unit(15, "mm"),
legend.text = element_text(color = "#848b9b", size = 8),
legend.title = element_text(face = "bold", color = "#363c4c", size = 8, hjust = 0.5),
plot.title = element_text(face = "bold", color = "#363c4c", size = 10),
plot.subtitle = element_text(color = "#363c4c", size = 10),
axis.title = element_blank(),
axis.text = element_text(color = "#848b9b", size = 8)
)
map_speed_light <- viridis::turbo(3, begin = 0.5)And let’s map the original vessel presence dataset for January-March 2024:
vp_senegal
## # A tibble: 1,445 × 5
## Lat Lon `Time Range` `Vessel IDs` `Vessel Presence Hours`
## <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 15.3 -17.5 2024 20 31
## 2 11.8 -18.7 2024 31 34
## 3 11.9 -17.9 2024 105 129
## 4 14.1 -16.4 2024 4 81
## 5 15.1 -19.8 2024 12 13
## 6 11.2 -19.5 2024 19 22
## 7 15.9 -18.6 2024 67 80
## 8 15.7 -16.7 2024 1 2
## 9 13.7 -18.4 2024 71 84
## 10 16 -18.9 2024 27 36
## # ℹ 1,435 more rowsWe can use ggplot2 and geom_tile to plot
the data.
vp_senegal |>
ggplot() +
geom_tile(aes(x = Lon,
y = Lat,
fill = `Vessel Presence Hours`)) +
geom_sf(data = ne_countries(returnclass = "sf", scale = "medium")) +
coord_sf(xlim = c(min(vp_senegal$Lon), max(vp_senegal$Lon)),
ylim = c(min(vp_senegal$Lat), max(vp_senegal$Lat))) +
scale_fill_gradientn(
transform = 'log10',
colors = map_speed_light,
na.value = NA,
labels = scales::comma) +
labs(title = "Vessel Presence hours in the Senegalese EEZ",
subtitle = glue("{start_date} to {end_date}"),
fill = "Vessel presence hours (log)") +
map_theme
Using the speed filter
gfw_ais_vessel_presence() supports filtering the vessels
by speed range (in knots) in the following categories :
- <2 – Less than 2 knots
- 2-4 – 2 to 4 knots
- 4-6 – 4 to 6 knots
- 6-10 – 6 to 10 knots
- 10-15 – 10 to 15 knots
- 15-25 – 15 to 25 knots
- >25 – Greater than 25 knots
The filter syntax is adding the category:
filter_by = "speed = '<2'"
Using the filter will subset the activity raster to the activity that happened in the speed range:
eez_vessel_presence_speed <- gfw_ais_presence(
spatial_resolution = "LOW",
temporal_resolution = "MONTHLY",
group_by = "FLAG",
filter_by = "speed = '6-10'",
start_date = start_date,
end_date = end_date,
region = 8371,
region_source = "EEZ"
)
eez_vessel_presence_speed
## # A tibble: 9,499 × 6
## Lat Lon `Time Range` flag `Vessel IDs` `Vessel Presence Hours`
## <dbl> <dbl> <chr> <chr> <dbl> <dbl>
## 1 13.9 -17.2 2024-02 SEN 3 4
## 2 15 -17.7 2024-03 BHS 1 2
## 3 13.8 -17.5 2024-02 NLD 1 2
## 4 14.6 -17.8 2024-03 NOR 1 1
## 5 12.7 -17.5 2024-02 PAN 1 3
## 6 14.1 -18.5 2024-03 BLZ 1 1
## 7 12.9 -17.7 2024-02 LUX 1 1
## 8 13 -18 2024-01 PAN 1 1
## 9 15.9 -17.3 2024-01 SGP 1 1
## 10 12.3 -17.4 2024-02 ESP 2 2
## # ℹ 9,489 more rowsNote: The output won’t have an indication of the speed filter that was used, so a recommendation is to add a column
speedto the output before merging with other speed bins. This is also a reason why using logical clauses likefilter_by = "speed = '6-10' AND speed = '10-15'"is not very useful if you want to be able to keep the bins apart.
The speed filter will be a subset of the overall vessel presence


Retrieving multiple speed classes
To retrieve data binned in multiple (or all) speed classes, we can loop across the desired speed categories. Let’s make another example in the Mexico EEZ.
id <- gfw_region_id("Mexico") |>
filter(POL_TYPE == "200NM") |>
dplyr::pull(id)
speed_categories <- c("<2",
"2-4",
"4-6",
"6-10",
"10-15",
"15-25",
">25")
# we create the filters for each
sp <- paste0("speed = '" , speed_categories, "'")
vp_all_speeds <- purrr::map(sp,
~gfw_ais_presence(
spatial_resolution = "LOW",
temporal_resolution = "MONTHLY",
group_by = "VESSEL_ID",
filter_by = .x,
start_date = start_date,
end_date = end_date,
region = id,
region_source = "EEZ"))
# adding a speed column to help aggregate the data
vp_all_speeds <- purrr::map2(vp_all_speeds,
speed_categories,
~mutate(.x, speed = .y)) |>
bind_rows()
# reorganizing factor levels
vp_all_speeds <- vp_all_speeds |>
mutate(speed = as.factor(speed)) |>
mutate(speed = forcats::fct_relevel(speed, c("<2",
"2-4",
"4-6",
"6-10",
"10-15",
"15-25",
">25")))
vp_all_speeds |>
ggplot() +
geom_tile(aes(x = Lon,
y = Lat,
fill = `Vessel Presence Hours`)) +
geom_sf(data = ne_countries(returnclass = "sf", scale = "medium")) +
coord_sf(xlim = c(min(vp_all_speeds$Lon),
max(vp_all_speeds$Lon)),
ylim = c(min(vp_all_speeds$Lat),
max(vp_all_speeds$Lat))) +
scale_fill_gradientn(
transform = 'log10',
colors = map_speed_light,
na.value = NA,
labels = scales::comma) +
facet_wrap(~speed) +
labs(title = "Vessel Presence hours by speed categories",
subtitle = glue("{start_date} to {end_date}"),
fill = "Presence hours (log10)") +
map_theme
Since the request retrieved the data grouped by
vessel ID, we have gear types, vessel types and other
identity markers that can help filter and refine the visualization.
vp_all_speeds |>
filter(`Vessel Type` %in% c("FISHING", "CARGO"),
speed %in% c("6-10", "10-15")) |>
ggplot() +
geom_tile(aes(x = Lon,
y = Lat,
fill = `Vessel Presence Hours`)) +
geom_sf(data = ne_countries(returnclass = "sf", scale = "medium")) +
coord_sf(xlim = c(min(vp_all_speeds$Lon),
max(vp_all_speeds$Lon)),
ylim = c(min(vp_all_speeds$Lat),
max(vp_all_speeds$Lat))) +
scale_fill_gradientn(
transform = 'log10',
colors = map_speed_light,
na.value = NA,
labels = scales::comma) +
facet_grid(speed~`Vessel Type`) +
labs(title = "Vessel Presence hours in Mexico's EEZ",
subtitle = paste("For 6-10 knots and 10-15 knots and cargo vs. fishing vessel types"),
fill = "Presence hours (log10)") +
map_theme
Bivariate plots of speed and vessel presence hours
Vessel presence and speed may be better visualized using a bivariate plot, to distinguish areas with different levels of vessel density from areas where vessels transit at high or low speeds.
vp_all_speeds <- vp_all_speeds |>
mutate(speed_nb = case_when(
speed == "<2" ~ 2,
speed == "2-4" ~ 4,
speed == "4-6" ~ 6,
speed == "6-10" ~ 10,
speed == "10-15" ~ 15,
speed == "15-25" ~ 25,
speed == ">25" ~ 50))
biscale_speeds <- vp_all_speeds |>
group_by(Lat, Lon, speed_nb) |>
summarize(vessel_presence_hours = sum(`Vessel Presence Hours`)) |>
biscale::bi_class(x = vessel_presence_hours,
y = speed_nb, style = "quantile")
speed_map <- biscale_speeds |>
ggplot() +
geom_tile(aes(x = Lon,
y = Lat,
fill = bi_class)) +
geom_sf(data = ne_countries(returnclass = "sf", scale = "medium")) +
coord_sf(xlim = c(min(biscale_speeds$Lon),
max(biscale_speeds$Lon)),
ylim = c(min(biscale_speeds$Lat),
max(biscale_speeds$Lat))) +
bi_scale_fill(pal = "DkBlue2") +
theme_minimal() +
theme(legend.position = "none")
p_legend <- bi_legend(pal = "DkBlue2",
xlab = "Presence hours",
ylab = "Speed",
dim = 3,
size = 12)
p_combo <- (speed_map + p_legend)
p_combo