Skip to contents

Overview

The gfw_ais_presence() function provides gridded vessel presence data from Global Fishing Watch’s 4Wings Map Visualization API. Global Fishing Watch’s vessel presence layer includes all vessel types, fishing or not, and summarizes their presence in hours, based on the data transmitted by their AIS transponders.

In this vignette we will explore several ways to call this function, and the filters that have been implemented in our APIs to support the exploration of this layer.

Setup

To get started, first load the gfwr package and some of the packages we will use

##  Loading gfwr

We will fetch data for January-March 2024. Remember the date intervals in gfwr include the start of the interval and exclude the last date of the interval:

start_date <- '2024-01-01' # will be included
end_date <- '2024-04-01'   # will be excluded. search will be up to 2024-03-31

Handling pre-built regions: MPAs, RFMOs, and EEZs

gfw_ais_presence() was designed to provide data for a specific region, offering users the ability to select from multiple built-in region types by specifying a specific Exclusive Economic Zone (EEZ), Marine Protected Area (MPA), or Regional Fisheries Management Organization (RFMO).

Note: The use of a region is mandatory, as the API is not designed to handle global requests

The list of available regions for each type, and their label and id, can be accessed with the gfw_regions() function.

eez_regions <- gfw_regions(region_source = 'EEZ')
eez_regions
## # A tibble: 285 × 5
##    iso   label                          id GEONAME                      POL_TYPE
##    <chr> <chr>                       <dbl> <chr>                        <chr>   
##  1 ASM   American Samoa               8444 United States Exclusive Eco… 200NM   
##  2 SHN   Ascension                    8379 British Exclusive Economic … 200NM   
##  3 COK   Cook Islands                 8446 New Zealand Exclusive Econo… 200NM   
##  4 FLK   Falkland / Malvinas Islands  8389 Overlapping claim Falkland … Overlap…
##  5 PYF   French Polynesia             8440 French Exclusive Economic Z… 200NM   
##  6 PCN   Pitcairn                     8439 British Exclusive Economic … 200NM   
##  7 SHN   Saint Helena                 8380 British Exclusive Economic … 200NM   
##  8 WSM   Samoa                        8445 Samoan Exclusive Economic Z… 200NM   
##  9 TON   Tonga                        8448 Tongan Exclusive Economic Z… 200NM   
## 10 SHN   Tristan da Cunha             8382 British Exclusive Economic … 200NM   
## # ℹ 275 more rows

gfwr also includes the gfw_region_id() function to get the label and id for a specific region using the region argument. For EEZs, region corresponds to the name or the country or the ISO3 code. Note that, for some countries, the name will return multiple regions. For RFMOs, region corresponds to the RFMO abbreviation (e.g. "ICCAT") and for MPAs it refers to the name of the MPA.

To fetch the numeric code of the Senegal EEZ, let’s use gfw_region_id()

# Use gfw_region_id function to get EEZ code for Senegal
senegal_eez_code <- gfw_region_id(region = "Senegal", region_source = "EEZ")
senegal_eez_code
## # A tibble: 2 × 5
##   iso3  label                                         id GEONAME        POL_TYPE
##   <chr> <chr>                                      <dbl> <chr>          <chr>   
## 1 NA    Joint regime area: Senegal / Guinea-Bissau 48964 Joint regime … Joint r…
## 2 SEN   Senegal                                     8371 Senegalese Ex… 200NM

The results show the EEZ and a Joint regime area. We will pick the EEZ code, 8371.

Calling the function

The gfw_ais_presence() function allows users to specify multiple criteria to customize the data they download, including the date range, spatial and temporal resolution, and grouping variables. See the documentation for gfw_ais_presence() or the GFW APIs for more info about these parameter options.

Spatial resolution can be LOW = 0.1 degree or HIGH = 0.01 degree,

vp_senegal <- gfw_ais_presence(spatial_resolution = "LOW",
                               temporal_resolution = "MONTHLY",
                               start_date = start_date,
                               end_date = end_date,
                               region_source = "EEZ",
                               region = 8371)
vp_senegal
## # A tibble: 4,173 × 5
##      Lat   Lon `Time Range` `Vessel IDs` `Vessel Presence Hours`
##    <dbl> <dbl> <chr>               <dbl>                   <dbl>
##  1  14.5 -18   2024-02                57                     100
##  2  11.6 -18.1 2024-01                21                      24
##  3  11.4 -19.3 2024-01                 9                      12
##  4  13.8 -20   2024-03                 5                       6
##  5  13   -17.5 2024-01                24                      61
##  6  14.6 -18.5 2024-03                48                      52
##  7  15.4 -18.5 2024-02                26                      30
##  8  11.9 -17.7 2024-02                58                      66
##  9  11.8 -18.9 2024-02                10                      11
## 10  11.7 -19.6 2024-02                 4                       4
## # ℹ 4,163 more rows

Without grouping variables, the function will return a number of vessel IDs present (for a definition of Vessel ID see our vessel identity vignette) and the total vessel presence hours for each cell (lat, lon).

The Time Range column will be expressed in the temporal units of the temporal resolution selected. In this example, MONTHLY will create a Time Range expressed in months: YYYY-MM

Explore other temporal resolution and how the results vary.

vp_senegal <- gfw_ais_presence(spatial_resolution = "LOW",
                               temporal_resolution = "YEARLY",
                               start_date = start_date,
                               end_date = end_date,
                               region_source = "EEZ",
                               region = 8371)
vp_senegal
## # A tibble: 1,445 × 5
##      Lat   Lon `Time Range` `Vessel IDs` `Vessel Presence Hours`
##    <dbl> <dbl>        <dbl>        <dbl>                   <dbl>
##  1  15.3 -17.5         2024           20                      31
##  2  11.8 -18.7         2024           31                      34
##  3  11.9 -17.9         2024          105                     129
##  4  14.1 -16.4         2024            4                      81
##  5  15.1 -19.8         2024           12                      13
##  6  11.2 -19.5         2024           19                      22
##  7  15.9 -18.6         2024           67                      80
##  8  15.7 -16.7         2024            1                       2
##  9  13.7 -18.4         2024           71                      84
## 10  16   -18.9         2024           27                      36
## # ℹ 1,435 more rows

Grouping variables

The outputs of gfw_ais_presence() can be grouped by FLAG, GEARTYPE, FLAGANDGEARTYPE, MMSI or VESSEL_ID. This will create extra grouping columns, and the number of vessel presence hours will be expressed accordingly.

vp_senegal_flag <- gfw_ais_presence(spatial_resolution = "LOW",
                                    temporal_resolution = "MONTHLY",
                                    group_by = "FLAG",
                                    start_date = start_date,
                                    end_date = end_date,
                                    region_source = "EEZ",
                                    region = 8371)
vp_senegal_flag |> count(flag) |> arrange((desc(n)))
## # A tibble: 84 × 2
##    flag      n
##    <chr> <int>
##  1 MHL    2918
##  2 LBR    2900
##  3 PAN    2653
##  4 MLT    2076
##  5 SGP    1897
##  6 BHS    1697
##  7 HKG    1639
##  8 NOR    1285
##  9 ATG    1270
## 10 PRT    1261
## # ℹ 74 more rows

Note that these results are grouped by Vessel ID, which is not the same as grouping by number of vessels. Check our vessel identity vignette for more information.

Grouping by MMSI will group the results at the MMSI scale, which can correspond to individual vessels, but this is not always the case.

vp_senegal_MMSI <- gfw_ais_presence(spatial_resolution = "LOW",
                                    temporal_resolution = "MONTHLY",
                                    group_by = "MMSI",
                                    start_date = start_date,
                                    end_date = end_date,
                                    region_source = "EEZ",
                                    region = 8371)
vp_senegal_MMSI
## # A tibble: 79,918 × 7
##      Lat   Lon `Time Range`      mmsi `Entry Timestamp`   `Exit Timestamp`   
##    <dbl> <dbl> <chr>            <dbl> <dttm>              <dttm>             
##  1  14.4 -20   2024-02      314573000 2024-02-16 13:00:00 2024-02-17 03:00:00
##  2  12.2 -17.9 2024-03      352003666 2024-03-19 00:00:00 2024-03-19 20:00:00
##  3  14.5 -18.6 2024-01      228032600 2024-01-16 23:00:00 2024-01-17 16:00:00
##  4  12.9 -19.2 2024-03      538009957 2024-03-22 05:00:00 2024-03-23 04:00:00
##  5  13   -18.9 2024-03      440006000 2024-02-18 17:00:00 2024-03-16 05:00:00
##  6  14.7 -17.4 2024-03      787777888 2024-03-30 10:00:00 2024-03-31 23:00:00
##  7  13.9 -17.8 2024-03      538002884 2024-03-17 05:00:00 2024-03-18 21:00:00
##  8  12.8 -18   2024-03      248128000 2024-03-10 23:00:00 2024-03-31 14:00:00
##  9  14   -18.3 2024-02      538006075 2024-02-19 14:00:00 2024-02-20 10:00:00
## 10  15.9 -19   2024-02      563106700 2024-01-31 03:00:00 2024-02-01 02:00:00
## # ℹ 79,908 more rows
## # ℹ 1 more variable: `Vessel Presence Hours` <dbl>

Finally, grouping by Vessel ID not only returns the Vessel IDs of the active vessels in the area, it also returns all the identity details about the vessels. Knowing this can help a lot in workflows that need detailed information about vessel identity, gears, and characteristics.

vp_senegal_vesselID <- gfw_ais_presence(spatial_resolution = "LOW",
                                        temporal_resolution = "MONTHLY",
                                        group_by = "VESSEL_ID",
                                        start_date = start_date,
                                        end_date = end_date,
                                        region_source = "EEZ",
                                        region = 8371)
vp_senegal_vesselID
## # A tibble: 79,936 × 16
##      Lat   Lon `Time Range` `Vessel ID`  Flag  `Vessel Name` `Entry Timestamp`  
##    <dbl> <dbl> <chr>        <chr>        <chr> <chr>         <dttm>             
##  1  15.2 -17.8 2024-01      25bfd0a97-7… DNK   PAWIKAN       2024-01-15 16:00:00
##  2  11.1 -19.1 2024-01      696b92039-9… MHL   PERLY         2024-01-30 16:00:00
##  3  12.4 -18.5 2024-03      ad009e811-1… BHS   AUTUMN STREAM 2024-03-02 19:00:00
##  4  12.7 -17.4 2024-01      89da242f9-9… PAN   MSC SUEZ      2024-01-12 08:00:00
##  5  12.8 -18.8 2024-01      bab37e213-3… MHL   NORDIC ORION  2024-01-01 00:00:00
##  6  14.6 -17.8 2024-01      882c3d526-6… THA   KANCHANA NAR… 2024-01-01 23:00:00
##  7  14.2 -18.7 2024-02      a19c4747d-d… PRT   BRITTA OLDEN… 2024-02-25 16:00:00
##  8  11.3 -19   2024-02      63913b575-5… MLT   DORADO LNG    2024-02-27 03:00:00
##  9  15.5 -17.6 2024-03      cd9d9eb63-3… LBR   ASL NEPTUNE   2024-03-21 17:00:00
## 10  15.3 -17.8 2024-03      93e609520-0… LBR   KRITI CAPTAIN 2024-03-12 21:00:00
## # ℹ 79,926 more rows
## # ℹ 9 more variables: `Exit Timestamp` <dttm>, `Gear Type` <chr>,
## #   `Vessel Type` <chr>, MMSI <dbl>, IMO <dbl>, CallSign <chr>,
## #   `First Transmission Date` <dttm>, `Last Transmission Date` <dttm>,
## #   `Vessel Presence Hours` <dbl>

The columns include Lat, Lon, Time Range, Vessel ID, Flag, Vessel Name, Entry Timestamp, Exit Timestamp, Gear Type, Vessel Type, MMSI, IMO, CallSign, First Transmission Date, Last Transmission Date, Vessel Presence Hours.

vp_senegal_vesselID |> count(`Gear Type`)
## # A tibble: 17 × 2
##    `Gear Type`             n
##    <chr>               <int>
##  1 BUNKER                274
##  2 CARGO               40261
##  3 CARRIER              2020
##  4 DRIFTING_LONGLINES    343
##  5 FISHING               438
##  6 GEAR                   26
##  7 INCONCLUSIVE          394
##  8 OTHER               27947
##  9 OTHER_PURSE_SEINES     60
## 10 PASSENGER            1352
## 11 POLE_AND_LINE          16
## 12 PURSE_SEINES           54
## 13 PURSE_SEINE_SUPPORT    24
## 14 SEISMIC_VESSEL        414
## 15 SET_LONGLINES          43
## 16 TRAWLERS             5891
## 17 TUNA_PURSE_SEINES     379
vp_senegal_vesselID |> count(`Vessel Type`)
## # A tibble: 9 × 2
##   `Vessel Type`      n
##   <chr>          <int>
## 1 BUNKER           274
## 2 CARGO          40329
## 3 CARRIER         2020
## 4 FISHING         7631
## 5 GEAR              30
## 6 OTHER          27862
## 7 PASSENGER       1352
## 8 SEISMIC_VESSEL   414
## 9 SUPPORT           24

Mapping vessel presence with ggplot2

Before mapping let’s define a theme using ggplot2

# Map theme with dark background
map_theme <- ggplot2::theme_minimal() + 
  ggplot2::theme(
    panel.border = element_blank(), 
    legend.position = "bottom", legend.box = "vertical", 
    legend.key.height = unit(3, "mm"), 
    legend.key.width = unit(15, "mm"),
    legend.text = element_text(color = "#848b9b", size = 8), 
    legend.title = element_text(face = "bold", color = "#363c4c", size = 8, hjust = 0.5), 
    plot.title = element_text(face = "bold", color = "#363c4c", size = 10), 
    plot.subtitle = element_text(color = "#363c4c", size = 10), 
    axis.title = element_blank(), 
    axis.text = element_text(color = "#848b9b", size = 8)
    )

map_speed_light <- viridis::turbo(3, begin = 0.5)

And let’s map the original vessel presence dataset for January-March 2024:

vp_senegal
## # A tibble: 1,445 × 5
##      Lat   Lon `Time Range` `Vessel IDs` `Vessel Presence Hours`
##    <dbl> <dbl>        <dbl>        <dbl>                   <dbl>
##  1  15.3 -17.5         2024           20                      31
##  2  11.8 -18.7         2024           31                      34
##  3  11.9 -17.9         2024          105                     129
##  4  14.1 -16.4         2024            4                      81
##  5  15.1 -19.8         2024           12                      13
##  6  11.2 -19.5         2024           19                      22
##  7  15.9 -18.6         2024           67                      80
##  8  15.7 -16.7         2024            1                       2
##  9  13.7 -18.4         2024           71                      84
## 10  16   -18.9         2024           27                      36
## # ℹ 1,435 more rows

We can use ggplot2 and geom_tile to plot the data.

vp_senegal |> 
  ggplot() +
  geom_tile(aes(x = Lon,
                y = Lat,
                fill = `Vessel Presence Hours`)) +
  geom_sf(data = ne_countries(returnclass = "sf", scale = "medium")) +
  coord_sf(xlim = c(min(vp_senegal$Lon), max(vp_senegal$Lon)),
           ylim = c(min(vp_senegal$Lat), max(vp_senegal$Lat))) +
  scale_fill_gradientn(
    transform = 'log10',
    colors = map_speed_light, 
    na.value = NA,
    labels = scales::comma) +
  labs(title = "Vessel Presence hours in the Senegalese EEZ",
       subtitle = glue("{start_date} to {end_date}"),
       fill = "Vessel presence hours (log)") +
  map_theme

A map of Vessel Presence in the Senegalese EEZ.  The scale is logarithmic

Using the speed filter

gfw_ais_vessel_presence() supports filtering the vessels by speed range (in knots) in the following categories :

  • <2 – Less than 2 knots
  • 2-4 – 2 to 4 knots
  • 4-6 – 4 to 6 knots
  • 6-10 – 6 to 10 knots
  • 10-15 – 10 to 15 knots
  • 15-25 – 15 to 25 knots
  • >25 – Greater than 25 knots

The filter syntax is adding the category: filter_by = "speed = '<2'"

Using the filter will subset the activity raster to the activity that happened in the speed range:

eez_vessel_presence_speed <- gfw_ais_presence(
  spatial_resolution = "LOW",
  temporal_resolution = "MONTHLY",
  group_by = "FLAG",
  filter_by = "speed = '6-10'",
  start_date = start_date,
  end_date = end_date,
  region = 8371,
  region_source = "EEZ"
  )
eez_vessel_presence_speed
## # A tibble: 9,499 × 6
##      Lat   Lon `Time Range` flag  `Vessel IDs` `Vessel Presence Hours`
##    <dbl> <dbl> <chr>        <chr>        <dbl>                   <dbl>
##  1  13.9 -17.2 2024-02      SEN              3                       4
##  2  15   -17.7 2024-03      BHS              1                       2
##  3  13.8 -17.5 2024-02      NLD              1                       2
##  4  14.6 -17.8 2024-03      NOR              1                       1
##  5  12.7 -17.5 2024-02      PAN              1                       3
##  6  14.1 -18.5 2024-03      BLZ              1                       1
##  7  12.9 -17.7 2024-02      LUX              1                       1
##  8  13   -18   2024-01      PAN              1                       1
##  9  15.9 -17.3 2024-01      SGP              1                       1
## 10  12.3 -17.4 2024-02      ESP              2                       2
## # ℹ 9,489 more rows

Note: The output won’t have an indication of the speed filter that was used, so a recommendation is to add a column speed to the output before merging with other speed bins. This is also a reason why using logical clauses like filter_by = "speed = '6-10' AND speed = '10-15'" is not very useful if you want to be able to keep the bins apart.

The speed filter will be a subset of the overall vessel presence

Above, a map Vessel Presence in the Senegalese EEZ. Below, the same map but only for vessels present and transiting at 6-10 knots average speed. The scale is logarithmicAbove, a map Vessel Presence in the Senegalese EEZ. Below, the same map but only for vessels present and transiting at 6-10 knots average speed. The scale is logarithmic

Retrieving multiple speed classes

To retrieve data binned in multiple (or all) speed classes, we can loop across the desired speed categories. Let’s make another example in the Mexico EEZ.

id <- gfw_region_id("Mexico") |> 
  filter(POL_TYPE == "200NM") |> 
  dplyr::pull(id)
speed_categories <- c("<2",
                      "2-4",
                      "4-6",
                      "6-10",
                      "10-15",
                      "15-25",
                      ">25")

# we create the filters for each
sp <- paste0("speed = '" , speed_categories, "'")

vp_all_speeds <- purrr::map(sp,
                           ~gfw_ais_presence(
                             spatial_resolution = "LOW",
                             temporal_resolution = "MONTHLY",
                             group_by = "VESSEL_ID",
                             filter_by = .x,
                             start_date = start_date,
                             end_date = end_date,
                             region = id, 
                             region_source = "EEZ"))
# adding a speed column to help aggregate the data
vp_all_speeds <- purrr::map2(vp_all_speeds,
                             speed_categories, 
                             ~mutate(.x, speed = .y)) |>
  bind_rows()

# reorganizing factor levels 
vp_all_speeds <- vp_all_speeds |>
  mutate(speed = as.factor(speed)) |> 
  mutate(speed = forcats::fct_relevel(speed, c("<2",
                                               "2-4",
                                               "4-6",
                                               "6-10",
                                               "10-15",
                                               "15-25",
                                               ">25")))
  
vp_all_speeds |>   
ggplot() +
  geom_tile(aes(x = Lon,
                y = Lat,
                fill = `Vessel Presence Hours`)) +
  geom_sf(data = ne_countries(returnclass = "sf", scale = "medium")) +
  coord_sf(xlim = c(min(vp_all_speeds$Lon),
                    max(vp_all_speeds$Lon)),
           ylim = c(min(vp_all_speeds$Lat),
                    max(vp_all_speeds$Lat))) +
  scale_fill_gradientn(
    transform = 'log10',
    colors = map_speed_light, 
    na.value = NA,
    labels = scales::comma) +
  facet_wrap(~speed) +
  labs(title = "Vessel Presence hours by speed categories",
       subtitle = glue("{start_date} to {end_date}"),
       fill = "Presence hours (log10)") +
  map_theme

Since the request retrieved the data grouped by vessel ID, we have gear types, vessel types and other identity markers that can help filter and refine the visualization.

vp_all_speeds |>   
  filter(`Vessel Type` %in% c("FISHING", "CARGO"),
         speed %in% c("6-10", "10-15")) |> 
  ggplot() +
  geom_tile(aes(x = Lon,
                y = Lat,
                fill = `Vessel Presence Hours`)) +
  geom_sf(data = ne_countries(returnclass = "sf", scale = "medium")) +
  coord_sf(xlim = c(min(vp_all_speeds$Lon),
                    max(vp_all_speeds$Lon)),
           ylim = c(min(vp_all_speeds$Lat),
                    max(vp_all_speeds$Lat))) +
  scale_fill_gradientn(
    transform = 'log10',
    colors = map_speed_light, 
    na.value = NA,
    labels = scales::comma) +
  facet_grid(speed~`Vessel Type`) +
  labs(title = "Vessel Presence hours in Mexico's EEZ",
       subtitle = paste("For 6-10 knots and 10-15 knots and cargo vs. fishing vessel types"),
       fill = "Presence hours (log10)")  + 
  map_theme

Bivariate plots of speed and vessel presence hours

Vessel presence and speed may be better visualized using a bivariate plot, to distinguish areas with different levels of vessel density from areas where vessels transit at high or low speeds.

vp_all_speeds <- vp_all_speeds |> 
  mutate(speed_nb = case_when(
    speed == "<2" ~ 2, 
    speed == "2-4" ~ 4, 
    speed == "4-6" ~ 6, 
    speed == "6-10" ~ 10, 
    speed == "10-15" ~ 15, 
    speed == "15-25" ~ 25, 
    speed == ">25" ~ 50))

biscale_speeds <- vp_all_speeds |> 
  group_by(Lat, Lon, speed_nb) |> 
  summarize(vessel_presence_hours = sum(`Vessel Presence Hours`)) |>
  biscale::bi_class(x = vessel_presence_hours, 
                    y = speed_nb, style = "quantile")

speed_map <- biscale_speeds |> 
ggplot() +
  geom_tile(aes(x = Lon,
                y = Lat,
                fill = bi_class)) +
  geom_sf(data = ne_countries(returnclass = "sf", scale = "medium")) +
  coord_sf(xlim = c(min(biscale_speeds$Lon), 
                    max(biscale_speeds$Lon)),
           ylim = c(min(biscale_speeds$Lat),
                    max(biscale_speeds$Lat))) +
  bi_scale_fill(pal = "DkBlue2") + 
  theme_minimal() +
  theme(legend.position = "none")
  p_legend <- bi_legend(pal = "DkBlue2",
                        xlab = "Presence hours",
                        ylab = "Speed",
                        dim = 3,
                        size = 12)
p_combo <- (speed_map + p_legend) 
p_combo