get_vessel_info()
: the basics
of vessel identity in gfwr
vignettes/articles/identity.Rmd
identity.Rmd
This vignette explains the use of get_vessel_info()
as a
key function to understand vessel identity and to use all the other
Global Fishing Watch API endpoints.
We discuss basic identity markers, why Vessel ID was created, the
structure of the get_vessel_info()
response and how to get
vesselId
for use in the functions
get_event()
and get_event_stats()
.
The Automatic Identification System (AIS) is an automatic tracking
system originally developed to help preventing collisions between
vessels at sea. Vessels broadcast AIS to alert other vessels of their
presence, but terrestrial and satellite receivers can also receive these
messages and monitor vessel movements. AIS is at the core of Global
Fishing Watch analysis pipelines, including the AIS-based fishing effort
calculation displayed on our map and available through function
get_raster()
in gfwr
.
AIS messages include identity information about the vessel, such as ship name, call sign, and International Maritime Organization (IMO) number, as well as an identifier known as the Maritime Mobile Service Identity (MMSI).
MMSI are nine-digit numbers broadcasted in AIS messages. MMSIs are supposed to be unique for each vessel, but 1. a vessel can change MMSIs throughout its lifecycle–for example when it’s reflagged, because the first three digits refer to the flag country 2. several vessels can broadcast the same MMSI at the same time. This happens for many reasons, including the fact that data entry in AIS messages is manual.
Shipname and callsign can be also transmitted in AIS messages but they are optional, not every AIS-broadcasting vessel transmits them, and their transmission can be inconsistent. Shipnames can also vary a lot in their spelling, requiring some fuzzy matching to detect if they refer to the same vessel.
IMO numbers are permanent unique identifiers that follow a vessel from construction to scrapping. Assigned by the International Maritime Organization, IMO numbers are required for only a subset of industrial fishing vessels. IMO number can be transmitted along with MMSI in AIS messages but they are frequently missing.
These identity markers are often the starting point of any enquiry around vessel identity. However, due to their characteristics, none of these identifiers should be interpreted as the sole identity source for a vessel. Global Fishing Watch does extensive work to analyze and gather all the information available for a given vessel into cohesive sets.
Note: MMSI is referred to as
ssvid
in our tables.ssvid
stands for “source-specific vessel identity”. In this case, the source is AIS, and ssvid = MMSI.
Function get_vessel_info()
allows a user to run a basic
query using MMSI, callsign, shipname or IMO number but it also allows
for complex searches, using a combination of these to retrieve the
vessel of interest more accurately:
Do a simple search using “query” and search_type = “search” (which is the default so it can be omitted from the function call)
get_vessel_info(query = 224224000, search_type = "search")
Do complex search or fuzzy matching using
"where"
and
search_type = "search"
get_vessel_info(where = "imo = '8300949'")
get_vessel_info(where = "imo = '8300949' AND ssvid = '214182732'")
get_vessel_info(where = "shipname LIKE '%GABU REEFE%' OR imo = '8300949'")
Importantly, the response will return all the information it has for the vessel that matches the combination of identity markers requested, not only the ones requested.
The function doesn’t “filter” the results literally because vessels can have different MMSI, IMO, shipnames (or spellings of it) and callsigns in time, and these are all part of the identity history of that same vessel. Internally, Global Fishing Watch reconstructs the vessel identity history by matching a combination of these fields.
One of the key elements of this response is the
vesselId
, used by functions
get_event()
and get_event_stats()
.
vesselId
To solve the complexity of having several vessel identifiers that can
be duplicated or missing for each vessel and that can change in time,
Global Fishing Watch developed vesselId
, a
unified vessel identity variable that combines vessel information and is
specific to a specific time interval.
A vesselId
is formed by a combination
of the MMSI and the IMO number when available, or by the MMSI, callsign
and shipname transmitted in AIS messages. Each
vesselId
is thus associated to a single
MMSI at a specific period of time, and refers to a single vessel.
A single vessel can have several
vesselId
in time, and this is why simple
calls to get_vessel_info()
can return tables that have many
rows and different identity markers in time.
Let’s go back to the simple search. To get information of a vessel
using its MMSI, IMO number, callsign or name, the search can be done
directly using the number or the string. For example, to look for a
vessel with MMSI = 224224000
, the number is enough:
mmsi_search <- get_vessel_info(query = 224224000,
search_type = "search",
key = gfw_auth())
The response from the API is a list with seven elements:
names(mmsi_search)
# [1] "dataset" "registryInfoTotalRecords"
# [3] "registryInfo" "registryOwners"
# [5] "registryPublicAuthorizations" "combinedSourcesInfo"
# [7] "selfReportedInfo"
The content of the original AIS message transmitted by the vessel
appears in $selfReportedInfo
:
mmsi_search$selfReportedInfo
# # A tibble: 2 × 13
# vesselId ssvid shipname nShipname flag callsign imo messagesCounter
# <chr> <chr> <chr> <chr> <chr> <chr> <chr> <int>
# 1 6632c9eb8-8009-… 3061… AGURTZA… AGURTZAB… BES PJBL 8733… 21772378
# 2 3c99c326d-dd2e-… 2242… AGURTZA… AGURTZAB… ESP EBSJ 8733… 1887249
# # ℹ 5 more variables: positionsCounter <int>, sourceCode <list>,
# # matchFields <chr>, transmissionDateFrom <chr>, transmissionDateTo <chr>
As you can see this vessel returns a dataframe with two rows. One corresponds to our original search, where ssvid (MMSI) equals 224224000.
The second line has a different ssvid, but the same name and IMO
number. The two lines correspond to the same vessel, and as you can see
form the fields transmissionDateFrom
and
transmissionDateTo
, flag
, and
ssvid
, the vessel operated with a Spain flag (ESP) and one
ssvid between 2015 and 2019, then it was reflagged and operated with a
BES flag (from Bonaire, Sint Eustatius and Saba) between 2019 and 2023.
The change in ssvid reflects the reflagging operation because the first
three digits of MMSI correspond to the flag country.
Variable matchFields
reports that the matching was done
using "SEVERAL_FIELDS"
.
library(dplyr)
library(tidyr)
mmsi_search$selfReportedInfo %>%
select(vesselId, ssvid, flag, contains("Date"))
# # A tibble: 2 × 5
# vesselId ssvid flag transmissionDateFrom transmissionDateTo
# <chr> <chr> <chr> <chr> <chr>
# 1 6632c9eb8-8009-abdb-baf9-… 3061… BES 2019-10-15T12:16:54Z 2023-11-30T18:22:…
# 2 3c99c326d-dd2e-175d-626f-… 2242… ESP 2015-10-13T15:47:16Z 2019-10-15T12:10:…
This is a simple case, in which the successive
vesselId
do not overlap in time and most
identifiers match, in spite of some changes.
For some vessels, variables transmissionDateFrom
and
transmissionDateTo
can overlap and other fields can be
different.
vesselId
in other functions
vesselId
can be extracted from
$selfReportedInfo$vesselId
, but is is highly recommended to
take a look at the response and confirm which of the values returned
vesselId should be selected.
Before picking a vesselId
to use in
other functions, it is useful to examinate:
vesselId
corresponds to the time
interval of interestvesselId
refer
to the vessels of interestmessagesCounter
.
Sometimes very few positions are transmitted for a short time interval
and that vesselId
can be treated as an
exceptionYou can use the selected vesselId
to
get any events related to the vessel of interest in other functions.
In our example, Global Fishing Watch analyses report the vessel had two encounters:
id <- mmsi_search$selfReportedInfo$vesselId
get_event(event_type = "ENCOUNTER", vessels = id)
# [1] "Downloading 2 events from GFW"
# # A tibble: 2 × 11
# start end id type lat lon regions
# <dttm> <dttm> <chr> <chr> <dbl> <dbl> <list>
# 1 2020-09-14 08:30:00 2020-09-14 11:50:00 514d33… enco… 8.01 -20.8 <named list>
# 2 2021-04-19 09:40:00 2021-04-19 12:50:00 6ffcbd… enco… 3.08 -12.2 <named list>
# # ℹ 4 more variables: boundingBox <list>, distances <list>, vessel <list>,
# # event_info <list>
Vessel registries carry important vessel identity information, like vessel characteristics, history of registration, licenses to fish in certain areas, and vessel ownership data. Global Fishing Watch compiles vessel information from over 40 public vessel registries and matches this information with the AIS-transmitted identity fields to provide a better overview of a vessel’s identity.
This information is requested by parameter "includes"
and returned in the elements
$registryInfoTotalRecords
(number of records in
registries),
mmsi_search$registryInfoTotalRecords
# # A tibble: 1 × 1
# registryInfoTotalRecords
# <int>
# 1 1
$registryInfo
the actual information in the registry,
including identity, vessel characteristics, dates of transmission and
whether the record corresponds to the latest vessel information
available
mmsi_search$registryInfo
# # A tibble: 1 × 15
# id sourceCode ssvid flag shipname nShipname callsign imo
# <chr> <list> <chr> <chr> <chr> <chr> <chr> <chr>
# 1 e0c9823749264a129d6b… <chr [6]> 2242… ESP AGURTZA… AGURTZAB… EBSJ 8733…
# # ℹ 7 more variables: latestVesselInfo <lgl>, transmissionDateFrom <chr>,
# # transmissionDateTo <chr>, geartypes <list>, lengthM <dbl>, tonnageGt <int>,
# # vesselInfoReference <chr>
$registryOwners
with their name, country of origin,
ssvid of the vessel and dates of ownership. Sometimes the vessel can
change identities and flags but its owners remain the same, and
sometimes changes in identity correspond to changes in ownership.
mmsi_search$registryOwners
# # A tibble: 2 × 6
# name flag ssvid sourceCode dateFrom dateTo
# <chr> <chr> <chr> <list> <chr> <chr>
# 1 JEALSA RIANXEIRA ESP 306118000 <chr [1]> 2019-10-15T12:47:53Z 2023-09-15T1…
# 2 JEALSA RIANXEIRA ESP 224224000 <chr [1]> 2015-10-13T16:06:33Z 2019-10-15T0…
$registryPublicAuthorizations
of the response and the
respective organizations
mmsi_search$registryPublicAuthorizations %>%
tidyr::unnest(sourceCode)
# # A tibble: 4 × 4
# dateFrom dateTo ssvid sourceCode
# <chr> <chr> <chr> <chr>
# 1 2019-10-15T00:00:00Z 2023-02-01T00:00:00Z 306118000 ICCAT
# 2 2018-01-09T00:00:00Z 2019-10-24T00:00:00Z 224224000 ICCAT
# 3 2012-01-01T00:00:00Z 2019-01-01T00:00:00Z 224224000 IOTC
# 4 2014-03-11T00:00:00Z 2016-07-28T00:00:00Z 224224000 ICCAT
In the best of cases, AIS messages match registry information and the whole identity of the vessel can be reconstructed. Here are two examples with on registry match and AIS data not overlapping in time
This vessel has a single identity throughout its entire history:
one_AIS <- get_vessel_info(where= "ssvid='701024000'")
#see AIS-based identities
one_AIS$selfReportedInfo
# # A tibble: 1 × 13
# vesselId ssvid shipname nShipname flag callsign imo messagesCounter
# <chr> <chr> <chr> <chr> <chr> <chr> <chr> <int>
# 1 8e930bac5-594b-… 7010… ATLANTI… ATLANTIC… ARG LW3233 8615… 985759069
# # ℹ 5 more variables: positionsCounter <int>, sourceCode <list>,
# # matchFields <chr>, transmissionDateFrom <chr>, transmissionDateTo <chr>
#check registry info:
one_AIS$registryInfo %>% relocate(transmissionDateFrom, transmissionDateTo) #changing column order for visualization
# # A tibble: 1 × 15
# transmissionDateFrom transmissionDateTo id sourceCode ssvid flag shipname
# <chr> <chr> <chr> <list> <chr> <chr> <chr>
# 1 2014-01-01T03:28:33Z 2024-05-31T23:56:3… 4550… <chr [1]> 7010… ARG ATLANTI…
# # ℹ 8 more variables: nShipname <chr>, callsign <chr>, imo <chr>,
# # latestVesselInfo <lgl>, geartypes <list>, lengthM <dbl>, tonnageGt <int>,
# # vesselInfoReference <chr>
This other vessel has had three identities based on AIS, but these are easy to reconstruct:
three_AIS <- get_vessel_info(where= "ssvid='412217989'") #one registry, three AIS
# see AIS-based indentities:
three_AIS$selfReportedInfo %>% relocate(transmissionDateFrom, transmissionDateTo)
# # A tibble: 3 × 13
# transmissionDateFrom transmissionDateTo vesselId ssvid shipname nShipname
# <chr> <chr> <chr> <chr> <chr> <chr>
# 1 2021-11-29T06:20:07Z 2024-07-20T23:59:49Z b373b6306-… 4122… HAO YAN… HAOYANG77
# 2 2014-03-29T00:32:05Z 2021-11-27T20:13:55Z 305097c65-… 4122… JIN LIA… JINLIAOY…
# 3 2012-08-01T01:47:23Z 2014-03-29T00:14:05Z 95a173191-… 4316… HAKKO M… HAKKOMAR…
# # ℹ 7 more variables: flag <chr>, callsign <chr>, imo <chr>,
# # messagesCounter <int>, positionsCounter <int>, sourceCode <list>,
# # matchFields <chr>
#check registry info:
three_AIS$registryInfo
# # A tibble: 1 × 15
# id sourceCode ssvid flag shipname nShipname callsign imo
# <chr> <list> <chr> <chr> <chr> <chr> <chr> <chr>
# 1 bdd48f4144f4fdd034ce… <chr [5]> 4122… CHN HAOYANG… HAOYANG77 BAWB 9038…
# # ℹ 7 more variables: latestVesselInfo <lgl>, transmissionDateFrom <chr>,
# # transmissionDateTo <chr>, geartypes <list>, lengthM <dbl>, tonnageGt <dbl>,
# # vesselInfoReference <chr>
However, sometimes a vessel found in AIS has no registry information and the registry fields come back empty.
get_vessel_info(where = "ssvid='71000036'")
# $dataset
# # A tibble: 1 × 1
# dataset
# <chr>
# 1 public-global-vessel-identity:v20231026
#
# $registryInfoTotalRecords
# # A tibble: 1 × 1
# registryInfoTotalRecords
# <int>
# 1 0
#
# $registryInfo
# # A tibble: 0 × 1
# # ℹ 1 variable: <list> <list>
#
# $registryOwners
# # A tibble: 0 × 1
# # ℹ 1 variable: <list> <list>
#
# $registryPublicAuthorizations
# # A tibble: 0 × 1
# # ℹ 1 variable: <list> <list>
#
# $combinedSourcesInfo
# # A tibble: 1 × 9
# vesselId geartypes_geartype_n…¹ geartypes_geartype_s…² geartypes_geartype_y…³
# <chr> <chr> <chr> <int>
# 1 c9cc6776… DRIFTING_LONGLINES COMBINATION_OF_REGIST… 2022
# # ℹ abbreviated names: ¹geartypes_geartype_name, ²geartypes_geartype_source,
# # ³geartypes_geartype_yearFrom
# # ℹ 5 more variables: geartypes_geartype_yearTo <int>,
# # shiptypes_shiptype_name <chr>, shiptypes_shiptype_source <chr>,
# # shiptypes_shiptype_yearFrom <int>, shiptypes_shiptype_yearTo <int>
#
# $selfReportedInfo
# # A tibble: 1 × 13
# vesselId ssvid shipname nShipname flag callsign imo messagesCounter
# <chr> <chr> <chr> <chr> <chr> <chr> <lgl> <int>
# 1 c9cc67761-1a95-… 7100… BOIA 2 BOIA2 BRA BO12345 NA 91004
# # ℹ 5 more variables: positionsCounter <int>, sourceCode <list>,
# # matchFields <chr>, transmissionDateFrom <chr>, transmissionDateTo <chr>
It is also possible that a search returns a vessel with no matching AIS information and no registry
noAIS_noReg <- get_vessel_info(where = "imo='44201155'") #no AIS, no registry information
noAIS_noReg$registryInfoTotalRecords
# # A tibble: 1 × 1
# registryInfoTotalRecords
# <int>
# 1 0
noAIS_noReg$selfReported
# # A tibble: 1 × 13
# vesselId ssvid shipname nShipname flag callsign imo messagesCounter
# <chr> <chr> <chr> <chr> <lgl> <chr> <chr> <int>
# 1 ec9a49563-3add-… 5242… . ALK@A… ALKALK NA LNOTE`? 4420… 19
# # ℹ 5 more variables: positionsCounter <int>, sourceCode <list>,
# # matchFields <chr>, transmissionDateFrom <chr>, transmissionDateTo <chr>
Digging into the $selfReported
information, we find the
vessel MMSI is malformed (it has only six characters) and points to very
few positions in old AIS messages (2013). Variable
matchFields
reports NO_MATCH
.