8  Vessel info tables

Building on the vessel database, the vessel info tables provide summary information by MMSI that combine AIS activity (positions, hours, fishing hours, etc.), registry info (when available), and neural net outputs (vessel class, length, tonnage, etc.)

A key purpose of the vessel info tables is to evaluate these multiple and potentially conflicting sources of information to determine the “best” values for various fields (e.g. vessel class, flag, dimensions) to be used by GFW. The vessel info tables are also used to identify active fishing vessels and those vessels that are spoofing or offsetting their position, as well as quickly summarizing fleet activity.

8.1 Key Tables

  • gfw_research.vi_ssvid_vYYYYMMDD - This table includes the best activity and identity information available for the vessel based on its full AIS timeseries.
  • gfw_research.vi_ssvid_byyear_vYYYYMMDD - This table includes annual best activity and identity information for the vessel based only on data in each year. As a result, the information for a given year in gfw_research.vi_ssvid_byyear_vYYYYMMDD may not match that in gfw_research.vi_ssvid_vYYYYMMDD (e.g. the vessel recently re-flagged).
  • gfw_research.fishing_vessels_ssvid_vYYYYMMDD - This table includes the GFW yearly list of active non spoofing/offsetting fishing vessels. It is GFW’s most restrictive list of fishing vessels and the default list to use in research/analysis.

8.2 Data Description

8.2.1 vi_ssvid and vi_ssvid_byyear tables

The information in the main vessel info tables (vi_ssvid_vYYYYMMDD; vi_ssvid_byyear_vYYYYMMDD) is organized into multiple STRUCT that summarize the MMSI’s activity, ais_identity, inferred characteristics, registry_info, and best info to be used by GFW.

  • activity: Fields summarizing the amount and location of the MMSI’s AIS activity - positions, hours, fishing_hours, spoofing etc.
    • eez: Helpful summary of the MMSI’s activity (hours, fishing_hours) by EEZ. Useful for quick, non-spatial summaries of activity by EEZ

    • offsetting: Whether the MMSI offsetting its position

    • overlap_hours_multinames: Spoofing flag. Indicates how many hours the MMSI has overlapping segments with multiple identities. MMSI with overlap_hours_multinames > 24 in a given year are considered to be spoofed.

  • ais_identity: Fields summarizing the identities (ship name, callsign, IMO, flag, etc.) broadcast by that MMSI in AIS messages. Each identity type includes a mostcommon field providing the most-common value for that identity type.
    • Fields prefixed with n_ are normalized versions of the original values.

    • likely_gear: Whether the MMSI’s most common shipname (n_shipname_mostcommon) suggests the MMSI is attached to fishing gear (e.g. BUOY16).

  • inferred: Fields summarizing vessel characteristics (length, tonnage, geartype) inferred by the neural net for the MMSI.
    • GFW uses a nested vessel class hierarchy and only “leaf” classes are scored by the neural net, which are recorded in inferred.inferred_vessel_class. However, the final inferred vessel class assigned to an MMSI is recorded in inferred.inferred_vessel_class_ag and corresponds to the lowest level vessel class with a cumulative neural net score > 0.5.

  • registry_info: The identity information associated with the MMSI sourced from vessel registries
    • Values correspond to the feature STRUCT in the vessel database.

  • best: GFW’s assigned “best” vessel characteristics after considering all info from AIS, the neural net, and vessel registries
    • registry_net_disagreement: Do the neural net (inferred.inferred_vessel_class_ag) and vessel registries (registry_info.best_known_vessel_class) disagree about the vessel class of the MMSI

8.2.1.1 on_fishing_list_ fields

GFW assigns MMSI to four types of fishing lists in the vessel info tables. Three lists indicate whether the MMSI is potentially a fishing vessel according to different sources. The fourth list, on_fishing_list_best combines the results of the previous three lists to make a final determination of whether each MMSI is likely a fishing vessel.

  • on_fishing_list_known - Listed as a fishing vessel on 1+ registries

  • on_fishing_list_sr - Consistently self reports as fishing in AIS messages >98% of messages; minimum 50 positions

  • on_fishing_list_nn - The neural net believes the vessel is more likely a fishing vessel than a non-fishing vessel (e.g. combined neural net score for fishing classes inferred.fishing_class_score > 0.5)

  • on_fishing_list_best - GFW’s best list of likely fishing MMSI. True in the following cases:

    • The vessel is on_fishing_list_known
    • The vessel is on_fishing_list_sr and on_fishing_list_nn
    • The vessel’s highest inferred.inferred_vessel_class_score is for a fishing vessel class and exceeds 0.85
    • The vessel is on_fishing_list_nn and on_fishing_list_known is not False

8.2.2 fishing_vessels_ssvid table

The on_fishing_list_best field in the vessel info tables indicates whether GFW believes an MMSI is likely a fishing vessel. However, many of these MMSI are spoofing, offsetting, likely gear, or simply highly inactive and including them in analyses can introduce a lot of noise. For this reason, GFW produces the fishing_vessels_ssvid table, which takes the MMSI listed as on_fishing_list_best and applies a set of filters to fields in the activity struct of the vi_ssvid_byyear table to create a yearly list of likely fishing vessels.

8.3 Caveats & Known Issues

8.3.1 Fishing hours incorrect for squid jiggers

Currently, the vessel info table summarizes fishing_hours from the pipe_v20201001_segs_daily table. Fishing hours in this table are calculated solely from nnet_score and are therefore incorrect for squid_jiggers, which should be calculated using night_loitering.

8.4 Example Queries

8.6 Updates