Advances in protected species assessment increasingly rely on diverse data sources and machine learning to improve species distribution predictions and capture complex spatiotemporal dynamics. The dynamic oceanography of the Gulf of America (hereafter, Gulf) supports multiple cetacean species, including data-deficient and endangered species. However, we need to better understand the relationship between ocean dynamics and animal densities and distribution to design long-term monitoring programs and reduce species vulnerabilities to anthropogenic stressors in this heavily industrialized ecosystem.
We derived daily species density estimates from a 21-station moored passive acoustic recorder array (2020–2024) and examined patterns in densities of deep-diving cetaceans related to surface and deep oceanographic features, including temperature, salinity, chlorophyll-a, upwelling, eddy dynamics, and the influence of freshwater and the tropical Loop Current. We applied Boosted Regression Trees, a machine learning method, to learn and predict Gulf-wide distributions of these deep divers, testing multiple cross-validation strategies to reduce overfitting and improve generalization. The dataset combined long-term monitoring sites with annually rotating stations deployed across diverse oceanic Gulf regions deeper than 250 m.
Predictions show that goose-beaked whales (Ziphius cavirostris) are associated with deep eddies near steep slopes, Gervais’ beaked whales (Mesoplodon europaeus) follow surface and midwater eddies, and sperm whales (Physeter macrocephalus) frequent freshwater-influenced regions, avoiding Loop Current waters. These findings highlight how surface and subsurface processes interact with topography to influence cetacean occurrence across the Gulf. Next steps aim to improve predictions by incorporating visual survey data and scaling outputs to relative abundance estimates to enhance stock assessments.