Robust & Explainable Seafloor AI

Funding Source: DoD DEPSCoR

Budget: $577,825

Time: 08/2023 - 07/2026

PI: Dr. Xi Peng (Machine Learning)

Co-PI: Dr. Arthur Trembanis (Marine Science)

SeafloorAI, the first large AI-ready dataset for seafloor mapping using sonar imagery.

Abstract: Characterizing seafloor morphodynamics is essential for naval applications relying on extensive geoacoustic and environmental data over broad spatiotemporal scales. Understanding dynamics and uncertainty among numerous variables presents a significant out-of-distribution challenge, as AI/ML models often struggle with unseen distributions, leading to fragile predictions and unreliable explanations. To overcome these challenges, this project will develop new robust and explainable optimization methods using newly-curated seafloorAI datasets. The research outcomes—including an AI-ready multi-site seabed morphodynamic database, innovative trustworthy AI/ML optimization methods, and scalable implementations—will be linked to the Ocean Biogeographic Information System and Seabed 2030 to benefit a broad range of communities and stakeholders.

Publications:

Open-Sourced Data:

  • SeafloorAI Dataset: It includes 696,000 sonar images, 827,000 annotated segmentation masks, 696,000 detailed language descriptions and approximately 7M question-answer pairs. We make this dataset publicly available:  [https://sites.google.com/udel.edu/seafloorai/homes]
  • Prediction Rationale Dataset for ImageNet: We construct a new rationale dataset that covers all 1,000 categories in the ImageNet. For each category, we generate an ontology tree with a maximum height of two. Combining attributes and sub-attributes, this dataset contains over 4,000 unique rationales. [https://github.com/deep-real/DCP/tree/main/Rationale%20Dataset]

Open-Sourced Software:

  • Ensemble Pruning for OoD generalization: A Toolkit for ensemble-based robust optimization against distribution shifts. Github Repo [https://github.com/deep-real/TEP]
  • Ordinal Ranking of Concept Activation (ORCA): A lightweight, interpretable failure detection toolkit based on concept activation rankings. Github Repo [https://github.com/Nyquixt/ORCA]
  • Distributionally Robust Explanations (DRE): A framework to enhance machine learning (ML) model robustness against out-of-distribution data. Source code and pretrained models at [https://github.com/deep-real/DRE]. 
  • DisEntAngle and Localized (DEAL): An interpretation toolkit that disentangles and localizes fine-grained, concept-level explanations for visual models. Github Repo [https://github.com/deep-real/DEAL]