Back to datasets
Dataset assetOpen Source CommunityMachine LearningModel Analysis

BAM dataset

The BAM dataset consists of several subsets (e.g., `obj`, `scene`, `scene_only`, etc.) designed for model comparison and input‑dependency analysis. Each subset provides detailed training and validation splits along with specific usage descriptions.

Source
github
Created
Apr 3, 2019
Updated
Aug 11, 2023
Signals
103 views
Availability
Linked source ready
Overview

Dataset description and usage context

BAM - Benchmarking Attribution Methods Dataset Overview

Dataset

Dataset Structure

  • obj: Contains 90,000 training images and 10,000 validation images for model comparison; images have object labels.
  • scene: Contains 90,000 training images and 10,000 validation images for model comparison and input dependence analysis; images have scene labels.
  • scene_only: Contains 90,000 training images and 10,000 validation images for input dependence analysis; includes only scene images with scene labels.
  • dog_bedroom: Contains 200 validation images for relative model comparison; images are labeled as bedroom.
  • bamboo_forest: Contains 100 validation images for input independence analysis; includes only bamboo‑forest scene images.
  • bamboo_forest_patch: Contains 100 validation images for input independence analysis; bamboo‑forest images contain a functionally insignificant dog patch.

Dataset Description

  • val_loc.txt records the top‑left and bottom‑right coordinates of objects in the validation set.
  • val_mask contains binary masks of objects in the validation set.

Models

Model Architecture

  • models/obj: Model trained on data/obj.
  • models/scene: Model trained on data/scene.
  • models/scene_only: Model trained on data/scene_only.
  • models/scenei: Model trained on scenes containing dogs, where i denotes different scene classes.

Metrics

Model Contrast Scores (MCS)

Measures attribution differences between object‑label and scene‑label models on images containing both objects and scenes.

Input Dependence Rate (IDR)

Measures the percentage of inputs where adding an object reduces attribution importance in a scene‑label model.

Input Independence Rate (IIR)

Measures the percentage of inputs where a functionally insignificant patch (e.g., a dog) in a scene‑only model does not significantly affect explanations.

Evaluation Methods

Significance Evaluation Methods

  • Model Contrast Score (MCS): Compute by running python bam/metrics.py --metrics=MCS --num_imgs=10.
  • Input Dependence Rate (IDR): Compute by running python bam/metrics.py --metrics=IDR --num_imgs=10.
  • Input Independence Rate (IIR): First build a set of functionally insignificant patches, then compute via python bam/metrics.py --metrics=IIR --num_imgs=10.

TCAV Evaluation

  • Compute the TCAV score for the dog concept of the object model by running python bam/run_tcav.py --model=obj.
Need downstream help?

Pair the dataset with AI analysis and content workflows.

Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.

Explore AI studio