Explore high-quality datasets for your AI and machine learning projects.
The BIOSCAN_1M insect dataset provides information about insects. Each record includes four primary attributes: DNA barcode sequence, barcode index number (BIN), taxonomic rank annotation, and RGB image. The DNA barcode sequence shows the nucleotide arrangement, BIN serves as an alternative to Linnaean names, providing gene‑centered taxonomy, taxonomic rank annotation classifies organisms hierarchically based on evolutionary relationships, and the RGB image displays raw images from the 16 most densely sampled insect orders. The dataset also illustrates class distribution and class imbalance, which are inherent characteristics of insect communities.