sagecontinuum/solarirradiancedataset
Solar Irradiance
Image Regression
This dataset is used to estimate solar irradiance from ground‑level images to support solar power generation, weather forecasting, climate‑change research, and smart‑home management. The data consist of images captured by the top camera of a Sage Waggle Node and corresponding solar irradiance values recorded by a tower at Argonne National Laboratory. During preprocessing, a CSV file was created to match images with their irradiance values, and nighttime photos were excluded; only summer photos were retained. During training, images were resized to 224 × 224 and augmented with random flips and rotations to improve model generalization. ResNet‑50 achieved the best performance with a mean absolute error of 82. Future work includes adding more training data to reduce error, improving estimation under thin clouds and during sunrise/sunset, and implementing pattern‑based irradiance prediction.
### Dataset Overview
## Dataset Information
- **Features**:
- `image`: image data, type `image`
- `irradiance`: irradiance data, type `float32`
- **Splits**:
- `full`: complete dataset with 1 000 samples, total size 13,466,250 bytes
- **Download Size**: 14,234,112 bytes
- **Dataset Size**: 13,466,250 bytes
- **Labels**:
- `climate`
- **License**: MIT
## Pre‑processing
- A CSV file was generated to pair each image with its corresponding solar irradiance value.
- Images were taken from the top camera of a Sage Waggle Node, while irradiance values came from Argonne National Laboratory tower measurements.
- Nighttime photos were filtered out; only summer photos were kept, and original 2,000 × 2,000‑pixel images were downscaled to 500 × 500.
## Training and Model
- Images were resized to 224 × 224 before conversion to tensors, with random flips and rotations applied for augmentation.
- Pre‑trained ResNet and VGG‑16 models were compared, with the final fully‑connected layer replaced to output a continuous value.
- ResNet‑50 yielded the best results with a mean absolute error of 82.
## Future Directions
- Increase training data to lower MAE.
- Address errors caused by thin cloud layers.
- Improve irradiance prediction during sunrise and sunset.
- Develop pattern‑based irradiance level prediction.