|
---
|
|
license: other
|
|
language:
|
|
- en
|
|
metrics:
|
|
- precision
|
|
- f1
|
|
pipeline_tag: tabular-classification
|
|
tags:
|
|
- agriculture
|
|
- remote sensing
|
|
- satellite imagery
|
|
- crop classification
|
|
library_name: XGBoost
|
|
library_version: 2.0.3
|
|
---
|
|
|
|
# Model Card for AI-Enhanced Crop Field Data Curation
|
|
|
|
## Summary
|
|
|
|
This repository primarily features models for classifying crop types for the Kharif and Rabi seasons. These models have been trained and fine-tuned by Wadhwani AI on open-source Sentinel-1 and Sentinel-2 image datasets, with ground truth data supplied by the [Mahalanobis National Crop Forecast Center](https://www.ncfc.gov.in/about-us.html).
|
|
|
|
## Model Details
|
|
|
|
Scalers have been built using the `StandardScaler` from [scikit-learn library](https://scikit-learn.org/stable/) and ML classifiers have been trained using [XGBoost](https://xgboost.readthedocs.io/en/latest/).
|
|
|
|
## Training Data
|
|
|
|
- Sowing Year:
|
|
- `rabi`: 2022
|
|
- `kharif`: 2023
|
|
- location: Please find the location wise crop distribution [here](https://drive.google.com/file/d/1HEC9r3cu17eeXOssxjDfHg-x8cidMH8j/view?usp=sharing)
|
|
- Predictors:
|
|
- Source: Sentinel-2 and sentinel-1 image data from Google Earth Engine
|
|
- data type:
|
|
- `rabi`: Fortnightly recorded NDVI(Normalized Difference Vegetation Index) values for individual crop lands throughout the entire Rabi season (October to April).
|
|
- `kharif`: NDVI values and VH(Vertical-Horizontal Polarization) values recorded at fortnightly intervals for individual crop lands throughout the entire Kharif season (May to November).
|
|
- GT Source: Ground truth data curated by [Mahalanobis National Crop Forecast Center](https://www.ncfc.gov.in/about-us.html)
|
|
|
|
## Kharif Season Models
|
|
|
|
### Crop Type Classifiers
|
|
|
|
These models can be used to scale and predict crop types for the Kharif season. For predictions, you can use either the entire season’s NDVI and VH data (from the 1st fortnight of May to the 2nd fortnight of November) or a subset of this data for early crop type identification. Ensure that the model you use is trained with the same dataset you are applying for predictions. Target labels are: `{0: Paddy, 1:Sugarcane, 2: Cotton}`.
|
|
|
|
- `kharif_ctc_scaler.pkl`: Standard scaler for Kharif crop type classification.
|
|
- `may_1f-jul_2f_kharif_ctc.pkl`: Crop type classifier trained on data from 2nd fortnight of May to 2nd fortnight of July.
|
|
- `may_1f-aug_1f_kharif_ctc.pkl`: Crop type classifier trained on data from 1st fortnight of May to 1st fortnight of August.
|
|
- `may_1f-aug_2f_kharif_ctc.pkl`: Crop type classifier trained on data from 2nd fortnight of May to 2nd fortnight of August.
|
|
- `may_1f-sep_1f_kharif_ctc.pkl`: Crop type classifier trained on data from 1st fortnight of May to 1st fortnight of September.
|
|
- `may_1f-sep_2f_kharif_ctc.pkl`: Crop type classifier trained on data from 2nd fortnight of May to 2nd fortnight of September.
|
|
- `may_1f-oct_1f_kharif_ctc.pkl`: Crop type classifier trained on data from 1st fortnight of May to 1st fortnight of October.
|
|
- `may_1f-oct_2f_kharif_ctc.pkl`: Crop type classifier trained on data from 2nd fortnight of May to 2nd fortnight of October.
|
|
- `may_1f-nov_1f_kharif_ctc.pkl`: Crop type classifier trained on data from 1st fortnight of May to 1st fortnight of November.
|
|
- `may_1f-nov_2f_kharif_ctc.pkl`: Crop type classifier trained on data from 2nd fortnight of May to 2nd fortnight of November.
|
|
|
|
## Rabi Season Models
|
|
|
|
### Crop Type Classifiers
|
|
|
|
These models can be used to scale and predict crop types for the Rabi season. For predictions, you can use either the entire season’s NDVI data (from the 1st fortnight of October to the 2nd fortnight of April) or a subset of this data for early crop type identification. Ensure that the model you use is trained with the same dataset you are applying for predictions. Target labels are: `{0: Mustard, 1: Wheat, 2: Potato, 3: Bengal Gram}`.
|
|
|
|
- `rabi_ctc_scaler.pkl`: Standard Scaler for Rabi crop type classification.
|
|
- `oct_1f-dec_2f_rabi_ctc.pkl`: Crop type classifier trained on data from 2nd fortnight of October to 2nd fortnight of December.
|
|
- `oct_1f-jan_1f_rabi_ctc.pkl`: Crop type classifier trained on data from 1st fortnight of October to 1st fortnight of January.
|
|
- `oct_1f-jan_2f_rabi_ctc.pkl`: Crop type classifier trained on data from 2nd fortnight of October to 2nd fortnight of January.
|
|
- `oct_1f-feb_1f_rabi_ctc.pkl`: Crop type classifier trained on data from 1st fortnight of October to 1st fortnight of February.
|
|
- `oct_1f-feb_2f_rabi_ctc.pkl`: Crop type classifier trained on data from 2nd fortnight of October to 2nd fortnight of February.
|
|
- `oct_1f-mar_1f_rabi_ctc.pkl`: Crop type classifier trained on data from 1st fortnight of October to 1st fortnight of March.
|
|
- `oct_1f-mar_2f_rabi_ctc.pkl`: Crop type classifier trained on data from 2nd fortnight of October to 2nd fortnight of March.
|
|
- `oct_1f-apr_1f_rabi_ctc.pkl`: Crop type classifier trained on data from 1st fortnight of October to 1st fortnight of April.
|
|
- `oct_1f-apr_2f_rabi_ctc.pkl`: Crop type classifier trained on data from 2nd fortnight of October to 2nd fortnight of April.
|
|
|
|
### Other Crop Rejection Classifiers
|
|
|
|
These models can be used to scale and reject other crop types for the Rabi season. For predictions, you can use either the entire season’s NDVI data (from the 1st fortnight of October to the 2nd fortnight of April) or a subset of this data for rejection. Ensure that the model you use is trained with the same dataset you are applying for predictions. Target labels are: `{0: Desired, 1: Others}`.
|
|
|
|
- `rabi_ocr_scaler.pkl`: Standard Scaler for Rabi other crop rejection.
|
|
- `oct_1f-dec_2f_rabi_ocr.pkl`: Other crop rejection classifier trained on data from 2nd fortnight of October to 2nd fortnight of December.
|
|
- `oct_1f-jan_1f_rabi_ocr.pkl`: Other crop rejection classifier trained on data from 1st fortnight of October to 1st fortnight of January.
|
|
- `oct_1f-jan_2f_rabi_ocr.pkl`: Other crop rejection classifier trained on data from 2nd fortnight of October to 2nd fortnight of January.
|
|
- `oct_1f-feb_1f_rabi_ocr.pkl`: Other crop rejection classifier trained on data from 1st fortnight of October to 1st fortnight of February.
|
|
- `oct_1f-feb_2f_rabi_ocr.pkl`: Other crop rejection classifier trained on data from 2nd fortnight of October to 2nd fortnight of February.
|
|
- `oct_1f-mar_1f_rabi_ocr.pkl`: Other crop rejection classifier trained on data from 1st fortnight of October to 1st fortnight of March.
|
|
- `oct_1f-mar_2f_rabi_ocr.pkl`: Other crop rejection classifier trained on data from 2nd fortnight of October to 2nd fortnight of March.
|
|
- `oct_1f-apr_1f_rabi_ocr.pkl`: Other crop rejection classifier trained on data from 1st fortnight of October to 1st fortnight of April.
|
|
- `oct_1f-apr_2f_rabi_ocr.pkl`: Other crop rejection classifier trained on data from 2nd fortnight of October to 2nd fortnight of April.
|
|
|
|
## Usage
|
|
|
|
### Installation
|
|
|
|
To use the models, you need to have the XGBoost, scikit-learn and pandas library installed. You can install it via pip:
|
|
```bash
|
|
pip install xgboost~=2.0.3
|
|
pip install scikit-learn~=1.5.0
|
|
pip install pandas~=2.2.2
|
|
```
|
|
|
|
### Inference
|
|
|
|
Here is a sample code snippet for using the model to classify rabi season crop types:
|
|
```python
|
|
import xgboost, scikit-learn
|
|
import pandas as pd
|
|
import pickle
|
|
|
|
# Load your desired scaler and model
|
|
with open('path/to/rabi_ctc_scaler.pkl', 'rb') as f:
|
|
scaler = pickle.load(f)
|
|
|
|
with open('path/to/oct_1f-apr_2f_rabi_ctc.pkl', 'rb') as f:
|
|
model = pickle.load(f)
|
|
|
|
# Prepare your data
|
|
# Assuming data is a pandas dataframe with features corresponding to NDVI(normalized to 0-200) from 1st fortnight of October to 2nd fortnight of April
|
|
data = pd.DataFrame(data=[[115, 122, 138, 145, 152, 159, 165, 172, 140, 130, 110],
|
|
[116, 123, 139, 146, 153, 160, 166, 173, 141, 131, 111]],
|
|
columns=['oct_1f', 'oct_2f', 'nov_1f', 'nov_2f', 'dec_1f', 'dec_2f', 'jan_1f',
|
|
'jan_2f', 'feb_1f', 'feb_2f', 'mar_1f', 'mar_2f', 'apr_1f', 'apr_2f'])
|
|
|
|
# Scale your data
|
|
scaled_data = scaler.transform(data)
|
|
|
|
# Make predictions
|
|
predictions = model.predict(scaled_data)
|
|
|
|
# Interpret predictions
|
|
# Rabi Season Crops
|
|
class_mapping = {0: 'Mustard', 1: 'Wheat', 2: 'Potato', 'Bengal Gram'}
|
|
classified_crops = list(map(lambda label: class_mapping[label], predictions))
|
|
|
|
print(classified_crops)
|
|
```
|
|
|
|
For Kharif Season:
|
|
```python
|
|
import xgboost, scikit-learn
|
|
import pandas as pd
|
|
import pickle
|
|
|
|
# Load your desired scaler and model
|
|
with open('path/to/kharif_ctc_scaler.pkl', 'rb') as f:
|
|
scaler = pickle.load(f)
|
|
|
|
with open('path/to/may_1f-nov_2f_kharif_ctc.pkl', 'rb') as f:
|
|
model = pickle.load(f)
|
|
|
|
# Prepare your data
|
|
# Assuming data is a pandas dataframe with features corresponding to NDVI(normalized to 0-200) and VH(in db) alternatively from 1st fortnight of May to 2nd fortnight of November
|
|
data = pd.DataFrame(data=[-10.0000, 110.0000, -11.6667, 118.3333, -13.3333, 126.6667, -15.0000, 135.0000, -16.6667, 143.3333, -18.3333, 151.6667, -20.0000, 160.0000, -21.6667, 168.3333, -23.3333, 176.6667, -25.0000, 185.0000, -26.6667, 186.6667, -28.3333, 188.3333],
|
|
columns= ['may_1f_vh', 'may_1f_ndvi', 'may_2f_vh', 'may_2f_ndvi', 'jun_1f_vh',
|
|
'jun_1f_ndvi', 'jun_2f_vh', 'jun_2f_ndvi', 'jul_1f_vh', 'jul_1f_ndvi',
|
|
'jul_2f_vh', 'jul_2f_ndvi', 'aug_1f_vh', 'aug_1f_ndvi', 'aug_2f_vh',
|
|
'aug_2f_ndvi', 'sep_1f_vh', 'sep_1f_ndvi', 'sep_2f_vh', 'sep_2f_ndvi',
|
|
'oct_1f_vh', 'oct_1f_ndvi', 'oct_2f_vh', 'oct_2f_ndvi', 'nov_1f_vh',
|
|
'nov_1f_ndvi', 'nov_2f_vh', 'nov_2f_ndvi'])
|
|
|
|
# Scale your data
|
|
scaled_data = scaler.transform(data)
|
|
|
|
# Make predictions
|
|
predictions = model.predict(scaled_data)
|
|
|
|
# Interpret predictions
|
|
# Rabi Season Crops
|
|
class_mapping = {0: Paddy, 1:Sugarcane, 2: Cotton}
|
|
classified_crops = list(map(lambda label: class_mapping[label], predictions))
|
|
|
|
print(classified_crops)
|
|
```
|
|
|
|
## Out-of-Scope
|
|
|
|
The models are not designed to handle crops outside what has been mentioned under targets. Our `ocr` models for the Rabi season may face potential misclassification due to unrefined “Others" crop data. For the Kharif season, the lack of ground truth data limits our model to classifying only sugarcane, paddy, and cotton, with no out-of-distribution rejection mechanisms. Although effective in tested regions, further validation is needed to ensure performance across diverse geographical areas.
|
|
|
|
## Abbreviations
|
|
|
|
- `NDVI`: Normalized Difference Vegetation Index
|
|
- `VH`: Vertical-Horizontal Polarization
|
|
- `Kharif`: Season of heavy rainfall (May to November)
|
|
- `Rabi`: Season of scanty rainfall (October to April)
|
|
- `Crop lands`: Areas of land that are planted with crops
|
|
- `Ground` truth data: Data that is considered to be true and accurate
|
|
- `1f` : First fortnight of the month
|
|
- `2f`: Second fortnight of the month
|
|
|
|
## Contact
|
|
|
|
For any queries, please feel free to reach out to us at this email: [[email protected]]([email protected]) |