Upload 10 files

Browse files

Files changed (10) hide show

LICENSE +21 -0
Model Architecture +15 -0
README.md +43 -3
__init__.py +1 -0
data_loader.py +27 -0
evaluate.py +24 -0
model_builder.py +33 -0
predict.py +23 -0
setup.py +18 -0
train.py +45 -0

LICENSE ADDED Viewed

	@@ -0,0 +1,21 @@

+MIT License
+Copyright (c) 2025 Hrishikesh Shahane
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.

Model Architecture ADDED Viewed

	@@ -0,0 +1,15 @@

+ConvLSTM Layer
+The model leverages ConvLSTM2D, a combination of convolutional layers and LSTM units, to predict cloud cover:
+ConvLSTM2D Layer: Captures spatiotemporal features by applying convolutional operations on the input sequence of images.
+Batch Normalization: Normalizes the output of each convolutional layer to improve training speed and stability.
+Residual Connections: Introduced in the model to allow for deeper layers without vanishing gradients, ensuring that important features are passed forward.
+TimeDistributed Layer: Allows the model to apply 2D convolution to each frame in a sequence independently.
+Sigmoid Activation: The final output layer uses the sigmoid activation function to output predicted cloud cover values in the range [0, 1].
+Model Evaluation
+SSIM (Structural Similarity Index)
+SSIM is a metric used to measure the similarity between two images. It evaluates luminance, contrast, and structure. The closer the SSIM score is to 1, the more similar the two images are.
+MSE (Mean Squared Error)
+MSE calculates the average squared differences between the predicted and actual values, providing a quantitative measure of the prediction error. Lower MSE indicates better predictions.

README.md CHANGED Viewed

@@ -1,3 +1,43 @@
----
-license: mit
----

+# Cloud Cover Nowcasting
+This project aims to predict cloud cover using a sequence-to-sequence ConvLSTM (Convolutional Long Short-Term Memory) model. The goal is to predict future cloud cover based on past satellite images. The model uses satellite .tif images taken at regular intervals to forecast cloud patterns, aiding in weather prediction and climate monitoring.
+## Features
+Satellite Image Processing: The model processes `.tif` images, which are commonly used in satellite data, providing high-resolution, geospatial information.
+ConvLSTM for Sequential Prediction: The ConvLSTM model is used to learn and predict the temporal dynamics of cloud cover. ConvLSTM combines the power of convolutional layers (for spatial feature extraction) with `LSTM (Long Short-Term Memory)` layers (for sequence learning), making it ideal for spatiotemporal data like satellite imagery.
+Model Evaluation: The model's performance is evaluated using metrics like Structural Similarity Index (SSIM) and Mean Squared Error (MSE). SSIM helps in evaluating the similarity between the predicted and actual cloud cover images, while MSE gives a quantitative measure of prediction error.
+Project Overview
+## What is ConvLSTM?
+ConvLSTM is an advanced deep learning architecture that is particularly suitable for processing spatiotemporal data. Unlike regular LSTMs, which operate on sequences of scalar values, ConvLSTMs apply convolution operations within the LSTM structure, allowing the model to capture both spatial and temporal dependencies in the data.
+Convolutional Layers: These layers capture spatial patterns in images, such as cloud structures, edges, and textures.
+LSTM Layers: These layers allow the model to learn temporal dependencies, meaning the model can learn how cloud patterns evolve over time.
+Sequence-to-Sequence Learning: The model is trained to predict the next frames in a sequence based on the previous ones. This is particularly useful for tasks like weather forecasting, where the past data influences future predictions.
+Time Series Forecasting with ConvLSTM
+The dataset consists of a series of satellite images (each representing cloud cover at different time intervals). Time series forecasting using ConvLSTM allows the model to predict the next frames in the sequence, making it an ideal tool for nowcasting cloud cover. Each sequence of images provides context to help the model make accurate predictions about future cloud cover, taking into account both spatial and temporal factors.
+## Dataset
+The dataset consists of .tif images (grayscale cloud cover images) that are loaded into memory and preprocessed for model training. The images are organized into sequences of frames to enable the model to learn the temporal evolution of cloud cover.
+## Features in short
+- Processes satellite `.tif` images.
+- Uses ConvLSTM for sequential prediction.
+- Evaluates model accuracy using SSIM and MSE.
+## Setup
+1. Clone the repository.
+2. Install dependencies: `pip install -r requirements.txt`.
+3. Add `.tif` images to the `data/` directory.
+## Run Training
+The model uses ConvLSTM layers to process the sequences of images. It learns spatial features from each frame and temporal patterns from the sequence, enabling it to predict future cloud cover images. The model is trained using mean absolute error (MAE) as the loss function, and it is optimized using the Adam optimizer.
+`python /train.py`
+## Run Evaluate
+Once the model is trained, you can evaluate its performance on a validation set using the following command:
+`python /evaluate.py`
+## Run Predict
+To make predictions on a new set of satellite images, use the following command:
+`python /predict.py`

__init__.py ADDED Viewed

	@@ -0,0 +1 @@


1	+ # This file marks the `src` directory as a Python package.

data_loader.py ADDED Viewed

	@@ -0,0 +1,27 @@

+import os
+import cv2
+import numpy as np
+def load_images_from_folder(folder_path, img_size=(200, 200)):
+    images = []
+    for filename in sorted(os.listdir(folder_path)):
+        if filename.endswith('.tif'):
+            img_path = os.path.join(folder_path, filename)
+            img = cv2.imread(img_path, cv2.IMREAD_UNCHANGED)
+            if img is not None:
+                img = cv2.resize(img, img_size)
+                img = img.astype('float32') / np.max(img)  # Normalize
+                images.append(img)
+            else:
+                print(f"Warning: {img_path} could not be read.")
+    print(f"Total images loaded from {folder_path}: {len(images)}")
+    return np.array(images)
+def create_sequences(data, sequence_length):
+    num_sequences = data.shape[0] - sequence_length + 1
+    return np.array([data[i:i + sequence_length] for i in range(num_sequences)])
+def create_shifted_frames(data):
+    x = data[:, :-1, :, :, :]  # Frames 0 to n-1
+    y = data[:, 1:, :, :, :]  # Frames 1 to n
+    return x, y

evaluate.py ADDED Viewed

	@@ -0,0 +1,24 @@

+import matplotlib.pyplot as plt
+from sklearn.metrics import mean_squared_error, structural_similarity as ssim
+def evaluate_model(model, x_test, y_test):
+    predictions = model.predict(x_test)
+    mse = mean_squared_error(y_test.flatten(), predictions.flatten())
+    ssim_score = ssim(y_test[0, -1, :, :, 0], predictions[0, -1, :, :, 0], data_range=1.0)
+    print(f"Mean Squared Error: {mse}")
+    print(f"Structural Similarity Index: {ssim_score}")
+    return predictions
+def visualize_predictions(x_test, y_test, predictions, idx=0):
+    fig, axes = plt.subplots(1, 3, figsize=(15, 5))
+    axes[0].imshow(x_test[idx, -1, :, :, 0], cmap='gray')
+    axes[0].set_title("Input Frame")
+    axes[1].imshow(y_test[idx, -1, :, :, 0], cmap='gray')
+    axes[1].set_title("True Frame")
+    axes[2].imshow(predictions[idx, -1, :, :, 0], cmap='gray')
+    axes[2].set_title("Predicted Frame")
+    plt.show()

model_builder.py ADDED Viewed

	@@ -0,0 +1,33 @@

+import tensorflow as tf
+from tensorflow.keras.layers import (
+    ConvLSTM2D, Input, Conv2D, BatchNormalization, Add, ReLU, TimeDistributed
+)
+def build_residual_convlstm_model_seq2seq(input_shape):
+    input_layer = Input(shape=input_shape)
+    # First ConvLSTM layer with residual connection
+    x = ConvLSTM2D(filters=128, kernel_size=(3, 3), padding='same', return_sequences=True)(input_layer)
+    x = BatchNormalization()(x)
+    res = x  # Save the residual
+    # Second ConvLSTM layer
+    x = ConvLSTM2D(filters=128, kernel_size=(3, 3), padding='same', return_sequences=True)(x)
+    x = BatchNormalization()(x)
+    # Residual connection
+    x = Add()([x, res])
+    # Third ConvLSTM layer with residual connection, returning the entire sequence
+    x = ConvLSTM2D(filters=128, kernel_size=(3, 3), padding='same', return_sequences=True)(x)
+    x = BatchNormalization()(x)
+    # Apply Conv2D and ReLU to each frame in the sequence using TimeDistributed
+    x = TimeDistributed(Conv2D(128, (3, 3), padding='same'))(x)
+    x = TimeDistributed(ReLU())(x)
+    # Final Conv2D layer to predict the sequence of frames
+    output_layer = TimeDistributed(Conv2D(1, (3, 3), activation='sigmoid', padding='same'))(x)
+    model = tf.keras.Model(inputs=input_layer, outputs=output_layer)
+    return model

predict.py ADDED Viewed

	@@ -0,0 +1,23 @@

+import numpy as np
+from tensorflow.keras.models import load_model
+from data_loader import load_images_from_folder, create_sequences
+def predict_next_frame(model_path, input_sequence):
+    model = load_model(model_path)
+    predictions = model.predict(input_sequence)
+    return predictions
+if __name__ == "__main__":
+    folder_path = "/path/to/new/data"
+    img_size = (200, 200)
+    sequence_length = 5
+    # Load and preprocess data
+    dataset = load_images_from_folder(folder_path, img_size=img_size)
+    dataset = np.expand_dims(dataset, axis=-1)
+    sequences = create_sequences(dataset, sequence_length)
+    # Load trained model and predict
+    model_path = "best_model.keras"
+    predictions = predict_next_frame(model_path, sequences)
+    print("Predictions generated for the input sequence.")

setup.py ADDED Viewed

	@@ -0,0 +1,18 @@

+from setuptools import setup, find_packages
+setup(
+    name="cloud_cover_nowcasting",
+    version="0.1.0",
+    author="Hrishikesh Shahane",
+    description="A package for cloud cover nowcasting using ConvLSTM.",
+    packages=find_packages(where="src"),
+    package_dir={"": "src"},
+    install_requires=[
+        "tensorflow>=2.9.0",
+        "numpy",
+        "opencv-python",
+        "matplotlib",
+        "scikit-image",
+        "scikit-learn",
+    ],
+)

train.py ADDED Viewed

	@@ -0,0 +1,45 @@

+import numpy as np
+from sklearn.model_selection import train_test_split
+from tensorflow.keras.optimizers import Adam
+from tensorflow.keras.callbacks import ModelCheckpoint, EarlyStopping
+from data_loader import load_images_from_folder, create_sequences, create_shifted_frames
+from model_builder import build_residual_convlstm_model_seq2seq
+# Parameters
+folder_path = "/home/hrishikesh2003/Data/DataNewApproch"
+image_size = (200, 200)
+sequence_length = 5
+batch_size = 4
+epochs = 20
+# Load and preprocess data
+dataset = load_images_from_folder(folder_path, img_size=image_size)
+dataset = np.expand_dims(dataset, axis=-1)
+sequences = create_sequences(dataset, sequence_length)
+# Split data
+train_sequences, val_sequences = train_test_split(sequences, test_size=0.1, shuffle=False)
+x_train, y_train = create_shifted_frames(train_sequences)
+x_val, y_val = create_shifted_frames(val_sequences)
+# Build model
+model = build_residual_convlstm_model_seq2seq(input_shape=(sequence_length - 1, *image_size, 1))
+model.compile(optimizer=Adam(learning_rate=0.001), loss='mean_absolute_error')
+# Callbacks
+checkpoint = ModelCheckpoint(
+    'best_model.keras', monitor='val_loss', save_best_only=True, mode='min', verbose=1
+)
+early_stopping = EarlyStopping(monitor='val_loss', patience=20, mode='min', verbose=1)
+# Train model
+if x_train.shape[0] > 0 and y_train.shape[0] > 0:
+    model.fit(
+        x_train, y_train,
+        batch_size=batch_size,
+        epochs=epochs,
+        validation_data=(x_val, y_val),
+        callbacks=[checkpoint, early_stopping]
+    )
+else:
+    print("Not enough training data to train the model.")