בינה מאלכותית RB14-16 : ניבוי סדרות ורצפים עם בינה מלאכותית LSTM חלק 1

בינה מאלכותית RB14-16 : ניבוי סדרות ורצפים עם בינה מלאכותית LSTM

 

 :part 1

complex things start simple ….. LSTM

1,2,3,4,….?

LSTM is used. I’ll explain every object, function, and parameter.

how the program works

1) Imports

import numpy as np
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, LSTM, Input
  • numpy as np: numerical arrays and reshaping.

  • Sequential: a Keras model type where layers are stacked in order.

  • Dense: a fully connected (feed-forward) layer.

  • LSTM: Long Short-Term Memory layer (a recurrent layer for sequences).

  • Input: explicitly declares the input tensor shape.

2) Build a simple dataset (integers 1..100)

data = np.array(range(1, 101))
  • Creates data = [1, 2, 3, ..., 100] as a NumPy array.

3) Create sliding windows (supervised sequences)

X, y = [], []
for i in range(len(data)-3):
X.append(data[i:i+3]) # length-3 window: [t, t+1, t+2]
y.append(data[i+3]) # next value (t+3) as label
  • For each position i, take a 3-number window as input and the next number as the target.

  • Example: when i=0, X[0]=[1,2,3], y[0]=4; when i=1, X[1]=[2,3,4], y[1]=5, etc.

  • Because the window length is 3, the last usable start is at index len(data)-4, so there are len(data)-3 = 97 samples.

Convert to arrays:

X = np.array(X) # shape (97, 3)
y = np.array(y) # shape (97,)

4) Reshape X for LSTM: [samples, timesteps, features]

X = X.reshape((X.shape[0], X.shape[1], 1))
  • LSTM expects 3D input: (batch, timesteps, features_per_timestep).

  • Here: samples=97, timesteps=3 (the window length), features=1 (each timestep is a single scalar).

  • Final X.shape == (97, 3, 1).

5) Define the model

model = Sequential()
model.add(Input(shape=(3,1)))
model.add(LSTM(50, activation='relu'))
model.add(Dense(1))

Layer by layer:

  • Input(shape=(3,1))

    • Declares that each training example is a sequence of length 3, with 1 feature per time step.

  • LSTM(50, activation='relu')

    • LSTM is a gated recurrent unit that processes the 3 timesteps in order, maintaining an internal memory (cell state) to model temporal patterns.

    • units=50: the size of the LSTM hidden state (number of LSTM cells).

    • activation='relu': the activation applied to the candidate/output (default is tanh; here it’s changed to ReLU).

    • Important default parameters (not shown but used):

      • recurrent_activation='sigmoid' for the input/forget/output gates.

      • return_sequences=False (outputs only the last time step’s output vector of length 50).

      • dropout=0.0, recurrent_dropout=0.0 by default.

    Parameter count (informative):
    For LSTM, params = 4 * units * (units + input_dim + 1)
    Here: units=50, input_dim=14 * 50 * (50 + 1 + 1) = 4 * 50 * 52 = 10,400.

  • Dense(1)

    • Fully connected layer mapping the 50-dim output from LSTM to a single scalar prediction (the next number).

    • Params: 50*1 + 1 = 51.

  • Total trainable parameters: 10,400 + 51 = 10,451.

6) Compile the model (training configuration)

model.compile(optimizer='adam', loss='mse')
  • optimizer='adam': adaptive gradient method (good default).

  • loss='mse': mean squared error for regression (predicting a real number).

7) Train

model.fit(X, y, epochs=200, verbose=1)
  • X: (97,3,1) inputs; y: (97,) targets.

  • epochs=200: the whole dataset is iterated 200 times.

  • verbose=1: prints a progress bar and per-epoch loss.

  • Default batch_size=32: with 97 samples, batches are typically sizes 32, 32, 33 per epoch.

What the LSTM learns:
Given sequences like [n, n+1, n+2] → target (n+3), it learns the rule “next value ≈ previous value + 1” (i.e., a simple linear progression) from examples 1..100.

8) Test inference

test_input = np.array([1,2,3]).reshape((1,3,1))
print("Next value:", model.predict(test_input, verbose=0))
  • test_input shape (1,3,1) matches the model’s expected input.

  • The LSTM processes the three timesteps [1,2,3] and outputs a scalar close to 4.0.

  • verbose=0 silences prediction logs.

What “using an LSTM” means here

  • Across the 3 timesteps, the LSTM updates its hidden state and cell state using input/forget/output gates, allowing it to model temporal dependencies.

  • Even though the sequence is very short (length 3), the LSTM treats it as a time series, not just a static 3-vector. It “reads” values in order and uses its memory mechanism to summarize them before the Dense layer maps that summary to the next value.

Notes and typical improvements (optional)

  • For time-series regression, activation='tanh' in LSTM is common; relu can work but may be less stable for some data.

  • Normalizing the data (e.g., scaling 1..100 to 0..1) often improves training.

  • Use a proper train/validation split to check generalization.

  • Longer windows (more timesteps) help when the next value depends on more history.

That’s it: the code frames next-step prediction as sequence modeling, prepares 3-step windows, trains an LSTM to map each window to its next value, and demonstrates prediction on [1,2,3] → ~4.

 


 

 

part 2 :

Let try far more complex sequence to predict 

 

Block 1: Imports and Setup

We start by importing the required libraries:

import numpy as np
import matplotlib.pyplot as plt
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense, Input

numpy is used for arrays and numbers, matplotlib.pyplot (optional) is for visualization, and tensorflow.keras provides the tools to build and train the LSTM model.


Block 2: Generate Dataset Function

Next, we define the function that builds our dataset:

def generate_function(x, noise_level=0.0):
base = np.sin(x) + 0.5*np.cos(3*x) + 0.3*np.sin(5*x)
noise = noise_level * np.random.randn(len(x))
return base + noise

This function creates a signal that combines sine and cosine waves, then adds Gaussian noise. The noise_level parameter controls how much randomness is introduced.


Block 3: Generate Data

We then generate 2000 data points:

x = np.linspace(0, 100, 2000)
data = generate_function(x, noise_level=0.2)

Here, x is evenly spaced between 0 and 100. The function is applied to these points, with a small noise level of 0.2.


Block 4: Prepare Training Sequences

To prepare the dataset for training, we use sliding windows:

window_size = 20
predict_ahead = 5
X, y = [], []
for i in range(len(data) – window_size – predict_ahead):
X.append(data[i:i+window_size])
y.append(data[i+window_size:i+window_size+predict_ahead])X = np.array(X)
y = np.array(y)
X = X.reshape((X.shape[0], X.shape[1], 1))

Each training input (X) contains the previous 20 values, and the corresponding output (y) contains the next 5 values. Finally, X is reshaped into the 3D format required by LSTMs: samples, timesteps, features.


Block 5: Build the LSTM Model

We now construct the model:

model = Sequential()
model.add(Input(shape=(window_size, 1)))
model.add(LSTM(64, activation='tanh'))
model.add(Dense(predict_ahead))
model.compile(optimizer='adam', loss='mse')

The model takes 20 values as input, passes them through an LSTM layer with 64 units, and then outputs 5 predicted values. It uses the Adam optimizer and mean squared error as the loss function.


Block 6: Train the Model

Training the model is done with:

history = model.fit(X, y, epochs=30, verbose=1, validation_split=0.2)

The model is trained for 30 epochs. Twenty percent of the dataset is reserved for validation.


Block 7: Prediction Function

We define a function to make predictions starting at any given index:

def predict_from_index(model, data, x, start_index, window_size, predict_ahead):
start_seq = data[start_index:start_index + window_size]
test_input = start_seq.reshape((1, window_size, 1))
prediction = model.predict(test_input, verbose=0).flatten()
true_future = data[start_index+window_size:start_index+window_size+predict_ahead]
return prediction, true_future

This function reshapes the last 20 values into the correct format, predicts the next 5 values, and also retrieves the true 5 values for comparison.


Block 8: Example Usage

Finally, we test the model by predicting future values:

start_index = 50
prediction, true_future = predict_from_index(model, data, x, start_index, window_size, predict_ahead)
print(f"\nPrediction from index {start_index+window_size} "
f"to {start_index+window_size+predict_ahead-1}:")
print("Predicted:", prediction)
print("True :", true_future)

Here, we choose start_index = 50. The model predicts the next 5 values after this point, and we print both the predicted values and the actual true values.


כתיבת תגובה