בינה מאלכותית RB14-05 : זיהוי ספרות תרגיל מעשי

learns to recognize handwritten digits (0–9) from images.

Step by step:

  1. Input:

    • The AI receives a picture of a digit (28×28 pixels).

    • Each pixel is just a number showing how dark/bright it is.

  2. Learning:

    • The AI goes through many examples (60,000 images) where it knows the correct answer (the label 0–9).

    • It adjusts its weights (internal numbers) so its guesses get closer to the correct labels.

    • This happens during training using backpropagation + Adam optimizer.

  3. Output:

    • When you give it a new image, it outputs 10 probabilities (one for each digit).

    • Example: [0.01, 0.02, 0.05, 0.90, ...] → the highest is at position 3, so it predicts digit = 3.

  4. Goal:

    • Minimize mistakes.

    • After training, it achieves about 98% accuracy on new images it has never seen before.

The code you wrote is using a feed-forward Artificial Neural Network (ANN), also called a Multilayer Perceptron (MLP).

Details:

  • Input: Flattened MNIST images → each 28×28 image is reshaped into a 784-dimensional vector.

  • Hidden layers:

    • Dense(256, ReLU)

    • Dense(128, ReLU)

  • Output layer: Dense(10, Softmax) → gives probabilities for digits 0–9.

So the model type is fully connected MLP (not a CNN).

 

 

Spatial data means data where the position of each value matters and has relationships with its neighbors.

Examples:

  • Images → each pixel has meaning only in relation to nearby pixels (edges, corners, shapes).

  • Maps / GIS data → a location’s value (temperature, population, elevation) depends on surrounding areas.

  • Medical scans (X-ray, MRI, CT) → pixel/voxel arrangement encodes anatomy.

Non-spatial data (opposite):

  • Tabular data (Excel sheets: age, income, blood pressure). The order of columns or rows does not define relationships.

  • Feature vectors (already extracted numbers like embeddings).

In short:

  • Spatial data = has a geometry (2D, 3D, grid, sequence) where location matters.

  • Non-spatial data = just independent features, order doesn’t matter.

 

A plain ANN (like MLP) can sometimes be better than a CNN, but only in specific conditions:

  1. Non-spatial data

    • If your inputs are tabular data (numbers, features, categories) with no spatial or temporal structure, ANN is usually better.

    • Example: predicting house prices, credit scoring, sensor values.

  2. Very small datasets

    • CNNs need many samples to learn filters. With very little data, a small ANN may generalize better (or at least overfit less).

  3. Low-dimensional inputs

    • If your input has only a few features (e.g., 20–50 values), CNN has no advantage.

    • ANN is simpler and faster.

  4. When spatial structure is irrelevant

    • If the order of pixels/features doesn’t matter (e.g., shuffled or abstract features), CNN loses its main advantage.

  5. As a classifier after feature extraction

    • Sometimes features are already extracted (e.g., embeddings from another model). In that case, a simple ANN on top of those features is better than CNN.

Rule of thumb:

  • Use CNN when data has clear spatial/local structure (images, spectrograms).

  • Use ANN when data is flat/tabular or when relationships are global, not local.

כתיבת תגובה