Forward passPart 4 · 45 min · beginner

The first prediction

Turn the neuron into a perceptron, generate a V/H line dataset, and make first predictions.

Open in Colab Download notebook Full lab fallback

Kernel: ColdSections: 0/12

Neural Network Fundamentals

Part 4: The Perceptron - First Prediction

The Brain's Decision Committee - Chapter 4

Previously: In Parts 1-3, our committee member learned:

How to read images as numbers (matrices)
How to weigh evidence and apply personal thresholds (weights & bias)
How to cast meaningful votes (activation functions)

Today's Mission: Our committee member is now fully equipped. It's time for their first real attempt at classifying lines! We'll build a complete Perceptron - the original neural network from 1958 - and watch it make predictions.

Spoiler: It won't go well at first. And that's exactly the point.

What You'll Learn in Part 4

By the end of this notebook, you will:

Understand the Perceptron - The first working neural network (Rosenblatt, 1958)
Generate a dataset - Create V/H line examples on-the-fly
Implement the forward pass - Input → Weighted Sum → Activation → Output
Build a Perceptron class - Clean, reusable code
Make predictions - Watch the untrained network guess
Understand why it fails - Random weights = random guesses

Prerequisites

Make sure you've completed:

Part 0 & 1: Welcome & Matrices (neural_network_fundamentals.ipynb)
Part 2: The First Committee Member (part_2_single_neuron.ipynb)
Part 3: Activation Functions (part_3_activation_functions.ipynb)

Setup: Import Dependencies

Let's import our tools and recreate the building blocks from previous notebooks.

cell 003

# =============================================================================# PART 4: THE PERCEPTRON - SETUP# ============================================================================= import numpy as npimport matplotlib.pyplot as pltfrom IPython.display import display, clear_output # Try to import ipywidgets for interactive featurestry:    import ipywidgets as widgets    WIDGETS_AVAILABLE = Trueexcept ImportError:    WIDGETS_AVAILABLE = False    print("Note: ipywidgets not installed. Interactive features will be limited.") # Set up matplotlib stylestyle_options = ['seaborn-v0_8-whitegrid', 'seaborn-whitegrid', 'ggplot', 'default']for style in style_options:    try:        plt.style.use(style)        break    except OSError:        continue plt.rcParams['figure.figsize'] = [10, 6]plt.rcParams['font.size'] = 12np.random.seed(42)  # For reproducible random numbers # =============================================================================# RECREATE OUR CANONICAL LINE IMAGES (from Parts 1-3)# ============================================================================= # Vertical line: bright pixels in the middle columnvertical_line = np.array([    [0, 1, 0],    [0, 1, 0],    [0, 1, 0]]) # Horizontal line: bright pixels in the middle rowhorizontal_line = np.array([    [0, 0, 0],    [1, 1, 1],    [0, 0, 0]]) # Flattened versions (9 pixels as a 1D array)vertical_flat = vertical_line.flatten()horizontal_flat = horizontal_line.flatten() print("Setup complete!")print("="*60)print("\nOur canonical images (as 3x3 matrices):")print(f"\nVertical Line:            Horizontal Line:")print(f"  {vertical_line[0]}                 {horizontal_line[0]}")print(f"  {vertical_line[1]}                 {horizontal_line[1]}")print(f"  {vertical_line[2]}                 {horizontal_line[2]}")print(f"\nAs flattened vectors (9 pixels):")print(f"  Vertical:   {vertical_flat}")print(f"  Horizontal: {horizontal_flat}")

# =============================================================================
# PART 4: THE PERCEPTRON - SETUP
# =============================================================================

import numpy as np
import matplotlib.pyplot as plt
from IPython.display import display, clear_output

# Try to import ipywidgets for interactive features
try:
    import ipywidgets as widgets
    WIDGETS_AVAILABLE = True
except ImportError:
    WIDGETS_AVAILABLE = False
    print("Note: ipywidgets not installed. Interactive features will be limited.")

# Set up matplotlib style
style_options = ['seaborn-v0_8-whitegrid', 'seaborn-whitegrid', 'ggplot', 'default']
for style in style_options:
    try:
        plt.style.use(style)
        break
    except OSError:
        continue

plt.rcParams['figure.figsize'] = [10, 6]
plt.rcParams['font.size'] = 12
np.random.seed(42)  # For reproducible random numbers

# =============================================================================
# RECREATE OUR CANONICAL LINE IMAGES (from Parts 1-3)
# =============================================================================

# Vertical line: bright pixels in the middle column
vertical_line = np.array([
    [0, 1, 0],
    [0, 1, 0],
    [0, 1, 0]
])

# Horizontal line: bright pixels in the middle row
horizontal_line = np.array([
    [0, 0, 0],
    [1, 1, 1],
    [0, 0, 0]
])

# Flattened versions (9 pixels as a 1D array)
vertical_flat = vertical_line.flatten()
horizontal_flat = horizontal_line.flatten()

print("Setup complete!")
print("="*60)
print("\nOur canonical images (as 3x3 matrices):")
print(f"\nVertical Line:            Horizontal Line:")
print(f"  {vertical_line[0]}                 {horizontal_line[0]}")
print(f"  {vertical_line[1]}                 {horizontal_line[1]}")
print(f"  {vertical_line[2]}                 {horizontal_line[2]}")
print(f"\nAs flattened vectors (9 pixels):")
print(f"  Vertical:   {vertical_flat}")
print(f"  Horizontal: {horizontal_flat}")

4.1 What is a Perceptron?

The Perceptron is the original neural network, invented by Frank Rosenblatt in 1958. It's the simplest possible neural network - just a single neuron!

Why Start with the Perceptron?

Before diving in, let's understand why the Perceptron matters:

Question	Answer
What problem does it solve?	Binary classification (yes/no, cat/dog, vertical/horizontal)
Why is it fundamental?	ALL neural networks are built from Perceptron-like units
Why learn it first?	Simple enough to understand completely, complex enough to be useful

The Key Insight: Once you understand ONE neuron, you understand the building block of ALL deep learning. Everything else is just more neurons connected together!

Historical Significance

The Perceptron was revolutionary. For the first time, a machine could learn to classify patterns without being explicitly programmed. Rosenblatt famously predicted it would eventually "be able to walk, talk, see, write, reproduce itself and be conscious of its existence."

(Spoiler: We're still working on most of those.)

Why This Architecture?

The Perceptron's design is inspired by biological neurons:

Biological Neuron	Perceptron Equivalent	Purpose
Dendrites (receive signals)	Inputs (x)	Receive information
Synapses (connection strength)	Weights (w)	Determine importance
Cell body (integrates)	Weighted sum (Σ)	Combine all inputs
Axon hillock (threshold)	Bias (b)	Decision threshold
Axon (fires/doesn't fire)	Activation (f)	Output a decision

This isn't just an analogy - it's the actual inspiration! Rosenblatt was trying to model how real neurons make decisions.

The Architecture

A Perceptron is exactly what we built in Parts 2-3:

    INPUTS (x)           WEIGHTS (w)              SUM               ACTIVATION          OUTPUT
    ┌─────┐              ┌─────┐                                    
    │ x₁  │──────────────│ w₁  │─────┐                              
    └─────┘              └─────┘     │                              
    ┌─────┐              ┌─────┐     │         ┌─────┐              ┌─────┐
    │ x₂  │──────────────│ w₂  │─────┼────────▶│  Σ  │──────────────│ f() │────────▶  ŷ
    └─────┘              └─────┘     │         │+bias│              └─────┘
    ┌─────┐              ┌─────┐     │         └─────┘              
    │ x₃  │──────────────│ w₃  │─────┘                              
    └─────┘              └─────┘

The Complete Formula (Everything Together!)

$y^= f (\sum i = 1 n w_{i} \cdot x_{i} + b) = f (w \cdot x + b)$

Where:

x = input vector (our flattened 9-pixel image)
w = weight vector (9 weights, one per pixel)
b = bias (the personal threshold)
Σ = weighted sum (dot product + bias)
f() = activation function (sigmoid for us)
ŷ = predicted output (probability it's a vertical line)

Committee Analogy: The First Working Committee Member

"Our committee member is now fully trained in procedure. They can:

Read the evidence (input)
Weigh each piece by importance (weights)
Apply their personal threshold (bias)
Cast a meaningful vote (activation)

Now it's time for their first REAL case!"

4.2 Generating Our Dataset

To test our Perceptron, we need examples to classify. Instead of loading a dataset from a file, we'll generate one on-the-fly. This is a powerful technique!

First, What IS a Dataset?

A dataset is a collection of examples used to train or test a machine learning model. Each example has:

Features (X): The input data (for us, 9 pixel values)
Label (y): The correct answer (for us, 0 or 1)

This is called supervised learning because we "supervise" the model by giving it the right answers to learn from.

Term	Meaning	Our Example
Sample	One example (input + label)	One 3x3 image + whether it's vertical
Feature	One piece of input data	One pixel value
Label	The correct answer	0 (horizontal) or 1 (vertical)
Dataset	Collection of samples	100 images with their labels

Why Do We Need Datasets?

Machine learning models learn by example, not by rules:

Traditional Programming	Machine Learning
Human writes rules	Human provides examples
"If middle column is bright, it's vertical"	Model sees 50 vertical + 50 horizontal lines
Rules are explicit	Model discovers patterns itself
Hard to handle edge cases	Learns from variety in data

The magic: Instead of us figuring out the rules, the model discovers them from data!

Our Classification Task

Image Type	Label (y)	Meaning
Vertical Line	1	"This is a vertical line"
Horizontal Line	0	"This is a horizontal line"

Dataset Requirements

For a proper machine learning experiment, we need:

Multiple examples - Not just 2 images, but many variations
Balanced classes - Equal numbers of vertical and horizontal
Some variety - Lines in different positions
Optional noise - To make the problem harder (later)

The Dataset Generator Function

We'll create a function that generates any number of V/H line examples:

cell 006

# =============================================================================# DATASET GENERATOR: Create V/H Line Examples On-The-Fly# ============================================================================= def generate_line_dataset(n_samples=100, noise_level=0.0, seed=None):    """    Generate a dataset of vertical and horizontal line images.        Parameters:    -----------    n_samples : int        Total number of samples (will be split evenly between V and H)    noise_level : float (0.0 to 0.5)        Amount of random noise to add (0.0 = clean, 0.3 = noisy)    seed : int or None        Random seed for reproducibility        Returns:    --------    X : numpy array of shape (n_samples, 9)        Flattened 3x3 images    y : numpy array of shape (n_samples,)        Labels: 1 for vertical, 0 for horizontal    """        if seed is not None:        np.random.seed(seed)        X = []  # Will hold all images (as flattened arrays)    y = []  # Will hold all labels        # Generate n_samples/2 vertical lines and n_samples/2 horizontal lines    for i in range(n_samples):                if i < n_samples // 2:            # ----- VERTICAL LINE (label = 1) -----            # Pick a random column (0, 1, or 2) for variety            col = np.random.randint(0, 3)                        # Create blank 3x3 image            image = np.zeros((3, 3))                        # Fill the chosen column with 1s            image[:, col] = 1                        # Add noise if requested            if noise_level > 0:                image = image + np.random.randn(3, 3) * noise_level                image = np.clip(image, 0, 1)  # Keep values in [0, 1]                        X.append(image.flatten())            y.append(1)  # Label: Vertical                    else:            # ----- HORIZONTAL LINE (label = 0) -----            # Pick a random row (0, 1, or 2) for variety            row = np.random.randint(0, 3)                        # Create blank 3x3 image            image = np.zeros((3, 3))                        # Fill the chosen row with 1s            image[row, :] = 1                        # Add noise if requested            if noise_level > 0:                image = image + np.random.randn(3, 3) * noise_level                image = np.clip(image, 0, 1)                        X.append(image.flatten())            y.append(0)  # Label: Horizontal        # Convert to numpy arrays    X = np.array(X)    y = np.array(y)        # Shuffle the dataset (so V and H are mixed, not grouped)    shuffle_idx = np.random.permutation(n_samples)    X = X[shuffle_idx]    y = y[shuffle_idx]        return X, y print("Dataset generator function created!")print("="*60)

# =============================================================================
# DATASET GENERATOR: Create V/H Line Examples On-The-Fly
# =============================================================================

def generate_line_dataset(n_samples=100, noise_level=0.0, seed=None):
    """
    Generate a dataset of vertical and horizontal line images.
    
    Parameters:
    -----------
    n_samples : int
        Total number of samples (will be split evenly between V and H)
    noise_level : float (0.0 to 0.5)
        Amount of random noise to add (0.0 = clean, 0.3 = noisy)
    seed : int or None
        Random seed for reproducibility
    
    Returns:
    --------
    X : numpy array of shape (n_samples, 9)
        Flattened 3x3 images
    y : numpy array of shape (n_samples,)
        Labels: 1 for vertical, 0 for horizontal
    """
    
    if seed is not None:
        np.random.seed(seed)
    
    X = []  # Will hold all images (as flattened arrays)
    y = []  # Will hold all labels
    
    # Generate n_samples/2 vertical lines and n_samples/2 horizontal lines
    for i in range(n_samples):
        
        if i < n_samples // 2:
            # ----- VERTICAL LINE (label = 1) -----
            # Pick a random column (0, 1, or 2) for variety
            col = np.random.randint(0, 3)
            
            # Create blank 3x3 image
            image = np.zeros((3, 3))
            
            # Fill the chosen column with 1s
            image[:, col] = 1
            
            # Add noise if requested
            if noise_level > 0:
                image = image + np.random.randn(3, 3) * noise_level
                image = np.clip(image, 0, 1)  # Keep values in [0, 1]
            
            X.append(image.flatten())
            y.append(1)  # Label: Vertical
            
        else:
            # ----- HORIZONTAL LINE (label = 0) -----
            # Pick a random row (0, 1, or 2) for variety
            row = np.random.randint(0, 3)
            
            # Create blank 3x3 image
            image = np.zeros((3, 3))
            
            # Fill the chosen row with 1s
            image[row, :] = 1
            
            # Add noise if requested
            if noise_level > 0:
                image = image + np.random.randn(3, 3) * noise_level
                image = np.clip(image, 0, 1)
            
            X.append(image.flatten())
            y.append(0)  # Label: Horizontal
    
    # Convert to numpy arrays
    X = np.array(X)
    y = np.array(y)
    
    # Shuffle the dataset (so V and H are mixed, not grouped)
    shuffle_idx = np.random.permutation(n_samples)
    X = X[shuffle_idx]
    y = y[shuffle_idx]
    
    return X, y

print("Dataset generator function created!")
print("="*60)

cell 007

# =============================================================================# GENERATE AND VISUALIZE OUR DATASET# ============================================================================= # Generate 20 clean examples for visualizationX_small, y_small = generate_line_dataset(n_samples=20, noise_level=0.0, seed=42) print("DATASET GENERATED!")print("="*60)print(f"\nDataset shape: X = {X_small.shape}, y = {y_small.shape}")print(f"  - {X_small.shape[0]} total samples")print(f"  - Each sample has {X_small.shape[1]} features (3x3 = 9 pixels)")print(f"\nLabel distribution:")print(f"  - Vertical lines (y=1): {np.sum(y_small == 1)} samples")print(f"  - Horizontal lines (y=0): {np.sum(y_small == 0)} samples") # Show first few samplesprint("\n" + "="*60)print("FIRST 6 SAMPLES:")print("="*60) for i in range(6):    image = X_small[i].reshape(3, 3)    label = y_small[i]    label_name = "VERTICAL" if label == 1 else "HORIZONTAL"    print(f"\nSample {i}: Label = {label} ({label_name})")    print(f"  {image[0]}")    print(f"  {image[1]}")    print(f"  {image[2]}")

# =============================================================================
# GENERATE AND VISUALIZE OUR DATASET
# =============================================================================

# Generate 20 clean examples for visualization
X_small, y_small = generate_line_dataset(n_samples=20, noise_level=0.0, seed=42)

print("DATASET GENERATED!")
print("="*60)
print(f"\nDataset shape: X = {X_small.shape}, y = {y_small.shape}")
print(f"  - {X_small.shape[0]} total samples")
print(f"  - Each sample has {X_small.shape[1]} features (3x3 = 9 pixels)")
print(f"\nLabel distribution:")
print(f"  - Vertical lines (y=1): {np.sum(y_small == 1)} samples")
print(f"  - Horizontal lines (y=0): {np.sum(y_small == 0)} samples")

# Show first few samples
print("\n" + "="*60)
print("FIRST 6 SAMPLES:")
print("="*60)

for i in range(6):
    image = X_small[i].reshape(3, 3)
    label = y_small[i]
    label_name = "VERTICAL" if label == 1 else "HORIZONTAL"
    print(f"\nSample {i}: Label = {label} ({label_name})")
    print(f"  {image[0]}")
    print(f"  {image[1]}")
    print(f"  {image[2]}")

cell 008

# =============================================================================# VISUALIZE SAMPLE IMAGES FROM OUR DATASET# ============================================================================= # Show a grid of 10 sample imagesfig, axes = plt.subplots(2, 5, figsize=(12, 5)) for i, ax in enumerate(axes.flat):    image = X_small[i].reshape(3, 3)    label = y_small[i]    label_name = "VERTICAL" if label == 1 else "HORIZONTAL"        ax.imshow(image, cmap='Blues', vmin=0, vmax=1)    ax.set_title(f"{label_name}\n(y={label})", fontsize=10)    ax.axis('off')        # Add grid lines    for j in range(4):        ax.axhline(j - 0.5, color='gray', linewidth=0.5)        ax.axvline(j - 0.5, color='gray', linewidth=0.5) plt.suptitle('Sample Images from Our Generated Dataset', fontsize=14, fontweight='bold')plt.tight_layout()plt.show() print("\nNotice: The lines can appear in different positions (left/center/right columns,")print("top/center/bottom rows). This variety makes our dataset more realistic!")

# =============================================================================
# VISUALIZE SAMPLE IMAGES FROM OUR DATASET
# =============================================================================

# Show a grid of 10 sample images
fig, axes = plt.subplots(2, 5, figsize=(12, 5))

for i, ax in enumerate(axes.flat):
    image = X_small[i].reshape(3, 3)
    label = y_small[i]
    label_name = "VERTICAL" if label == 1 else "HORIZONTAL"
    
    ax.imshow(image, cmap='Blues', vmin=0, vmax=1)
    ax.set_title(f"{label_name}\n(y={label})", fontsize=10)
    ax.axis('off')
    
    # Add grid lines
    for j in range(4):
        ax.axhline(j - 0.5, color='gray', linewidth=0.5)
        ax.axvline(j - 0.5, color='gray', linewidth=0.5)

plt.suptitle('Sample Images from Our Generated Dataset', fontsize=14, fontweight='bold')
plt.tight_layout()
plt.show()

print("\nNotice: The lines can appear in different positions (left/center/right columns,")
print("top/center/bottom rows). This variety makes our dataset more realistic!")

4.3 The Forward Pass: Step-by-Step

The forward pass is how a neural network makes a prediction. Information flows forward from input to output.

What is the Forward Pass?

The term "forward pass" comes from the direction information flows:

INPUT → WEIGHTS × INPUT → ADD BIAS → ACTIVATION → OUTPUT
  x    →    w · x       →   + b    →    f(z)    →   ŷ

Term	Meaning
Forward	Information flows left-to-right, input-to-output
Pass	One complete journey through the network
Inference	Another name for making predictions (vs. training)

Why "Forward"? Later in Part 5, we'll see the backward pass where error flows in the opposite direction. Together, they form the complete learning process!

Forward Pass vs Training

It's important to understand when each happens:

Forward Pass (Inference)	Training
Make a prediction	Learn from mistakes
Uses current weights	Updates the weights
Fast (one direction)	Slower (forward + backward)
Used after training	Used to create the model
"What do I think this is?"	"How can I do better?"

Right now, we're just doing the forward pass - making predictions. Training comes in Part 5!

The Four Steps of a Forward Pass

Step	Operation	Formula	Purpose
1	Receive Input	x	Get the flattened image (9 values)
2	Weighted Sum	z = w · x	Compute dot product with weights
3	Add Bias	z = z + b	Add the personal threshold
4	Apply Activation	ŷ = f(z)	Convert score to meaningful output

Let's trace through this step-by-step with actual numbers.

Committee Analogy

"The forward pass is the committee member reading a case file:

They receive the evidence (input)
They multiply each piece by their priority (weights)
They add their personal standard (bias)
They cast their vote (activation)"

Let's see this in code with EVERY step shown:

cell 010

# =============================================================================# THE FORWARD PASS: Step-by-Step Walkthrough# ============================================================================= # Define the sigmoid activation function (from Part 3)def sigmoid(z):    """Sigmoid activation: squashes any value to range (0, 1)."""    return 1 / (1 + np.exp(-z)) # Let's use our canonical vertical line as the inputx = vertical_flat.copy() # Create some random weights (as if the Perceptron is untrained)np.random.seed(123)  # For reproducibilityw = np.random.randn(9) * 0.5  # 9 random weightsb = np.random.randn() * 0.1    # 1 random bias print("="*70)print("FORWARD PASS: Step-by-Step with Real Numbers")print("="*70) # ----- STEP 1: Receive Input -----print("\n┌─────────────────────────────────────────────────────────────────────┐")print("│ STEP 1: Receive Input                                               │")print("└─────────────────────────────────────────────────────────────────────┘")print(f"\nInput image (as 3x3 grid):")print(f"  {x.reshape(3,3)[0]}")print(f"  {x.reshape(3,3)[1]}")print(f"  {x.reshape(3,3)[2]}")print(f"\nFlattened input vector x:")print(f"  x = {x}") # ----- STEP 2: Weighted Sum (Dot Product) -----print("\n┌─────────────────────────────────────────────────────────────────────┐")print("│ STEP 2: Weighted Sum (Dot Product)                                  │")print("└─────────────────────────────────────────────────────────────────────┘")print(f"\nWeights vector w:")print(f"  w = [{', '.join([f'{wi:.3f}' for wi in w])}]") # Show element-wise multiplicationprint(f"\nElement-wise products (x[i] × w[i]):")products = x * wprint(f"  = [{', '.join([f'{p:.3f}' for p in products])}]") # Sum the productsdot_product = np.sum(products)print(f"\nSum of products (the dot product):")print(f"  w · x = {dot_product:.4f}") # ----- STEP 3: Add Bias -----print("\n┌─────────────────────────────────────────────────────────────────────┐")print("│ STEP 3: Add Bias                                                    │")print("└─────────────────────────────────────────────────────────────────────┘")print(f"\nBias value:")print(f"  b = {b:.4f}")print(f"\nPre-activation value z:")z = dot_product + bprint(f"  z = (w · x) + b")print(f"  z = {dot_product:.4f} + {b:.4f}")print(f"  z = {z:.4f}") # ----- STEP 4: Apply Activation -----print("\n┌─────────────────────────────────────────────────────────────────────┐")print("│ STEP 4: Apply Activation (Sigmoid)                                  │")print("└─────────────────────────────────────────────────────────────────────┘")print(f"\nApplying sigmoid to z = {z:.4f}:")y_hat = sigmoid(z)print(f"  ŷ = sigmoid(z) = 1 / (1 + e^(-z))")print(f"  ŷ = 1 / (1 + e^(-{z:.4f}))")print(f"  ŷ = {y_hat:.4f}") # ----- FINAL RESULT -----print("\n" + "="*70)print("FORWARD PASS COMPLETE!")print("="*70)print(f"\nFinal output: ŷ = {y_hat:.4f}")print(f"\nInterpretation: The Perceptron is {y_hat*100:.1f}% confident this is a VERTICAL line.")print(f"\nPrediction: {'VERTICAL (y=1)' if y_hat >= 0.5 else 'HORIZONTAL (y=0)'}")print(f"Actual label: VERTICAL (y=1)")print(f"{'✓ Correct!' if y_hat >= 0.5 else '✗ Wrong!'}")

# =============================================================================
# THE FORWARD PASS: Step-by-Step Walkthrough
# =============================================================================

# Define the sigmoid activation function (from Part 3)
def sigmoid(z):
    """Sigmoid activation: squashes any value to range (0, 1)."""
    return 1 / (1 + np.exp(-z))

# Let's use our canonical vertical line as the input
x = vertical_flat.copy()

# Create some random weights (as if the Perceptron is untrained)
np.random.seed(123)  # For reproducibility
w = np.random.randn(9) * 0.5  # 9 random weights
b = np.random.randn() * 0.1    # 1 random bias

print("="*70)
print("FORWARD PASS: Step-by-Step with Real Numbers")
print("="*70)

# ----- STEP 1: Receive Input -----
print("\n┌─────────────────────────────────────────────────────────────────────┐")
print("│ STEP 1: Receive Input                                               │")
print("└─────────────────────────────────────────────────────────────────────┘")
print(f"\nInput image (as 3x3 grid):")
print(f"  {x.reshape(3,3)[0]}")
print(f"  {x.reshape(3,3)[1]}")
print(f"  {x.reshape(3,3)[2]}")
print(f"\nFlattened input vector x:")
print(f"  x = {x}")

# ----- STEP 2: Weighted Sum (Dot Product) -----
print("\n┌─────────────────────────────────────────────────────────────────────┐")
print("│ STEP 2: Weighted Sum (Dot Product)                                  │")
print("└─────────────────────────────────────────────────────────────────────┘")
print(f"\nWeights vector w:")
print(f"  w = [{', '.join([f'{wi:.3f}' for wi in w])}]")

# Show element-wise multiplication
print(f"\nElement-wise products (x[i] × w[i]):")
products = x * w
print(f"  = [{', '.join([f'{p:.3f}' for p in products])}]")

# Sum the products
dot_product = np.sum(products)
print(f"\nSum of products (the dot product):")
print(f"  w · x = {dot_product:.4f}")

# ----- STEP 3: Add Bias -----
print("\n┌─────────────────────────────────────────────────────────────────────┐")
print("│ STEP 3: Add Bias                                                    │")
print("└─────────────────────────────────────────────────────────────────────┘")
print(f"\nBias value:")
print(f"  b = {b:.4f}")
print(f"\nPre-activation value z:")
z = dot_product + b
print(f"  z = (w · x) + b")
print(f"  z = {dot_product:.4f} + {b:.4f}")
print(f"  z = {z:.4f}")

# ----- STEP 4: Apply Activation -----
print("\n┌─────────────────────────────────────────────────────────────────────┐")
print("│ STEP 4: Apply Activation (Sigmoid)                                  │")
print("└─────────────────────────────────────────────────────────────────────┘")
print(f"\nApplying sigmoid to z = {z:.4f}:")
y_hat = sigmoid(z)
print(f"  ŷ = sigmoid(z) = 1 / (1 + e^(-z))")
print(f"  ŷ = 1 / (1 + e^(-{z:.4f}))")
print(f"  ŷ = {y_hat:.4f}")

# ----- FINAL RESULT -----
print("\n" + "="*70)
print("FORWARD PASS COMPLETE!")
print("="*70)
print(f"\nFinal output: ŷ = {y_hat:.4f}")
print(f"\nInterpretation: The Perceptron is {y_hat*100:.1f}% confident this is a VERTICAL line.")
print(f"\nPrediction: {'VERTICAL (y=1)' if y_hat >= 0.5 else 'HORIZONTAL (y=0)'}")
print(f"Actual label: VERTICAL (y=1)")
print(f"{'✓ Correct!' if y_hat >= 0.5 else '✗ Wrong!'}")

4.4 Building the Perceptron Class

Now let's package everything into a clean, reusable Perceptron class. This is how real neural networks are implemented - as modular, reusable code.

Why Use a Class?

In programming, a class is a blueprint for creating objects. For neural networks, classes help us:

Benefit	Explanation
Organization	Keep weights, bias, and methods together
Reusability	Create multiple Perceptrons easily
State	Remember weights between method calls
Readability	`perceptron.predict(x)` is clearer than raw math

What Our Perceptron Needs

Component	What It Does
`__init__()`	Initialize weights and bias (randomly)
`forward()`	Compute the forward pass (returns probability)
`predict()`	Make a binary decision (0 or 1)

Why Random Initialization?

Before training, we need some starting values for weights. Why random?

Alternative	Problem
All zeros	All neurons would output the same thing!
All ones	Would overwhelm the activation function
Same value everywhere	All weights would update identically
Random small values	✓ Breaks symmetry, allows diverse learning

Key Insight: The SPECIFIC random values don't matter much - training will adjust them. But they must be:

Small (typically between -0.1 and 0.1) to avoid saturating the sigmoid
Different from each other to allow diverse learning

The scale * 0.1 keeps initial outputs near 0.5 (middle of sigmoid), where learning is fastest.

The Core Math (Keep It Simple!)

All the math fits in just two lines:

Forward pass: z = np.dot(weights, x) + bias

Activation: output = 1 / (1 + np.exp(-z))

Prediction: 1 if output >= 0.5 else 0

Understanding the Threshold (0.5)

The sigmoid outputs a probability between 0 and 1. To make a decision, we need a threshold:

Output	Decision Rule	Prediction
0.0 - 0.49	"Probably NOT vertical"	0 (Horizontal)
0.50 - 1.0	"Probably IS vertical"	1 (Vertical)

Why 0.5? It's the natural midpoint - if the model is >50% confident it's vertical, we call it vertical.

Note: In some applications, you might use a different threshold (e.g., 0.7 for "high confidence only"). But 0.5 is the standard starting point.

cell 012

# =============================================================================# THE PERCEPTRON CLASS: Clean, Reusable Implementation# ============================================================================= class Perceptron:    """    A single-layer Perceptron for binary classification.        This is the simplest possible neural network - just one neuron!        Attributes:        n_inputs (int): Number of input features (9 for our 3x3 images)        weights (array): One weight per input feature        bias (float): The threshold/offset term    """        def __init__(self, n_inputs):        """        Initialize the Perceptron with random weights and bias.                Parameters:            n_inputs: Number of input features (pixels in our image)        """        # Random weights, small values centered around 0        self.weights = np.random.randn(n_inputs) * 0.1                # Bias starts at 0        self.bias = 0.0                # Store for reference        self.n_inputs = n_inputs                # Storage for debugging/visualization        self.last_z = None    # Pre-activation value        self.last_output = None  # Final output        def forward(self, x):        """        Compute the forward pass - make a prediction.                Parameters:            x: Input array (can be 2D image or 1D flattened)                Returns:            float: Probability between 0 and 1        """        # Ensure x is a 1D array        x = np.array(x).flatten()                # STEP 1 & 2: Weighted sum + bias        # Formula: z = w · x + b        self.last_z = np.dot(self.weights, x) + self.bias                # STEP 3: Apply sigmoid activation        # Formula: output = 1 / (1 + e^(-z))        self.last_output = 1 / (1 + np.exp(-self.last_z))                return self.last_output        def predict(self, x):        """        Make a binary prediction (0 or 1).                Parameters:            x: Input array                Returns:            int: 0 (horizontal) or 1 (vertical)        """        probability = self.forward(x)        return 1 if probability >= 0.5 else 0        def __repr__(self):        return f"Perceptron(inputs={self.n_inputs})"  # Create our Perceptron!print("="*60)print("PERCEPTRON CLASS CREATED!")print("="*60) # Instantiate a Perceptron for 9 inputs (3x3 = 9 pixels)perceptron = Perceptron(n_inputs=9) print(f"\nOur Perceptron: {perceptron}")print(f"\nInitial weights (random, untrained):")print(f"  Shape: {perceptron.weights.shape}")print(f"  Values: [{', '.join([f'{w:.3f}' for w in perceptron.weights])}]")print(f"\nInitial bias: {perceptron.bias}")print("\nThe Perceptron is ready, but completely UNTRAINED!")print("Its weights are random - it doesn't know what a vertical line looks like.")

# =============================================================================
# THE PERCEPTRON CLASS: Clean, Reusable Implementation
# =============================================================================

class Perceptron:
    """
    A single-layer Perceptron for binary classification.
    
    This is the simplest possible neural network - just one neuron!
    
    Attributes:
        n_inputs (int): Number of input features (9 for our 3x3 images)
        weights (array): One weight per input feature
        bias (float): The threshold/offset term
    """
    
    def __init__(self, n_inputs):
        """
        Initialize the Perceptron with random weights and bias.
        
        Parameters:
            n_inputs: Number of input features (pixels in our image)
        """
        # Random weights, small values centered around 0
        self.weights = np.random.randn(n_inputs) * 0.1
        
        # Bias starts at 0
        self.bias = 0.0
        
        # Store for reference
        self.n_inputs = n_inputs
        
        # Storage for debugging/visualization
        self.last_z = None    # Pre-activation value
        self.last_output = None  # Final output
    
    def forward(self, x):
        """
        Compute the forward pass - make a prediction.
        
        Parameters:
            x: Input array (can be 2D image or 1D flattened)
        
        Returns:
            float: Probability between 0 and 1
        """
        # Ensure x is a 1D array
        x = np.array(x).flatten()
        
        # STEP 1 & 2: Weighted sum + bias
        # Formula: z = w · x + b
        self.last_z = np.dot(self.weights, x) + self.bias
        
        # STEP 3: Apply sigmoid activation
        # Formula: output = 1 / (1 + e^(-z))
        self.last_output = 1 / (1 + np.exp(-self.last_z))
        
        return self.last_output
    
    def predict(self, x):
        """
        Make a binary prediction (0 or 1).
        
        Parameters:
            x: Input array
        
        Returns:
            int: 0 (horizontal) or 1 (vertical)
        """
        probability = self.forward(x)
        return 1 if probability >= 0.5 else 0
    
    def __repr__(self):
        return f"Perceptron(inputs={self.n_inputs})"

# Create our Perceptron!
print("="*60)
print("PERCEPTRON CLASS CREATED!")
print("="*60)

# Instantiate a Perceptron for 9 inputs (3x3 = 9 pixels)
perceptron = Perceptron(n_inputs=9)

print(f"\nOur Perceptron: {perceptron}")
print(f"\nInitial weights (random, untrained):")
print(f"  Shape: {perceptron.weights.shape}")
print(f"  Values: [{', '.join([f'{w:.3f}' for w in perceptron.weights])}]")
print(f"\nInitial bias: {perceptron.bias}")
print("\nThe Perceptron is ready, but completely UNTRAINED!")
print("Its weights are random - it doesn't know what a vertical line looks like.")

4.5 Initial Predictions: The Confused Perceptron

Now the moment of truth! Let's see how our untrained Perceptron performs.

What is Accuracy?

Accuracy is the simplest way to measure how well a model performs:

$Accuracy = \frac{Number of Correct Predictions}{Total Number of Predictions} \times 100 %$

For example:

80 correct out of 100 = 80% accuracy
50 correct out of 100 = 50% accuracy

The Baseline: What's "Random Guessing"?

For any classification task, there's a baseline accuracy - what you'd get by guessing randomly:

Task Type	Classes	Random Baseline
Binary (yes/no)	2	50%
3-way choice	3	33%
10-way choice	10	10%

Our task is binary (vertical vs horizontal), so random guessing gives 50%.

Why this matters: If your model gets 50% on binary classification, it's learned NOTHING. It's no better than flipping a coin!

What We Expect

Since the weights are random, the Perceptron has no idea what it's doing. It's like asking someone who's never seen a line before to classify them.

Expected accuracy: Around 50% (random guessing for binary classification)

Committee Analogy

"Our committee member has been trained in procedure, but has never seen an actual case. They're about to make judgments based on completely arbitrary priorities. The results won't be pretty..."

cell 014

# =============================================================================# TESTING THE UNTRAINED PERCEPTRON ON OUR CANONICAL EXAMPLES# ============================================================================= print("="*70)print("TESTING UNTRAINED PERCEPTRON")print("="*70) # Test on our canonical vertical lineprint("\n┌─────────────────────────────────────────────────────────────────────┐")print("│ Test 1: VERTICAL LINE                                               │")print("└─────────────────────────────────────────────────────────────────────┘")print(f"\nImage (3x3):")print(f"  {vertical_line[0]}")print(f"  {vertical_line[1]}")print(f"  {vertical_line[2]}") prob_vertical = perceptron.forward(vertical_flat)pred_vertical = perceptron.predict(vertical_flat)actual_vertical = 1 print(f"\nForward pass calculation:")print(f"  z = w · x + b = {perceptron.last_z:.4f}")print(f"  output = sigmoid(z) = {prob_vertical:.4f}")print(f"\nPrediction: {pred_vertical} ({'VERTICAL' if pred_vertical == 1 else 'HORIZONTAL'})")print(f"Actual:     {actual_vertical} (VERTICAL)")print(f"Result:     {'CORRECT!' if pred_vertical == actual_vertical else 'WRONG!'}") # Test on our canonical horizontal lineprint("\n┌─────────────────────────────────────────────────────────────────────┐")print("│ Test 2: HORIZONTAL LINE                                             │")print("└─────────────────────────────────────────────────────────────────────┘")print(f"\nImage (3x3):")print(f"  {horizontal_line[0]}")print(f"  {horizontal_line[1]}")print(f"  {horizontal_line[2]}") prob_horizontal = perceptron.forward(horizontal_flat)pred_horizontal = perceptron.predict(horizontal_flat)actual_horizontal = 0 print(f"\nForward pass calculation:")print(f"  z = w · x + b = {perceptron.last_z:.4f}")print(f"  output = sigmoid(z) = {prob_horizontal:.4f}")print(f"\nPrediction: {pred_horizontal} ({'VERTICAL' if pred_horizontal == 1 else 'HORIZONTAL'})")print(f"Actual:     {actual_horizontal} (HORIZONTAL)")print(f"Result:     {'CORRECT!' if pred_horizontal == actual_horizontal else 'WRONG!'}")

# =============================================================================
# TESTING THE UNTRAINED PERCEPTRON ON OUR CANONICAL EXAMPLES
# =============================================================================

print("="*70)
print("TESTING UNTRAINED PERCEPTRON")
print("="*70)

# Test on our canonical vertical line
print("\n┌─────────────────────────────────────────────────────────────────────┐")
print("│ Test 1: VERTICAL LINE                                               │")
print("└─────────────────────────────────────────────────────────────────────┘")
print(f"\nImage (3x3):")
print(f"  {vertical_line[0]}")
print(f"  {vertical_line[1]}")
print(f"  {vertical_line[2]}")

prob_vertical = perceptron.forward(vertical_flat)
pred_vertical = perceptron.predict(vertical_flat)
actual_vertical = 1

print(f"\nForward pass calculation:")
print(f"  z = w · x + b = {perceptron.last_z:.4f}")
print(f"  output = sigmoid(z) = {prob_vertical:.4f}")
print(f"\nPrediction: {pred_vertical} ({'VERTICAL' if pred_vertical == 1 else 'HORIZONTAL'})")
print(f"Actual:     {actual_vertical} (VERTICAL)")
print(f"Result:     {'CORRECT!' if pred_vertical == actual_vertical else 'WRONG!'}")

# Test on our canonical horizontal line
print("\n┌─────────────────────────────────────────────────────────────────────┐")
print("│ Test 2: HORIZONTAL LINE                                             │")
print("└─────────────────────────────────────────────────────────────────────┘")
print(f"\nImage (3x3):")
print(f"  {horizontal_line[0]}")
print(f"  {horizontal_line[1]}")
print(f"  {horizontal_line[2]}")

prob_horizontal = perceptron.forward(horizontal_flat)
pred_horizontal = perceptron.predict(horizontal_flat)
actual_horizontal = 0

print(f"\nForward pass calculation:")
print(f"  z = w · x + b = {perceptron.last_z:.4f}")
print(f"  output = sigmoid(z) = {prob_horizontal:.4f}")
print(f"\nPrediction: {pred_horizontal} ({'VERTICAL' if pred_horizontal == 1 else 'HORIZONTAL'})")
print(f"Actual:     {actual_horizontal} (HORIZONTAL)")
print(f"Result:     {'CORRECT!' if pred_horizontal == actual_horizontal else 'WRONG!'}")

cell 015

# =============================================================================# TESTING ON THE FULL DATASET: Calculate Accuracy# ============================================================================= # Generate a larger dataset for proper testingX_test, y_test = generate_line_dataset(n_samples=100, noise_level=0.0, seed=99) print("="*70)print("FULL DATASET EVALUATION")print("="*70)print(f"\nDataset: {len(y_test)} samples ({sum(y_test)} vertical, {len(y_test) - sum(y_test)} horizontal)") # Make predictions on all samplespredictions = []correct = 0 for i in range(len(X_test)):    pred = perceptron.predict(X_test[i])    predictions.append(pred)    if pred == y_test[i]:        correct += 1 accuracy = correct / len(y_test) * 100 # Display results table (first 10 samples)print("\n" + "-"*70)print("FIRST 10 PREDICTIONS:")print("-"*70)print(f"{'Sample':<8} {'Actual':<12} {'Predicted':<12} {'Result':<10}")print("-"*70) for i in range(10):    actual_name = "VERTICAL" if y_test[i] == 1 else "HORIZONTAL"    pred_name = "VERTICAL" if predictions[i] == 1 else "HORIZONTAL"    result = "Correct" if predictions[i] == y_test[i] else "WRONG"    symbol = "+" if predictions[i] == y_test[i] else "X"    print(f"  {i:<6} {actual_name:<12} {pred_name:<12} {symbol} {result}") # Summaryprint("\n" + "="*70)print("ACCURACY SUMMARY")print("="*70)print(f"\n  Total samples:  {len(y_test)}")print(f"  Correct:        {correct}")print(f"  Wrong:          {len(y_test) - correct}")print(f"\n  ACCURACY: {accuracy:.1f}%")print(f"\n  Expected (random guessing): ~50%")print(f"  Difference from random: {abs(accuracy - 50):.1f}%") if accuracy > 55:    print("\n  Hmm, slightly better than random - got lucky with the random weights!")elif accuracy < 45:    print("\n  Worse than random! The weights are actually hurting performance.")else:    print("\n  As expected: basically random guessing. The Perceptron is CONFUSED!")

# =============================================================================
# TESTING ON THE FULL DATASET: Calculate Accuracy
# =============================================================================

# Generate a larger dataset for proper testing
X_test, y_test = generate_line_dataset(n_samples=100, noise_level=0.0, seed=99)

print("="*70)
print("FULL DATASET EVALUATION")
print("="*70)
print(f"\nDataset: {len(y_test)} samples ({sum(y_test)} vertical, {len(y_test) - sum(y_test)} horizontal)")

# Make predictions on all samples
predictions = []
correct = 0

for i in range(len(X_test)):
    pred = perceptron.predict(X_test[i])
    predictions.append(pred)
    if pred == y_test[i]:
        correct += 1

accuracy = correct / len(y_test) * 100

# Display results table (first 10 samples)
print("\n" + "-"*70)
print("FIRST 10 PREDICTIONS:")
print("-"*70)
print(f"{'Sample':<8} {'Actual':<12} {'Predicted':<12} {'Result':<10}")
print("-"*70)

for i in range(10):
    actual_name = "VERTICAL" if y_test[i] == 1 else "HORIZONTAL"
    pred_name = "VERTICAL" if predictions[i] == 1 else "HORIZONTAL"
    result = "Correct" if predictions[i] == y_test[i] else "WRONG"
    symbol = "+" if predictions[i] == y_test[i] else "X"
    print(f"  {i:<6} {actual_name:<12} {pred_name:<12} {symbol} {result}")

# Summary
print("\n" + "="*70)
print("ACCURACY SUMMARY")
print("="*70)
print(f"\n  Total samples:  {len(y_test)}")
print(f"  Correct:        {correct}")
print(f"  Wrong:          {len(y_test) - correct}")
print(f"\n  ACCURACY: {accuracy:.1f}%")
print(f"\n  Expected (random guessing): ~50%")
print(f"  Difference from random: {abs(accuracy - 50):.1f}%")

if accuracy > 55:
    print("\n  Hmm, slightly better than random - got lucky with the random weights!")
elif accuracy < 45:
    print("\n  Worse than random! The weights are actually hurting performance.")
else:
    print("\n  As expected: basically random guessing. The Perceptron is CONFUSED!")

4.6 Why It's Wrong: Understanding the Problem

Our Perceptron performed around 50% accuracy - basically coin-flipping. Why?

Understanding What Weights Actually DO

The weights are the Perceptron's knowledge. Each weight answers the question:

"How important is this input for making the decision?"

Weight Value	Meaning
Large positive (+1.0)	"This input STRONGLY suggests class 1"
Small positive (+0.1)	"This input slightly suggests class 1"
Near zero (0.0)	"This input doesn't matter"
Small negative (-0.1)	"This input slightly suggests class 0"
Large negative (-1.0)	"This input STRONGLY suggests class 0"

What We WANT the Perceptron to Learn

For detecting vertical lines, the ideal weights would encode this knowledge:

    "Pixels in columns = IMPORTANT for vertical detection"
    "Pixels in rows = NOT important (or negative) for vertical detection"

In weight terms:

Middle column pixels → HIGH positive weights (vertical lines have these lit up)
Other pixels → LOW or NEGATIVE weights (don't indicate verticality)

The Problem: Random Weights = No Knowledge

Our current weights are random - they encode NO knowledge about vertical lines:

Some weights are positive when they should be negative
Some weights are large when they should be small
There's no pattern that matches "vertical line detection"

Feature Detection: What the Perceptron is Trying to Become

A feature detector is a model that responds strongly to specific patterns. Our goal:

Input Pattern	Ideal Perceptron Response
Vertical line (any column)	High output (close to 1.0)
Horizontal line (any row)	Low output (close to 0.0)

Right now: The Perceptron is NOT a feature detector - it's just random noise.

After training: It WILL become a vertical line feature detector!

The Problem: Random Weights = Random Decisions

Let's visualize what our random weights actually look like:

cell 017

# =============================================================================# VISUALIZING THE PROBLEM: Random Weights vs Ideal Weights# ============================================================================= # What ideal weights for a vertical detector should look likeideal_weights = np.array([    [-1,  2, -1],   # Top row: look for middle    [-1,  2, -1],   # Middle row: look for middle    [-1,  2, -1]    # Bottom row: look for middle]).flatten() * 0.5 # Our actual (random) weightsactual_weights = perceptron.weights # Visualizefig, axes = plt.subplots(1, 3, figsize=(14, 4)) # Plot 1: Random weights (what we have)ax1 = axes[0]weights_grid = actual_weights.reshape(3, 3)im1 = ax1.imshow(weights_grid, cmap='RdBu', vmin=-0.5, vmax=0.5)ax1.set_title('Our Random Weights\n(Untrained)', fontsize=12, fontweight='bold')for i in range(3):    for j in range(3):        ax1.text(j, i, f'{weights_grid[i,j]:.2f}', ha='center', va='center', fontsize=10)plt.colorbar(im1, ax=ax1, label='Weight value') # Plot 2: Ideal weights (what we need)ax2 = axes[1]ideal_grid = ideal_weights.reshape(3, 3)im2 = ax2.imshow(ideal_grid, cmap='RdBu', vmin=-0.5, vmax=0.5)ax2.set_title('Ideal Weights\n(What we need)', fontsize=12, fontweight='bold')for i in range(3):    for j in range(3):        ax2.text(j, i, f'{ideal_grid[i,j]:.2f}', ha='center', va='center', fontsize=10)plt.colorbar(im2, ax=ax2, label='Weight value') # Plot 3: A vertical line (what we're trying to detect)ax3 = axes[2]im3 = ax3.imshow(vertical_line, cmap='Blues', vmin=0, vmax=1)ax3.set_title('Vertical Line\n(What we detect)', fontsize=12, fontweight='bold')for i in range(3):    for j in range(3):        ax3.text(j, i, f'{vertical_line[i,j]}', ha='center', va='center', fontsize=10)plt.colorbar(im3, ax=ax3, label='Pixel value') plt.tight_layout()plt.show() # Show the key insightprint("\nKEY INSIGHT: Why Random Weights Fail")print("="*60)print("""IDEAL weights for vertical detection should have:  - HIGH values in the middle column (where vertical lines are)  - LOW or NEGATIVE values elsewhere Our RANDOM weights have no pattern - they're just noise! The Perceptron doesn't KNOW what vertical lines look like yet.It needs to LEARN the right weights through TRAINING.""")

# =============================================================================
# VISUALIZING THE PROBLEM: Random Weights vs Ideal Weights
# =============================================================================

# What ideal weights for a vertical detector should look like
ideal_weights = np.array([
    [-1,  2, -1],   # Top row: look for middle
    [-1,  2, -1],   # Middle row: look for middle
    [-1,  2, -1]    # Bottom row: look for middle
]).flatten() * 0.5

# Our actual (random) weights
actual_weights = perceptron.weights

# Visualize
fig, axes = plt.subplots(1, 3, figsize=(14, 4))

# Plot 1: Random weights (what we have)
ax1 = axes[0]
weights_grid = actual_weights.reshape(3, 3)
im1 = ax1.imshow(weights_grid, cmap='RdBu', vmin=-0.5, vmax=0.5)
ax1.set_title('Our Random Weights\n(Untrained)', fontsize=12, fontweight='bold')
for i in range(3):
    for j in range(3):
        ax1.text(j, i, f'{weights_grid[i,j]:.2f}', ha='center', va='center', fontsize=10)
plt.colorbar(im1, ax=ax1, label='Weight value')

# Plot 2: Ideal weights (what we need)
ax2 = axes[1]
ideal_grid = ideal_weights.reshape(3, 3)
im2 = ax2.imshow(ideal_grid, cmap='RdBu', vmin=-0.5, vmax=0.5)
ax2.set_title('Ideal Weights\n(What we need)', fontsize=12, fontweight='bold')
for i in range(3):
    for j in range(3):
        ax2.text(j, i, f'{ideal_grid[i,j]:.2f}', ha='center', va='center', fontsize=10)
plt.colorbar(im2, ax=ax2, label='Weight value')

# Plot 3: A vertical line (what we're trying to detect)
ax3 = axes[2]
im3 = ax3.imshow(vertical_line, cmap='Blues', vmin=0, vmax=1)
ax3.set_title('Vertical Line\n(What we detect)', fontsize=12, fontweight='bold')
for i in range(3):
    for j in range(3):
        ax3.text(j, i, f'{vertical_line[i,j]}', ha='center', va='center', fontsize=10)
plt.colorbar(im3, ax=ax3, label='Pixel value')

plt.tight_layout()
plt.show()

# Show the key insight
print("\nKEY INSIGHT: Why Random Weights Fail")
print("="*60)
print("""
IDEAL weights for vertical detection should have:
  - HIGH values in the middle column (where vertical lines are)
  - LOW or NEGATIVE values elsewhere

Our RANDOM weights have no pattern - they're just noise!

The Perceptron doesn't KNOW what vertical lines look like yet.
It needs to LEARN the right weights through TRAINING.
""")

cell 018

# =============================================================================# WHAT IF WE HAD IDEAL WEIGHTS? (Sneak Preview)# ============================================================================= print("="*70)print("WHAT IF WE HAD THE RIGHT WEIGHTS? (A Preview)")print("="*70) # Create a new Perceptron and give it ideal weightsideal_perceptron = Perceptron(n_inputs=9)ideal_perceptron.weights = ideal_weights.copy()ideal_perceptron.bias = -1.5  # A good threshold print("\nIdeal weights (as 3x3 grid):")print(f"  {ideal_perceptron.weights.reshape(3,3)[0]}")print(f"  {ideal_perceptron.weights.reshape(3,3)[1]}")print(f"  {ideal_perceptron.weights.reshape(3,3)[2]}")print(f"\nBias: {ideal_perceptron.bias}") # Test on the same datasetcorrect_ideal = 0for i in range(len(X_test)):    if ideal_perceptron.predict(X_test[i]) == y_test[i]:        correct_ideal += 1 accuracy_ideal = correct_ideal / len(y_test) * 100 print("\n" + "-"*70)print("COMPARISON:")print("-"*70)print(f"\n  Random weights accuracy:  {accuracy:.1f}%")print(f"  Ideal weights accuracy:   {accuracy_ideal:.1f}%")print(f"\n  Improvement: +{accuracy_ideal - accuracy:.1f}%") print("\n" + "="*70)print("THE BIG QUESTION:")print("="*70)print("""How do we get from RANDOM weights to IDEAL weights? We don't want to hand-design them (that defeats the purpose!).We want the Perceptron to LEARN them automatically. This is what TRAINING does - and it's the topic of Part 5!""")

# =============================================================================
# WHAT IF WE HAD IDEAL WEIGHTS? (Sneak Preview)
# =============================================================================

print("="*70)
print("WHAT IF WE HAD THE RIGHT WEIGHTS? (A Preview)")
print("="*70)

# Create a new Perceptron and give it ideal weights
ideal_perceptron = Perceptron(n_inputs=9)
ideal_perceptron.weights = ideal_weights.copy()
ideal_perceptron.bias = -1.5  # A good threshold

print("\nIdeal weights (as 3x3 grid):")
print(f"  {ideal_perceptron.weights.reshape(3,3)[0]}")
print(f"  {ideal_perceptron.weights.reshape(3,3)[1]}")
print(f"  {ideal_perceptron.weights.reshape(3,3)[2]}")
print(f"\nBias: {ideal_perceptron.bias}")

# Test on the same dataset
correct_ideal = 0
for i in range(len(X_test)):
    if ideal_perceptron.predict(X_test[i]) == y_test[i]:
        correct_ideal += 1

accuracy_ideal = correct_ideal / len(y_test) * 100

print("\n" + "-"*70)
print("COMPARISON:")
print("-"*70)
print(f"\n  Random weights accuracy:  {accuracy:.1f}%")
print(f"  Ideal weights accuracy:   {accuracy_ideal:.1f}%")
print(f"\n  Improvement: +{accuracy_ideal - accuracy:.1f}%")

print("\n" + "="*70)
print("THE BIG QUESTION:")
print("="*70)
print("""
How do we get from RANDOM weights to IDEAL weights?

We don't want to hand-design them (that defeats the purpose!).
We want the Perceptron to LEARN them automatically.

This is what TRAINING does - and it's the topic of Part 5!
""")

Part 4 Summary: What We've Learned

Key Concepts Mastered

Concept	What It Is	Why It Matters
Perceptron	Single-neuron neural network	Simplest possible NN, building block for larger networks
Dataset Generation	Creating training examples	We can test our models without external data
Forward Pass	Input → Output computation	This is how predictions are made
Random Initialization	Starting with random weights	The beginning state before learning

The Complete Perceptron Formula

$y^= σ (w \cdot x + b) = \frac{1}{1 + e^{- (w \cdot x + b)}}$

Or in code:

z = np.dot(weights, x) + bias    # Weighted sum
output = 1 / (1 + np.exp(-z))    # Sigmoid activation
prediction = 1 if output >= 0.5 else 0

Committee Analogy Progress

Part	What Happened
Part 1	Committee learned to read evidence (matrices)
Part 2	First member learned to weigh evidence (weights/bias)
Part 3	Member learned to cast meaningful votes (activation)
Part 4	Member attempted their first case - and FAILED!
Part 5	(Next) Member learns from their mistakes

Key Insight

Random weights = Random guessing

An untrained Perceptron has no knowledge. Its weights are just noise. To become useful, it must learn the right weights by seeing examples and adjusting based on its mistakes.

Knowledge Check

cell 020

# =============================================================================# KNOWLEDGE CHECK - Part 4# ============================================================================= print("KNOWLEDGE CHECK - Part 4: The Perceptron")print("="*60)print("\nAnswer these questions to test your understanding:\n") questions = [    {        "q": "1. What are the steps of a forward pass (in order)?",        "options": [            "A) Activation -> Weighted Sum -> Output",            "B) Weighted Sum -> Add Bias -> Activation -> Output",            "C) Input -> Output -> Activation",            "D) Bias -> Weights -> Sigmoid"        ],        "answer": "B",        "explanation": "The forward pass is: (1) compute weighted sum of inputs, (2) add bias, (3) apply activation function, (4) get output."    },    {        "q": "2. Why does an untrained Perceptron get ~50% accuracy?",        "options": [            "A) Because sigmoid always outputs 0.5",            "B) Because the dataset is unbalanced",            "C) Because random weights give random predictions",            "D) Because the bias is always 0"        ],        "answer": "C",        "explanation": "Random weights have no meaningful pattern, so the Perceptron essentially guesses randomly. For binary classification, random guessing gives ~50% accuracy."    },    {        "q": "3. What does the forward pass output for binary classification?",        "options": [            "A) Always 0 or 1 exactly",            "B) A probability between 0 and 1",            "C) Any real number",            "D) The raw weighted sum"        ],        "answer": "B",        "explanation": "The sigmoid activation squashes the output to a probability between 0 and 1. We then threshold at 0.5 to get a binary prediction."    },    {        "q": "4. For a vertical line detector, where should the weights be highest?",        "options": [            "A) In the corners",            "B) In the middle column",            "C) In the middle row",            "D) Equally everywhere"        ],        "answer": "B",        "explanation": "Vertical lines appear in columns. High weights in the middle column will give high scores when vertical pixels align with them."    },    {        "q": "5. Who invented the Perceptron?",        "options": [            "A) Geoffrey Hinton",            "B) Frank Rosenblatt",            "C) Yann LeCun",            "D) Alan Turing"        ],        "answer": "B",        "explanation": "Frank Rosenblatt invented the Perceptron in 1958 at Cornell. It was the first neural network that could learn!"    }] for q in questions:    print(q["q"])    for opt in q["options"]:        print(f"   {opt}")    print() print("\n" + "="*60)print("Scroll down for answers...")print("="*60)

# =============================================================================
# KNOWLEDGE CHECK - Part 4
# =============================================================================

print("KNOWLEDGE CHECK - Part 4: The Perceptron")
print("="*60)
print("\nAnswer these questions to test your understanding:\n")

questions = [
    {
        "q": "1. What are the steps of a forward pass (in order)?",
        "options": [
            "A) Activation -> Weighted Sum -> Output",
            "B) Weighted Sum -> Add Bias -> Activation -> Output",
            "C) Input -> Output -> Activation",
            "D) Bias -> Weights -> Sigmoid"
        ],
        "answer": "B",
        "explanation": "The forward pass is: (1) compute weighted sum of inputs, (2) add bias, (3) apply activation function, (4) get output."
    },
    {
        "q": "2. Why does an untrained Perceptron get ~50% accuracy?",
        "options": [
            "A) Because sigmoid always outputs 0.5",
            "B) Because the dataset is unbalanced",
            "C) Because random weights give random predictions",
            "D) Because the bias is always 0"
        ],
        "answer": "C",
        "explanation": "Random weights have no meaningful pattern, so the Perceptron essentially guesses randomly. For binary classification, random guessing gives ~50% accuracy."
    },
    {
        "q": "3. What does the forward pass output for binary classification?",
        "options": [
            "A) Always 0 or 1 exactly",
            "B) A probability between 0 and 1",
            "C) Any real number",
            "D) The raw weighted sum"
        ],
        "answer": "B",
        "explanation": "The sigmoid activation squashes the output to a probability between 0 and 1. We then threshold at 0.5 to get a binary prediction."
    },
    {
        "q": "4. For a vertical line detector, where should the weights be highest?",
        "options": [
            "A) In the corners",
            "B) In the middle column",
            "C) In the middle row",
            "D) Equally everywhere"
        ],
        "answer": "B",
        "explanation": "Vertical lines appear in columns. High weights in the middle column will give high scores when vertical pixels align with them."
    },
    {
        "q": "5. Who invented the Perceptron?",
        "options": [
            "A) Geoffrey Hinton",
            "B) Frank Rosenblatt",
            "C) Yann LeCun",
            "D) Alan Turing"
        ],
        "answer": "B",
        "explanation": "Frank Rosenblatt invented the Perceptron in 1958 at Cornell. It was the first neural network that could learn!"
    }
]

for q in questions:
    print(q["q"])
    for opt in q["options"]:
        print(f"   {opt}")
    print()

print("\n" + "="*60)
print("Scroll down for answers...")
print("="*60)

cell 021

# =============================================================================# ANSWERS - Knowledge Check Part 4# ============================================================================= print("ANSWERS - Part 4 Knowledge Check")print("="*60) for i, q in enumerate(questions, 1):    print(f"\n{i}. Answer: {q['answer']}")    print(f"   Explanation: {q['explanation']}") print("\n" + "="*60)print("How did you do?")print("  5/5: Perceptron Expert!")print("  4/5: Great understanding!")print("  3/5: Review the sections you missed")print("  <3:  Re-read Part 4 before continuing")print("="*60)

What's Next?

You've completed Part 4! Our Perceptron is built but confused - it makes random guesses because its weights are random.

Coming Up in Part 5: Training - Learning from Mistakes

In Part 5, we'll cover:

Loss Functions - Measuring "how wrong" a prediction is
Gradient Descent - Finding better weights
Backpropagation - How errors flow backward
The Training Loop - Iteratively improving weights
Watch It Learn - See accuracy improve from 50% to 90%+!

Continue to Part 5: part_5_training.ipynb

"The Perceptron is ready. The data is ready. Now it's time to LEARN."

The Brain's Decision Committee - Learning to See, One Step at a Time