Introduction to Neural Networks

What are Neural Networks?

Neural Networks are computing systems inspired by the biological neural networks in animal brains. They consist of interconnected nodes (neurons) that work together to process information and learn patterns.

Biological Inspiration

Biological Neuron

Dendrites receive signals
Cell body processes signals
Axon sends output
Synapses connect neurons

Artificial Neuron

Inputs receive data
Weighted sum + activation
Output sends result
Weights connect neurons

Structure of a Neural Network

Layers

Input Layer: Receives raw data
Hidden Layers: Process and transform data (can be multiple)
Output Layer: Produces final prediction

Input Layer    Hidden Layer    Output Layer
    ●              ●               ●
     \           /   \           /
      \         /     \         /
    ●  ------> ●       ● -----> ● (Output)
      /         \     /         \
     /           \   /           
    ●              ●

How Neurons Work

Each neuron performs a simple calculation:

Receive inputs: Get values from previous layer
Weight multiplication: Each input is multiplied by a weight
Sum: Add all weighted inputs plus a bias
Activation: Apply an activation function
Output: Pass result to next layer

Formula:

output = activation(w₁x₁ + w₂x₂ + w₃x₃ + ... + bias)

Activation Functions

Activation functions introduce non-linearity, allowing networks to learn complex patterns.

Sigmoid

Squashes values between 0 and 1

σ(x) = 1/(1+e⁻ˣ)

ReLU

Most popular: returns x if positive, 0 otherwise

f(x) = max(0, x)

Tanh

Squashes values between -1 and 1

tanh(x)

Softmax

Converts to probabilities (sum to 1)

Used in output layer

Training Process

Forward Pass: Input flows through network to produce prediction
Calculate Loss: Compare prediction to actual answer
Backward Pass (Backpropagation): Calculate how to adjust weights
Update Weights: Adjust to reduce error
Repeat: Continue until model performs well

Simple Example: AND Gate

Let's create a neural network that learns the AND logic gate:

# Neural Network for AND gate
import numpy as np

def sigmoid(x):
    return 1 / (1 + np.exp(-x))

def sigmoid_derivative(x):
    return x * (1 - x)

# Training data for AND gate
X = np.array([[0, 0], [0, 1], [1, 0], [1, 1]])
y = np.array([[0], [0], [0], [1]])

# Initialize weights randomly
np.random.seed(1)
weights = 2 * np.random.random((2, 1)) - 1
bias = 0

# Training
learning_rate = 0.5
for epoch in range(10000):
    # Forward pass
    input_layer = X
    output = sigmoid(np.dot(input_layer, weights) + bias)
    
    # Calculate error
    error = y - output
    
    # Backward pass
    adjustments = error * sigmoid_derivative(output)
    
    # Update weights
    weights += learning_rate * np.dot(input_layer.T, adjustments)
    bias += learning_rate * np.sum(adjustments)

# Test the trained network
print("Trained weights:", weights.flatten())
print("Trained bias:", bias)
print("\nPredictions:")
for i in range(len(X)):
    prediction = sigmoid(np.dot(X[i], weights) + bias)[0]
    print(f"Input: {X[i]} → Output: {prediction:.4f} → Rounded: {round(prediction)}")

# Output:
# Input: [0 0] → Output: 0.0146 → Rounded: 0
# Input: [0 1] → Output: 0.0661 → Rounded: 0
# Input: [1 0] → Output: 0.0661 → Rounded: 0
# Input: [1 1] → Output: 0.9338 → Rounded: 1
# The network learned AND logic!

Real-World Example: Digit Recognition

# Neural Network for handwritten digit recognition
from sklearn.datasets import load_digits
from sklearn.model_selection import train_test_split
from sklearn.neural_network import MLPClassifier
from sklearn.metrics import accuracy_score
import matplotlib.pyplot as plt

# Load digit dataset (8x8 pixel images)
digits = load_digits()
X, y = digits.data, digits.target

# Split data
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)

# Create neural network
# 64 inputs (8x8 pixels) → 50 hidden neurons → 10 outputs (digits 0-9)
nn = MLPClassifier(
    hidden_layer_sizes=(50,),
    activation='relu',
    max_iter=500,
    random_state=42
)

# Train
print("Training neural network...")
nn.fit(X_train, y_train)

# Test
predictions = nn.predict(X_test)
accuracy = accuracy_score(y_test, predictions)
print(f"Accuracy: {accuracy * 100:.2f}%")
# Output: Accuracy: 97.22%

# Test on a single image
sample_idx = 0
sample_image = X_test[sample_idx].reshape(8, 8)
sample_prediction = nn.predict([X_test[sample_idx]])[0]
sample_actual = y_test[sample_idx]

print(f"\nSample prediction: {sample_prediction}")
print(f"Actual digit: {sample_actual}")

Key Concepts Summary

Weights: Connection strengths between neurons (learned during training)
Bias: Allows shifting the activation function
Loss Function: Measures how wrong predictions are
Backpropagation: Method to calculate weight updates
Learning Rate: How much to adjust weights in each step
Epoch: One complete pass through all training data

Why Use Neural Networks?

Can learn very complex patterns
Work well with large amounts of data
Flexible - can solve many different problems
Can automatically learn features from raw data
State-of-the-art performance in many domains

Common Applications

Image Recognition: Identifying objects, faces, scenes
Natural Language Processing: Translation, chatbots, sentiment analysis
Speech Recognition: Voice assistants, transcription
Game Playing: Chess, Go, video games
Autonomous Vehicles: Self-driving cars
Healthcare: Disease diagnosis, drug discovery