What are Neural Networks?
Neural Networks are computing systems inspired by the biological neural networks in animal brains. They consist of interconnected nodes (neurons) that work together to process information and learn patterns.
Biological Inspiration
Biological Neuron
- Dendrites receive signals
- Cell body processes signals
- Axon sends output
- Synapses connect neurons
Artificial Neuron
- Inputs receive data
- Weighted sum + activation
- Output sends result
- Weights connect neurons
Structure of a Neural Network
Layers
- Input Layer: Receives raw data
- Hidden Layers: Process and transform data (can be multiple)
- Output Layer: Produces final prediction
Input Layer Hidden Layer Output Layer
● ● ●
\ / \ /
\ / \ /
● ------> ● ● -----> ● (Output)
/ \ / \
/ \ /
● ●
How Neurons Work
Each neuron performs a simple calculation:
- Receive inputs: Get values from previous layer
- Weight multiplication: Each input is multiplied by a weight
- Sum: Add all weighted inputs plus a bias
- Activation: Apply an activation function
- Output: Pass result to next layer
Formula:
output = activation(w₁x₁ + w₂x₂ + w₃x₃ + ... + bias)
Activation Functions
Activation functions introduce non-linearity, allowing networks to learn complex patterns.
Sigmoid
Squashes values between 0 and 1
σ(x) = 1/(1+e⁻ˣ)
ReLU
Most popular: returns x if positive, 0 otherwise
f(x) = max(0, x)
Tanh
Squashes values between -1 and 1
tanh(x)
Softmax
Converts to probabilities (sum to 1)
Used in output layer
Training Process
- Forward Pass: Input flows through network to produce prediction
- Calculate Loss: Compare prediction to actual answer
- Backward Pass (Backpropagation): Calculate how to adjust weights
- Update Weights: Adjust to reduce error
- Repeat: Continue until model performs well
Simple Example: AND Gate
Let's create a neural network that learns the AND logic gate:
# Neural Network for AND gate
import numpy as np
def sigmoid(x):
return 1 / (1 + np.exp(-x))
def sigmoid_derivative(x):
return x * (1 - x)
# Training data for AND gate
X = np.array([[0, 0], [0, 1], [1, 0], [1, 1]])
y = np.array([[0], [0], [0], [1]])
# Initialize weights randomly
np.random.seed(1)
weights = 2 * np.random.random((2, 1)) - 1
bias = 0
# Training
learning_rate = 0.5
for epoch in range(10000):
# Forward pass
input_layer = X
output = sigmoid(np.dot(input_layer, weights) + bias)
# Calculate error
error = y - output
# Backward pass
adjustments = error * sigmoid_derivative(output)
# Update weights
weights += learning_rate * np.dot(input_layer.T, adjustments)
bias += learning_rate * np.sum(adjustments)
# Test the trained network
print("Trained weights:", weights.flatten())
print("Trained bias:", bias)
print("\nPredictions:")
for i in range(len(X)):
prediction = sigmoid(np.dot(X[i], weights) + bias)[0]
print(f"Input: {X[i]} → Output: {prediction:.4f} → Rounded: {round(prediction)}")
# Output:
# Input: [0 0] → Output: 0.0146 → Rounded: 0
# Input: [0 1] → Output: 0.0661 → Rounded: 0
# Input: [1 0] → Output: 0.0661 → Rounded: 0
# Input: [1 1] → Output: 0.9338 → Rounded: 1
# The network learned AND logic!
Real-World Example: Digit Recognition
# Neural Network for handwritten digit recognition
from sklearn.datasets import load_digits
from sklearn.model_selection import train_test_split
from sklearn.neural_network import MLPClassifier
from sklearn.metrics import accuracy_score
import matplotlib.pyplot as plt
# Load digit dataset (8x8 pixel images)
digits = load_digits()
X, y = digits.data, digits.target
# Split data
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.2, random_state=42
)
# Create neural network
# 64 inputs (8x8 pixels) → 50 hidden neurons → 10 outputs (digits 0-9)
nn = MLPClassifier(
hidden_layer_sizes=(50,),
activation='relu',
max_iter=500,
random_state=42
)
# Train
print("Training neural network...")
nn.fit(X_train, y_train)
# Test
predictions = nn.predict(X_test)
accuracy = accuracy_score(y_test, predictions)
print(f"Accuracy: {accuracy * 100:.2f}%")
# Output: Accuracy: 97.22%
# Test on a single image
sample_idx = 0
sample_image = X_test[sample_idx].reshape(8, 8)
sample_prediction = nn.predict([X_test[sample_idx]])[0]
sample_actual = y_test[sample_idx]
print(f"\nSample prediction: {sample_prediction}")
print(f"Actual digit: {sample_actual}")
Key Concepts Summary
- Weights: Connection strengths between neurons (learned during training)
- Bias: Allows shifting the activation function
- Loss Function: Measures how wrong predictions are
- Backpropagation: Method to calculate weight updates
- Learning Rate: How much to adjust weights in each step
- Epoch: One complete pass through all training data
Why Use Neural Networks?
- Can learn very complex patterns
- Work well with large amounts of data
- Flexible - can solve many different problems
- Can automatically learn features from raw data
- State-of-the-art performance in many domains
Common Applications
- Image Recognition: Identifying objects, faces, scenes
- Natural Language Processing: Translation, chatbots, sentiment analysis
- Speech Recognition: Voice assistants, transcription
- Game Playing: Chess, Go, video games
- Autonomous Vehicles: Self-driving cars
- Healthcare: Disease diagnosis, drug discovery