Deep-Learning

Introduction

Activation functions are crucial in neural networks. They introduce non-linearity, enabling the network to learn complex patterns. Without activation functions, the network would be equivalent to a linear regression model, no matter the number of layers.


Types of Activation Functions

1. Linear Activation Function


2. Sigmoid Function


3. Hyperbolic Tangent (Tanh)


4. Rectified Linear Unit (ReLU)


5. Leaky ReLU

Equation:

\[f(x) = \begin{cases} x, & \text{if } x > 0 \\ \alpha x, & \text{if } x \leq 0 \end{cases}\]

where ( $\alpha$ ) is a small constant (e.g., ( 0.01 )).


6. Softmax Function


Comparison of Activation Functions

Function Range Computational Cost Gradient Issue Applications
Linear ( $-\infty, \infty$ ) Low No gradient saturation Regression tasks
Sigmoid ( 0, 1 ) High Vanishing gradient Binary classification
Tanh ( -1, 1 ) High Vanishing gradient Hidden layers
ReLU $[0, \infty)$ Low Dying ReLU problem Most neural network architectures
Leaky ReLU ($-\infty, \infty$) Low Minimal Deep networks
Softmax (0, 1) High None Output layers for classification
         

Examples

Example 1: Sigmoid

Given ( x = 2 ), compute ( f(x) ): [ $f(2) = \frac{1}{1 + e^{-2}} \approx 0.88$ ]

Example 2: Tanh

For ( x = 1 ): [ $f(1) = \tanh(1) = \frac{e^1 - e^{-1}}{e^1 + e^{-1}} \approx 0.76$ ]

Example 3: ReLU

For ( x = -3 ) and ( x = 2 ): [ $f(-3) = \max(0, -3) = 0,\ f(2) = \max(0, 2) = 2$ ]


Choosing the Right Activation Function


Summary

Activation functions play a pivotal role in deep learning by introducing non-linear properties, enabling the network to model complex relationships in data. Selection of the appropriate activation function depends on the problem domain and architecture.

Back