Backpropagation is an algorithm used to train artificial neural networks. It works by iteratively adjusting the weights of the connections between neurons in the network to minimize the error between the network’s predictions and the desired outputs. Here’s a breakdown of the process:

**Forward Pass:**An input is fed into the network, and it propagates through the layers, with each neuron performing a weighted sum of its inputs and applying an activation function.**Error Calculation:**The output of the network is compared to the desired output, and the error (loss) is calculated using a function like the mean squared error.**Backward Pass:**The error is then propagated backward through the network, layer by layer. At each layer, the contribution of each neuron’s activation to the overall error is calculated.**Weight Update:**Using the calculated error contribution, the weights of the connections are adjusted in a way that reduces the overall error. This is typically done using an optimization algorithm like gradient descent.

**Equations**

**Weighted Sum:**The weighted sum of inputs at a neuron is calculated as:

```
z = Σ (w_i * x_i) + b
```

Where:

- z = weighted sum
- w_i = weight of the i-th connection
- x_i = i-th input
- b = bias of the neuron (constant value)
**Sigmoid Activation Function:**The sigmoid function, a common activation function, is applied to the weighted sum to introduce non-linearity:

```
a = σ(z) = \frac{1} {(1 + e^{-z})}
```

$latex \Delta T_{\text{heatsink}} = Q_{\text{load}} \cdot 0.15 \frac{K}{W}$

Where:

- a = activation of the neuron
- σ = sigmoid function
- e = base of the natural logarithm (approximately 2.718)
**Error Calculation (Mean Squared Error):**The mean squared error between the network’s output (a_out) and the desired output (y) is:

```
E = 1/n * Σ (a_out - y)^2
```

Where:

- E = mean squared error
- n = number of training examples

Neuron:

graph LR A[Input] --> B(Weighted Sum) B --> C{Activation Function} C --> D(Output)

Neural Network (Multi-Layer Perception):

graph LR A[Input 1] --> B[Neuron 1.1] A --> C[Neuron 1.2] ... B --> D(Hidden Layer 1) C --> D D --> E[Neuron 2.1] D --> F[Neuron 2.2] ... E --> G(Output) F --> G