What is the role of the sigmoid function in a multilayer perceptron (MLP)?
In an MLP, what is the purpose of the softmax function in the output layer?
What is the derivative of the sigmoid function \( \sigma(t) = \frac{1}{1 + e^{-t}} \)?
What is the Jacobian matrix of the elementwise sigmoid function \( \sigma(t) = (\sigma(t_1), \dots, \sigma(t_n)) \)?
In the forward phase of computing the gradient of the loss function in an MLP, what is the output of the \( i \)-th hidden layer?
What is the output of the backward loop in computing the gradient of the loss function in an MLP?
What is the purpose of the convolutional layers in a CNN?