Questions tagged [activation-function]

Activation function is a non-linear transformation, usually applied in neural networks to the output of the linear or convolutional layer. Common activation functions: sigmoid, tanh, ReLU, etc.

172 questions
4
votes
6 answers

Why is activation needed at all in neural network?

I watched the Risto Siilasmaa video on Machine Learning. It's very well explained, but the question emerged that at what stage should we use the activation function and why we need it at all. I know that by definition the activation function…
Jane Mänd
  • 349
  • 3
  • 9
1
vote
1 answer

Does the number of hidden layers affect the activation function?

Suppose there's a network with N hidden layers. There are 2 cases: The network is deep The network is shallow I've been wondering how N affects choosing the activation function. Will it affect, for example, Sigmoid more than Leaky ReLU?
Rony
  • 11
  • 1
1
vote
1 answer

Activation function vs If else statement

The question is very naive and most of us may know the answer. I have googled it but was not able to find a satisfactory answer so posting it here. Can someone please put the right words on this question. Activation functions like ReLU, Sigmoid etc…
Sandeep Bhutani
  • 894
  • 1
  • 7
  • 24
1
vote
1 answer

Square-law based RBF kernel

What is the Square-law based RBF kernel (SQ-RBF)? The definition in the table at the Wikipedia article Activation Function looks wrong, since it says y = 1 - x^2/2 for |x| <= 1 2 - (2-x^2)/2 for 1 < |x| <= 2 0 for |x|…
Fortranner
  • 215
  • 1
  • 5
0
votes
0 answers

tanh function values are either 1 or -1, how to interpret that distribution

I have a question regarding the tanh function. I trained an NN (with tanh activation functions in hidden layers) on a multiclass dataset and visualised the tanh values of the complete samples from the dataset passed through the NN. The question is…
malocho
  • 183
  • 1
  • 6
0
votes
2 answers

Is there a limit in the number of layers for neural network?

I heard the neural network has a problem with vanishing gradient problems even though the ReLU activation function is used. In ResNet(that has a connecting function for reducing the problem), there is limit of maximum layers of 120~190(I heard). For…
INNO TECH
  • 139
  • 4
0
votes
1 answer

How ReLU is bringing non linearity and why it is not an alternative to dropout?

The differentiation of ReLU function is 1 when input is greater than 0, and 0, when input is less than or equal to 0. In the backpropagation process it doesn’t change the value of d(error)/d(weight) at all. Either the gradient is multiplied by 1, or…
0
votes
2 answers

Why use tanh (or any other activation function)?

In machine learning, it is common to use activation functions like tanh, sigmoid, or ReLU to introduce non-linearity into a neural network. These non-linearities help the network learn complex relationships between input features and output…