9

I have built a convolutional neural network which is needed to classify the test data into either 0 or 1. I am training the CNN with labels either 0 or 1 but while running the below code I am getting the following result.

predictions = classifier.predict(x_test)

print(predictions)

[0.0128037 ]
 [0.01182843]
 [0.01042355]
 [0.00906552]
 [0.00820154]
 [0.00726516]......

logloss_score = log_loss(y_test, predictions)
print(logloss_score)
0.047878393431377855

How do I get the results between 0 and 1? What do I need to modify in the above code?

Ethan
  • 1,633
  • 9
  • 24
  • 39
LIsa
  • 93
  • 1
  • 1
  • 3

4 Answers4

14

What you have are predicted class probabilities. Since you are doing binary classification, each output is the probability of the first class for that test example.

To convert these to class labels you can take a threshold:

import numpy as np

probas = np.array([[0.4],[0.7],[0.2]])
labels = (probas < 0.5).astype(np.int)
print(labels)
[[1]
 [0]
 [1]]

For multiclass classification where you want to assign one class from multiple possibilities you can use argmax:

probas = np.array([[0.4, 0.1, 0.5],[0.7, 0.2, 0.1],[0.3, 0.4, 0.3]])
labels = np.argmax(probas, axis=-1)    
print(labels)
[2 0 1]

And to get these as one-hot encoded arrays you can use LabelBinarizer:

from sklearn import preprocessing

lb = preprocessing.LabelBinarizer()
lb.fit_transform(labels)
array([[0, 0, 1],
       [1, 0, 0],
       [0, 1, 0]])

And for multilabel classification where you can have multiple output classes per example you can use thresholding again:

probas = np.array([[0.6, 0.1, 0.7],[0.7, 0.2, 0.1],[0.8, 0.9, 0.6]])
labels = (probas > 0.5).astype(np.int)
print(labels)
[[1 0 1]
 [1 0 0]
 [1 1 1]]

Some packages provide separate methods for getting probabilities and labels, so there is no need to do this manually, but it looks like you are using Keras which only gives you probabilities.

As a sidenote, this is not called "normalization" for neural networks. Normalization typically describes scaling your input data to fit in a nice range like [-1,1].

Imran
  • 2,381
  • 12
  • 22
  • you say 'each output is the probability of the first class for that test example'. Is the first class '0' in OP's case? In that case, in your example the second entry in 'probas' i.e. 0.7 means that it has high probability of belonging to first class i.e. '0' but final output shows [1]. What am I missing? – deadcode Jan 31 '19 at 11:30
  • You are exactly right. I had my first example flipped. I have fixed my answer. – Imran Jan 31 '19 at 14:32
  • I wonder if this is mentioned anywhere in the docs, couldn't find it. So this answer has been of huge help! – deadcode Feb 01 '19 at 06:43
  • For binary classification, why the threshold is. 5? Why not any other value $\in [0,1] $ – M.M Mar 18 '19 at 16:02
  • Because we must always choose exactly one of the two classes, so we pick the more likely one. Imagine the estimated probabilities were 0.45 and 0.55 respectively, and we used a threshold of 0.6: Then we would pick neither class. Similarly imagine we used a threshold of 0.4: Then we would pick both classes! – Imran Mar 18 '19 at 16:06
  • It's definitely possible to pick a threshold other than 0.5. Neural net binary classification is basically an elaborate logistic regression that predicts log-odds that you can convert to a probability of membership in the "1" group. If your probability of being "1" is 0.55 and your threshold is 0.6, you don't classify that observation is "1" but as "0". Neural networks (and logistic regression) are classifiers in conjunction with a decision rule once you get the probability of belonging to one of the classes. – Dave Jun 12 '19 at 12:59
  • The question does not specify which library we're using, but I think classifier.predict() will output class probabilities (or predicted class labels), not log odds in the major Python machine learning libraries, therefore the only threshold that makes sense is 0.5. With that said, I think I misinterpreted MM's question to be about multiclass classification, not binary classification, so this should clarify my above comment. – Imran Jun 12 '19 at 13:34
  • The probability of being in class "1" is just a function of the log-odds. It absolutely makes sense to pick other thresholds. You may elect to use 0.5, but you don't have to. Without varying the threshold, we don't get ROC curves. – Dave Jun 12 '19 at 14:01
  • OK, I see what you are saying. Yes you are right. I'll try and put a note about this on my answer soon. – Imran Jun 12 '19 at 14:03
2

predictions = classifier.predict(x_test)

You have not provided the shape of your x_test but based on the documentation of the predict function that you should provide an array-like item, you are inputting an array-like input. Each output already shows the probability of each corresponding input. It seems that because the low values of predictions, they are smaller than 0.5, the predicted labels for your test data are all zero.

Green Falcon
  • 14,058
  • 9
  • 57
  • 98
1

The predictions you are getting are logits, meaning the sum across all categories is 1. So the largest-number should be the category you are looking for. To get the category, you can use argmax to find the index of the maximum number. The cross entropy loss is a measure of discrepancy between predicted value and labels. In this case you are 95% accurate.

Ricky Han
  • 111
  • 1
  • I'm not sure that's quite right. She says it is a binary classification, so I think you are looking at the probability of the first class only for each test example. – Imran Feb 13 '18 at 02:48
  • If it's binary then it should output 2 logits! – Ricky Han Feb 13 '18 at 03:56
  • No, many packages will just give you one because this is all the information you need. It looks like she is using Keras, and Keras only outputs the probability of the first class for binary classification. – Imran Feb 13 '18 at 04:03
0

To convert your class probabilities to class labels just let it through argmax that will encode the highest probability as 1

prob_ = np.array([[0.12, 0.18, 0.2, 0.6],[0.7, 0.08,0.12, 0.1],[0.15, 0.4, 
0.3, 0.15]])
labels = np.argmax(prob_, axis=-1)    
print(labels)
[3 0 1]