Imbalanced dataset in MLP classifier in python

Question

I am dealing with imbalanced dataset and I try to make a predictive model using MLP classifier. Unfortunately the algorithm classifies all the observations from test set to class "1" and hence the f1 score and recall values in classification report are 0. Does anyone know how to deal with it?

model= MLPClassifier(solver='lbfgs', activation='tanh')
model.fit(X_train, y_train)
score=accuracy_score(y_test, model.predict(X_test), )
fpr, tpr, thresholds = roc_curve(y_test, model.predict_proba(X_test)[:,1])
roc=roc_auc_score(y_test, model.predict_proba(X_test)[:,1])
cr=classification_report(y_test, model.predict(X_test))

score 3 · Accepted Answer · answered Aug 20 '17 at 16:51

You can try using data re-sampling techniques. They can be divided in four categories: undersampling the majority class, oversampling the minority class, combining over and under sampling, and creating an ensemble of balanced datasets.

The above methods and more are implemented in the imbalanced-learn library in Python that interfaces with scikit-learn. See ipython notebook for an example.

Imbalanced dataset in MLP classifier in python

1 Answers1