Ordinal classification with xgboost

Question

I am working in the problem where the dependent variables are ordered classes, such as bad, good, very good.

How could I declare this problem in xgboost instead of normal classification or regression?

Thanks

score 2 · Answer 1 · answered Oct 29 '19 at 12:45

2

You can run 2 xgboost binary classifiers

1 classifier classifies if sample is (good or very good)
2 classifier classfies if sample is very good
if both true on unseen data classify as very good
if only 1st one true, second false classify as good both false=> classify as bad

answered Oct 29 '19 at 12:45

alexprice

221
2
7

What to do if first false but second true? – Ben Reiniger Oct 29 '19 at 15:50
if both classfiers trained well, should happen only rarely and should be classified as bad. if need more tuning can output probabilities and compare probabilities instead of labels – alexprice Oct 30 '19 at 12:45
Indeed, this is probably a better situation than the regression setup in the other answer in the case of conflicting uncertainty. You could just output "I don't know," or if a decision is required, make sure the classifiers are probabilistic and well-calibrated. – Ben Reiniger Oct 30 '19 at 15:05
You can also use the prediction/probabilities of earlier labels as features for the higher labels. For example, the classifier 2 can be given the probability that classifier 1 already indicated it was at least 'good' as a feature – DrewH Feb 14 '20 at 21:42

score 1 · Answer 2 · answered Jan 22 '19 at 14:33

1

I think you can use a regression setup, e.g. bad=0, good=0.5, very good = 1 for labels, and then postprocess output of XGBoost, such as pred_value < 0.25 => prediction_label=bad, pred_value >= 0.25 and pred_value < 0.75 => prediction_label=good and so on.

answered Jan 22 '19 at 14:33

Viacheslav Komisarenko

388
1
5

+1, but the two-classifier ordinal approach seems more flexible. – Ben Reiniger Oct 30 '19 at 15:06

Ordinal classification with xgboost

2 Answers2