1

I'd like to easily find out if the five points are getting better, worse, or staying flat. This is in regard to performance of a machine learning classifier when the classifier is provided fractions of training data. So for example the fractions might be $0$-$20$, $20$-$40$, $40$-$60$, $60$-$80$, $80$-$100$. That is, given $0$-$20\%$ of the training data, how well does the classifier perform with the assumption that more data would equal better performance. The metric for performance is root mean squared error (RMSE). For a given set of classifiers, the performance data might look like this:

index Model $0$ $1$ $2$ $3$ $4$
$0$ ANN $0.66$ $0.64$ $0.60$ $0.58$ $0.56$
$1$ RF $0.93$ $0.95$ $1$ $1.01$ $1$
$2$ Linear Regression $1.02$ $1.07$ $1.07$ $1.1$ $1.09$
$3$ DT $1.09$ $1.15$ $1.18$ $1.23$ $1.21$
$4$ LSTM $0.86$ $0.83$ $0.81$ $0.81$ $0.84$

At first glance it might be easy to eyeball which classifiers are getting better, staying flat, etc. I can take the slope of any of these results but what is a good number for the $X$ value (assuming performance values are $Y$)? For example, $X=[0,1,2,3,4]$ and $X=[0,0.1,0.2,0.3,0.4]$ produce different slopes. Am I overthinking this?

There
  • 111

0 Answers0