1

Imagine I setup a Kaggle competition with normalized stock data (e.g. price, volume, etc) plus a random rotation matrix (i.e. so that it's less obvious what the features are). Is it possible for a contestants to figure out that the data is stock-related? If yes, how?

EDIT:

The contest will be presented as a general machine learning contest with no specific background info. The data will not be time-series - just a set of samples, each containing a list of unnamed features plus a binary label.

When I asked the question, I was wondering more about whether contestants might figure out by examine the data using data/statistical analysis (e.g. PCA) or unsupervised machine learning methods (e.g. clustering).

Roy
  • 289
  • 3
  • 9
  • How will the competition be presented? "Here is some undescribed artificial data, please predict the label"? Will the data be presented as a time-series? – Neil Slater Jul 03 '17 at 12:35
  • Yeah some vague description like what you said. No, the data will not be presented as time-series. – Roy Jul 04 '17 at 20:49

0 Answers0