Most Popular

1500 questions
9
votes
1 answer

How does YOLO algorithm detect objects if the grid size is way smaller than the object in the test image?

In YOLO algorithm how do these grids output a prediction if some grids only see a small black portion of the car if the model was trained on datasets with full images?
Rishi Swethan
  • 101
  • 1
  • 2
9
votes
1 answer

How to get a confidence score for predictions?

In a regression problem, is it possible to calculate a confidence/reliability score for a certain prediction given models like XGBoost or Neural Networks?
Henrique Nader
  • 511
  • 2
  • 5
  • 15
9
votes
1 answer

What is the BLEU score used in Google Brain's "Attention Is All You Need" paper?

Google Brain's Attention Is All You Need paper on sequence-to-sequence translation reports: Our model achieves 28.4 BLEU on the WMT 2014 Englishto-German translation task, improving over the existing best results, including ensembles, by over 2…
Imran
  • 2,381
  • 12
  • 22
9
votes
2 answers

Adding feature leads to worse results

I have a dataset with 20 variables and ~50K observations, I created several new features using those 20 variables. I compare the results of a GBM model (using python xgboost and light GBM) and I found that it doesn't matter what are the…
Yaron
  • 191
  • 1
  • 1
  • 4
9
votes
3 answers

How to train a xgboost model on data that is too big for the memory?

What are the best practices to train xgboost (eXtreme gradient boosting) models on data that is to big to hold it in memory at once? Splitting the data and train multiple models? Are there more elegant solutions?
Soerendip
  • 724
  • 1
  • 9
  • 16
9
votes
3 answers

Clustering of documents using the topics derived from Latent Dirichlet Allocation

I want to use Latent Dirichlet Allocation for a project and I am using Python with the gensim library. After finding the topics I would like to cluster the documents using an algorithm such as k-means(Ideally I would like to use a good one for…
Swan87
  • 211
  • 1
  • 2
  • 3
9
votes
1 answer

Understanding batch normalization

In the paper Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift (here), before explaining the process of batch normalization, the paper tries to explain the issues related with (I am not getting what the…
figs_and_nuts
  • 833
  • 1
  • 5
  • 14
9
votes
2 answers

What is the difference between concept class and hypothesis

Formal definition that I have seen of concept class is class of all true functions mathematically : $f:X \rightarrow\{0,1\}$ and that of hypothesis is: $h:X \rightarrow\{0,1\}$ But most of the times they are used together. For example in…
user40687
  • 99
  • 1
  • 2
9
votes
2 answers

Ways to reconstruct shuffled pixels of a video file?

Suppose that you have a video file which pixel order has been shuffled once. That is, a random order have been defined once and applied to all frames. Does it exist some known approach for retrieving the initial order of pixels? I have some ideas…
9
votes
2 answers

Why does decreasing the SGD learning rate cause a massive increase in accuracy?

In papers such as this I often see training curves with this kind of shape: In this case SGD was used with a factor of 0.9 and learning rate decreasing by a factor of 10 every 30 epochs. Why is there such a large decrease in error when the…
geometrikal
  • 533
  • 1
  • 5
  • 14
9
votes
1 answer

Convolutional network for classification, extremely sensitive to lighting

I trained a convolutional network to classify images of a mechanical component as good or defective. Though the test accuracy was high, I realized that the model performed poorly on images which had slightly different lighting. The features that…
9
votes
1 answer

Is it valuable to normalize/rescale labels in neural network regression?

Have there been any papers, or does anyone have any specific experience to know whether normalizing labels in a regression problem is likely to improve the performance of a neural network? I have labels that are in the range (0,1000) applying square…
davidparks21
  • 423
  • 4
  • 17
9
votes
2 answers

"Deep Noether's Theorem": Building in Symmetry Constraints

If I have a learning problem that should have an inherent symmetry, is there a way to subject my learning problem to a symmetry constraint to enhance learning? For example, if I am doing image recognition, I might want 2D rotational symmetry.…
user32280
9
votes
0 answers

AdaBoost implementation and tuning for high dimensional feature space in R

I am trying to implement the AdaBoost.M1 algorithm (trees as base-learners) to a data set with a large feature space (~ 20.000 features) and ~ 100 samples in R. There exists a variety of different packages for this purpose; AdaBag, Ada and gbm.…
AfBM
  • 91
  • 2
9
votes
2 answers

Tensorflow regression model giving same prediction every time

import tensorflow as tf x = tf.placeholder(tf.float32, [None,4]) # input vector w1 = tf.Variable(tf.random_normal([4,2])) # weights between first and second layers b1 = tf.Variable(tf.zeros([2])) # biases added to hidden…
Tarun
  • 93
  • 1
  • 1
  • 5