Most Popular
1500 questions
9
votes
1 answer
How does YOLO algorithm detect objects if the grid size is way smaller than the object in the test image?
In YOLO algorithm how do these grids output a prediction if some grids only see a small black portion of the car if the model was trained on datasets with full images?
Rishi Swethan
- 101
- 1
- 2
9
votes
1 answer
How to get a confidence score for predictions?
In a regression problem, is it possible to calculate a confidence/reliability score for a certain prediction given models like XGBoost or Neural Networks?
Henrique Nader
- 511
- 2
- 5
- 15
9
votes
1 answer
What is the BLEU score used in Google Brain's "Attention Is All You Need" paper?
Google Brain's Attention Is All You Need paper on sequence-to-sequence translation reports:
Our model achieves 28.4 BLEU on the WMT 2014 Englishto-German
translation task, improving over the existing best results, including
ensembles, by over 2…
Imran
- 2,381
- 12
- 22
9
votes
2 answers
Adding feature leads to worse results
I have a dataset with 20 variables and ~50K observations, I created several new features using those 20 variables.
I compare the results of a GBM model (using python xgboost and light GBM) and I found that it doesn't matter what are the…
Yaron
- 191
- 1
- 1
- 4
9
votes
3 answers
How to train a xgboost model on data that is too big for the memory?
What are the best practices to train xgboost (eXtreme gradient boosting) models on data that is to big to hold it in memory at once? Splitting the data and train multiple models? Are there more elegant solutions?
Soerendip
- 724
- 1
- 9
- 16
9
votes
3 answers
Clustering of documents using the topics derived from Latent Dirichlet Allocation
I want to use Latent Dirichlet Allocation for a project and I am using Python with the gensim library. After finding the topics I would like to cluster the documents using an algorithm such as k-means(Ideally I would like to use a good one for…
Swan87
- 211
- 1
- 2
- 3
9
votes
1 answer
Understanding batch normalization
In the paper Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift (here), before explaining the process of batch normalization, the paper tries to explain the issues related with (I am not getting what the…
figs_and_nuts
- 833
- 1
- 5
- 14
9
votes
2 answers
What is the difference between concept class and hypothesis
Formal definition that I have seen of concept class is
class of all true functions
mathematically :
$f:X \rightarrow\{0,1\}$
and that of hypothesis is:
$h:X \rightarrow\{0,1\}$
But most of the times they are used together. For example in…
user40687
- 99
- 1
- 2
9
votes
2 answers
Ways to reconstruct shuffled pixels of a video file?
Suppose that you have a video file which pixel order has been shuffled once. That is, a random order have been defined once and applied to all frames.
Does it exist some known approach for retrieving the initial order of pixels?
I have some ideas…
Denis Dollfus
- 93
- 3
9
votes
2 answers
Why does decreasing the SGD learning rate cause a massive increase in accuracy?
In papers such as this I often see training curves with this kind of shape:
In this case SGD was used with a factor of 0.9 and learning rate decreasing by a factor of 10 every 30 epochs.
Why is there such a large decrease in error when the…
geometrikal
- 533
- 1
- 5
- 14
9
votes
1 answer
Convolutional network for classification, extremely sensitive to lighting
I trained a convolutional network to classify images of a mechanical component as good or defective. Though the test accuracy was high, I realized that the model performed poorly on images which had slightly different lighting.
The features that…
Effective_cellist
- 191
- 4
9
votes
1 answer
Is it valuable to normalize/rescale labels in neural network regression?
Have there been any papers, or does anyone have any specific experience to know whether normalizing labels in a regression problem is likely to improve the performance of a neural network? I have labels that are in the range (0,1000) applying square…
davidparks21
- 423
- 4
- 17
9
votes
2 answers
"Deep Noether's Theorem": Building in Symmetry Constraints
If I have a learning problem that should have an inherent symmetry, is there a way to subject my learning problem to a symmetry constraint to enhance learning?
For example, if I am doing image recognition, I might want 2D rotational symmetry.…
user32280
9
votes
0 answers
AdaBoost implementation and tuning for high dimensional feature space in R
I am trying to implement the AdaBoost.M1 algorithm (trees as base-learners) to a data set with a large feature space (~ 20.000 features) and ~ 100 samples in R. There exists a variety of different packages for this purpose; AdaBag, Ada and gbm.…
AfBM
- 91
- 2
9
votes
2 answers
Tensorflow regression model giving same prediction every time
import tensorflow as tf
x = tf.placeholder(tf.float32, [None,4]) # input vector
w1 = tf.Variable(tf.random_normal([4,2])) # weights between first and second layers
b1 = tf.Variable(tf.zeros([2])) # biases added to hidden…
Tarun
- 93
- 1
- 1
- 5