Most Popular

1500 questions
9
votes
5 answers

Why 100% accuracy on test data is not good?

I was asked this question in an interview and wasn’t able to give a satisfactory answer not only upto the interviewers' expectations but of my own as well. The question was as above only, he later gave an example as if why if my model predicted the…
Rishabh Sharma
  • 659
  • 2
  • 8
  • 18
9
votes
1 answer

What is meant by Distributed for a gradient boosting library?

I am checking out XGBoost documentation and it's stated that XGBoost is an optimized distributed gradient boosting library. What is meant by distributed? Have a nice day
9
votes
3 answers

Can we remove features that have zero-correlation with the target/label?

So I draw a pairplot/heatmap from the feature correlations of a dataset and see a set of features that bears Zero-correlations both with: every other feature and also with the target/label .Reference code snippet in python is below: corr =…
karthiks
  • 342
  • 1
  • 2
  • 10
9
votes
0 answers

Why is my Keras model not learning image segmentation?

Edit: as is turns out, not even the model's initial creator could successfully fine-tune it. This is most likely a problem of implementation, or possibly related to the non-intuitive way in which the Keras batch normalization layer works. I'm trying…
Matt
  • 199
  • 8
9
votes
1 answer

Validation loss is lower than the training loss

I am using autoencoder for anomaly detection in warranty data. Architecture 1: The plot shows the training vs validation loss based on Architecture 1. As we see in the plot, validation loss is lower than the train loss which is totally weird.…
Ashwini
  • 235
  • 1
  • 2
  • 7
9
votes
1 answer

How to make two parallel convolutional neural networks in Keras?

I created two convolutional neural networks (CNN), and I want to make these networks work in parallel. Each network takes different type of images and they join in the last fully connected layer. How to do this?
N.IT
  • 1,995
  • 4
  • 19
  • 35
9
votes
1 answer

clipping the reward for adam optimizer in keras

I would like to clip the reward in keras. I saw it is possible to clip the norm and clip the value is sgd as follows: sgd = optimizers.SGD(lr=0.01, clipnorm=1.) sgd = optimizers.SGD(lr=0.01, clipvalue=0.5) What are clipping the norm and clipping…
user10296606
  • 1,834
  • 5
  • 17
  • 31
9
votes
2 answers

Python - Converting 3D numpy array to 2D

I have a 3D matrix like this: array([[[ 0, 1], [ 2, 3]], [[ 4, 5], [ 6, 7]], [[ 8, 9], [10, 11]], [[12, 13], [14, 15]]]) and would like to stack them in a grid format, ending up with: array([[ 0, …
Tarlan Ahad
  • 271
  • 2
  • 5
  • 15
9
votes
1 answer

keras' ModelCheckpoint not working

I'm trying to train a model in keras and I'm using ModelCheckpoint to save the best model according to a monitored validation metric (in my case the Jaccard index). While I can see the model improving in tensorboard, when I try to load the weights…
ILM91
  • 338
  • 1
  • 7
9
votes
1 answer

When does decision tree perform better than the neural network?

I was experimenting with different modelling methods including KNN, Decision Trees, Neural Networks and SVN and trying to fit my data to see which works the best. To my surprise, the decision tree works the best with training accuracy of 1.0 and…
Suhail Gupta
  • 601
  • 8
  • 15
9
votes
2 answers

Display Images (url) Inside Pandas Dataframe

I would like to display images (mostly jpg and png formats) directly from their url link inside a pandas dataframe. Imagine I already have the following dataframe: id image_url 1 …
TwinPenguins
  • 4,249
  • 3
  • 19
  • 53
9
votes
2 answers

Is there any consensus on choosing an appropriate ML approach?

I am studying data science at the moment and we are taught a dizzying variety of basic regression/classification techniques (linear, logistic, trees, splines, ANN, SVM, MARS, and so on....), along with a variety of extra tools (bootstrapping,…
Brendan Hill
  • 155
  • 8
9
votes
3 answers

R random forest on Amazon ec2 Error: cannot allocate vector of size 5.4 Gb

I am training random forest models in R using randomForest() with 1000 trees and data frames with about 20 predictors and 600K rows. On my laptop everything works fine, but when I move to amazon ec2, to run the same thing, I get the error: Error:…
SOUser
9
votes
2 answers

Dealing with feature vectors of variable length

How does one deal with a feature vector that can vary in size? Let's say per object, I calculate 4 features. In order to solve a certain regression problem, I may have 1, 2, or more of these objects (no more than 10). Thus, the feature vector is…
Otto Nahmee
  • 91
  • 1
  • 4
9
votes
3 answers

Interactive Graphing while logging data

I'm looking to graph and interactively explore live/continuously measured data. There are quite a few options out there, with plot.ly being the most user-friendly. Plot.ly has a fantastic and easy to use UI (easily scalable, pannable, easily…