Most Popular
1500 questions
9
votes
3 answers
Text classification with thousands of output classes in Keras
Task:
I have a dataset with job titles and descriptions. The task is to predict tags for job by job title and description.
There are several tags for each job posting. Therefore, the number of labels for the model will be measured in tens of…
lemon
- 205
- 2
- 6
9
votes
3 answers
How to use Cross Entropy loss in pytorch for binary prediction?
In the pytorch docs, it says for cross entropy loss:
input has to be a Tensor of size (minibatch, C)
Does this mean that for binary (0,1) prediction, the input must be converted into an (N,2) tensor where the second dimension is equal to (1-p)?
So…
AAC
- 509
- 2
- 5
- 13
9
votes
2 answers
Difference between using RMSE and nDCG to evaluate Recommender Systems
What kind of error measures do RMSE and nDCG give while evaluating a recommender system, and how do I know when to use one over the other? If you could give an example of when to use each, that would be great as well!
covfefe
- 293
- 4
- 7
9
votes
2 answers
How does a FC layer work in a typical CNN
I am new to CNNs and NNs. I am reading this blog: CNN and I am confused about this part: What confuses me is the operation that will be performed on an input vector/matrix. Will we be using a typical ANN equation: "O = W.T * input"?. And then a…
user57521
- 91
- 1
- 1
- 2
9
votes
1 answer
Using class weights in Keras with multiple binary outputs which are not simply one-hot-encoded
My labels are binary vectors of length 5, e.g., [0, 0, 1, 1, 1].
My label set is very biased, 1-to-50, where the case [0, 0, 0, 0, 0] is very common while all other combinations are not. I'd like to weight the uncommon versions using the…
André Christoffer Andersen
- 336
- 3
- 9
9
votes
1 answer
XGBoost: Quantifying Feature Importances
I need to quantify the importance of the features in my model. However, when I use XGBoost to do this, I get completely different results depending on whether I use the variable importance plot or the feature importances.
For example, if I use…
NLR
- 191
- 1
- 1
- 2
9
votes
1 answer
Implementing simple linear regression using a neural network
I have been trying to implement simple linear regression using neural networks in Keras in hope of understanding how to work in the Keras library. Unfortunately, I am ending up with a very bad model.
Here is the implementation:
from pylab import…
mathisbetter
- 207
- 2
- 5
9
votes
3 answers
Where can I find freely available multi-label datasets online?
I'm trying to find multi-label classfication datasets, which are available for free online.
By "multi-label" I mean that each instance can be labeled with anywhere from a single to $k$ labels, where $k$ is the total number of different labels in…
Bobson Dugnutt
- 195
- 1
- 8
9
votes
3 answers
How to estimate the variance of regressors in scikit-learn?
Every classifier in scikit-learn has a method predict_proba(x) that predicts class probabilities for x. How to do the same thing for regressors?
The only regressor for which I know how to estimate the variance of the predictions is Gaussian process…
Vladislav Gladkikh
- 1,136
- 10
- 19
9
votes
1 answer
Binary classification of every time series step based on past and future values
I'm currently facing a Machine Learning problem and I've reached a point where I need some help to proceed.
I have various time series of positional (x, y, z) data tracked by sensors. I've developed some more features. For example, I rasterized the…
Chris
- 245
- 2
- 9
9
votes
1 answer
Implementation of Stochastic Gradient Descent in Python
I am attempting to implement a basic Stochastic Gradient Descent algorithm for a 2-d linear regression in Python. I was given some boilerplate code for vanilla GD, and I have attempted to convert it to work for SGD.
Specifically -- I am a little…
foobarbaz
- 203
- 1
- 2
- 4
9
votes
3 answers
Binary (Unary) Recommendation System with Biased Views
I would like to create a content recommendation system based on binary click data that also takes views into account.
What content a user has been exposed to, and therefore has the chance to click on, is currently biased by a rule based system that…
elz
- 43
- 8
9
votes
1 answer
How can I change the transparency of a histogram plot in Seaborn using Pairgrid?
I'm using the Kaggle Titanic dataset. One feature is "Embarked", the city the passenger embarked from. The survival rate appears to correlate with it, but I'm worried it may just be correlated with the ticket Fare (which the survival rate definitely…
GrundleMoof
- 311
- 2
- 4
- 7
9
votes
3 answers
Difference between indicator column and categorical identity column in tensorflow
I am learning Tensorflow and came across different feature columns used in Tensorflow . Out of these types, two are categorical_identity_column and indicator_column. Both have been defined in the same way. As far as I understand, both convert…
Ankit Seth
- 1,821
- 14
- 27
9
votes
1 answer
Which type auto encoder gives best results for text
I did I couple of examples for auto encoders for images and they worked fine. Now I want to do an auto encoder for text that takes as input a sentence and returns the same sentence. But when I try to use the same auto encoders as the ones I used for…
sspp
- 109
- 2
- 6