Highest Voted Questions - Data Science Stack Exchange

9

votes

3 answers

Which, if any, machine learning algorithms are accepted as being a good tradeoff between explainability and prediction?

Machine learning texts describing algorithms such as gradient boosting machines or neural networks often comment that these models are good at prediction, but this comes at the price of a loss of explainability or interpretability. Conversely,…

asked May 22 '16 at 23:56

Robert de Graaf

899
5
17

9

votes

3 answers

What recommendation engine for a situation where users can only see a fraction of all items?

I want to add a recommendation feature to a document management system. It is a server on which most company documents are stored. Employees browse the web interface and click to download (or read online) the documents they want. Each employee only…

asked May 17 '16 at 10:16

Nicolas Raoul

335
2
11

9

votes

3 answers

Split a list of values into columns of a dataframe?

I am new to python and stuck at a particular problem involving dataframes. The image has a sample column, however the data is not consistent. There are also some floats and NAN. I need these to be split across columns. That is each unique value…

asked May 17 '16 at 01:37

Drj

427
1
7
19

9

votes

1 answer

Is there any domain where Spiking Neural Networks outperform other algorithms (non-spiking)?

I'm reading about reservoir computing techniques like Echo State Networks and Liquid State Machines. Both of the methods involve feeding inputs to a population of randomly (or not) connected spiking neurons, and a relatively simple readout algorithm…

asked Apr 29 '16 at 15:43

Justas

191
3

9

votes

4 answers

k-means in R, usage of nstart parameter?

I try to use k-means clusters (using SQLserver + R), and it seems that my model is not stable : each time I run the k-means algorithm, it finds different clusters. But if I set nstart (in R k-means function) high enough (10 or more) it becomes…

asked Apr 28 '16 at 15:44

irimias

277
1
3
7

9

votes

3 answers

What is the difference between residual sum of squares and ordinary least squares?

They look like the same thing to me but I'm not sure. Update: in retrospect, this was not a very good question. OLS refers to fitting a line to data and RSS is the cost function that OLS uses. It finds the parameters that gives the least residual…

linear-regression

asked Apr 28 '16 at 00:07

sebastianspiegel

891
4
11
16

9

votes

2 answers

Why did Tufte call this a "superbly produced duck"?

I think I understand Tufte's concept of a "Duck" -- A graphic that is taken over by decorative forms. But I couldn't understand why he called this a duck (a "superbly produced" one at that). It seemed to me more functional than decorative.…

asked Apr 22 '16 at 17:03

thanks_in_advance

325
2
10

9

votes

1 answer

Difference between tf-idf and tf with Random Forests

I am working on a text classification problem using Random Forest as classifiers, and a bag-of-words approach. I am using the basic implementation of Random Forests (the one present in scikit), that creates a binary condition on a single variable…

asked Sep 16 '14 at 08:14

papafe

595
1
5
9

9

votes

2 answers

MLOps for beginner

I am 1 year old in ML and have been using jupyter notebook to build static models all these days, do some analysis and present my results to the bosses as it was all POC. Now, we would like to scale the solution to become automatic and be able to…

asked Jul 03 '22 at 14:05

The Great

2,565
2
20
43

9

votes

5 answers

How does deep learning helps in detecting multiple objects in single image?

Let's say there are two cars in an image. How can it detect these cars, given that it can detect single car in an image?

asked Apr 07 '16 at 19:15

Amanuel Negash

451
4
8

9

votes

2 answers

How to build a textual search engine?

I am having an HTML string and want to find out if a word I supply is relevant in that string. Relevancy could be measured based on frequency in the text. An example to illustrate my problem: this is an awesome bike store bikes can be purchased…

asked Sep 12 '14 at 11:48

Hendrik

191
2

9

votes

2 answers

Training Deep Nets on an Ordinary Laptop

Would it be possible for a an amateur who is interested in getting some "hands-on" experience in desining and training deep neural networks, to use an ordinary laptop for that purpose (no GPU), or is it hopeless to get good results in reasonable…

asked Feb 20 '16 at 07:24

Lior

223
1
2
6

9

votes

1 answer

Understanding Reinforcement Learning with Neural Net (Q-learning)

I am trying to understand reinforcement learning and markov decision processes (MDP) in the case where a neural net is being used as the function approximator. I'm having difficulty with the relationship between the MDP where the environment is…

asked Feb 18 '16 at 10:11

CatsLoveJazz

247
1
10

9

votes

1 answer

Should I take random elements for mini-batch gradient descent?

When implementing mini-batch gradient descent for neural networks, is it important to take random elements in each mini-batch? Or is it enough to shuffle the elements at the beginning of the training once? (I'm also interested in sources which…

asked Feb 11 '16 at 16:35

Martin Thoma

18,880
35
95
169

8

votes

1 answer

What is a "residual mapping"?

A recent paper by He et al. (Deep Residual Learning for Image Recognition, Microsoft Research, 2015) claims that they use up to 4096 layers (not neurons!). I am trying to understand the paper, but I stumble about the word "residual". Could somebody…

asked Jan 24 '16 at 16:49

Martin Thoma

18,880
35
95
169

Most Popular