2

I am having difficulties to extract some elements from a dataset. I imported the .csv file in Mathematica using the following line of code:

emails = Import[
  "/Users/Desktop/Spam/email.csv", { "Dataset", All}]

and it looks like

enter image description here

Then I tried to select only the texts with spam equals 1 using this line

spamEmail = emails[Select[spam = 1 &], "text"]

but it is not working. What I would like to do is to select from the dataset only the emails/text with spam=1 and count the words using WordCounts and see their frequency using WordCloud.

Could you please give me suggestions on how to select only those elements? Thanks

Math
  • 177
  • 6

1 Answers1

4

You need to add "HeaderLines" -> 1 to Import to have "text" and "spam" as keys:

emails2 = Import["/Users/Desktop/Spam/email.csv", "Dataset", "HeaderLines" -> 1]

spamEmail  = emails2[Select[#spam == 1 &], {"text"}] 

or

spamEmail  = emails2[Select[Slot["spam"] ==1 &], {"text"}]

Alternatively, if you use Import as you did

emails = Import["/Users/Desktop/Spam/email.csv", { "Dataset", All}]

then, you can use Parts for filtering, i.e.,

emails[Select[#[[2]] == 1& ], {#[[1]]&}]
kglr
  • 394,356
  • 18
  • 477
  • 896
  • I am getting the following error message: Failure. Message: Part text is not applicable to expressions of the form (String,). Tag: Dataset – Math Jul 15 '19 at 01:53
  • 1
    What is the output from emails // Take[#, 2] & // InputForm? – Rohit Namjoshi Jul 15 '19 at 02:05
  • 1
    @math.world Here's what I get from a randomly created Dataset that looks like it has the same structure as yours: https://i.stack.imgur.com/vJlHm.png -- Works like it says it should in the documentation for Dataset. – Michael E2 Jul 15 '19 at 03:20
  • 1
    @MichaelE2 In your screenshot the first row has a light gray background. In the OP's screenshot it is white. So I think the underlying data is a list of lists, not a list of associations. Which is why I asked for the InputForm. – Rohit Namjoshi Jul 15 '19 at 04:01
  • @RohitNamjoshi Thanks. I missed that. I think you're right. – Michael E2 Jul 15 '19 at 04:05
  • Thanks everyone for your help – Math Jul 16 '19 at 16:38