Association on a list

Question

I have a dataset under this form : sentence = {sentence1, sentence2, sentence3,...}

and a list of label : label = {1,2,3,4,5,6,7...} both have same length !

So I want to make an association between my first and second list as sentence1 = label 1

So I tried this code

ruleData = Table[Rule[sentence,label], {i, 1, Length[sentence]}]

so as when I recall ruleData[[7,1]], it should give me : sentence7 (as label7 is 7)

but my code is not working... Any hints ?

Shouldn't you have sentence[[i]] etc in your table? p.s. check AssociationThread[label, sentence] — Kuba, Apr 08 '19 at 13:23
Thank you ! It works !
I want now to apply TFIDF to this variable ruleData and hence use this code : TFIDF = FeatureExtraction[ Join[First /@ Keys@ruleData[[All]], Last /@ Keys@ruleData[[All]]], "TFIDF"] but the output is telling me that there is nonatomic expression. As I wanted to apply the tf-idf for each sentence and for the total of sentences ... Any ideas of how to solve it ? — Tom Peterson, Apr 08 '19 at 13:37
You're pulling the Keys out of your association and then mapping First over it. Keys are usually atomic expressions (numbers, strings, etc.) and you cannot apply First and/or Last to atoms. It's difficult to understand what you're trying to do from a comment like this. — Sjoerd Smit, Apr 08 '19 at 13:45
I want to apply TF-IDF:
puting the number of times that a word appears in a sentence (term frequency) in relation to the number of times that that word appears in all other sentences (document frequency) or in other words counting the times a word appears on a given sentence but reducing its importance if it appears on many other sentences — Tom Peterson, Apr 08 '19 at 13:59
To make an association, use ruleData = AssociationThread[{sentence1, sentence2, sentence3}, {1, 2, 3}]. You may want to flip those around so can access ruleData[3] to return sentence3. To generate the TF-IDF, you can use GroupBy's to group documents, and words once they are tokenized/normalized. Please ask as a separate question with a minimal dataset. — alancalvitti, Apr 08 '19 at 14:06
https://mathematica.stackexchange.com/questions/194822/words-weighting-with-tf-idf — Tom Peterson, Apr 08 '19 at 15:16

score 1 · Answer 1 · answered Apr 08 '19 at 17:44

Your code can be fixed by supplying indexes to the variables bound to the lists. Like so:

Contrived data

SeedRandom[1];
With[{n = 4},
  sentence = 
    StringJoin[#1, " ", #2] & @@@ Transpose@{RandomWord["Verb", n], RandomWord["Noun", n]};
  label = Range[n]];

Then

ruleData = Table[Rule[sentence[[i]], label[[i]]], {i, 1, Length[sentence]}]

gives

{"beget weevil" -> 1, "closet clanswoman" -> 2, 
 "panel seconder" -> 3,  "cauterize prominence" -> 4}

and

ruleData[[4, 1]]

gives

"cauterize prominence"

as expected.

But there is a much better way. Mathematica has associations built-in a hash table objects. They are very efficient and make working with key, value associations very easy. Here is how you would set up and use an association for the contrived data.

 assocData = AssociationThread[label -> sentence]

You can retrieve the sentence that has 4 as it key by simply writing

assocData[4]

"cauterize prominence"

nufaie · Answer 2 · 2019-04-08T15:16:34.607

say your sentence as lst your lable as lbl

Then

lst = {s1, s2, s3, s4, s5, s6, s7};
lbl = {1, 2, 3, 4, 5, 6, 7};

to combine or associate two list, then use

data=Flatten /@ Transpose[{lst, lbl}]

output (* {{s1, 1}, {s2, 2}, {s3, 3}, {s4, 4}, {s5, 5}, {s6, 6}, {s7, 7}}*)

To use Table to get specific list define

Table[data[[i]], {i, 1, Length[lst]}]
output (* {{s1, 1}, {s2, 2}, {s3, 3}, {s4, 4}, {s5, 5}, {s6, 6}, {s7, 7}}*)

To get list from 6 to 7 == Length[lst]==Length[lbl]

Table[data[[i]], {i, 6, Length[lst]}]

output(* {{s6, 6}, {s7, 7}} *)

To get specific list you can use above table or just use

data[[3]]

output (* {s3, 3} *)

Association on a list

2 Answers2