4

I suppose this is a trivial question to many of you, please bear with me.

  1. I have a long list of words, list1. I would like to "remove all words that contain a specific letter, say "o".

    list1 = {"world country", "capital", "population", "poppel”, "poppy"}
    DeleteCases[list1, "o" ∼∼ __]  (*does not work*)
    

I have tried many things and I have looked through many examples to no avail. I have read through Patterns but have not found anything that I can use. Several suggestions here refer to the case where the pattern is a word in the list, like so:

list1 = {"world country", "capital", "population", "popp", "poppy"}
DeleteCases[list1, "popp"]

(* Out: {"world country", "capital", "population", "poppy"} (*works*) *)

But this is not usable, as I need to have a pattern that means "all words that contain the letter "o")

  1. What I am really trying to solve is this: get rid of all words in list1 that contain the letters in list3, such as:

    list1 = {"world country", "capital", "population", "popp", "poppy"}
    list3 = {"o","q"}
    DeleteCases[list1, list3]  (*does not work*)
    
    (* My desired output would be {"capital"} *)
    
MarcoB
  • 67,153
  • 18
  • 91
  • 189
JSP
  • 367
  • 1
  • 10
  • 2
    Take a look at StringFreeQ. – Kuba Feb 23 '16 at 14:55
  • Select[list1, StringContainsQ["o"]] – Jason B. Feb 23 '16 at 14:56
  • 1
    @Kuba, they forgot to include an operator form of StringFreeQ! – Jason B. Feb 23 '16 at 14:57
  • This is almost a duplicate 72670 but it was marked as a duplicate of more general question. – Kuba Feb 23 '16 at 14:57
  • @JasonB I agree :) p.s. OP want's to remove them. – Kuba Feb 23 '16 at 14:58
  • @Kuba, yes of course, I meant to say Select[list1, Not@*StringContainsQ["o"]] – Jason B. Feb 23 '16 at 14:59
  • 1
    @Jason, Not @*... I dunno what to say. Why not ! StringContainsQ["o"], if you're going that way? :P – J. M.'s missing motivation Feb 23 '16 at 15:02
  • Welcome! I suggest the following:
    1. As you receive help, try to give it too, by answering questions in your area of expertise.
    2. Take the tour and check the faqs!
    3. When you see good questions and answers, vote them up by clicking the gray triangles, because the credibility of the system is based on the reputation gained by users sharing their knowledge. Remember to accept the answer, if any, that solves your problem, by clicking the checkmark sign!
    –  Feb 23 '16 at 15:06
  • @J.M., well probably since Select[list1, ! StringContainsQ["o"]] didn't return an answer. Using composition allowed it to keep its operator form – Jason B. Feb 23 '16 at 15:07
  • @Jason Uff, I didn't notice it was in operator form. You're right. (I still haven't gotten used to those things, actually.) – J. M.'s missing motivation Feb 23 '16 at 15:08
  • Thank you very much, awesome people here! The answer solves the problem very well. As @Louis says, I will vote it up. I will also try to learn and implement the suggestion about StringFreeQ which I understand from the comments is very useful too. – JSP Feb 24 '16 at 10:38
  • This is on hold because it is a double or trivial. Maybe there is a site for infrequent users and newbies of Mathematica? I cannot see where the double question is, it is certainly quite informative too. I think the answers here are really too good to be deleted. I upvoted the answer by MarcoB but could not do the same for the suggestion by @Kuba. Also, browsing a lot shows that many users have problem with patterns. I struggle a lot with patterns and have questions about that issue, where can one ask such questions? Thanks – JSP Feb 25 '16 at 16:12

1 Answers1

7
list1 = {"world country", "capital", "population", "popp", "poppy"}
list3 = {"o", "q"}
Select[list1, Not@StringContainsQ[#, Alternatives @@ list3] &]  

(* Out: {"capital"} *)

Just to flesh out this answer a bit, and because timing things is fun, let's explore Martin's excellent suggestion of using StringFreeQ rather than Not@StringContainsQ in the Select expression.

In order to appreciate the difference, however, we need a longer word list: enter the SOWPODS word list, used by English-language Scrabble players outside of North America:

sowpods = 
 Import["http://www.freescrabbledictionary.com/sowpods/download/sowpods.txt", "List"][[3;;]];

Length[sowpods]
(* Out: 267 751 *)

and time the difference, considering also that StringFreeQ can take a list of arguments, so Alternatives can also be removed:

rejectlist = {"o", "q", "a", "e"};

Select[sowpods, Not@StringContainsQ[#, Alternatives @@ rejectlist] &]; // RepeatedTiming
Select[sowpods, StringFreeQ[#, Alternatives @@ rejectlist] &]; // RepeatedTiming
Select[sowpods, StringFreeQ[#, rejectlist] &]; // RepeatedTiming

(* Out: 
{1.8, Null}
{0.61, Null}
{0.368, Null}
*)

The most direct StringFreeQ approach handily wins.

MarcoB
  • 67,153
  • 18
  • 91
  • 189
  • 1
    @MartinBüttner - your golfing instincts can't stand the extra 20 bytes! – Jason B. Feb 23 '16 at 15:23
  • 1
    @JasonB Of course not. :P (It also seems more intuitive and potentially more efficient though.) – Martin Ender Feb 23 '16 at 15:24
  • @MartinBüttner That's an excellent point. I added timings to my answer. – MarcoB Feb 23 '16 at 18:07
  • @MartinBüttner another good point indeed. I also added the Alternatives-free version to the timing: it is quite a bit faster! – MarcoB Feb 23 '16 at 19:16
  • @Marco From 10.4, StringContainsQ is in C (see its new PrintDefinitions) and it has an operator form. Here are the three timings I get (in the same order as yours, and using the operator form for the first input): {0.475, Null}, {0.668, Null}, {0.428, Null}. –  Mar 06 '16 at 16:08
  • @Xavier thank you for the update! I'll try to add the new timings when I get 10.4 – MarcoB Mar 06 '16 at 16:54