3

I'm trying to get better with pattern matching but even after years of using MMA it's still a murky concept. The docs are actually pretty good in this regard but I still struggle. In this situation I have a list of real estate items (that are strings) which I'm trying to clean up. I've tried the below 2 methods and have not been able to delete the "HVAC" elements required from this list.

DeleteCases[featureList, StringContainsQ [#,"HVAC"] & ]

and

DeleteCases[featureList, __~~"HVAC"~~__]

Both return the same list back:

{ML #,Property Type,SubSpcSqFt,Land Sz SF,Building Type,List Date,Yr Blt,Address,Area,Class,Clear Ceiling Ht (Feet),Lot Frontage (ft),Lot Depth (ft),Zone,Price,Sale Type,Transaction Type,Zoning/Land Use,Amenities-HVAC System,Building Type-Freestanding,HVAC-See Realtor Remarks,HVAC-Baseboard,HVAC-Central A/C,HVAC-Common Water Heater,HVAC-Electric,HVAC-Forced Air,HVAC-Heat Pump,HVAC-Hot Water,HVAC-In-Floor,HVAC-Make-Up Air,HVAC-Mixed,HVAC-None,HVAC-Radiant,HVAC-Rooftop,HVAC-Separate Controls,HVAC-Separate HVAC Units,HVAC-Separate Water Heaters,HVAC-Space Heaters,HVAC-Steam,HVAC-Window A/C}

What is the correct way to filter out any string containing "HVAC"?

BBirdsell
  • 1,196
  • 8
  • 21

2 Answers2

10
featureList = {"ML #", "Property Type", "SubSpcSqFt", "Land Sz SF", 
   "Building Type", "List Date", "Yr Blt", "Address", "Area", "Class",
    "Clear Ceiling Ht (Feet)", "Lot Frontage (ft)", "Lot Depth (ft)", 
   "Zone", "Price", "Sale Type", "Transaction Type", 
   "Zoning/Land Use", "Amenities-HVAC System", 
   "Building Type-Freestanding", "HVAC-See Realtor Remarks", 
   "HVAC-Baseboard", "HVAC-Central A/C", "HVAC-Common Water Heater", 
   "HVAC-Electric", "HVAC-Forced Air", "HVAC-Heat Pump", 
   "HVAC-Hot Water", "HVAC-In-Floor", "HVAC-Make-Up Air", 
   "HVAC-Mixed", "HVAC-None", "HVAC-Radiant", "HVAC-Rooftop", 
   "HVAC-Separate Controls", "HVAC-Separate HVAC Units", 
   "HVAC-Separate Water Heaters", "HVAC-Space Heaters", "HVAC-Steam", 
   "HVAC-Window A/C"};

The following are all equivalent

DeleteCases[featureList, _?(StringContainsQ[#, "HVAC"] &)]

DeleteCases[featureList, _?(StringContainsQ["HVAC"])]

DeleteCases[featureList, _?(StringMatchQ[#, ___ ~~ "HVAC" ~~ ___] &)]

DeleteCases[featureList, _?(StringMatchQ[___ ~~ "HVAC" ~~ ___])]

Select[featureList, StringFreeQ["HVAC"]]
Chris Degnen
  • 30,927
  • 2
  • 54
  • 108
  • It would be fine to add he alternative solution : Select[featureList, ! StringContainsQ[#, "HVAC"] &] – andre314 Mar 07 '20 at 15:03
  • Excellent answer. I found this doc helpful in understanding the behaviour of this answer: https://reference.wolfram.com/language/ref/PatternTest.html – BBirdsell Mar 07 '20 at 16:37
4

In addition:

DeleteCases[featureList, str_/;StringContainsQ[str, "HVAC"]]

But perhaps combined with a regular expression is more powerful?

For example (delete cases where str begins with 'HVAC' only):

 DeleteCases[featureList, str_/;StringContainsQ[str, 
  RegularExpression["^HVAC"]]]

{ML #, Property Type, SubSpcSqFt, Land Sz SF, Building Type, List Date, Yr Blt, Address, Area, Class, Clear Ceiling Ht (Feet), Lot Frontage (ft), Lot Depth (ft), Zone, Price, Sale Type, Transaction Type, Zoning/Land Use, Amenities-HVAC System, Building Type-Freestanding}

And, of course, there is also Pick

Pick[featureList,StringContainsQ[featureList,"HVAC"],False]
user1066
  • 17,923
  • 3
  • 31
  • 49