We are still helping our customers get more out of the data they have and make data driven decisions. Our new site is packed with information and insights of how data can be the catalyst to your business growth.
Among the functions you can use in Proc SQL is PRXMATCH. At a single stroke, this increases the power of the SELECT statement quite dramatically. Now you can select records that match regular expressions.
This first example is looking for product names that contain words beginning with “h” and end with “r”. For matching records, PRXMATCH will return the position of the substring that matched the regular expression; if there was no match, it will return 0.
A quick refresher on regular expressions:
Among the matches found by the above query are:
Large Hover Mower
Easy Patio Heater
Hand Cultivator (Wood)
The last of these is one that was not wanted – there is a word boundary between the “h” and the “r”. We could get rid of this in a number of ways, for example by changing the regular expression to “!b[Hh]w*[rR]b!”, where “w” matches any “word character”. Word characters are defined as alphanumerics, plus the underscore character.
Here is another example using the same dataset, in which we look for cats and dogs:
Here the “|” means “or”, and the “i” at the end of the regular expression makes it case-insensitive. Among the matches found are:
Kennel (Large Dog)
This time the regular expression said nothing about word boundaries.