A Different RegEx Solution

In my last post about regular expressions (regex), I offered up a solution to requirements to extract text a substring from a larger blob of text. Over the weekend, I was thinking about this solution and wanted to know if it was possible to meet the requirements using a solution based solely on a regular expression. It turns out it is.
The Requirement
For a refresher, here is what the requirement was:
Let’s assume we are working on a software project and need to return rows from a database where the word ‘Ipsum’ is followed by a space and then a five- or six-letter word. Instead of returning the entire text or a substring from the beginning of the text, we need to return the matching text and up to six characters before and after the match.
Here is the query we used to solve this requirement.
I was pretty happy with this solution, but the more I thought about it, the more it seemed like something Rube Goldberg would create. I realized we could solve this requirement by trimming out a lot of fluff.
The Full RegEx Way
Here is the query I came up with. It is more elegant and likely offers better performance.
Here, I am adding a range (.{0,6}
) at our original match’s beginning and end. This range translates into: “zero to six characters”. So, instead of using the starting position of our match along with the match itself andsubstr()
(and other functions), we can make the pattern match all the text we want returned.
The complete translation of this pattern is now: zero to six characters, followed by the word ipsum
, followed by a five to six-letter word, followed by zero to six characters. You may be wondering why I used 0,6
. I used this range because ipsum
could be the start of the string, and the trailing five or six-letter word might be at the very end of the string.
The Result
The results from this query resemble the output below:
The snippet in this query does not exactly match our previous example, but if you count the characters, you can see that this example better meets the requirements.
Wrap Up
Like many things in a developer’s toolkit, using regular expressions to solve problems is not an exact science. There can be more than one way to tackle a problem. In this post, I showed how we can meet our requirements using a solution purely based on regular expressions. For more information on regular expressions in MySQL, check out the documentation.