Regular Expressions in MySQL

Regular expressions (regex) are known to solve complex comparison problems. When it comes to Sql, numbers of developers don’t even know MySQL supports regular expressions. Let us directly jump to a couple of examples:
 

SELECT name FROM students WHERE name REGEXP  ‘^(A|E|I|O|U)’

Above SQL will retrieve students with name starting with character ‘A’, ‘E’, ‘I’, ‘O’ or ‘U’

 
SELECT name FROM students WHERE hobbies REGEXP  ‘[[:<:]]badminton[[:>:]]’

Above SQL will retrieve students having a word ‘badminton’ in `hobbies`

 

MYSQL METACHARACTERS:

  • . match any character 
  • ? match zero or one
  • * match zero or more
  • + match one or more
  • ^ beginning of line
  • $ end of line
  • {n} match n times
  • {m,n} match m through n times
  • {n,} match n or more times
  • [abc] match one of enclosed chars
  • [^xyz] match any char not enclosed
  • | separates alternatives
  • [:class:] match a character class
    • [:alpha:] for letters
    • [:space:] for whitespace
    • [:punct:] for punctuation
    • [:upper:] for upper case letters
  • [[:<:]] match beginning of words
  • [[:>:]] match ending of words

 

CAUTION:

  1. REGEXP operator works in bit-wise fashion. It means it may produce unexpected results while being used against multi-byte character set data.
  2. Queries with REGEXP can’t use indexes and may slow down the performance.
  3. MySQL supports POSIX regular expressions which lack negative lookaheads.

 

CONCLUSION:

While MySQL has extensive regex support and can be a handy and powerful tool, it has some downsides and can affect the SQL performance causing more problems than it can solve. So, if results can be achieved with LIKE or other string function, REGEXP must be avoided.

 
happy coding…