Print

Regular Expressions in SQL

What?
Practice makes perfect. Or in my case, any practice is a start. This article serves as a quick note on how to use regular expressions within SQL statements:

How?
For the following examples, I am pretending to select rows from a table called `STUDENTS`.

Oracle PL/SQL
Looking for abnormal data, note the circumflex to exclude the clean alphanumeric rows:
copyraw
SELECT * FROM STUDENTS WHERE REGEXP_LIKE(S_SURNAME, '[^a-zA-Z0-9]')
-- returns all rows where the student surname contains non-alphanumeric characters
-- eg.  O'Brien will be returned

SELECT * FROM STUDENTS WHERE REGEXP_LIKE(S_SURNAME, '[^a-zA-Z0-9\'']')
-- returns all rows where the student surname contains non-alphanumeric characters and does not contain apostrophes
-- eg.  O'Brien will NOT be returned

-- Other REGEXP Functions for Oracle:
-- -- REGEXP_REPLACE(country_name, '(.)', '\1 ')
-- -- REGEXP_INSTR(email, '\w+@\w+(\.\w+)+') > 0
-- -- REGEXP_SUBSTR('oracle', 'o r a c l e', 1, 1, 'x')
  1.  SELECT * FROM STUDENTS WHERE REGEXP_LIKE(S_SURNAME, '[^a-zA-Z0-9]') 
  2.  -- returns all rows where the student surname contains non-alphanumeric characters 
  3.  -- eg.  O'Brien will be returned 
  4.   
  5.  SELECT * FROM STUDENTS WHERE REGEXP_LIKE(S_SURNAME, '[^a-zA-Z0-9\'']') 
  6.  -- returns all rows where the student surname contains non-alphanumeric characters and does not contain apostrophes 
  7.  -- eg.  O'Brien will NOT be returned 
  8.   
  9.  -- Other REGEXP Functions for Oracle: 
  10.  -- -- REGEXP_REPLACE(country_name, '(.)', '\1 ') 
  11.  -- -- REGEXP_INSTR(email, '\w+@\w+(\.\w+)+') > 0 
  12.  -- -- REGEXP_SUBSTR('oracle', 'o r a c l e', 1, 1, 'x') 


Microsoft T-SQL
Note the circumflex/hat to negate the expression in the first example, and then the use of NOT LIKE in the second example:
copyraw
SELECT * FROM [STUDENTS] WHERE [S_SURNAME] LIKE '%[^a-zA-Z0-9]%'
-- returns all rows where the student surname contains non-alphanumeric characters
-- eg.  O'Brien will be returned

SELECT * FROM [STUDENTS] WHERE [S_SURNAME] NOT LIKE '%[a-zA-Z0-9]%'
-- returns all rows where the student surname does NOT contain a letter or number
-- eg.  O'Brien will NOT be returned as it contains letters
  1.  SELECT * FROM [STUDENTS] WHERE [S_SURNAME] LIKE '%[^a-zA-Z0-9]%' 
  2.  -- returns all rows where the student surname contains non-alphanumeric characters 
  3.  -- eg.  O'Brien will be returned 
  4.   
  5.  SELECT * FROM [STUDENTS] WHERE [S_SURNAME] NOT LIKE '%[a-zA-Z0-9]%' 
  6.  -- returns all rows where the student surname does NOT contain a letter or number 
  7.  -- eg.  O'Brien will NOT be returned as it contains letters 


MySQL
And my favorite. Note the circumflex as we are still excluding non-alphanumeric rows.
copyraw
SELECT * FROM Students WHERE studentSurname REGEXP '[^a-zA-Z0-9]';  

or 

SELECT * FROM Students WHERE studentSurname RLIKE '[^a-zA-Z0-9]';

-- returns all rows where the student surname contains non-alphanumeric characters
-- eg.  O'Brien will be returned
  1.  SELECT * FROM Students WHERE studentSurname REGEXP '[^a-zA-Z0-9]'
  2.   
  3.  or 
  4.   
  5.  SELECT * FROM Students WHERE studentSurname RLIKE '[^a-zA-Z0-9]'
  6.   
  7.  -- returns all rows where the student surname contains non-alphanumeric characters 
  8.  -- eg.  O'Brien will be returned 

Category: Databases :: Article: 545