Know Your MYSQL: How to use regular expression in MYSQL


MySQL offers the ability to use regular expressions to perform complex searches against your data. A regular expression is a tool that provides for a concise and flexible way to identify strings of text based on user-defined patterns.
This article will discuss the MySQL regular expression operators, review their use and syntax, and identify the constructs and special characters that can be used in a MySQL regular expression, as well as provide a few examples of their use.
MySQL Regular Expression Operators
The following operators are used in MySQL to perform regular expression operations. These are used in a WHERE clause similar to the well-known and often used LIKE operator.
  • REGEXP: The pattern matching operator for using regular expressions.
  • NOT REGEXP: The negation of the REGEXP operator.
  • RLIKE: A synonym for the REGEXP operator.
MySQL Regular Expression Syntax
The basic syntax used for MySQL regular expression operations is:
-- For the REGEXP Operator
SELECT {COLUMN_NAME} FROM {TABLE_NAME} WHERE {COLUMN_NAME} REGEXP '{REGEXP_PATTERN}';

-- For the NOT REGEXP Operator
SELECT {COLUMN_NAME} FROM {TABLE_NAME} WHERE {COLUMN_NAME} NOT REGEXP '{REGEXP_PATTERN}';

-- For the RLIKE Alias Operator
SELECT {COLUMN_NAME} FROM {TABLE_NAME} WHERE {COLUMN_NAME} RLIKE '{REGEXP_PATTERN}';
To provide more detailed, yet simple, example of a MySQL regular expression operation, take the following statement. It will retrieve all the columns of each record in the table PRICE where the PRICELIST_ID matches the pattern specified (starts with the numeric range 0-9 occurring one or more times, followed by an ‘_’ (underscore), and then the character sequence ‘USD’.
SELECT * FROM PRICE WHERE PRICELIST_ID REGEXP '^[0-9]+_USD';
Another example of a MySQL regular expression operation, can be shown in the following statement. It will retrieve all columns of each record from the PRICE table where the PRICE_ID matches the pattern specified (starts with an O, followed by and ‘_’ (underscore), then the numeric range 0-9 occurring one or more times, followed by and ‘_’ (underscore), then ending with either the character sequence USD, or BRA.
SELECT * FROM PRICE WHERE PRICE_ID REGEXP '^O_[0-9]+_[USD|BRA]';
MySQL REGEXP Constructs and Special Characters
A MySQL regular expression may use any of the following constructs and special characters to construct a pattern for use with the REGEXP operators. The construct or special character is shown, followed by a description of each and what operations in performs within the pattern for the regular expression.
  • ^ : Match the beginning of a string.
  • $ : Match the end of a string.
  • . : Match any character (including carriage return and newline characters).
  • a* : Match any sequence of zero or more a characters.
  • a+ : Match any sequence of one or more a characters.
  • a? : Match either zero or one a characters.
  • de|abc : Match either of the character sequences, de or abc.
  • (abc)* : Match zero or more instances of the character sequence abc.
  • {1},{2,3} : Provides a more general way of writing regular expressions that match many occurences of the previous atom (or “piece”) of the pattern. i.e. a? can be written as a{0,1}.
  • [a-dX],[^a-dX] : Matches any character that is (or is not, if ^ is used) either abcd, or X. A “-” character between two other characters forms a range that maches all characters from the first character to the second.
  • [.characters.] : Within a bracket expression (using “[” and “]”), matches the sequence of characters of that collating element. i.e. the pattern [[.period.]] would match the ‘.’ (period) character.
  • [=character_class=] : Within a bracket expression, represents an equivalence class. It matches all characters with the same collation value, including itself.
  • [:character_class:] : Within a bracket expression, represents a character class that matches all characters belonging to that class. i.e. the pattern [[:alpha:]] would match against a string that is all aphabetic characters.
  • [[:<:]],[[:>:]] : These markers stand for word boundaries, and as such they match the beginning and ending of words, respectively.
* NOTE: MySQL interprets the “\” (backslash) character as an escape character. If you choose to use the “\” character as part of your pattern in a regular expression it will need to escaped with another backslash “\\”.


Comments

Popular posts from this blog

How to download a file using command prompt (cmd) Windows?

The future of Artificial Intelligence: 6 ways it will impact everyday life

Angular 9 - User Registration and Login Example & Tutorial