Regular Expressions in Oracle Database 10g
Oracle Database 10g supports regular expression in search and replace. It provides a set of new
functions that provides regular expression capability. Oracle10g's implementation of regular expression
complies with Portable Operating System for Unix (POSIX) standards.
Regular expression is a pattern that contains special characters, which specify
what exactly we are looking for. Let us now understand some of the limitations we
have with LIKE operator and INSTR function.
Let us assume we have to search for a name that starts with either N, M or P. The
following LIKE operator, though lengthy, can do it.
select * from employees where first_name like 'N%' or first_name like 'M%' or first_name like 'P%';
But what if we want to get all names that start with any uppercase letter and contain
only 5 letters. It is possible but it is not an easy condition (not at least for beginners).
This is the purpose regular expressions can serve. They allow you to make your searching
more precise. The following example searches for employees whose first name is starting with uppercase letter and contain only
5 letters.
select first_name from employees where regexp_like ( first_name, '^[:aplha:]{4}$');
Special characters in regular expression
The above example used a few special characters to specify what exactly we search
for. Here is the list of special characters in regular expression.
Character |
Description |
* |
Zero or more
occurrences of the previous character. |
+ |
One or more occurrences of the previous character. |
? |
Zero or one occurrences of the previous character. |
. |
Any character matches this position. |
[] |
Character must be any character in the brackets. Even a range of characters can
be given. For Example, [A-Z] means any uppercase letter is a match. |
[^] |
Any character other than the characters given in the brackets is a match. Reverse
of []. |
^ |
Matches beginning of the string. For example, ^A means the string must start with A. |
$ |
Matches end of the string. For example, A$ means string must end with A. |
| |
Alternate operator. Either of the expressions must be a match. expr1|expr2 means either
expr1 or expr2 must be a match. |
{n} |
Previous expression must be repeated for exactly n times. For example, [0-9]{3}
means three digits. |
{n,m} |
Previous expression must be repeated from n to m times. For example, [0-9]{3,5}
means we may have 3 to 5 digits. |
{n,} |
Previous expression must be repeated for at least n times. |
() |
Groups the expression. The entire expression enclosed is treated as one expression. |
The character class [:alpha:] matches any alphabet. The other characters classes available are given below:
Character Class | Description |
---|
[:alnum:] | Alphanumeric characters |
[:alpha:] | Any alphabet either upper or lower case. |
[:cntrl:] | Any control character. A non-printable character is called as control character. |
[:digit:] | Any digit. |
[:lower:] | Any lower case letter. |
[:print:] | Any printable character. |
[:punct:] | Any punctuation character. |
[:space:] | All space characters. |
[:upper:] | Any upper case letter. |
New Functions
The following are the new functions and what they do. All these function extend
the functionality of some existing function/operator with the capability of supporting
regular expression.
Function
|
Description |
REG_LIKE |
Same as LIKE operator but supports regular expression. Allows search based on regular expression. |
REG_INSTR |
Same as INSTR function but supports regular expression. Allows you to locate a string in another
string more precisely using regular expression. |
REG_SUBSTR |
Same as SUBSTR function but supports regular expression. Allows you to extract a part of another string
using regular expression. |
REG_REPLACE |
Same as REPLACE function but supports regular expression. Allows you to search for source string using regular expression and
replaces it with target string |
Examples
Let us now explore these functions and see how they can be used. We start with REGEXP_LIKE
function, which is extended version of LIKE operator.
select job_title from jobs where regexp_like ( job_title, '^S.*er$');
The above example displays job titles that start with S and end
with er. It may
contain anything in between these two.
If you want to modify the way REGEXP_LIKE compares characters then third parameter,
which contains either 'c' for case sensitive or 'i'
for ignore case, can be given
as shown below.
select job_title from jobs where regexp_like ( job_title, '^S.*er$','i')
Now let us see how REGEXP_SUBSTR is used to extract a substring based on regular expression.
select regexp_substr('Oracle Database 10g is first grid aware database','[0-9]+') version from dual;
The above query displays 10 as it is consisting of one or more digits. The following
example looks for a number followed by a character - displays 10g.
select regexp_substr('Oracle Database 10g is first grid aware database','[0-9]+[a-z]') version from dual;
The following query displays the starting position of one or more digits.
select regexp_instr('Oracle Database 10g is first grid aware database','[0-9]+') position from dual;
The following query returns the positon of first non-alphabet in the given string.
select regexp_instr('Abc123 xyz123','[^[:alpha:]]') from dual;
The following query places a space between Oracle its version using REGEXP_REPLACE function. For example,
Oracle9i will become Oracle 9i, Oracle10g will become Oracle 10g. We are looking
for a series of alphabets and take them as group 1. Then we are looking for a group
of digits followed by any character and treat it as group 2. Then we replace the original with \1 (group 1) a space and \2 (group 2).
select regexp_replace('Oracle10g','([[:alpha:]]+)([[:digit:]]+.)','\1 \2') from dual;
Regular expressions in Oracle10g will make searching and replacing more precise.
Certain searches that were not possible with simple LIKE operator are now possible
with the introduction of regular expressions.
Keep Learning,
Srikanth.