Dear readers of our blog, we'd like to recommend you to visit the main page of our website, where you can learn about our product SQLS*Plus and its advantages.
 
SQLS*Plus - best SQL Server command line reporting and automation tool! SQLS*Plus is several orders of magnitude better than SQL Server sqlcmd and osql command line tools.
 

REQUEST COMPLIMENTARY SQLS*PLUS LICENCE

Enteros UpBeat offers a patented database performance management SaaS platform. It proactively identifies root causes of complex revenue-impacting database performance issues across a growing number of RDBMS, NoSQL, and deep/machine learning database platforms. We support Oracle, SQL Server, IBM DB2, MongoDB, Casandra, MySQL, Amazon Aurora, and other database systems.

Oracle REGEXP_SUBSTR function

3 August 2020

Oracle REGEXP_SUBSTR function

The Oracle/PLSQL REGEXP_SUBSTR function is an extension of function SUBSTR. This function, represented in Oracle 10g, allows you to extract substring from a string using regular expression pattern matching.

Syntax of the Oracle/PLSQL function REGEXP_SUBSTR

REGEXP_SUBSTR( string_id, pattern_id [, start_position_id [, nth_appearance_id [, match_parameter_id [, sub_expression_id ] ] ] ] ] 

Parameters and function arguments

  • string_id – A search line. It can be CHAR, VARCHAR2, NCHAR, NVARCHAR2, CLOB or NCLOB.
  • pattern_id – Template. Regular expression for comparison. It can be a combination of the following values:
MeaningDescription
^Corresponds to the beginning of the line. When using match_parameter with m, corresponds to the beginning of the string anywhere within the expression.
$Corresponds to the end of the line. When using match_parameter with m, it corresponds to the end of the string anywhere within the expression.
*Corresponds to zero or more occurrences.
+Corresponds to one or more occurrences.
?Corresponds to zero or one entry.
.Corresponds to any character except NULL.
|Used as “OR” to specify more than one alternative.
[ ]It is used to specify a list of matches where you try to match any of the characters in the list.
[^ ]It is used to specify a nonmatching list where you try to match any character except for those on the list.
( )Used for group expressions as subexpressions.
{m}Corresponds m times.
{m,}Matching at least m times.
{m,n}Matching at least m times, but not more than n times.
\nn is a number between 1 and 9. It corresponds to the n-th subexpression located in ( ) before \n.
[..]Corresponds to a single element mappings that can be more than one character.
[::]Meets the symbol class.
[==]Corresponds to the class of equivalence.
\dCorresponds to the digital symbol.
\DCorresponds to a non-digital symbol.
\wCorresponds to the text symbol.
\WCorresponds to a non-text symbol.
\sCorresponds to the space character.
\SDoesn’t match the space character.
\ACorresponds to the beginning of a line or corresponds to the end of a line before a new line character.
\ZCorresponds to the end of the line.
*?Corresponds to the previous pattern of zero or more occurrences.
+?One or more entries correspond to the previous template.
??Corresponds to the previous zero or one entry pattern.
{n}?Corresponds to the previous template n times.
{n,}?Corresponds to the previous template at least n times.
{n,m}?Corresponds to the previous template at least n times, but not more than m times.

 

  • start_position_id – Optional. This is the position in the line from which the search will start. If this parameter is omitted, by default it is 1, which is the first position in the string.
  • nth_appearance_id – Optional. This is the n-th view of the pattern in the string. If this option is omitted, it defaults to 1, which is the first entry of the template in the string. If you specify 0 for this parameter, all template entries in the string will be replaced.
  • match_parameter_id – It’s optional. This allows you to change the compliance behavior for the REGEXP_REPLACE condition. This can be a combination of the following values:
MeaningDescription
‘c’Performs register-sensitive alignment.
‘i’Performs case insensitive alignment.
‘n’Allows a character period (.) to match the character of a new string. By default, the metasymic period.
‘m’The expression assumes that there are several lines where ^ is the beginning of a line and $ is the end of a line, regardless of the position of these characters in the expression. By default, the expression is assumed to be on the same line.
‘x’The symbols of spaces are ignored. By default, the space characters are the same as any other character.

 

  • subexpression_id – Optional. Used when the template has subexpressions, and you want to specify which subexpression in the template is the target. This is an integer value between 0 and 9, indicating that the subexpression matches the template.

The REGEXP_SUBSTR function returns a string value.

If REGEXP_SUBSTR does not detect any pattern occurrence, it returns NULL.

If there are conflicting values for match_parameter, the REGEXP_SUBSTR function will use the last value.

REGEXP_SUBSTR function can be used in the following versions of Oracle / PLSQL

Oracle 12c, Oracle 11g, Oracle 10g

Example of a match in words

Let’s start by extracting the first word from the string.
For example:

SELECT REGEXP_SUBSTR ("Google is a great search engine.", '(\S*)(\s)')
FROM dual;
--Result: 'Google'

This example will return ‘Google’ because it will extract all characters without spaces as specified (\S*) and then the first character of the space specified (\s). The result will include both the first word and the space character after the word.

If you do not want to include a space in the result, we will change our example as follows:

SELECT REGEXP_SUBSTR ("Google is a great search engine.", '(\S*)')
FROM dual;
-Result: 'Google'

This example will return ‘Google’ without a space at the end.

If we need to find the second word in a line, we will change our function as follows:

SELECT REGEXP_SUBSTR ("Google is a great search engine.", '(\S*)(\s)', 1, 2)
FROM dual;
--Result: 'is '

This example will return ‘is ‘ with a space at the end of the line.
If we need to find the fourth word in the string, we will change our function as follows:

SELECT REGEXP_SUBSTR ("Google is a great search engine.", '(\S*)(\s)', 1, 4)
FROM dual;
--Result: 'great'

This example will return ‘great’ with a space at the end of the line.

Example of a number match

Let’s see how we will use the function REGEXP_SUBSTR to compare the pattern of digital characters.
For example:

SELECT REGEXP_SUBSTR ('2, 4, and 10 numbers for example', '\d').
FROM dual;
--Result: '2'

In this example, the first digit will be extracted from a line, as specified at \d. In this case, it will match number 2.

We could change our pattern to find a two-digit number.
For example:

SELECT REGEXP_SUBSTR ('2, 4, and 10 numbers for example', '(\d)(\d)')
FROM dual;
--Result: '10'

In this example, a number will be printed that has two digits as specified in (\d)(\d). In this case it will skip the numeric values 2 and 4 and return 10.

Let’s see how we will use the REGEXP_SUBSTR function with a table column and look for a two-digit number.
For example:

SELECT REGEXP_SUBSTR (address, '(\d)(\d)')
FROM contacts;

In this example, we are going to extract the first two-digit value from the address field in the contacts table.

Example of matching several alternatives

The following example, which we will look at, includes the use of | template. | template is used as an “OR” to specify several alternatives.
For example:

SELECT REGEXP_SUBSTR ('AeroSmith', 'a|e|i|o|u').
FROM dual;
--Result: 'e'

This example will return an ‘e’ because it looks for the first vowel (a, e, i, o or u) in the string. Since we didn’t specify a match_parameter value, the REGEXP_SUBSTR function will perform a case sensitive search, which means that ‘A’ in ‘AeroSmith’ will not be matched.

To perform a case-insensitive search, we will modify our query in the following way:

SELECT REGEXP_SUBSTR ('AeroSmith', 'a|e|i|o|u', 1, 1, 'i')
FROM dual;

--Result: 'A'

Now since we have provided match_parameter = ‘i’, the query will return ‘A’ as a result. This time ‘A’ in ‘AeroSmith’ will be matched.

Now consider how you will use this function with a column.
So, suppose we have a contact table with the following data:

contact_idlast_name
1000AeroSmith
2000Joy
3000Scorpions

 

Now let’s start the next request:

SELECT contact_id, last_name, REGEXP_SUBSTR (last_name, 'a|e|i|o|u', 1, 1, 'i') AS "First Vowel".
FROM contacts;

The results to be returned by the request:

contact_idlast_nameFirst Vowel
1000AeroSmithA
2000Joyo
3000Scorpionso

 

Example of matches based on nth_occurrence parameter

The next example we will consider includes the nth_occurrence parameter. The nth_occurrence parameter allows you to choose from which occurrence of the template you want to extract the substring.

First occurrence

Let’s see how to extract the first occurrence of the template in a row.
For example:

SELECT REGEXP_SUBSTR ('AeroSmith', 'a|e|i|o|u', 1, 1, 'i')
FROM dual;
--Result: 'A'

This example will return ‘A’ because it retrieves the first vowel occurrence (a, e, i, o or u) in the string.

Second occurrence

Then we will select a template for the second line entry.
For example:

SELECT REGEXP_SUBSTR ('AeroSmith', 'a|e|i|o|u', 1, 2, 'i')
FROM dual;
--Result: 'e'

This example will return ‘e’ because it retrieves the second occurrence of a vowel (a, e, i, o or u) in a string.

The third occurrence

For example:

SELECT REGEXP_SUBSTR ('AeroSmith', 'a|e|i|o|u', 1, 3, 'i')
FROM dual;
--Result: 'o'

This example will return an ‘o’ because it retrieves the third vowel occurrence (a, e, i, o or u) in a string.

Oracle regular expression: extracting substring regexp substr

 
Tags: , , , , ,

MORE NEWS

 

Preamble​​NoSql is not a replacement for SQL databases but is a valid alternative for many situations where standard SQL is not the best approach for...

Preamble​​MongoDB Conditional operators specify a condition to which the value of the document field shall correspond.Comparison Query Operators $eq...

5 Database management trends impacting database administrationIn the realm of database management systems, moreover half (52%) of your competitors feel...

The data type is defined as the type of data that any column or variable can store in MS SQL Server. What is the data type? When you create any table or...

Preamble​​MS SQL Server is a client-server architecture. MS SQL Server process starts with the client application sending a query.SQL Server accepts,...

First the basics: what is the master/slave?One database server (“master”) responds and can do anything. A lot of other database servers store copies of all...

Preamble​​Atom Hopper (based on Apache Abdera) for those who may not know is an open-source project sponsored by Rackspace. Today we will figure out how to...

Preamble​​MongoDB recently introduced its new aggregation structure. This structure provides a simpler solution for calculating aggregated values rather...

FlexibilityOne of the most advertised features of MongoDB is its flexibility.  Flexibility, however, is a double-edged sword. More flexibility means more...

Preamble​​SQLShell is a cross-platform command-line tool for SQL, similar to psql for PostgreSQL or MySQL command-line tool for MySQL.Why use it?If you...

Preamble​​Writing an application on top of the framework on top of the driver on top of the database is a bit like a game on the phone: you say “insert...

Preamble​​Oracle Coherence is a distributed cache that is functionally comparable with Memcached. In addition to the basic function of the API cache, it...

Preamble​​IBM pureXML, a proprietary XML database built on a relational mechanism (designed for puns) that offers both relational ( SQL / XML ) and...

  What is PostgreSQL array? In PostgreSQL we can define a column as an array of valid data types. The data type can be built-in, custom or enumerated....

Preamble​​If you are a Linux sysadmin or developer, there comes a time when you need to manage an Oracle database that can work in your environment.In this...

Preamble​​Starting with Microsoft SQL Server 2008, by default, the group of local administrators is no longer added to SQL Server administrators during the...