Regex Module
     __________________________________________________________

   Table of Contents

   1. Admin Guide

        1.1. Overview
        1.2. Dependencies

              1.2.1. OpenSIPS Modules
              1.2.2. External Libraries or Applications

        1.3. Exported Parameters

              1.3.1. file (string)
              1.3.2. max_groups (int)
              1.3.3. group_max_size (int)
              1.3.4. pcre_caseless (int)
              1.3.5. pcre_multiline (int)
              1.3.6. pcre_dotall (int)
              1.3.7. pcre_extended (int)

        1.4. Exported Functions

              1.4.1. pcre_match (string, pcre_regex)
              1.4.2. pcre_match_group (string [, group])

        1.5. Exported MI Functions

              1.5.1. regex_reload

        1.6. Installation and Running

              1.6.1. File format

   2. Contributors

        2.1. By Commit Statistics
        2.2. By Commit Activity

   3. Documentation

        3.1. Contributors

   List of Tables

   2.1. Top contributors by DevScore^(1), authored commits^(2) and
          lines added/removed^(3)

   2.2. Most recently active contributors^(1) to this module

   List of Examples

   1.1. Set file parameter
   1.2. Set max_groups parameter
   1.3. Set group_max_size parameter
   1.4. Set pcre_caseless parameter
   1.5. Set pcre_multiline parameter
   1.6. Set pcre_dotall parameter
   1.7. Set pcre_extended parameter
   1.8. pcre_match usage (forcing case insensitive)
   1.9. pcre_match usage (using "end of line" symbol)
   1.10. pcre_match_group usage
   1.11. regex file
   1.12. Using with pua_usrloc
   1.13. Incorrect groups file

Chapter 1. Admin Guide

1.1. Overview

   This module offers matching operations against regular
   expressions using the powerful PCRE library.

   A text file containing regular expressions categorized in
   groups is compiled when the module is loaded, storing the
   compiled PCRE objects in an array. A function to match a string
   or pseudo-variable against any of these groups is provided. The
   text file can be modified and reloaded at any time via a MI
   command. The module also offers a function to perform a PCRE
   matching operation against a regular expression provided as
   function parameter.

   For a detailed list of PCRE features read the man page of the
   library.

1.2. Dependencies

1.2.1. OpenSIPS Modules

   The following modules must be loaded before this module:
     * No dependencies on other OpenSIPS modules.

1.2.2. External Libraries or Applications

   The following libraries or applications must be installed
   before running OpenSIPS with this module loaded:
     * libpcre-dev - the development libraries of PCRE.

1.3. Exported Parameters

1.3.1. file (string)

   Text file containing the regular expression groups. It must be
   set in order to enable the group matching function.

   Default value is “NULL”.

   Example 1.1. Set file parameter
...
modparam("regex", "file", "/etc/opensips/regex_groups")
...

1.3.2. max_groups (int)

   Max number of regular expression groups in the text file.

   Default value is “20”.

   Example 1.2. Set max_groups parameter
...
modparam("regex", "max_groups", 40)
...

1.3.3. group_max_size (int)

   Max content size of a group in the text file.

   Default value is “8192”.

   Example 1.3. Set group_max_size parameter
...
modparam("regex", "group_max_size", 16384)
...

1.3.4. pcre_caseless (int)

   If this options is set, matching is done caseless. It is
   equivalent to Perl's /i option, and it can be changed within a
   pattern by a (?i) or (?-i) option setting.

   Default value is “0”.

   Example 1.4. Set pcre_caseless parameter
...
modparam("regex", "pcre_caseless", 1)
...

1.3.5. pcre_multiline (int)

   By default, PCRE treats the subject string as consisting of a
   single line of characters (even if it actually contains
   newlines). The "start of line" metacharacter (^) matches only
   at the start of the string, while the "end of line"
   metacharacter ($) matches only at the end of the string, or
   before a terminating newline.

   When this option is set, the "start of line" and "end of line"
   constructs match immediately following or immediately before
   internal newlines in the subject string, respectively, as well
   as at the very start and end. This is equivalent to Perl's /m
   option, and it can be changed within a pattern by a (?m) or
   (?-m) option setting. If there are no newlines in a subject
   string, or no occurrences of ^ or $ in a pattern, setting this
   option has no effect.

   Default value is “0”.

   Example 1.5. Set pcre_multiline parameter
...
modparam("regex", "pcre_multiline", 1)
...

1.3.6. pcre_dotall (int)

   If this option is set, a dot metacharater in the pattern
   matches all characters, including those that indicate newline.
   Without it, a dot does not match when the current position is
   at a newline. This option is equivalent to Perl's /s option,
   and it can be changed within a pattern by a (?s) or (?-s)
   option setting.

   Default value is “0”.

   Example 1.6. Set pcre_dotall parameter
...
modparam("regex", "pcre_dotall", 1)
...

1.3.7. pcre_extended (int)

   If this option is set, whitespace data characters in the
   pattern are totally ignored except when escaped or inside a
   character class. Whitespace does not include the VT character
   (code 11). In addition, characters between an unescaped #
   outside a character class and the next newline, inclusive, are
   also ignored. This is equivalent to Perl's /x option, and it
   can be changed within a pattern by a (?x) or (?-x) option
   setting.

   Default value is “0”.

   Example 1.7. Set pcre_extended parameter
...
modparam("regex", "pcre_extended", 1)
...

1.4. Exported Functions

1.4.1.  pcre_match (string, pcre_regex)

   Matches the given string parameter against the regular
   expression pcre_regex, which is compiled into a PCRE object.
   Returns TRUE if it matches, FALSE otherwise.

   Meaning of the parameters is as follows:
     * string - String to compare.
     * pcre_regex (string) - Regular expression to be compiled in
       a PCRE object.

   NOTE: To use the "end of line" symbol '$' in the pcre_regex
   parameter use '$$'.

   This function can be used from REQUEST_ROUTE, FAILURE_ROUTE,
   ONREPLY_ROUTE, BRANCH_ROUTE and LOCAL_ROUTE.

   Example 1.8.  pcre_match usage (forcing case insensitive)
...
if (pcre_match("$ua", "(?i)^twinkle")) {
    xlog("L_INFO", "User-Agent matches\n");
}
...

   Example 1.9.  pcre_match usage (using "end of line" symbol)
...
if (pcre_match($rU, "^user[1234]$$")) {  # Will be converted to "^user[1
234]$"
    xlog("L_INFO", "RURI username matches\n");
}
...

1.4.2.  pcre_match_group (string [, group])

   It uses the groups readed from the text file (see
   Section 1.6.1, “File format”) to match the given string
   parameter against the compiled regular expression in group
   number group. Returns TRUE if it matches, FALSE otherwise.

   Meaning of the parameters is as follows:
     * string - String to compare.
     * group (int) - group to use in the operation. If not
       specified then 0 (the first group) is used.

   This function can be used from REQUEST_ROUTE, FAILURE_ROUTE,
   ONREPLY_ROUTE, BRANCH_ROUTE and LOCAL_ROUTE.

   Example 1.10.  pcre_match_group usage
...
if (pcre_match_group($rU, 2)) {
    xlog("L_INFO", "RURI username matches group 2\n");
}
...

1.5. Exported MI Functions

1.5.1.  regex_reload

   Causes regex module to re-read the content of the text file and
   re-compile the regular expressions. The number of groups in the
   file can be modified safely.

   Name: regex_reload

   Parameters: none

   MI FIFO Command Format:
opensips-cli -x mi regex_reload

1.6. Installation and Running

1.6.1. File format

   The file contains regular expressions categorized in groups.
   Each group starts with "[number]" line. Lines starting by
   space, tab, CR, LF or # (comments) are ignored. Each regular
   expression must take up just one line, this means that a
   regular expression can't be splitted in various lines.

   An example of the file format would be the following:

   Example 1.11. regex file
### List of User-Agents publishing presence status
[0]

# Softphones
^Twinkle/1
^X-Lite
^eyeBeam
^Bria
^SIP Communicator
^Linphone

# Deskphones
^Snom

# Others
^SIPp
^PJSUA


### Blacklisted source IP's
[1]

^190\.232\.250\.226$
^122\.5\.27\.125$
^86\.92\.112\.


### Free PSTN destinations in Spain
[2]

^1\d{3}$
^((\+|00)34)?900\d{6}$

   The module compiles the text above to the following regular
   expressions:
group 0: ((^Twinkle/1)|(^X-Lite)|(^eyeBeam)|(^Bria)|(^SIP Communicator)|
          (^Linphone)|(^Snom)|(^SIPp)|(^PJSUA))
group 1: ((^190\.232\.250\.226$)|(^122\.5\.27\.125$)|(^86\.92\.112\.))
group 2: ((^1\d{3}$)|(^((\+|00)34)?900\d{6}$))

   The first group can be used to avoid auto-generated PUBLISH
   (pua_usrloc module) for UA's already supporting presence:

   Example 1.12. Using with pua_usrloc
route[REGISTER] {
    if (! pcre_match_group("$ua", 0)) {
        xlog("L_INFO", "Auto-generated PUBLISH for $fu ($ua)\n");
        pua_set_publish();
    }
    save("location");
    exit;
}

   NOTE: It's important to understand that the numbers in each
   group header ([number]) must start by 0. If not, the real group
   number will not match the number appearing in the file. For
   example, the following text file:

   Example 1.13. Incorrect groups file
[1]
^aaa
^bbb

[2]
^ccc
^ddd

   will generate the following regular expressions:
group 0: ((^aaa)|(^bbb))
group 1: ((^ccc)|(^ddd))

   Note that the real index doesn't match the group number in the
   file. This is, compiled group 0 always points to the first
   group in the file, regardless of its number in the file. In
   fact, the group number appearing in the file is used for
   nothing but for delimiting different groups.

   NOTE: A line containing a regular expression cannot start by
   '[' since it would be treated as a new group. The same for
   lines starting by space, tab, or '#' (they would be ignored by
   the parser). As a workaround, using brackets would work:
[0]
([0-9]{9})
( #abcde)
( qwerty)

Chapter 2. Contributors

2.1. By Commit Statistics

   Table 2.1. Top contributors by DevScore^(1), authored
   commits^(2) and lines added/removed^(3)
     Name DevScore Commits Lines ++ Lines --
   1. Iñaki Baz Castillo 15 3 1242 2
   2. Razvan Crainea (@razvancrainea) 13 11 41 24
   3. Liviu Chircu (@liviuchircu) 12 10 25 42
   4. Vlad Patrascu (@rvlad-patrascu) 9 6 63 88
   5. Bogdan-Andrei Iancu (@bogdan-iancu) 8 6 12 7
   6. Norman Brandinger (@NormB) 5 3 4 4
   7. Sergio Gutierrez 3 1 19 1
   8. Ovidiu Sas (@ovidiusas) 3 1 13 11
   9. Anca Vamanu 3 1 6 3
   10. Maksym Sobolyev (@sobomax) 3 1 3 3

   All remaining contributors: Marius Zbihlei.

   (1) DevScore = author_commits + author_lines_added /
   (project_lines_added / project_commits) + author_lines_deleted
   / (project_lines_deleted / project_commits)

   (2) including any documentation-related commits, excluding
   merge commits. Regarding imported patches/code, we do our best
   to count the work on behalf of the proper owner, as per the
   "fix_authors" and "mod_renames" arrays in
   opensips/doc/build-contrib.sh. If you identify any
   patches/commits which do not get properly attributed to you,
   please submit a pull request which extends "fix_authors" and/or
   "mod_renames".

   (3) ignoring whitespace edits, renamed files and auto-generated
   files

2.2. By Commit Activity

   Table 2.2. Most recently active contributors^(1) to this module
                      Name                   Commit Activity
   1.  Norman Brandinger (@NormB)          Apr 2023 - Apr 2023
   2.  Maksym Sobolyev (@sobomax)          Feb 2023 - Feb 2023
   3.  Razvan Crainea (@razvancrainea)     Sep 2011 - Sep 2019
   4.  Vlad Patrascu (@rvlad-patrascu)     May 2017 - Apr 2019
   5.  Bogdan-Andrei Iancu (@bogdan-iancu) Feb 2009 - Apr 2019
   6.  Liviu Chircu (@liviuchircu)         Mar 2014 - Nov 2018
   7.  Ovidiu Sas (@ovidiusas)             Jan 2013 - Jan 2013
   8.  Marius Zbihlei                      Sep 2010 - Sep 2010
   9.  Iñaki Baz Castillo                  Feb 2009 - Jul 2010
   10. Anca Vamanu                         Sep 2009 - Sep 2009

   All remaining contributors: Sergio Gutierrez.

   (1) including any documentation-related commits, excluding
   merge commits

Chapter 3. Documentation

3.1. Contributors

   Last edited by: Vlad Patrascu (@rvlad-patrascu), Razvan Crainea
   (@razvancrainea), Liviu Chircu (@liviuchircu), Bogdan-Andrei
   Iancu (@bogdan-iancu), Iñaki Baz Castillo.

   Documentation Copyrights:

   Copyright © 2009 Iñaki Baz Castillo
