Regular expression predicates were introduced as of T/EC 3.7 that make the task of string pattern matching and manipulation a clean and intuitive procedure.
In a nutshell, there are basically 7 predicates, where
- one is used to define a regular expression pattern
- two are to perform a substring subsitition
- the remaining four are for pattern matching.
The “heart” of this set of predicates, is re_create. It should generally only be called once, as in a TEC_Start rule. The regular expression is defined and assigned to a global variable with this predicate, and the other regex predicates use this assigned global variable. Note: there could be cases where a dynamically built pattern would be required, in which case the re_create would be instantiated in a rule prior to the pattern being applied by one of the other “regex” predicates.
As noted in the T/EC Reference guide, the regular expression is a perl syntax, however, for those wanting to use advanced perl expressions, it is helpful to know that the expressions do not honor some of the perl 5 constructs (such as greedy or minimal matching).
The regex predicates are documented well in the T/EC 3.7 reference guide, along with some decent examples, but to get a good idea of the more useful functionality, some added emphasis is given on the use of indexes, which are used with the re_match predicate.
re_create : Defines a regular expression for use with other regular expression predicates.
Syntax : re_create(_pattern_name,<regexpattern>)
Syntax:
% Some simple pattern definitions
re_create( _patnumber , '\d+' )
% a numeric string
re_create( _patnoblanks , '\S+ )
% a string of non-whitespaces
% a blank space (tab, space, newline)
% Peform pattern matches
re_search_string( _patnumber , 'abc123xyz')
% Succeeds match for a string of numbers
re_search_string( _patnoblanks , 'abc123xyz')
% Succeeds match for a string of non-whitespaces
re_search_string( _patnumber , 'abc123xyz')
% Fails match for a whitespace
% Simple still but more meat.
% String : $HASP050 JES2 RESOURCE SHORTAGE OF CMBS - 98% UTILIZATION REACHED
% The following pattern definitions will match the string above:
% non-indexed
re_create( _pat_nv390_util_1 , '.*-\S+% UTILIZATION.*' )
% indexed, notice the parenthesis,
re_create( _pat_nv390_util_2 , '(.*-) (\S+)% (UTILIZATION.*)' )
% indexed, notice the parenthesis again, we'll get back to this in re_match.
re_create( _patipaddr , '(\d+\.\d+)\.(\d+)\.(\d+)' )
re_match : Compares predefined pattern to string and saves match
Syntax : re_match(_pattenr_name, _string, _index, _result)
This predicate searches for a match in _string using a named regular expression defined with the re_create predicate. The predicate succeeds if a match is found.
The matched substring is returned in _result. The _index argument is used to specify which part of the matched substring to return. A value of 0 returns the entire matching substring, a value of 1 indexes into the matched substring one position and returns the result, a value of 2 indexes into the substring two positions and returns the result, and so forth. Take the following example from above:
re_create( _patipaddr , '(\d+\.\d+)\.(\d+)\.(\d+)' ) re_match( _patipaddr , '111.234.456.999' , 0 , _result ) % Succeeds, _result is '111.234.456.999'
re_match( _patipaddr, '111.234.456.999' , 1 , _result ) % Succeeds, _result is '111.234' . Grab the first indexed position, which is the first set of parenthesis
re_match( _patipaddr, '111.234.456.999' , 3 , _result ) % Succeeds, _result is '999' . Grab the third indexed position, which is the third set of parenthesis
re_create( _pat_nv390_util_2 , '(.*-) (\S+)% (UTILIZATION.*)' ) re_match( _pat_nv390_util_2 , 'RSC SHORTAGE - 95% UTILIZATION REACHED', 1, _lval ) % _lval is assigned 'RSC SHORTAGE –'
re_match( _pat_nv390_util_2 , 'RSC SHORTAGE - 95% UTILIZATION REACHED', 2, _mval ) % _mval is assigned '95'
re_match( _pat_nv390_util_2 , 'RSC SHORTAGE - 95% UTILIZATION REACHED', 3 , _rval ) % _rval is assigned 'UTILIZATION REACHED'
For those interested in more advanced perl regex, I read on the TME10 list where somebody was trying to make use of the perl 5 concept of greedy and minimal expressions. For example, in Perl with a RegEx like ‘^.*-.*-dm-.*?-’ the following would happen:
A string like “host-reg1-dm-systype-servername-extra-stuff” we could match and get a result of “host-reg1-dm-systype-” with minimal (which is what we
need). With greedy matching it becomes “host-reg1-dm-systype-servername-extra-”
As I mentioned before, the perl syntax these prologs are constructed upon seem to not support the concept of minimal expressions, it will return the “greedy” match of “host-reg1-dm-systype-servername-extra”.
The remaining predicates are show below just to have the complete set of regex predicates included in this article, but they are nearly identical to the T/EC 3.7 Rule writing guide.
re_after_match : Searches for a match in a string using a named regular expression and returns the substring located after the match as a result.
Syntax: re_after_match(_pattern_name, _string, _result)
re_create(test,'a.*i')
% Create regular expression test.
re_after_match(test,'chair',_result)
% Search 'chair' usingreg ular expression test.
% Return the substringafter the match in _result.
% Succeeds, 'r' returned in _result.
re_before_match : Searches for a match in a string using a named regular expression and returns the substring located before the match as a result.
Syntax: re_before_match(_pattern_name, _string, _result)
re_create(test,'a.*r') % Create regular expression test.
re_before_match(test,'chair',_result)
% Search 'chair' usingreg ular expression test. % Return the substringbefore the match in _result.
% Succeeds, 'ch' returned in _result.
Some Examples in a tec rule prolog format:
reception_action: check_for_utilization_value: (
% Regex compare to determine if is a utilization message
re_match(pat_nv390_util,_msg,1,_lvalue),
re_match(pat_nv390_util,_msg,2,_uvalue),
re_match(pat_nv390_util,_msg,3,_rvalue),
inttoatom(_int1,_uvalue),
% Change Severity based on threhold ranges
((
_int1 < 90;
(_int1 >= 90, _int1 < 95, _new_sev = 'WARNING' );
(_int1 >= 95, _new_sev = 'CRITICAL');
true
)),
% Replace value of percentage in message with new severity
atomconcat([_lvalue,' ',_new_sev,' ',_rvalue],_new_msg),
bo_set_slotval(_event,msg,_new_msg),
% Put the value of percentage utilization into slot 'utilization'
atomconcat([_uvalue,'%'],_utilization),
% Match : the TSM client
% String : ANR2716E Schedule prompter was not able to contact \
% client TMRGATEWAY1 using type 1 (10.16.240.107
% ---------------------------------------------------------------
% Match the rightmost part of string
re_create(pat_anr2716,'client (\S+) using'),
% ---------------------------------------------------------------
% Match : strip prefixed domain and suffixed 't'
% String : mydomain.sysat
% ---------------------------------------------------------------
re_create(patmvshost,'\S+\.(\S+)[tT]'),
reception_action: strip_mvshost_action:
(
re_match(patmvshost,_hostname,1,_chop_host),
bo_set_slotval(_event,hostname,_chop_host),
re_mark_as_modified(_event,_)
)
% ---------------------------------------------------------------
% Ruleset : iptoregion.rls
% Match : class B, third and fourth octet
% String : aaa.bbb.ccc.ddd
% --------------------------------------------------------------- re_create(patipaddr,'(\d+\.\d+)\.(\d+)\.(\d+)')
Some Other predicates to play with:
re_search_string : Searches for a match in a string using a named regular expression.
Syntax: re_search_string( _pattern_name, _string )
re_substitute : Searches for a match in a string using a named regular expression, replaces the match, and returns the new string as a result.
re_substitute_global : Searches for all matches in a string using a named regular expression, replaces them, and returns the new string as a result

