Twitter Updates

    follow me on Twitter

    Thursday, June 6, 2013

    Regular Expressions from ANDTEK

    This is a super handy RegEx reference that just hit my inbox from ANDTEK.  They make UC Applications that work with the Cisco UC gear we install a lot of.  Good stuff.   Enjoy….

     

    From: ANDTEK [mailto:info@andtek.com]
    Sent: Thursday, June 06, 2013 10:00 AM
    To: Doug Renner
    Subject: Regular Expressions for Douglas Renner

     

    space

    Regular Expressions for Douglas Renner

    Usually we give you some updates about products, new trainings and updated features. This time we want to focus on an interesting technical feature which is useful in the AND Phone Application Server but is used in a lot of different technological fields – regular expressions.

    Regular Expressions are very powerful and useful so why not looking a bit deeper into that "evil language"?!

    The ANDTEK server uses regular expression in several modules to modify or replace phone numbers and to prefix or append information to a phone number.
    You use lookup or calling a number out of a directory?

    Believe us, Regular Expressions will become your best friend.

    There is always a point where you struggle about phone numbers which are not E.164 compliant, include spaces brackets etc.?

    Now you can think about changing each single phone number in the data source which takes you probably a couple hours or days. Or you use regular expressions to transform and modify those numbers into a format which can be handled by the CUCM. This work for incoming calls as well as for outgoing calls.

    First a few words about what regular expressions are before we come to the most common regular expressions which are used at our server.

    The Regex Alphabet and Words

    Regex is a written language that uses its own alphabet of special characters to form its verbs and nouns. These verbs and nouns are properly called metacharacters or operators and are written with the following characters:

    { } ? : ( ) . [ ] + ! < > * | ^ $ =

    Western numerals (0-9) and letters of the Latin alphabet (a-z) are also used in conjunction with those characters.
    A typical regex consists of a search pattern and a replacement pattern expressed with a combination of metacharacters and non metacharacters.

    Metacharacters tell a regex engine what to look for, where to look, when to turn a blind eye to what it is looking at and what to do once it's matched its search expression to something.

    Non metacharacters are the symbols, digits and words being searched for. They have no special meaning to the regex engine and represent exactly what they are – regular text strings. Non metacharacters are usually called literal characters or literals.

    We can tell a regex engine to treat a metacharacter as a literal character by placing a backslash () in front of it. This act is known as "commenting out".

    Most important Metacharacters

    Expression   

    Definition

    ^

    Beginning of the string

    $

    End of the string

    Use before any of the following characters to escape or null the meaning of it. + [ ] ( ) ^ . * { } $

    .

    Any character

    *

    Multiple occurrences of the preceding character (even 0 occurrences)

    ?

    At least one occurrence of the preceding character

    +

    Preceding character occurs once or multiple times

    |

    Starts alternative match this|that would mean match this or that

    [x-y]

    Range for matching digits from x to y

    [wxz]

    Range for matching digits w or x or z (no y)

    [^x-y]

    Negates the class and means digits don't match the range x-y

    (xxxx)

    Back reference point group which can be used in the „Replace with" definition by using the expression $ where is the number of the group. Group numbers start with 1.



    Examples

    Now since we know what we can do with regular expressions we will not leave you alone with it. Here the most common regular expressions which are used to normalize numbers or make them searchable.

    Typically phone numbers in a directory are configured like +49 (89) 555-123 and need to be brought into an E.164 compliant format like +4989555123 or a format like 0089555123. This just takes a few regular expression statements for "Outgoing Calls":

    Regex   

    Replace   

    Result

    Explanation

    Comment

    [^0-9#+]

    +4989555123

    Everything which is not a digit, a + or a #

    This will remove all special characters like brackets, spaces, dashes etc.

    ^+49|+ 49

    00

    0089555123

    String begins with +49 or string begins with + 49

    This will replace the +49 with 00

    ^+[^49]|+ [^49]

    000

    String begins with plus and next two digits are not 49 or space 49

    This will replace the + for international numbers with 000



    For reverse number lookup you need to define a set of regular expressions for "Incoming Calls" to transform the calling number close to the format how it is written in your database. Let's assume you get calls presented in a format like 0089555123 or even without the area code like 0555123 while your database entry has the format +49 89 555123. The following regex will help you to get properly formatted phone numbers for the lookup:

    Regex   

    Replace   

    Result

    Explanation

    Comment

    ^00

    +49

    +4989555123

    String begins with 00

    This replaces the leading 00 with the country code +49

    ^0

    +4989

    +4989555123

    String begins with 0

    This will replace the 0 with the country code +49 and insert also the area code 89



    That's it! With just 5 regular expression statements you covered about 95% of your phone numbers to be able to dial them or make them searchable.

    But always keep in mind: Not regular expressions are evil, but overuse of regular expressions might become complicated.

     

     

     

     

    IMPORTANT NOTICE
    This e-mail, including attachments, is covered by the Electronic Communications Privacy Act, 18 U.S.C. §§ 2510-2521, may include confidential, proprietary, and legally privileged information (including, without limitation, attorney-client privilege), and may be used only by the person or entity to which it is addressed. If the reader of this e-mail is not the intended recipient or his or her authorized agent, the reader is hereby notified that any use, dissemination, distribution, printing, or copying of this e-mail is strictly prohibited. If you have received this e-mail in error, please notify the sender by replying to this message and delete this e-mail immediately.