Working with Kiki

Intro

Kiki is a free environment for regular expression testing (ferret). It allows you to write regexes and test them against your sample text, providing extensive output about the results. It is useful for several purposes:

exploring and understanding the structure of match objects generated by the re module, making Kiki a valuable tool for people new to regexes
testing regexes on sample text before deploying them in code

Kiki can function on its own or as plugin for the Spe Python editor.

Working with Kiki

Enter the regex in the combo box and hit Evaluate to run it against the text in the Sample text tab. The results appear in the Matches tab. You can use the list in the Help tab to view the built-in documentation about regular expressions. This documentation comes from the Python help files.

Kiki automatically stores its settings and the last used sample text/regex in between sessions. During a session, all regexes which have been evaluated and have returned matches, are also stored in the combo box where the regex is entered, so you may experiment with e.g. adapted versions without losing a more primitive regex that already kinda works.

Options

Methods: Use Find all to get all non-overlapping matches of the regex, or Find first to get only the first match (equivalent to the search method of regular expression objects).
Flags: Determines which flags the regex should be compiled with.

Understanding the output

Kiki's output is quite extensive and provides color-coded information on the results of the match. Let's assume you're trying to match the regex fer(re)*t against the sample text Kiki the ferret - as nifty as ferrets get. The results of a Find all evaluation by default looks something like this:

Kiki the 0:(fer(re)1t)0 - as nifty as 1:(fer(re)1t)0s get

Each match is prepended by a small, underlined number: this is the index of the corresponding match object in the list of match objects found in the sample text. In the example above, there are two match objects, with indexes 0 and 1.

Within each match, colored parentheses show where a match group starts and ends. Group ()0 represents the entire match. In this example we have also created an extra group because of the use of (re)*". This is made visible by the green parentheses bearing the index 1: ()1.

Pay attention to an interesting property of these groups: some of them might not contribute to the match and are then skipped in the output (in the match object, these groups start and end at position -1). An example: let's find all sentences talking about expecting the Spanish Inquisition in the text below:


                Chapman: I didn't expect a kind of Spanish Inquisition.
                


                (JARRING CHORD - the cardinals burst in)
                


                Ximinez: NOBODY expects the Spanish Inquisition! Our chief weapon is surprise...surprise and fear...fear and surprise.... 
                         Our two weapons are fear and surprise...and ruthless efficiency.... 
                         Our *three* weapons are fear, surprise, and ruthless efficiency...and an almost fanatical devotion to the Pope.... 
                         Our *four*...no... *Amongst* our weapons.... Amongst our weaponry...are such elements as fear, surprise.... 
                         I'll come in again. (Exit and exeunt)

For this purpose, we can use e.g. the following regex:

([a-zA-Z']+\s)+?expect(.*?)(the )*Spanish Inquisition(!|.)

The result is:

Chapman: 0:(I (didn't )1expect( a kind of )2Spanish Inquisition(.)4)0

(JARRING CHORD - the cardinals burst in)

Ximinez: 1:((NOBODY )1expect(s )2(the )3Spanish Inquisition(!)4)0 Our chief weapon is surprise...surprise and fear...fear and surprise.... Our two weapons are fear and surprise...and ruthless efficiency.... Our *three* weapons are fear, surprise, and ruthless efficiency...and an almost fanatical devotion to the Pope.... Our *four*...no... *Amongst* our weapons.... Amongst our weaponry...are such elements as fear, surprise.... I'll come in again. (Exit and exeunt)

The interesting part is what's going on in the match with index 0: between the group with index 2 and the one with index 4, the group with index 3 has disappeared. This group matches an optional the which is not present in this case. In other words, the group exists, but does not contribute to the match and is therefore not displayed.