Thursday, June 01, 2006

I love Regex

I'm no expert, and a relatively recent convert to them, but I love regular expressions. They make so many awkward text manipulation tasks easy. Today I had to write a little function to turn a legacy application date into a dotnet DateTime struct. The legacy application's date is in the format CYYMMDD where C is the century, 0 for 1900, 1 for 2000 etc, YY is the two digit year, MM is the month and DD is the day of the month. It's really easy to parse this into a DateTime struct using the dotnet Regex class:
public static DateTime ToDateTime(string geniusDate)
{
    Regex regex = new Regex(@"^(\d)(\d{2})(\d{2})(\d{2})$");
    Match match = regex.Match(geniusDate);
    if (!match.Success)
    {
        throw new ArgumentException("not a valid geniusDate");
    }

    int century = (int.Parse(match.Groups[1].Value) * 100) + 1900;
    int year = int.Parse(match.Groups[2].Value) + century;
    int month = int.Parse(match.Groups[3].Value);
    int day = int.Parse(match.Groups[4].Value);

    return new DateTime(year, month, day);
}
Also I really like the regular expression search and replace built into visual studio. Here's one that changes a field declaration like:
 string _name;
into a property like:
 string name
 {
  get{ return _name; }
  set{ _name = value;}
 }
Just put this:
:b*<{.*}>:b<_{.*}>;
in the 'find what' field, and this:
\1 \2\n{\nget{ return _\2; }\nset{ _\2 = value;}\n}
in the 'replace with' field. Don't forget to check the 'Use' check box and select 'Regular expressions'. It's a shame that the regular expression syntax is different for the visual studio and the Regex class. The captures are () in Regex, but {} in visual studio. Now, of course it's much easier to type control R, E and let the refactoring tools do this job for you, but it's a good demonstration of how much you can do with search and replace regular expressions.

1 comment:

Unknown said...

I completely agree! I've just used it for parsing math expressions that contain complex numbers, variables and functions.

I think that would have been an insane task without regex!