Tag Archives: Regex

Supporting Named capturing groups before Java 7

Java 7 has introduced named capturing groups in regular expressions. Which is cool. But I had a requirement to support named capturing groups for prior versions of Java too. The requirement was from my own mobile application where I had to read regular expression from the user and give the user a way to read matching elements of the data to specified variables. Named capturing groups was perfect solution as it solves both the purposes but I didn’t want to restrict my application to specific version of Android.

So I have supported named capturing groups with a wrapper. And the wrapper turned out to be very straight forward. Here it goes:

import java.util.LinkedHashMap;
import java.util.Map;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class Parser {
    private Map<String, String> parseResult = new LinkedHashMap<>();
    private String pattern;

    public Parser(String pattern) {
        this.pattern = reformatRegex(pattern);
    }

    public Map<String, String> parse(String message, String pattern) {
        Pattern regex = Pattern.compile(pattern);
        Matcher matcher = regex.matcher(message);
        if(matcher.matches()) {
            int groupIndex = 1;
            for(String key : parseResult.keySet()) {
                if (groupIndex <= matcher.groupCount()) {
                    parseResult.put(key, matcher.group(groupIndex++));
                }
            }
        }

        return parseResult;
    }

    private String reformatRegex(String pattern) {
        StringBuilder newRegex = new StringBuilder(pattern);
        Pattern regex = Pattern.compile(".*?\\?<(.*?)>.*?");

        Matcher matcher = regex.matcher(newRegex);
        while(matcher.matches()) {
            parseResult.put(matcher.group(1), null);
            newRegex.replace(matcher.start(1)-2, matcher.end(1)+1, "");
            matcher = regex.matcher(newRegex);
        }

        return newRegex.toString();
    }
}

The important part of the code is the method reformatRegex(). The Parser class pre-processes the expression by converting named capturing groups to numbered groups and builds the map of group names which will be filled with result later in parse() method.