JavaRegular expressions

Match results

Getting match results

As you know, the find method of Matcher can check whether a substring of a string matches the pattern. Here is an example.

String javaText = "Java supports regular expressions. LET'S USE JAVA!!!";

Pattern javaPattern = Pattern.compile("java", Pattern.CASE_INSENSITIVE);
Matcher matcher = javaPattern.matcher(javaText);

System.out.println(matcher.find()); // prints "true"

After it has returned true it is possible to get information about a substring that matches the pattern like below.

System.out.println(matcher.start()); // 0, the starting index of match
System.out.println(matcher.end());   // 4, the index followed the last index of match
System.out.println(matcher.group()); // "Java", a substring that matches tha pattern
There is a special object MatchResult that represents all the results together:
MatchResult result = matcher.toMatchResult(); // a special object containing match results
        
System.out.println(result.start()); // 0
System.out.println(result.end());   // 4
System.out.println(result.group()); // "Java"

Be careful, if you will try to invoke the methods start, end, group before the find method, they will throw IllegalStateException as well as invoke them after find has returned false. To avoid the exception, you should always check the boolean result of find before invoking these methods.

if (matcher.find()) {
    System.out.println(matcher.start());
    System.out.println(matcher.end());
    System.out.println(matcher.group());
} else {
    System.out.println("No matches found");
}

This code will print "No matches found" if the find method has returned false. It also gives you guarantees that start, end, group are invoked only after the find method.

Iterating over multiple matches

Sometimes more than one substrings of a string can match the same pattern. In the previous example, there are two suitable strings "Java" and "JAVA", because the pattern is case insensitive. The find method allows us to iterate over all substrings that match the pattern in a loop.

String javaText = "Java supports regular expressions. LET'S USE JAVA!!!";

Pattern javaPattern = Pattern.compile("java", Pattern.CASE_INSENSITIVE);
Matcher matcher = javaPattern.matcher(javaText);

while (matcher.find()) {
    System.out.println("group: " + matcher.group() + ", start: " + matcher.start());
}

This code outputs:

group: Java, start: 0, end: 4
group: JAVA, start: 45, end: 49
The condition of the loop gives us guarantees that start and group are invoked only after the find method returns true.
How did you like the theory?
Report a typo