(0[1-9]|[1-9]|[12][0-9]|3[01])-(0[1-9]|1[012]|[1-9])-(19|20)\\d{2}
However - RegEx can be used for parsing as well as validating - a use often overlooked within Java systems in favour of manually parsing out data structures from strings. What's even better is that a single RegEx can both validate and parse in one single step, through the use of groups, back-references, or Sub-matches depending on your terminology - they are all the same thing. Java's
java.util.regex package uses the "Group" name. Essentially, it all involves the use of brackets in your expressions. Note that they were used above, and would have accidentally worked, were it not for the fact that the \\d{2} is not inside any brackets, making the year un-parseable.Taking a simpler example - HTML colour strings of the form #FFCC00. Here, we're allowing them to be optionally pre-pended by the hash character, and will allow both upper and lower case for the characters.
String COLOR_REGEX = "^#{0,1}[0-9A-Fa-f]{2}[0-9A-Fa-f]{2}[0-9A-Fa-f]{2}$";
This expression states that at the start of the string being processed, we expect a hash character with either 0 or 1 occurences (thereby making it optional). Following this, we expect any one of the numbers 0-9, the letters A-F and a-f twice (with the {2}). The last pattern is repeated three times to account for each of the colours being expressed. Finally, we add the $ to state we don't want any more content after the last hex byte.
This pattern is sufficient to validate a String is an HTML color:
package regex;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class RegexValidator {
public static void main(String[] args) {
String regex = "^#{0,1}[0-9A-Fa-f]{2}[0-9A-Fa-f]{2}[0-9A-Fa-f]{2}$";
Pattern validator = Pattern.compile(regex);
Matcher m;
m = validator.matcher("#FFcc8A");
System.out.println("#FFcc8A - " + (m.matches() ? "Validated" : "NOT A COLOUR!"));
m = validator.matcher("#steve");
System.out.println("#steve - " + (m.matches() ? "Validated" : "NOT A COLOUR!"));
}
}
But we can do better. By adding in brackets around each of the Hex byte expresssions, the Matcher will now not only match, but allow us to pull the group content as well:
package regex;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class RegexValidator {
public static void main(String[] args) {
String regex = "^#{0,1}([0-9A-Fa-f]{2})([0-9A-Fa-f]{2})([0-9A-Fa-f]{2})$";
Pattern parser = Pattern.compile(regex);
Matcher m;
m = parser.matcher("#FFcc8A");
System.out.println("#FFcc8A - " + (m.matches() ? "Validated" : "NOT A COLOUR!"));
if (m.matches()) {
System.out.println("Components - R: " + m.group(1) + " G: " + m.group(2) + " B: " + m.group(3));
}
m = parser.matcher("#steve");
System.out.println("#steve - " + (m.matches() ? "Validated" : "NOT A COLOUR!"));
}
}
In this way, we can save quite a bit of messing around with String#indexOf(), String#substring(), etc.
We can also use something similar for outputting data without having to build up output messages using StringBuilders all the time -
java.util.Formatter. A Formatter instance allows us to define how some content should be output, and apply the actual content separately. Taking the example of HTML colours again, having parsed out the hex components as strings, we will likely have converted them to integers for processing. If we wanted to output the values again (I'm holding them in a List here, but a simple array would be more convenient), we could use something like:package regex;
import java.util.Arrays;
import java.util.Formatter;
import java.util.List;
public class OutputFormatting {
public static void main(String args[]) {
List<Integer> intComps = Arrays.asList(new Integer[] {21, 243, 0});
String output = new Formatter().format("#%02X%02X%02X", intComps.toArray()).toString();
System.out.println(output);
}
}