[Solved] Split a string with letters, numbers, and punctuation


A new approach since Java’s Regex.Split() doesn’t seem to keep the delimiters in the result, even if they are enclosed in a capturing group:

Pattern regex = Pattern.compile(
    "[+-]?           # Match a number, starting with an optional sign,\n" +
    "\\d+            # a mandatory integer part,\n" +
    "(?:\\.\\d+)?    # optionally followed by a decimal part\n" +
    "(?:e[+-]?\\d+)? # and/or an exponential part.\n" +
    "|               # OR\n" +
    "(?:             # Match...\n" +
    " (?![+-]?\\d)   # (unless it's the beginning of a number)\n" +
    " .              # any character\n" +
    ")*              # any number of times.", 
    Pattern.CASE_INSENSITIVE | Pattern.UNICODE_CASE | Pattern.COMMENTS);
Matcher regexMatcher = regex.matcher(subjectString);
while (regexMatcher.find()) {
    matchList.add(regexMatcher.group());
} 

Note that this regex doesn’t match “abbreviated” decimal numbers like 1. or .1 correctly – it assumes that a decimal number always as an integer part and a decimal part. If those cases need to be included, the regex will need to be augmented.

3

solved Split a string with letters, numbers, and punctuation