ReUtil

Origin

In text processing, regular expressions are almost omnipotent, but sometimes it is still cumbersome to handle some things in Java using regular expressions. Therefore, I encapsulated some commonly used functions. For example, if we want to match some parts in a piece of text, we would do it like this:

String content = "ZZZaaabbbccc中文1234";
Pattern pattern = Pattern.compile(regex, Pattern.DOTALL);
Matcher matcher = pattern.matcher(content);
if (matcher.find()) {
    String result= matcher.group();
}

There are multiple objects involved in this process, and it is really hard to remember them when needed.好吧,since this function is so commonly used,I encapsulated it like this:

/**
* Get the matched string
*
* @param pattern the compiled regular expression pattern
* @param content the content to be matched
* @param groupIndex the index of the group to be matched
* @return the matched string, null if not matched
*/
public static String get(Pattern pattern, String content, int groupIndex) {
    Matcher matcher = pattern.matcher(content);
    if (matcher.find()) {
        return matcher.group(groupIndex);
    }
    return null;
}

/**
* Get the matched string
*
* @param regex the regular expression to be matched
* @param content the content to be matched
* @param groupIndex the index of the group to be matched
* @return the matched string, null if not matched
*/
public static String get(String regex, String content, int groupIndex) {
    Pattern pattern = Pattern.compile(regex, Pattern.DOTALL);
    return get(pattern, content, groupIndex);
}

Usage

ReUtil.extractMulti

Extract multiple groups and concatenate them together

String content = "ZZZaaabbbccc中文1234";
String resultExtractMulti = ReUtil.extractMulti("(\\w)aa(\\w)", content, "$1-$2");
Assert.assertEquals("Z-a", resultExtractMulti);

ReUtil.delFirst

Delete the first matched content

String content = "ZZZaaabbbccc中文1234";
String resultDelFirst = ReUtil.delFirst("(\\w)aa(\\w)", content);
Assert.assertEquals("ZZbbbccc中文1234", resultDelFirst);

ReUtil.findAll

Find all matching texts

String content = "ZZZaaabbbccc中文1234";
List<String> resultFindAll = ReUtil.findAll("\\w{2}", content, 0, new ArrayList<String>());
// Result: ["ZZ", "Za", "aa", "bb", "bc", "cc", "12", "34"]

ReUtil.getFirstNumber

Find the first number that matches the pattern.

Integer resultGetFirstNumber = ReUtil.getFirstNumber(content);
// Result: 1234

ReUtil.isMatch

Check if a given string matches a given regular expression.

String content = "ZZZaaabbbccc中文1234";
boolean isMatch = ReUtil.isMatch("\\w+[\u4E00-\u9FFF]+\\d+", content);
Assert.assertTrue(isMatch);

ReUtil.replaceAll

Find a string using a regular expression and replace it with a replacement template, where $1 refers to the first group.

String content = "ZZZaaabbbccc中文1234";
String replaceAll = ReUtil.replaceAll(content, "(\\d+)", "->$1<-");
Assert.assertEquals("ZZZaaabbbccc中文->1234<-", replaceAll);

ReUtil.escape

Escape a given string for regular expression usage, handling special symbols.

String escape = ReUtil.escape("我有个$符号{}");
// Result: 我有个\$符号\{\\}