牛骨文教育服务平台(让学习变的简单)
博文笔记

探究java中String.replaceAll方法把换行符( )替换为明文显示( )为何需要四个反斜杠(\\n)

创建时间:2017-07-13 投稿人: 浏览次数:6246

最近需要解析一个JSONArray类型的字符串

[{"key":"姓名","value":"XX"},{"key":"资质","value":"从事贵金属投资行业10年
国家期货二级分析师
上金所荣誉长老"},{"key":"其他","value":""}]

在key资质对应的value中包含三条分行显示的信息,那么坑就来了,当JSON解析遇到 (换行)就会抛出异常,那怎么办?
还好,想到了一个对策,就是使用java原生的String.replaceAll方法先把换行( )转换成能明文显示的 (\n)。

System.out.println(array.replaceAll("
","\n"));

结果发现,貌似不对劲,输出结果是这样的????

[{"key":"姓名","value":"XX"},{"key":"资质","value":"从事贵金属投资行业10年n国家期货二级分析师n上金所荣誉长老"},{"key":"其他","value":""}]

哇,有毒!怎么只剩下一个n了??
为了搞明白什么问题,百度、google?no,我们看源码。
先看一下replaceAll方法的源码

public String replaceAll(String regex, String replacement) {
    return Pattern.compile(regex).matcher(this).replaceAll(replacement);
}

从源码中发现,该方式使用了正则匹配,那么,匹配的逻辑到底是怎么样的?我们再看看Matcher.replaceAll方法

public String replaceAll(String replacement) {
    reset();
    boolean result = find();
    if (result) {
        StringBuffer sb = new StringBuffer();
        do {
            appendReplacement(sb, replacement);
            result = find();
        } while (result);
        appendTail(sb);
        return sb.toString();
    }
    return text.toString();
}

从该方法中,我们可以看到,该方法中是一直循环直至find()返回false,每一次find匹配到换行(我们调用String.replaceAll时传入的匹配字符串是” ”)都会执行appendReplacement方法,那么这个家伙到底做了什么呢?

public Matcher appendReplacement(StringBuffer sb, String replacement) {

        // If no match, return error
        if (first < 0)
            throw new IllegalStateException("No match available");

        // Process substitution string to replace group references with groups
        int cursor = 0;
        StringBuilder result = new StringBuilder();

        while (cursor < replacement.length()) {
            char nextChar = replacement.charAt(cursor);
            if (nextChar == "\") {
                cursor++;
                nextChar = replacement.charAt(cursor);
                result.append(nextChar);
                cursor++;
            } else if (nextChar == "$") {
                // Skip past $
                cursor++;
                // A StringIndexOutOfBoundsException is thrown if
                // this "$" is the last character in replacement
                // string in current implementation, a IAE might be
                // more appropriate.
                nextChar = replacement.charAt(cursor);
                int refNum = -1;
                if (nextChar == "{") {
                    cursor++;
                    StringBuilder gsb = new StringBuilder();
                    while (cursor < replacement.length()) {
                        nextChar = replacement.charAt(cursor);
                        if (ASCII.isLower(nextChar) ||
                            ASCII.isUpper(nextChar) ||
                            ASCII.isDigit(nextChar)) {
                            gsb.append(nextChar);
                            cursor++;
                        } else {
                            break;
                        }
                    }
                    if (gsb.length() == 0)
                        throw new IllegalArgumentException(
                            "named capturing group has 0 length name");
                    if (nextChar != "}")
                        throw new IllegalArgumentException(
                            "named capturing group is missing trailing "}"");
                    String gname = gsb.toString();
                    if (ASCII.isDigit(gname.charAt(0)))
                        throw new IllegalArgumentException(
                            "capturing group name {" + gname +
                            "} starts with digit character");
                    if (!parentPattern.namedGroups().containsKey(gname))
                        throw new IllegalArgumentException(
                            "No group with name {" + gname + "}");
                    refNum = parentPattern.namedGroups().get(gname);
                    cursor++;
                } else {
                    // The first number is always a group
                    refNum = (int)nextChar - "0";
                    if ((refNum < 0)||(refNum > 9))
                        throw new IllegalArgumentException(
                            "Illegal group reference");
                    cursor++;
                    // Capture the largest legal group string
                    boolean done = false;
                    while (!done) {
                        if (cursor >= replacement.length()) {
                            break;
                        }
                        int nextDigit = replacement.charAt(cursor) - "0";
                        if ((nextDigit < 0)||(nextDigit > 9)) { // not a number
                            break;
                        }
                        int newRefNum = (refNum * 10) + nextDigit;
                        if (groupCount() < newRefNum) {
                            done = true;
                        } else {
                            refNum = newRefNum;
                            cursor++;
                        }
                    }
                }
                // Append group
                if (start(refNum) != -1 && end(refNum) != -1)
                    result.append(text, start(refNum), end(refNum));
            } else {
                result.append(nextChar);
                cursor++;
            }
        }
        // Append the intervening text
        sb.append(text, lastAppendPosition, first);
        // Append the match substitution
        sb.append(result);

        lastAppendPosition = last;
        return this;
    }

分析该方法的实现,我们可以发现在while循环的第一行执行了

char nextChar = replacement.charAt(cursor);

获取替换目标字符串的第一个字符,我们这里是”\n”,那么第一个字符就是’’,然后看第一个if语句

 if (nextChar == "\") {
    cursor++;
    nextChar = replacement.charAt(cursor);
    result.append(nextChar);
    cursor++;
} 

当该字符为’’时,cursor会++自增1,然后获取第二个字符’’,把该字符append到result中,关键之处就在这里了,它把连续的两个反斜杠(‘\’)变成了一个反斜杠(‘’),到这里,问题貌似搞明白了。

那么,我们最终的写法应该是

System.out.println(array.replaceAll("
","\\n"));

输出结果

[{"key":"姓名","value":"XX"},{"key":"资质","value":"从事贵金属投资行业10年
国家期货二级分析师
上金所荣誉长老"},{"key":"其他","value":""}]

完美!。。。。

声明:该文观点仅代表作者本人,牛骨文系教育信息发布平台,牛骨文仅提供信息存储空间服务。