In a recent project on GitHub, I saw that the login function of the project calls trim() on the password string passed in from the front. The trim() method is intended to remove Spaces from the front and back of the user’s input data. Part of the code is as follows:

            / / user name
            String username = loginData.get(getUsernameParameter());
            / / password
            String password = loginData.get(getPasswordParameter());
            // prevent null pointer exceptions
            if (username == null) {
                username = "";
            }
            if (password == null) {
                password = "";
            }
            // Remove Spaces at the beginning and end
            username = username.trim();
Copy the code

Trim () is a String method. If you look around, trim() isn’t just a matter of removing the “leading and trailing Spaces” from strings.

P.S: ref 1:string.trim() What exactly is removed? Avoid getField opCode

Direct source code

public String trim(a) {
        Len represents the length of the instance string
        int len = value.length;
        // st represents a counter (cursor)
        int st = 0;
        // When a method requires a large number of references to instance domain variables, using local variables in the method instead of references can reduce the number of getField operations and improve performance
        char[] val = value;  
        
        // the first while
        while ((st < len) && (val[st] <= ' ')) {
            st++;
        }
        // the second while
        while ((st < len) && (val[len - 1] < =' ')) {
            len--;
        }
        return ((st > 0) || (len < value.length)) ? substring(st, len) : this;
    }
Copy the code

Observations:

  1. Either the first or the second while has a comparison between characters of type char, as inval[st] <= ' 'andval[len - 1] <= ' '
  2. The trim() method ends with a call to the subString() method, indicating that trim() actually ends with a string truncation action

Analysis:

  1. We know that because char types have numerical values in encoding tables such as ASCII, comparisons between char types can actually be directly treated as comparisons between integers in ASCII tables. We can start with asciI.911cha.com/. By encoding the comparison table, we can conclude that the trim() method actually trims out (removes) all characters with Unicode encoding less than 32 (\u0020) at both ends of the string. More generally, trim() essentially removes all ASCII control characters from a string (which makes sense, since these characters are almost impossible to type on a keyboard), leaving only ASCII displayable characters in the string, such as ‘a’, ‘B’, etc.
  2. For the subString() method:
public String substring(int beginIndex, int endIndex) {
        if (beginIndex < 0) {
            throw new StringIndexOutOfBoundsException(beginIndex);
        }
        if (endIndex > value.length) {
            throw new StringIndexOutOfBoundsException(endIndex);
        }
        int subLen = endIndex - beginIndex;
        if (subLen < 0) {
            throw new StringIndexOutOfBoundsException(subLen);
        }
        return ((beginIndex == 0) && (endIndex == value.length)) ? this
                : new String(value, beginIndex, subLen);
    }
Copy the code

We find that the String returned is new, which is in the Heap memory, not in the constant pool of the method area. This illustrates the result: when a String instance calls trim(), a new object is returned.


Conclusion:

  1. The trim() method removes not just whitespace, but all characters at both ends of the string whose Unicode encoding is less than or equal to 32 (\u0020)
  2. When trim() is called, a new object is returned