The code you read implements the ability to convert variable names to the Snake Case style.

The code snippet read here is from 30-seconds-of-Python.

snake

from re import sub

def snake(s) :
  return '_'.join(
    sub('([A-Z][a-z]+)'.r' \1',
    sub('([A-Z]+)'.r' \1',
    s.replace(The '-'.' '))).split()).lower()

# EXAMPLES
snake('camelCase') # 'camel_case'
snake('some text') # 'some_text'
snake('some-mixed_string With spaces_underscores-and-hyphens') # 'some_mixed_string_with_spaces_underscores_and_hyphens'
snake('AllThe-small Things') # "all_the_small_things"
Copy the code

The Snake function uses regular expressions to transform strings, break them into words, and combine them with an _ delimiter. The functions mainly use the re module sub, str.replace, str.split, str.lower, and str.join. Before we dive into the logic of snake’s function, let’s look at the other functions it uses.

str.replace(old, new[, count])

Returns a copy of the string in which all occurrences of the substring old are replaced with new if the optional argument count is given, only the previous occurrences of count are replaced.

str.split(sep=None, maxsplit=-1)

Returns a list of words in the string, using sep as the delimiter string. If maxsplit is given, at most maxsplit splits will be done (thus, the list will have at most maxsplit+1 elements). If maxsplit is not specified or -1, there is no limit on the number of splits (all possible splits are made).

If sep is not specified or None, a different splitting algorithm is applied: successive whitespace is treated as a single separator, and the beginning and end, if they contain whitespace, are not split into empty strings. Therefore, splitting an empty string with None or a string containing only Spaces returns [].

>>> '1 2 3'.split()
['1'.'2'.'3']
>>> '1 2 3'.split(maxsplit=1)
['1'.'2, 3,']
>>> '1, 2, 3'.split()
['1'.'2'.'3']
Copy the code

str.join(iterable)

Returns a string concatenated from iterable.

str.lower()

Returns a copy of the original string with all case-sensitive characters converted to lowercase.

re.sub(pattern, repl, string, count=0, flags=0)

Returns the string obtained by replacing the non-overlapping occurrences of pattern in the leftmost part of the string with repl. If the style is not found, string is returned unchanged. A repL can be a string or a function. Backreferences like \6 are replaced with substrings matched by group 6 in the style. For example, in the following example, the first group matches myfun, so when substituting, \1 is replaced with myfun, so \npy_ is followed by myfun in the result.

Strings prefixed with ‘r’ are raw strings, backslashes do not need to be treated in any special way. So r “\n” means a string containing two characters ‘\’ and ‘n’, while “\n” means a string containing only one newline character.

>>> re.sub(r'def\s+([a-zA-Z_][a-zA-Z_0-9]*)\s*\(\s*\):'..       r'static PyObject*\npy_\1(void)\n{'..       'def myfunc():')
'static PyObject*\npy_myfunc(void)\n{'
Copy the code

snakePerform logical

First, let’s analyze the sub function at the bottom of snake function. Let’s look at the input parameters.

String is s.replace(‘-‘, ‘) replaces the ‘-‘ in the string to be converted with ‘ ‘.

Pattern is ‘([a-z]+)’, where (…) Indicates that it is a combination that matches the regular expression inside the parentheses, and when the match is complete, the contents of the combination can be retrieved and later matched or used again with a \number escape sequence, such as \1 in the previous example. The combination of ‘([a-z]+)’ matches one or more uppercase letters and the longest possible substring.

Repl is r’ \1′, which replaces the matched string with a space. For example, ‘abcDEF’ will become ‘ABC DEF’ when matched and replaced. sub(‘([A-Z]+)’, r’ \1′, ‘abcDEF’) # ‘abc DEF’

Therefore, the output of the innermost sub function in snake replaces the ‘-‘ in the original string with ‘ ‘, and matches one or more consecutive larger letters in the string, preceded by a space. For example, the original string is’ abc-abcdef-abc ‘converted by the first sub function to’ ABC ABC DEF ABC ‘(note that’ ABC ‘is preceded by two Spaces).

Next, let’s analyze the sub function of the second layer. Let’s look at the input parameters first.

String is the output from the previous sub, which in the previous example is’ ABC ABC DEF ABC ‘(note that’ ABC ‘is preceded by two Spaces).

The pattern is’ ([a-z] [a-z] +) ‘. It is also a combination that matches the form of an uppercase letter followed by one or more lowercase letters, and the longest substring possible.

Repl is also r’ \1′, which replaces the matched string with a space before it is matched by a combination.

Thus, the output of the second sub is simply matching the form of an uppercase letter followed by one or more lowercase letters, preceded by a space. Continuing with the previous example, the input string for this layer is’ ABC ABC DEF ABC ‘(note that’ ABC ‘is preceded by two Spaces) and the output is’ ABC ABC DEF ABc’ (note that ‘A’ is preceded by two Spaces).

The snake function then splits the string output from the second layer sub into a list of strings using the str.split function. The resulting list of strings is then combined using a ‘-‘ delimiter. Finally, use str.lower to convert the combined string to lowercase. Continuing with the example above, the final output string is: ‘abc_abc_def_a_bc’