Note on efficient Python Programming Tips (2) String handling problems and solutions

1. How to split a string containing multiple delimiters

Actual cases:

We want to split a string into different fields based on the split symbol. The string contains a number of different delimiters, for example: s = ‘ab; cd|efg|hi,jkl|mn\topq; RST, the sale \ txyz ‘<, >, <; >, < | >, < > \ t is symbol segmentation, how to deal with?

Solution:

Method 1: Use the str.split() method continuously, processing one split symbol at a time

Method 2: Use the re.split() method of regular expressions, recommended.

s='ab; cd|efg|hi,jkl|mn\topq; rst,uvw\txyz'
s.split('|,; ')
s.split('; ')
[ss.split('|') for ss in s.split('; ')]
list(map(lambda ss:ss.split('|'),s.split('; ')))
t=[]
t.extend([1.2.3])
t.extend([4.5.6])
map(t.extend())
[ss.split('|') for ss in s.split('; ')]
sum([ss.split('|') for ss in s.split('; ')], [])def my_split(s,seps):
    res=[s]
    for sep in seps:
        t=[]
        list(map(lambda ss: t.extend(ss.split(sep)),res))
        res=t
    return res
my_split(s,',; |\t')

from functools import reduce
reduce(lambda l, sep:sum(map(lambda ss:ss.split(sep),1), []),',; |\t',[s]
my_splits== lambda s,seps:reduce(lambda l, sep:sum(map(lambda ss:ss.split(sep),1),[]),seps,[s]

Copy the code

2. How to check whether string A begins or ends with string B

Discussion questions:

How do I determine whether string A begins or ends with string B?

Actual cases:

A file directory contains a series of files:

 quicksort.c
 
 graph.p
 
 heap.java
 
 install.sh
 
 stack.cpp
Copy the code

Write a program to add user executable permissions to all of the.sh and.py files.

Solution:

Use the str.startswith() and str.endswith() methods (note: Arguments are tuples for multiple matches)

fn='aaa.py'
fn.endswith('.py')
fn.endswith('.sh'.'.py')
import os 
os.listdir('. ')
s=os.stat('b.py')
s.st_mode|0o100
oct(s.st_mode|0o100)
od.chmod('b.py',s.st_mode |0o100)

import stat
stat.S_IXUSR
for fn in os.listdir():
    if fn.endswith('.py'.'.sh')
    fs=os.stat(fn)
    os.chmod(fn,fs.st_mode | stat.S_IXUSR)
Copy the code

3. How to adjust the format of the Chinese text of a string

Discussion questions:

How to format the text of a string?

Actual cases:

The date format of the logo file is’ YYYY-MM-DD ‘. We want to change the date format to ‘mm/ DD/YYYY ‘. 2019-07-23’=>’07/23/2019’.

Solution:

Use the regular expression re.sub() method for string substitution, using the regular expression capture group, capture the content of each part, adjust the order of each capture group in the replacement string.

ls /var/log
cat/var/log/dpkg.log1.
f=open('/var/log/dpkg.log.1')
log=f.read()

import re
re.sub(p,r,s)
print(re.sub(r'(\d{4})-(\d{2})-(\d{2})'.r'\2/\3/\1',log))
print(re.sub(r'(? P
      
       \d{4})-(? P
       
        \d{2})-(? P
        
         \d{2})'
        
       
      .r'\g<m>/\g<y>',log))

Copy the code

4. How to concatenate multiple small strings into one large string

Discussion questions:

How do I concatenate a small string into a large string?

Actual cases:

When designing the network program, we customize a udP-based network protocol to pass a series of parameters to the server in a fixed order:

hwdetect:   "< 0112 >"

gxDepthBits:  32 "< >"

gxResolution:   "< 1024 * 768 >"

gcRefresh:      "< > 60." "" fullAlpha: "< 1 >" lodDist: "< 100 >" DistCall: "< 500 >"
Copy the code

In the program we collect the parameters into the list in order:

[” < 0112 > < 32 “, “>”, “< 1024 >”, “< > 60”, “< 1 >, < 100.00 > < 500.00 >”, “]

Finally, we need to splice each parameter into a data for submission.

32 “< 0112 > < > < 1024 > < 60 > < 1 > < 100 > < 500 >”

Solution:

Method 1: Iterate over the list, concatenating each string in sequence using the “+” operation.

Method 2: Use str.join() to concatenate all strings in the list more quickly.

L = [" < 0112 > ""," < > 32 ", "< 1024 >", "< 60 >", "< 1 >, < 100.00 >", "< 500.00 >"] s = 'for x in l: s+=x s.join(iterable)->str timeit ''.join(l) timeit reduce(str.__add__,1)Copy the code

5. How to align strings left, right, and center

Actual cases:

A dictionary stores a list of attribute values: {

   "lodDist": 100.00."SmallCull": 0.04."DistCull": 500.00."trilinear": 40."farclip": 477,Copy the code

} In the program, we want to output its contents in integer format, how to handle?

    lodDist:100.00,
    SmallCull:0.04,
    DistCull:500.00,
    trilinear:40,
    farclip:477.Copy the code

Solution:

Str.ljust (),str.rjust(),str.center()

Method 2: Use the format method, passing arguments like ‘<20>,’>20′,’^20′ to accomplish the same task.

s='abc'
s.ljust(10)
'abc '
s.rjust(10)
' abc'
format(s,'< 10')
format(s,'> 10')
format(s,'^ 10')
format(123.'+')
format(- 123..'+')
format(- 123..'> + 10'
format(- 123..'+ 10 =')
format(- 123..'0 = + 10')
format(+546.'0 = + 10')
d={ 'lodDist':100.00.'SmallCull':0.04.'DistCull':500.00.'trilinear':40.'farclip':477,}
w=max(map(len,d.keys()))
for k,v in d.items():
    print(k.ljust(w),':',v)
    
Copy the code

6. How to string characters that are not needed

Discussion questions:

How to remove unnecessary characters from a string?

Actual cases:

1. Filter out redundant white space characters in user input.

' [email protected]'

2. Filter ‘\r’ in a Windows-edited text:

'hello world\r\n'

3. Remove Unicode combination symbols from text:

"Ni ha, ch ǐ fan '

Solution:

Method 1: strip(),lstrip(),rstrip() methods strip both ends of the string.

s=' hellowowd'
s.strip()
s.lstrip()
s.rstrip()

Copy the code

Method 2: Delete characters in a single position by slicing and splicing.

s2='abc:1234'
s2[:3]+s2[4:]
'abc1234'
Copy the code

Method three: The string replace() method or the regular expression re.sub() removes arbitrary strings

s3=' abc xyz '
  s3.replace()
s3=' \t abc \t,xtz'
import re
re.sub('[ \t\n]'.' ',s3)
Copy the code

Method 4: String translate(), which can delete many different characters

s='abc1234xyz'
s.translate({ord('a') :'X'})
s.maketrans('abcxyz'.'XYZABC')
s.translate(s.maketrans('abcxyz'.'XYZABC'))
s.translate({ord('a') :None})
Copy the code

Case 4:

s4='ni ha, ch ǐ fan'C = a len (c)import unicodedata
unicodedata.combining(c[1])
[ord(c) for c in s4 if unicodedata.combining(c)]
dict.formkeys([ord(c) for c in s4 if unicodedata.combining(c)],None)
s4.translate(dict.formkeys([ord(c) for c in s4 if unicodedata.combining(c)],None))
Copy the code

Split (),replace(),translate(),format(),strip()…

Note on efficient Python Programming Tips (2) String handling problems and solutions

1. How to split a string containing multiple delimiters

Actual cases:

Solution:

2. How to check whether string A begins or ends with string B

Discussion questions:

Actual cases:

Solution:

3. How to adjust the format of the Chinese text of a string

Discussion questions:

Actual cases:

Solution:

4. How to concatenate multiple small strings into one large string

Discussion questions:

Actual cases:

Solution:

5. How to align strings left, right, and center

Actual cases:

Solution:

6. How to string characters that are not needed

Discussion questions:

Actual cases:

Solution:

Related Posts

JAVA data structures: strings

Here are 5 developer tools that satisfy the two souls but don’t work

No more than this article on the various coding headaches of javaweb development…