以下是一种使用零宽度lookaround来隔离每个名称的方法:string = "555-1239Moe Szyslak(636) 555-0113Burns, C. Montgomery555 -6542Rev. Timothy Lovejoy555 8904Ned Flanders636-555-3226Simpson, Homer5553642Dr. Julius Hibbert"
result = re.findall(r'(?:(?<=^)|(?<=[^A-Za-z.,]))[A-Za-z.,]+(?: [A-Za-z.,]+)*(?:(?=[^A-Za-z.,])|(?=$))', string)
print(result)
['Moe Szyslak', 'Burns, C. Montgomery', 'Rev. Timothy Lovejoy', 'Ned Flanders',
'Simpson, Homer', 'Dr. Julius Hibbert']
实际匹配的模式是:
^{pr2}$
这表示匹配任何大写或小写字母、点或句点,后跟空格和一个或多个相同字符,零次或多次。在
此外,我们在该模式的左右两侧使用以下环视:(?:(?<=^)|(?<=[^A-Za-z.,]))
Lookbehind and assert either the start of the string, or a non matching character
(?:(?=[^A-Za-z.,])|(?=$))
Lookahead and asser either the end of the string or a non matching character