正则表达式

JavaScript 正则表达式详解与实战

  • Symbols
符号 描述
. (period) Matches any single character, except for line breaks.
* Matches the preceding expression 0 or more times.
+ Matches the preceding expression 1 or more times.
? Preceding expression is optional (Matches 0 or 1 times).
^ Matches the beginning of the string.
$ Matches the end of the string.
  • Character groups
符号 描述
\d Matches any single digit character.
\w Matches any word character (alphanumeric & underscore).
[XYZ] Character Set: Matches any single character from the character within the brackets. You can also do a range such as [A-Z]
[XYZ]+ Matches one or more of any of the characters in the set.
[^a-z] Inside a character set, the ^ is used for negation. In this example, match anything that is NOT an uppercase letter.
  • Flags: There are five optional flags. They can be used separately or together and are placed after the closing slash. Example: /[A-Z]/g I’ll only be introducing 2 here.
符号 描述
g Global search
i case insensitive search
  • Advanced
符号 描述
(x) Capturing Parenthesis: Matches x and remembers it so we can use it later.
(?:x) Non-capturing Parenthesis: Matches x and does not remembers it.
x(?=y) Lookahead: Matches x only if it is followed by y.

匹配模式

全局模式

严格模式

Sticky 模式常用于语句令牌化这种需要严格指定匹配位置的地方:

function tokenize(TOKEN_REGEX, str) {
  let result = [];
  let match;
  while ((match = TOKEN_REGEX.exec(str))) {
    result.push(match[1]);
  }
  return result;
}

const TOKEN_GY = /\s*(\+|[0-9]+)\s*/gy;
const TOKEN_G = /\s*(\+|[0-9]+)\s*/g;
> tokenize(TOKEN_GY, '3 + 4')
[ '3', '+', '4' ]
> tokenize(TOKEN_G, '3 + 4')
[ '3', '+', '4' ]

> tokenize(TOKEN_GY, '3x + 4')
[ '3' ]
> tokenize(TOKEN_G, '3x + 4')
[ '3', '+', '4' ]
上一页