JavaScript 正则表达式详解与实战
符号 |
描述 |
. |
(period) Matches any single character, except for line breaks. |
* |
Matches the preceding expression 0 or more times. |
+ |
Matches the preceding expression 1 or more times. |
? |
Preceding expression is optional (Matches 0 or 1 times). |
^ |
Matches the beginning of the string. |
$ |
Matches the end of the string. |
符号 |
描述 |
\d |
Matches any single digit character. |
\w |
Matches any word character (alphanumeric & underscore). |
[XYZ] |
Character Set: Matches any single character from the character within the brackets. You can also do a range such as [A-Z] |
[XYZ]+ |
Matches one or more of any of the characters in the set. |
[^a-z] |
Inside a character set, the ^ is used for negation. In this example, match anything that is NOT an uppercase letter. |
- Flags: There are five optional flags. They can be used separately or together and are placed after the closing slash. Example: /[A-Z]/g I’ll only be introducing 2 here.
符号 |
描述 |
g |
Global search |
i |
case insensitive search |
符号 |
描述 |
(x) |
Capturing Parenthesis: Matches x and remembers it so we can use it later. |
(?:x) |
Non-capturing Parenthesis: Matches x and does not remembers it. |
x(?=y) |
Lookahead: Matches x only if it is followed by y. |
匹配模式
全局模式
严格模式
Sticky 模式常用于语句令牌化这种需要严格指定匹配位置的地方:
function tokenize(TOKEN_REGEX, str) {
let result = [];
let match;
while ((match = TOKEN_REGEX.exec(str))) {
result.push(match[1]);
}
return result;
}
const TOKEN_GY = /\s*(\+|[0-9]+)\s*/gy;
const TOKEN_G = /\s*(\+|[0-9]+)\s*/g;
> tokenize(TOKEN_GY, '3 + 4')
[ '3', '+', '4' ]
> tokenize(TOKEN_G, '3 + 4')
[ '3', '+', '4' ]
> tokenize(TOKEN_GY, '3x + 4')
[ '3' ]
> tokenize(TOKEN_G, '3x + 4')
[ '3', '+', '4' ]