The extension of regular expressions
ECMAScript 6 introduced significant enhancements to regular expressions, adding Unicode support, named capture groups, lookbehind assertions, and other features, while also extending the methods and modifiers of regular expression objects. These improvements make regular expressions more powerful and flexible when handling complex text.
Unicode Property Escapes
ES6 introduced the \p{...}
and \P{...}
syntax, enabling direct matching of Unicode character properties. This is enabled via the u
modifier:
// Match all Greek letters
const greekRegex = /\p{Script=Greek}/u;
console.log(greekRegex.test('π')); // true
// Match all non-ASCII punctuation
const punctuationRegex = /\p{P}/u;
console.log(punctuationRegex.test('!')); // true
Property categories include:
Script
: Classified by writing system (e.g.,Han
,Latin
)General_Category
: Classified by character type (e.g.,Letter
,Number
)White_Space
: Whitespace characters
Named Capture Groups
Traditional capture groups are accessed via numeric indices, but ES6 allows naming capture groups:
const dateRegex = /(?<year>\d{4})-(?<month>\d{2})-(?<day>\d{2})/;
const match = dateRegex.exec('2023-05-21');
console.log(match.groups.year); // "2023"
console.log(match.groups.month); // "05"
console.log(match.groups.day); // "21"
In replacement strings, named groups can be referenced via $<name>
:
'2023-05-21'.replace(dateRegex, '$<day>/$<month>/$<year>');
// Returns "21/05/2023"
Lookbehind Assertions
Added (?<=...)
positive lookbehind and (?<!...)
negative lookbehind assertions:
// Match amounts preceded by a dollar sign
const amountRegex = /(?<=\$)\d+(\.\d{2})?/;
console.log(amountRegex.exec('Price: $42')[0]); // "42"
// Match numbers not preceded by a dollar sign
const noDollarRegex = /(?<!\$)\d+/;
console.log(noDollarRegex.exec('€100')[0]); // "100"
dotAll Mode
The s
modifier allows the dot .
to match any character (including newlines):
const multilineRegex = /foo.bar/s;
console.log(multilineRegex.test('foo\nbar')); // true
Sticky Matching
The y
modifier enables sticky matching, requiring matches to start at the current position in the target string:
const stickyRegex = /a+/y;
let str = 'aaa_aaa';
stickyRegex.lastIndex = 0;
console.log(stickyRegex.exec(str)[0]); // "aaa"
stickyRegex.lastIndex = 4;
console.log(stickyRegex.exec(str)[0]); // "aaa"
stickyRegex.lastIndex = 1; // Doesn't match starting position
console.log(stickyRegex.exec(str)); // null
flags Property
Added the flags
property to retrieve the modifier string of a regular expression:
const re = /foo/ig;
console.log(re.flags); // "gi"
RegExp Constructor Extensions
Supports copying an existing regex object and overriding modifiers:
const re1 = /foo/i;
const re2 = new RegExp(re1, 'g');
console.log(re2.toString()); // "/foo/g"
String Matching Method Adjustments
String methods like match
, replace
, search
, and split
now internally call Symbol.match
and other built-in methods:
class CustomMatcher {
[Symbol.match](string) {
return string.includes('foo') ? ['foo'] : null;
}
}
console.log('barfoo'.match(new CustomMatcher())); // ["foo"]
Unicode Case Folding
In u
mode, case matching adheres more closely to Unicode standards:
console.log(/[a-z]/i.test('K')); // false
console.log(/[a-z]/iu.test('K')); // true (matches Kelvin symbol)
Regular Expression Subclassing
Create custom regex classes by extending RegExp
:
class MyRegExp extends RegExp {
exec(str) {
const result = super.exec(str);
if (result) result.push('extra');
return result;
}
}
const myRe = new MyRegExp('\\d+');
console.log(myRe.exec('123')); // ["123", "extra"]
Match Indices Proposal (ES2022)
The d
modifier captures the start and end indices of each capture group:
const re = /(a+)(b+)/d;
const match = re.exec('aaabb');
console.log(match.indices[0]); // [0, 5]
console.log(match.indices[1]); // [0, 3]
console.log(match.indices[2]); // [3, 5]
Practical Examples
Processing complex log formats:
const logRegex = /^(?<time>\d{2}:\d{2}:\d{2}) \[(?<level>\w+)\] (?<message>.+?)(?: \((?<file>.+?):(?<line>\d+)\))?$/;
const logLine = '14:35:22 [ERROR] Failed to load module (app.js:42)';
const { groups } = logRegex.exec(logLine);
console.log(groups);
// {
// time: "14:35:22",
// level: "ERROR",
// message: "Failed to load module",
// file: "app.js",
// line: "42"
// }
Extracting Markdown links:
function extractLinks(markdown) {
const linkRegex = /\[(?<text>[^\]]+)\]\((?<url>[^)]+)\)/g;
return [...markdown.matchAll(linkRegex)].map(m => m.groups);
}
const links = extractLinks('See [Google](https://google.com) or [GitHub](https://github.com)');
console.log(links);
// [
// {text: "Google", url: "https://google.com"},
// {text: "GitHub", url: "https://github.com"}
// ]
本站部分内容来自互联网,一切版权均归源网站或源作者所有。
如果侵犯了你的权益请来信告知我们删除。邮箱:cc@cccx.cn