阿里云主机折上折
  • 微信号
Current Site:Index > New regular expression features

New regular expression features

Author:Chuan Chen 阅读数:63751人阅读 分类: JavaScript

ECMAScript 15 introduced several enhancements to regular expressions, including destructuring assignment for named capture groups, the d flag for match indices, the v flag for pattern extensions, and more. These features significantly improve the readability and functionality of regular expressions.

Destructuring Assignment for Named Capture Groups

Named capture groups now support direct destructuring assignment, simplifying the process of extracting data from match results. ES2018 introduced the named capture group syntax (?<name>pattern), and ES15 allows directly referencing group names during destructuring:

const text = "2023-04-01";
const { year, month, day } = text.match(
  /(?<year>\d{4})-(?<month>\d{2})-(?<day>\d{2})/
).groups;

console.log(year);  // "2023"
console.log(month); // "04"
console.log(day);   // "01"

When a regular expression might return null, a fallback with an empty object can be used:

const { groups: { year = "1970" } = {} } = 
  "Invalid".match(/(?<year>\d{4})/) || {};

The d Flag and Match Indices

The new d flag generates an indices array for each capture group, recording the start and end positions of matches:

const re = /(a+)(b+)/d;
const match = re.exec("aaabb");

console.log(match.indices[0]); // [0, 5] - Entire match
console.log(match.indices[1]); // [0, 3] - First capture group
console.log(match.indices[2]); // [3, 5] - Second capture group

For named capture groups, indices also appear in indices.groups:

const re = /(?<as>a+)(?<bs>b+)/d;
const match = re.exec("aaabb");

console.log(match.indices.groups.as); // [0, 3]
console.log(match.indices.groups.bs); // [3, 5]

Enhanced Unicode Property Escapes

\p{} and \P{} now support more Unicode properties, such as matching by script (Script) and extended properties (Extended_Property):

// Match all Greek letters
const greekLetters = /\p{Script=Greek}+/u.test("αβγδε");
console.log(greekLetters); // true

// Match all emoji
const emoji = /\p{Emoji}/u.test("😊");
console.log(emoji); // true

The new X flag allows more flexible Unicode matching rules:

// Match any Unicode whitespace character (including full-width spaces)
const unicodeWhitespace = /\s+/X.test(" "); // Full-width space

The v Mode Flag

The v flag introduces set operations and string patterns, supporting the following operations:

  1. Set Difference: Use -- to exclude characters

    // Match all non-ASCII digit letters
    const re = /[\p{Letter}--\p{ASCII}]/v;
    
  2. Set Intersection: Use && to match common characters

    // Match characters that are both Greek letters and uppercase
    const re = /[\p{Script=Greek}&&\p{Uppercase}]/v;
    
  3. String Literals: Use \q{} to include exact strings

    // Match "foo.bar" exactly, with the dot not treated as a wildcard
    const re = /\q{foo.bar}/v;
    

Regular Expression Subfeature Testing

The new hasIndices property detects whether the d flag is enabled:

const re1 = /a/d;
console.log(re1.hasIndices); // true

const re2 = /a/;
console.log(re2.hasIndices); // false

The unicodeSets property detects whether the v mode is enabled:

const re = /a/v;
console.log(re.unicodeSets); // true

Practical Application Examples

Scenario 1: Log Timestamp Parsing

const log = "[2023-04-01T14:30:00Z] ERROR: System crash";
const { groups: { date, time } } = log.match(
  /\[(?<date>\d{4}-\d{2}-\d{2})T(?<time>\d{2}:\d{2}:\d{2})/
) || {};

console.log(`Failure occurred on ${date} ${time}`);

Scenario 2: Multilingual Text Processing

// Match text mixing Chinese, Japanese, and Latin characters
const mixedText = "中文日本語English混合";
const re = /[\p{Script=Han}\p{Script=Hiragana}\p{Script=Latin}]+/v;
console.log(mixedText.match(re)); // ["中文日本語English混合"]

Scenario 3: Advanced Input Validation

// Validate passwords must contain lowercase, uppercase, and special characters
const passwordRe = /^(?=.*\p{Ll})(?=.*\p{Lu})(?=.*\p{P}).{8,}$/v;
console.log(passwordRe.test("Passw0rd!")); // true
console.log(passwordRe.test("weak"));      // false

Performance Considerations

  1. Using the d flag slightly increases matching time; disable it if unnecessary.
  2. The v mode has lower performance than simple patterns when handling complex set operations.
  3. Precompile frequently used regular expressions:
// Precompile regular expressions
const reCache = new Map();
function getRegExp(pattern) {
  if (!reCache.has(pattern)) {
    reCache.set(pattern, new RegExp(pattern, "v"));
  }
  return reCache.get(pattern);
}

Backward Compatibility Strategies

  1. Two ways to detect feature support:
// Method 1: Direct detection
const supportsDFlag = (() => {
  try {
    new RegExp("", "d");
    return true;
  } catch {
    return false;
  }
})();

// Method 2: Use a feature detection library
import features from 'regex-features';
console.log(features.namedGroups);
  1. For unsupported environments, use the Babel plugin @babel/plugin-transform-named-capturing-groups for fallback.

本站部分内容来自互联网,一切版权均归源网站或源作者所有。

如果侵犯了你的权益请来信告知我们删除。邮箱:cc@cccx.cn

Front End Chuan

Front End Chuan, Chen Chuan's Code Teahouse 🍵, specializing in exorcising all kinds of stubborn bugs 💻. Daily serving baldness-warning-level development insights 🛠️, with a bonus of one-liners that'll make you laugh for ten years 🐟. Occasionally drops pixel-perfect romance brewed in a coffee cup ☕.