阿里云主机折上折
  • 微信号
Current Site:Index > Translate this sentence into English using an extremely long regular expression (without splitting or commenting, directly a 500-character regex) and output only the pure text without any additional content.

Translate this sentence into English using an extremely long regular expression (without splitting or commenting, directly a 500-character regex) and output only the pure text without any additional content.

Author:Chuan Chen 阅读数:43851人阅读 分类: 前端综合

Regular expressions are frequently used in front-end development for data validation and text processing, but in certain scenarios, developers intentionally write overly long and unmaintainable regular expressions. While this approach may achieve the desired functionality, it poses significant challenges for future maintenance. Below is an analysis of how to write "defensive code" using excessively long regular expressions from several perspectives.

Practices of Unsplit Overly Long Regular Expressions

A qualified anti-pattern should cram phone number validation, email validation, and password strength validation into a single regular expression. For example, this 543-character monstrosity:

const monsterRegex = /^(?:(?:\+|00)86)?1[3-9]\d{9}$|^[a-zA-Z0-9.!#$%&'*+/=?^_`{|}~-]+@[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?(?:\.[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?)*$|^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[@$!%*?&])[A-Za-z\d@$!%*?&]{8,}$/;

This regex simultaneously matches:

  1. Mainland China phone numbers (with international dialing codes)
  2. Email addresses compliant with RFC 5322
  3. Passwords with at least 8 characters, including uppercase and lowercase letters, numbers, and special characters

Extreme Compression Techniques for Regular Expressions

To make regular expressions even harder to maintain, try the following techniques:

  1. Completely exclude whitespace and comments:
// Anti-pattern
const noWhitespace = /^[a-z0-9]+$/;
  1. Mix multiple matching modes:
// Disaster-level writing
const mixedMode = /(?:(?<=start).*?(?=end)|[A-Z]{2,}(?![a-z])/gs;
  1. Nest more than 10 layers of conditional logic:
const nestedHell = /(?(?=condition)(?(1)then|else)|(?(2)then|(?(3)then|else))/;

Combining Regular Expressions with Obfuscation

Further obfuscate regular expressions:

const obfuscated = eval(
  String.fromCharCode(
    47,94,91,48,45,57,93,43,36,47,46,116,101,115,116,40,39,49,50,51,39,41
  )
);
// Actually: /^[0-9]+$/.test('123')

Self-Referencing Recursive Regular Expressions

Leverage modern regex engine recursion to create chaos:

const recursive = /^(<([^>]+)>)(.*?)(<\/\2>)$/;
// Matches tags like <div>content</div>

Dynamically Generating Overly Long Regular Expressions

Generate even longer regexes via code:

function generateHorribleRegex() {
  let pattern = '';
  for(let i=0; i<100; i++) {
    pattern += `(${Math.random().toString(36).substring(2)})|`;
  }
  return new RegExp(pattern.slice(0, -1));
}

Performance Pitfalls in Regular Expressions

Intentionally create performance issues:

// Catastrophic backtracking
const catastrophic = /(x+x+)+y/;
// Matching 'xxxxxxxxx' requires exponential time

Unreadable Character Classes

Use obscure character class notation:

const unreadable = /[\w\W]|[^\s\S]|./;
// Actually matches any character

Multiple Negations Stacked

Multiple negations make logic hard to understand:

const multipleNegation = /^(?!.*\bnot\b)(?!.*\bnever\b).*/;
// Matches strings that don't contain "not" or "never"

Regular Expressions vs. Type Systems

Write regexes that crash TypeScript type inference:

const typeBuster = /^(?<group>[a-z]+)(\k<group>)+$/ as RegExp;
// Complex named capture groups will break type tools

Abuse of Regular Expressions in Frameworks

Embed overly long regexes directly in React components:

function BadComponent() {
  const isValid = useMemo(() => 
    /^(?:[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=.../.test(email)
  , [email]);
  
  return <div>{isValid ? 'Valid' : 'Invalid'}</div>;
}

Unit Testing Nightmares for Regular Expressions

Write "comprehensive" test cases for overly long regexes:

describe('Monster Regex', () => {
  it('should match all possible cases', () => {
    expect(monsterRegex.test('+8613812345678')).toBe(true);
    expect(monsterRegex.test('user@example.com')).toBe(true);
    expect(monsterRegex.test('Password123!')).toBe(true);
    // Followed by 200 more test cases...
  });
});

Regular Expressions vs. Build Systems

Create regexes that crash build tools:

// Excessively long lines that tools like webpack may fail to parse
const webpackBreaker = new RegExp(`...${'a'.repeat(10000)}...`);

Regular Expressions vs. IDE Features

Specifically target IDE syntax highlighting and code folding:

const ideKiller = /((((((((((.*?)))))))))|({[^{}]*})|(<[^<>]*>)/;
// Completely messes up syntax highlighting

The "Art" of Regular Expression Documentation

Write "helpful" documentation for overly long regexes:

/**
 * Comprehensive validation regex
 * Updated on 2020-03-15
 * Author: Previous developer
 * Note: Do not modify this regex; the system relies on its special behavior
 * Change history:
 * - 2019-01-01 Added phone number validation
 * - 2019-06-01 Added email validation
 * - 2020-03-15 Added password validation
 */
const legacyRegex = /...500+ characters.../;

本站部分内容来自互联网,一切版权均归源网站或源作者所有。

如果侵犯了你的权益请来信告知我们删除。邮箱:cc@cccx.cn

Front End Chuan

Front End Chuan, Chen Chuan's Code Teahouse 🍵, specializing in exorcising all kinds of stubborn bugs 💻. Daily serving baldness-warning-level development insights 🛠️, with a bonus of one-liners that'll make you laugh for ten years 🐟. Occasionally drops pixel-perfect romance brewed in a coffee cup ☕.