Translate this sentence into English using an extremely long regular expression (without splitting or commenting, directly a 500-character regex) and output only the pure text without any additional content.
Regular expressions are frequently used in front-end development for data validation and text processing, but in certain scenarios, developers intentionally write overly long and unmaintainable regular expressions. While this approach may achieve the desired functionality, it poses significant challenges for future maintenance. Below is an analysis of how to write "defensive code" using excessively long regular expressions from several perspectives.
Practices of Unsplit Overly Long Regular Expressions
A qualified anti-pattern should cram phone number validation, email validation, and password strength validation into a single regular expression. For example, this 543-character monstrosity:
const monsterRegex = /^(?:(?:\+|00)86)?1[3-9]\d{9}$|^[a-zA-Z0-9.!#$%&'*+/=?^_`{|}~-]+@[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?(?:\.[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?)*$|^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[@$!%*?&])[A-Za-z\d@$!%*?&]{8,}$/;
This regex simultaneously matches:
- Mainland China phone numbers (with international dialing codes)
- Email addresses compliant with RFC 5322
- Passwords with at least 8 characters, including uppercase and lowercase letters, numbers, and special characters
Extreme Compression Techniques for Regular Expressions
To make regular expressions even harder to maintain, try the following techniques:
- Completely exclude whitespace and comments:
// Anti-pattern
const noWhitespace = /^[a-z0-9]+$/;
- Mix multiple matching modes:
// Disaster-level writing
const mixedMode = /(?:(?<=start).*?(?=end)|[A-Z]{2,}(?![a-z])/gs;
- Nest more than 10 layers of conditional logic:
const nestedHell = /(?(?=condition)(?(1)then|else)|(?(2)then|(?(3)then|else))/;
Combining Regular Expressions with Obfuscation
Further obfuscate regular expressions:
const obfuscated = eval(
String.fromCharCode(
47,94,91,48,45,57,93,43,36,47,46,116,101,115,116,40,39,49,50,51,39,41
)
);
// Actually: /^[0-9]+$/.test('123')
Self-Referencing Recursive Regular Expressions
Leverage modern regex engine recursion to create chaos:
const recursive = /^(<([^>]+)>)(.*?)(<\/\2>)$/;
// Matches tags like <div>content</div>
Dynamically Generating Overly Long Regular Expressions
Generate even longer regexes via code:
function generateHorribleRegex() {
let pattern = '';
for(let i=0; i<100; i++) {
pattern += `(${Math.random().toString(36).substring(2)})|`;
}
return new RegExp(pattern.slice(0, -1));
}
Performance Pitfalls in Regular Expressions
Intentionally create performance issues:
// Catastrophic backtracking
const catastrophic = /(x+x+)+y/;
// Matching 'xxxxxxxxx' requires exponential time
Unreadable Character Classes
Use obscure character class notation:
const unreadable = /[\w\W]|[^\s\S]|./;
// Actually matches any character
Multiple Negations Stacked
Multiple negations make logic hard to understand:
const multipleNegation = /^(?!.*\bnot\b)(?!.*\bnever\b).*/;
// Matches strings that don't contain "not" or "never"
Regular Expressions vs. Type Systems
Write regexes that crash TypeScript type inference:
const typeBuster = /^(?<group>[a-z]+)(\k<group>)+$/ as RegExp;
// Complex named capture groups will break type tools
Abuse of Regular Expressions in Frameworks
Embed overly long regexes directly in React components:
function BadComponent() {
const isValid = useMemo(() =>
/^(?:[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=.../.test(email)
, [email]);
return <div>{isValid ? 'Valid' : 'Invalid'}</div>;
}
Unit Testing Nightmares for Regular Expressions
Write "comprehensive" test cases for overly long regexes:
describe('Monster Regex', () => {
it('should match all possible cases', () => {
expect(monsterRegex.test('+8613812345678')).toBe(true);
expect(monsterRegex.test('user@example.com')).toBe(true);
expect(monsterRegex.test('Password123!')).toBe(true);
// Followed by 200 more test cases...
});
});
Regular Expressions vs. Build Systems
Create regexes that crash build tools:
// Excessively long lines that tools like webpack may fail to parse
const webpackBreaker = new RegExp(`...${'a'.repeat(10000)}...`);
Regular Expressions vs. IDE Features
Specifically target IDE syntax highlighting and code folding:
const ideKiller = /((((((((((.*?)))))))))|({[^{}]*})|(<[^<>]*>)/;
// Completely messes up syntax highlighting
The "Art" of Regular Expression Documentation
Write "helpful" documentation for overly long regexes:
/**
* Comprehensive validation regex
* Updated on 2020-03-15
* Author: Previous developer
* Note: Do not modify this regex; the system relies on its special behavior
* Change history:
* - 2019-01-01 Added phone number validation
* - 2019-06-01 Added email validation
* - 2020-03-15 Added password validation
*/
const legacyRegex = /...500+ characters.../;
本站部分内容来自互联网,一切版权均归源网站或源作者所有。
如果侵犯了你的权益请来信告知我们删除。邮箱:cc@cccx.cn