Line separator and paragraph separator
ECMAScript 9 introduced explicit support for the Line Separator and Paragraph Separator, further standardizing line break handling in strings and regular expressions. These special characters were already defined in Unicode, but ES9 formally incorporated them into the language specification, addressing the pain points of cross-platform line break handling.
Unicode Definitions of Line Separator and Paragraph Separator
The Unicode standard defines two special line control characters:
- Line Separator (U+2028): Indicates the end of a logical line but does not enforce a line break
- Paragraph Separator (U+2029): Indicates the end of a paragraph, typically resulting in larger spacing
Traditional line breaks (e.g., \n
and \r\n
) behave inconsistently across operating systems, while these two characters provide more precise control over text structure. For example:
const lineSep = '\u2028';
const paraSep = '\u2029';
console.log(`First line${lineSep}Second line`);
console.log(`First paragraph${paraSep}Second paragraph`);
Pre-ES9 Handling Issues
Before ES9, these separators could cause syntax errors or unexpected behavior:
// ES8 and earlier might throw errors
const badTemplate = `Contains
invisible separator`; // May actually contain U+2028
JSON parsing also posed risks:
// Hidden U+2028 could cause parsing failures
JSON.parse('{"text":"Hidden separator\u2028"}');
ES9 Improvements
ES9 explicitly treats these characters as valid content in strings and template literals:
- String Literals: All Unicode line terminators can be directly included
- Regular Expressions:
\s
now matches all Unicode whitespace characters - JSON Specification: Explicitly supports them as string content
Example demonstrating improved handling:
// ES9 safe usage
const safeString = 'Explicit \u2028 separator';
const regex = /\s+/u; // Matches all Unicode whitespace
// JSON handling
const jsonStr = JSON.stringify({ text: 'Data with \u2029' });
console.log(JSON.parse(jsonStr)); // Parses correctly
Practical Use Cases
Multi-line Text Processing
Precisely distinguishing lines and paragraphs in rich text:
function parseText(input) {
return input.split(/\u2029/g).map(paragraph =>
paragraph.split(/\u2028/g)
);
}
const content = 'Paragraph1\u2029Paragraph2Line1\u2028Line2';
console.log(parseText(content));
// Output: [ ['Paragraph1'], ['Paragraph2Line1', 'Line2'] ]
Source Code Validation Tools
Developing ESLint plugins to detect accidental separator usage:
// Example ESLint rule
module.exports = {
create(context) {
return {
Literal(node) {
if (typeof node.value === 'string' &&
(node.value.includes('\u2028') || node.value.includes('\u2029'))) {
context.report({
node,
message: 'Explicitly use line/paragraph separators instead of escape sequences'
});
}
}
};
}
};
Interaction with Regular Expressions
ES9 enhanced regex handling of line terminators:
- Dot (
.
) matching mode can now be configured to include line terminators \s
character class includes all Unicode whitespace characters
// Old vs. new behavior
const text = 'Line1\u2028Line2\u2029Paragraph2';
// Traditional matching
console.log(text.match(/.*/)[0]); // Only matches the first line
// ES9 's' flag
console.log(text.match(/.*/s)[0]); // Matches all content
// Whitespace detection
console.log(/\s/u.test('\u2028')); // true
Browser and Engine Compatibility
Implementation status across major engines:
- V8 (Chrome 64+)
- SpiderMonkey (Firefox 60+)
- JavaScriptCore (Safari 12+)
Feature detection method:
const supportsES9Separators = () => {
try {
eval("'\\u2028'");
eval("'\\u2029'");
return true;
} catch (e) {
return false;
}
};
Performance Considerations
Heavy use of these separators may impact string operation performance:
// Performance comparison
const testLargeString = (sep) => {
const bigText = Array(1e5).fill(`text${sep}`).join('');
console.time('split');
bigText.split(sep);
console.timeEnd('split');
};
testLargeString('\n'); // Traditional line break
testLargeString('\u2028'); // Line separator
Integration with TypeScript
TypeScript 3.2+ fully supports these features but requires configuration:
{
"compilerOptions": {
"target": "ES2018",
"lib": ["ES2019.String"]
}
}
The type system can also detect these special characters:
declare function processText(text: string): void;
// Type-safe usage
processText('Valid \u2028 separator'); // Valid
processText(`Template \u2029 string`); // Valid
本站部分内容来自互联网,一切版权均归源网站或源作者所有。
如果侵犯了你的权益请来信告知我们删除。邮箱:cc@cccx.cn