阿里云主机折上折
  • 微信号
Current Site:Index > The encoding setting of an HTML document

The encoding setting of an HTML document

Author:Chuan Chen 阅读数:50459人阅读 分类: HTML

The encoding settings of an HTML document directly affect the correct display of content and the browser's parsing method. Incorrect encoding may lead to garbled text or layout issues, so configuration is required at both the document structure and meta tag levels.

Document Type Declaration and Encoding

An HTML5 document must declare its document type with <!DOCTYPE html>, which serves as the foundation for encoding settings. The complete document structure should include the <meta charset> tag:

<!DOCTYPE html>
<html lang="zh-CN">
<head>
  <meta charset="UTF-8">
  <title>Example Page</title>
</head>

Common encoding formats include:

  • UTF-8 (recommended): Supports multilingual character sets
  • GB2312/GBK: Encoding specifically for Chinese
  • ISO-8859-1: Encoding for Western European languages

Detailed Configuration of Meta Tags

The <meta charset> tag must be placed within the first 1024 bytes of the <head> section, preceding any content rendering:

<head>
  <!-- Correct position -->
  <meta charset="UTF-8">
  <!-- Other head content -->
  <link rel="stylesheet" href="style.css">
</head>

For special cases requiring fallback encoding:

<meta http-equiv="Content-Type" content="text/html; charset=GB2312">

Priority of HTTP Headers and Encoding

When the server's HTTP header conflicts with the document's declared encoding, the priority order is:

  1. Content-Type in the HTTP header
  2. <meta charset> tag
  3. Browser auto-detection

Example of setting response headers in a Node.js server:

const http = require('http');
http.createServer((req, res) => {
  res.setHeader('Content-Type', 'text/html; charset=GB18030');
  res.end('<h1>Chinese Content Test</h1>');
}).listen(3000);

Handling Special Characters

Special characters should be escaped using HTML entities:

<p>Copyright symbol: &copy; Currency symbol: &euro;</p>
<!-- Output: Copyright symbol: © Currency symbol: € -->

Complete ASCII control character escape reference table:

Character Entity Number Entity Name
< &#60; &lt;
> &#62; &gt;
& &#38; &amp;

Encoding Practices for Multilingual Documents

Special attention is required for mixed CJK (Chinese, Japanese, Korean) content:

<!DOCTYPE html>
<html>
<head>
  <meta charset="UTF-8">
  <style>
    /* Special font settings for Japanese */
    .jp { font-family: "MS Gothic", monospace; }
  </style>
</head>
<body>
  <p>Chinese content</p>
  <p class="jp">日本語のコンテンツ</p>
  <p>한국어 컨텐츠</p>
</body>
</html>

Methods for Diagnosing Encoding Issues

Steps to check encoding in Chrome Developer Tools:

  1. Open the Network panel
  2. Click the target document
  3. View the Content-Type in Response Headers
  4. Inspect the <meta> tag in the Elements panel

Common solutions for garbled text:

<!-- Solution 1: Force-refresh encoding -->
<script>document.charset = 'UTF-8';</script>

<!-- Solution 2: Reset HTTP headers on the backend -->
<?php header('Content-Type: text/html; charset=GB2312'); ?>

Compatibility Handling for Legacy Encoding Formats

For legacy systems requiring IE6 compatibility:

<!--[if IE]>
<meta http-equiv="X-UA-Compatible" content="IE=EmulateIE7">
<![endif]-->

Mixed encoding declarations for traditional web pages:

<meta http-equiv="Content-Type" content="text/html; charset=big5">
<meta charset="big5">

Encoding Configuration in Modern Frontend Tools

Configuring HTML encoding in webpack:

// webpack.config.js
module.exports = {
  plugins: [
    new HtmlWebpackPlugin({
      meta: { charset: 'utf-8' },
      template: './src/index.html'
    })
  ]
}

Global configuration in Vue CLI projects:

// vue.config.js
module.exports = {
  chainWebpack: config => {
    config.plugin('html').tap(args => {
      args[0].meta = { charset: 'utf-8' }
      return args
    })
  }
}

Relationship Between Encoding and DOM Operations

Encoding considerations for JavaScript string operations:

// Correctly decoding URL parameters
function getParam(name) {
  return decodeURIComponent(
    new URLSearchParams(window.location.search).get(name)
  );
}

// Specifying encoding for Blob objects
const blob = new Blob([content], { type: 'text/html;charset=utf-8' });

Handling Special Scenarios on Mobile

Solutions for encoding issues in WeChat browsers:

<!-- Force WeChat browsers to use UTF-8 -->
<script>
  if(/MicroMessenger/i.test(navigator.userAgent)){
    document.querySelector('meta[charset]').setAttribute('content','text/html; charset=UTF-8');
  }
</script>

本站部分内容来自互联网,一切版权均归源网站或源作者所有。

如果侵犯了你的权益请来信告知我们删除。邮箱:cc@cccx.cn

Front End Chuan

Front End Chuan, Chen Chuan's Code Teahouse 🍵, specializing in exorcising all kinds of stubborn bugs 💻. Daily serving baldness-warning-level development insights 🛠️, with a bonus of one-liners that'll make you laugh for ten years 🐟. Occasionally drops pixel-perfect romance brewed in a coffee cup ☕.