Data transformation and preprocessing
Data Transformation and Preprocessing
In the process of data visualization, raw data often cannot be directly used for chart rendering. ECharts, as a popular visualization library, provides robust data processing capabilities. From data cleaning to format conversion, and then to aggregation calculations, each step impacts the final presentation.
Data Format Standardization
ECharts supports multiple data formats but recommends using key-value pair arrays. Raw CSV or JSON data often requires conversion:
// Raw data
const rawData = [
{ year: '2020', sales: 1250 },
{ year: '2021', sales: 1870 },
{ year: '2022', sales: 2100 }
];
// Convert to ECharts-compatible format
const chartData = {
xAxis: rawData.map(item => item.year),
series: [{
data: rawData.map(item => item.sales)
}]
};
Time data requires special attention to format consistency. Use moment.js or native Date objects to handle time formats:
const timeData = [
{ date: '2023-01', value: 42 },
{ date: '2023-02', value: 78 }
];
// Convert to timestamp format
const processedData = timeData.map(item => ({
...item,
date: new Date(item.date + '-01').getTime()
}));
Data Cleaning and Filtering
Outlier handling is a critical aspect of data preprocessing. Filter unreasonable data by setting thresholds:
const dirtyData = [12, 45, 999, 32, -5, 28];
// Filter values outside the 0-100 range
const cleanData = dirtyData.filter(
value => value >= 0 && value <= 100
);
Missing value handling can use interpolation methods. Linear interpolation example:
const incompleteData = [
{ x: 1, y: 10 },
{ x: 2, y: null },
{ x: 3, y: 30 }
];
// Fill missing values with linear interpolation
for (let i = 1; i < incompleteData.length - 1; i++) {
if (incompleteData[i].y === null) {
incompleteData[i].y =
(incompleteData[i-1].y + incompleteData[i+1].y) / 2;
}
}
Data Aggregation and Grouping
Large datasets require aggregation. Typical example of time-based aggregation:
const dailyData = [
{ date: '2023-01-01', category: 'A', value: 10 },
{ date: '2023-01-01', category: 'B', value: 20 },
// ...more data
];
// Aggregate by month
const monthlyData = dailyData.reduce((acc, curr) => {
const month = curr.date.substring(0, 7);
if (!acc[month]) acc[month] = 0;
acc[month] += curr.value;
return acc;
}, {});
// Convert to ECharts format
const seriesData = Object.entries(monthlyData).map(
([month, value]) => [month, value]
);
Categorical data grouping statistics:
const products = [
{ category: 'Electronics', price: 999 },
{ category: 'Clothing', price: 199 },
// ...more products
];
// Group by category and calculate average price
const categoryStats = products.reduce((acc, product) => {
if (!acc[product.category]) {
acc[product.category] = { sum: 0, count: 0 };
}
acc[product.category].sum += product.price;
acc[product.category].count++;
return acc;
}, {});
// Calculate averages
const result = Object.entries(categoryStats).map(
([category, stats]) => ({
category,
avgPrice: stats.sum / stats.count
})
);
Data Mapping and Transformation
Mapping raw values to visual properties is a common requirement. Color mapping example:
const temperatureData = [12, 18, 25, 30, 15];
// Temperature-to-color mapping function
function tempToColor(temp) {
if (temp < 15) return '#3498db'; // Cold
if (temp < 25) return '#2ecc71'; // Comfortable
return '#e74c3c'; // Hot
}
const coloredData = temperatureData.map(temp => ({
value: temp,
itemStyle: { color: tempToColor(temp) }
}));
Value range normalization:
const rawValues = [50, 120, 80, 200];
// Normalize to 0-1 range
const max = Math.max(...rawValues);
const normalized = rawValues.map(v => v / max);
// Map to specific range (e.g., 50-200 pixels)
const range = [50, 200];
const finalValues = normalized.map(
v => range[0] + v * (range[1] - range[0])
);
Time Series Processing
Time data requires special transformations. Weekly data to calendar coordinates:
const weekData = [
{ day: 'Mon', value: 12 },
{ day: 'Tue', value: 19 },
// ...complete week data
];
// Convert to calendar coordinate system format
const calendarData = weekData.map((item, index) => [
index, // x-axis coordinate
item.value, // y-axis value
item.day // Display label
]);
Handling discontinuous time series:
const sparseTimeData = [
{ time: '2023-01', value: 10 },
{ time: '2023-03', value: 20 },
{ time: '2023-06', value: 15 }
];
// Fill missing monthly data
const allMonths = ['01','02','03','04','05','06'].map(m => `2023-${m}`);
const completeData = allMonths.map(month => {
const existing = sparseTimeData.find(d => d.time === month);
return existing || { time: month, value: 0 };
});
Multidimensional Data Pivoting
Multidimensional data requires dimensionality reduction for display. Using dataset and dimensions configuration:
const multiDimData = [
{ product: 'Phone', region: 'East', sales: 1200 },
{ product: 'Computer', region: 'North', sales: 800 },
// ...more data
];
option = {
dataset: {
source: multiDimData,
dimensions: ['product', 'region', 'sales']
},
series: {
type: 'bar',
encode: {
x: 'product',
y: 'sales',
itemName: 'region'
}
}
};
Performance Optimization
Large datasets require sampling optimization. Equidistant sampling algorithm:
const largeData = [...Array(10000)].map((_, i) => ({
x: i,
y: Math.sin(i / 100)
}));
// Equidistant sampling to retain 100 points
const sampleSize = 100;
const step = Math.floor(largeData.length / sampleSize);
const sampledData = [];
for (let i = 0; i < largeData.length; i += step) {
sampledData.push(largeData[i]);
}
Incremental data update strategy:
let allData = [...initialData];
// When new data arrives
function handleNewData(newPoints) {
// Keep the most recent 1000 points
if (allData.length + newPoints.length > 1000) {
allData = allData.slice(newPoints.length);
}
allData.push(...newPoints);
// Update chart
myChart.setOption({
series: [{ data: allData }]
});
}
Interactive Data Processing
Dynamic data filtering example:
const fullData = [
{ name: 'Beijing', value: 123 },
{ name: 'Shanghai', value: 156 },
// ...more city data
];
function filterData(minValue) {
return fullData.filter(item => item.value >= minValue);
}
// Slider interaction
document.getElementById('rangeSlider').addEventListener('input', (e) => {
const filtered = filterData(parseInt(e.target.value));
myChart.setOption({
series: [{ data: filtered }]
});
});
Geographic Data Transformation
GeoJSON data requires special transformations:
// Extract coordinate boundaries from GeoJSON
function getBounds(features) {
return features.reduce((bounds, feature) => {
const [minX, minY, maxX, maxY] = turf.bbox(feature);
bounds.minX = Math.min(bounds.minX || Infinity, minX);
bounds.minY = Math.min(bounds.minY || Infinity, minY);
bounds.maxX = Math.max(bounds.maxX || -Infinity, maxX);
bounds.maxY = Math.max(bounds.maxY || -Infinity, maxY);
return bounds;
}, {});
}
Coordinate transformation example:
// WGS84 to Web Mercator
function wgs84ToMercator(lng, lat) {
const x = lng * 20037508.34 / 180;
const y = Math.log(Math.tan((90 + lat) * Math.PI / 360)) / (Math.PI / 180);
return [x, y * 20037508.34 / 180];
}
本站部分内容来自互联网,一切版权均归源网站或源作者所有。
如果侵犯了你的权益请来信告知我们删除。邮箱:cc@cccx.cn