"王伟,你的报表数据怎么对不上啊?" 😅 上周我就遇到了这样的尴尬——公司客户表里有23个"张伟",18个"李娜",还有一堆同名不同人的记录,更糟的是,系统把这些同名的客户订单全混在一起了!这就是典型的数据库重名问题,今天我们就来彻底解决它!
重复数据通常分为两种:
-- 示例:用户表中可能的重名记录 SELECT * FROM users WHERE name = '张伟';
-- 基本用法:获取不重复的姓名列表 SELECT DISTINCT name FROM customers; -- 多字段去重 SELECT DISTINCT name, phone FROM customers;
注意:DISTINCT会导致全表扫描,大表慎用!
-- 获取每种姓名的第一条记录 SELECT * FROM customers GROUP BY name; -- 带聚合函数的实用案例 SELECT name, COUNT(*) as dup_count FROM customers GROUP BY name HAVING dup_count > 1; -- 找出所有重名
-- 步骤1:创建临时表存储去重结果 CREATE TABLE temp_customers LIKE customers; -- 步骤2:插入去重数据 INSERT INTO temp_customers SELECT * FROM customers GROUP BY name, phone; -- 按业务规则确定去重字段 -- 步骤3:替换原表 RENAME TABLE customers TO customers_backup, temp_customers TO customers;
-- 使用ROW_NUMBER()标记重复项 SELECT *, ROW_NUMBER() OVER(PARTITION BY name, phone ORDER BY id) AS row_num FROM customers; -- 实际删除操作(CTE语法) WITH duplicates AS ( SELECT id, ROW_NUMBER() OVER(PARTITION BY name, phone ORDER BY id) AS rn FROM customers ) DELETE FROM customers WHERE id IN (SELECT id FROM duplicates WHERE rn > 1);
-- 基本唯一索引 ALTER TABLE customers ADD UNIQUE INDEX idx_unique_name_phone (name, phone); -- 遇到重复时的处理方式 INSERT IGNORE INTO customers(name, phone) VALUES ('张伟', '13800138000'); -- 静默跳过 -- 或者使用ON DUPLICATE KEY UPDATE INSERT INTO customers(name, phone) VALUES ('张伟', '13800138000') ON DUPLICATE KEY UPDATE last_update = NOW();
-- 使用SOUNDEX函数找发音相似的姓名 SELECT a.name, b.name FROM customers a JOIN customers b ON SOUNDEX(a.name) = SOUNDEX(b.name) WHERE a.id != b.id; -- 结合LIKE的模糊匹配 SELECT * FROM customers WHERE name LIKE '%张伟%' OR name LIKE '%张卫%';
-- 每次处理1000条 DELETE FROM customers WHERE id IN ( SELECT id FROM ( SELECT id, ROW_NUMBER() OVER(PARTITION BY name ORDER BY id) AS rn FROM customers WHERE name IN ('张伟','李娜') ) t WHERE rn > 1 LIMIT 1000 );
-- 验证去重是否彻底 SELECT name, COUNT(*) FROM customers GROUP BY name HAVING COUNT(*) > 1; -- 检查数据完整性 SELECT COUNT(*) AS original_count FROM customers_backup; SELECT COUNT(*) AS deduplicated_count FROM customers;
DELIMITER // CREATE TRIGGER prevent_duplicate_customer BEFORE INSERT ON customers FOR EACH ROW BEGIN IF EXISTS (SELECT 1 FROM customers WHERE name = NEW.name AND phone = NEW.phone) THEN SIGNAL SQLSTATE '45000' SET MESSAGE_TEXT = '客户已存在!'; END IF; END // DELIMITER ;
通过以上方法,我们公司成功将"张伟"们梳理得清清楚楚,再也不会出现订单混淆的情况了!好的数据库就像整洁的房间——定期"大扫除",日常"随手整理",才能保持最佳状态,现在就去检查你的数据库吧,说不定也有隐藏的"重名危机"等着你去解决呢!
(本文方法基于MySQL 8.0版本验证,更新于2025年7月)
本文由 计从蓉 于2025-07-30发表在【云服务器提供商】,文中图片由(计从蓉)上传,本平台仅提供信息存储服务;作者观点、意见不代表本站立场,如有侵权,请联系我们删除;若有图片侵权,请您准备原始证明材料和公证书后联系我方删除!
本文链接:https://vps.7tqx.com/wenda/485137.html
发表评论