当前位置：首页 > news >正文

【删除重复记录】SQL删除重复记录

news 来源：原创 2024/5/17 2:16:09

1.给name字段修改成唯一索引：

drop index idx_name on test; alter table test add unique index (name);

这样当向表中添加相同记录的时候，会返回1062的添加失败信息。
但是有一种情况是表中已经有n个重复的记录，这时候我们才想起来要添加唯一索引，再执行上面的操作时，数据库会告诉你已经有重复的记录了，建立索引失败，这时候，我们可以用下面的操作：

alter ignore table test add unique idx_name (name);

它会删除重复的记录(别怕，会保留一条)(但是这个命令在MySQL5.1.37之前是可以的，在5.1.48以后就行不通了)，然后建立唯一索引，高效而且人性化。

另外需要注意的是alter ignore table 在percona版本的MySQL行不通，因为它创建索引的方式是：fast index creation

3.删除重复记录法：

创建一个表用来存放，要删除的记录的id信息:

CREATE TABLE `tmp_ids` (
`id` int ( 11 ),
`name` char( 20)
) ENGINE = MyISAM;

如果要删除的记录不多的话，可以把这个表创建成内存表形式：

CREATE TABLE `tmp_ids` (
`id` int ( 11 ),
`name` char( 20)
) ENGINE = HEAP;

然后在test表中删除重复记录：

insert into tmp_ids select min (id),name from test group by name having count ( * ) > 1 order by null; delete a.* from test a,tmp_ids b where b.name=a.name and a.id>b.id ; truncate table tmp_ids;

摘抄自：https://www.cnblogs.com/sunss/archive/2011/01/29/1947469.html

select * FROM mx_maint_order_list WHERE lIndex NOT IN ( SELECT temp.mid FROM ( SELECT min(lIndex) as mid FROM mx_maint_order_list GROUP BY lMaintOrderID,lWindFieldID) AS temp);

-- 检查重复code1
select count(identity) num, identity from event_log 
where code='code1' 
group by identity having count(identity) > 1
order by num desc

删除重复记录

DELETE FROM event_log WHERE `code`='code1' AND identity IN (
    SELECT identity from (
        SELECT identity FROM event_log WHERE code='code1' GROUP BY identity HAVING count(identity) > 1
    ) a
) AND id NOT IN (
    SELECT keepId FROM (
        SELECT min(id) keepId FROM event_log WHERE code='code1' GROUP BY identity HAVING count(identity) > 1
    ) b
)

其中 a 和 b 两个中间表的作用是, 避免执行时出现 You can't specify target table 'xxxxx' for update in FROM clause 错误

分组按时间正序取第一条记录, 巧妙地使用了not exists

select d.* from t_charge d where not exists (select 1 from t_charge where user_id = d.user_id and created_at <</span> d.created_at)

按时间倒序则是

select f.* from t_charge f where not exists (select 1 from t_charge where user_id = f.user_id and created_at > f.created_at)

转自：https://www.cnblogs.com/milton/p/6354229.html

在生产环境中，我们有的列是不允许出现重复值的，亦或是某两列不允许同时重复，但由于前端未做限制，或者没限制住，出现了单列重复值，或者两列本应组成唯一组合却也出现重复，这两种情况都是不允许的。现在由于前端应用限制不住，要做删除操作后，添加唯一索引，从数据库层面进行限制，以下是处理过程：

mysql> select * from aixuan1;
+----+------+-------+
| id | text | text1 |
+----+------+-------+
| 1 | aa   | 11    |
| 2 | bb   | 22    |
| 3 | cc   | 33    |
| 4 | cc   | 44    |
| 5 | bb   | 22    |
| 6 | aa   | 11    |
| 7 | dd   | 55    |
+----+------+-------+
7 rows in set (0.00 sec)

text字段全部重复的有：

mysql> select * from aixuan1 where text in (select text from aixuan1 GROUP BY text having count(*) > 1);
+----+------+-------+
| id | text | text1 |
+----+------+-------+
| 1 | aa   | 11    |
| 2 | bb   | 22    |
| 3 | cc   | 33    |
| 4 | cc   | 44    |
| 5 | bb   | 22    |
| 6 | aa   | 11    |
+----+------+-------+
6 rows in set (0.00 sec)

筛选出text单列重复值

select * from aixuan1 where `text` in (select `text` from aixuan1 GROUP BY `text` having count(*) > 1) and id not in (select min(id) from aixuan1 group by text having count(*)>1)
+----+------+-------+
| id | text | text1 |
+----+------+-------+
| 4 | cc   | 44    |
| 5 | bb   | 22    |
| 6 | aa   | 11    |
+----+------+-------+
3 rows in set (0.00 sec)

还可以这么查

mysql> select * FROM aixuan1 WHERE id NOT IN ( SELECT temp.mid FROM ( SELECT min(id) as mid FROM aixuan1 em GROUP BY em.text) AS temp);
+----+------+-------+
| id | text | text1 |
+----+------+-------+
| 4 | cc   | 44    |
| 5 | bb   | 22    |
| 6 | aa   | 11    |
+----+------+-------+
3 rows in set (0.00 sec)

筛选出text和text1同时重复的字段：

mysql> select * FROM aixuan1 WHERE id NOT IN ( SELECT temp.mid FROM ( SELECT min(id) as mid FROM aixuan1 em GROUP BY em.text,em.text1) AS temp);
+----+------+-------+
| id | text | text1 |
+----+------+-------+
| 5 | bb | 22 |
| 6 | aa | 11 |
+----+------+-------+
2 rows in set (0.00 sec)

查出来了，删就好办了，把select换成delete就Ok了，具体说保留大的id还是保留小的id那条，只要子查询的id函数用min(id)或者max(id)即可

本文出自 “岁伏” 博客，请务必保留此出处http://suifu.blog.51cto.com/9167728/1796055