在sql中我們經常會碰到有重復的一些數據,下面我來介紹在mysql中刪除重復記錄的多種方法,有需要的朋友可參考參考.
刪除重復記錄方法一:
1.新建一個臨時表,代碼如下:
create table tmp as select * from youtable group by name(name為不希望有重復的列)
2.刪除原來的表,代碼如下:drop table youtable
3.重命名表,代碼如下:
alter table tmp rename youtable
但是這個方法有個問題,由臨時表轉變過來的最終表,其表結構會和原來的不一致,需要手工更改。這個問題,待解決。
刪除重復記錄方法二:
1.新建一個臨時表,代碼如下:
CREATE TABLE tmp AS SELECT * FROM youtable GROUP BY name(name為不希望有重復的列)
2.清空原來的表,代碼如下:
TRUNCATE TABLE youtable
3.把臨時表插入到youtable,代碼如下:
INSERT INTO tablename SELECT * FROM temp
4.刪除臨時表,代碼如下:
DROP TABLE temp
刪除重復記錄方法三:代碼如下:
delete table where ID not in(select min(ID) from table group by name(name:重復的字段))
刪除重復記錄方法四:
具體實現如下:
- Table Create Table
- ------------ --------------------------------------------------------
- users_groups CREATE TABLE `users_groups` (
- `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
- `uid` int(11) NOT NULL,
- `gid` int(11) NOT NULL,
- PRIMARY KEY (`id`)
- ) ENGINE=InnoDB AUTO_INCREMENT=15 DEFAULT CHARSET=utf8
users_groups.txt內容:
- 1,11,502
- 2,107,502
- 3,100,503
- 4,110,501
- 5,112,501
- 6,104,502
- 7,100,502
- 8,100,501
- 9,102,501
- 10,104,502
- 11,100,502
- 12,100,501
- 13,102,501
- 14,110,501
- mysql> load data infile 'c:/users_groups.txt' into table users_groups fields
- terminated by ',' lines terminated by 'n';
- Query OK, 14 rows affected (0.05 sec)
- Records: 14 Deleted: 0 Skipped: 0 Warnings: 0
- mysql> select * from users_groups;
- query result(14 records)
- id uid gid
- 1 11 502
- 2 107 502
- 3 100 503
- 4 110 501
- 5 112 501
- 6 104 502
- 7 100 502
- 8 100 501
- 9 102 501
- 10 104 502
- 11 100 502
- 12 100 501
- 13 102 501
- 14 110 501
- 14 rows in set (0.00 sec)
根據一位兄弟的建議修改,代碼如下:
- mysql> create temporary table tmp_wrap select * from users_groups group by uid having count(1) >= 1;
- Query OK, 7 rows affected (0.11 sec)
- Records: 7 Duplicates: 0 Warnings: 0
- mysql> truncate table users_groups;
- Query OK, 14 rows affected (0.03 sec)
- mysql> insert into users_groups select * from tmp_wrap;
- Query OK, 7 rows affected (0.03 sec)
- Records: 7 Duplicates: 0 Warnings: 0
- mysql> select * from users_groups;
- --Vevb.com
- query result(7 records)
- id uid gid
- 1 11 502
- 2 107 502
- 3 100 503
- 4 110 501
- 5 112 501
- 6 104 502
- 9 102 501
- mysql> drop table tmp_wrap;
- Query OK, 0 rows affected (0.05 sec)
2、還有一個很精簡的辦法.
查找重復的,并且除掉最小的那個,代碼如下:
- delete users_groups as a from users_groups as a,
- (
- select *,min(id) from users_groups group by uid having count(1) > 1
- ) as b
- where a.uid = b.uid and a.id > b.id;
- (7 row(s)affected)
- (0 ms taken)
- query result(7 records)
- id uid gid
- 1 11 502
- 2 107 502
- 3 100 503
- 4 110 501
- 5 112 501
- 6 104 502
- 9 102 501
3、現在來看一下這兩個辦法的效率,運行一下以下SQL 語句,代碼如下:
- create index f_uid on users_groups(uid);
- explain select * from users_groups group by uid having count(1) > 1 union all
- select * from users_groups group by uid having count(1) = 1;
- explain select * from users_groups as a,
- (
- select *,min(id) from users_groups group by uid having count(1) > 1
- ) as b
- where a.uid = b.uid and a.id > b.id;
- query result(3 records)
- id select_type table type possible_keys key key_len ref rows Extra
- 1 PRIMARY users_groups index (NULL) f_uid 4 (NULL) 14
- 2 UNION users_groups index (NULL) f_uid 4 (NULL) 14
- (NULL) UNION RESULT <union1,2> ALL (NULL) (NULL) (NULL) (NULL) (NULL)
- query result(3 records)
- id select_type table type possible_keys key key_len ref rows Extra
- 1 PRIMARY <derived2> ALL (NULL) (NULL) (NULL) (NULL) 4
- 1 PRIMARY a ref PRIMARY,f_uid f_uid 4 b.uid 1 Using where
- 2 DERIVED users_groups index (NULL) f_uid 4 (NULL) 14
很明顯的第二個比第一個掃描的函數要少,當沒有創建表或創建索引權限的時候,創建一個新表,然后將原表中不重復的數據插入新表,代碼如下:
- mysql> create table demo_new as select * from demo group by site;
- Query OK, 3 rows affected (0.19 sec)
- Records: 3 Duplicates: 0 Warnings: 0
- mysql> show tables;
- +----------------+
- | Tables_in_test |
- +----------------+
- | demo |
- | demo_new |
- +----------------+
- 2 rows in set (0.00 sec)
- mysql> select * from demo order by id;
- +----+------------------------+
- | id | site |
- +----+------------------------+
- | 1 | http://m.survivalescaperooms.com |
- | 2 | http://Vevb.com |
- | 3 | http://m.survivalescaperooms.com |
- | 4 | http://m.survivalescaperooms.com |
- | 5 | http://m.survivalescaperooms.com |
- +----+------------------------+
- 5 rows in set (0.00 sec)
- mysql> select * from demo_new order by id;
- +----+------------------------+
- | id | site |
- +----+------------------------+
- | 1 | http://m.survivalescaperooms.com |
- | 2 | http://Vevb.com |
- | 3 | http://m.survivalescaperooms.com |
- +----+------------------------+
- 3 rows in set (0.00 sec)
然后將原表備份,將新表重命名為當前表,代碼如下:
- mysql> rename table demo to demo_old, demo_new to demo;
- Query OK, 0 rows affected (0.04 sec)
- mysql> show tables;
- +----------------+
- | Tables_in_test |
- +----------------+
- | demo |
- | demo_old |
- +----------------+
- 2 rows in set (0.00 sec)
- mysql> select * from demo order by id;
- +----+------------------------+
- | id | site |
- +----+------------------------+
- | 1 | http://m.survivalescaperooms.com |
- | 2 | http://Vevb.com |
- | 3 | http://m.survivalescaperooms.com |
- +----+------------------------+
- 3 rows in set (0.00 sec)
新聞熱點
疑難解答