检索排名前三和删除重复
准备工作
创建表
CREATE TABLE `test` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(100) DEFAULT NULL,
`score` int(11) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
插入数据
INSERT INTO test.test (name,score) VALUES
('zhangsan',100),
('zhangsan',100),
('lisi',99),
('whangwu',99),
('lisi',98),
('zhaoliu',97);
题目1: 查出来分数前三的同学
步骤一: 检索分数最高的前三个, 同时考虑到可能出线分数相同(名次并列)的情况
-- 方法1, 分数去重
select distinct score from test limit 3
-- 方法2, 分数分组
select score from test group by score order by score desc limit 3
步骤二: 检索分数等于最高的前三个的同学即是答案
直接执行sql: select * from test where score in (select distinct score from test limit 3)
会报错并提示: This version of MySQL doesn't yet support 'LIMIT & IN/ALL/ANY/SOME subquery'
即不支持在子查询中使用limit, 解决办法就是把子查询变成一张派生表: select * from (select distinct score from test limit 3) as t
然后再使用子查询(即多嵌套一层):
select * from test where score in
(select * from
(select distinct score from test limit 3) as t
)
题目2: 删除名称重复的数据
步骤一: 检索出名称重复的数据
直接执行sql: select name, count(name) as counts from test where count(name) > 1 group by name
会报错并提示: Invalid use of group function
原因是where后面的条件中不能使用聚合函数, 要使用having语法替代:
select name, count(name) as counts from test group by name having counts > 1
步骤二: 删除重复的数据
通过max()或min()函数选择保留ID最大或最小的数据: select min(id) as min_id from test group by name
剩余的即是需要删除的
直接执行sql: delete from test where id not in (select min(id) as min_id from test group by name)
会报错并提示: You can't specify target table 'test' for update in FROM clause
原因是, 不能先查出同一个表中的数据以后再进行更新(包括删除), 解决办法与题目1中的步骤2一样, 生成派生表即可, 然后通过派生表删除:
delete from test where id not in
(select min_id from
(select min(id) as min_id from test group by name) as t
)
其他解法:
-- 方法1
select a.id
from (
select ROW_NUMBER() over (partition by name order by id) as d
,name
,id
from test) as a
where a.d>1
-- 方法2
select t.id from test t
inner join
(select count(1),name,min(id) as id from test group by name having count(1)>1) as a
on t.id <> a.id and t.name =a.name
-- 上面两个都能直接查出来需要删除的数据, 把select改成delete即可, 且效率比嵌套子查询高
题目3: 删除名称和分数重复的数据
题目3解答思路同题目2, 但是所有字段重复才算重复, 所以select和group by要列出所有字段:
delete from test where id in
(select max_id from
(select max(id) as max_id, name, score, count(*) as counts from test group by name, score having counts > 1) as t
)
PS: count(name) count(score) count(*) 效果一样