基于SQL Server中如何比較兩個表的各組數據圖解說明

2024-08-31 00:44:05

字體：大中小

來源：轉載

供稿：網友

開始

前一陣子，在項目中碰到這樣一個SQL查詢需求，有兩個相同結構的表(table_left & table_right)，如下：

基于SQL Server中如何比較兩個表的各組數據圖解說明

圖1.

檢查表table_left的各組(groupId)，是否在表table_right中存在有一組(groupId)數據(data)與它的數據(data)完全相等.

如圖1. 可以看出表table_left和table_right存在兩組數據完整相等：

基于SQL Server中如何比較兩個表的各組數據圖解說明

圖2.

分析

從上面的兩個表，可以知道它們存放的是一組一組的數據；那么，接下來我借助數學集合的列舉法和運算進行分析。

先通過集合的列舉法描述兩個表的各組數據：

基于SQL Server中如何比較兩個表的各組數據圖解說明

圖3.

這里只有兩種情況，相等和不相等。對于不相等，可再分為部分相等、包含、和完全不相等。使用集合描述，可使用交集，子集，并集。如下面圖4.，我列舉出這幾種常見的情況：

基于SQL Server中如何比較兩個表的各組數據圖解說明

圖4.

實現

在數據庫中，要找出表table_left和表table_right存在相同數據的組，方法很多，這里我列出兩種常用的方法。

(下面的SQL腳本，是以圖4.的數據為基礎參考)

方法1：

通過"Select … From …Order by … xml for path('') "把各組的data列數據連串起來（如，圖4.把table_left的組#11的列data連串起來成"data1-data2-data3"），其他分組（包含表table_right）以此方法實現data列數據連串起來；然后通過比較兩表的連串后字段是否存在相等，若是相等就說明這比較多兩組數據相等，由此可以判斷出表table_left的哪組數據在表table_right存在與它數據完全相等的組。

針對方法1，需要對原表增加一個字段dataPath,用于存儲data列數據連串的結果，如：

復制代碼代碼如下:

alter table table_left add dataPath nvarchar(200)
alter table table_right add dataPath nvarchar(200)

分組連串data列數據并update至剛新增的列dataPath，如：

復制代碼代碼如下:

update a
set dataPath=b.dataPath
from table_left a
cross apply(select (select '-'+x.data from table_left x where x.groupId=a.groupId order by x.data for xml path('')) as dataPath)b

update a
set dataPath=b.dataPath
from table_right a
cross apply(select (select '-'+x.data from table_right x where x.groupId=a.groupId order by x.data for xml path('')) as dataPath)b

接下來就是查詢了，如：

復制代碼代碼如下:

select distinct a.groupId
from table_left a
where exists(select 1 from table_right x where x.dataPath=a.dataPath)

完整代碼：

復制代碼代碼如下:

View Code
use tempdb
go
if object_id('table_left') is not null drop table table_left
if object_id('table_right') is not null drop table table_right
go
create table table_left(groupId nvarchar(5),data nvarchar(10))
create table table_right(groupId nvarchar(5),data nvarchar(10))
go
alter table table_left add dataPath nvarchar(200)
alter table table_right add dataPath nvarchar(200)
go
create nonclustered index ix_left on table_left(dataPath)
create nonclustered index ix_right on table_right(dataPath)
go
set nocount on
go
insert into table_right(groupId,data)
select '#1','data1' union all
select '#1','data2' union all
select '#1','data3' union all
select '#2','data55' union all
select '#2','data55' union all
select '#3','data91' union all
select '#3','data92' union all
select '#4','data65' union all
select '#4','data66' union all
select '#4','data67' union all
select '#4','data68' union all
select '#4','data69' union all
select '#5','data77' union all
select '#5','data79'
insert into table_left(groupId,data)
select '#11','data1' union all
select '#11','data2' union all
select '#11','data3' union all
select '#22','data55' union all
select '#22','data57' union all
select '#33','data99' union all
select '#33','data99' union all
select '#44','data66' union all
select '#44','data68' union all
select '#55','data77' union all
select '#55','data78' union all
select '#55','data79'
go
update a
set dataPath=b.dataPath
from table_left a
cross apply(select (select '-'+x.data from table_left x where x.groupId=a.groupId order by x.data for xml path('')) as dataPath)b
update a
set dataPath=b.dataPath
from table_right a
cross apply(select (select '-'+x.data from table_right x where x.groupId=a.groupId order by x.data for xml path('')) as dataPath)b
--
select distinct a.groupId
from table_left a
where exists(select 1 from table_right x where x.dataPath=a.dataPath)

方法2：

通過SQL Sever提供的集運算符"Except"，判斷兩組非重復的數據。如果兩組針對對方都不存在非重復的數據，就說明這兩組數據完全相等。如，表table_left中的組#11和表 table_right中的組#1，對列data進行"Except"集運算，無任是(#11 à #1)進行Except集運算，還是(#1 à #11 )進行Except集合運算，都返回空結果，這就說明組#1 和#11的data數據完全相等，如：

復制代碼代碼如下:

上一篇：order by newid() 各種數據庫隨機查詢的方法

下一篇：通過SQL繪制楊輝三角的實現方法介紹