Summary: in this tutorial, you will learn how to compare two tables to find the unmatched records.
In data migration, it is common to compare two tables to identify a record in one table that has no corresponding entries in another table.
For example, consider a scenario where a new database has a schema that is different from the legacy database. The objective is to migrate all data from the legacy database to the new one while ensuring the accuracy of the migration.
To validate the data, we have to compare two tables, one in the new database and one in the legacy database, and identify the records that do not have a match.
Let’s take two tables t1
and t2
as an example.
The following steps outline the process of comparing two tables and identifying unmatched records:
First, use the UNION statement to combine rows in both tables; include only the columns that you want to compare.
SELECT
t1.pk,
t1.c1
FROM
t1
UNION ALL
SELECT
t2.pk,
t2.c1
FROM
t2
Code language: SQL (Structured Query Language) (sql)
In this statement, t1.pk and t2.pk are the primary key columns of the tables, and t1.c1 and t2.c2 are the columns that you want to compare.
Second, group the rows based on the primary key and columns that you want to compare.
If the values in the columns that you want to compare are identical, the HAVING COUNT(*)
returns 2 otherwise it returns 1:
SELECT pk, c1
FROM
(
SELECT t1.pk, t1.c1
FROM t1
UNION ALL
SELECT t2.pk, t2.c1
FROM t2
) t
GROUP BY pk, c1
HAVING COUNT(*) = 1
ORDER BY pk
Code language: SQL (Structured Query Language) (sql)
If values in the columns involved in the comparison are identical, the query returns no row.
Comparing two tables example
First, create two new tables called t1
and t2
:
CREATE TABLE t1(
id int auto_increment primary key,
title varchar(255)
);
CREATE TABLE t2(
id int auto_increment primary key,
title varchar(255),
note varchar(255)
);
Code language: SQL (Structured Query Language) (sql)
Second, insert data into t1
and t2
tables:
INSERT INTO t1(title)
VALUES
('row 1'),
('row 2'),
('row 3');
INSERT INTO t2(title, note)
SELECT
title,
'data migration'
FROM
t1;
Code language: SQL (Structured Query Language) (sql)
Third, compare the values of the id
and title
column of both tables:
SELECT
id,
title
FROM
(
SELECT
id,
title
FROM
t1
UNION ALL
SELECT
id,
title
FROM
t2
) tbl
GROUP BY
id,
title
HAVING
count(*) = 1
ORDER BY
id;
Code language: SQL (Structured Query Language) (sql)
Output:
Empty set (0.00 sec)
Code language: JavaScript (javascript)
The query returns no row because there are no unmatched records.
Fourth, insert a new row into the t2 table:
INSERT INTO t2(title, note)
VALUES
('new row 4', 'new');
Code language: SQL (Structured Query Language) (sql)
Fifth, execute the query to compare the values of the title
column in both tables again:
SELECT
id,
title
FROM
(
SELECT
id,
title
FROM
t1
UNION ALL
SELECT
id,
title
FROM
t2
) tbl
GROUP BY
id,
title
HAVING
count(*) = 1
ORDER BY
id;
The new row, which is the unmatched row, should return:
+----+-----------+
| id | title |
+----+-----------+
| 4 | new row 4 |
+----+-----------+
1 row in set (0.00 sec)
Code language: JavaScript (javascript)
Summary
- Use the
GROUP BY
andHAVING COUNT
clauses to compare the contents of two tables to find the unmatched records.