我有两个数据帧如下
+--------------------+--------+-----------+-------------+
|UniqueFundamentalSet|Taxonomy|FFAction|!||DataPartition|
+--------------------+--------+-----------+-------------+
|192730241374 |1 |I|!| |Japan |
|192730241374 |2 |I|!| |Japan |
|192730241373 |1 |I|!| |Japan |
|192730241373 |2 |I|!| |Japan |
+--------------------+--------+-----------+-------------+
+--------------------+--------+-----------+-------------+
|UniqueFundamentalSet|Taxonomy|FFAction|!||DataPartition|
+--------------------+--------+-----------+-------------+
|192730241374 |1 |I|!| |Japan |
|192730241374 |2 |I|!| |Japan |
|192730391384 |1 |I|!| |Japan |
|192730391384 |2 |I|!| |Japan |
|192730241373 |1 |I|!| |Japan |
|192730241373 |2 |I|!| |Japan |
+--------------------+--------+-----------+-------------+
当我在上面的数据帧之间执行并集时,我得到了重复的行。
这是我的输出
+--------------------+--------+-----------+-------------+
|UniqueFundamentalSet|Taxonomy|FFAction|!||DataPartition|
+--------------------+--------+-----------+-------------+
|192730241374 |1 |I|!| |Japan |
|192730241374 |2 |I|!| |Japan |
|192730241373 |1 |I|!| |Japan |
|192730241373 |2 |I|!| |Japan |
|192730241374 |1 |I|!| |Japan |
|192730241374 |2 |I|!| |Japan |
|192730391384 |1 |I|!| |Japan |
|192730391384 |2 |I|!| |Japan |
|192730241373 |1 |I|!| |Japan |
|192730241373 |2 |I|!| |Japan |
+--------------------+--------+-----------+-------------+
val dfToSave = dfMainOutput.union(insertdf)
我的印象是union删除重复的行,unionall保留它。