牛骨文教育服务平台(让学习变的简单)
博文笔记

hive多表关联

创建时间:2016-08-01 投稿人: 浏览次数:11562
多表join使用说明


select * from (select userId from table_a where dt=20160731) a join (select userId from table_b where dt=20160731) b  on a.userId=b.userId join (select userId from table_c where dt=20160731) c on a.userId=c.userId
等价于
select * from (select userId from table_a where dt=20160731) a join (select userId from table_b where dt=20160731) b  on a.userId=b.userId join (select userId from table_c where dt=20160731) c on b.userId=c.userId
等价于
select * from (select userId from table_a where dt=20160731) a join (select userId from table_b where dt=20160731) b  on a.userId=b.userId join (select userId from table_c where dt=20160731) c on a.userId=c.userId and b.userId=c.userId




总结:
可以把a与b表join关联看成一个新的表table_j,table_j表有两列a.userId,b.userId与c表进行关联






如果是left outer join 效果相同,只不过是将对最后一个on链接条件来说,是与table_j的第一列相连还是与第二列相连而已


等价于table_j的第二列userId(a.userId)与c进行关联,不去管b表是否有能关联上

select * from 
(select userId from table_a where dt=20160731) a 
left outer join (select userId from table_b where dt=20160731) b  on a.userId=b.userId 
left outer join (select userId from table_c where dt=20160731) c on a.userId=c.userId




等价于table_j的第二列userId(b.userId)与c进行关联,不去管a表是否有能关联上
select * from 
(select userId from table_a where dt=20160731) a 
left outer join (select userId from table_b where dt=20160731) b  on a.userId=b.userId 
left outer join (select userId from table_c where dt=20160731) c on b.userId=c.userId




等价于table_j的第一列与第二列userId(b.userId)与c进行关联,要求同时能关联上
select * from 
(select userId from table_a where dt=20160731) a 
left outer join (select userId from table_b where dt=20160731) b  on a.userId=b.userId  
left outer join (select userId from table_c where dt=20160731) c on a.userId=c.userId and b.userId=c.userId

声明:该文观点仅代表作者本人,牛骨文系教育信息发布平台,牛骨文仅提供信息存储空间服务。