牛骨文教育服务平台(让学习变的简单)
博文笔记

Hive(十四)--静态分区和动态分区

创建时间:2017-05-23 投稿人: 浏览次数:934

静态分区是分区的时候指定所分的区,不能自动实现分区。

动态分区可以在程序运行的时候以某一列变量的值进行分区。


静态分区表:

一级分区表:

复制代码
CREATE TABLE order_created_partition (
    orderNumber STRING
  , event_time  STRING
)
PARTITIONED BY (event_month string)
ROW FORMAT DELIMITED FIELDS TERMINATED BY "	";
复制代码

加载数据方式一:从本地/HDFS目录加载

load data local inpath "/home/spark/software/data/order_created.txt" overwrite into table order_created_partition PARTITION(event_month="2014-05");
select * from order_created_partition where event_month="2014-05";
复制代码
+-----------------+-----------------------------+--------------+
|   ordernumber   |         event_time          | event_month  |
+-----------------+-----------------------------+--------------+
| 10703007267488  | 2014-05-01 06:01:12.334+01  | 2014-05      |
| 10101043505096  | 2014-05-01 07:28:12.342+01  | 2014-05      |
| 10103043509747  | 2014-05-01 07:50:12.33+01   | 2014-05      |
| 10103043501575  | 2014-05-01 09:27:12.33+01   | 2014-05      |
| 10104043514061  | 2014-05-01 09:03:12.324+01  | 2014-05      |
+-----------------+-----------------------------+--------------+
复制代码

加载数据方式二:手工上传文件到hdfs上,然后将数据添加到分区表指定的分区:

1) 创建hdfs目录:在hdfs目录:/user/hive/warehouse/order_created_partition目录下创建event_month=2014-06

hadoop fs -mkdir /user/hive/warehouse/order_created_partition/event_month=2014-06

2)拷贝数据到新创建的目录下:

hadoop fs -put /home/spark/software/data/order_created.txt /user/hive/warehouse/order_created_partition/event_month=2014-06

select * from order_created_partition where event_month="2014-06"; #发现查询结果是空的

3)添加新分区数据到元数据信息中:

msck repair table order_created_partition;

输出日志信息:

Partitions not in metastore: order_created_partition:event_month=2014-06
Repair: Added partition to metastore order_created_partition:event_month=2014-06

 

或者: alter table order_created_partition add partition(dt="2014-06");

select * from order_created_partition where event_month="2014-06"; 
复制代码
+-----------------+-----------------------------+--------------+
|   ordernumber   |         event_time          | event_month  |
+-----------------+-----------------------------+--------------+
| 10703007267488  | 2014-05-01 06:01:12.334+01  | 2014-06      |
| 10101043505096  | 2014-05-01 07:28:12.342+01  | 2014-06      |
| 10103043509747  | 2014-05-01 07:50:12.33+01   | 2014-06      |
| 10103043501575  | 2014-05-01 09:27:12.33+01   | 2014-06      |
| 10104043514061  | 2014-05-01 09:03:12.324+01  | 2014-06      |
+-----------------+-----------------------------+--------------+
复制代码

加载数据方式三:select查询方式insert/overwrite

CREATE TABLE order_created_4_partition (
    orderNumber STRING
  , event_time  STRING
)
ROW FORMAT DELIMITED FIELDS TERMINATED BY "	";
load data local inpath "/home/spark/software/data/order_created.txt" overwrite into table order_created_4_partition;

insert into table order_created_partition partition(event_month="2014-07") select * from order_created_4_partition;
insert overwrite table order_created_partition partition(event_month="2014-07") select * from order_created_4_partition;

对比:

insert overwrite table order_created_partition partition(event_month="2014-07") select ordernumber,event_time from order_created_4_partition;
insert overwrite table order_created_partition partition(event_month="2014-07") select event_time,ordernumber from order_created_4_partition;

发现字段值错位,在使用时一定要注意:字段值顺序要与表中字段顺序一致,名称可以不一致;

查看分区表已有的所有分区:

show partitions order_created_partition;

查看分区表已有的指定分区:

SHOW PARTITIONS order_created_partition PARTITION(event_month="2014-06");

查看表字段信息:

desc order_created_partition;
desc extended order_created_partition;
desc formatted order_created_partition;
desc formatted order_created_partition partition(event_month="2014-05");

 

 

二级分区表:

复制代码
CREATE TABLE order_created_partition2 (
    orderNumber STRING
  , event_time  STRING
)
PARTITIONED BY (event_month string, step string)
ROW FORMAT DELIMITED FIELDS TERMINATED BY "	";
复制代码
show partitions order_created_partition2;

显示结果空

load data local inpath "/home/spark/software/data/order_created.txt" into table order_created_partition2 partition(event_month="2014-09",step="1");  
show partitions order_created_partition2;
+-----------------------------+
|           result            |
+-----------------------------+
| event_month=2014-09/step=1  |
+-----------------------------+
insert overwrite table order_created_partition2 partition(event_month="2014-09",step="2") select * from order_created_4_partition;
show partitions order_created_partition2;
复制代码
+-----------------------------+
|           result            |
+-----------------------------+
| event_month=2014-09/step=1  |
| event_month=2014-09/step=2  |
+-----------------------------+
复制代码

 

动态分区表

复制代码
CREATE TABLE order_created_dynamic_partition (
    orderNumber STRING
  , event_time  STRING
)
PARTITIONED BY (event_month string)
;
复制代码

 

insert into table order_created_dynamic_partition PARTITION (event_month)
select orderNumber, event_time, substr(event_time, 1, 7) as event_month from order_created;

报错:

FAILED: SemanticException [Error 10096]: Dynamic partition strict mode requires at least one static partition column. 
To turn this off set hive.exec.dynamic.partition.mode=nonstrict

解决方案:

set hive.exec.dynamic.partition.mode=nonstrict;

重新执行:

insert into table order_created_dynamic_partition PARTITION (event_month)
select orderNumber, event_time, substr(event_time, 1, 7) as event_month from order_created;
select * from order_created_dynamic_partition;
复制代码
+-----------------+-----------------------------+--------------+
|   ordernumber   |         event_time          | event_month  |
+-----------------+-----------------------------+--------------+
| 10703007267488  | 2014-05-01 06:01:12.334+01  | 2014-05      |
| 10101043505096  | 2014-05-01 07:28:12.342+01  | 2014-05      |
| 10103043509747  | 2014-05-01 07:50:12.33+01   | 2014-05      |
| 10103043501575  | 2014-05-01 09:27:12.33+01   | 2014-05      |
| 10104043514061  | 2014-05-01 09:03:12.324+01  | 2014-05      |
+-----------------+-----------------------------+--------------+
复制代码
声明:该文观点仅代表作者本人,牛骨文系教育信息发布平台,牛骨文仅提供信息存储空间服务。