Hive Shell命令之一（数据库和表的操作）

创建时间：2015-12-22 投稿人：浏览次数：1684

//数据库的有关操作
1、如果数据库不存在的话创建数据库，默认数据库default：
create database if not exists test;

2、查看hive中所包含的数据库：
show databases;

3、如果数据库非常多，可以用正则表达式匹配筛选出需要的数据库名。
show databases like "t.*"；

4、创建数据库并指定数据库存放位置(默认存放在hive.metastore.warehouse.dir所指定的目录)：
create database test01 location "/data1";

5、创建数据库时增加描述信息：
create database test02 comment "this is a database named test02";

6、查看数据库的描述信息：
describe database test02;
结果：test02   this is a database named test02   hdfs://master.infobird.com:8020/user/hive/warehouse/test02.db   hdfs   USER

7、为数据库增加一些和其相关的键值对属性信息：
create database test05 with dbproperties("name" = "abc", "data"="2015-12-22");
注意：键值对根据需要自定义

8、显示为数据库定义的键值对的信息：
describe database extended test05;

9、使用数据库：
use test01;

10、设置属性值使提示符里显示当前数据库：
set hive.cli.print.current.db=true;

11、如果数据库存在，删除数据库：
drop database if exists test;

12、默认情况下，hive是不允许用户删除一个包含有标的数据库的。
    用户要么先删除表再删除数据库，要么在命令中加入关键字cascade（默认是restrict）:
drop database if exists test01 cascade;

如果某个数据库删除了，其对应的目录也同时会被删除。

13、只有为数据库设置的键值对信息可以改，其他信息如数据库名，数据库所在目录都不可更改：
alter database test05 set dbproperties("name"="abcd","size"="10");

//表的有关操作
1、创建表
create table if not exists test05.employees(name string comment "Employee name",
      salary float comment "employee salary",
      subordinates array<String> comment "Names of subordinates",
      deductions map<string, float> comment "keys are deductions names, values are percentages",
      address struct<street:string, city:string, state:string, zip:int> comment "Home address")
      comment "Description of the table"
      tblproperties ("creator"="me","created_date"="2015-12-22")
      location "/user/hive/warehouse/test05.db/employees";
如果用户当前所处的数据库并非目标数据库，那么我们可以在表名钱增加一个数据库名来进行指定，如test05就是我们之前建的数据库。

2、拷贝已经存在的标的表模式，无需拷贝数据：
create table if not exists test05.employees2 like test05.employees;

3、列举当前库下的所有表：
show tables;

列举指定库下的所有表：
show tables in default;

根据正则过滤需要的表名：
show tables like "empl.*";

4、查看表结构信息:
describe test05.employees;

查看表的结构的详细信息：
describe extended test05.employees;

查看某一列的信息：
describe test05.employees salary;

5、创建一个外部表
create external table if not exists stocks(name STRING, age INT, address STRING)
      ROW FORMAT DELIMITED FIELDS TERMINATED BY ","
      LOCATION "/data1/stocks";
这个表可以读取所有位于/data1/stocks目录下以逗号分隔的数据。
因为表是外部的，所以hive并非认为其完全拥有这份数据，因此删除该表并不会删除掉这份数据。

6、通过复制产生一张表
create external table if not exists test05.emplyees3 like test05.employees;

这里语句中如果省略掉external关键字而且源表是外部表的话，那么生成的新表也是外部表。
如果语句中省略掉external关键字而且源表是管理表的话，那么生成的新表也将是管理表。
但是，如果语句中包含有external关键字而且源表是管理表的话，那么生成的新表将是外部表。

7、创建分区表：
create table employees(name string, age int, phone string) partitioned by (address string, city string);

如果表中数据及分区个数都非常大，执行一个所有分区的查询可能会触发一个巨大的MapReduce任务，因此建议设置
set hive.mapred.mode=strict;

8、查看所有分区：

show partitions employees;

9、查看指定分区：
show partitions employees partition(addresds="Beijing");

10、创建外部表分区：
create external table if not exists log_messages (hms int) partitioned by (year int, month int, day int) row format delimited fields terminated by " ";

修改表

alter table log_messages add partition(year = 2015, month = 12, day = 23) location "hdfs://data1/log_messages/2015/12/23";

11、自定义表的存储格式：
Hive的默认存储格式是文本文件格式，这个格式也可以通过可选的子句STORED AS TEXTFILE显示指定。
用户可以在创建表的时候指定各种各样的分隔符
create table employees(name string, salary float,
     subordinates array<string>,
     deductions map<string, float>,
     address struct<street:string, city:string, state:string, zip:int>)
     row format delimited
     fields terminated by "01"
     collection items terminated by "02"
     map keys terminated by "03"
     lines terminated by " "
     stored as textfile;

12、删除表：
drop table if exists employees;

13、表重命名
alter table log_messages rename to logmessages;

14、增加、修改和删除分区
alter table logmessages add if not exists partition(year = 2015, month = 12, day =23) location "/logs/2015/12/23" partition (year = 2015, month = 12, day = 22) location "/logs/2015/12/22";

通过高效地移动位置来修改某个分区的路径（不会将数据从旧的路径转走，也不会删除旧的数据）：
alter table logmessages partition(year = 2015, month = 12, day = 22) set location "s3n://ourbucket/logs/2015/12/22";

删除某个分区：
alter table logmessages drop if exists partition (year = 2015, month = 12, day = 22);

15、增加列：
alter table logmessages add columns (app_name string comment "application name");

16、修改列，对某个字段进行重命名并修改其位置、类型或者注释：
alter table logmessages change column hms hours_minutes int comment "the hours and minutes" after app_name;

把列名hms修改为：hours_minutes，并添加注释，位置放在列app_name 后面，若是放在第一个位置则用关键字 first代替after app_name.

17、删除或者替换列：
alter table logmessages replace columns (hms int);
这是把表中所有的列删除掉，并新加入列hms，因为是alter语句，所以只有表的元数据信息改变了，原来的分区还在。

18、修改表的属性
alter table logmessages set tblproperties("notes" = "the process");

19、修改表的存储属性
alter table logmessages partition(year = 2015, minth = 12, day = 22) set fileformat sequencefile;

声明：该文观点仅代表作者本人，牛骨文系教育信息发布平台，牛骨文仅提供信息存储空间服务。

上一篇：实用的php文件操作类
下一篇： C语言中 for循环内的break语句跳出的问题

热门文章: CTF writeup 2_南邮网络攻防训...; SSM框架——详细整合教程（...; Linux Shell脚本编程－－curl命...; HttpClient使用详解; Java面试题全集（上）; JAVA设计模式之单例模式; java.lang.OutOfMemoryError: PermGen ...; TCP协议中的三次握手和四次...; form表单的两种提交方式，su...; String,StringBuffer与StringBuilder...

最新文章: Java之品优购课程讲义_day20（7）; 剑指 Offer - 8：跳台阶; Netty权威指南_札记02_NIO编程; mysql时间属性之时间戳和datetime之...; 虚拟现实或许可以拯救古埃及的“...; spring cloud服务注册中心eureka---集群...; Java SE 第六章; HTTP请求+数据库; HIDL学习笔记之HIDL C++（第二天）; ubuntu系统下指定tomcat运行时为JDK1.8...