HiveQL
Index: 索引
索引是标准的数据库技术,
1 2 3 4 hive> create table user( id int, name string)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ‘\t’ > STORED AS TEXTFILE; 2、导入数据:
1 2 hive> load data local inpath ‘/export1/tmp/wyp/row.txt’ > overwrite into table user; 3、创建索引之前测试
01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 hive> select * from user where id =500000; Total MapReduce jobs = 1 Launching Job 1 out of 1 Number of reduce tasks is set to 0 since there’s no reduce operator Cannot run job locally: Input Size (= 356888890) is larger than hive.exec.mode.local.auto.inputbytes.max (= 134217728) Starting Job = job_1384246387966_0247, Tracking URL =
http://l-datalogm1.data.cn1:9981/proxy/application_1384246387966_0247/
Kill Command=/home/q/hadoop/bin/hadoop job -kill job_1384246387966_0247
Hadoop job information for Stage-1: number of mappers:2; number of reducers:0
2013-11-13 15:09:53,336 Stage-1 map = 0%, reduce = 0%
2013-11-13 15:09:59,500 Stage-1 map=50%,reduce=0%, Cumulative CPU 2.0 sec
2013-11-13 15:10:00,531 Stage-1 map=100%,reduce=0%, Cumulative CPU 5.63 sec
2013-11-13 15:10:01,560 Stage-1 map=100%,reduce=0%, Cumulative CPU 5.63 sec
MapReduce Total cumulative CPU time: 5 seconds 630 msec
Ended Job = job_1384246387966_0247
MapReduce Jobs Launched:
Job 0: Map: 2 Cumulative CPU: 5.63 sec
HDFS Read: 361084006 HDFS Write: 357 SUCCESS
Total MapReduce CPU Time Spent: 5 seconds 630 msec
OK
500000 wyp.
Time taken: 14.107 seconds, Fetched: 1 row(s)
一共用了
01
02
03
04
05
06
07
08
09
10
11
12
hive> create index user_index on table user(id) > as ‘org.apache.hadoop.hive.ql.index.compact.CompactIndexHandler’ > with deferred rebuild > IN TABLE user_index_table;
hive> alter index user_index on user rebuild;
hive> select * from user_index_table limit 5;
0 hdfs://mycluster/user/hive/warehouse/table02/000000_0 [0]
1 hdfs://mycluster/user/hive/warehouse/table02/000000_0 [352]
2 hdfs://mycluster/user/hive/warehouse/table02/000000_0 [704]
3 hdfs://mycluster/user/hive/warehouse/table02/000000_0 [1056]
4 hdfs://mycluster/user/hive/warehouse/table02/000000_0 [1408]
Time taken: 0.244 seconds, Fetched: 5 row(s)
这样就对
在
Check the index columns, they should appear in the table being indexed.
FAILED: Execution Error, return code 1 from
org.apache.hadoop.hive.ql.exec.DDLTask
这个
Insert
1、
2、
在
标准语法语法如下:
1
2
3
4
5
6
7
8
9
用法一:
INSERT OVERWRITE TABLE tablename1 [PARTITION
(partcol1=val1, partcol2=val2 …) [IF NOT EXISTS]]
select_statement1 FROM from_statement;
用法二:
INSERT INTO TABLE tablename1 [PARTITION
(partcol1=val1, partcol2=val2 …)]
select_statement1 FROM from_statement;
注意:上面语句由于太长了,为了页面显示美观,用’
1
2
hive> insert into table cite
> select * from tt;
这样就会将
1 2 3 4 5 6 hive> insert into table cite > select * from cite_standby;
FAILED: SemanticException [Error 10044]: Line 1:18 Cannot insert into
target table because column number/types are different ‘cite’:
Table insclause-0 has 2 columns, but query has 1 columns.
从上面错误提示看出,查询的表格
1
2
hive> insert into table cite
> select * from cite;
结果就是相当于复制了一份
1
2
hive> insert overwrite table cite
> select * from tt;
上面的语句将会用