环境:
hadoop版本:hadoop-2.2.0 (官网下载并编译为64位版本)
hive版本:hive-0.12.0(官网下载后解压)
集群状态良好,尝试普通hive以及mapreduce均成功。
测试新版hive的orc存储格式,步骤如下:
create external table text_test (id string,text string) row format delimited fields terminated by '\t' STORED AS textfile LOCATION '/user/hive/warehouse/text_test';
hadoop版本:hadoop-2.2.0 (官网下载并编译为64位版本)
hive版本:hive-0.12.0(官网下载后解压)
集群状态良好,尝试普通hive以及mapreduce均成功。
测试新版hive的orc存储格式,步骤如下:
create external table text_test (id string,text string) row format delimited fields terminated by '\t' STORED AS textfile LOCATION '/user/hive/warehouse/text_test';
create external table orc_test (id string,text string) row format delimited fields terminated by '\t' STORED AS orc LOCATION '/user/hive/warehouse/orc_test';
hive> desc text_test;
OK
id string None
text string None
hive> desc orc_test;
OK
id string from deserializer
text string from deserializer
hive> select * from text_test;
OK
1 myew
2 ccsd
3 33
hive> insert overwrite table orc_test select * from text_test;
Total MapReduce jobs = 3
Launching Job 1 out of 3
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1394433490694_0016, Tracking URL = http://zw-34-69:8088/proxy/application_1394433490694_0016/
Kill Command = /opt/hadoop/hadoop/bin/hadoop job -kill job_1394433490694_0016
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0
2014-03-13 17:00:49,899 Stage-1 map = 0%, reduce = 0%
2014-03-13 17:01:10,097 Stage-1 map = 100%, reduce = 0%
Ended Job = job_1394433490694_0016 with errors
Error during job, obtaining debugging information...
Examining task ID: task_1394433490694_0016_m_000000 (and more) from job job_1394433490694_0016
Task with the most failures(4):
-----
Task ID:
task_1394433490694_0016_m_000000
URL:
http://zw-34-69:8088/taskdetails.jsp?jobid=job_1394433490694_0016&tipid=task_1394433490694_0016_m_000000
-----
Diagnostic Messages for this Task:
Error: java.lang.RuntimeException: Hive Runtime Error while closing operators
at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:240)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)
Caused by: java.lang.UnsupportedOperationException: This is supposed to be overridden by subclasses.
at com.google.protobuf.GeneratedMessage.getUnknownFields(GeneratedMessage.java:180)
at org.apache.hadoop.hive.ql.io.orc.OrcProto$ColumnStatistics.getSerializedSize(OrcProto.java:3046)
at com.google.protobuf.CodedOutputStream.computeMessageSizeNoTag(CodedOutputStream.java:749)
at com.google.protobuf.CodedOutputStream.computeMessageSize(CodedOutputStream.java:530)
at org.apache.hadoop.hive.ql.io.orc.OrcProto$RowIndexEntry.getSerializedSize(OrcProto.java:4129)
at com.google.protobuf.CodedOutputStream.computeMessageSizeNoTag(CodedOutputStream.java:749)
at com.google.protobuf.CodedOutputStream.computeMessageSize(CodedOutputStream.java:530)
at org.apache.hadoop.hive.ql.io.orc.OrcProto$RowIndex.getSerializedSize(OrcProto.java:4641)
at com.google.protobuf.AbstractMessageLite.writeTo(AbstractMessageLite.java:75)
at org.apache.hadoop.hive.ql.io.orc.WriterImpl$TreeWriter.writeStripe(WriterImpl.java:548)
at org.apache.hadoop.hive.ql.io.orc.WriterImpl$StructTreeWriter.writeStripe(WriterImpl.java:1328)
at org.apache.hadoop.hive.ql.io.orc.WriterImpl.flushStripe(WriterImpl.java:1699)
at org.apache.hadoop.hive.ql.io.orc.WriterImpl.close(WriterImpl.java:1868)
at org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat$OrcRecordWriter.close(OrcOutputFormat.java:95)
at org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.closeWriters(FileSinkOperator.java:181)
at org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:866)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:596)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:207)
... 8 more
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
MapReduce Jobs Launched:
Job 0: Map: 1 HDFS Read: 0 HDFS Write: 0 FAIL
Total MapReduce CPU Time Spent: 0 msec
随后开始漫长的google、baidu、bing之旅,终于找到了解决办法:http://web.archiveorange.com/archive/v/S2z2uV6yqpmtC3rgpsrs
感谢两位前辈辛勤的研究。
总结一下问题原因:
编译hadoop-2.2.0时用的protobuf-2.5.0版本,而编译hive-0.12.0时用的protobuf-2.4.1版本,从而造成了冲突。
解决办法:
重新使用protobuf-2.5.0来编译hive-0.12.0
1. 安装protobuf:下载:https://code.google.com/p/protobuf/downloads/detail?name=protobuf-2.5.0.tar.gz
解压:tar -xzvf protobuf-2.5.0.tar.gz
进入:cd protobuf-2.5.0
编译安装:
OK
1 myew
2 ccsd
3 33
hive> insert overwrite table orc_test select * from text_test;
Total MapReduce jobs = 3
Launching Job 1 out of 3
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1394433490694_0016, Tracking URL = http://zw-34-69:8088/proxy/application_1394433490694_0016/
Kill Command = /opt/hadoop/hadoop/bin/hadoop job -kill job_1394433490694_0016
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0
2014-03-13 17:00:49,899 Stage-1 map = 0%, reduce = 0%
2014-03-13 17:01:10,097 Stage-1 map = 100%, reduce = 0%
Ended Job = job_1394433490694_0016 with errors
Error during job, obtaining debugging information...
Examining task ID: task_1394433490694_0016_m_000000 (and more) from job job_1394433490694_0016
Task with the most failures(4):
-----
Task ID:
task_1394433490694_0016_m_000000
URL:
http://zw-34-69:8088/taskdetails.jsp?jobid=job_1394433490694_0016&tipid=task_1394433490694_0016_m_000000
-----
Diagnostic Messages for this Task:
Error: java.lang.RuntimeException: Hive Runtime Error while closing operators
at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:240)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)
Caused by: java.lang.UnsupportedOperationException: This is supposed to be overridden by subclasses.
at com.google.protobuf.GeneratedMessage.getUnknownFields(GeneratedMessage.java:180)
at org.apache.hadoop.hive.ql.io.orc.OrcProto$ColumnStatistics.getSerializedSize(OrcProto.java:3046)
at com.google.protobuf.CodedOutputStream.computeMessageSizeNoTag(CodedOutputStream.java:749)
at com.google.protobuf.CodedOutputStream.computeMessageSize(CodedOutputStream.java:530)
at org.apache.hadoop.hive.ql.io.orc.OrcProto$RowIndexEntry.getSerializedSize(OrcProto.java:4129)
at com.google.protobuf.CodedOutputStream.computeMessageSizeNoTag(CodedOutputStream.java:749)
at com.google.protobuf.CodedOutputStream.computeMessageSize(CodedOutputStream.java:530)
at org.apache.hadoop.hive.ql.io.orc.OrcProto$RowIndex.getSerializedSize(OrcProto.java:4641)
at com.google.protobuf.AbstractMessageLite.writeTo(AbstractMessageLite.java:75)
at org.apache.hadoop.hive.ql.io.orc.WriterImpl$TreeWriter.writeStripe(WriterImpl.java:548)
at org.apache.hadoop.hive.ql.io.orc.WriterImpl$StructTreeWriter.writeStripe(WriterImpl.java:1328)
at org.apache.hadoop.hive.ql.io.orc.WriterImpl.flushStripe(WriterImpl.java:1699)
at org.apache.hadoop.hive.ql.io.orc.WriterImpl.close(WriterImpl.java:1868)
at org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat$OrcRecordWriter.close(OrcOutputFormat.java:95)
at org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.closeWriters(FileSinkOperator.java:181)
at org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:866)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:596)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:207)
... 8 more
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
MapReduce Jobs Launched:
Job 0: Map: 1 HDFS Read: 0 HDFS Write: 0 FAIL
Total MapReduce CPU Time Spent: 0 msec
随后开始漫长的google、baidu、bing之旅,终于找到了解决办法:http://web.archiveorange.com/archive/v/S2z2uV6yqpmtC3rgpsrs
感谢两位前辈辛勤的研究。
总结一下问题原因:
编译hadoop-2.2.0时用的protobuf-2.5.0版本,而编译hive-0.12.0时用的protobuf-2.4.1版本,从而造成了冲突。
解决办法:
重新使用protobuf-2.5.0来编译hive-0.12.0
1. 安装protobuf:下载:https://code.google.com/p/protobuf/downloads/detail?name=protobuf-2.5.0.tar.gz
解压:tar -xzvf protobuf-2.5.0.tar.gz
进入:cd protobuf-2.5.0
编译安装:
- ./configure
- make
- make check
- make install (root权限)
2. 下载hive源码:svn checkout http://svn.apache.org/repos/asf/hive/tags/release-0.12.0/
3. 安装ant:下载地址 http://ant.apache.org/bindownload.cgi 我下载的1.9.2版本 apache-ant-1.9.2-bin.tar.gz。
(1.9.3版本编译会报错 http://www.mailinglistarchive.com/html/dev@ant.apache.org/2014-01/msg00009.html)
解压: tar -xzvf apache-ant-1.9.2-bin.tar.gz
配置Path:vi ~/.bash_profile
export ANT_HOME=/opt/hadoop/apache-ant-1.9.2
PATH=$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$HIVE_HOME/bin:$ANT_HOME/bin:$PATH
export PATH
保存退出后执行: . ~/.bash_profile 使配置生效。
4. 更改ant编译时使用的protobuf版本:
(1.9.3版本编译会报错 http://www.mailinglistarchive.com/html/dev@ant.apache.org/2014-01/msg00009.html)
解压: tar -xzvf apache-ant-1.9.2-bin.tar.gz
配置Path:vi ~/.bash_profile
export ANT_HOME=/opt/hadoop/apache-ant-1.9.2
PATH=$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$HIVE_HOME/bin:$ANT_HOME/bin:$PATH
export PATH
保存退出后执行: . ~/.bash_profile 使配置生效。
4. 更改ant编译时使用的protobuf版本:
修改release-0.12.0/ivy/libraries.properties文件,将protobuf.version=2.4.1修改为protobuf.version=2.5.0
5. 在hive目录中编译protobuf:cd release-0.12.0
ant protobuf
5. 在hive目录中编译protobuf:cd release-0.12.0
ant protobuf
6. 编译hive:
在release-0.12.0目录下执行:ant clean package
漫长的等待......(要联网)
7. 编译好的内容在release-0.12.0/build/dist/中
回头执行:insert overwrite table orc_test select * from text_test;成功。
在release-0.12.0目录下执行:ant clean package
漫长的等待......(要联网)
7. 编译好的内容在release-0.12.0/build/dist/中
回头执行:insert overwrite table orc_test select * from text_test;成功。
hive --orcfiledump <hdfs-location-of-orc-file>
hive> select * from orc_test;
OK
1 myew
2 ccsd
3 33
hive --orcfiledump /user/hive/warehouse/orc_test/000000_0
Rows: 3
Compression: ZLIB
Compression size: 262144
Type: struct<_col0:string,_col1:string>
Statistics:
Column 0: count: 3
Column 1: count: 3 min: 1 max: 3
Column 2: count: 3 min: 33 max: myew
Stripes:
Stripe: offset: 3 data: 31 rows: 3 tail: 50 index: 59
Stream: column 0 section ROW_INDEX start: 3 length 9
Stream: column 1 section ROW_INDEX start: 12 length 23
Stream: column 2 section ROW_INDEX start: 35 length 27
Stream: column 1 section DATA start: 62 length 6
Stream: column 1 section LENGTH start: 68 length 5
Stream: column 2 section DATA start: 73 length 13
Stream: column 2 section LENGTH start: 86 length 7
Encoding column 0: DIRECT
Encoding column 1: DIRECT_V2
Encoding column 2: DIRECT_V2
- 主题推荐
- 存储 hadoop bug hive application
- 博文推荐
- 利用sqoop从mysql向多分区hiv...
- haoop执行reduce后合并结果文件
- hadoop HDFS原理基础知识
- 剖析淘宝TDDL(TAOBAO DIST...
- Oracle VM VirtualBox...
- iOS提交后申请加急审核
- android之viewFlipper简...
- 驱动程序与应用程序之间的通信