简述了sqoop的的功能,作用,以及版本演进,那么本篇我们就来实战下,看下如下安装使用sqoop(注:散仙在这里部署的是sqoop1的环境搭建)。
首先,sqoop是基于Hadoop工作的,所以在这之前,确保你的Linux环境下,已经有可以正常工作的hadoop集群,当然伪分布式和完全分布式都可以。
其次,我们得下载一个sqoop的安全包,散仙在这里使用的是sqoop1,版本是sqoop1.4.4的版本。
最后,我们需要配置一些坏境变量,然后就可以以使用sqoop进行数据迁移了。
我们先启动hadoop集群,散仙的是伪分布式的截图如下:
接下来,我们需要配置sqoop的环境变量,拷贝sqoop-env-template.sh改名为sqoop-env.sh,需要注意的是在这个配置文件里面,hadoop的环境变量是必须要配置的,否则将会导致连接失败,并出现警告,其他的如Hbase,和Hive的环境变量,则不是必要的,虽然会出现警告信息,但是只要不向它们上面导入数据,就没问题。当然如果我们的应用,需要向Hbase,或Hive里导入数据,那么则需要配置此环境变量,另外一个关于zookeeper的zoo.cfg的配置目录,这个可以不用配置,使用sqoop内置的zookeeper即可,当然如果我们使用的是外置的zookeeper,则可能需要配置一下,散仙的配置文件如下:
- # Licensed to the Apache Software Foundation (ASF) under one or more
- # contributor license agreements. See the NOTICE file distributed with
- # this work for additional information regarding copyright ownership.
- # The ASF licenses this file to You under the Apache License, Version 2.0
- # (the "License"); you may not use this file except in compliance with
- # the License. You may obtain a copy of the License at
- #
- # http://www.apache.org/licenses/LICENSE-2.0
- #
- # Unless required by applicable law or agreed to in writing, software
- # distributed under the License is distributed on an "AS IS" BASIS,
- # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- # See the License for the specific language governing permissions and
- # limitations under the License.
- # included in all the hadoop scripts with source command
- # should not be executable directly
- # also should not be passed any arguments, since we need original $*
- # Set Hadoop-specific environment variables here.
- #Set path to where bin/hadoop is available
- #hadoop的环境信息必须
- export HADOOP_COMMON_HOME=/root/hadoop
- #Set path to where hadoop-*-core.jar is available
- #hadoop的mr存放目录的配置信息必须
- export HADOOP_MAPRED_HOME=/root/hadoop/tmp/mapred
- #set the path to where bin/hbase is available
- #hbase的配置信息非必须
- export HBASE_HOME=/root/hbase
- #Set the path to where bin/hive is available
- #hive的配置信息非必须
- export HIVE_HOME=/root/hive
- #Set the path for where zookeper config dir is
- #export ZOOCFGDIR=
# Licensed to the Apache Software Foundation (ASF) under one or more # contributor license agreements. See the NOTICE file distributed with # this work for additional information regarding copyright ownership. # The ASF licenses this file to You under the Apache License, Version 2.0 # (the "License"); you may not use this file except in compliance with # the License. You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. # included in all the hadoop scripts with source command # should not be executable directly # also should not be passed any arguments, since we need original $* # Set Hadoop-specific environment variables here. #Set path to where bin/hadoop is available #hadoop的环境信息必须 export HADOOP_COMMON_HOME=/root/hadoop #Set path to where hadoop-*-core.jar is available #hadoop的mr存放目录的配置信息必须 export HADOOP_MAPRED_HOME=/root/hadoop/tmp/mapred #set the path to where bin/hbase is available #hbase的配置信息非必须 export HBASE_HOME=/root/hbase #Set the path to where bin/hive is available #hive的配置信息非必须 export HIVE_HOME=/root/hive #Set the path for where zookeper config dir is #export ZOOCFGDIR=
此外,还有一点必须要做的是,拷贝的一份hadoop的核心包到sqoop的lib里面,和以及相对应的数据库的连接包,如果你是oracle,就拷贝一份oracle的JDBC连接包到sqoop的lib,同样如果你是sql server的则一样,散仙在这里使用的是mysql,所以需要拷贝mysql的JDBC包到sqoop的lib里面,截图如下:
然后,我们就可以输入命令,测试数据库连接了:
- bin/sqoop list-databases --connect jdbc:mysql://192.168.
- 20.129 --username root --password root
bin/sqoop list-databases --connect jdbc:mysql://192.168. 120.129 --username root --password root
打印的信息如下:
- Warning: /usr/lib/hcatalog does not exist! HCatalog jobs will fail.
- Please set $HCAT_HOME to the root of your HCatalog installation.
- Warning: $HADOOP_HOME is deprecated.
- 13/12/30 06:58:40 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
- 13/12/30 06:58:40 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
- information_schema
- hive
- mysql
- test
Warning: /usr/lib/hcatalog does not exist! HCatalog jobs will fail. Please set $HCAT_HOME to the root of your HCatalog installation. Warning: $HADOOP_HOME is deprecated. 13/12/30 06:58:40 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead. 13/12/30 06:58:40 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset. information_schema hive mysql test
1,将msyql中的数据导入HDFS命令如下
- bin/sqoop import --connect jdbc:mysql://192.168.120.129/test --table student
- //将HDFS上的数据导入到mysql中
- bin/sqoop export --connect jdbc:mysql://192.168.120.129/test --username sqoop --password sqoop --table students --export-dir hdfs://masternode:9000/user/grid/students/part-m-00000
bin/sqoop import --connect jdbc:mysql://192.168.120.129/test --table student //将HDFS上的数据导入到mysql中 bin/sqoop export --connect jdbc:mysql://192.168.120.129/test --username sqoop --password sqoop --table students --export-dir hdfs://masternode:9000/user/grid/students/part-m-00000
2,将msyql中的数据导入Hbase命令如下
- bin/sqoop import --connect jdbc:mysql://192.168.120.129/test --table student --hbase-table qindongliang --hbase-create-table --hbase-row-key id --column-family dong
bin/sqoop import --connect jdbc:mysql://192.168.120.129/test --table student --hbase-table qindongliang --hbase-create-table --hbase-row-key id --column-family dong
3,将msyql中的数据导入Hive命令如下
- sqoop import --connect jdbc:mysql://192.168.120.129/test --table ST_Statistics --hive-import --create-hive-table
- //将Hive中的数据导出到mysql中
- bin/sqoop export --connect jdbc:mysql://192.168.120.129/test --username root --password admin --table uv_info --export-dir /user/hive/warehouse/uv/dt=mytable
sqoop import --connect jdbc:mysql://192.168.120.129/test --table ST_Statistics --hive-import --create-hive-table //将Hive中的数据导出到mysql中 bin/sqoop export --connect jdbc:mysql://192.168.120.129/test --username root --password admin --table uv_info --export-dir /user/hive/warehouse/uv/dt=mytable
关于将Hbase的数据导入到mysql里,sqoop并不是直接支持的,一般采用如下3种方法,将Hbase数据,扁平化成HDFS文件,然后再由sqoop导入,第二种,将Hbase数据导入Hive表中,然后再导入mysql,第三种直接使用Hbase的Java API读取表数据,直接向mysql导入,不需要使用sqoop。
相关推荐
2、配置sqoop的环境配置文件: mv /usr/local/sqoop-1.4.6-cdh5.13.2/conf/sqoop-env.template.sh /usr/local/sqoop-1.4.6-cdh5.13.2/conf/sqoop-env.sh vi /usr/local/sqoop-1.4.6-cdh5.13.2/conf/sqoop-env.sh ...
Hadoop hbase hive sqoop集群环境安装配置及使用文档
大数据环境搭建———>Sqoop安装与配置
1.sqoop基本概念 2.配置sqoop环境 3.sqoop的典型应用 4.sqoop的原理
1.sqoop基本概念 2.配置sqoop环境 3.sqoop的典型应用 4.sqoop的原理 5、sqoop环境的配置步骤 6、理解sqoop的原理 7、熟练使用sqoop导入导出数据
数据迁移工具sqoop和Hadoop系统集成步骤说明
hadoop中的sqoop安装与使用
大数据技术基础实验报告-sqoop的安装配置与应用
第9章 Sqoop组件安装配置.pdf
Sqoop是一个用来将Hadoop和关系型数据库中的数据相互转移的工具,可以将一个关系型数据库(例如 : MySQL ,Oracle ,Postgres等)中的数据导进到Hadoop的HDFS中,也可以将HDFS的数据导进到关系型数据库中。...
Hadoop2.6伪分布安装sqoop1.4.6及hdfs与mysql导入导出数据 sqoop安装及简单测试 启动sqoop 启动mysql 从mysql导入数据到hdfs中 从HDFS导出数据到mysql
sqoop连接sqlserver的驱动工具,没有这个sqoop是连不上滴
第9章 Sqoop组件安装配置.docx
电商数仓项目(九) Sqoop安装与配置
Sqoop集群搭建.Sqoop集群搭建.Sqoop集群搭建.Sqoop集群搭建.Sqoop集群搭建.Sqoop集群搭建.Sqoop集群搭建.Sqoop集群搭建.Sqoop集群搭建.
Sqoop学习文档(1){Sqoop基本概念、Sqoop的安装配置}。记录我的学习之旅,每份文档倾心倾力,带我成我大牛,回头观望满脸笑意,望大家多多给予意见,有问题或错误,请联系 我将及时改正;借鉴文章标明出处,谢谢
Sqoop(发音:skup)是一款开源的工具,主要用于在Hadoop(Hive)与传统的数据库(mysql、postgresql...)间进行数据的传递,可以将一个关系型数据库(例如 : MySQL ,Oracle ,Postgres等)中的数据导进到Hadoop的HDFS中,...
sqoop安装详解以及sqoop内容介绍使用介绍 集群介绍 sqoop:是一个工具,主要用于导入导出,实现MySQL到Hadoop之间数据的转换 2、导入数据:从结构化数据(Mysql,oracle,db2,)导入到半结构化或非结构化hadoop中...
运行Sqoop报错:找不到或无法加载主类 org.apache.sqoop.sqoop 将sqoop-1.4.7.jar包放到Sqoop的lib目录下,问题解决。
NULL 博文链接:https://mazhilin.iteye.com/blog/2421559