最近翻看两年前的大数据课设,感觉这个大数据课设实验当时答辩
在大数据课设实验过程中,我遇到了很多问题,在这里做出汇总:
1、MySQL启动报错
首先,我的MySQL有时候启动不了,当我输入这个命令的时候,会报很多信息出来:
mysql -u root -p
有时候,我尝试了很多这个命令,就是打不开,一直显示这个信息。
但有时候却可以启动,感觉很奇怪,很多时候得看运气:
2、Sqoop连接MySQL报错
在测试sqoop与MySQL之间的连接是否成功的时候,命令如下:
sqoop list-databases --connect jdbc:mysql://127.0.0.1:3306/ --username root -P
报错图片:
部分报错信息:
hadoop@dblab-VirtualBox:~$ sqoop list-databases --connect jdbc:mysql://127.0.0.1:3306/ --username root -P
Warning: /usr/local/sqoop/../hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
Warning: /usr/local/sqoop/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
Warning: /usr/local/sqoop/../zookeeper does not exist! Accumulo imports will fail.
Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation.
21/12/08 10:11:53 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6
Enter password:
21/12/08 10:12:48 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
Wed Dec 08 10:12:48 CST 2021 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
21/12/08 10:12:49 ERROR manager.CatalogQueryManager: Failed to list databases
com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure
Caused by: javax.net.ssl.SSLHandshakeException: No appropriate protocol (protocol is disabled or cipher suites are inappropriate)
at sun.security.ssl.HandshakeContext.<init>(HandshakeContext.java:171)
at sun.security.ssl.ClientHandshakeContext.<init>(ClientHandshakeContext.java:98)
at sun.security.ssl.TransportContext.kickstart(TransportContext.java:220)
at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:428)
at com.mysql.jdbc.ExportControlled.transformSocketToSSLSocket(ExportControlled.java:149)
... 27 more
从上面的报错信息可以看到:
For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
我之前使用hive连接MySQL的时候,也会有很多报错信息,那时候我处理了很长时间,去网上看了很多教程,我才知道hive连接MySQL数据库的时候需要在连接的后面加上:
useSSL=false
而且hive在连接MySQL数据库的时候,需要在一个xml配置文件里面配置,非常麻烦,而其他同学都不需要这样的配置就可以连接上MySQL数据库,只有我一个人需要这样配置,这导致我后面做这个大作业的时候会遇到一大推这方面的问题。
不过,我看到这个报错信息,就发现了需要在连接到时候加上这个useSSL=false,但是我不知道需不需要在配置文件里面加上,于是我尝试在命令这里:
jdbc:mysql://127.0.0.1:3306/
的后面加上:
?useSSL=false
完整的命令是这样的:
sqoop list-databases --connect jdbc:mysql://127.0.0.1:3306/?useSSL=false --username root -P
这样就可以连上MySQL数据库了:
hadoop@dblab-VirtualBox:~$ sqoop list-databases --connect jdbc:mysql://127.0.0.1:3306/?useSSL=false --username root -P
Warning: /usr/local/sqoop/../hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
Warning: /usr/local/sqoop/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
Warning: /usr/local/sqoop/../zookeeper does not exist! Accumulo imports will fail.
Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation.
21/12/08 10:15:08 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6
Enter password:
21/12/08 10:15:11 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
information_schema
dblab
hive
mysql
performance_schema
spark
sys
3、Sqoop从hive导入MySQL数据报错
我的Sqoop好不容易连接上数据库,但是在Sqoop导入MySQL数据库的时候有遇到了很大的问题,导入命令如下:
./bin/sqoop export --connect jdbc:mysql://localhost:3306/dblab --username root --password hadoop --table user_action --export-dir '/user/hive/warehouse/dblab.db/user_action' --fields-terminated-by '\t'; #导入命令
报错图片:
部分报错信息:
hadoop@dblab-VirtualBox:/usr/local/sqoop$ ./bin/sqoop export --connect jdbc:mysql://localhost:3306/dblab --username root --password hadoop --table user_action --export-dir '/user/hive/warehouse/dblab.db/user_action' --fields-terminated-by '\t';
Warning: /usr/local/sqoop/../hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
Warning: /usr/local/sqoop/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
Warning: /usr/local/sqoop/../zookeeper does not exist! Accumulo imports will fail.
Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation.
21/12/03 17:36:26 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6
21/12/03 17:36:26 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
......
java.net.ConnectException: 拒绝连接
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:716)
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
21/12/03 17:36:30 WARN zookeeper.RecoverableZooKeeper: Possibly transient ZooKeeper, quorum=localhost:2181, exception=org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase/hbaseid
21/12/03 17:36:31 INFO zookeeper.ClientCnxn: Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
21/12/03 17:36:31 WARN zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: 拒绝连接
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:716)
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
21/12/03 17:36:31 WARN zookeeper.RecoverableZooKeeper: Possibly transient ZooKeeper, quorum=localhost:2181, exception=org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase/hbaseid
21/12/03 17:36:32 INFO zookeeper.ClientCnxn: Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
21/12/03 17:36:32 WARN zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: 拒绝连接
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:716)
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
21/12/03 17:36:33 INFO zookeeper.ClientCnxn: Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
21/12/03 17:36:33 WARN zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: 拒绝连接
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:716)
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
21/12/03 17:36:33 WARN zookeeper.RecoverableZooKeeper: Possibly transient ZooKeeper, quorum=localhost:2181, exception=org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase/hbaseid
21/12/03 17:36:34 INFO zookeeper.ClientCnxn: Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
21/12/03 17:36:34 WARN zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: 拒绝连接
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:716)
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
21/12/03 17:36:35 INFO zookeeper.ClientCnxn: Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
21/12/03 17:36:35 WARN zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: 拒绝连接
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:716)
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
21/12/03 17:36:37 INFO zookeeper.ClientCnxn: Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
21/12/03 17:36:37 WARN zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
我尝试了很多这个命令次,但是就是无法导入,一直报错,我还问了很多同学,其中一班的学委她说她也遇到了这个问题,然后后面就放弃不做这个选题了,我也感觉我做不出来,但是我又不想放弃,毕竟之前我连接MySQL花了不少时间,放弃感觉很不好,然后我就直接打电话给涂老师商量一下解决方法,我和他聊了很久,他让我可以重新新建一个虚拟机或者换一个选题,我想了很久,也开始找数据集准备放弃了,但是我打开虚拟机再次执行这个命令的时候:
./bin/sqoop export --connect jdbc:mysql://localhost:3306/dblab --username root --password hadoop --table user_action --export-dir '/user/hive/warehouse/dblab.db/user_action' --fields-terminated-by '\t'; #导入命令
居然又可以导入了:
通过查看MySQL的数据库也可以看到已经正常导入了:
后面想了一下,也和其他老师交流了这个问题,我觉得很有可能是有些服务没有打开或者也和之前的MySQL连接有问题的原因,具体我也不是很清楚。
4、R的安装问题
4.1、安装R的镜像问题
在修改配置文件的时候,命令如下:
sudo vim /etc/apt/sources.list
通过教材,我们需要在这个配置文件/etc/apt/sources.list的最后一行配置这个厦门大学的镜像:
deb http://mirrors.xmu.edu.cn/CRAN/bin/linux/ubuntu/ trusty/
但是,我们可以发现这个镜像http://mirrors.xmu.edu.cn/CRAN/bin/linux/ubuntu/网页是没办法访问的:
因此,它也根本无法在这个镜像下载R。
也就是在/etc/apt/sources.list文件后面加上使用清华大学的镜像:
deb http://mirrors.tuna.tsinghua.edu.cn/CRAN/bin/linux/ubuntu/ trusty/
这样就可以正常下载R了。
4.2、ggplot2的安装问题
通过教材我们发现,ggplot2的安装是需要在R的界面下,通过这个命令来尝试安装的:
install.packages('ggplot2')
但是,我们发现使用这个方法我改变无法下载ggplot2,很可能是镜像的问题,但也可能是其他问题。
命令如下:
install.packages("ggplot2",repos='https://mran.microsoft.com/snapshot/2019-02-01/')
这样就可以顺利下载ggplot2了:
4.3、recharts的安装问题
通过刚刚那位同学的方法,我成功下载了ggplot2,于是我就继续参考她的方法来下载recharts,她说她没有下载devtools,而是直接下载recharts,她是直接在GitHub上面下载的recharts。
于是我按照她的方法来安装recharts,通过解压GitHub上面下载的recharts放在“下载”文件夹里面,然后执行以下命令:
R CMD INSTALL ~/下载/recharts-master --library=/home/hadoop/R/x86_64-pc-linux-gnu-library/3.4
但是报错了:
•hadoop@dblab-VirtualBox:~$ R CMD INSTALL ~/下载/recharts-master --library=/home/hadoop/R/x86_64-pc-linux-gnu-library/3.4
ERROR: dependencies ‘htmltools’, ‘htmlwidgets’, ‘jsonlite’, ‘webshot’ are not available for package ‘recharts’
* removing ‘/home/hadoop/R/x86_64-pc-linux-gnu-library/3.4/recharts’
说明上面的方法有问题,3.4版本的R还是不能直接下载GitHub上面的recharts。
她可以安装也是因为她的R版本是3.2.
下面按照教程运行下面命令安装devtools:,然后再来安装recharts。
我们执行以下命令来安装devtools:
install.packages('devtools')
但是后面我们在R下面安装recharts的时候
devtools::install_github('taiyun/recharts')
可以看到这个报错信息:
devtools::install_github('taiyun/recharts')
Error in loadNamespace(name) : 不存在叫‘devtools’这个名字的程辑包
这说明,我们的devtools并没有安装成功。
参考教程,我们知道,我们需要在命令行下载这些安装包:
sudo apt-get install libssl-dev
sudo apt-get install libssh2-1-dev
sudo apt-get install libcurl4-openssl-dev
但是,我安装完成这些安装包之后,再来下载devtools的时候发现还是报了错误,于是我在网上找了解决方法:
我参考了这个博客解决了:
https://blog.csdn.net/Wing_kin666/article/details/106020600
也就是安装这个安装包:
sudo apt-get install libxml2-dev
下面再来下载’devtools’:
install.packages('devtools')
这样就可以成功下载devtools了。
下面再来下载recharts:
devtools::install_github('taiyun/recharts')
这样就可以成功下载recharts了。