踏坑--安装DATAHUB

踏坑--安装DATAHUB

首页角色扮演D4DJ Groovy Mix更新时间:2024-05-07
理论与实践太难了,这玩意错误太多,遍地是坑,还有就是不知道怎么用,一脸懵逼

https://github.com/linkedin/datahub

安装docker

yum -y install Docker # 未启动docker,出现如下问题 [root@localhost ~]# docker pull java:8 Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running? [root@localhost ~]# systemctl daemon-reload [root@localhost ~]# systemctl restart docker.service # 启动 # 成功解决安装python3.8

下载wget https://www.python.org/ftp/python/3.8.2/Python-3.8.0.tgz

#第一步 #非常重要要不然要报错 ModuleNotFoundError: No module named '_ctypes' yum install libffi-devel -y cd /app wget https://www.python.org/ftp/python/3.8.2/Python-3.8.0.tgz tar -zxvf Python-3.8.0.tgz cd /app/Python-3.8.0 #编译安装 ./configure --prefix=/usr/local/python3 make && make install #创建软连接 ln -s /usr/local/python3/bin/python3 /usr/local/bin/python3 ln -s /usr/local/python3/bin/pip3 /usr/local/bin/pip3 #验证是否成功 [root@xxx Python-3.8.2]# python3 -V Python 3.8.2 [root@xxx Python-3.8.2]# pip3 -V pip 19.2.3 from /usr/local/python3/lib/python3.8/site-packages/pip (python 3.8)安装pip3

#更新pip3 [root@artemis python3]# pip3 install --upgrade pip -i http://pypi.douban.com/simple --trusted-host pypi.douban.com Looking in indexes: http://pypi.douban.com/simple Collecting pip Downloading http://pypi.doubanio.com/packages/54/eb/4a3642e971f404d69d4f6fa3885559d67562801b99d7592487f1ecc4e017/pip-20.3.3-py2.py3-none-any.whl (1.5MB) |████████████████████████████████| 1.5MB 6.3MB/s Installing collected packages: pip Found existing installation: pip 19.2.3 Uninstalling pip-19.2.3: Successfully uninstalled pip-19.2.3 Successfully installed pip-20.3.3设置自由切换python2和python3 实际不需要这步

本文方法使用的是update-alternatives工具 第一步 查看是否已经存在python的可选项 update-alternatives --display python #!若无则不显示任何信息 第二步 将python2和python3分别添加为可选项 sudo update-alternatives --install /usr/bin/python python /usr/bin/python2.7 1 sudo update-alternatives --install /usr/bin/python python /usr/local/bin/python3 2 #! /usr/bin/python链接文件相同,/usr/local/bin/python3.4 1则根据自己具体安装目录来设定,1、2分别代表优先级 第三步 查看版本,此时的版本是Python2 python --version Python 2.7.5 第四步 切换版本 sudo update-alternatives --config python There are 2 programs which provide 'python'. Selection Command ----------------------------------------------- 1 /usr/bin/python2.7 * 2 /usr/local/bin/python3.6 Enter to keep the current selection[ ], or type selection number: 2(1为python2.7,2为python3.6) 第五步 如上,我们选择的选项是2,因此此时版本应该为Python3 python --version Python 3.6.5 附加,删除可选项 sudo update-alternatives --remove python /usr/bin/python2.7 (删除2.7) sudo update-alternatives --remove python /usr/local/bin/python3.6 (删除3.6)

安装 sklearn

pip3 install sklearn -i http://pypi.douban.com/simple --trusted-host pypi.douban.com

安装docker-compose 方法1 会报错2 不建议

pip3 install docker-compose -i http://pypi.douban.com/simple --trusted-host pypi.douban.com

安装docker-compose 方法2 可以解决报错2

curl -L "https://github.com/docker/compose/releases/download/1.27.4/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose chmod x /usr/local/bin/docker-compose

安装完毕后重启下docker服务

systemctl 方式 守护进程重启 sudo systemctl daemon-reload 重启docker服务 sudo systemctl restart docker 关闭docker sudo systemctl stop docker报错1可能是网络引起的

[root@ares datahub]# ./docker/quickstart.sh Pulling mysql ... done Pulling zookeeper ... done Pulling elasticsearch ... done Pulling elasticsearch-setup ... done Pulling kibana ... done Pulling broker ... done Pulling neo4j ... done Pulling schema-registry ... done Pulling schema-registry-ui ... done Pulling kafka-setup ... done Pulling datahub-mae-consumer ... done Pulling datahub-gms ... done Pulling datahub-mce-consumer ... done Pulling datahub-frontend ... done Pulling kafka-rest-proxy ... done Pulling kafka-topics-ui ... done Building elasticsearch-setup Step 1/6 : FROM jwilder/dockerize:0.6.1 ---> 849596ab86ff Step 2/6 : RUN apk add --no-cache curl jq ---> Running in 8ca0149674f7 fetch http://dl-cdn.alpinelinux.org/alpine/v3.6/main/x86_64/APKINDEX.tar.gz WARNING: Ignoring http://dl-cdn.alpinelinux.org/alpine/v3.6/main/x86_64/APKINDEX.tar.gz: network error (check Internet connection and firewall) fetch http://dl-cdn.alpinelinux.org/alpine/v3.6/community/x86_64/APKINDEX.tar.gz WARNING: Ignoring http://dl-cdn.alpinelinux.org/alpine/v3.6/community/x86_64/APKINDEX.tar.gz: network error (check Internet connection and firewall) ERROR: unsatisfiable constraints: curl (missing): required by: world[curl] jq (missing): required by: world[jq] ERROR: Service 'elasticsearch-setup' failed to build : The command '/bin/sh -c apk add --no-cache curl jq' returned a non-zero code: 2报错2 docker-compose

[root@artemis datahub]# ./docker/quickstart.sh /usr/lib/python2.7/site-packages/paramiko/transport.py:33: CryptographyDeprecationWarning: Python 2 is no longer supported by the Python core team. Support for it is now deprecated in cryptography, and will be removed in the next release. from cryptography.hazmat.backends import default_backend Traceback (most recent call last): File "/usr/bin/docker-compose", line 5, in <module> from compose.cli.main import main File "/usr/lib/python2.7/site-packages/compose/cli/main.py", line 24, in <module> from ..config import ConfigurationError File "/usr/lib/python2.7/site-packages/compose/config/__init__.py", line 6, in <module> from .config import ConfigurationError File "/usr/lib/python2.7/site-packages/compose/config/config.py", line 51, in <module> from .validation import match_named_volumes File "/usr/lib/python2.7/site-packages/compose/config/validation.py", line 12, in <module> from jsonschema import Draft4Validator File "/usr/lib/python2.7/site-packages/jsonschema/__init__.py", line 21, in <module> from jsonschema._types import TypeChecker File "/usr/lib/python2.7/site-packages/jsonschema/_types.py", line 3, in <module> from pyrsistent import pmap File "/usr/lib64/python2.7/site-packages/pyrsistent/__init__.py", line 3, in <module> from pyrsistent._pmap import pmap, m, PMap File "/usr/lib64/python2.7/site-packages/pyrsistent/_pmap.py", line 98 ) from e ^ SyntaxError: invalid syntax报错3 替换国内源在报错5还有一种解决方案,视情况替换

WARNING: Ignoring http://dl-cdn.alpinelinux.org/alpine/v3.6/main/x86_64/APKINDEX.tar.gz: network error (check Internet connection and firewall)正确的做法是使用国内源完全覆盖 /etc/apk/repositories在Dockerfile中增加下面的第二行

FROM alpine:3.7 RUN echo -e http://mirrors.ustc.edu.cn/alpine/v3.7/main/ > /etc/apk/repositories 可能修改的文件 /app/datahub/docker/datahub-mae-consumer/Dockerfile /app/datahub/docker/datahub-gms/Dockerfile /app/datahub/docker/datahub-frontend/Dockerfile /app/datahub/docker/datahub-mae-consumer/Dockerfile /app/datahub/docker/datahub-mce-consumer/Dockerfile 具体是哪个或者哪几个不清楚,宁可错*不可放过,在所有的FROM 后面都加上,一个文件里面可能有多个,好像有问题等会放一放

https://www.jianshu.com/p/eb34e7088c77官方使用指南

https://github.com/linkedin/datahub/blob/master/docs/debugging.md#how-can-i-confirm-if-all-docker-containers-are-running-as-expected-after-a-quickstart

检查启动

[root@ares ~]# docker container ls CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 14ef8eedaf60 linkedin/datahub-frontend:latest "datahub-frontend/bi…" 15 hours ago Up 15 hours 0.0.0.0:9001->9001/tcp datahub-frontend 8377aad77608 landoop/kafka-topics-ui:0.9.4 "/run.sh" 15 hours ago Up 15 hours 0.0.0.0:18000->8000/tcp kafka-topics-ui a5ef70bde9f1 linkedin/datahub-mae-consumer:latest "/bin/sh -c /datahub…" 15 hours ago Up 15 hours 9090/tcp, 0.0.0.0:9091->9091/tcp datahub-mae-consumer f5ab5ae53011 linkedin/datahub-gms:latest "/bin/sh -c /datahub…" 15 hours ago Up 15 hours 0.0.0.0:8080->8080/tcp datahub-gms f5cda035d5e5 confluentinc/cp-kafka-rest:5.4.0 "/etc/confluent/dock…" 15 hours ago Up 15 hours 0.0.0.0:8082->8082/tcp kafka-rest-proxy 2328387122a3 landoop/schema-registry-ui:latest "/run.sh" 15 hours ago Up 15 hours 0.0.0.0:8000->8000/tcp schema-registry-ui 95acca24d698 confluentinc/cp-schema-registry:5.4.0 "/etc/confluent/dock…" 15 hours ago Up 15 hours 0.0.0.0:8081->8081/tcp schema-registry 58e7a0d307d2 confluentinc/cp-kafka:5.4.0 "/etc/confluent/dock…" 15 hours ago Up 15 hours 0.0.0.0:9092->9092/tcp, 0.0.0.0:29092->29092/tcp broker 1b4b6dec57e9 kibana:5.6.8 "/docker-entrypoint.…" 15 hours ago Up 15 hours 0.0.0.0:5601->5601/tcp kibana 61a90895e756 neo4j:4.0.6 "/sbin/tini -g -- /d…" 15 hours ago Up 15 hours 0.0.0.0:7474->7474/tcp, 7473/tcp, 0.0.0.0:7687->7687/tcp neo4j 5b7ab8c768c1 confluentinc/cp-zookeeper:5.4.0 "/etc/confluent/dock…" 15 hours ago Up 15 hours 2888/tcp, 0.0.0.0:2181->2181/tcp, 3888/tcp zookeeper da9188c1035f elasticsearch:5.6.8 "/docker-entrypoint.…" 15 hours ago Up 15 hours 0.0.0.0:9200->9200/tcp, 9300/tcp elasticsearch 49252b1240b8 mysql:5.7 "docker-entrypoint.s…" 15 hours ago Up 15 hours 0.0.0.0:3306->3306/tcp, 33060/tcp mysqldocker内切换阿里源及安装vim方法,在下面报错5还有个在加载docker时修改的方案

#切换阿里源 echo -e "deb http://mirrors.ustc.edu.cn/debian buster main contrib non-free \n\ deb http://mirrors.ustc.edu.cn/debian buster-backports main contrib non-free \n\ deb http://mirrors.ustc.edu.cn/debian buster-proposed-updates main contrib non-free \n\ deb http://mirrors.ustc.edu.cn/debian-security buster/updates main contrib non-free \n" \ > /etc/apt/sources.list root@mysql:/# apt-get clean root@mysql:/# apt-get update Get:1 http://mirrors.ustc.edu.cn/debian buster InRelease [121 kB] Get:2 http://mirrors.ustc.edu.cn/debian buster-backports InRelease [46.7 kB] Get:3 http://mirrors.ustc.edu.cn/debian buster-proposed-updates InRelease [54.5 kB] Get:4 http://mirrors.ustc.edu.cn/debian-security buster/updates InRelease [65.4 kB] Get:5 http://mirrors.ustc.edu.cn/debian buster/non-free amd64 Packages [87.7 kB] Get:6 http://mirrors.ustc.edu.cn/debian buster/contrib amd64 Packages [50.2 kB] Hit:7 http://repo.mysql.com/apt/debian buster InRelease Get:8 http://mirrors.ustc.edu.cn/debian buster/main amd64 Packages [7907 kB] Get:9 http://mirrors.ustc.edu.cn/debian buster-backports/non-free amd64 Packages [29.0 kB] Get:10 http://mirrors.ustc.edu.cn/debian buster-backports/contrib amd64 Packages [7816 B] Get:11 http://mirrors.ustc.edu.cn/debian buster-backports/main amd64 Packages [410 kB] Get:12 http://mirrors.ustc.edu.cn/debian buster-proposed-updates/main amd64 Packages [50.1 kB] Get:13 http://mirrors.ustc.edu.cn/debian-security buster/updates/main amd64 Packages [260 kB] Get:14 http://mirrors.ustc.edu.cn/debian-security buster/updates/non-free amd64 Packages [556 B] Fetched 9090 kB in 4s (2509 kB/s) Reading package lists... Done #安装vim apt-get update, 这个命令的作用是:同步 /etc/apt/sources.list 和 /etc/apt/sources.list.d 中列出的源的索引,这样才能获取到最新的软件包。 等更新完毕 apt-get install vim命令即可。登录mysql方法

docker exec -it mysql /usr/bin/mysql datahub --user=datahub --password=datahub

查看所有镜像 docker images 1、启动所有容器 docker start $(docker ps -a | awk '{ print $1}' | tail -n 2) 2、关闭所有容器 docker stop $(docker ps -a | awk '{ print $1}' | tail -n 2) 3、删除所有容器 docker rm $(docker ps -a | awk '{ print $1}' | tail -n 2) 4、删除所有镜像(慎用) docker rmi $(docker images | awk '{print $3}' |tail -n 2) systemctl status docker docker container ls检查各个Docker容器日志docker logs <<container_name>>。

对于datahub-gms,您应该在初始化结束时看到类似以下的日志:

docker logs datahub-gms 2022-01-05 09:20:54.870:INFO:oejs.Server:main: Started @18807ms

对于datahub-frontend,您应该在初始化结束时看到类似以下的日志:

docker logs datahub-frontend 09:20:22 [main] INFO play.core.server.AkkaHttpServer - Listening for HTTP on /0.0.0.0:9001运行.docker/ingestion/ingestion.sh 报错4

[root@ares datahub]# ./docker/ingestion/ingestion.sh WARNING: Native build is an experimental feature and could change at any time WARNING: Found orphan containers (datahub-mae-consumer, schema-registry-ui, broker, kafka-setup, kafka-topics-ui, datahub-frontend, mysql, datahub-mce-consumer, kafka-rest-proxy, neo4j, kibana, elasticsearch-setup, datahub-gms, elasticsearch, schema-registry, zookeeper) for this project. If you removed or renamed this service in your compose file, you can run this command with the --remove-orphans flag to clean it up. Building ingestion [ ] Building 86.4s (9/11) => [internal] load build definition from Dockerfile 0.0s => => transferring dockerfile: 32B 0.0s => [internal] load .dockerignore 0.0s => => transferring context: 35B 0.0s => [internal] load metadata for docker.io/library/openjdk:8 3.7s => [internal] load metadata for docker.io/library/openjdk:8-jre-alpine 4.7s => [prod-build 1/3] FROM docker.io/library/openjdk:8@sha256:c1dcc499d35d74a93c6cbfb1819a88bd588e06741d23f9a1962f636799d77822 0.0s => [internal] load build context 0.5s => => transferring context: 385.15kB 0.4s => CACHED [base 1/1] FROM docker.io/library/openjdk:8-jre-alpine@sha256:f362b165b870ef129cbe730f29065ff37399c0aa8bcab3e44b51c302938c9193 0.0s => CACHED [prod-build 2/3] COPY . datahub-src 0.0s => ERROR [prod-build 3/3] RUN cd datahub-src && ./gradlew :metadata-ingestion-examples:mce-cli:build 81.2s ------ > [prod-build 3/3] RUN cd datahub-src && ./gradlew :metadata-ingestion-examples:mce-cli:build: #9 0.579 Downloading https://services.gradle.org/distributions/gradle-5.6.4-bin.zip #9 4.484 ......................................................................................... #9 32.15 #9 32.15 Welcome to Gradle 5.6.4! #9 32.15 #9 32.15 Here are the highlights of this release: #9 32.15 - Incremental Groovy compilation #9 32.15 - Groovy compile avoidance #9 32.15 - Test fixtures for Java projects #9 32.15 - Manage plugin versions via settings script #9 32.15 #9 32.15 For more details see https://docs.gradle.org/5.6.4/release-notes.html #9 32.15 #9 32.34 To honour the JVM settings for this build a new JVM will be forked. Please consider using the daemon: https://docs.gradle.org/5.6.4/userguide/gradle_daemon.html. #9 33.84 Daemon will be stopped at the end of the build stopping after processing #9 36.74 Configuration on demand is an incubating feature. #9 80.64 #9 80.64 FAILURE: Build failed with an exception. #9 80.64 #9 80.64 * What went wrong: #9 80.64 A problem occurred configuring root project 'datahub-src'. #9 80.64 > Could not resolve all artifacts for configuration ':classpath'. #9 80.64 > Could not resolve com.linkedin.pegasus:gradle-plugins:28.3.7. #9 80.64 Required by: #9 80.64 project : #9 80.64 > Could not resolve com.linkedin.pegasus:gradle-plugins:28.3.7. #9 80.64 > Could not get resource 'https://linkedin.bintray.com/maven/com/linkedin/pegasus/gradle-plugins/28.3.7/gradle-plugins-28.3.7.pom'. #9 80.64 > Could not GET 'https://linkedin.bintray.com/maven/com/linkedin/pegasus/gradle-plugins/28.3.7/gradle-plugins-28.3.7.pom'. #9 80.64 > linkedin.bintray.com: Temporary failure in name resolution #9 80.64 #9 80.64 * Try: #9 80.64 Run with --stacktrace option to get the stack trace. Run with --info or --debug option to get more log output. Run with --scan to get full insights. #9 80.64 #9 80.64 * Get more help at https://help.gradle.org #9 80.64 #9 80.64 BUILD FAILED in 1m 20s ------ executor failed running [/bin/sh -c cd datahub-src && ./gradlew :metadata-ingestion-examples:mce-cli:build]: exit code: 1 ERROR: Service 'ingestion' failed to build解决方案1:

1确认运行目录是在datahub的根目录下 在目录下新建一个文件 vim sources.list deb http://mirrors.ustc.edu.cn/debian buster main contrib non-free deb http://mirrors.ustc.edu.cn/debian buster-backports main contrib non-free deb http://mirrors.ustc.edu.cn/debian buster-proposed-updates main contrib non-free deb http://mirrors.ustc.edu.cn/debian-security buster/updates main contrib non-free 2.修改./docker/ingestion/Dockerfile cat ./docker/ingestion/Dockerfile # Defining environment ARG APP_ENV=prod FROM openjdk:8-jre-alpine as base FROM openjdk:8 as prod-build COPY . datahub-src COPY sources.list /etc/apt/sources.list #为阿里云的地址 RUN apt-get update #更新 RUN ls /etc/apt/sources.list RUN cd datahub-src && ./gradlew :metadata-ingestion-examples:mce-cli:build FROM base as prod-install COPY --from=prod-build datahub-src/metadata-ingestion-examples/mce-cli/build/libs/mce-cli.jar /datahub/ingestion/bin/mce-cli.jar COPY --from=prod-build datahub-src/metadata-ingestion-examples/mce-cli/example-bootstrap.json /datahub/ingestion/example-bootstrap.json FROM base as dev-install # Dummy stage for development. Assumes code is built on your machine and mounted to this image. # See this excellent thread https://github.com/docker/cli/issues/1134 FROM ${APP_ENV}-install as final CMD java -jar /datahub/ingestion/bin/mce-cli.jar -m produce /datahub/ingestion/example-bootstrap.json 3.多次运行./docker/ingestion/ingestion.sh

报错5

Step 1/7 : FROM jwilder/dockerize:0.6.1 ---> 849596ab86ff Step 2/7 : RUN apk add --no-cache curl jq ---> [Warning] IPv4 forwarding is disabled. Networking will not work.

解决方案

第一步:在宿主机上执行echo "net.ipv4.ip_forward=1" >>/usr/lib/sysctl.d/00-system.conf 第二步:重启network和docker服务 [root@localhost /]# systemctl restart network && systemctl restart docker

截图留念

查看全文
大家还看了
也许喜欢
更多游戏

Copyright © 2024 妖气游戏网 www.17u1u.com All Rights Reserved