文章目录
- 环境介绍
- 部署
- 访问kubeflow ui
- 问题记录
环境介绍
K8S版本:v1.23.17,需要配置默认的sc
参考:https://github.com/kubeflow/manifests/tree/v1.7.0
部署
#获取安装包
wget https://github.com/kubeflow/manifests/archive/refs/tags/v1.7.0.tar.gz
#解压
tar -zxvf manifests-1.7.0.tar.gz
#获取需要的镜像
cd manifests-1.7.0
kustomize build example |grep 'image: '|awk '$2 != "" { print $2}' |sort -u
镜像源调整
find /kubeflow/manifests-1.7.0/ -type f -name "*.yaml" | xargs sed -i 's#gcr.io#gcr.dockerproxy.com#g'
find /kubeflow/manifests-1.7.0/ -type f -name "*.py" | xargs sed -i 's#gcr.io#gcr.dockerproxy.com#g'
部署:
while ! kustomize build example | awk '!/well-defined/' | kubectl apply -f -; do echo "Retrying to apply resources"; sleep 10; done
检查Pod是否正常?
root@ser-compute-07:/srv/k8s_yaml/kubeflow/manifests-1.7.0# kubectl get pods -n cert-manager
NAME READY STATUS RESTARTS AGE
cert-manager-b4b465456-cqpmd 1/1 Running 0 12h
cert-manager-cainjector-64d74f9c8f-h8sbd 1/1 Running 0 12h
cert-manager-webhook-66fff58cdf-lh7tc 1/1 Running 0 12h
root@ser-compute-07:/srv/k8s_yaml/kubeflow/manifests-1.7.0# kubectl get pods -n istio-system
NAME READY STATUS RESTARTS AGE
authservice-0 1/1 Running 0 12h
cluster-local-gateway-7f55dcfff7-lnht5 1/1 Running 0 12h
istio-ingressgateway-869ccf7495-bd547 1/1 Running 0 12h
istiod-69d59d9787-gzqxh 1/1 Running 0 12h
root@ser-compute-07:/srv/k8s_yaml/kubeflow/manifests-1.7.0# kubectl get pods -n auth
NAME READY STATUS RESTARTS AGE
dex-86c6ff6df8-fkk4c 1/1 Running 0 12h
root@ser-compute-07:/srv/k8s_yaml/kubeflow/manifests-1.7.0# kubectl get pods -n knative-eventing
NAME READY STATUS RESTARTS AGE
eventing-controller-7889878c4f-zpp5w 1/1 Running 0 12h
eventing-webhook-67f458d8dc-wzsw5 1/1 Running 0 12h
root@ser-compute-07:/srv/k8s_yaml/kubeflow/manifests-1.7.0# kubectl get pods -n knative-serving
NAME READY STATUS RESTARTS AGE
activator-5b8f844df6-bcbz7 2/2 Running 7 (12h ago) 12h
autoscaler-db588db95-lwx9v 2/2 Running 0 12h
controller-67cf9bbc8-nx29h 2/2 Running 0 12h
domain-mapping-5cdc99c95c-mstvm 2/2 Running 0 12h
domainmapping-webhook-7b6c4fccbd-69xpg 2/2 Running 0 12h
net-istio-controller-8468c9f8d5-dn92x 2/2 Running 0 12h
net-istio-webhook-6d55c8b86c-xnjlb 2/2 Running 0 12h
webhook-85c77fccfc-7ncdp 2/2 Running 0 12h
root@ser-compute-07:/srv/k8s_yaml/kubeflow/manifests-1.7.0# kubectl get pods -n kubeflow
NAME READY STATUS RESTARTS AGE
admission-webhook-deployment-657697f86-nd82z 1/1 Running 0 12h
cache-server-666dbc749-twpsz 2/2 Running 0 12h
centraldashboard-554fbb8f9d-lqwb4 2/2 Running 0 12h
jupyter-web-app-deployment-787c9ccf46-zjkf4 2/2 Running 0 12h
katib-controller-6df466949b-d9tfh 1/1 Running 0 12h
katib-db-manager-6c7cdd865d-cs42k 1/1 Running 0 12h
katib-mysql-6975d6c6c4-rxrq8 1/1 Running 0 12h
katib-ui-cd5f5fbd6-hbk7h 2/2 Running 1 (25m ago) 24m
kserve-controller-manager-5fc9cbcdf8-vccnk 2/2 Running 0 12h
kserve-models-web-app-7d99fdb-jz6jr 2/2 Running 0 12h
kubeflow-pipelines-profile-controller-558b7678d8-trqpt 1/1 Running 0 12h
metacontroller-0 1/1 Running 0 12h
metadata-envoy-deployment-5788595668-87z64 1/1 Running 0 12h
metadata-grpc-deployment-75fb876c4b-pmmpk 2/2 Running 1 (25m ago) 24m
metadata-writer-56b4c57949-7vzqk 2/2 Running 0 12h
minio-88f9db94d-nzwcl 2/2 Running 0 12h
ml-pipeline-5f974c9879-6pgkk 2/2 Running 7 (12h ago) 12h
ml-pipeline-persistenceagent-548958c9-nrk95 2/2 Running 0 12h
ml-pipeline-scheduledworkflow-8699d58b74-xwz9x 2/2 Running 0 12h
ml-pipeline-ui-84f68c8899-hmkl6 2/2 Running 0 12h
ml-pipeline-viewer-crd-67f995fd8c-c94bn 2/2 Running 1 (12h ago) 12h
ml-pipeline-visualizationserver-564586897b-dgwqc 2/2 Running 0 12h
mysql-77ff498954-bb74m 2/2 Running 0 12h
notebook-controller-deployment-7d6df9f67c-fv9q5 2/2 Running 1 (22m ago) 21m
profiles-deployment-c46c4fb9f-gk8pf 3/3 Running 1 (12h ago) 12h
tensorboard-controller-deployment-649d96556f-hgmdn 3/3 Running 2 (12h ago) 12h
tensorboards-web-app-deployment-64b8b6b9cc-2rg9s 2/2 Running 0 12h
training-operator-64c4cfc8bb-hsqpx 1/1 Running 0 12h
volumes-web-app-deployment-8b6b8f49d-2chwz 2/2 Running 0 12h
workflow-controller-6b6495dd65-whnss 2/2 Running 2 (12h ago) 12h
root@ser-compute-07:/srv/k8s_yaml/kubeflow/manifests-1.7.0# kubectl get pods -n kubeflow-user-example-com
NAME READY STATUS RESTARTS AGE
ml-pipeline-ui-artifact-755fbf99d-4phbh 2/2 Running 0 18m
ml-pipeline-visualizationserver-75c845688d-5hxqn 2/2 Running 0 18m
访问kubeflow ui
Login with the default user's credential. The default email address is user@example.com and the default password is 12341234.
kubectl patch service istio-ingressgateway -n istio-system -p '{"spec":{"type":"NodePort"}}'
问题记录
1、创建Notebook失败
kubectl edit deployments.apps -n kubeflow jupyter-web-app-deployment
......
- name: APP_SECURE_COOKIES
value: "false"