Centos7.9在K8s安装生产级别的分布式存储Rook+Ceph

1.介绍

在k8s云原生平台中，存储是除了网络之外的另一个核心，因为他涉及到了数据的保存，以及容灾等一系列的问题，做生产级别的应用，一定要具有多节点分布式，灾备及时恢复，数据平滑迁移等多种特性。Rook+Ceph就是我们在生产中常用的k8s存储方案。接下来，我们在k8s上安装该存储系统。

2.安装K8s

安装K8s，按照这篇文章去安装即可<<Centos7.9 yum形式安装kubernetes1.19.9(K8s)系统>>

3.升级内核

由于CephFs要4.17以上的内核版本，所有我们想要升级一下K8s各节点的操作系统内核版本，默认安装好Centos7之后的内核版本是3.10,默认安装好后的k8s是这样的环境。并且每个node节点都有第二块硬盘来作为rook-ceph的存储。
在这里插入图片描述

将要升级的内核文件下载下来。如果嫌下载太慢可以在这里的云盘直接下载。提取码: 6bqe

#下载内核
wget http://mirrors.coreix.net/elrepo-archive-archive/kernel/el7/x86_64/RPMS/kernel-ml-5.19.9-1.el7.elrepo.x86_64.rpm
wget http://mirrors.coreix.net/elrepo-archive-archive/kernel/el7/x86_64/RPMS/kernel-ml-devel-5.19.9-1.el7.elrepo.x86_64.rpm
wget http://mirrors.coreix.net/elrepo-archive-archive/kernel/el7/x86_64/RPMS/kernel-ml-headers-5.19.9-1.el7.elrepo.x86_64.rpm

yum -y install perl.x86_64

#安装内核
rpm -ivh kernel-ml-5.19.9-1.el7.elrepo.x86_64.rpm 
rpm -ivh kernel-ml-devel-5.19.9-1.el7.elrepo.x86_64.rpm 
rpm -ivh kernel-ml-headers-5.19.9-1.el7.elrepo.x86_64.rpm 

#设置对应的数字启动内核，0代表5.17版本
grub2-set-default 0   
#重新加载启动文件
grub2-mkconfig -o /boot/grub2/grub.cfg 
#重启服务器
reboot now

升级完成后，整体集群的内核已经变成了5.19.
在这里插入图片描述

4.启用RBD 下载ceph镜像

#安装lvm
yum -y install lvm2

#启用rbd模块
modprobe rbd
#生成自启动文件
cat > /etc/rc.sysinit << EOF
#!/bin/bash
for file in /etc/sysconfig/modules/*.modules
do
  [ -x \$file ] && \$file
done
EOF

#rbd导入模块
cat > /etc/sysconfig/modules/rbd.modules << EOF
modprobe rbd
EOF

chmod 755 /etc/sysconfig/modules/rbd.modules
lsmod |grep rbd

导入rook-ceph镜像，镜像如果不能从官方下载可以直接从这里（提取码: 6bqe）下载导入各节点


docker load < ./cephcephv1428.tar
docker load < ./cephcsiv122.tar
docker load < ./csi-attacherv120.tar
docker load < ./csi-node-driver-registrarv120.tar
docker load < ./csi-provisionerv140.tar
docker load < ./csi-snapshotterv122.tar
docker load < ./rookcephV1.2.6.tar

导入镜像后，开始下载rook1.2的包。可以直接在官网下载，也可以直接在这里的云盘下载。

5.安装ceph集群

#解压源码包，解压完成后在当前目录出现一个rook的文件夹
tar -xzvf rook1.2.tar.gz
#创建权限等基础数据
kubectl create -f rook/cluster/examples/kubernetes/ceph/common.yaml

修改operator.yaml里面的镜像，改成rook/ceph:v1.2.6这个已经导入的版本

kubectl create -f rook/cluster/examples/kubernetes/ceph/operator.yaml 
kubectl -n rook-ceph get pod -o wide

最后的operator.yaml的修改内容如下:

#################################################################################################################
# The deployment for the rook operator
# Contains the common settings for most Kubernetes deployments.
# For example, to create the rook-ceph cluster:
#   kubectl create -f common.yaml
#   kubectl create -f operator.yaml
#   kubectl create -f cluster.yaml
#
# Also see other operator sample files for variations of operator.yaml:
# - operator-openshift.yaml: Common settings for running in OpenShift
#################################################################################################################
# Rook Ceph Operator Config
# Use this ConfigMap to override operator configurations
# Precedence will be given to this config in case
# Env Var also exists for the same
#
kind: ConfigMap
apiVersion: v1
metadata:
  name: rook-ceph-operator-config
  # should be in the namespace of the operator
  namespace: rook-ceph
data:
  # # (Optional) Ceph Provisioner NodeAffinity.
  # CSI_PROVISIONER_NODE_AFFINITY: "role=storage-node; storage=rook, ceph"
  # # (Optional) CEPH CSI provisioner tolerations list. Put here list of taints you want to tolerate in YAML format.
  # # CSI provisioner would be best to start on the same nodes as other ceph daemons.
  # CSI_PROVISIONER_TOLERATIONS: |
  #   - effect: NoSchedule
  #     key: node-role.kubernetes.io/controlplane
  #     operator: Exists
  #   - effect: NoExecute
  #     key: node-role.kubernetes.io/etcd
  #     operator: Exists
  # # (Optional) Ceph CSI plugin NodeAffinity.
  # CSI_PLUGIN_NODE_AFFINITY: "role=storage-node; storage=rook, ceph"
  # # (Optional) CEPH CSI plugin tolerations list. Put here list of taints you want to tolerate in YAML format.
  # # CSI plugins need to be started on all the nodes where the clients need to mount the storage.
  # CSI_PLUGIN_TOLERATIONS: |
  #   - effect: NoSchedule
  #     key: node-role.kubernetes.io/controlplane
  #     operator: Exists
  #   - effect: NoExecute
  #     key: node-role.kubernetes.io/etcd
  #     operator: Exists
---
# OLM: BEGIN OPERATOR DEPLOYMENT
apiVersion: apps/v1
kind: Deployment
metadata:
  name: rook-ceph-operator
  namespace: rook-ceph
  labels:
    operator: rook
    storage-backend: ceph
spec:
  selector:
    matchLabels:
      app: rook-ceph-operator
  replicas: 1
  template:
    metadata:
      labels:
        app: rook-ceph-operator
    spec:
      serviceAccountName: rook-ceph-system
      containers:
      - name: rook-ceph-operator
        image: rook/ceph:v1.2.6
        args: ["ceph", "operator"]
        volumeMounts:
        - mountPath: /var/lib/rook
          name: rook-config
        - mountPath: /etc/ceph
          name: default-config-dir
        env:
        # If the operator should only watch for cluster CRDs in the same namespace, set this to "true".
        # If this is not set to true, the operator will watch for cluster CRDs in all namespaces.
        - name: ROOK_CURRENT_NAMESPACE_ONLY
          value: "false"
        # To disable RBAC, uncomment the following:
        # - name: RBAC_ENABLED
        #   value: "false"
        # Rook Agent toleration. Will tolerate all taints with all keys.
        # Choose between NoSchedule, PreferNoSchedule and NoExecute:
        # - name: AGENT_TOLERATION
        #   value: "NoSchedule"
        # (Optional) Rook Agent toleration key. Set this to the key of the taint you want to tolerate
        # - name: AGENT_TOLERATION_KEY
        #   value: "<KeyOfTheTaintToTolerate>"
        # (Optional) Rook Agent tolerations list. Put here list of taints you want to tolerate in YAML format.
        # - name: AGENT_TOLERATIONS
        #   value: |
        #     - effect: NoSchedule
        #       key: node-role.kubernetes.io/controlplane
        #       operator: Exists
        #     - effect: NoExecute
        #       key: node-role.kubernetes.io/etcd
        #       operator: Exists
        # (Optional) Rook Agent priority class name to set on the pod(s)
        # - name: AGENT_PRIORITY_CLASS_NAME
        #   value: "<PriorityClassName>"
        # (Optional) Rook Agent NodeAffinity.
        # - name: AGENT_NODE_AFFINITY
        #   value: "role=storage-node; storage=rook,ceph"
        # (Optional) Rook Agent mount security mode. Can by `Any` or `Restricted`.
        # `Any` uses Ceph admin credentials by default/fallback.
        # For using `Restricted` you must have a Ceph secret in each namespace storage should be consumed from and
        # set `mountUser` to the Ceph user, `mountSecret` to the Kubernetes secret name.
        # to the namespace in which the `mountSecret` Kubernetes secret namespace.
        # - name: AGENT_MOUNT_SECURITY_MODE
        #   value: "Any"
        # Set the path where the Rook agent can find the flex volumes
        # - name: FLEXVOLUME_DIR_PATH
        #   value: "<PathToFlexVolumes>"
        # Set the path where kernel modules can be found
        # - name: LIB_MODULES_DIR_PATH
        #   value: "<PathToLibModules>"
        # Mount any extra directories into the agent container
        # - name: AGENT_MOUNTS
        #   value: "somemount=/host/path:/container/path,someothermount=/host/path2:/container/path2"
        # Rook Discover toleration. Will tolerate all taints with all keys.
        # Choose between NoSchedule, PreferNoSchedule and NoExecute:
        # - name: DISCOVER_TOLERATION
        #   value: "NoSchedule"
        # (Optional) Rook Discover toleration key. Set this to the key of the taint you want to tolerate
        # - name: DISCOVER_TOLERATION_KEY
        #   value: "<KeyOfTheTaintToTolerate>"
        # (Optional) Rook Discover tolerations list. Put here list of taints you want to tolerate in YAML format.
        # - name: DISCOVER_TOLERATIONS
        #   value: |
        #     - effect: NoSchedule
        #       key: node-role.kubernetes.io/controlplane
        #       operator: Exists
        #     - effect: NoExecute
        #       key: node-role.kubernetes.io/etcd
        #       operator: Exists
        # (Optional) Rook Discover priority class name to set on the pod(s)
        # - name: DISCOVER_PRIORITY_CLASS_NAME
        #   value: "<PriorityClassName>"
        # (Optional) Discover Agent NodeAffinity.
        # - name: DISCOVER_AGENT_NODE_AFFINITY
        #   value: "role=storage-node; storage=rook, ceph"
        # Allow rook to create multiple file systems. Note: This is considered
        # an experimental feature in Ceph as described at
        # http://docs.ceph.com/docs/master/cephfs/experimental-features/#multiple-filesystems-within-a-ceph-cluster
        # which might cause mons to crash as seen in https://github.com/rook/rook/issues/1027
        - name: ROOK_ALLOW_MULTIPLE_FILESYSTEMS
          value: "false"

        # The logging level for the operator: INFO | DEBUG
        - name: ROOK_LOG_LEVEL
          value: "INFO"

        # The interval to check the health of the ceph cluster and update the status in the custom resource.
        - name: ROOK_CEPH_STATUS_CHECK_INTERVAL
          value: "60s"

        # The interval to check if every mon is in the quorum.
        - name: ROOK_MON_HEALTHCHECK_INTERVAL
          value: "45s"

        # The duration to wait before trying to failover or remove/replace the
        # current mon with a new mon (useful for compensating flapping network).
        - name: ROOK_MON_OUT_TIMEOUT
          value: "600s"

        # The duration between discovering devices in the rook-discover daemonset.
        - name: ROOK_DISCOVER_DEVICES_INTERVAL
          value: "60m"

        # Whether to start pods as privileged that mount a host path, which includes the Ceph mon and osd pods.
        # This is necessary to workaround the anyuid issues when running on OpenShift.
        # For more details see https://github.com/rook/rook/issues/1314#issuecomment-355799641
        - name: ROOK_HOSTPATH_REQUIRES_PRIVILEGED
          value: "false"

        # In some situations SELinux relabelling breaks (times out) on large filesystems, and doesn't work with cephfs ReadWriteMany volumes (last relabel wins).
        # Disable it here if you have similar issues.
        # For more details see https://github.com/rook/rook/issues/2417
        - name: ROOK_ENABLE_SELINUX_RELABELING
          value: "true"

        # In large volumes it will take some time to chown all the files. Disable it here if you have performance issues.
        # For more details see https://github.com/rook/rook/issues/2254
        - name: ROOK_ENABLE_FSGROUP
          value: "true"

        # Disable automatic orchestration when new devices are discovered
        - name: ROOK_DISABLE_DEVICE_HOTPLUG
          value: "false"

        # Provide customised regex as the values using comma. For eg. regex for rbd based volume, value will be like "(?i)rbd[0-9]+".
        # In case of more than one regex, use comma to seperate between them.
        # Default regex will be "(?i)dm-[0-9]+,(?i)rbd[0-9]+,(?i)nbd[0-9]+"
        # Add regex expression after putting a comma to blacklist a disk
        # If value is empty, the default regex will be used.
        - name: DISCOVER_DAEMON_UDEV_BLACKLIST
          value: "(?i)dm-[0-9]+,(?i)rbd[0-9]+,(?i)nbd[0-9]+"

        # Whether to enable the flex driver. By default it is enabled and is fully supported, but will be deprecated in some future release
        # in favor of the CSI driver.
        - name: ROOK_ENABLE_FLEX_DRIVER
          value: "false"

        # Whether to start the discovery daemon to watch for raw storage devices on nodes in the cluster.
        # This daemon does not need to run if you are only going to create your OSDs based on StorageClassDeviceSets with PVCs.
        - name: ROOK_ENABLE_DISCOVERY_DAEMON
          value: "true"

        # Enable the default version of the CSI CephFS driver. To start another version of the CSI driver, see image properties below.
        - name: ROOK_CSI_ENABLE_CEPHFS
          value: "true"

        # Enable the default version of the CSI RBD driver. To start another version of the CSI driver, see image properties below.
        - name: ROOK_CSI_ENABLE_RBD
          value: "true"
        - name: ROOK_CSI_ENABLE_GRPC_METRICS
          value: "true"
        # Enable deployment of snapshotter container in ceph-csi provisioner.
        - name: CSI_ENABLE_SNAPSHOTTER
          value: "true"
        # Enable Ceph Kernel clients on kernel < 4.17 which support quotas for Cephfs
        # If you disable the kernel client, your application may be disrupted during upgrade.
        # See the upgrade guide: https://rook.io/docs/rook/v1.2/ceph-upgrade.html
        - name: CSI_FORCE_CEPHFS_KERNEL_CLIENT
          value: "true"
        # CSI CephFS plugin daemonset update strategy, supported values are OnDelete and RollingUpdate.
        # Default value is RollingUpdate.
        #- name: CSI_CEPHFS_PLUGIN_UPDATE_STRATEGY
        #  value: "OnDelete"
        # CSI Rbd plugin daemonset update strategy, supported values are OnDelete and RollingUpdate.
        # Default value is RollingUpdate.
        #- name: CSI_RBD_PLUGIN_UPDATE_STRATEGY
        #  value: "OnDelete"
        # The default version of CSI supported by Rook will be started. To change the version
        # of the CSI driver to something other than what is officially supported, change
        # these images to the desired release of the CSI driver.
        #- name: ROOK_CSI_CEPH_IMAGE
        #  value: "quay.io/cephcsi/cephcsi:v2.0.0"
        #- name: ROOK_CSI_REGISTRAR_IMAGE
        #  value: "quay.io/k8scsi/csi-node-driver-registrar:v1.2.0"
        #- name: ROOK_CSI_RESIZER_IMAGE
        #  value: "quay.io/k8scsi/csi-resizer:v0.4.0"
        #- name: ROOK_CSI_PROVISIONER_IMAGE
        #  value: "quay.io/k8scsi/csi-provisioner:v1.4.0"
        #- name: ROOK_CSI_SNAPSHOTTER_IMAGE
        #  value: "quay.io/k8scsi/csi-snapshotter:v1.2.2"
        #- name: ROOK_CSI_ATTACHER_IMAGE
        #  value: "quay.io/k8scsi/csi-attacher:v2.1.0"
        # kubelet directory path, if kubelet configured to use other than /var/lib/kubelet path.
        #- name: ROOK_CSI_KUBELET_DIR_PATH
        #  value: "/var/lib/kubelet"
        # (Optional) Ceph Provisioner NodeAffinity.
        # - name: CSI_PROVISIONER_NODE_AFFINITY
        #   value: "role=storage-node; storage=rook, ceph"
        # (Optional) CEPH CSI provisioner tolerations list. Put here list of taints you want to tolerate in YAML format.
        #  CSI provisioner would be best to start on the same nodes as other ceph daemons.
        # - name: CSI_PROVISIONER_TOLERATIONS
        #   value: |
        #     - effect: NoSchedule
        #       key: node-role.kubernetes.io/controlplane
        #       operator: Exists
        #     - effect: NoExecute
        #       key: node-role.kubernetes.io/etcd
        #       operator: Exists
        # (Optional) Ceph CSI plugin NodeAffinity.
        # - name: CSI_PLUGIN_NODE_AFFINITY
        #   value: "role=storage-node; storage=rook, ceph"
        # (Optional) CEPH CSI plugin tolerations list. Put here list of taints you want to tolerate in YAML format.
        # CSI plugins need to be started on all the nodes where the clients need to mount the storage.
        # - name: CSI_PLUGIN_TOLERATIONS
        #   value: |
        #     - effect: NoSchedule
        #       key: node-role.kubernetes.io/controlplane
        #       operator: Exists
        #     - effect: NoExecute
        #       key: node-role.kubernetes.io/etcd
        #       operator: Exists
        # Configure CSI cephfs grpc and liveness metrics port
        #- name: CSI_CEPHFS_GRPC_METRICS_PORT
        #  value: "9091"
        #- name: CSI_CEPHFS_LIVENESS_METRICS_PORT
        #  value: "9081"
        # Configure CSI rbd grpc and liveness metrics port
        #- name: CSI_RBD_GRPC_METRICS_PORT
        #  value: "9090"
        #- name: CSI_RBD_LIVENESS_METRICS_PORT
        #  value: "9080"

        # Time to wait until the node controller will move Rook pods to other
        # nodes after detecting an unreachable node.
        # Pods affected by this setting are:
        # mgr, rbd, mds, rgw, nfs, PVC based mons and osds, and ceph toolbox
        # The value used in this variable replaces the default value of 300 secs
        # added automatically by k8s as Toleration for
        # <node.kubernetes.io/unreachable>
        # The total amount of time to reschedule Rook pods in healthy nodes
        # before detecting a <not ready node> condition will be the sum of:
        #  --> node-monitor-grace-period: 40 seconds (k8s kube-controller-manager flag)
        #  --> ROOK_UNREACHABLE_NODE_TOLERATION_SECONDS: 5 seconds
        - name: ROOK_UNREACHABLE_NODE_TOLERATION_SECONDS
          value: "5"

        # The name of the node to pass with the downward API
        - name: NODE_NAME
          valueFrom:
            fieldRef:
              fieldPath: spec.nodeName
        # The pod name to pass with the downward API
        - name: POD_NAME
          valueFrom:
            fieldRef:
              fieldPath: metadata.name
        # The pod namespace to pass with the downward API
        - name: POD_NAMESPACE
          valueFrom:
            fieldRef:
              fieldPath: metadata.namespace
      # Uncomment it to run rook operator on the host network
      #hostNetwork: true
      volumes:
      - name: rook-config
        emptyDir: {}
      - name: default-config-dir
        emptyDir: {}
# OLM: END OPERATOR DEPLOYMENT

安装完成后的结果
在这里插入图片描述
开始安装rook-ceph集群配置cluster.yaml，想要修改几个地方

# 修改集群配置文件，替换镜像
sed -i 's|ceph/ceph:v14.2.9|ceph/ceph:v14.2.8|g' rook/cluster/examples/kubernetes/ceph/cluster.yaml
#关闭所有节点和所有设备选择，手动指定节点和设备
sed -i 's|useAllNodes: true|useAllNodes: false|g' rook/cluster/examples/kubernetes/ceph/cluster.yaml
sed -i 's|useAllDevices: true|useAllDevices: false|g' rook/cluster/examples/kubernetes/ceph/cluster.yaml

在storage标签的config:下添加如下配置，每个节点下的第二个磁盘作为ceph存储

      metadataDevice:
      databaseSizeMB: "1024"
      journalSizeMB:  "1024"
    nodes:
    - name: "node1"
      devices:
      - name: "sdb"
      config:
        storeType: bluestore
    - name: "node2"
      devices:
      - name: "sdb"
      config:
        storeType: bluestore
    - name: "node3"
      devices:
      - name: "sdb"
      config:
        storeType: bluestore

kubectl apply -f rook/cluster/examples/kubernetes/ceph/cluster.yaml
kubectl -n rook-ceph get pod -o wide

整体的cluster.yaml配置源文件如下:

#################################################################################################################
# Define the settings for the rook-ceph cluster with common settings for a production cluster.
# All nodes with available raw devices will be used for the Ceph cluster. At least three nodes are required
# in this example. See the documentation for more details on storage settings available.

# For example, to create the cluster:
#   kubectl create -f common.yaml
#   kubectl create -f operator.yaml
#   kubectl create -f cluster.yaml
#################################################################################################################

apiVersion: ceph.rook.io/v1
kind: CephCluster
metadata:
  name: rook-ceph
  namespace: rook-ceph
spec:
  cephVersion:
    # The container image used to launch the Ceph daemon pods (mon, mgr, osd, mds, rgw).
    # v13 is mimic, v14 is nautilus, and v15 is octopus.
    # RECOMMENDATION: In production, use a specific version tag instead of the general v14 flag, which pulls the latest release and could result in different
    # versions running within the cluster. See tags available at https://hub.docker.com/r/ceph/ceph/tags/.
    # If you want to be more precise, you can always use a timestamp tag such ceph/ceph:v14.2.5-20190917
    # This tag might not contain a new Ceph version, just security fixes from the underlying operating system, which will reduce vulnerabilities
    image: ceph/ceph:v14.2.8
    # Whether to allow unsupported versions of Ceph. Currently mimic and nautilus are supported, with the recommendation to upgrade to nautilus.
    # Octopus is the version allowed when this is set to true.
    # Do not set to true in production.
    allowUnsupported: false
  # The path on the host where configuration files will be persisted. Must be specified.
  # Important: if you reinstall the cluster, make sure you delete this directory from each host or else the mons will fail to start on the new cluster.
  # In Minikube, the '/data' directory is configured to persist across reboots. Use "/data/rook" in Minikube environment.
  dataDirHostPath: /var/lib/rook
  # Whether or not upgrade should continue even if a check fails
  # This means Ceph's status could be degraded and we don't recommend upgrading but you might decide otherwise
  # Use at your OWN risk
  # To understand Rook's upgrade process of Ceph, read https://rook.io/docs/rook/master/ceph-upgrade.html#ceph-version-upgrades
  skipUpgradeChecks: false
  # Whether or not continue if PGs are not clean during an upgrade
  continueUpgradeAfterChecksEvenIfNotHealthy: false
  # set the amount of mons to be started
  mon:
    count: 3
    allowMultiplePerNode: false
  # mgr:
    # modules:
    # Several modules should not need to be included in this list. The "dashboard" and "monitoring" modules
    # are already enabled by other settings in the cluster CR and the "rook" module is always enabled.
    # - name: pg_autoscaler
    #   enabled: true
  # enable the ceph dashboard for viewing cluster status
  dashboard:
    enabled: true
    # serve the dashboard under a subpath (useful when you are accessing the dashboard via a reverse proxy)
    # urlPrefix: /ceph-dashboard
    # serve the dashboard at the given port.
    # port: 8443
    # serve the dashboard using SSL
    ssl: true
  # enable prometheus alerting for cluster
  monitoring:
    # requires Prometheus to be pre-installed
    enabled: false
    # namespace to deploy prometheusRule in. If empty, namespace of the cluster will be used.
    # Recommended:
    # If you have a single rook-ceph cluster, set the rulesNamespace to the same namespace as the cluster or keep it empty.
    # If you have multiple rook-ceph clusters in the same k8s cluster, choose the same namespace (ideally, namespace with prometheus
    # deployed) to set rulesNamespace for all the clusters. Otherwise, you will get duplicate alerts with multiple alert definitions.
    rulesNamespace: rook-ceph
  network:
    # toggle to use hostNetwork
    hostNetwork: false
  rbdMirroring:
    # The number of daemons that will perform the rbd mirroring.
    # rbd mirroring must be configured with "rbd mirror" from the rook toolbox.
    workers: 0
  # enable the crash collector for ceph daemon crash collection
  crashCollector:
    disable: false
  # To control where various services will be scheduled by kubernetes, use the placement configuration sections below.
  # The example under 'all' would have all services scheduled on kubernetes nodes labeled with 'role=storage-node' and
  # tolerate taints with a key of 'storage-node'.
#  placement:
#    all:
#      nodeAffinity:
#        requiredDuringSchedulingIgnoredDuringExecution:
#          nodeSelectorTerms:
#          - matchExpressions:
#            - key: role
#              operator: In
#              values:
#              - storage-node
#      podAffinity:
#      podAntiAffinity:
#      tolerations:
#      - key: storage-node
#        operator: Exists
# The above placement information can also be specified for mon, osd, and mgr components
#    mon:
# Monitor deployments may contain an anti-affinity rule for avoiding monitor
# collocation on the same node. This is a required rule when host network is used
# or when AllowMultiplePerNode is false. Otherwise this anti-affinity rule is a
# preferred rule with weight: 50.
#    osd:
#    mgr:
  annotations:
#    all:
#    mon:
#    osd:
# If no mgr annotations are set, prometheus scrape annotations will be set by default.
#   mgr:
  resources:
# The requests and limits set here, allow the mgr pod to use half of one CPU core and 1 gigabyte of memory
#    mgr:
#      limits:
#        cpu: "500m"
#        memory: "1024Mi"
#      requests:
#        cpu: "500m"
#        memory: "1024Mi"
# The above example requests/limits can also be added to the mon and osd components
#    mon:
#    osd:
#    prepareosd:
#    crashcollector:
  # The option to automatically remove OSDs that are out and are safe to destroy.
  removeOSDsIfOutAndSafeToRemove: false
#  priorityClassNames:
#    all: rook-ceph-default-priority-class
#    mon: rook-ceph-mon-priority-class
#    osd: rook-ceph-osd-priority-class
#    mgr: rook-ceph-mgr-priority-class
  storage: # cluster level storage configuration and selection
    useAllNodes: false
    useAllDevices: false
    #deviceFilter:
    config:
      # The default and recommended storeType is dynamically set to bluestore for devices and filestore for directories.
      # Set the storeType explicitly only if it is required not to use the default.
      # storeType: bluestore
      # metadataDevice: "md0" # specify a non-rotational storage so ceph-volume will use it as block db device of bluestore.
      # databaseSizeMB: "1024" # uncomment if the disks are smaller than 100 GB
      # journalSizeMB: "1024"  # uncomment if the disks are 20 GB or smaller
      # osdsPerDevice: "1" # this value can be overridden at the node or device level
      # encryptedDevice: "true" # the default value for this option is "false"
      metadataDevice:
      databaseSizeMB: "1024"
      journalSizeMB:  "1024"
    nodes:
    - name: "node1"
      devices:
      - name: "sdb"
      config:
        storeType: bluestore
    - name: "node2"
      devices:
      - name: "sdb"
      config:
        storeType: bluestore
    - name: "node3"
      devices:
      - name: "sdb"
      config:
        storeType: bluestore

# Cluster level list of directories to use for filestore-based OSD storage. If uncomment, this example would create an OSD under the dataDirHostPath.
    #directories:
    #- path: /var/lib/rook
# Individual nodes and their config can be specified as well, but 'useAllNodes' above must be set to false. Then, only the named
# nodes below will be used as storage resources.  Each node's 'name' field should match their 'kubernetes.io/hostname' label.
#    nodes:
#    - name: "172.17.4.101"
#      directories: # specific directories to use for storage can be specified for each node
#      - path: "/rook/storage-dir"
#      resources:
#        limits:
#          cpu: "500m"
#          memory: "1024Mi"
#        requests:
#          cpu: "500m"
#          memory: "1024Mi"
#    - name: "172.17.4.201"
#      devices: # specific devices to use for storage can be specified for each node
#      - name: "sdb"
#      - name: "nvme01" # multiple osds can be created on high performance devices
#        config:
#          osdsPerDevice: "5"
#      config: # configuration can be specified at the node level which overrides the cluster level config
#        storeType: filestore
#    - name: "172.17.4.301"
#      deviceFilter: "^sd."
  # The section for configuring management of daemon disruptions during upgrade or fencing.
  disruptionManagement:
    # If true, the operator will create and manage PodDisruptionBudgets for OSD, Mon, RGW, and MDS daemons. OSD PDBs are managed dynamically
    # via the strategy outlined in the [design](https://github.com/rook/rook/blob/master/design/ceph/ceph-managed-disruptionbudgets.md). The operator will
    # block eviction of OSDs by default and unblock them safely when drains are detected.
    managePodBudgets: false
    # A duration in minutes that determines how long an entire failureDomain like `region/zone/host` will be held in `noout` (in addition to the
    # default DOWN/OUT interval) when it is draining. This is only relevant when  `managePodBudgets` is `true`. The default value is `30` minutes.
    osdMaintenanceTimeout: 30
    # If true, the operator will create and manage MachineDisruptionBudgets to ensure OSDs are only fenced when the cluster is healthy.
    # Only available on OpenShift.
    manageMachineDisruptionBudgets: false
    # Namespace in which to watch for the MachineDisruptionBudgets.
    machineDisruptionBudgetNamespace: openshift-machine-api

经过一段时间的初始化后，rook-ceph创建完毕，显示如下：
在这里插入图片描述

每个node节点都会显示ceph的lvm信息
在这里插入图片描述
安装一下ceph tools，查看整个集群的状态

sed -i 's|rook/ceph:v1.2.7|rook/ceph:v1.2.6|g' rook/cluster/examples/kubernetes/ceph/toolbox.yaml

kubectl apply -f rook/cluster/examples/kubernetes/ceph/toolbox.yaml 
kubectl -n rook-ceph get pod -l "app=rook-ceph-tools"

#查看集群状态
NAME=$(kubectl -n rook-ceph get pod -l "app=rook-ceph-tools" -o jsonpath='{.items[0].metadata.name}')
kubectl -n rook-ceph exec -it ${NAME} sh
ceph status
ceph osd status
ceph osd df
ceph osd utilization
ceph osd pool stats
ceph osd tree
ceph pg stat
ceph df
rados df
exit

登录Ceph Dashboard看看集群状态，把rook-ceph-mgr的管理端口改成NodePort端口，这样就可以访问了，更新的NodePort_update.yaml文件如下:

# Please edit the object below. Lines beginning with a '#' will be ignored,
# and an empty file will abort the edit. If an error occurs while saving this file will be
# reopened with the relevant failures.
#
apiVersion: v1
kind: Service
metadata:
  creationTimestamp: "2024-09-23T08:34:18Z"
  labels:
    app: rook-ceph-mgr
    rook_cluster: rook-ceph
  name: rook-ceph-mgr-dashboard
  namespace: rook-ceph
  ownerReferences:
  - apiVersion: ceph.rook.io/v1
    blockOwnerDeletion: true
    kind: CephCluster
    name: rook-ceph
    uid: 12207a89-a6e7-48d0-8499-03a4e96604f5
  resourceVersion: "53342"
  selfLink: /api/v1/namespaces/rook-ceph/services/rook-ceph-mgr-dashboard
  uid: f8154ff1-87f3-49e0-8561-e3d928560e68
spec:
  clusterIP: 10.1.98.80
  ports:
  - name: https-dashboard
    port: 8443
    protocol: TCP
    targetPort: 8443
    nodePort: 31112
  selector:
    app: rook-ceph-mgr
    rook_cluster: rook-ceph
  sessionAffinity: None
  type: NodePort
status:
  loadBalancer: {}

kubectl apply -f NodePort_update.yaml

然后可以看到，31112端口已经开始监听
在这里插入图片描述
通过浏览器打开https://xxx.xxx.xxx.xxx:31112

用户名:admin 密码是通过如下方式获得

Ciphertext=$(kubectl -n rook-ceph get secret rook-ceph-dashboard-password -o jsonpath="{['data']['password']}")
Pass=$(echo ${Ciphertext}|base64 --decode)
echo ${Pass}

在这里插入图片描述
由于我这里服务器时间没有对齐，所以出现了health_warn.把服务器的时间全部对齐

yum -y install chrony
systemctl enable chronyd && systemctl start chronyd
timedatectl status
timedatectl set-local-rtc 0
systemctl restart rsyslog && systemctl restart crond

在这里插入图片描述

Ceph能为pod提供块设备，接下来我们创建块设备


sed -i 's/failureDomain: host/failureDomain: osd/g' rook/cluster/examples/kubernetes/ceph/csi/rbd/storageclass.yaml

kubectl apply -f rook/cluster/examples/kubernetes/ceph/csi/rbd/storageclass.yaml

在创建完成后，可以在k8s的dashboard看见rook-ceph-block.
在这里插入图片描述
我现在安装一个范例的mysql来看看是否能自动创建pv,pvc.

kubectl apply -f rook/cluster/examples/kubernetes/mysql.yaml

可以看到自动创建pv,pvc
在这里插入图片描述