title: “Check apiservices”
date: 2021-03-03T13:15:50
slug: check-apiservices
oc get apiservices
title: “Check apiservices”
date: 2021-03-03T13:15:50
slug: check-apiservices
oc get apiservices
title: “Call Api by CLI / OC Command”
date: 2021-03-03T13:14:04
slug: call-api-by-cli-oc-command
oc api-versions
oc get --raw /apis/metrics.k8s.io/v1beta1
oc get --raw /apis/oauth.openshift.io/v1/oauthaccesstokens/e\_qNXwWgq4fWAyibMHELBJMRm5lsGgGS9DLLolPmvng
title: “Delete Custom Prometheus Alerting Rule”
date: 2021-02-24T08:33:03
slug: delete-custom-prometheus-alerting-rule
[root@master02vp ~]# oc get prometheusrule
NAME AGE
deployment-monitoring-rules 96d
prometheus-k8s-rules 1y
prometheus-noris-rules 1y
-------------------------------------------------------------
[root@master02vp ~]# oc delete prometheusrule deployment-monitoring-rules
prometheusrule.monitoring.coreos.com "deployment-monitoring-rules" deleted
[root@master02vp ~]# oc get prometheusrule
NAME AGE
prometheus-k8s-rules 1y
prometheus-noris-rules 1y
-----------------------------------------------------------------------
[root@master02vp ~]# oc delete pod cluster-monitoring-operator-8578656f6f-nrd2w alertmanager-main-2 alertmanager-main-1 alertmanager-main-0 prometheus-k8s-0 prometheus-k8s-1 prometheus-operator-6644b8cd54-n7mjg
pod "cluster-monitoring-operator-8578656f6f-nrd2w" deleted
pod "alertmanager-main-2" deleted
pod "alertmanager-main-1" deleted
pod "alertmanager-main-0" deleted
pod "prometheus-k8s-0" deleted
pod "prometheus-k8s-1" deleted
pod "prometheus-operator-6644b8cd54-n7mjg" deleted
title: “Explanation of replica, count and worker scaling for Red Hat OpenShift Container Storage 4.x”
date: 2021-01-20T17:23:57
slug: explanation-of-replica-count-and-worker-scaling-for-red-hat-openshift-container-storage-4-x
Red Hat OpenShift Container Storage 4.x
Explanation of replica, count and worker scaling for Red Hat OpenShift Container Storage 4.x
When creating the StorageCluster kind object with name ocs-storagecluster to deploy OpenShift Container Storage (OCS), administrators can set spec.storageDeviceSets[0].count and spec.storageDeviceSets[0].replica. What should the values of these fields be set to?
For example:
cat <<'EOF' | oc apply -f -
apiVersion: ocs.openshift.io/v1
kind: StorageCluster
metadata:
name: ocs-storagecluster
namespace: openshift-storage
spec:
manageNodes: false
resources:
mds:
limits:
cpu: 3
memory: 8Gi
requests:
cpu: 1
memory: 8Gi
noobaa-core:
limits:
cpu: 2
memory: 8Gi
requests:
cpu: 1
memory: 8Gi
noobaa-db:
limits:
cpu: 2
memory: 8Gi
requests:
cpu: 1
memory: 8Gi
monDataDirHostPath: /var/lib/rook
storageDeviceSets:
- count: 2
dataPVCTemplate:
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 2328Gi
storageClassName: localblock
volumeMode: Block
name: ocs-deviceset
placement: {}
portable: false
replica: 3
resources: {}
EOF
The only supported value for replica: is 3.
OCS worker nodes must be scaled out in multiples of 3. Meaning that 3,6,9,… OCS worker nodes are o.k., but for example 4 or 5 are not.
Each OCS worker node must have the same number [number of OCS PVs per worker] of equally sized PVs that can be used for OCS
count: must be set to [number of OCS worker nodes] * [number of OCS PVs per worker] / 3 (the last 3 is explained by [replica size = 3]
The reason for this lies in the following
The pools’ replicated size is hard coded to 3 with OCS. If the rack count is not divisible by 3, then the count: parameter must be a multiple of 3. The total number of OSDs needs to be divisible by 3.
One could in theory change the rack count by changing the replica: parameter, but this negatively influences scale-outs, forcing administrators to add node multiples of rack count (= replica: parameter) at every single scale-out.
One can only add node capacity in multiples of rack count. Rack count is set by the replica: parameter. The ideal rack count is 3, as this is the smallest supported unit one can have with Ceph due to the pools’ replicated size of 3.
When looking at ocs-deviceset-x-y, replica: controls max x. And count: controls max y.
Each storageDeviceSet will be tied to a specific rack according to replica::
$ oc get jobs -o name | while read job; do echo === $job === ; oc get $job -o yaml | egrep 'rack[0-9]+' ; done
=== job.batch/rook-ceph-osd-prepare-ocs-deviceset-0-0-nd5jp ===
- rack0
=== job.batch/rook-ceph-osd-prepare-ocs-deviceset-0-1-ns9jx ===
- rack0
=== job.batch/rook-ceph-osd-prepare-ocs-deviceset-1-0-qbf59 ===
- rack1
=== job.batch/rook-ceph-osd-prepare-ocs-deviceset-1-1-j7bs8 ===
- rack1
=== job.batch/rook-ceph-osd-prepare-ocs-deviceset-2-0-jwshs ===
- rack2
=== job.batch/rook-ceph-osd-prepare-ocs-deviceset-2-1-49ldb ===
- rack2
Each node will be evenly distributed into a specific rack:
$ oc get nodes -l topology.rook.io/rack=rack2
NAME STATUS ROLES AGE VERSION
ip-10-0-202-249.eu-west-1.compute.internal Ready worker 8d v1.17.1+b83bc57
$ oc get nodes -l topology.rook.io/rack=rack0
NAME STATUS ROLES AGE VERSION
ip-10-0-198-152.eu-west-1.compute.internal Ready worker 8d v1.17.1+b83bc57
$ oc get nodes -l topology.rook.io/rack=rack1
NAME STATUS ROLES AGE VERSION
ip-10-0-197-77.eu-west-1.compute.internal Ready worker 8d v1.17.1+b83bc57
$ oc get nodes -l topology.rook.io/rack=rack2
NAME STATUS ROLES AGE VERSION
ip-10-0-202-249.eu-west-1.compute.internal Ready worker 8d v1.17.1+b83bc57
When labeling nodes, nodes will be added to the racks. For example, when adding:
oc label node/ip-10-0-210-157.eu-west-1.compute.internal cluster.ocs.openshift.io/openshift-storage=''
The node will be added to rack0:
$ oc get nodes -l topology.rook.io/rack=rack0
NAME STATUS ROLES AGE VERSION
ip-10-0-198-152.eu-west-1.compute.internal Ready worker 8d v1.17.1+b83bc57
ip-10-0-210-157.eu-west-1.compute.internal Ready worker 8d v1.17.1+b83bc57
$ oc get nodes -l topology.rook.io/rack=rack1
NAME STATUS ROLES AGE VERSION
ip-10-0-197-77.eu-west-1.compute.internal Ready worker 8d v1.17.1+b83bc57
$ oc get nodes -l topology.rook.io/rack=rack2
NAME STATUS ROLES AGE VERSION
ip-10-0-202-249.eu-west-1.compute.internal Ready worker 8d v1.17.1+b83bc57
In the above example, 2 more nodes must added for a supported configuration in this case, until each rack shows the same number of nodes.
When increasing count: by 1, one new OSD will be created in each rack, adding a total of 3 OSDs.
If the OCS worker node count cannot be divided by 3, then OCS cannot create all OSDs for the scale-out. The prepare OSD job’s pod will remain in Pending with a message similar to the following:
$ oc get pods | grep rook-ceph-osd-prepare-ocs-deviceset-2-2-vp5m4-2gmkv
rook-ceph-osd-prepare-ocs-deviceset-2-2-vp5m4-2gmkv 0/1 Pending 0 32m
$ oc describe pod rook-ceph-osd-prepare-ocs-deviceset-2-2-vp5m4-2gmkv | tail -1
Warning FailedScheduling 22s (x29 over 32m) default-scheduler 0/10 nodes are available: 1 node(s) didn't find available persistent volumes to bind, 9 node(s) didn't match node selector.
title: “Ceph on Openshift”
date: 2021-01-14T11:59:41
slug: ceph-on-openshift
Reaplio 3/2 Means: I will have 3 Replicas, the Client Receives OK after writing the second Replica. Third Replica will be written after/while sending ACK to the Client
Replication Factor 2/2 Means: Client Receives ACK if all Replicas are written
Show Status
ceph status
List Pools
ceph osd pool ls
or with id:
ceph osd lspools
List Pool Size (Replicas)
ceph osd pool get ocs-storagecluster-cephfilesystem-data0 size
List min Pool Size (when Client receives ACK)
ceph osd pool get ocs-storagecluster-cephfilesystem-data0 min\_size
List Numbers of Placement Groups
ceph osd pool get ocs-storagecluster-cephfilesystem-data0 pg\_num
Other Possible Parameters:
size|min\_size|pg\_num|pgp\_num|pgp\_num\_actual|crush\_rule|
hashpspool|nodelete|nopgchange|nosizechange|write\_fadvise\_dontneed|noscrub|nodeep-
scrub|hit\_set\_type|hit\_set\_period|hit\_set\_count|hit\_set\_fpp|use\_gmt\_hitset|target\_max\_
bytes|target\_max\_objects|cache\_target\_dirty\_ratio|cache\_target\_dirty\_high\_ratio|cache\_
target\_full\_ratio|cache\_min\_flush\_age|cache\_min\_evict\_age|min\_read\_recency\_for\_promote|
min\_write\_recency\_for\_promote|fast\_read|hit\_set\_grade\_decay\_rate|hit\_set\_search\_last\_n|
scrub\_min\_interval|scrub\_max\_interval|deep\_scrub\_interval|recovery\_priority|recovery\_
op\_priority|scrub\_priority|compression\_mode|compression\_algorithm|compression\_required\_
ratio|compression\_max\_blob\_size|compression\_min\_blob\_size|csum\_type|csum\_min\_block|
csum\_max\_block|allow\_ec\_overwrites|fingerprint\_algorithm|pg\_autoscale\_mode|pg\_
autoscale\_bias|pg\_num\_min|target\_size\_bytes|target\_size\_ratio {--yes-i-really-mean-it}
List OSDs
ceph osd ls
Get OSD Tree and Status (UP/DOWN)
ceph osd tree
Show PlacementGroups and their OSDs
ceph pg dump pgs\_brief
PG\_STAT STATE UP UP\_PRIMARY ACTING ACTING\_PRIMARY
3.d active+clean [2,0,1] 2 [2,0,1] 2
1.f active+clean [2,1,0] 2 [2,1,0] 2
PG_STAT: Inique Identifier
STATE: State
UP:
UP_PRIMARY:
ACTING: Includes these OSDs
ACTING_PRIMARY: This is the Primary OSD
More Dump Objects
all|summary|sum|delta|pools|osds|pgs|pgs\_brief [all|summary|sum|delta|pools
Exec into a OSD
oc rsh rook-ceph-osd-0-65547d878c-dfd8g
Put a File (text.txt) into a pool
Exec into ToolBox Pod and:
rados -p ocs-storagecluster-cephobjectstore.rgw.buckets.data put test.txt ./test.txt
List all Files in a Pool:
rados -p ocs-storagecluster-cephobjectstore.rgw.buckets.data ls
test.txt
Get a File (test.txt) from Ceph Storage and write it into new_file
rados -p ocs-storagecluster-cephobjectstore.rgw.buckets.data get test.txt new\_file
Monitormap:
ceph mon getmap > /tmp/map
monmaptool --print /tmp/map
monmaptool: monmap file /tmp/map
epoch 3
fsid 709412c8-caf8-4958-885d-66d8a918ba0e
last\_changed 2021-01-13 10:10:19.834120
created 2021-01-13 10:09:39.685560
min\_mon\_release 14 (nautilus)
0: [v2:172.31.132.253:3300/0,v1:172.31.132.253:6789/0] mon.a
1: [v2:172.31.186.223:3300/0,v1:172.31.186.223:6789/0] mon.b
2: [v2:172.31.147.23:3300/0,v1:172.31.147.23:6789/0] mon.c
OSDMap:
ceph osd getmap > /tmp/osdmap
osdmaptool --print /tmp/osdmap
osdmaptool: osdmap file '/tmp/osdmap'
epoch 40
fsid 709412c8-caf8-4958-885d-66d8a918ba0e
created 2021-01-13 10:09:40.731069
modified 2021-01-13 10:11:41.718711
flags sortbitwise,recovery\_deletes,purged\_snapdirs,pglog\_hardlimit
crush\_version 14
full\_ratio 0.85
backfillfull\_ratio 0.8
nearfull\_ratio 0.75
require\_min\_compat\_client luminous
min\_compat\_client jewel
require\_osd\_release nautilus
pool 1 'ocs-storagecluster-cephblockpool' replicated size 3 min\_size 2 crush\_rule 1 object\_hash rjenkins pg\_num 32 pgp\_num 32 autoscale\_mode on last\_change 17 flags hashpspool,selfmanaged\_snaps stripe\_width 0 target\_size\_ratio 0.49 application rbd
removed\_snaps [1~3]
pool 2 'ocs-storagecluster-cephobjectstore.rgw.control' replicated size 3 min\_size 2 crush\_rule 2 object\_hash rjenkins pg\_num 8 pgp\_num 8 autoscale\_mode on last\_change 13 flags hashpspool stripe\_width 0 pg\_num\_min 8 application rook-ceph-rgw
pool 3 'ocs-storagecluster-cephfilesystem-metadata' replicated size 3 min\_size 2 crush\_rule 3 object\_hash rjenkins pg\_num 32 pgp\_num 32 autoscale\_mode on last\_change 15 flags hashpspool stripe\_width 0 pg\_autoscale\_bias 4 pg\_num\_min 16 recovery\_priority 5 application cephfs
pool 4 'ocs-storagecluster-cephfilesystem-data0' replicated size 3 min\_size 2 crush\_rule 4 object\_hash rjenkins pg\_num 32 pgp\_num 32 autoscale\_mode on last\_change 16 flags hashpspool stripe\_width 0 target\_size\_ratio 0.49 application cephfs
pool 5 'ocs-storagecluster-cephobjectstore.rgw.meta' replicated size 3 min\_size 2 crush\_rule 5 object\_hash rjenkins pg\_num 8 pgp\_num 8 autoscale\_mode on last\_change 16 flags hashpspool stripe\_width 0 pg\_num\_min 8 application rook-ceph-rgw
pool 6 'ocs-storagecluster-cephobjectstore.rgw.log' replicated size 3 min\_size 2 crush\_rule 6 object\_hash rjenkins pg\_num 8 pgp\_num 8 autoscale\_mode on last\_change 21 flags hashpspool stripe\_width 0 pg\_num\_min 8 application rook-ceph-rgw
pool 7 'ocs-storagecluster-cephobjectstore.rgw.buckets.index' replicated size 3 min\_size 2 crush\_rule 7 object\_hash rjenkins pg\_num 8 pgp\_num 8 autoscale\_mode on last\_change 26 flags hashpspool stripe\_width 0 pg\_num\_min 8 application rook-ceph-rgw
pool 8 'ocs-storagecluster-cephobjectstore.rgw.buckets.non-ec' replicated size 3 min\_size 2 crush\_rule 8 object\_hash rjenkins pg\_num 8 pgp\_num 8 autoscale\_mode on last\_change 31 flags hashpspool stripe\_width 0 pg\_num\_min 8 application rook-ceph-rgw
pool 9 '.rgw.root' replicated size 3 min\_size 2 crush\_rule 9 object\_hash rjenkins pg\_num 8 pgp\_num 8 autoscale\_mode on last\_change 36 flags hashpspool stripe\_width 0 pg\_num\_min 8 application rook-ceph-rgw
pool 10 'ocs-storagecluster-cephobjectstore.rgw.buckets.data' replicated size 3 min\_size 2 crush\_rule 10 object\_hash rjenkins pg\_num 32 pgp\_num 32 autoscale\_mode on last\_change 40 flags hashpspool stripe\_width 0 application rook-ceph-rgw
max\_osd 3
osd.0 up in weight 1 up\_from 13 up\_thru 38 down\_at 0 last\_clean\_interval [0,0) [v2:10.128.8.21:6800/79773,v1:10.128.8.21:6801/79773] [v2:10.128.8.21:6802/79773,v1:10.128.8.21:6803/79773] exists,up cfba7128-87ee-4af3-a5eb-8539153df4e6
osd.1 up in weight 1 up\_from 11 up\_thru 38 down\_at 0 last\_clean\_interval [0,0) [v2:10.129.8.12:6800/77176,v1:10.129.8.12:6801/77176] [v2:10.129.8.12:6802/77176,v1:10.129.8.12:6803/77176] exists,up 682f2bbf-d4e0-4bdc-9bff-c78e5642fd17
osd.2 up in weight 1 up\_from 14 up\_thru 38 down\_at 0 last\_clean\_interval [0,0) [v2:10.130.8.21:6800/80313,v1:10.130.8.21:6801/80313] [v2:10.130.8.21:6802/80313,v1:10.130.8.21:6803/80313] exists,up 61f186ca-ecd8-48f7-9c96-addcfee773b9
Calculate PlacementGroup (pg) by Poolname and Filename
ceph osd map ocs-storagecluster-cephfilesystem-data0 file2.txt
osdmap e40 pool 'ocs-storagecluster-cephfilesystem-data0' (4) object 'file2.txt' -> pg 4.2a096a1d (4.1d) -> up ([1,0,2], p1) acting ([1,0,2], p1)
Ceph Bluestore Tool
ceph-bluestore-tool show-label --dev /var/lib/ceph/osd/ceph-0/block
{
"/var/lib/ceph/osd/ceph-0/block": {
"osd\_uuid": "cfba7128-87ee-4af3-a5eb-8539153df4e6",
"size": 2199023255552,
"btime": "2021-01-13 10:10:50.648298",
"description": "main",
"bluefs": "1",
"ceph\_fsid": "709412c8-caf8-4958-885d-66d8a918ba0e",
"kv\_backend": "rocksdb",
"magic": "ceph osd volume v026",
"mkfs\_done": "yes",
"osd\_key": "AQApx/5fOpgaJRAAOSndBV7CVzBhCz/XT1Fs0A==",
"ready": "ready",
"require\_osd\_release": "14",
"whoami": "0"
}
}
Mounting the Bluestore Storage with ceph-objectstore-tool
1st Step: The OSD bust be stopped, otherwise you receive an error that the block device cannot be locked
Tell the ceph monitors not to “out” any OSDs from the crush map and not to start recovery and re-balance activities, to maintain the replica count.
ceph osd set noout
ceph -s
cluster:
id: 709412c8-caf8-4958-885d-66d8a918ba0e
health: HEALTH\_WARN
noout flag(s) set
(unset wit "ceph osd unset noout")
Stop (scale down) the Rook-Ceph Operator
oc scale --replicas=0 deployment/rook-ceph-operator
Now Restart the desired OSD Pod with a simple Sleep Comamnd as the OSD Process must not be run for mounting the Bluestore
oc edit deployment rook-ceph-osd-2
containers:
- args:
- -c
- 'while true; do sleep 3600; done'
command:
- /bin/sh
AND REMOVE LIVENESS PROBE!!!!
Exec into the OSD Pod
oc rsh rook-ceph-osd-2-6f5d645d4f-765wq
Execute Commands
ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-2 --op list
Mount Bluestore to /mnt (Its running in foreground by default, so add a &)
ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-2 --op fuse --mountpoint /mnt/ &
List the Content:
sh-4.4# ls -l /mnt
total 0
drwx------. 0 root root 0 Jan 1 1970 1.0\_head
drwx------. 0 root root 0 Jan 1 1970 1.10\_head
drwx------. 0 root root 0 Jan 1 1970 1.11\_head
drwx------. 0 root root 0 Jan 1 1970 1.12\_head
drwx------. 0 root root 0 Jan 1 1970 1.13\_head
drwx------. 0 root root 0 Jan 1 1970 1.14\_head
drwx------. 0 root root 0 Jan 1 1970 1.15\_head
drwx------. 0 root root 0 Jan 1 1970 1.16\_head
drwx------. 0 root root 0 Jan 1 1970 1.17\_head
drwx------. 0 root root 0 Jan 1 1970 1.18\_head
List Volume
ceph fs subvolumegroup ls ocs-storagecluster-cephfilesystem (This will display csi)
ceph fs subvolume ls ocs-storagecluster-cephfilesystem csi
ceph fs subvolume info ocs-storagecluster-cephfilesystem csi-vol-96671632-5680-11eb-9d1a-0a580a80041a csi
title: “Install Ceph Toolbox on Openshift”
date: 2021-01-14T11:56:15
slug: install-ceph-toolbox-on-openshift
oc patch OCSInitialization ocsinit -n openshift-storage --type json --patch '[{ "op": "replace", "path": "/spec/enableCephTools", "value": true }]'
title: “RouterCertsDegraded: secret/v4-0-config-system-router-certs.spec.data -n openshift-authentication: certificate could not validate route hostname example.com: x509: certificate signed by unknown authority in OCP4”
date: 2021-01-13T10:34:42
slug: outercertsdegraded-secret-v4-0-config-system-router-certs-spec-data-n-openshift-authentication-certificate-could-not-validate-route-hostname-example-com-x509-certificate-signed-by-unknown-authori
Red Hat OpenShift Container Platform
4.2
authentication operator becomes degraded.Just follow the steps from Replacing the default ingress certificate
but in the step Create a secret that contains the wildcard certificate and key the certificate provided should contain the wildcard and the root CA (with the entire chain if there are intermediates).
Create a bundle file with the custom certificate and the chain CA in the following order:
wildcard certificate
intermediate CA (if available)
root CA
Create the secret using this bundle file and proceed with the next steps from the documentation:
$ oc create secret tls <certificate> --cert=</path/to/bundle-cert.crt> --key=</path/to/cert.key> -n openshift-ingress
The root CA from the new custom certificate provided is not recognized by the system.
The cluster operator is in a degraded state:
$ oc get co authentication
NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE
authentication 4.2.0 True False True 11d
From a oauth-openshift pod there are logs like:
...
RouterCertsDegraded: secret/v4-0-config-system-router-certs.spec.data[apps.example.com] -n openshift-authentication: certificate could not validate route hostname oauth-openshift.apps.example.com: x509: certificate signed by unknown authority
...
Event(v1.ObjectReference{Kind:"Deployment", Namespace:"openshift-authentication-operator", Name:"authentication-operator", UID:"<UID>",
APIVersion:"apps/v1", ResourceVersion:"", FieldPath:""}): type: 'Normal' reason: 'OperatorStatusChanged' Status for clusteroperator/authentication changed: Degraded changed from False to True ("RouterCertsDegraded: secret/v4-0-config-system-router-certs.spec.data[apps.example.com] -n openshift-authentication: certificate could not validate route hostname oauth-openshift.apps.example.com: x509: certificate signed by unknown authority")
...
title: “Check Operator Status”
date: 2021-01-13T10:24:05
slug: check-operator-status
oc get co
title: “Openshift – Container Stroage OCS”
date: 2021-01-08T12:26:04
slug: openshift-container-stroage-ocs
Warning alert:
A minimal cluster deployment will be performed.Tech Preview
The selected nodes do not match the OCS storage cluster requirement of an aggregated 30 CPUs and 72 GiB of RAM. If the selection cannot be modified, a minimal cluster will be deployed.
title: “jsonpath with dots”
date: 2021-01-05T10:33:33
slug: jsonpath-with-dots
jsonpath={.data['ca\.crt']}"
-o jsonpath='{.data.ca\.crt}' | base64