Troubleshooting

This section contains the following troubleshooting topics:

Inspecting Deployment

Deployment of CTE for Kubernetes typically uses a single controller pod and at least one worker pod per Kubernetes node. Inspect the deployment using kubectl get commands to access basic output.

Example

Get the CTE for Kubernetes pod names that were deployed on the Kubernetes cluster, type:

NAMESPACE="kube-system"
kubectl get pods -n ${NAMESPACE} -o wide | grep cte-csi

Response

cte-csi-controller-0      5/5     Running   0      25h    192.168.77.182   node1.example.com    <none>   <none>
cte-csi-node-28srz        4/4     Running   0      25h    192.168.77.181   node1.example.com    <none>   <none>
cte-csi-node-cjzzn        4/4     Running   0      25h    192.168.76.68    control.example.com   <none>   <none>

Getting the Agent version

The version of CTE for Kubernetes that you are running displays in the log files for each pod:

NAMESPACE="kube-system"
for podName in `kubectl get pods -n ${NAMESPACE} | grep cte-csi-[node,controller] | cut -d " " -f1`; do
    echo -n "${podName}:  "
    kubectl logs -n ${NAMESPACE} ${podName} cte-csi | grep "Version:"
done

Response

cte-csi-controller-7d7d8bd46b-wbffd:  I0308 17:25:13.608929 1 cte.go:139] Version: 1.6.0.xx
cte-csi-node-8qx5q:  I0308 17:25:11.643673 2218968 cte.go:139] Version: 1.6.0.xx
cte-csi-node-h8q8q:  I0308 17:25:13.465795 2205645 cte.go:139] Version: 1.6.0.xx

Inspecting Events

CTE for Kubernetes relies on Kubernetes event infrastructure for diagnostics of problems with a CTE for Kubernetes volume. Any errors in attaching a volume to a pod will display in the event logs. Look at all of the events in a Kubernetes namespace by typing:

NAMESPACE="kube-system"
kubectl get event -n ${NAMESPACE}

To target events of a specific pod, type:

PODNAME="my_application_pod"
NAMESPACE="kube-system"
kubectl get event -n ${NAMESPACE} --field-selector involvedObject.name=${PODNAME}

Inspecting logs

Examining logs for CTE for Kubernetes could offer supplementary insights that might not be conveyed through Kubernetes Events.

Deployment of CTE for Kubernetes is split between two different types of pods:

Controller Server: Manages the volume manipulation (Dynamic Volume Provisioning, Volume Cloning, Volume Expansion, Volume Snapshot & Restore) activities. Denoted by a cte-csi-controller-X pod name cte-csi-controller-X pod name.
Node server: Manages attaching CTE for Kubernetes volumes to pods. Denoted by a cte-csi-node-XXXXX pod name.

Inspecting the Controller Server Log

The logs for the CTE for Kubernetes controller server pod are distributed across two containers:

The cte-csi server contains logs relevant to dynamic provisioning for CTE for Kubernetes persistent volumes, volume management (Volume cloning, Volume Resizing, Volume Snapshot & Restore, and Volume affinity).
The cte-csi-signer contains logs relevant to the signer client. (The Signer client is used in the pod attestation feature). Node activities (node addition, deletion) for the feature is set to automatically cleanup for CTE clients.

Use the two relevant kubectl logs commands to view the logs, type:

NAMESPACE="kube-system"
kubectl logs -n ${NAMESPACE} cte-csi-controller-0 cte-csi
kubectl logs -n ${NAMESPACE} cte-csi-controller-0 cte-csi-signer

Inspecting the Node Server Log

The logs for the CTE for Kubernetes pods are distributed across two containers:

The cte-csi container contains logs relevant to CTE for Kubernetes activity, encompassing details like volume mounting and registration
The cte-agent-logs contain logs for the CTE encryption engine agent, VMD, CipherTrust Manager server communication logs, and file access logs.

Inspecting the Node Server requires first identifying which Kubernetes node the application pod is scheduled on. Once that node has been identified, find the CTE for Kubernetes node server running on that node.

After you obtain the pod name, use the two relevant kubectl logs commands to view the logs, type:

NAMESPACE="kube-system"
kubectl logs -n ${NAMESPACE} cte-csi-node-XXXXX cte-csi
kubectl logs -n ${NAMESPACE} cte-csi-node-XXXXX cte-agent-logs

Updating CTE-K8s Log level on a live CTE deployment

The CTE-K8s driver logging is useful to debug the kubernetes resources for CTE, and activity related to volumes, registration, mount, etc.

It currently has a log level of <1-5>. The default log level is set to 1, which provides minimal logging. You can set the Log Level from 1-5. Set the log level to 5 for the maximum debug logging information.

In CTE for Kubernetes v1.6.0 and subsequent versions, you can change this log level on a live CTE for Kubernetes deployment, without disrupting any running deployment. When you install CTE for Kubernetes, it creates a default ConfigMap, cte-k8s-config in the same namespace where the CTE-K8s driver/agent is installed.

By default CTE for Kubernetes is installed in the kube-system namespace. The namespace could be different, based on the CTE for Kubernetes installation command.

To verify the CTE for Kubernetes installation namespace, type:

    kubectl get pods --all-namespaces -o wide | grep cte-csi-controller

In the ConfigMap, cte-k8s-config, the log_level parameter reports the CTE-K8s Log Level. You can update the log_level" to change the CTE-K8s log level.

Updating the CTE-CTE-K8s Log Level

You can update the log level parameter with the kubectl patch or kubectl edit command:

To update with the kubectl patch command, type:

kubectl patch cm cte-k8s-config -n <CTE-K8s-Namespace> --type merge --patch '{ "data": { "log_level": "<LEVEL>" }}'

Where:

<CTE-K8s-Namespace>: Namespace in which CTE for Kubernetes is installed.
<LEVEL>: Log level: 1-5; 1 is for minimal logs and 5 for maximum logs.

Example:

kubectl patch cm cte-k8s-config -n kube-system --type merge --patch '{ "data": { "log_level": "5" }}'

To update with the kubectl edit command, type:

kubectl edit cm cte-k8s-config -n <CTE-K8s-Namespace>

<CTE-K8s-Namespace>: Namespace in which CTE for Kubernetes is installed.

Note

Update the log_level value accordingly.

Problems with Registration

CTE for Kubernetes automatically registers to CipherTrust Manager based on demand for volumes on a node. Failure to register is typically due to the registration token being either invalid or the token has no more client capacity. These types of errors are reported back through Kubernetes so analyzing the log files of the troubled pod reveals the registration failure message seen by the agent.

Troubleshooting Trusted Pods failures

The following are two examples of trusted pod failures:

Example 1

In cte-csi-node logs, the following error indicates that a running pod digest was not found in any signature set attached to a security policy.

E0310 06:41:16.614410 1687839 server.go:106] GRPC error: rpc error: code = Internal desc = Pod did not pass attestation checks: rpc error: code = Internal desc = Found no signature set for container ubuntu with digest 2adf22367284330af9f832ffefb717c78239f6251d9d0f58de50b86229ed1427

Example 2

In cte-csi-node logs, the following error indicates that running pod digests cannot be matched to the same signature set. There were two containers (ubuntu and ubuntu2) with digests included in different signature sets. Partial matches are displayed to help with troubleshooting.

E0407 09:07:33.393609 1492044 server.go:106] GRPC error: rpc error: code = Internal desc = Pod did not pass attestation checks: rpc error: code = Internal desc = Pod attestation failed. Unable to match all pod digests to same signature set. Partial matches:
Signature set policy-sigset2: [ubuntu:8ae9bafbb64f63a50caab98fd3a5e37b3eb837a3e0780b78e5218e63193961f9]
Signature set policy-sigset3: [ubuntu2:69665d02cb32192e52e07644d76bc6f25abeb5410edc1c7a81a10ba3f0efb90a]

Backing up Databases after Encryption

After encrypting a database, CipherTrust Transparent Encryption cannot make a backup of the database. Both scheduled and manual backup fail. The problem was the user's policy. A policy used in this scenario must follow a few rules.

With a CBC_CS1 key, a guarded file is modified to have a 4096 byte header holding key information. When an Apply Key effect is specified, the CipherTrust Transparent Encryption code adjusts the length and file offset for this header. Without an Apply Key effect, the size and access of the offset include the CBC_CS1 header.

Thales recommends that you modify the first rule of your policy. Remove the action entry for f_rd_att from the first rule and add a new rule before it:

**action**: f_rd_att

**effect**: Permit, Apply Key

Policy processing starts with the first rule and continues until a matching rule is found. The effect for the matching rule is then applied.

For the f_rd_att action, this results in the SECFS code including the CBC_CS1 key header and adjusts the file size value. Without the Apply Key effect, the file size includes the CBC_CS1 header size and the file appears as 4096 bytes larger than its real size.

FSGroupID is not working with NFS shared storage volume

The fsgroup ID Security Context option allows an administrator to change the permissions of volumes before a pod starts. Some users have found that when adding the fsgroup ID to the SecurityContext section in the pod yaml file, it does not work as expected in the NFS storage environment.

The reason is that the fsgroup ID Security Context option is not supported with NFS volumes. It is only supported with local storage. This is a limitation from Kubernetes and not an issue with the CTE-U fuse driver.

See Configure a Security Context for a Pod or Container in the Kubernetes documentation for more information.

Kubernetes Node reboot procedure

If you need to reboot a Kubernetes node, you must follow the proper order for commands:

To stop scheduling new pods on a node, type:
```
kubectl cordon ${NODE_NAME}
```
To stop all running pods in the node, type:
```
kubectl drain --ignore-daemonsets --force ${NODE_NAME}
```
Note

Does not stop the DaemonSet. It allows CTE-K8s nodes to complete deregistration.

Check to see if a finalizer exists for that node, type:

kubectl get storageclass ${CTE_STORAGE_CLASS} -o jsonpath='{.metadata.finalizers}' | grep ${NODE_NAME}

Finding a finalizer with the node name on a StorageClass is an indication that the node is still registered with CipherTrust Manager. CTE-K8s may need to wait for the deregistration period for a complete deregistration to occur. Once this command returns nothing, you can reboot, type:
```
reboot
```
After rebooting, you can resume scheduling the pods on a node, type:
```
kubectl uncordon ${NODE_NAME}
```

Suggest A Change

Troubleshooting

Inspecting Deployment

Getting the Agent version

Inspecting Events

Inspecting logs

Inspecting the Controller Server Log

Inspecting the Node Server Log

Updating CTE-K8s Log level on a live CTE deployment

Updating the CTE-CTE-K8s Log Level

Problems with Registration

Troubleshooting Trusted Pods failures

Backing up Databases after Encryption

FSGroupID is not working with NFS shared storage volume

Kubernetes Node reboot procedure

On this page