Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Export pod ephemeral PVCs metrics #2490

Open
TPXP opened this issue Aug 30, 2024 · 1 comment
Open

Export pod ephemeral PVCs metrics #2490

TPXP opened this issue Aug 30, 2024 · 1 comment
Assignees
Labels
kind/feature Categorizes issue or PR as related to a new feature. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@TPXP
Copy link

TPXP commented Aug 30, 2024

What would you like to be added: kube-state-metrics exposes metrics about PVC usage by pods through metrics like kube_pod_spec_volumes_persistentvolumeclaims_info and kube_pod_spec_volumes_persistentvolumeclaims_readonly. I'd like similar metrics to be available for Ephemeral Volumes mounts since those are also backed by PVCs.

Why is this needed: We use prometheus metrics to determine whether a PVC is not mounted, giving us a reminder to drop it it was left behind for some reason. Our alerting rule lists PVCs in a namespace with kube_persistentvolumeclaim_info and excludes mounted ones with kube_pod_spec_volumes_persistentvolumeclaims_info. Ephemeral volumes generate a PVC which appears in kube_persistentvolumeclaim_info but not in kube_pod_spec_volumes_persistentvolumeclaims_info since the volume does not have PersistentVolumeClaim.ClaimName defined. Adding a metric exposing ephemeral PVCs would give us a way to avoid false alarms when a pod is using an ephemeral PVC.

Describe the solution you'd like: Exposing another metric kube_pod_spec_volumes_ephemeral_persistentvolumeclaims_info seems acceptable, or updating kube_pod_spec_volumes_persistentvolumeclaims_info to add a ephemeral label would work as well.

Implementation note: while the PodSpec does not have a field explicitly giving the PVC name, the docs clarify how it's derived from the pod and volume name:

Naming of the automatically created PVCs is deterministic: the name is a combination of the Pod name and volume name, with a hyphen (-) in the middle.

Alternatively, exposing PVC ownership data (ownerReferences metadata) would also address my use case, although I think it would be hard to integrate to my alerting rule.

Additional context
We sometimes run temporary workloads that need to store large amounts of data. Since we don't need the data to persist across pod executions, we use Ephemeral Volumes to ensure the PVC is removed when we drop the pod.

Here's a pod manifest example (we use these pods to perform operations on our databases by exec-ing into them, this avoids tunneling and guards against connection drops):

apiVersion: v1
kind: Pod
metadata:
  labels:
    run: tmp-workload
  name: tmp-workload
spec:
  terminationGracePeriodSeconds: 3
  containers:
  - args:
    - bash
    - -c
    - sleep infinity
    image: postgres
    name: tmp-workload
    volumeMounts:
    - name: workdir
      mountPath: /workdir
    resources:
      limits:
        memory: 1Gi
        cpu: "1"
  volumes:
  - name: workdir
    ephemeral:
      volumeClaimTemplate:
        spec:
          accessModes: ["ReadWriteOnce"]
          resources:
            requests:
              storage: 1Ti
@TPXP TPXP added the kind/feature Categorizes issue or PR as related to a new feature. label Aug 30, 2024
@k8s-ci-robot k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Aug 30, 2024
@dashpole
Copy link
Contributor

dashpole commented Sep 5, 2024

/assign @dgrisonnet
/triage accepted

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Sep 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

No branches or pull requests

4 participants