Skip to content

Emit Prometheus merics for runtime observability#501

Open
Bslabe123 wants to merge 2 commits into
kubernetes-sigs:mainfrom
Bslabe123:prom-server-seed
Open

Emit Prometheus merics for runtime observability#501
Bslabe123 wants to merge 2 commits into
kubernetes-sigs:mainfrom
Bslabe123:prom-server-seed

Conversation

@Bslabe123
Copy link
Copy Markdown
Contributor

@Bslabe123 Bslabe123 commented May 21, 2026

Partially Adresses: #489

Summary

Adds a Prometheus HTTP exposition surface under a new inference_perf/observability/ package and emits one metric:

  • inference_perf_run_elapsed_seconds, wall-clock seconds since the metrics server started.

No wiring into the run lifecycle, no CLI flag, no pushgateway, no additional metrics. Those are deliberate follow-ups so this PR stays trivially reviewable and so the metric-naming decisions in #489 don't block landing the plumbing.

Intentionally deferred

Adds a minimal HTTP /metrics endpoint exposing a single gauge,
inference_perf_run_elapsed_seconds, as the first step toward kubernetes-sigs#489.
Default port 9464 (OTel Collector convention), fresh CollectorRegistry
per instance, no wiring into run lifecycle yet (intentional, to keep
this reviewable).

Metric name and naming conventions are placeholder pending the design
sync; this PR is structural only.
@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 21, 2026
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: Bslabe123
Once this PR has been reviewed and has the lgtm label, please assign jeffwan for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels May 21, 2026
@Bslabe123 Bslabe123 changed the title WIP: seed Prometheus exposition surface for runtime observability (#489) Emit Prometheus merics runtime observability May 21, 2026
@Bslabe123 Bslabe123 marked this pull request as ready for review May 21, 2026 17:30
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 21, 2026
@k8s-ci-robot k8s-ci-robot requested a review from Jeffwan May 21, 2026 17:30
- Add Iterator[PrometheusMetricsServer] return type to the fixture
- Annotate _scrape's read() locally to satisfy --strict
- Apply ruff format to tests/required/apis/test_chat.py (pre-existing
  drift from kubernetes-sigs#496 that was blocking CI on this PR)
@Bslabe123 Bslabe123 changed the title Emit Prometheus merics runtime observability Emit Prometheus merics for runtime observability May 21, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants