RSPEED-2857: Fix rlsapi Splunk telemetry reporting total_llm_tokens as zero by major · Pull Request #1502 · lightspeed-core/lightspeed-stack

major · 2026-04-14T13:07:41Z

Description

total_llm_tokens in the rlsapi Splunk telemetry event was hardcoded to 0. Token counting is already implemented via extract_token_usage() and the data is captured in the endpoint handler, but it was never passed through to the Splunk event builder.

This adds input_tokens and output_tokens fields to InferenceEventData and plumbs the actual token counts from infer_endpoint() through _queue_splunk_event() into build_inference_event(), which now computes total_llm_tokens as the sum of input and output tokens.

Type of change

Bug fix

Tools used to create PR

Assisted-by: Claude (Sisyphus via OpenCode)
Generated by: N/A

Related Tickets & Documents

RSPEED-2857

Checklist before requesting a review

I have performed a self-review of my code.
PR has passed all pre-merge test jobs.
If it is a core feature, I have added thorough tests.

Testing

Run uv run pytest tests/unit/observability/formats/test_rlsapi.py -v to verify:
- Default token values (0+0) produce total_llm_tokens: 0
- Non-zero token values (150+75) produce total_llm_tokens: 225
Run uv run make verify to confirm all linters pass.
Functionally, deploy and trigger an rlsapi /infer request against a model that returns usage data. The resulting Splunk event should now contain non-zero total_llm_tokens.

Summary by CodeRabbit

New Features
- Telemetry for inference now records input token count, output token count, and calculates total tokens for each event to improve LLM usage visibility.
Tests
- Added unit test verifying total token calculation in telemetry events.
- Updated integration and endpoint tests to include token usage fields in mocked LLM responses.

coderabbitai · 2026-04-14T13:07:57Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: d1c0327e-634d-45b4-8e72-9c0137b7f139

📥 Commits

Reviewing files that changed from the base of the PR and between d454817 and 46669a3.

📒 Files selected for processing (5)

src/app/endpoints/rlsapi_v1.py
src/observability/formats/rlsapi.py
tests/integration/endpoints/test_rlsapi_v1_integration.py
tests/unit/app/endpoints/test_rlsapi_v1.py
tests/unit/observability/formats/test_rlsapi.py

📜 Recent review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (14)

GitHub Check: E2E Tests for Lightspeed Evaluation job
GitHub Check: mypy
GitHub Check: bandit
GitHub Check: unit_tests (3.13)
GitHub Check: unit_tests (3.12)
GitHub Check: build-pr
GitHub Check: Pylinter
GitHub Check: integration_tests (3.12)
GitHub Check: E2E: server mode / ci / group 2
GitHub Check: E2E: library mode / ci / group 1
GitHub Check: E2E: library mode / ci / group 3
GitHub Check: E2E: library mode / ci / group 2
GitHub Check: E2E: server mode / ci / group 3
GitHub Check: E2E: server mode / ci / group 1

🧰 Additional context used

📓 Path-based instructions (4)

tests/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

tests/**/*.py: Use pytest for all unit and integration tests; do not use unittest
Use pytest.mark.asyncio marker for async unit tests

Files:

tests/unit/observability/formats/test_rlsapi.py
tests/unit/app/endpoints/test_rlsapi_v1.py
tests/integration/endpoints/test_rlsapi_v1_integration.py

tests/unit/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Use pytest-mock for AsyncMock objects in unit tests

Files:

tests/unit/observability/formats/test_rlsapi.py
tests/unit/app/endpoints/test_rlsapi_v1.py

src/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

src/**/*.py: Use absolute imports for internal modules (e.g., from authentication import get_auth_dependency)
Use from llama_stack_client import AsyncLlamaStackClient for Llama Stack imports
Check constants.py for shared constants before defining new ones
All modules start with descriptive docstrings explaining purpose
Use logger = get_logger(__name__) from log.py for module logging
Type aliases defined at module level for clarity
All functions require docstrings with brief descriptions
Use complete type annotations for function parameters and return types
Use union types with modern syntax: str | int instead of Union[str, int]
Use Optional[Type] for optional types in type annotations
Use snake_case with descriptive, action-oriented names for functions (get_, validate_, check_)
Avoid in-place parameter modification anti-patterns: return new data structures instead of modifying parameters
Use async def for I/O operations and external API calls
Handle APIConnectionError from Llama Stack in error handling
Use logger.debug() for detailed diagnostic information
Use logger.info() for general information about program execution
Use logger.warning() for unexpected events or potential problems
Use logger.error() for serious problems that prevented function execution
All classes require descriptive docstrings explaining purpose
Use PascalCase for class names with descriptive names and standard suffixes: Configuration, Error/Exception, Resolver, Interface
Use complete type annotations for all class attributes; avoid using Any
Follow Google Python docstring conventions for all modules, classes, and functions
Include Parameters:, Returns:, Raises: sections in function docstrings as needed

Files:

src/observability/formats/rlsapi.py
src/app/endpoints/rlsapi_v1.py

src/app/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

src/app/**/*.py: Use from fastapi import APIRouter, HTTPException, Request, status, Depends for FastAPI dependencies
Use FastAPI HTTPException with appropriate status codes for API endpoint error handling

Files:

src/app/endpoints/rlsapi_v1.py

🧠 Learnings (1)

📚 Learning: 2026-04-06T20:18:07.852Z

Learnt from: major
Repo: lightspeed-core/lightspeed-stack PR: 1463
File: src/app/endpoints/rlsapi_v1.py:266-271
Timestamp: 2026-04-06T20:18:07.852Z
Learning: In the lightspeed-stack codebase, within `src/app/endpoints/` inference/MCP endpoints, treat `tools: Optional[list[Any]]` in MCP tool definitions as an intentional, consistent typing pattern (used across `query`, `responses`, `streaming_query`, `rlsapi_v1`). Do not raise or suggest this as a typing issue during code review; changing it in isolation could break endpoint typing consistency across the codebase.

Applied to files:

src/app/endpoints/rlsapi_v1.py

🔇 Additional comments (6)

src/observability/formats/rlsapi.py (1)

29-30: Token totals are now correctly derived from tracked usage fields.

This change cleanly fixes the telemetry correctness issue while preserving zero-safe defaults.

Also applies to: 52-52

src/app/endpoints/rlsapi_v1.py (2)

356-357: Telemetry helper now correctly accepts and forwards per-call token counts.

The parameter/docs/payload updates are consistent and correctly wired into InferenceEventData.

Also applies to: 359-371, 387-388

773-774: Success-path Splunk event now receives actual token usage values.

This completes the endpoint-side fix for total_llm_tokens reporting.

tests/unit/app/endpoints/test_rlsapi_v1.py (1)

119-122: Updated unit fixtures now model response usage correctly for token-aware code paths.

Good alignment with the endpoint’s extract_token_usage behavior.

Also applies to: 131-134

tests/integration/endpoints/test_rlsapi_v1_integration.py (1)

118-121: Integration mocks now include usage data consistently across scenarios.

This is a solid update to keep integration coverage aligned with token telemetry extraction.

Also applies to: 310-313, 353-356, 416-419, 461-464

tests/unit/observability/formats/test_rlsapi.py (1)

53-77: Great targeted regression test for token aggregation.

The new assertion on total_llm_tokens provides direct coverage for the bug fix.

Walkthrough

Adds token-usage fields to telemetry: InferenceEventData gains input_tokens and output_tokens, build_inference_event() computes total_llm_tokens from them, and the /infer endpoint now extracts and forwards token counts when queuing the "infer_with_llm" Splunk event. Tests updated to include usage in mocked responses and to assert the new total.

Changes

Cohort / File(s)	Summary
Observability Format `src/observability/formats/rlsapi.py`	Added `input_tokens: int = 0` and `output_tokens: int = 0` to `InferenceEventData`. `build_inference_event()` now sets `total_llm_tokens` = `data.input_tokens + data.output_tokens` instead of a hardcoded 0.
Endpoint Integration `src/app/endpoints/rlsapi_v1.py`	Extended `_queue_splunk_event(...)` signature with optional `input_tokens` and `output_tokens` (default 0). Updated `/infer` to extract `token_usage.input_tokens` and `token_usage.output_tokens` from LLM responses and pass them when queuing the `"infer_with_llm"` event.
Unit Tests — formats `tests/unit/observability/formats/test_rlsapi.py`	Added `test_builds_event_with_token_counts` verifying `build_inference_event()` computes `total_llm_tokens` as the sum of input and output tokens.
Unit & Integration Tests — endpoints `tests/unit/app/endpoints/test_rlsapi_v1.py`, `tests/integration/endpoints/test_rlsapi_v1_integration.py`	Updated mocked LLM responses to include a `usage` object with `input_tokens` and `output_tokens` in fixtures and several test-specific mocks to reflect the new response shape consumed by endpoint logic.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title directly and clearly identifies the bug being fixed: rlsapi Splunk telemetry reporting total_llm_tokens as zero, which is the primary objective of this changeset.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

✨ Simplify code

Create PR with simplified code

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/app/endpoints/rlsapi_v1.py`:
- Around line 356-359: The docstring for _queue_splunk_event is missing
documentation for the new input_tokens and output_tokens parameters; update the
function docstring to include a Parameters: section that lists input_tokens
(int, default 0) and output_tokens (int, default 0) with a brief description
that they represent the count of input and output tokens for the telemetry
event, and keep/verify existing entries for other params; also ensure the
Returns: section states None and update Raises: if any exceptions can be thrown
by _queue_splunk_event.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: f6204267-fb3b-40d2-b8bd-1e3e1cd7d138

📥 Commits

Reviewing files that changed from the base of the PR and between d0350c0 and d454817.

📒 Files selected for processing (3)

src/app/endpoints/rlsapi_v1.py
src/observability/formats/rlsapi.py
tests/unit/observability/formats/test_rlsapi.py

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)

GitHub Check: Pylinter
GitHub Check: E2E: server mode / ci / group 2
GitHub Check: E2E: library mode / ci / group 2
GitHub Check: build-pr
GitHub Check: E2E: server mode / ci / group 1

🧰 Additional context used

📓 Path-based instructions (4)

tests/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

tests/**/*.py: Use pytest for all unit and integration tests; do not use unittest
Use pytest.mark.asyncio marker for async unit tests

Files:

tests/unit/observability/formats/test_rlsapi.py

tests/unit/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Use pytest-mock for AsyncMock objects in unit tests

Files:

tests/unit/observability/formats/test_rlsapi.py

src/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

src/**/*.py: Use absolute imports for internal modules (e.g., from authentication import get_auth_dependency)
Use from llama_stack_client import AsyncLlamaStackClient for Llama Stack imports
Check constants.py for shared constants before defining new ones
All modules start with descriptive docstrings explaining purpose
Use logger = get_logger(__name__) from log.py for module logging
Type aliases defined at module level for clarity
All functions require docstrings with brief descriptions
Use complete type annotations for function parameters and return types
Use union types with modern syntax: str | int instead of Union[str, int]
Use Optional[Type] for optional types in type annotations
Use snake_case with descriptive, action-oriented names for functions (get_, validate_, check_)
Avoid in-place parameter modification anti-patterns: return new data structures instead of modifying parameters
Use async def for I/O operations and external API calls
Handle APIConnectionError from Llama Stack in error handling
Use logger.debug() for detailed diagnostic information
Use logger.info() for general information about program execution
Use logger.warning() for unexpected events or potential problems
Use logger.error() for serious problems that prevented function execution
All classes require descriptive docstrings explaining purpose
Use PascalCase for class names with descriptive names and standard suffixes: Configuration, Error/Exception, Resolver, Interface
Use complete type annotations for all class attributes; avoid using Any
Follow Google Python docstring conventions for all modules, classes, and functions
Include Parameters:, Returns:, Raises: sections in function docstrings as needed

Files:

src/observability/formats/rlsapi.py
src/app/endpoints/rlsapi_v1.py

src/app/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

src/app/**/*.py: Use from fastapi import APIRouter, HTTPException, Request, status, Depends for FastAPI dependencies
Use FastAPI HTTPException with appropriate status codes for API endpoint error handling

Files:

src/app/endpoints/rlsapi_v1.py

🧠 Learnings (1)

📚 Learning: 2026-04-06T20:18:07.852Z

Learnt from: major
Repo: lightspeed-core/lightspeed-stack PR: 1463
File: src/app/endpoints/rlsapi_v1.py:266-271
Timestamp: 2026-04-06T20:18:07.852Z
Learning: In the lightspeed-stack codebase, within `src/app/endpoints/` inference/MCP endpoints, treat `tools: Optional[list[Any]]` in MCP tool definitions as an intentional, consistent typing pattern (used across `query`, `responses`, `streaming_query`, `rlsapi_v1`). Do not raise or suggest this as a typing issue during code review; changing it in isolation could break endpoint typing consistency across the codebase.

Applied to files:

src/app/endpoints/rlsapi_v1.py

🔇 Additional comments (3)

src/observability/formats/rlsapi.py (1)

29-30: Token fields and total-token computation are implemented correctly.

The new defaults and Line 52 aggregation fix the zero-token telemetry bug while keeping existing event construction backward compatible.

Also applies to: 52-52

tests/unit/observability/formats/test_rlsapi.py (1)

53-77: Good targeted regression test for token aggregation.

This test cleanly validates the non-zero total_llm_tokens path and complements the existing default-zero assertion.

src/app/endpoints/rlsapi_v1.py (1)

375-376: Token usage is correctly threaded from inference to Splunk event payload.

Lines 375-376 and Line 761-Line 762 correctly propagate extracted token counts, enabling accurate total_llm_tokens.

Also applies to: 761-762

src/app/endpoints/rlsapi_v1.py

total_llm_tokens was hardcoded to 0 in the rlsapi Splunk event builder despite token counting being implemented via extract_token_usage(). Add input_tokens and output_tokens to InferenceEventData and pass actual counts from the endpoint handler. Ref: RSPEED-2857 Signed-off-by: Major Hayden <major@redhat.com>

coderabbitai bot reviewed Apr 14, 2026

View reviewed changes

src/app/endpoints/rlsapi_v1.py Outdated Show resolved Hide resolved

major force-pushed the rspeed-2857/fix-rlsapi-token-telemetry branch from d454817 to 46669a3 Compare April 14, 2026 16:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RSPEED-2857: Fix rlsapi Splunk telemetry reporting total_llm_tokens as zero#1502

RSPEED-2857: Fix rlsapi Splunk telemetry reporting total_llm_tokens as zero#1502
major wants to merge 1 commit intolightspeed-core:mainfrom
major:rspeed-2857/fix-rlsapi-token-telemetry

major commented Apr 14, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Apr 14, 2026 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

major commented Apr 14, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of change

Tools used to create PR

Related Tickets & Documents

Checklist before requesting a review

Testing

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Apr 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

major commented Apr 14, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Apr 14, 2026 •

edited

Loading