RSPEED-2857: Fix rlsapi Splunk telemetry reporting total_llm_tokens as zero#1502
RSPEED-2857: Fix rlsapi Splunk telemetry reporting total_llm_tokens as zero#1502major wants to merge 1 commit intolightspeed-core:mainfrom
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (5)
📜 Recent review details⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (14)
🧰 Additional context used📓 Path-based instructions (4)tests/**/*.py📄 CodeRabbit inference engine (AGENTS.md)
Files:
tests/unit/**/*.py📄 CodeRabbit inference engine (AGENTS.md)
Files:
src/**/*.py📄 CodeRabbit inference engine (AGENTS.md)
Files:
src/app/**/*.py📄 CodeRabbit inference engine (AGENTS.md)
Files:
🧠 Learnings (1)📚 Learning: 2026-04-06T20:18:07.852ZApplied to files:
🔇 Additional comments (6)
WalkthroughAdds token-usage fields to telemetry: Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
✨ Simplify code
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/app/endpoints/rlsapi_v1.py`:
- Around line 356-359: The docstring for _queue_splunk_event is missing
documentation for the new input_tokens and output_tokens parameters; update the
function docstring to include a Parameters: section that lists input_tokens
(int, default 0) and output_tokens (int, default 0) with a brief description
that they represent the count of input and output tokens for the telemetry
event, and keep/verify existing entries for other params; also ensure the
Returns: section states None and update Raises: if any exceptions can be thrown
by _queue_splunk_event.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: f6204267-fb3b-40d2-b8bd-1e3e1cd7d138
📒 Files selected for processing (3)
src/app/endpoints/rlsapi_v1.pysrc/observability/formats/rlsapi.pytests/unit/observability/formats/test_rlsapi.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: Pylinter
- GitHub Check: E2E: server mode / ci / group 2
- GitHub Check: E2E: library mode / ci / group 2
- GitHub Check: build-pr
- GitHub Check: E2E: server mode / ci / group 1
🧰 Additional context used
📓 Path-based instructions (4)
tests/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
tests/**/*.py: Use pytest for all unit and integration tests; do not use unittest
Usepytest.mark.asynciomarker for async unit tests
Files:
tests/unit/observability/formats/test_rlsapi.py
tests/unit/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
Use
pytest-mockfor AsyncMock objects in unit tests
Files:
tests/unit/observability/formats/test_rlsapi.py
src/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
src/**/*.py: Use absolute imports for internal modules (e.g.,from authentication import get_auth_dependency)
Usefrom llama_stack_client import AsyncLlamaStackClientfor Llama Stack imports
Checkconstants.pyfor shared constants before defining new ones
All modules start with descriptive docstrings explaining purpose
Uselogger = get_logger(__name__)fromlog.pyfor module logging
Type aliases defined at module level for clarity
All functions require docstrings with brief descriptions
Use complete type annotations for function parameters and return types
Use union types with modern syntax:str | intinstead ofUnion[str, int]
UseOptional[Type]for optional types in type annotations
Use snake_case with descriptive, action-oriented names for functions (get_, validate_, check_)
Avoid in-place parameter modification anti-patterns: return new data structures instead of modifying parameters
Useasync deffor I/O operations and external API calls
HandleAPIConnectionErrorfrom Llama Stack in error handling
Uselogger.debug()for detailed diagnostic information
Uselogger.info()for general information about program execution
Uselogger.warning()for unexpected events or potential problems
Uselogger.error()for serious problems that prevented function execution
All classes require descriptive docstrings explaining purpose
Use PascalCase for class names with descriptive names and standard suffixes:Configuration,Error/Exception,Resolver,Interface
Use complete type annotations for all class attributes; avoid usingAny
Follow Google Python docstring conventions for all modules, classes, and functions
IncludeParameters:,Returns:,Raises:sections in function docstrings as needed
Files:
src/observability/formats/rlsapi.pysrc/app/endpoints/rlsapi_v1.py
src/app/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
src/app/**/*.py: Usefrom fastapi import APIRouter, HTTPException, Request, status, Dependsfor FastAPI dependencies
Use FastAPIHTTPExceptionwith appropriate status codes for API endpoint error handling
Files:
src/app/endpoints/rlsapi_v1.py
🧠 Learnings (1)
📚 Learning: 2026-04-06T20:18:07.852Z
Learnt from: major
Repo: lightspeed-core/lightspeed-stack PR: 1463
File: src/app/endpoints/rlsapi_v1.py:266-271
Timestamp: 2026-04-06T20:18:07.852Z
Learning: In the lightspeed-stack codebase, within `src/app/endpoints/` inference/MCP endpoints, treat `tools: Optional[list[Any]]` in MCP tool definitions as an intentional, consistent typing pattern (used across `query`, `responses`, `streaming_query`, `rlsapi_v1`). Do not raise or suggest this as a typing issue during code review; changing it in isolation could break endpoint typing consistency across the codebase.
Applied to files:
src/app/endpoints/rlsapi_v1.py
🔇 Additional comments (3)
src/observability/formats/rlsapi.py (1)
29-30: Token fields and total-token computation are implemented correctly.The new defaults and Line 52 aggregation fix the zero-token telemetry bug while keeping existing event construction backward compatible.
Also applies to: 52-52
tests/unit/observability/formats/test_rlsapi.py (1)
53-77: Good targeted regression test for token aggregation.This test cleanly validates the non-zero
total_llm_tokenspath and complements the existing default-zero assertion.src/app/endpoints/rlsapi_v1.py (1)
375-376: Token usage is correctly threaded from inference to Splunk event payload.Lines 375-376 and Line 761-Line 762 correctly propagate extracted token counts, enabling accurate
total_llm_tokens.Also applies to: 761-762
total_llm_tokens was hardcoded to 0 in the rlsapi Splunk event builder despite token counting being implemented via extract_token_usage(). Add input_tokens and output_tokens to InferenceEventData and pass actual counts from the endpoint handler. Ref: RSPEED-2857 Signed-off-by: Major Hayden <major@redhat.com>
d454817 to
46669a3
Compare
Description
total_llm_tokensin the rlsapi Splunk telemetry event was hardcoded to0. Token counting is already implemented viaextract_token_usage()and the data is captured in the endpoint handler, but it was never passed through to the Splunk event builder.This adds
input_tokensandoutput_tokensfields toInferenceEventDataand plumbs the actual token counts frominfer_endpoint()through_queue_splunk_event()intobuild_inference_event(), which now computestotal_llm_tokensas the sum of input and output tokens.Type of change
Tools used to create PR
Related Tickets & Documents
Checklist before requesting a review
Testing
uv run pytest tests/unit/observability/formats/test_rlsapi.py -vto verify:total_llm_tokens: 0total_llm_tokens: 225uv run make verifyto confirm all linters pass./inferrequest against a model that returns usage data. The resulting Splunk event should now contain non-zerototal_llm_tokens.Summary by CodeRabbit
New Features
Tests