Skip to content

improves monitor tserver view#6329

Open
keith-turner wants to merge 1 commit intoapache:mainfrom
keith-turner:tserver-monitor-view
Open

improves monitor tserver view#6329
keith-turner wants to merge 1 commit intoapache:mainfrom
keith-turner:tserver-monitor-view

Conversation

@keith-turner
Copy link
Copy Markdown
Contributor

Made a few major changes in the this PR all in support of providing an improved tserver page on the monitor.

  • Gave the RPC thread a consistent name across all server types. This was done to make the thread name findable in a metrics tag using a constant.
  • Setup a custom monitor metrics registry. This was done because it may not be safe to read from registry in another thread (see StepFunctionCounter.count() has a race condition micrometer-metrics/micrometer#7417) AND more importantly to get step functionality where metrics like function counters show the delta for the last 30 seconds.
  • Refactored the SeversView code to be more flexible. It used to directly compute data from a a single metric. Now its easier to do arbitrary reductions on a collection of metrics for the data in a column.
  • Started collecting executor metrics on thread pools and used those to create some of the tserver columns in the monitor. Using the metrics requires looking for specific thread pool names in the tags.
  • Added a new meric to track scan errors.
  • Fixed some incorrect metrics types.

Made a few major changes in the this PR all in support of providing an
improved tserver page on the monitor.

 * Gave the RPC thread a consistent name across all server types.  This
   was done to make the thread name findable in a metrics tag using a
   constant.
 * Setup a custom monitor metrics registry.  This was done because it
   may not be safe to read from registry in another thread (see
   micrometer-metrics/micrometer#7417) AND more importantly to get step
   functionality where metrics like function counters show the delta
   for the last 30 seconds.
 * Refactored the SeversView code to be more flexible.  It used to
   directly compute data from a a single metric.  Now its easier to
   do arbitrary reductions on a collection of metrics for the data
   in a column.
 * Started collecting executor metrics on thread pools and used those to
   create some of the tserver columns in the monitor.  Using the metrics
   requires looking for specific thread pool names in the tags.
 * Added a new meric to track scan errors.
 * Fixed some incorrect metrics types.
@keith-turner keith-turner added this to the 4.0.0 milestone Apr 22, 2026
@keith-turner
Copy link
Copy Markdown
Contributor Author

keith-turner commented Apr 22, 2026

This is a screenshot of the updated tserver page. Notice the new rates, these will make sorting on data much easier. Before there were ever increasing counts which make sorting not as useful when some tsevers have been running for months and some just restarted. The rates are based on the last 30 seconds of count increments for each tserver this comes from the new registry.

Screenshot at 2026-04-22 18-52-14

opts.parseArgs(applicationName, args);
var siteConfig = opts.getSiteConfiguration();
final String newBindParameter = siteConfig.get(Property.RPC_PROCESS_BIND_ADDRESS);
final String newBindParameter = siteConfig.get(siteConfig
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bug found while testing this, need to pull this in to a separate PR and some script changes are also needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant