Skip to content

Centralize Monitor Server table definitons on backend#6321

Merged
DomGarguilo merged 4 commits intoapache:mainfrom
DomGarguilo:backendDefinedMonitorTables
Apr 20, 2026
Merged

Centralize Monitor Server table definitons on backend#6321
DomGarguilo merged 4 commits intoapache:mainfrom
DomGarguilo:backendDefinedMonitorTables

Conversation

@DomGarguilo
Copy link
Copy Markdown
Member

This PR centralizes all of the Monitor table definitions into one place on the backend. We can now more or less define a monitor table via a List<Metric> where that list gets mapped 1:1 to the view on the front end. All order and filtering of the columns takes place next to this List on the backend.

  • reworked server process monitor table views to return a backend-defined table model with columns, data (rows), status (used for banners and status LEDs), and timestamp
  • Added a new record Column so the backend can define and send each columns
    • key
    • label
    • description
    • uiClass
  • Added enum ServerTable to serve as the IDs for each table and to be used as the API param in /rest-v2/servers/view;table=<table>
  • Added a new enum ServersView. There should be on ServersView per UI table
  • added a method for each ServersView where each tables shape is defined. These methods return a List<Metric> which maps directly to the frontend table columns. This is where we define which and in what order the columns will be for each table.
  • removed the generated columns.js and its supporting code. Now the frontend builds the table headers and DataTables column config directly from the returned backend column definitions

Now to add a new table to the UI:

  • add a new ServerTable enum value and create the corresponding metric selection method in ServersView for that table
  • cache the table in SystemInformation with the corresponding server set
  • add a fetch helper method in the javascript that calls /rest-v2/servers/view;table=<NEW_TABLE>
  • point the page JS at that helper and storage key and use the existing generic table render methods

@DomGarguilo DomGarguilo self-assigned this Apr 17, 2026
@keith-turner
Copy link
Copy Markdown
Contributor

Tried running this locally and in the web browser debug console I noticed for every page I visited on the monitor it always fetched /rest-v2/servers/view;table=SCAN_SERVERS and /rest-v2/servers/view;table=MANAGERS. I suspect this is preexisting behavior and also suspect its caused by navbar.js, but not sure. Wondering if this means every page visited is pulling down a list of all scan servers and there metrics? If so may need to open an issue about this.

I first noticed it on the tserver page, it would fetch TABLET_SERVERS, SCAN_SERVERS, and MANAGERS. Then tried visiting the main page and it fetched SCAN_SERVERS and MANAGERS.

Copy link
Copy Markdown
Contributor

@keith-turner keith-turner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mainly looked at the java code and those changes look good.

}

private static List<Metric> tabletServerMetrics() {
return metricList(MetricDocSection.GENERAL_SERVER, MetricDocSection.TABLET_SERVER,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As an experiment I changed this method to return the following and ran the monitor. It displayed those columns (plus the commons ones prepended) in that order. Not suggesting this change, just sharing the experiment it was neat to see this working and really easy to change.

return List.of(Metric.TSERVER_MINC_RUNNING, Metric.TSERVER_MINC_QUEUED, Metric.TSERVER_HOLD, Metric.SCAN_ZOMBIE_THREADS, Metric.BLOCKCACHE_DATA_REQUESTCOUNT, Metric.BLOCKCACHE_DATA_HITCOUNT, Metric.BLOCKCACHE_DATA_EVICTIONCOUNT);

Not something for this PR, but w/ the experiment above realized we need scan queued, running, and failed metrics. Some of those metrics may be there in the form of thread pool metrics, but those are from micrometer and not currently collected by the monitor. May also be good to have failed MinC metric. Then for scan and minc we could have running/queued/failed and those would be really good indications of tserver health. Probably want to eventually put the smallest set of metrics in the table for all servers that indicates if a server is healthy, then maybe we could have a per server page that one could click on and see all that the monitor knows about a specific server.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it was neat to see this working and really easy to change.

Great! This was the goal so I'm glad things were easy.

we need scan queued, running, and failed metrics

Yea this seems like a good follow on. Should we create a ticket to track this?

Probably want to eventually put the smallest set of metrics in the table for all servers that indicates if a server is healthy, then maybe we could have a per server page that one could click on and see all that the monitor knows about a specific server.

This seems like a good idea to me. The goal with these changes is to make it easy to figure out what info to put in each table however I am not sure where the best place to discuss or document that info per table is.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The goal with these changes is to make it easy to figure out what info to put in each table

For tablet servers I would like to try to focus on a few things w.r.t. metrics, are scans,writes, conditional updates failing or backed up? Hope to try experimenting w/ that in a follow on to this. If those high level indicators show a problem then would need to dig deeper to figure out why.


// Keep common columns and filter out metric columns that have no values
this.columns = requestedColumns.stream()
.filter(col -> isCommonColumn(col.key()) || presentMetricNames.contains(col.key()))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like this will only show columns that have a metric present on at least one server? Would it be possible to always show all the requested columns even if no server currently has metrics? Seems like this could cause columns to blink in and out of existence when auto refresh is enabled.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea this is definitely an issue. Its a tradeoff. The filter was added to avoid a very wide table full of empty columns but filtering out empty columns is not a good longterm solution. The better longterm solution is to to make the per-table metric Lists smaller and more intentional then always show that list even if that means empty columns.

Maybe for now ill remove that filter so it makes it super obvious there is work to be done to refine the list for each table.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed in 7e2775f

@DomGarguilo
Copy link
Copy Markdown
Member Author

for every page I visited on the monitor it always fetched /rest-v2/servers/view;table=SCAN_SERVERS and /rest-v2/servers/view;table=MANAGERS. I suspect this is preexisting behavior and also suspect its caused by navbar.js, but not sure.

I'm pretty sure you're correct. With this PR the calls the navbar is making need to pull these full table model JSONs to get the data they need. I think it will be a worthwhile follow on to make a dedicated endpoint to return this info. That will be a lot more lightweight and not send a ton of unnecessary data.

@DomGarguilo DomGarguilo merged commit 828827a into apache:main Apr 20, 2026
9 checks passed
@DomGarguilo DomGarguilo deleted the backendDefinedMonitorTables branch April 20, 2026 18:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants