Distributes compaction coordination across managers#6324
Distributes compaction coordination across managers#6324keith-turner wants to merge 3 commits intoapache:mainfrom
Conversation
For an overview of how this works see the javadoc in CompactionCoordinator.
| SecurityErrorCode.PERMISSION_DENIED).asThriftException(); | ||
| } | ||
| ResourceGroupId groupId = ResourceGroupId.of(job.group); | ||
| LOG.trace("reserveCompactionJob called for group {} by compactor {}", groupId, |
There was a problem hiding this comment.
Should we include the ECID here?
| public void addJobs(TInfo tinfo, TCredentials credentials, List<TResolvedCompactionJob> tjobs) | ||
| throws TException { | ||
| if (!security.canPerformSystemActions(credentials)) { | ||
| LOG.warn("Thrift call attempted to add job and did not have proper access. {}", |
There was a problem hiding this comment.
Why not throw an exception here?
There was a problem hiding this comment.
That is a oneway thrift method. Added a comment in ed7ef38 and changed to log level from warn to error.
| List<HostAndPort> sortedUniqueHost) { | ||
| } | ||
|
|
||
| public synchronized CoordinatorLocations getLocations(boolean useCache) { |
There was a problem hiding this comment.
From what I could tell, 'true' is only used as an argument
| zooCache.clear(Constants.ZMANAGER_COORDINATOR); | ||
| } | ||
| byte[] serializedMap = zooCache.get(Constants.ZMANAGER_COORDINATOR); | ||
| var type = new TypeToken<Map<String,String>>() {}.getType(); |
There was a problem hiding this comment.
this could probably be a private static final var
| byte[] serializedMap = zooCache.get(Constants.ZMANAGER_COORDINATOR); | ||
| var type = new TypeToken<Map<String,String>>() {}.getType(); | ||
| Map<String,String> stringMap = GSON.get().fromJson(new String(serializedMap, UTF_8), type); | ||
| Map<ResourceGroupId,HostAndPort> locations = new HashMap<>(); |
There was a problem hiding this comment.
| Map<ResourceGroupId,HostAndPort> locations = new HashMap<>(); | |
| Map<ResourceGroupId,HostAndPort> locations = new HashMap<>(stringMap.size()); |
| sleepTime = maxSleepTime; | ||
| } | ||
|
|
||
| UtilWaitThread.sleep(sleepTime); |
There was a problem hiding this comment.
We might need a property here like the TABLE_SUSPEND_DURATION, but for Coordinators. There is a case where we don't want to be too responsive to a Manager going away and coming back. Bouncing the Manager sometimes clears up some state issues. I would think that we only want to recompute the group assignments if the number of compaction groups has changed or the number of managers has changed, and then only after some suspend duration.
| private void setupAssistantMetrics(MetricsProducer... producers) { | ||
| MetricsInfo metricsInfo = getContext().getMetricsInfo(); | ||
| metricsInfo.addMetricsProducers(producers); | ||
| // TODO should tests compaction metrics from multiple managers |
For an overview of how this works see the javadoc in CompactionCoordinator.