Skip to content

Fix silent task.completed / task.blocked hooks (#2449)#536

Merged
bborn merged 1 commit intomainfrom
task/2449-investigate-broken-taskyou-notification
Apr 21, 2026
Merged

Fix silent task.completed / task.blocked hooks (#2449)#536
bborn merged 1 commit intomainfrom
task/2449-investigate-broken-taskyou-notification

Conversation

@bborn
Copy link
Copy Markdown
Owner

@bborn bborn commented Apr 20, 2026

Summary

The notifications.jsonl file on the agent server had been silent for ~40 days despite many task completions. Three separate bugs in the hook pipeline compounded to suppress task.completed completely and let task.blocked fire only by accident.

  • db.UpdateTaskStatus only emitted the generic task.updated event, so every caller (CLI, TUI, MCP, Claude hook subprocesses) silently skipped lifecycle hooks
  • The executor's success path transitions to backlog not done, so no code path fired task.completed for agent-finished tasks
  • events.Emitter.Emit used a fire-and-forget goroutine, so short-lived CLI commands like ty close exited before the hook subprocess even forked

Fix

  • db layer: extend EventEmitter with EmitTaskBlocked / EmitTaskCompleted, fire them from UpdateTaskStatus on the matching transitions — every caller benefits as long as an emitter is registered
  • executor: emit task.completed when the agent succeeds (since the status goes to backlog) and task.failed when it dies
  • CLI: openTaskDB helper registers a process-wide emitter on every DB open; PersistentPostRun on the root Cobra command flushes pending hook goroutines before exit
  • events: Emitter.Wait() via sync.WaitGroup so CLI commands can flush without blocking the daemon

Verification

Built the fix, deployed to the agent server, and verified three real scenarios each produced the expected notifications.jsonl line:

{"event":"blocked","task_id":"55","title":"e2e final","project":"personal","timestamp":"2026-04-20T17:24:40Z"}
{"event":"completed","task_id":"55","title":"e2e final","project":"personal","timestamp":"2026-04-20T17:24:47Z"}
  • Agent hitting Claude's Stop hook → task.blocked fires via the Claude hook subprocess emitting through the db layer
  • ty close <id>task.completed fires, and the PersistentPostRun wait prevents the process from exiting before the append completes
  • ty status <id> blockedtask.blocked fires the same way

Extras

  • examples/hooks/notifications-health-check.sh: cron-friendly script that alerts when notifications.jsonl stays silent longer than a threshold (default 24 hours) so the next outage is caught early
  • README / examples docs updated to reflect when each event actually fires

Test plan

  • go test ./... green
  • Built for linux, deployed to howdyrunner-agents.exe.xyz, restarted ty daemon
  • Exercised agent execution → verified task.blocked line appears
  • Exercised ty close → verified task.completed line appears
  • Exercised ty status ... blocked → verified task.blocked line appears
  • Follow up: install notifications-health-check.sh as a cron on the agent server

🤖 Generated with Claude Code

Investigation summary
---------------------

`/home/exedev/notifications.jsonl` on the agent server had not received a
write since 2026-03-11 despite dozens of task completions since. The file
is populated by `~/.config/task/hooks/task.completed` and
`~/.config/task/hooks/task.blocked` scripts.

The `task.blocked` script had only fired twice (both on the day hooks were
installed) and then again on 2026-04-20. The `task.completed` script had
never fired. Root cause turned out to be three separate gaps:

1. **db.UpdateTaskStatus only emitted `task.updated`**, never the lifecycle
   events `task.blocked` / `task.completed`. Every caller (CLI, TUI, MCP
   server, the Claude `Notification` / `Stop` hook subprocesses) routes
   through this function, so none of them fired the specific hooks.

2. **The executor's success path transitions to StatusBacklog**, not
   StatusDone, because "only humans should mark tasks as done." That
   leaves no code path that fires `task.completed` on agent success.

3. **Event hook goroutines were racing CLI exits**. `events.Emitter.Emit`
   spawned `go e.runHook(event)` but never awaited it. Short-lived
   commands like `ty close` returned before the hook subprocess even
   forked, so the append to notifications.jsonl silently dropped.

Fix
---

- `internal/db/events.go` + `internal/db/tasks.go`: extend `EventEmitter`
  interface with `EmitTaskBlocked` and `EmitTaskCompleted`, and fire them
  from `UpdateTaskStatus` on the relevant status transitions. Every
  caller now emits lifecycle events as long as an emitter is registered.

- `internal/executor/executor.go`: fire `EmitTaskCompleted` on agent
  success (which transitions to backlog, not done, so the db emit
  doesn't cover it) and `EmitTaskFailed` on agent failure so watchers
  can distinguish "needs input" from "agent died."

- `cmd/task/main.go`: add `openTaskDB` helper that registers a
  process-wide events emitter on every db open — claude-hook, MCP
  server, and every CLI/TUI command share the same emitter — plus a
  `PersistentPostRun` on the root Cobra command that waits for
  pending hook goroutines to finish before the process exits.

- `internal/events/events.go`: track in-flight emits with a WaitGroup
  and expose `Emitter.Wait()` so CLI commands can flush before exit
  without blocking the daemon's hot path.

Verification
------------

Upgraded server to this build and exercised three real scenarios on the
agent server; each produced the expected line in `notifications.jsonl`:

- agent execution hitting Claude's `Stop` hook → `task.blocked` fires
- `ty close <id>` on a blocked task → `task.completed` fires
- `ty status <id> blocked` on a done task → `task.blocked` fires

Extras
------

- `examples/hooks/notifications-health-check.sh`: cron-friendly script
  that alerts if notifications.jsonl goes silent longer than a threshold
  (default 24 hours) so this class of outage is caught earlier next time.

- README/examples docs updated to reflect when each event actually fires.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@bborn bborn merged commit 069ac5f into main Apr 21, 2026
3 of 4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant