← Back

Friday Afternoon Amnesia

It's Friday afternoon and someone asks what I accomplished this week. I know I was busy. I was in meetings, I reviewed code, I unblocked people, I made decisions in Slack threads, I debugged a deployment issue, I wrote some code. But when I try to reconstruct the actual sequence of events, my memory is surprisingly unhelpful. The week was fragmented across code, tickets, chats, calendar, email, and a dozen browser tabs, and none of those systems individually tell the story of what happened. The result is a familiar feeling: I was clearly busy all week and I can't quite say what I did.

This bothered me enough to build something about it. Over the past several months I've assembled a set of tools that gather the scattered traces of my workday, turn them into reviewable summaries, and let me confirm or correct before anything gets committed.

Meeting Summaries

The calendar says a meeting was scheduled. A meeting summary says what changed because of it.

Meetings are the easiest work to remember vaguely and document poorly. You walk out knowing the vibe — "that went well" or "we need to follow up on the migration thing" — but the specific decisions, action items, and who-said-what details start decaying immediately.

flowchart LR
    AH["Audio Hijack"] --> Hazel["Hazel"]
    Hazel --> Parakeet["parakeet-mlx<br/>(local transcription)"]
    Hazel --> Cal["Google Calendar<br/>(event matching)"]
    Parakeet --> OpenAI["OpenAI<br/>(transcript-grounded summary)"]
    AH --> Gemini["Gemini<br/>(audio-aware summary)"]
    Cal --> Merge["OpenAI<br/>(conservative merge)"]
    OpenAI --> Merge
    Gemini --> Merge
    Merge --> Obsidian["Obsidian vault"]

I set up a pipeline that records meeting audio, transcribes it locally, and then matches the recording to the right calendar event. That last part is its own problem — the system fetches the day's calendar events, ranks them by time overlap with the recording, and uses a model to pick the best match or mark the recording as impromptu. Declined events, focus time blocks, and out-of-office entries get filtered out. The matched calendar event owns the title and scheduled start time; an impromptu meeting gets its title inferred from the content.

Once the recording is matched, two separate models summarize it independently. The first receives the raw audio file, so it can pick up on things a transcript misses: tone, emphasis, hesitation, laughter, group reactions. When someone says "sure, we can do that" in a flat voice after a long pause, that means something different than an enthusiastic agreement, and the audio-aware model can capture that distinction. The second model works only from the text transcript and is instructed to stay strictly grounded in what was said — exact names, dates, numbers, URLs, and technical terms.

A third pass merges the two drafts. It prefers the transcript-grounded version for specifics and keeps audio observations when they're relevant and don't conflict. Where the two disagree, it drops uncertain attribution rather than guessing. Everything lands in my Obsidian vault alongside debug artifacts — the raw transcript, the calendar match decision, each model's draft, and the final merge — so I can audit any summary back to its sources. Occasionally I do catch things worth correcting.

This turned out to matter more than I expected. Meeting summaries make daily notes better because they turn spoken decisions into a written record. They make time tracking better because they explain what a calendar block was actually about — which client, which project, what came out of it. Without them, meetings are just opaque blocks on a timeline.

Dayflow

A lot of the day is reviews, chats, docs, planning, and small follow-ups that don't leave obvious artifacts. One of the hardest lessons I've had to learn as an engineering lead is that the absence of a commit does not mean the absence of work. Dayflow is the tool that makes that in-between work visible.

flowchart LR
    Screen["Screen activity"] --> Dayflow["Dayflow<br/>(capture)"]
    Dayflow --> Bedrock["Kimi K2.5<br/>on Bedrock<br/>(ZDR inference)"]
    Bedrock --> SQLite["SQLite DB"]
    Dayflow --> SQLite
    SQLite --> MCP["Dayflow MCP<br/>server"]

Dayflow runs in the background and records what's actually on screen — application windows, sites, time boundaries — and uses an LLM to describe what I appeared to be doing. All of this lands in a local SQLite database: the raw timeline data, the app/window/site boundaries, and the LLM-generated activity descriptions. It's the closest thing in the system to a flight recorder.

Downstream tools don't read the SQLite database directly. Instead, I wrote a custom MCP server that exposes it — so when Claude Code is generating a daily note or reconciling time entries, it can query Dayflow's timeline through the same tool interface it uses for everything else.

Seeing as how this — a continuous screen recording — is insanely high-risk from a security standpoint, choice of LLM backend matters a lot. I initially tried running it fully locally with Qwen3.6 in LM Studio. This technically worked, but the activity descriptions weren't accurate enough and the compute cost was impractical for an always-on tool. On my MacBook Pro, normal near-full-day battery life dropped to about 90 minutes and the machine became uncomfortably hot. I moved to a cloud model: Kimi K2.5 running on Amazon Bedrock (the only inference vendor whose ZDR I trust to actually mean zero-data-retention). The cost so far has been under a dollar a month.

This isn't an ideological choice. Fully local is appealing for privacy, and the experiment proved it was technically possible. But possible is not the same thing as usable. The Bedrock path hit the practical middle ground: accurate enough, private enough, and the laptop doesn't burn my legs.

Daily Notes

flowchart LR
    MS["Meeting<br/>summaries"] --> Claude["Claude Code<br/>(synthesis)"]
    DF["Dayflow<br/>MCP server"] --> Claude
    Git["Git commits"] --> Claude
    GL["GitLab<br/>MCP server"] --> Claude
    RM["Redmine<br/>MCP server"] --> Claude
    AI["Claude Code &amp;<br/>Codex history"] --> Claude
    Cal["Google Calendar<br/>MCP server"] --> Claude
    Slack["Slack<br/>MCP server"] --> Claude
    Gmail["Gmail<br/>MCP server"] --> Claude
    Vault["Existing<br/>vault notes"] --> Claude
    Claude --> Daily["Obsidian<br/>daily note"]

Every morning I run a script to generate a daily note in Obsidian for the previous day. The script launches Claude Code with a pile of MCP servers connected — Dayflow, Google Calendar, Slack, Gmail, Redmine, GitLab — plus local data it gathers up front like git commits across my projects and Claude Code and Codex session history. Claude Code pulls from all of those sources and puts together a daily log grouped by time of day (morning, afternoon, or evening) and project.

These notes answer the question from the outset: what did I do yesterday. But they go a bit further. Each Monday the generation script adds a Last Week in Review section with a compressed overview of the previous week; each 1st of the month, it adds a Last Month in Review section drawn from that month's notes. It also writes an Observations section, like a personal retro of the day. This calls out things worth working on, like "You got distracted shopping for gardening supplies on Amazon instead of paying attention to that meeting."

Time Tracking

I unfortunately need to track time in three different systems, each with a different audience and level of granularity.

  1. Toggl is my personal timeline — what I actually spent time on, across all categories personal and business1.
  2. Redmine is the ticket-level work log for my company.
  3. Productive is the company timesheet — whole hours per client per day for finance to actually be able to bill out clients.

The hard part is always attribution. A meeting might be 30 minutes on the calendar but the prep and follow-up add another 20. A code review might span two clients. An interrupt might be too short to bill but real enough to account for. Very often I just forget to start or end a timer. And despite all three systems having incompatible granularity and rounding rules, they all should stay roughly consistent with each other.

flowchart LR
    DF["Dayflow<br/>MCP server"] --> Phase1["Phase 1:<br/>propose Toggl"]
    Cal["Google Calendar<br/>MCP server"] --> Phase1
    MS["Meeting<br/>summaries"] --> Phase1
    Phase1 -->|"confirm"| Toggl["Toggl<br/>MCP server"]
    Phase1 --> Phase2["Phase 2:<br/>propose Redmine"]
    Phase2 -->|"confirm"| Redmine["Redmine<br/>MCP server"]
    Phase2 --> Phase3["Phase 3:<br/>propose Productive"]
    Phase3 -->|"confirm"| Productive["Productive<br/>MCP server"]

At the end of each work day, I run a reconcile-time skill in either Claude Code or Codex2. This uses the same sources as the daily notes, primarily the Dayflow MCP server for actual time boundaries and the meeting summaries for attribution, and works through the three systems in order. Each system has its own MCP server: Toggl, Redmine, and Productive. The agent reads existing entries, diffs them against its proposal, and writes new or updated entries through the same MCP interface. It proposes Toggl entries first, I confirm, then it derives Redmine entries from verified Toggl, I confirm, then it derives Productive entries from verified Toggl, I confirm. Later systems always derive from the verified state of earlier systems, never from an unconfirmed proposal.

A useful side effect of the Redmine phase: when the agent finds dev work that doesn't have a corresponding ticket, it suggests creating one. This has turned out to be a surprisingly effective way to make sure tickets always get made. Work that might otherwise stay untracked — a quick bug fix, an infrastructure tweak, a spike that turned into real work — gets a ticket proposed at the same time it gets a time entry. I still confirm whether to create it, but the prompt is usually enough.

Caveats

I'm not sharing actual code for this because this setup is operationally fragile in every imaginable way. It depends on local CLIs, MCP servers, Dayflow, Obsidian's vault state, external APIs, API keys I've already forgotten how exactly to rotate, and so on. If you're qualified to operate this, you should just build it yourself.

Privacy is a real concern. Dayflow captures what's on screen. Meeting recordings contain other people's voices. The daily notes reference Slack conversations and email subjects. I'm careful about what goes through which model and on what terms, and I don't save audio recordings any longer than needed, but the tradeoff is always present.

I started this because I couldn't answer "what did I do this week?" and it bothered me. I kept building because the answer turned out to be useful for more than just satisfying curiosity — it feeds status updates, performance reviews, billing, and planning. All of those go better when they start from reality instead of from a lossy reconstruction.

Footnotes

  1. I used to take all categories a bit too far and actually had projects for things like "Quality Time Spent w/ Wife". Despite justifying the timer as a way to optimize for more of that, she didn't like it very much, and now most personal categories are untracked.

  2. Whichever one I have subscription usage left in.