Claude Code Visual OS: The Dashboard My Crew Needed

Claude Code Visual OS: The Dashboard My Crew Has Been Missing

I watched Jack's video on a Sunday afternoon with a half-empty coffee and the smug confidence of someone who already runs a working Claude Code stack.

Aria writes my posts. Hermes runs my personal knowledge work. Codex handles the second-opinion coding loop. Higgsfield 2's MCP gives me thirty image and video models from one terminal. I have skills, subagents, slash commands, an Obsidian vault wired through Pinecone, a Supabase logs table, and a CLAUDE.md for every project. By any reasonable measure, I am past the slot-machine phase.

Forty minutes into the video, that smugness was gone.

Jack was demoing what he calls a Claude Code visual OS — a single dashboard that connects every model, every memory store, every skill, every token spend, and every conversation across his AI stack into one self-improving control panel. Not a usage tracker. Not a billing dashboard. An actual operating layer with six pillars and a "dreaming" engine that runs every night to suggest new skills, prune dead ones, flag repetitive prompts, and quietly redesign his workflow while he sleeps.

I rewound twice. Not because the idea was unfamiliar — I have been hand-rolling pieces of this for a year — but because somebody had finally drawn the shape of the thing I kept almost building and never finishing.

This post is my reaction. What Jack built, what is real about it as of May 2026 (the dreaming feature shipped to Claude Managed Agents at Code with Claude on May 6 — more on that), what I would actually wire into my own crew, and the honest gap between the demo version of this idea and what it costs to run for real.

If you're juggling three or four AI subscriptions and a graveyard of half-tested skills, this is the post I wanted six months ago.

The Real Problem Nobody Is Naming

Before the dashboard makes sense, the pain has to be specific. So let me be specific about mine.

In May 2026 I am actively paying for Claude Pro Max, OpenAI Pro $200 (which moved to a 20x Plus allowance on an ongoing basis), Gemini Advanced, Higgsfield 2 credits, a Pinecone serverless index, a Supabase Pro project, and Obsidian Sync. I have Claude Code, Codex CLI, Gemini CLI, Hermes Agent, and Cursor installed on the same machine — I don't use Cursor anymore, I just haven't uninstalled it. Aria has 11 skills. My personal .claude/skills/ folder has 27. Half of them I have not touched in a month.

When something breaks or drifts, I have no answer to the most basic question a builder should be able to answer: which part of my stack is actually earning its keep right now?

I know roughly. Claude Opus 4.7 (released April 16, with the same $5/$25 per-million-token pricing as 4.6, but with a new tokenizer that produces up to 35% more tokens for the same input text — a real cost change hidden in an "unchanged" sticker) writes my long-form posts. Sonnet 4.6 handles structured tasks at $3/$15. Haiku 4.5 at $1/$5 does the cheap intermediate calls. The 1M-token context window is available at standard pricing on Opus 4.7, Opus 4.6, and Sonnet 4.6, which means a 900k-token request bills at the same per-token rate as a 9k-token one. I know these numbers. I do not know which of my skills uses which model, or whether any of them is silently calling Opus when Haiku would have done the job for one-fifth the cost.

That is the shape of the problem. The agents are running. The bills are arriving. The output is shipping. And the visibility layer that would let me say, with confidence, this skill earned $400 of my time this month and this one is a vampire — that layer does not exist in any of the official Anthropic or OpenAI dashboards. It does not exist in Pinecone's console. It does not exist anywhere.

Jack's pitch is that it should. And that it can be built.

That is what got me to rewind.

What Jack's Claude Code Visual OS Actually Is

Strip the framing away and Jack's system is one dashboard standing on six pillars. The pillars are not features — they are data sources the OS needs in order to reason about your AI life. The dashboard fuses them. The dreaming engine acts on them.

Here are the six, the way Jack laid them out, with my own first-person reading of why each one matters.

Pillar one: Models

Every model you use, every plan you pay for, with their costs, limits, and current pricing tier surfaced in one view. Not a screenshot of your Anthropic console — a live table that knows you also pay for Codex, Gemini Advanced, and Higgsfield, and that puts those side-by-side.

The reason this matters: model churn is brutal. Opus 4.7 shipped on April 16. Gemini 3.2 Flash is field-testing inside iOS without an announcement. GPT-5.4 sits inside Codex with a different cost profile than ChatGPT's web product. If your dashboard does not know which models you can call, it cannot help you pick the right one when you start a session.

Pillar two: Memory

Every place your stack stores context — local files, Obsidian vaults, Pinecone indexes, Supabase tables, raw SQL databases — connected and visible.

This is where most stacks break. I had four memory layers running in March and a fuzzy sense of what was in each one. A search query routed through the wrong layer is not just slower; it pulls the wrong context into a prompt and quietly poisons the output. The OS should know that my "brand voice" facts live in Obsidian, my project transcripts live in Pinecone, and my agent logs live in Supabase, and it should pick the right one for each query without me thinking about it.

Pillar three: Skills

Every individual skill you have written, tracked, evaluated, and (this is the part most setups miss) pruned for efficiency.

The OS lists every skill. It tells you the last time each one ran, how often it succeeded, how much money it earned or saved, and how much it costs in tokens per execution. Skills that have not been called in 90 days get flagged for archival. Skills that are silently expensive get flagged for refactor. Skills that succeed at high rates get promoted.

Most people treat their skills folder like a closet — stuff goes in, nothing comes out. The OS treats it like a team roster with quarterly reviews.

Pillar four: Knowledge systems

Connected repositories that provide reference material the agent can pull on demand. Documentation. Notes. Reference projects. Reading lists. The output of your dreaming sessions.

This is the layer that overlaps with memory but is conceptually different. Memory is what the agent produced. Knowledge is what it can consult. The OS tracks both and makes the difference legible — because conflating them is how you end up with an agent that confidently cites its own hallucination from three weeks ago as a source.

Pillar five: Usage and cost

Token consumption, subscription costs, and total financial impact, sliced by skill, by project, by brand, by model.

Not a billing dashboard. A financial truth layer. It answers: how much of this month's Anthropic bill came from Aria writing mejba.me posts versus ramlit.com case studies? Which skill burned the most Opus tokens this week? Is my cache hit rate on the long-form posts actually 60% like I think it is, or has it quietly dropped to 18% because I refactored the system prompt last Tuesday?

I covered the rough version of this problem in my Claude Code MCP install/cut post — every MCP server you load charges rent in context tokens whether you use it or not. The OS would make that rent legible across every server, every skill, every session.

Pillar six: The dreaming system

The automated analysis layer that watches what you did yesterday and suggests improvements for tomorrow. Repetitive prompts that should become skills. Skills that should be merged. Models you're overpaying for. Memory that has gone stale. New tools that fit your workflow.

This is the pillar that turns the dashboard from a report into an agent. And here is where Jack's pitch stopped being theoretical for me — because as of May 6, 2026, Anthropic actually shipped a dreaming feature into Claude Managed Agents, and it does almost exactly what Jack described.

Let me come back to that, because it changes the whole conversation.

The Dreaming Feature Is Real Now

Anthropic announced dreaming at the Code with Claude developer conference in San Francisco on May 6, 2026. It is currently in research preview for Claude Managed Agents — request access at the managed-agents form on claude.com. I have not been granted access yet. But the public documentation tells me a lot about how Anthropic thinks this layer should work.

A "dream" runs as a scheduled background process between agent sessions. It reads the existing memory store alongside past session transcripts and produces a new, reorganized memory store. Duplicates merged. Stale entries replaced with the latest value. Contradictions resolved. New insights surfaced.

The community write-ups call the user-facing variant Auto Dream or AutoDream for Claude Code. The mechanic that most caught my attention: dreaming converts relative dates to absolute dates. "Yesterday we decided to use Redis" becomes "On 2026-03-15 we decided to use Redis." This is the kind of detail that sounds boring until you have watched an agent confidently cite a "recent" decision from six months ago and burn an afternoon untangling the confusion.

The harder-hitting line from Anthropic's blog: "Dreaming surfaces patterns that a single agent can't see on its own, including recurring mistakes, workflows that agents converge on, and preferences shared across a team."

Read that again with Jack's six pillars in your head. That is pillar six described in Anthropic's own words. The legal-tech firm Harvey reports their Managed Agent completion rates went up roughly 6x in dreaming-enabled tests on long-form drafting and document creation.

Six. X.

What that tells me as a builder is that the visual OS Jack sketched is not a fantasy that requires a custom backend nobody can build. The hardest pillar — the one I would have argued was years away in May — Anthropic just shipped as a managed service. The rest of the OS is plumbing.

Which means the only honest question left is: how would I actually wire this into my crew?

How I Would Build the Visual OS Into My Aria Stack

Let me describe this the way I would describe it to a friend who has been running Claude Code for six months and is finally ready to graduate from skills-in-a-folder to a real control layer.

There are five stages. None of them require an enterprise plan. All of them assume you already have Claude Code installed and at least one project with a .claude/ directory.

Stage one: Model detection

The OS scans your machine and your accounts on first run. It looks for installed CLIs (Claude Code, Codex, Gemini CLI, Hermes), local API key files, environment variables, and active subscriptions. It builds the models table.

In my case, that table would currently include Opus 4.7 with the 1M context window, Sonnet 4.6 (the daily driver — $3/$15 with the same 1M context), Haiku 4.5 ($1/$5, the cost-cutter), GPT-5.4 via Codex, Gemini 3 Pro via the Gemini CLI, and whatever Higgsfield model I last invoked through the MCP. The OS doesn't have to call each one. It just has to know they exist and what each costs.

Why this matters: when Aria starts a new post for mejba.me, the OS should be able to glance at the task and route it to the cheapest model that can still do the job. Long-form 3,000-word post with brand voice constraints? Opus 4.7 with prompt caching enabled (Anthropic offers up to 90% cost savings on cached input tokens, which is enormous when the same system prompt rides every session). Short structured extraction? Haiku 4.5 every time.

I am not paying for intelligence. I am paying for appropriate intelligence. The OS makes the difference legible.

Stage two: Memory setup

A guided wizard connects every memory source you have. For me that means:

Obsidian vault at ~/vault/ — brand rules, voice constraints, agent definitions, post outlines
Pinecone serverless index — long-form post archives, vectorized for semantic recall, the source Aria pulls from when writing about a topic she has covered before
Supabase logs table — every agent session, every tool call, every input/output token count, every model selection
Local .claude/skills/ and .claude/agents/ — the actual skill definitions and subagent specs
Claude Memory API — the new persistent context layer Anthropic shipped earlier this year

The OS doesn't replace any of these. It connects them and gives each one a role. Obsidian for facts the agent can edit. Pinecone for semantic recall over volume. Supabase for the audit trail. The Memory API for the dreaming layer.

I covered the deeper architectural argument for this in my Pinecone Nexus post on agentic RAG — the upshot is that retrieval should happen at compile time, not query time, whenever possible. The OS makes that distinction operational.

Stage three: Define time value

This is the step most builders skip and the one that turns the dashboard from a vanity layer into a financial truth layer.

You tell the OS what an hour of your time is worth. Concretely. In dollars. Not "I'd like to make $200/hr eventually." Right now, on the work you actually charge for: my agency rate, my consulting rate, my realistic content rate.

The reason this matters: every skill's ROI is a function of time saved multiplied by your hourly value, minus token cost. Until the OS knows your hourly rate, "this skill saved you 14 minutes" is just a stat. Once the OS knows you bill at $150/hour, "this skill saved you 14 minutes" becomes "this skill earned you $35 today, at a token cost of $0.42, for a net of $34.58."

Multiply that across 27 skills and you finally have an honest answer to the question I started this post with: which part of my stack is earning its keep?

Stage four: The dream engine

Schedule a nightly dream run. The OS feeds yesterday's session transcripts and memory state into a Claude Opus 4.7 call with a structured prompt. The output is a markdown report:

Repetitive prompts that appeared 3+ times this week and should become a skill
Existing skills that fired but failed, with their failure mode
Memory entries that contradict newer entries
Cost anomalies — sessions that burned more tokens than the 30-day median for that task
Model misalignment — tasks that ran on Opus when Sonnet or Haiku would have sufficed
Suggested new skills based on patterns the dream loop noticed

For the part of this that matches the official Managed Agents dreaming layer, you delegate to Anthropic's research preview when you get access. For the rest, you write a dream skill that reads your Supabase logs table and your Obsidian vault and produces the same shape of report.

The pattern is the same either way. Yesterday's data becomes tomorrow's plan, without you having to read every transcript yourself.

That is the proactive-AI shift Jack kept circling in his video. The reactive model is "I open Claude Code, I think about what I want, I prompt." The proactive model is "I open Claude Code, the OS already noticed that on three of the last five Tuesdays I asked Aria to research the same competitor, and it suggests I save that as a competitor-scan skill with the canonical inputs locked in." One of those is gambling. The other is operating a system.

Stage five: The dashboard

The visible layer. Everything above gets rendered into one view.

Top of the screen: today's token budget across all models, with the cost-to-date for the month and a projection for end-of-month. A second row: active sessions, queued skills, last dream report timestamp. A third row: skill ROI leaderboard, top 5 by net earnings this week, with the option to drill into any of them.

A side panel for memory health: stale entries, contradiction count, last consolidation date. A side panel for cost intelligence: cache hit rate, average context size, percentage of calls running on the optimal model. A side panel for opportunity scanning: new tools the dream engine suggests you try, based on the work you actually did this month, not based on the X post you bookmarked.

This is the part of Jack's video that pulled me out of my chair. Because the dashboard is not the point. The dashboard is the artifact that proves the operating system underneath it actually works. You can build everything else and have it work invisibly. You build the dashboard so that when something stops working — or when a client asks you to explain what you're paying for — there is a page to point at.

For the founder side of my work, that page is the difference between "we use AI" and "we operate AI."

What This Means for the Way I Run My Crew

Here is where Jack's framing forced me to revisit assumptions I had stopped questioning.

I have been writing about my stack for a year. The three-layer agentic OS post covers architecture, memory, and observability — the foundations the visual OS sits on. The my AI stack 2026 post covers the S/A/B/C tier sort I run on tools. The MCP install/cut post covers the protocol-level cost layer. All three were honest at the time. None of them answered the question Jack's video forced me to ask.

The question: am I optimizing inside the wrong frame entirely?

For a year I have been optimizing skills, agents, MCPs, prompts, and memory layers one piece at a time. Aria gets better. The skills folder grows. The Supabase logs accumulate. Each individual decision is sound. But there is no surface where the system can look at itself and say the whole has drifted from its purpose. No surface where I can look at the whole and say that pillar is dead weight without spending an afternoon spelunking through logs.

The visual OS is that surface. Not because dashboards are magical, but because a system without a self-view cannot improve itself. Dreaming is the closure. The agent reads its own work, finds its own patterns, surfaces its own contradictions. The dashboard is just the window into that loop for the human running the org.

That reframes the next six months of my work in a specific way.

The skills folder is not a closet, it's a roster

Every Aria skill needs an attached ROI number by the end of Q2. I don't know what most of them are earning. That is unacceptable now that the tools to measure it exist. The first concrete action this post is generating for me: a skill-audit slash command that runs every Friday, pulls invocation counts and token spend per skill from Supabase, and writes a Markdown report into a ~/audits/skills/ folder for the dream loop to read.

The cost intelligence module is the cheapest win

I genuinely do not know my cache hit rate. I know Anthropic shipped prompt caching with up to 90% savings on cached input tokens. I know I designed Aria's system prompt to be cache-friendly. I have not validated it. The visual OS forces that validation by putting the number on the screen.

Building this part is a weekend project. It saves money on day one. There is no reason not to.

The dreaming layer is where I stop trying to be clever

This is the part I would have argued against in March. "I know my workflow. I don't need an automated suggester telling me what to do." I was wrong. The patterns the dream loop will surface are patterns I literally cannot see because I am inside them. The Harvey 6x completion rate is not a marketing number — it is a directional signal that pattern-surfacing by a second agent is a wildly more powerful tool than pattern-finding by the human who is generating the patterns.

I'll request Managed Agents access for the real version. While I wait, I'll write my own dream skill that runs nightly on the Supabase logs and outputs the same kind of report Anthropic's version produces. It will be worse. It will still beat nothing.

The opportunity scan is what makes this valuable for clients

Here is the angle I did not expect. The visual OS is not just useful for me. It is useful as a deliverable for the kind of agency work I do through ramlit.com. A small business cannot afford to figure out their AI stack the way I figured out mine — by paying for forty-three tools and quietly deleting thirty-four of them. A dashboard that walks them through model detection, memory setup, time-value definition, and a first dream report is the highest-leverage onboarding asset I could possibly hand a client. It compresses a year of my learning into a Tuesday afternoon.

That is the move I was not seeing until I watched Jack's video. The visual OS is a product surface, not just a personal tool.

The Honest Tradeoffs

Before I sell anyone on this, let me name the things I would push back on if a friend pitched me this video.

The dashboard is a maintenance liability. Every layer you visualize is a layer that breaks when an upstream API changes. Anthropic ships fast. Pinecone ships fast. Higgsfield ships fast. You will spend non-zero engineering time keeping the panels accurate. If you are a solo operator, that time has to come from somewhere. The benefit has to clear the cost. For me, with 250+ posts annual cadence and four brands, it does. For someone shipping six posts a year, it probably does not.

Dreaming is not infallible. The first month of dream reports will contain bad suggestions. Skills it proposes that you don't need. Memory consolidations that lose nuance. Cost flags that miss context. The OS is not a decision-maker — it is a candidate-generator. You still have to read the reports and choose. If you outsource judgment, you will get worse decisions faster.

The 1M context window is a temptation, not a free lunch. Just because Sonnet 4.6 and Opus 4.7 will accept a 900k-token input at standard pricing does not mean you should be sending one. The new Opus 4.7 tokenizer can produce up to 35% more tokens for the same input text — that is a real cost increase wearing the costume of a pricing freeze. The OS should warn you when a session is about to balloon, not encourage you to dump everything into context because the window can hold it.

Centralization has a failure mode. If the visual OS becomes the only surface you trust, the day it breaks is the day you cannot work. Keep your skills, your Obsidian vault, and your raw Supabase data accessible without the dashboard. The dashboard is a view, not a database. Lose that distinction and you have rebuilt the lock-in problem you were trying to escape.

You do not need this on day one. Most builders reading this post need to finish layer-one architecture first. Skills folder. CLAUDE.md. A working subagent or two. The visual OS is layer four or five of a stack that has to be standing on layers one through three to make any sense. Build the floor before you obsess about the ceiling.

That said. The ceiling is now visible. And Anthropic shipping dreaming on May 6 made the ceiling about a foot closer than it was the day before.

What I'm Doing This Week

Concrete actions, in order:

Request access to Claude Managed Agents dreaming research preview. Even if approval takes a month, the request goes in tonight.
Write the skill-audit slash command. Pulls invocation count, token spend, and last-run date per skill from my Supabase logs. Outputs a Markdown report into ~/audits/skills/ every Friday at 6 AM via a cron hook.
Wire a basic cost-intelligence panel into the existing claude-usage VS Code extension I already run. Cache hit rate, average context size per skill, and a flag when a skill calls Opus for a job Haiku could have handled.
Define my hourly rate inside a single config file the OS can read. Not a moving target. A number.
Write a v0 dream skill that runs nightly on the Supabase logs and outputs the same shape of report Anthropic's Managed Agents dreaming version produces. Worse than the real thing. Better than nothing. Ships this weekend.

Five things. None of them require a new subscription. All of them produce a measurable output by the time I post next week's roundup.

If even one of them works, the dashboard becomes inevitable. If three work, the visual OS is no longer a video Jack made — it is the operating layer my crew runs on.

That is what Sunday afternoon's coffee actually paid for. Not a new tool. A new frame for the tools I already have.

The chat box was the first AI interface. The agent loop was the second. The visual OS is the third. We are in the middle of that transition, and the people who notice it first get a year of compounding advantage before the rest of the market catches up. I plan on being one of the people who noticed.

The smug confidence I started this post with is gone. What replaced it is more useful. It is a to-do list.

What is a Claude Code visual OS?

A Claude Code visual OS is a unified dashboard that connects every AI model, memory store, skill, and usage stream in your Claude Code stack into a single self-improving surface. It tracks token spend, calculates skill ROI, and runs a nightly dreaming engine that surfaces patterns, prunes dead skills, and suggests workflow improvements. The concept extends Anthropic's officially shipped dreaming feature (May 6, 2026) into a personal control layer.

Is the dreaming feature actually available in Claude Code right now?

The dreaming feature is currently in research preview for Claude Managed Agents — request access at the managed-agents form on claude.com. The community variant called Auto Dream for Claude Code runs locally and consolidates memory files between sessions, pruning stale notes and converting relative dates to absolute ones. Both shipped publicly in May 2026.

How much does it cost to run a Claude Code dashboard like this?

The core dashboard infrastructure adds almost nothing if you already pay for Claude Pro Max and a Supabase or Pinecone tier. The dreaming runs cost is one Opus 4.7 call per night at roughly $5 per million input tokens, plus prompt caching savings up to 90%. Most builders running this on a typical solo stack pay an extra $5–15/month in inference for a system that surfaces hundreds in skill ROI.

How do I measure skill ROI inside Claude Code?

Define your hourly rate in dollars. Log every skill invocation with its token count and approximate time saved. Multiply time saved by hourly rate, subtract token cost, and surface net ROI per skill in the dashboard. The hourly-rate input is the step most builders skip — without it, every other ROI number is meaningless.

Should I build this before or after I have a solid Claude Code stack?

Build it after. The visual OS is layer four or five of a stack that needs layers one through three (architecture, memory, observability) standing first. If you do not yet have a working CLAUDE.md, at least one custom subagent, and skills you actually invoke recurringly, build those before you obsess about the dashboard. See my agentic OS three-layer build for the floor that comes first.

Let's Work Together

Looking to build AI systems, automate workflows, or scale your tech infrastructure? I'd love to help.

Fiverr (custom builds & integrations): fiverr.com/s/EgxYmWD
Portfolio: mejba.me
Ramlit Limited (enterprise solutions): ramlit.com
ColorPark (design & branding): colorpark.io
xCyberSecurity (security services): xcybersecurity.io

Claude Code Visual OS: The Dashboard My Crew Has Been Missing

The Real Problem Nobody Is Naming

What Jack's Claude Code Visual OS Actually Is

Pillar one: Models

Pillar two: Memory

Pillar three: Skills

Pillar four: Knowledge systems

Pillar five: Usage and cost

Pillar six: The dreaming system

The Dreaming Feature Is Real Now

How I Would Build the Visual OS Into My Aria Stack

Stage one: Model detection

Stage two: Memory setup

Stage three: Define time value

Stage four: The dream engine

Stage five: The dashboard

What This Means for the Way I Run My Crew

The skills folder is not a closet, it's a roster

The cost intelligence module is the cheapest win

The dreaming layer is where I stop trying to be clever

The opportunity scan is what makes this valuable for clients

The Honest Tradeoffs

What I'm Doing This Week

Frequently Asked Questions

Let's Work Together

Enjoyed this article?

Related Topics

Engr Mejba Ahmed

Comments

Leave a Comment

Related Articles

Sci-Fi Landing Page: Figma + Claude Code + Higgsfield

Google Ads Automation With Claude Code: A Full Build

Practical AGI Is Already Here: Anthropic's Own Numbers

Comments

Leave a Comment

Expand Your Knowledge

AI School

Certificates

Learning Flashcards

AI Agent Skills

Engr Mejba Ahmed

Hey there!