Agentic OS on Claude Code: The Three-Layer Build

Agentic OS Claude Code: The Three-Layer Build That Killed My Slot-Machine Habit

Most people use Claude Code like a slot machine.

Open the terminal. Type a prompt. Pull the lever. Sometimes you get a working refactor. Sometimes you get a markdown wall of caveats. Sometimes you get half of what you asked for and a polite explanation of why the other half is "not advisable at this time." So you reroll the prompt. Pull again. Maybe tweak a word. Pull again.

I did this for the first eight months I used Claude Code. I told myself it was prompt engineering. It wasn't. It was gambling.

The thing that finally broke the loop was a methodology I picked up from a video summary on building an "agentic OS" on top of Claude Code, then shaped against my own messy reality of running four brands and pushing roughly 250 long-form posts through the system this year. The methodology has three layers: architecture, memory, and observability. Boring words. Massive payoff.

This post is the version of that methodology I actually use, written from the field, with the specific Claude Code primitives I'm running and the things I tried that didn't survive contact with real work. If you're at the slot-machine stage and starting to suspect there's a better way, you're right. Here it is.

The Slot-Machine Problem Has a Cost

Before the methodology makes sense, the pain has to be specific. So let me be specific.

A slot-machine workflow looks productive. You're at the terminal. Things are happening. Code is being generated. But the work behind the work — the part where you re-explain your brand voice for the eleventh time, paste the same folder structure into a fresh session, debug an output that drifts because you forgot to mention the constraint that was obvious to you but invisible to the model — that part is invisible until you measure it.

I measured mine in March. Across one week of Claude Code sessions, I spent roughly 35% of my prompt tokens on context I'd already given the model in a previous session. Not the work itself. The setup for the work. Brand rules I'd written down in three places. Folder paths I'd shown it on Tuesday and reshown it on Friday. Voice constraints buried in a CLAUDE.md somewhere that I never bothered to point at.

Worse: the variance. Ask Claude Code to write a mejba.me post on Monday and you'd get one shape. Ask the same way on Friday and you'd get something that read like it was written by a related but unfamiliar person. That's not the model misbehaving. That's me running an unstructured prompt against a stateless engine and being surprised that the output isn't stable.

The agentic OS methodology fixes both problems by stopping the gambling and starting the architecture. Three layers, in order of impact: organize what you do (architecture), remember what you've done (memory), see what's happening (observability).

Even if you only adopt the first layer, you'll get most of the value. I want to be honest about that upfront because the temptation when you read about a three-layer system is to try to build all three on a Saturday. Don't. Build layer one. Live in it. Then add the next.

Let's get into it.

Layer One: Architecture — From Random Prompts to a Real Org Chart

The first layer is the one that broke the slot-machine habit for me. It's also the one I almost skipped because it sounds the least technical.

The idea is straightforward. Stop thinking of Claude Code as a place where you type prompts. Start thinking of it as an organization with departments, job descriptions, and standard operating procedures. Concretely, that means organizing your work into four nested concepts:

Domains — the broad areas you actually operate in (content creation, research, productivity, community, security)
Tasks — the recurring work each domain produces (a blog post, a competitor scan, a code review, a brand audit)
Skills — codified, repeatable instructions for how to do a specific task well
Automations — skills that run on a trigger instead of waiting for you to invoke them

This is not abstract. In Claude Code, each of those concepts maps onto a real primitive. Skills live as SKILL.md files with YAML frontmatter. Subagents live as markdown files in .claude/agents/. Slash commands live as files in .claude/commands/. Automations are skills wrapped in a hook (SessionStart, PostToolUse) or a scheduled trigger via the Claude Agent SDK. The platform is already shaped for this. You just have to use the shape.

What a domain actually looks like in my setup

I run four brands — mejba.me, ramlit.com, colorpark.io, xcybersecurity.io — plus an internal "ops" domain for everything that crosses brands. So I have five domains. Each domain has between three and seven tasks I actually do recurringly, and each task has at most one skill attached.

Here's the truncated structure on disk:

~/projects/agentic-os/
├── CLAUDE.md                          # root config (more on this below)
├── .claude/
│   ├── settings.json                  # permissions + hooks
│   ├── agents/                        # subagents (one per role)
│   │   ├── aria.md                    # content engineer
│   │   ├── auditor.md                 # SEO + voice auditor
│   │   └── researcher.md              # WebSearch + summary specialist
│   ├── commands/                      # slash commands
│   │   ├── morning-scan.md
│   │   ├── post-from-video.md
│   │   └── brand-audit.md
│   └── skills/                        # repeatable instructions
│       ├── write-blog-post/SKILL.md
│       ├── extract-video-summary/SKILL.md
│       └── seo-pass/SKILL.md
└── vault/                             # memory layer (Obsidian)
    ├── raw/
    ├── wiki/
    └── output/

That's the whole architecture in one tree. No clever tooling. No proprietary platform. Just folders that Claude Code already understands natively.

The thing that makes this an org chart and not just a folder is the relationships between the pieces. The aria.md subagent reads the write-blog-post skill. The morning-scan slash command invokes the researcher subagent, which reads the extract-video-summary skill. Each piece does one thing. None of them duplicate each other. When I want to change how mejba.me posts get written, I edit one file — the write-blog-post/SKILL.md — and every invocation across every domain inherits the change.

That's the boring superpower. One source of truth for each capability.

A real skill, not a toy one

Let me show you what a skill actually looks like. Here's a stripped-down version of the one I use to extract structured summaries from video transcripts before they become blog posts:

---
name: extract-video-summary
description: Extract a structured summary from a video transcript or YouTube URL. Use when the user provides a video, transcript, or asks to "summarize this video" before writing a post.
---

# Extract Video Summary

You are extracting a structured summary that another agent will use as
source material for a blog post. The output must be:

1. **TLDR** — three sentences. The single most important takeaway.
2. **Key claims** — bullet list. One claim per bullet. No editorializing.
3. **Specific examples** — bullet list. Names, numbers, dates, tools.
4. **Quotes worth pulling** — direct quotes that would land in a blog.
5. **What the video gets wrong or oversimplifies** — be honest.

Rules:
- Do not soften claims. If the speaker said it, write it as they said it.
- If a claim is unverified, mark it `[unverified]` and move on.
- Save the result to `vault/raw/video-summaries/[slug].md` using
  the video title as the slug.

That's it. Forty lines once you count formatting. Loaded into Claude Code via the Skill tool, it transforms an unreliable "summarize this for me" prompt into a deterministic process. Same shape every time. Same file location every time. Same downstream agents can read it without checking what format showed up today.

The key thing about a good skill is what it removes, not what it adds. A skill takes a fuzzy ask and removes the variance. If you find yourself writing a long, clever skill, you probably haven't decided what the task actually is.

Automations: skills with a trigger

An automation is a skill that runs without you. In Claude Code, the cheapest way to wire one up is via a hook in settings.json. A SessionStart hook fires when a Claude Code session begins. A PostToolUse hook fires after a tool finishes. Both are configured in settings.json and are documented in the official Claude Code hooks reference.

Here's the morning trend scan I run. It's wired as a slash command (/morning-scan) that I trigger manually most days, but on the days I want it automated, the same command runs from a cron job that just shells out to claude -p "/morning-scan":

---
name: morning-scan
description: Aggregate AI news, competitor moves, and trending topics into a single daily brief. Save to vault/raw/scans/YYYY-MM-DD.md.
---

# Morning Scan

Run this every weekday morning before I open the terminal.

1. Use WebSearch to pull the top 5 stories from each of:
   - Anthropic, OpenAI, Google AI launches in the last 24h
   - HackerNews top 10 (filter to AI/dev/agent topics)
   - r/ClaudeAI top posts of the day
2. For each story, write a 2-sentence summary. No fluff.
3. Flag anything that affects my multi-brand workflow:
   - Claude Code changelog → tag #claude-code-update
   - New AI tool launches → tag #stack-candidate
   - Security/CVE news → tag #xcyber-relevant
4. Save the brief to vault/raw/scans/YYYY-MM-DD.md.
5. If anything in the brief is post-worthy for mejba.me, add a
   line at the top: `POST CANDIDATE: [topic]`.

The skill is the what. The cron job is the trigger. Together they're an automation.

For the cleaner version, you'd use the Claude Agent SDK to schedule the run programmatically and post the result to a Slack channel or your own dashboard. I built that version eventually. The slash-command-plus-cron version got me 80% of the value in 30 minutes.

When automation becomes a trap

Honest section. I over-automated for two months in early 2026. Built fourteen automations across the four brands. Hooks firing on every Edit, scheduled scans every two hours, a hook that auto-committed any file Claude Code touched. It was beautiful on a whiteboard. It was a disaster in practice.

Three things broke. First, the hooks fought each other. A PostToolUse formatter kept reformatting files mid-edit and cascading into the next tool call. Second, the cost ballooned — every scheduled scan was a full Claude session with no cap, and the bill in March was almost double February's. Third, the noise. Fourteen automations meant fourteen Slack notifications a day, most of which I muted, which defeated the entire point.

I cut it back to four automations. Morning scan. End-of-day vault clean. Weekly brand-voice audit. Monthly stack review. Everything else became a slash command I run when I actually want it. The lesson: automation is for things you'd run anyway, not for things you wish someone would run.

If you're starting from scratch, build one skill, one slash command, and zero automations. Use them for a week. Add the next thing only when you've felt the absence.

I covered the broader skill-design philosophy in the deep dive on Claude Code skills businesses pay for in 2026, and the tactical patterns for keeping skills cheap in the caveman skill token-savings post — both worth reading before you start codifying your own.

That's layer one. Domains, tasks, skills, automations. If you stop reading here and just build this, you'll already be ahead of 95% of Claude Code users. The next two layers compound it.

Layer Two: Memory — The Obsidian Vault and the CLAUDE.md That Runs It

Layer one organizes what you do. Layer two gives Claude Code somewhere to remember what it did.

I want to be careful here, because "memory" is the most over-engineered concept in the agent ecosystem right now. Every other week another startup ships a "memory layer for Claude" that is, on inspection, a vector database with a marketing budget. For 90% of personal and small-team workflows, you don't need that. You need a folder of markdown files and a config file that tells Claude Code what's in it.

The pattern that finally clicked for me is the Karpathy LLM Wiki approach — Andrej Karpathy posted his version on April 3, 2026, and I spent a weekend rebuilding mine to match it. The shape is three folders inside an Obsidian vault: raw/, wiki/, output/. Each folder has a clear job. The LLM is the librarian and the author. There's no vector database, no embeddings, no chunking strategy.

I wrote about this approach in detail in the Karpathy Obsidian RAG post — that post is the deep dive on why it works. This section is about how it slots into the agentic OS as the memory layer.

The three folders, and what each one is for

The vault is dead simple:

vault/
├── raw/                              # ingestion, no organization required
│   ├── video-summaries/
│   ├── scans/                        # morning-scan output lands here
│   ├── transcripts/
│   ├── research-clippings/           # Obsidian Web Clipper drops here
│   └── inbox/                        # everything else, sorted later
│
├── wiki/                             # codified knowledge, LLM-maintained
│   ├── index.md                      # master index, LLM-written
│   ├── claude-code/
│   │   ├── index.md
│   │   ├── skills.md
│   │   ├── hooks.md
│   │   └── agents.md
│   ├── brands/
│   │   ├── mejba-me-voice.md
│   │   ├── ramlit-positioning.md
│   │   └── colorpark-design-rules.md
│   └── ops/
│       ├── seo-rules.md
│       └── publishing-checklist.md
│
└── output/                           # final deliverables
    ├── posts/
    ├── briefs/
    └── decks/

raw/ is the dumping ground. Anything I want Claude Code to eventually know about lands here, unsorted. Video transcripts. Web clippings (one click via the Obsidian Web Clipper). Daily scan results. Random voice notes I dictate while walking. The ingestion friction is intentionally near-zero, because the second I have to think about where to file something, I stop filing things.

wiki/ is where the LLM earns its keep. On a recurring schedule (or on demand via a slash command), Claude Code reads raw/, identifies new material that hasn't been integrated, and updates the wiki. It writes encyclopedia-style articles. It maintains the master index.md. It cross-links related concepts using [[wiki-style links]] that Obsidian renders natively. The wiki is the LLM's compiled understanding of everything in the vault, written in a format the next session can read efficiently.

output/ is the finish line. Final blog posts go here. Client briefs go here. Decks for upcoming workshops go here. Anything that's been delivered. The reason this gets its own folder is so that Claude Code can quickly answer "what have I shipped?" without crawling the rest of the vault.

That's the whole memory layer. Three folders. Markdown. Free. Portable.

The vault-level CLAUDE.md that ties it together

The single file that makes this layer functional is the CLAUDE.md at the root of the vault. It's the config that tells Claude Code what's in the folders and how to treat them. Without it, Claude has to guess every session. With it, you cut your context-setting tokens by something like 70%.

Here's the actual structure of mine, lightly redacted:

# CLAUDE.md — Agentic OS Vault Config

## Purpose

This vault is the persistent memory layer for the agentic OS. It is
maintained jointly by me and Claude Code. The vault has three top-level
folders, each with a specific role.

## Folder roles

### raw/
Unprocessed source material. Treat this as ingestion-only.
- DO NOT modify files in raw/ except to add new ones.
- DO NOT use raw/ as a primary source when answering questions —
  always check wiki/ first, then fall back to raw/ if the wiki
  has a gap.
- New material from Obsidian Web Clipper, video summaries, and
  the morning-scan automation lands here.

### wiki/
Codified knowledge, maintained by Claude Code. Treat this as the
primary source of truth for everything that's been processed.
- Always start here when answering a question.
- Always update wiki/index.md when you create a new wiki article.
- Use [[wiki-style links]] to cross-reference related concepts.
- If you find a gap in the wiki while answering a question, note
  it at the bottom of the relevant index.md as a pending note.

### output/
Final deliverables. Treat this as read-mostly.
- Only write here when explicitly asked to produce a deliverable.
- When producing a new post, check output/posts/ to make sure
  the slug isn't already taken.

## Workflow rules

1. When I ask for a blog post, read the relevant wiki section
   first, then raw/ for any new material since the wiki was
   last updated, then write the draft to output/posts/[slug].md.
2. When I add new material to raw/, do not auto-process it.
   Wait for me to run /update-wiki.
3. Brand voice rules live in wiki/brands/. Always load the
   relevant one before writing in that brand's voice.
4. SEO rules live in wiki/ops/seo-rules.md. Apply on every post.

## Active brands
- mejba.me (personal, first-person, passionate)
- ramlit.com (corporate, third-person, results-focused)
- colorpark.io (design, opinionated, visual)
- xcybersecurity.io (security, authoritative, urgent)

That file is maybe 60 lines. It does more for the consistency of my outputs than any single skill I've written. The reason is simple: it eliminates guessing. Claude Code doesn't have to figure out where things live, what each folder is for, or how to handle ambiguous requests. The rules are in the file, the file is loaded every session, and every subagent inherits the context.

If you're going to write one config file for your entire setup, write this one.

What I tried first and abandoned

Two memory experiments that didn't survive contact with real work.

First: I tried storing memory in a Supabase vector database with a custom MCP server. Supabase as the vector store, OpenAI embeddings, semantic retrieval over my notes. It worked. It was also wildly over-engineered for what I actually needed, which was "remember what we decided last Tuesday." The retrieval quality was actively worse than just letting Claude Code read the markdown directly, because the chunks would slice mid-sentence and similarity scores would surface near-duplicates instead of the most useful note. After two weekends of tuning, I deleted the whole thing.

Second: I tried having Claude Code auto-process every new raw file the moment it was added — a PostToolUse hook that would trigger a wiki update on every write to raw/. The cost was brutal. Every time I clipped a long article, it would spawn a session that read the article, decided where it fit in the wiki, sometimes wrote a new article, sometimes updated an existing one. Some of those sessions were 30,000+ tokens. Doing this dozens of times a day burned through credits without delivering proportional value, because most clippings don't need processing on the same day they're saved.

The fix was the explicit /update-wiki command in the CLAUDE.md above. I run it once a week, on Sundays. It batches all the unprocessed raw material into one session, and the cost-per-insight ratio drops by something like 10x.

The lesson: memory layers fail on either over-architecture (vector DBs) or over-eagerness (auto-processing every input). The Karpathy folder structure plus a clear CLAUDE.md plus a manual update command is the boring middle that actually works.

That's layer two. Three folders, one config file, one weekly habit. Now the system has continuity.

Layer Three: Observability — A Dashboard That Exposes the OS to Humans

Layer one organizes what you do. Layer two remembers what you did. Layer three is the part that matters when you stop being the only user.

I'll be honest: I built layer three last, and for a long time I didn't think I needed it. I was the only operator. I lived in the terminal. The terminal was fine. Then I tried to hand off the morning-scan automation to a teammate so I could go on holiday for a week, and the entire system fell over not because the technology failed but because she had to learn three CLI commands, a vault structure, and the difference between a slash command and a subagent before she could run one job.

The terminal is a moat. For me it's a feature. For everyone else, it's a wall.

A dashboard is the third layer of an agentic OS because it's the layer that lets the system serve people who don't want to type claude -p "..." for a living. That includes non-technical teammates, clients, future you on a Sunday morning when typing feels like work, and anyone who wants to see what the system is doing without grepping log files.

What the dashboard actually does

There isn't a single official Claude Code dashboard yet. As of May 2026, Anthropic ships the Claude Code monitoring stack via OpenTelemetry — eight metrics including session count, token usage, estimated cost, and active time — and a healthy ecosystem of community-built observability layers (the claude-code-otel project is the one I've used most). The thing nobody ships out of the box is the control surface — the part that exposes your skills and automations as buttons.

So you build that part yourself. The shape that's worked best for me is a small Next.js app — maybe 600 lines total — that does four things:

Exposes each skill and automation as a clickable button. Click "Morning Scan" and the dashboard shells out to claude -p "/morning-scan" (or hits the Claude Agent SDK programmatically). The output streams back into the UI.
Tracks usage. When was each skill last run? How long did it take? How many tokens did it cost? Which automations have been firing on schedule and which have failed silently?
Surfaces recent vault changes. What was added to raw/ in the last 24h? What did the last /update-wiki change in wiki/? What's been published to output/posts/ this week?
Links every output back into Obsidian. Every result the dashboard shows has a "View source" link that opens the relevant markdown file in Obsidian. Full traceability — every claim the system surfaces points at the file the claim came from.

That last one is the most important. Without it, the dashboard becomes another magic box where AI does things and you trust it. With it, every output is auditable in one click. You see the result, you click the source, you're reading the markdown the agent read. No hallucination can hide.

What the dashboard does NOT need to do

I want to flag this because I burned a month on it. The dashboard does not need to be:

A full project management tool. It is not Linear. It is not a replacement for your task tracker. It is a control surface for your agentic OS, full stop.
An analytics platform. Token spend tracking is useful. A custom analytics warehouse is not.
A multi-tenant SaaS. If you build one, you'll spend three months on auth and zero months on actual workflow improvements.
A real-time multi-user collaboration tool. You are the operator. The dashboard is for you and maybe one or two trusted collaborators.

The dashboard I built is a single-page Next.js app with one route, no auth (it's localhost-only), and a Postgres instance for usage logs. Total build time: about 14 hours, spread over two weekends. It does the four things above. Nothing else.

The metrics that actually matter

Of the eight metrics Claude Code exports natively, four show up on my dashboard's home screen:

Tokens spent per skill, per week. This is the metric that catches automations going rogue. The week I had the auto-wiki hook firing, this graph spiked 3x and I could see exactly which skill was responsible.
Run count per automation. Which automations actually run, and which ones I've quietly stopped using. If a weekly automation hasn't fired in three weeks, it's dead and I delete it.
Vault delta. How many files changed in raw/, wiki/, output/ this week. This is the closest thing to "is the system actually doing work" that I've found.
Last run timestamp per skill. When did I last invoke this? Skills I haven't run in 60 days get archived. The system should be a living organism, not a museum.

The other four metrics (PR count, lines of code, code edit decisions, active time) are useful for engineering teams but less useful for a content operation. Your mileage will vary depending on what your domains are.

What I'd build if I were starting today

If I were rebuilding the dashboard from scratch in May 2026, I'd start with one of the open-source Claude Code observability stacks — Cole Murray's claude-code-otel plus Grafana is a solid base — and bolt the control surface on top of it. The observability part is solved by community work. The control surface is the part you have to write yourself, because it's specific to your skills and automations.

Don't try to build the whole thing in week one. The dashboard should be the thing you reach for when running the OS in the terminal stops feeling fast. If you're still happy in the terminal, you don't need it yet.

That's layer three. A control surface that exposes the OS to humans, with traceability back into the vault. Build it last, build it small, build it for one user.

When This Much Structure Actually Pays Off

I've been writing as if everyone should build all three layers. They shouldn't. The agentic OS pays off in specific situations and is overkill in others. Honest assessment:

Build all three layers if you:

Run more than one brand or one major project (the structure compounds across each one)
Hand work off to teammates, clients, or contractors who don't live in a terminal
Produce recurring deliverables on a schedule (blog posts, briefs, audits, scans)
Have caught yourself re-explaining the same thing to Claude Code more than three times

Build only layer one if you:

Are a solo operator with one main project
Mostly use Claude Code for one-off coding tasks
Are in your first month of using Claude Code seriously (give it time before adding architecture)

Skip the whole thing if you:

Use Claude Code occasionally for personal projects with no recurring outputs
Are still figuring out what your domains and tasks even are
Want to learn the platform first — overstructuring before you understand the primitives is the second-fastest way to give up on the platform

The reason the OS pays off for me specifically is that I run four brands and ship roughly 250 long-form posts a year. Without the structure, the variance kills me. With the structure, each brand has a stable voice, each post starts from the same scaffolding, and the time per post drops from "a full afternoon" to "ninety minutes including research." That's the math that makes the architectural overhead worth it.

If your math is different, the answer is different. I want to make sure I'm not selling something nobody needs.

What I'd Skip on Day One

I keep getting asked "where do I start?" and I keep giving the same answer, so let me make it explicit. If you read this post and decide to build an agentic OS, here's what to do this week:

Day one, this week, only. Pick one domain. Just one. The thing you do most often in Claude Code right now. For me, that was content creation. For you, it might be code review, or research, or design work. Pick one.

Inside that domain, identify three tasks. The three things you actually do recurringly inside that domain. Not theoretical tasks. Things you've done at least four times in the last month.

Write one skill per task. Use the format above. Forty lines max. The goal is to remove variance, not to be clever. Save them in .claude/skills/.

Write a single CLAUDE.md. One paragraph per task explaining what you want and where the output should go. Not a book. A page.

Stop there.

Don't build the vault yet. Don't build the dashboard yet. Don't build automations. Use the architecture for two weeks. Pay attention to where it breaks and where it sings. Adjust the skills based on what you learn from real usage.

After two weeks, if the architecture has held, you'll start to feel the absence of memory. That's when you build the vault. After two more weeks, if you're working with anyone else, you'll start to feel the absence of the dashboard. That's when you build the dashboard.

The compound interest of this approach is that each layer is solving a problem you've already felt. You're not building speculative infrastructure. You're closing gaps you can name.

What This Changes About How You Use Claude Code

The deepest shift the agentic OS produces isn't tactical. It's psychological.

Before I had this structure, every Claude Code session felt like it could go any direction. I'd open the terminal with a vague intent, type something, and let the model do what it would. The variance felt like creativity. It wasn't. It was randomness with good PR.

After the structure, every session has a shape. I open Claude Code knowing which skill is going to run, which subagent is going to handle it, which folder the output is going to land in, and which downstream agent is going to pick it up next. The session feels less like prompting and more like dispatching. I'm not collaborating with a slot machine anymore. I'm running an org chart.

That shift, more than any individual tool or trick, is what made Claude Code go from "interesting tool" to "operating system" for me. The methodology is what got me there. The three layers are what hold it in place.

If you're at the slot-machine stage right now, here's the only thing I want you to take away from this post: the way out isn't a better prompt. It's a better architecture. Pick one domain this week. Write three skills. Write one CLAUDE.md. Stop pulling the lever and start running the operation.

I'll be in the terminal, but I'm not gambling anymore.

Agentic OS Claude Code: Questions Builders Ask

What is an agentic OS in Claude Code?

An agentic OS is a structured framework that turns Claude Code from ad-hoc prompting into a layered system with three layers — architecture (domains, tasks, skills, automations), memory (an Obsidian vault with raw, wiki, and output folders plus a vault-level CLAUDE.md), and observability (a dashboard that exposes skills and automations as clickable buttons with usage metrics). It uses Claude Code's native primitives like skills, subagents, hooks, and slash commands rather than custom tooling. For the full implementation walkthrough, see the three-layer breakdown above.

Do I need an Obsidian vault to use Claude Code effectively?

No — Obsidian is one good option for the memory layer, not a requirement. The vault layer is just a folder of markdown files with a vault-level CLAUDE.md telling Claude Code what each folder is for. You can implement the same structure in any plain folder; Obsidian adds free human-friendly viewing, wiki-style links, and the Web Clipper for ingestion.

How is a Claude Code skill different from a slash command?

A skill is a SKILL.md file with YAML frontmatter that describes a repeatable task and gets auto-loaded when relevant. A slash command is a markdown file in .claude/commands/ that you invoke explicitly with /command-name. Skills are about capability; slash commands are about invocation. Most well-built systems have skills that get triggered by slash commands.

What's a Claude Code automation, and how do I build one?

An automation is a skill that runs on a trigger instead of waiting for you to invoke it. The cheapest way to wire one up is via a hook in .claude/settings.json — a SessionStart or PostToolUse hook that fires the relevant slash command. For scheduled automations, a cron job that shells out to claude -p "/your-command" works fine. The Claude Agent SDK provides a programmatic version when you outgrow that.

How much does running an agentic OS on Claude Code cost?

Layer one (architecture) costs nothing extra — you're paying for Claude Code regardless. Layer two (vault) is free if you use Obsidian. Layer three (dashboard) is your hosting cost only — typically under $10/month for a single-operator setup. The variable cost comes from automations: a poorly-tuned automation can 2x your token spend, which is why I cap automations at four and review token-per-skill metrics weekly.

Can I hand off an agentic OS to a non-technical teammate?

This is exactly what layer three (the dashboard) is built for. The terminal is a moat for technical operators and a wall for everyone else. A dashboard that exposes skills and automations as buttons, with output streamed into the UI and source links back into Obsidian, lets a non-technical teammate run the system without ever touching the CLI. Without the dashboard, handoff is painful.

Where to Start If You Only Change One Thing

If the slot-machine feeling is what brought you here, don't try to stand up all three layers this weekend. Pick the layer that hurts most. If you keep re-explaining context, start with the Obsidian vault and a tight CLAUDE.md. If your prompts wander, write the org-chart architecture first. If a teammate can't see what the system is doing, the dashboard is your unlock.

I rebuilt this stack twice before it held, and the version above is the one that survived contact with real client work. Steal the parts that fit your operation and throw out the rest — the point is a system you can reason about, not a shrine to mine.

If you want a second set of eyes on your own build, I take on a small number of agentic-OS setups through Ramlit and one-off audits via Fiverr. Either way, the framework here is yours to run today.

Agentic OS Claude Code: The Three-Layer Build That Killed My Slot-Machine Habit

The Slot-Machine Problem Has a Cost

Layer One: Architecture — From Random Prompts to a Real Org Chart

What a domain actually looks like in my setup

A real skill, not a toy one

Automations: skills with a trigger

When automation becomes a trap

Layer Two: Memory — The Obsidian Vault and the CLAUDE.md That Runs It

The three folders, and what each one is for

The vault-level CLAUDE.md that ties it together

What I tried first and abandoned

Layer Three: Observability — A Dashboard That Exposes the OS to Humans

What the dashboard actually does

What the dashboard does NOT need to do

The metrics that actually matter

What I'd build if I were starting today

When This Much Structure Actually Pays Off

What I'd Skip on Day One

What This Changes About How You Use Claude Code

Agentic OS Claude Code: Questions Builders Ask

What is an agentic OS in Claude Code?

Do I need an Obsidian vault to use Claude Code effectively?

How is a Claude Code skill different from a slash command?

What's a Claude Code automation, and how do I build one?

How much does running an agentic OS on Claude Code cost?

Can I hand off an agentic OS to a non-technical teammate?

Where to Start If You Only Change One Thing

Enjoyed this article?

Related Topics

Engr Mejba Ahmed

Comments

Leave a Comment

Related Articles

17 Claude Code Plugins and Skills I Actually Use

Loop Engineering vs Prompt Engineering: The Truth

Launch Your Agent: I Tested Anthropic's Free Skill

Comments

Leave a Comment

Expand Your Knowledge

AI School

Certificates

Learning Flashcards

AI Agent Skills

Engr Mejba Ahmed

Hey there!