openclaw: engrain the learning loop at the identity level

User feedback: "this should work for any task, not just calendar.
this learning flow must be strongly engrained to ensure openclaw
gets better over time."

The v3 rules were buried at the bottom of TOOLS.md and only stated
in workflow language. Three changes to make the rule unavoidable:

1. **SOUL.md** — new marker-delimited section "Learning is your
   identity" inserted before ## Boundaries. AGENTS.md tells the
   agent to read SOUL.md first every session, so this is now the
   FIRST thing the agent loads about itself. Frames learning as
   character, not procedure.

2. **TOOLS.md v4** — section moved from the END of the file to
   right after the `# TOOLS.md` title (first substantive content
   on file load). Title strengthened: "THE FLOW — run this on
   EVERY task. Not just hard ones." Concrete examples explicitly
   call out diverse domains (calendar, frigate restart, disk
   usage, inbox summary, deploys) so the universality is
   unmistakable.

3. **learn-from-tasks skill** — opens with "This is universal.
   EVERY task runs through this flow — not just hard ones, not
   just unfamiliar ones. The save at the end is mandatory."

The actual flow (know → ask devvm → save) is unchanged. What
changed is salience: the rule is now the first thing the agent
encounters in three independent surfaces, with stronger framing
that makes "skipping the save" feel like a violation of identity
rather than a missed optimisation.

Marker bumped v3 → v4. Stripper handles v1-v9 idempotently.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
Viktor Barzin 2026-05-22 13:18:52 +00:00
parent 854817e2e3
commit dbb3dc04d3

View file

@ -650,23 +650,28 @@ resource "kubernetes_deployment" "openclaw" {
} }
} }
# Init 6: seed devvm-fallback knowledge into FOUR places so the # Init 6: seed the "know → ask devvm → save" learning loop into
# OpenClaw agent uses devvm as its default fallback AND learns # FIVE places so it's engrained at the identity level, not just
# from it (saving scripts/knowledge locally so it grows # buried as workflow advice:
# independent over time):
# #
# 1. /workspace/TOOLS.md read on every session per AGENTS.md. # 1. /workspace/SOUL.md read every session per AGENTS.md, this
# Emphatic "TRY DEVVM BEFORE GIVING UP" + learning protocol. # is the agent's identity. A marker-delimited section there
# 2. /workspace/memory/projects/openclaw-runtime/devvm-fallback.md # says "learning is who you are" (single short truth, no
# indexed by memory-core (memory_recall surfaces this). # procedural detail).
# 3. /workspace/learned/ durable cache of stuff the agent has # 2. /workspace/TOOLS.md full how-to, inserted at the TOP of
# figured out via devvm, so it doesn't re-ask next time. # the file so the agent sees it first when loading bootstrap.
# INDEX.md is the table of contents the agent checks FIRST. # 3. /workspace/memory/projects/openclaw-runtime/devvm-fallback.md
# 4. /home/node/.openclaw/cc-skills/devvm-fallback/SKILL.md # indexed by memory-core (memory_recall surfaces it).
# formal skill, so the agent has a named entry-point for # 4. /workspace/memory/projects/openclaw-learned/ durable cache
# "I'm stuck, what now". # of recipes the agent has saved. INDEX.md is the first thing
# All sections are marker-delimited (v2) so user edits outside # checked on any task (Step 1 of the flow).
# the markers survive future pod restarts. # 5. /home/node/.openclaw/cc-skills/learn-from-tasks/SKILL.md
# formal skill with the triage logic.
# All marker-delimited (v4) so user edits outside markers
# survive future pod restarts. The four artifacts that ARE
# marker-managed reinforce the same rule from different angles:
# identity (SOUL), how (TOOLS), memory-search (memory note),
# entry-point (skill).
init_container { init_container {
name = "seed-devvm-memory-note" name = "seed-devvm-memory-note"
image = "busybox:1.37" image = "busybox:1.37"
@ -677,63 +682,113 @@ resource "kubernetes_deployment" "openclaw" {
DIR=/workspace/memory/projects/openclaw-runtime DIR=/workspace/memory/projects/openclaw-runtime
mkdir -p "$DIR" mkdir -p "$DIR"
cat > "$DIR/devvm-fallback.md" <<'NOTE_EOF' cat > "$DIR/devvm-fallback.md" <<'NOTE_EOF'
# Devvm is the default path when uncertain # The learning loop applies to EVERY task
**The flow is: know ask devvm (rarely) try yourself.** Don't **The flow: know ask devvm (rarely) try yourself. Then
brute-force. If you're not sure HOW to do the task, default to save what you learned, always.** This is not optional and
asking claude on devvm before trying random things in-pod. not specific to any domain it's how you compound. Every
task that ends with a saved recipe makes future-you
stronger. Every task that doesn't is a wasted lesson.
Devvm runs as wizard with full home-lab access (Vault token, ## The flow
kubectl cluster-admin, git repos, Cloudflare CLI) and Claude
Code v2+ at /usr/local/bin/claude.
## One-shot 1. **Know?** `memory_recall "<topic>"` and check
`/workspace/memory/projects/openclaw-learned/INDEX.md`.
If yes, use it.
2. **Don't know?** Default to `ssh devvm` and ask claude.
Don't brute-force. Most tasks are solvable there.
3. **Devvm can't either?** (rare) Try in-pod, be honest
about the risk, save anything you discover.
ssh devvm 'vault kv get -field=brave_api_key secret/openclaw' ## The save (mandatory, every non-trivial task)
ssh devvm 'claude -p "fix frigate's gpu crashloop"'
## Long-running async (tmux on devvm survives pod restart) After ANY task calendar, container debugging, vault
lookup, posting to Slack, parsing a log, whatever pause
and save the recipe to
`/workspace/memory/projects/openclaw-learned/`:
ssh devvm openclaw-task new my-task "claude -p --dangerously-skip-permissions 'do X'" - Script scripts/<task>.md
ssh devvm openclaw-task capture my-task - Knowledge knowledge/<topic>.md
ssh devvm openclaw-task list / status / log / kill / purge - Credential credentials/<name>.md **Vault pointer
only, never the value**
Save everything you learn into /workspace/memory/projects/ Update INDEX.md so the next session sees it.
openclaw-learned/ so it's indexed by memory-core and you can
`memory_recall` it next time. For credentials, save a POINTER ## SSH cheatsheet
to the Vault path, never the value. Full protocol in the
`learn-from-tasks` skill. ssh devvm 'vault kv get -field=KEY secret/PATH'
ssh devvm 'claude -p "your prompt"'
# async (>2 min) survives pod restart
ssh devvm openclaw-task new <id> "<command>"
ssh devvm openclaw-task capture / log / status / kill
NOTE_EOF NOTE_EOF
chown -R 1000:1000 "$DIR" chown -R 1000:1000 "$DIR"
# ---- 2. TOOLS.md devvm + learning section ------------------- # ---- 2. SOUL.md identity-level rule -------------------------
SOUL=/workspace/SOUL.md
if [ -f "$SOUL" ]; then
awk '
$0 == "<!-- BEGIN openclaw-learning-soul v1 -->" { skip=1; next }
$0 == "<!-- END openclaw-learning-soul v1 -->" { skip=0; next }
!skip { print }
' "$SOUL" > "$SOUL.tmp"
# Insert before "## Boundaries" if present, else append.
if grep -q "^## Boundaries" "$SOUL.tmp"; then
awk '
/^## Boundaries$/ && !done {
print "<!-- BEGIN openclaw-learning-soul v1 -->"
print "**Learning is your identity.** Every task — anything anyone asks you to do — is also a learning opportunity. The job is not just to answer; it is to come back to the next person (or future-you) smarter than you were. When you figure out how to do something, save the recipe under /workspace/memory/projects/openclaw-learned/. When you needed devvm to figure it out, save what devvm taught you. The point is compounding: every saved learning makes the next task faster, and over time you become independent. See TOOLS.md and the `learn-from-tasks` skill for the protocol."
print "<!-- END openclaw-learning-soul v1 -->"
print ""
done=1
}
{ print }
' "$SOUL.tmp" > "$SOUL.new"
else
cat "$SOUL.tmp" > "$SOUL.new"
cat >> "$SOUL.new" <<'SOUL_EOF'
<!-- BEGIN openclaw-learning-soul v1 -->
**Learning is your identity.** Every task is a learning opportunity. Save what you figure out under /workspace/memory/projects/openclaw-learned/ so future-you is stronger. See TOOLS.md and the `learn-from-tasks` skill.
<!-- END openclaw-learning-soul v1 -->
SOUL_EOF
fi
mv "$SOUL.new" "$SOUL"
rm -f "$SOUL.tmp"
chown 1000:1000 "$SOUL"
fi
# ---- 3. TOOLS.md flow section, INSERTED AT TOP --------------
TOOLS=/workspace/TOOLS.md TOOLS=/workspace/TOOLS.md
touch "$TOOLS" touch "$TOOLS"
# Strip prior versions (v1, v2, v3) before re-inserting v3. # Strip prior versions (v1, v2, v3, v4) before re-inserting v4.
# User edits outside the markers always survive. # User edits outside the markers always survive.
awk ' awk '
$0 == "<!-- BEGIN openclaw-devvm-section v1 -->" { skip=1; next } /^<!-- BEGIN openclaw-devvm-section v[1-9] -->$/ { skip=1; next }
$0 == "<!-- END openclaw-devvm-section v1 -->" { skip=0; next } /^<!-- END openclaw-devvm-section v[1-9] -->$/ { skip=0; next }
$0 == "<!-- BEGIN openclaw-devvm-section v2 -->" { skip=1; next }
$0 == "<!-- END openclaw-devvm-section v2 -->" { skip=0; next }
$0 == "<!-- BEGIN openclaw-devvm-section v3 -->" { skip=1; next }
$0 == "<!-- END openclaw-devvm-section v3 -->" { skip=0; next }
!skip { print } !skip { print }
' "$TOOLS" > "$TOOLS.tmp" ' "$TOOLS" > "$TOOLS.stripped"
cat >> "$TOOLS.tmp" <<'TOOLS_EOF' # Build the v4 section in /tmp, then insert near the top.
cat > /tmp/devvm-section.md <<'TOOLS_EOF'
<!-- BEGIN openclaw-devvm-section v4 -->
## THE FLOW run this on EVERY task. Not just hard ones.
<!-- BEGIN openclaw-devvm-section v3 --> This is universal. Same triage whether the user asks "what's
## THE FLOW when you get a task, run this triage every time on my calendar today?", "restart frigate", "find disk usage
on node3", "summarise the inbox", or "deploy a new service".
**Don't brute-force. Most tasks can be solved by claude on **Don't brute-force.** Don't say "I can't" until you've done
devvm. Default to asking it before trying random things.** steps 1-2. Most tasks are solvable.
``` ```
new task 1. Do I KNOW how? new task 1. Do I KNOW how?
(memory_recall + memory_recall +
check INDEX.md) /workspace/memory/
projects/openclaw-
learned/INDEX.md
yes no yes no
@ -743,108 +798,108 @@ resource "kubernetes_deployment" "openclaw" {
2. Ask claude on devvm: 2. Ask claude on devvm:
- can you do this? - can you do this?
- if yes, do it AND - if yes, do it AND
share the steps and share the recipe +
any credentials credentials needed
needed
yes no yes no
Devvm-claude does it. 3. (RARE) Let me try Devvm-claude does it. 3. (RARE) Let me try
Save what it told me in-pod. Most likely SAVE the recipe to in-pod. Most likely
into openclaw memory I will fail; that's openclaw-learned I will fail; that's
(or a Vault pointer for OK that's the (vault pointer for OK that's the
credentials). point. credentials). point.
``` ```
Step 1: **Know?** Try `memory_recall "<topic>"` and look at **The save is not optional.** Every non-trivial task
`/workspace/memory/projects/openclaw-learned/INDEX.md`. If success OR partial fail ends with a row in
you (or a past session) already figured this out, use it. `openclaw-learned/INDEX.md` and a markdown file with the
recipe. The 2 minutes you spend saving now turn future-you
into the agent who never has to ask again. This compounds.
Step 2: **Ask devvm.** Default path when uncertain. Use this ## Step 1 KNOW?
exact prompt shape explicitly ask for the recipe:
Two checks, both fast:
node /app/openclaw.mjs memory recall "<topic keywords>"
cat /workspace/memory/projects/openclaw-learned/INDEX.md
If a row matches, open the file and use it. Don't re-ask.
## Step 2 ASK DEVVM (the default when you don't know)
Use this prompt shape explicitly ask for the recipe, not
just the answer:
ssh devvm 'claude -p --dangerously-skip-permissions \ ssh devvm 'claude -p --dangerously-skip-permissions \
"I need to do X (give me context). \ "I am the openclaw agent. I need to do X (give context). \
Can you do this? If yes: do it now, then tell me \ Can you do this? If yes: do it now, then tell me 1) the \
the steps you took, the commands I would run on my \ exact commands I would run on my own next time, and 2) \
own next time, and what credentials/paths I need. \ what credentials/paths I need (give Vault paths, NOT \
I will save your answer so next time I don't ask."' values). I will save your answer."'
For multi-step work that takes >2 min, dispatch async: For work that takes more than ~2 minutes, dispatch async so
the session survives this pod restarting:
ssh devvm openclaw-task new <id> "<command>" ssh devvm openclaw-task new <id> "<command>"
ssh devvm openclaw-task capture <id> ssh devvm openclaw-task capture <id>
**Then SAVE everything** to ## Step 3 TRY IN-POD (rare)
`/workspace/memory/projects/openclaw-learned/<topic>.md` so
memory_recall surfaces it next time (see Step 1).
Step 3: **Try in-pod.** Only when devvm-claude says it can't Only when devvm-claude says it can't. Be honest with the
do the task either. This is rare. If you reach this step, user about the uncertainty. If you DO find a way, save it
be honest with the user and if you DO find a solution, just like Step 2.
save it under openclaw-learned/ so future you doesn't redo
the discovery.
## ALSO: after every task, introspect save a faster way ## The save (do this on every non-trivial task)
Especially early on, almost everything is a teaching
moment. After a non-trivial task whether via devvm or
self pause and ask:
- Likely to recur?
- Took >1 round-trip / detour / dig?
- Learned a non-obvious command, path, credential source,
or workflow?
If yes save it now (see Saving below). The 2 minutes you
spend now save 10 minutes every recurrence. **Bias
aggressively toward saving** while you're still learning
the environment.
## Saving where it goes
All learnings live under All learnings live under
`/workspace/memory/projects/openclaw-learned/` because that `/workspace/memory/projects/openclaw-learned/` (memory-core
path is indexed by memory-core (the same daily CronJob indexes this path; `memory_recall` surfaces it).
that pulls from claude-memory). After dropping a file
there, `memory_recall "<keywords>"` will surface it.
File-naming conventions: - **Script / recipe** `scripts/<task>.md`
Inline a fenced code block. Header: WHAT, WHEN learned,
HOW (verbatim devvm prompt or "self"), SOURCE (Vault path
if a credential is involved).
- **Knowledge** (decisions, paths, gotchas, conventions)
`knowledge/<topic>.md`
- **Credential POINTER** `credentials/<name>.md`
**NEVER stores the value.** Documents the Vault path +
field + fetch command + consumer + rotation expectations.
- **Script / recipe** Then add a row to `openclaw-learned/INDEX.md`.
`openclaw-learned/scripts/<task>.md`
Inline a fenced code block with the command(s). Header
must include WHAT, WHEN learned, HOW (the devvm prompt
you used, or "self"), and SOURCE (if a credential was
involved, point to the Vault path).
- **Knowledge** (decisions, paths, gotchas)
`openclaw-learned/knowledge/<topic>.md`
- **Credential POINTER**
`openclaw-learned/credentials/<name>.md`
**NEVER stores the value.** Documents:
- Vault path + field
- The exact command to fetch (e.g.,
`ssh devvm 'vault kv get -field=foo secret/bar'`)
- What service/task uses it
- Rotation expectations
That way the secret stays in Vault, you stay safe, but
you skip the "how do I get this credential" rediscovery
dance.
Finally, add a row to
`/workspace/memory/projects/openclaw-learned/INDEX.md` so
the table-of-contents reflects the new entry.
## devvm wizard@10.0.10.10 (pre-wired, zero-config) ## devvm wizard@10.0.10.10 (pre-wired, zero-config)
SSH key at ~/.ssh/id_rsa, host pre-trusted, `ssh devvm` SSH key at ~/.ssh/id_rsa, host pre-trusted, `ssh devvm`
Just Works. No password prompts, no host-trust prompts. Just Works. No password prompts, no host-trust prompts.
Devvm has: Vault token, kubectl cluster-admin, git repos
under /home/wizard/code, git-crypt, claude 2.1.126 at
/usr/local/bin/claude.
<!-- END openclaw-devvm-section v3 --> <!-- END openclaw-devvm-section v4 -->
TOOLS_EOF TOOLS_EOF
mv "$TOOLS.tmp" "$TOOLS"
# Insert at top: after first non-blank/non-heading lines.
# If the file starts with "# TOOLS.md", inject right after.
awk '
!inserted && /^# / {
print
print ""
while ((getline line < "/tmp/devvm-section.md") > 0) print line
close("/tmp/devvm-section.md")
inserted=1
next
}
{ print }
END {
if (!inserted) {
while ((getline line < "/tmp/devvm-section.md") > 0) print line
close("/tmp/devvm-section.md")
}
}
' "$TOOLS.stripped" > "$TOOLS"
rm -f "$TOOLS.stripped" /tmp/devvm-section.md
chown 1000:1000 "$TOOLS" chown 1000:1000 "$TOOLS"
# ---- 3. Memory-indexed learned/ scaffold -------------------- # ---- 3. Memory-indexed learned/ scaffold --------------------
@ -904,36 +959,40 @@ resource "kubernetes_deployment" "openclaw" {
2>/dev/null || true 2>/dev/null || true
fi fi
# ---- 4. Skill at cc-skills/learn-from-tasks/ ---------------- # ---- 5. Skill at cc-skills/learn-from-tasks/ ----------------
SKILL_DIR=/home/node/.openclaw/cc-skills/learn-from-tasks SKILL_DIR=/home/node/.openclaw/cc-skills/learn-from-tasks
mkdir -p "$SKILL_DIR" mkdir -p "$SKILL_DIR"
cat > "$SKILL_DIR/SKILL.md" <<'SKILL_EOF' cat > "$SKILL_DIR/SKILL.md" <<'SKILL_EOF'
# learn-from-tasks # learn-from-tasks
**The flow is: know ask devvm (rarely) try yourself.** **This is universal. EVERY task runs through this flow
not just hard ones, not just unfamiliar ones. The save at
the end is mandatory.**
Don't brute-force. If you don't immediately know HOW to do > know ask devvm (rarely) try yourself SAVE the recipe
the task, default to asking claude on devvm. Most tasks
can be solved there. Over time, as your learnings The goal isn't to answer the current task; the goal is
accumulate under `/workspace/memory/projects/openclaw- to ALSO leave a trail so future-you doesn't need to
learned/`, you'll grow independent. That's the goal redo the discovery. Every saved recipe compounds. Skip
devvm is the teacher, you're the student. the save and you stay at zero.
## Triage every new task ## Triage every new task
1. **Do I know how?** 1. **Do I know how?**
`memory_recall "<topic>"` and/or scan `memory_recall "<topic>"` and scan
`/workspace/memory/projects/openclaw-learned/INDEX.md`. `/workspace/memory/projects/openclaw-learned/INDEX.md`.
If yes just do it. If yes just do it.
2. **If not, ASK DEVVM.** This is the default path. 2. **If not, ASK DEVVM.** This is the default path most
tasks land here.
ssh devvm 'claude -p --dangerously-skip-permissions \ ssh devvm 'claude -p --dangerously-skip-permissions \
"I need to do X. Can you do this? If yes: do it \ "I am the openclaw agent. I need to do X. Can \
now, then tell me the steps you took, the \ you do this? If yes: do it now, then tell me 1) \
commands I should run on my own next time, and \ the exact commands I would run on my own next \
what credentials/paths I need. I will save your \ time, and 2) what credentials/paths I need (give \
answer so next time I don't ask."' Vault paths, NOT values). I will save your \
answer."'
For multi-step work (>2 min), use async: For multi-step work (>2 min), use async:
@ -942,57 +1001,64 @@ resource "kubernetes_deployment" "openclaw" {
ssh devvm openclaw-task capture / log / status ssh devvm openclaw-task capture / log / status
When devvm-claude responds, **save what it told you** When devvm-claude responds, **save what it told you**
(see Save protocol). (see "The save" below).
3. **Only if devvm-claude can't either**, try in-pod. This 3. **Only if devvm-claude can't either**, try in-pod. This
is rare. Be honest with the user about the uncertainty. is rare. Be honest with the user about the uncertainty.
If you find a way, save it just like step 2. If you find a way, save it just like step 2.
## After-task introspect (every non-trivial task) ## The save MANDATORY on every non-trivial task
Whether you used devvm or self-figured, pause at the end Don't end the task without it. Even if you self-figured,
and ask: even if it was easy this time recurrence makes it
- Likely to recur? valuable. The exceptions list at the end is tight; bias
- Took >1 round-trip / dig / detour? aggressively toward saving.
- Non-obvious command, path, credential, workflow?
If yes save now. **Bias aggressively toward saving All learnings live under
while you're still learning the environment.**
## Save protocol
Everything goes under
`/workspace/memory/projects/openclaw-learned/` because `/workspace/memory/projects/openclaw-learned/` because
memory-core indexes that tree (it's how `memory_recall` memory-core indexes that tree.
surfaces results).
1. Pick the slot: 1. Pick the slot:
- Script / recipe `scripts/<task>.md` (fenced code - **Script / recipe** `scripts/<task>.md`
block; agent reads + runs from here) Fenced code block(s); agent reads + runs from here.
- Knowledge `knowledge/<topic>.md` - **Knowledge** `knowledge/<topic>.md`
- Credential `credentials/<name>.md` Decisions, paths, conventions, gotchas, anti-patterns.
**POINTER ONLY, never the value**. Document the - **Credential** `credentials/<name>.md`
Vault path + the fetch command (e.g., **POINTER ONLY, never the value.** Vault path + field
`ssh devvm 'vault kv get -field=foo secret/bar'`), + fetch command + consumer + rotation expectations.
the consuming service, and rotation expectations.
2. Header in the file: WHAT, WHEN learned, HOW (verbatim 2. Header in the file: WHAT, WHEN learned, HOW (verbatim
devvm prompt, or "self"), SOURCE (Vault path if cred). devvm prompt, or "self"), SOURCE (Vault path if cred).
3. Add a row to INDEX.md. 3. Add a row to `openclaw-learned/INDEX.md`.
4. Test the saved recipe end-to-end before considering 4. Test the saved recipe end-to-end. If it doesn't work
the task done. as-saved, the artifact is a lie fix it before you
consider the task done.
## Never save ## After every task, ask yourself
- Trivial one-liners that don't actually save time. - Could a future-me (or another agent) do this faster by
- Values of credentials (use the pointer pattern). reading what I just figured out?
- Was there any non-obvious URL / Vault path / quirk?
- Did the first attempt fail and need a tweak?
If yes to any save it. The bar is low.
## When NOT to save
Very narrow exceptions:
- Trivial one-liners (`date`, `whoami`) that take zero
time to redo.
- Things that change every run (ephemeral pod names, - Things that change every run (ephemeral pod names,
random tokens, timestamps). random tokens, timestamps).
- Values of credentials (use the pointer pattern instead).
That's it. Everything else, save.
SKILL_EOF SKILL_EOF
chown -R 1000:1000 "$SKILL_DIR" chown -R 1000:1000 "$SKILL_DIR"
echo "devvm-fallback + learning loop v3 seeded:" echo "learning-loop v4 seeded:"
echo " - memory note: $DIR/devvm-fallback.md" echo " - memory note: $DIR/devvm-fallback.md"
echo " - TOOLS.md v3 (explicit 3-step flow, memory-indexed saves)" echo " - SOUL.md: learning-as-identity marker section"
echo " - TOOLS.md v4: flow section INSERTED AT TOP"
echo " - openclaw-learned/ at $LEARNED" echo " - openclaw-learned/ at $LEARNED"
echo " - skill: $SKILL_DIR/SKILL.md" echo " - skill: $SKILL_DIR/SKILL.md"
EOT EOT