SillyTavern extension · long-form memory · v1.7.4

Smart Memory

The memory layer your AI was missing. Every fact, every arc, every shift in mood — captured, deduplicated, consolidated, and quietly handed back to the model on every single turn.

13 injection slots named · positional
11 macro tokens {{smartmemory-*}}
4 extraction tiers sequential pipeline
2 storage layers persistent · per-chat
The pipeline · top to bottom

How memory survives the conversation.

A six-step journey from message in to context restored — running quietly behind every reply.

  1. Input

    Every turn arrives.

    Solo chat, group scene, swiped reply — every rendered message flows into the pipeline. Smart Memory runs last in line, so it sees the final version of the turn.

  2. Extract

    Four tiers extract.

    Each tier listens for a different kind of signal — from the rolling summary to the slowly-forming canon — and writes its findings into its own bucket.

    Tier 1 · Compaction
    The running recap
    Keeps a progressive summary of everything so far. Extends — never rewrites.
    Tier 2 · Scenes
    Every story beat
    Detects scene breaks and writes a two-sentence beat for each, plus what each character now knows.
    Tier 3 · Batch
    Facts, arcs, world state
    Surfaces facts, relationships, preferences, open plot threads, and the live state of every entity in the scene.
    Tier 4 · Derived
    Profiles & canon
    Builds character and world profiles. Compiles the resolved story into a stable narrative canon.
  3. Classify

    Dedup classifies.

    Semantic embeddings — with a Jaccard fallback when none are available — sort every candidate into one of four buckets: new, supersedes, uncertain, duplicate. Old memories aren't deleted — they're retired with valid-from / valid-to indices and kept for the record.

    Passed
    new fact — keep
    Superseded
    state changed — replace
    Uncertain
    ask the model — confirm
    Rejected
    duplicate — drop
  4. Merge

    Consolidation richens.

    When enough memories of one kind pile up, the model fuses near-identical entries into fewer, richer ones. Runs independently for long-term and session memory, with separate thresholds per type.

    Long-term
    fact · relationship · preference · event
    Session
    scene · revelation · development · detail
    Threshold
    per type · independent counter
  5. Store

    Two layers hold it.

    Identity and continuity live in different homes. One survives every restart. The other resets when a chat does.

    Persistent
    Survives every session
    Long-term memories, relationships, persistent arcs, canon, the entity registry, and what each character knows about each other.
    • extension_settings
    • per-character
    • cross-chat
    Per-chat
    Lives with the conversation
    The rolling summary, session details, scene history, chat-scoped arcs, the live state ledger, and the current snapshot of profiles.
    • chatMetadata
    • per-conversation
    • resets on Fresh Start
  6. Inject

    Context, restored.

    On every turn, the right memory lands in the right place. Thirteen named slots — anchored at the character card, depth-relative to the current message, or merged into a single unified block — feed it back to the model exactly where it needs to be.

    IN_PROMPT
    4 slots · anchored at card depth
    IN_CHAT
    8 slots · depth-relative to the turn
    Unified
    1 slot · everything in one block

Your AI never forgets.

It remembers the small grudges, the inside jokes, the way the weather changed in the third scene of the second chapter — and brings them back when they matter.