Back to all posts
EngineeringMay 22, 202610 min read

Recall, Recency, and Relevance: How Memanto Finds Only the Exact Memory Your Agent Needs

Instead of dumping everything into an agent’s context, Memanto retrieves the single, most useful memory for the current task using typed recall, temporal queries, and differential retrieval.

Hetkumar PatelSoftware Developer
Recall, Recency, and Relevance: How Memanto Finds Only the Exact Memory Your Agent Needs
RECALL

More memory is not always better. Agents that store everything end up noisy, slow, and unreliable. Memanto returns the one piece of memory that answers the question right now by combining typed recall, temporal queries, and differential retrieval.

Why "More" Memory Hurts

Unfiltered memories produce low-signal matches, stale facts, and ambiguity about intent. An agent searching "What database do we use?" against a full memory store might retrieve: an old PostgreSQL note, a meeting transcript mentioning MongoDB exploration, a preference about SQL syntax, and a deployment log from six months ago. All match the query semantically, but only one is current truth. Memanto avoids this by making memory typed, queryable, and time-aware so agents retrieve focused, actionable context.

Recall — the right memory, not every match

memanto recall "<query>" runs a two-stage retrieval: first, Moorcheh semantic matching scores all memories against the query meaning. Then, type filtering narrows results to only the categories relevant to the question (fact, decision, context, event, etc.). For "what database do we use now?", the system targets fact and recent context types, ignoring preference and old event memories even if they match semantically. The goal is not to list everything that vaguely matches, but to return the exact memory that answers the agent’s intent.

Temporal queries — "as of" a time

Point-in-time truth is essential for time-bound questions. Use recall/as-of to ask what the system believed at a specific timestamp, preventing later edits from contaminating historical answers.

  • POST /api/v2/agents/{agent_id}/recall/as-of — query memory as it existed at a timestamp
  • POST /api/v2/agents/{agent_id}/recall/recent — return the last N stored memories without filtering
  • POST /api/v2/agents/{agent_id}/recall/changed-since — show only what changed since a date

Differential recall — show what changed

changed-since returns the delta between two points in time instead of the full archive. This surfaces what actually changed since the last checkpoint — ideal for release notes, deployment diffs, and post-incident reviews.

Recent — last N memories

recall/recent returns the last N memories stored in the agent’s knowledge base, in reverse chronological order. No query, no filtering — just the freshest state as a time-ordered snapshot. Useful for agents that need to see "what happened most recently?" without specifying what they’re looking for.

Relevant-only results — the retrieval pipeline

Query-based recall (e.g., recall/as-of, recall/changed-since) runs through three sequential filters:

FilterWhat It Does
Semantic score (Moorcheh)All memories ranked by meaning-distance from query. Low-scoring noise is discarded.
Type filterOnly return memory categories relevant to the question (e.g., fact for "what is X", decision for "what did we choose").
Temporal filterIf recall/recent, boost freshest memories. If recall/as-of, only return memories that existed at that timestamp. If recall/changed-since, only diff-worthy changes.

Example: an agent asks "has anything changed about our cloud provider?" Using recall/changed-since with a 7-day window, the system:

  • Scores all memories mentioning "cloud provider" or "AWS/GCP/Azure" semantically
  • Filters to fact, decision, and event types (ignoring preference or context)
  • Applies temporal diff: only returns memories created or updated in the last 7 days
  • Returns a tight set: e.g., "Migrated to GCP on 2026-05-20" and "Decommissioned AWS on 2026-05-21", ignoring the old 2026-02-15 "evaluated AWS" decision

The result is high-signal, actionable noise. The system reduces false positives and improves actionability.

Practical CLI examples

  • memanto remember "This project uses Postgres 16 in production" --type fact
  • memanto recall "what database are we using now" --type fact
  • curl -X POST "http://127.0.0.1:8000/api/v2/agents/my-agent/recall/as-of" -H "X-Session-Token: $TOKEN" -d '{"query":"what database are we using","as_of":"2026-05-01T00:00:00Z"}'
  • curl -X POST "http://127.0.0.1:8000/api/v2/agents/my-agent/recall/changed-since" -H "X-Session-Token: $TOKEN" -d '{"query":"deployment changes","since":"2026-05-01T00:00:00Z"}'
  • curl -X POST "http://127.0.0.1:8000/api/v2/agents/my-agent/recall/recent" -H "X-Session-Token: $TOKEN" -d '{"query":"what is the current status"}'
Memanto’s design aims to reduce noise: it does not try to remember everything — it tries to remember the right thing at the right time.
▘ ▝End of article
CONTINUE READING