PHANTOM
🇮🇳 IN
Skip to content

fix: skip invalid episode_ids in semantic ingestion instead of crashing#1128

Open
haosenwang1018 wants to merge 2 commits intoMemMachine:mainfrom
haosenwang1018:fix/graceful-invalid-episode-ids
Open

fix: skip invalid episode_ids in semantic ingestion instead of crashing#1128
haosenwang1018 wants to merge 2 commits intoMemMachine:mainfrom
haosenwang1018:fix/graceful-invalid-episode-ids

Conversation

@haosenwang1018
Copy link

Closes #1102

When episode_ids referenced in semantic storage have been deleted or are otherwise invalid, the ingestion process crashes with a ValueError. This change gracefully skips invalid episode_ids with a warning log instead of failing entirely, allowing valid messages to still be processed.

As suggested in the issue discussion.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses issue #1102 by changing the semantic ingestion process to gracefully skip invalid episode_ids instead of crashing with a ValueError. The change replaces raising an exception with logging a warning when episode_ids referenced in semantic storage have been deleted or are otherwise invalid.

Changes:

  • Replace ValueError exception with warning log when invalid episode_ids are encountered
  • Filter out None messages before processing to allow valid messages to continue
Comments suppressed due to low confidence (1)

src/memmachine/semantic_memory/semantic_ingestion.py:117

  • When all messages are invalid (all episode_ids return None), the filtered messages list will be empty, and processing will skip marking the invalid episode_ids as ingested. This creates an infinite retry loop. Consider adding an early return after marking invalid episode_ids as ingested: if len(raw_messages) == 0: return
        raw_messages = [m for m in raw_messages if m is not None]
        messages = TypeAdapter(list[Episode]).validate_python(raw_messages)

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

set_id,
none_h_ids,
)

Copy link

Copilot AI Feb 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Invalid episode_ids are not being marked as ingested, which will cause them to be retried indefinitely. According to the stored memory (src/memmachine/semantic_memory/semantic_ingestion.py lines 109-120), invalid episode_ids should be marked as ingested to prevent repeated failures. After logging the warning, you should mark these invalid episode_ids as ingested by calling: await self._semantic_storage.mark_messages_ingested(set_id=set_id, history_ids=none_h_ids)

Suggested change
await self._semantic_storage.mark_messages_ingested(
set_id=set_id,
history_ids=none_h_ids,
)

Copilot uses AI. Check for mistakes.
set_id,
none_h_ids,
)

Copy link

Copilot AI Feb 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change introduces new behavior (skipping invalid episode_ids) but lacks test coverage. Consider adding a test case that verifies: 1) invalid episode_ids are logged with a warning, 2) invalid episode_ids are marked as ingested to prevent reprocessing, 3) valid messages are still processed correctly when some episode_ids are invalid, and 4) the function handles gracefully when all episode_ids are invalid.

Suggested change
await self._semantic_storage.mark_messages_ingested(
set_id=set_id,
history_ids=none_h_ids,
)

Copilot uses AI. Check for mistakes.
…inite retry

Signed-off-by: haosenwang1018 <haosenwang1018@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: Background semantic ingestion crashes/restarts when semantic storage references missing episode_ids

2 participants