fix: skip invalid episode_ids in semantic ingestion instead of crashing by haosenwang1018 · Pull Request #1128 · MemMachine/MemMachine

haosenwang1018 · 2026-02-20T20:06:45Z

When episode_ids referenced in semantic storage have been deleted or are otherwise invalid, the ingestion process crashes with a ValueError. This change gracefully skips invalid episode_ids with a warning log instead of failing entirely, allowing valid messages to still be processed.

As suggested in the issue discussion.

Copilot

Pull request overview

This PR addresses issue #1102 by changing the semantic ingestion process to gracefully skip invalid episode_ids instead of crashing with a ValueError. The change replaces raising an exception with logging a warning when episode_ids referenced in semantic storage have been deleted or are otherwise invalid.

Changes:

Replace ValueError exception with warning log when invalid episode_ids are encountered
Filter out None messages before processing to allow valid messages to continue

Comments suppressed due to low confidence (1)

src/memmachine/semantic_memory/semantic_ingestion.py:117

When all messages are invalid (all episode_ids return None), the filtered messages list will be empty, and processing will skip marking the invalid episode_ids as ingested. This creates an infinite retry loop. Consider adding an early return after marking invalid episode_ids as ingested: if len(raw_messages) == 0: return

        raw_messages = [m for m in raw_messages if m is not None]
        messages = TypeAdapter(list[Episode]).validate_python(raw_messages)

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-21T00:11:24Z

src/memmachine/semantic_memory/semantic_ingestion.py

+                set_id,
+                none_h_ids,
            )



Invalid episode_ids are not being marked as ingested, which will cause them to be retried indefinitely. According to the stored memory (src/memmachine/semantic_memory/semantic_ingestion.py lines 109-120), invalid episode_ids should be marked as ingested to prevent repeated failures. After logging the warning, you should mark these invalid episode_ids as ingested by calling: await self._semantic_storage.mark_messages_ingested(set_id=set_id, history_ids=none_h_ids)

Suggested change

await self._semantic_storage.mark_messages_ingested(

set_id=set_id,

history_ids=none_h_ids,

)

Copilot · 2026-02-21T00:11:25Z

src/memmachine/semantic_memory/semantic_ingestion.py

+                set_id,
+                none_h_ids,
            )



This change introduces new behavior (skipping invalid episode_ids) but lacks test coverage. Consider adding a test case that verifies: 1) invalid episode_ids are logged with a warning, 2) invalid episode_ids are marked as ingested to prevent reprocessing, 3) valid messages are still processed correctly when some episode_ids are invalid, and 4) the function handles gracefully when all episode_ids are invalid.

Suggested change

await self._semantic_storage.mark_messages_ingested(

set_id=set_id,

history_ids=none_h_ids,

)

…inite retry Signed-off-by: haosenwang1018 <haosenwang1018@users.noreply.github.com>

fix: skip invalid episode_ids in semantic ingestion instead of crashing

b91954f

sscargal requested review from Copilot, malatewang and o-love February 21, 2026 00:00

Copilot started reviewing on behalf of sscargal February 21, 2026 00:00 View session

Copilot AI reviewed Feb 21, 2026

View reviewed changes

fix: add early return when all episode_ids are invalid to prevent inf…

809a4cf

…inite retry Signed-off-by: haosenwang1018 <haosenwang1018@users.noreply.github.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: skip invalid episode_ids in semantic ingestion instead of crashing#1128

fix: skip invalid episode_ids in semantic ingestion instead of crashing#1128
haosenwang1018 wants to merge 2 commits intoMemMachine:mainfrom
haosenwang1018:fix/graceful-invalid-episode-ids

haosenwang1018 commented Feb 20, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Feb 21, 2026

Uh oh!

Copilot AI Feb 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

+            await self._semantic_storage.mark_messages_ingested(
+                set_id=set_id,
+                history_ids=none_h_ids,
+            )

Conversation

haosenwang1018 commented Feb 20, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Feb 21, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 21, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants