Resisting Contextual Interference in RAG via Parametric-Knowledge Reinforcement

⚡ Updates

2025/06/05: 🎉 We release our paper and codebase.
2025/09/26: 🎉 We update our paper and code.
2025/11/26: 🎉 Our paper was accepted by ICLR 2026.

🚀 Introduction

Knowledgeable-R1 is a reinforcement-learning framework that explicitly trains large language models to use parametric knowledge (PK) to resist contextual interference while still exploiting external context when it is reliably helpful. We introduces a joint sampling scheme that generates paired responses with and without retrieval, and learns both local advantages (within each decoding regime) and global advantages under the same input to quantify when to ignore misleading context versus adopt it. We employ an asymmetric advantage transformation that amplifies exploratory behaviors toward parametric knowledge. Experiments show that Knowledgeable-R1 significantly improves robustness and reasoning accuracy in knowledge conflict scenarios and general RAG scenarios, outperforming SOTA baselines by 23% in counterfactual scenarios, and without degradation when the retrieved context is fully accurate.

🎯 Key Benefits:

No additional cost — only the rollout strategy and RL objective is modified
Easy to adopt — no additional components or complex multiple prompt pipelines are required in application
Superior generalization — Knowledgeable-r1 significantly enhances robustness and reasoning accuracy in both parameters and contextual conflict tasks and general RAG tasks

🙌 Environment

The runtime environment is in the requirements.txt so you can

conda create -n knowledgeable-r1 python=3.11 -y && conda activate knowledgeable-r1
pip install -r requirements.txt
pip install https://github.com/Dao-AILab/flash-attention/releases/download/v2.7.3/flash_attn-2.7.3+cu12torch2.5cxx11abiFALSE-cp311-cp311-linux_x86_64.whl

Usage

Download all dataset through this link. Unzip it under the folder of knowledgeable-R1.

Training

Run the following command:

GRPO W/ RAG

bash training_scripts/qwen2_5_7b_knowledge_confiqa_mc_grpo.sh

Ours

bash training_scripts/qwen2_5_7b_knowledge_confiqa_mc_knowledgeable_r1.sh

Evaluation

bash eval_query_only.sh     #for query only
bash eval_query_with_rag.sh #for RAG
bash eval_ours.sh
python get_metric.py

Our evaluation results can be found in this link.

🎯 Key Benefits:

No additional cost — only the rollout strategy and RL objective is modified
Easy to adopt — no additional components or complex agent pipelines are required in application
Superior generalization — Knowledgeable-R1 significantly enhances robustness and reasoning accuracy in both parameters and contextual conflict tasks and general RAG tasks

Citation

If you find our works useful for your research, please consider citing:

@misc{lin2025resistingcontextualinterferencerag,
      title={Resisting Contextual Interference in RAG via Parametric-Knowledge Reinforcement}, 
      author={Chenyu Lin and Yilin Wen and Du Su and Hexiang Tan and Fei Sun and Muhan Chen and Chenfu Bao and Zhonghou Lyu},
      year={2025},
      eprint={2506.05154},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2506.05154}, 
}

Acknowledgement

The training codes are built on EasyR1, and the evaluation suite employs vLLM for acceleration.
The base models are from Qwen2.5-7B-Instruct，Llama-3.1-8B-Instruct,Qwen2.5-3B-Instructand Qwen2.5-14B-Instruct.
The original training datasets are from ConFiQA and HotpotQA.

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
data_process		data_process
images		images
training_scripts		training_scripts
verl		verl
.gitignore		.gitignore
LICENSE		LICENSE
eval_ours.sh		eval_ours.sh
eval_query_only.sh		eval_query_only.sh
eval_query_with_rag.sh		eval_query_with_rag.sh
get_metric.py		get_metric.py
introduction.png		introduction.png
main_knowledge.py		main_knowledge.py
merge_model.py		merge_model.py
readme.md		readme.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Resisting Contextual Interference in RAG via Parametric-Knowledge Reinforcement

⚡ Updates

🚀 Introduction

🙌 Environment

Usage

Training

GRPO W/ RAG

Ours

Evaluation

Citation

Acknowledgement

About

Uh oh!

Releases

Packages

Languages

License

lcy80366872/Knowledgeable-R1

Folders and files

Latest commit

History

Repository files navigation

Resisting Contextual Interference in RAG via Parametric-Knowledge Reinforcement

⚡ Updates

🚀 Introduction

🙌 Environment

Usage

Training

GRPO W/ RAG

Ours

Evaluation

Citation

Acknowledgement

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages