๐ Website: vectorinstitute.github.io/Factual-Preference-Alignment | ๐ Paper: arxiv.org/abs/2601.03027 | ๐ Dataset: Hugging Face
Factuality-aware Direct Preference Optimization is a research and engineering framework for studying and improving factual alignment in preference-optimized Large Language Models (LLMs).
The project introduces F-DPO, a factuality-aware extension of Direct Preference Optimization (DPO) that incorporates:
- Explicit factuality supervision
- Synthetic hallucination inversion
- Margin-based factual penalties
The repository provides end-to-end infrastructure for:
- Dataset construction
- Multi-model preference fine-tuning
- Automated factuality evaluation
All components are config-driven, reproducible, and aligned with the Vector Institute AI Engineering Template.
- ๐ Binary factuality supervision integrated into preference learning
- ๐งช Synthetic hallucination inversion pairs
- ๐ ฮ-margin factual penalties for controllable hallucination suppression
- โ๏ธ Fully config-driven data, training, and evaluation pipelines
- ๐ Multi-model ร multi-ฮ benchmarking at scale
aixpert/
โ
โโโ src/aixpert/
โ โโโ config/ # Central config.yaml
โ โโโ data_construction/ # 8-stage factual dataset pipeline
โ โโโ training/ # Original-DPO & F-DPO training
โ โโโ evaluation/ # GPT-4o-mini judge evaluation
โ โโโ utils/ # Shared helpers
โ
โโโ README.md
โโโ pyproject.toml
Standard DPO aligns models to human preferences, but does not explicitly discourage hallucinated yet preferred responses.
F-DPO introduces a factuality-aware margin:
- Each preference tuple includes
(h_w, h_l)factuality indicators - A penalty ฮป is applied when the preferred response is less factual
- Optimization pressure shifts toward factually correct preferences
โก๏ธ Result: Lower hallucination rates without sacrificing preference alignment
This repository contains a complete eight-stage pipeline for converting the Skywork Reward-Preference-80K dataset into balanced, factual-aware DPO datasets.
| Stage | Description |
|---|---|
| 1 | Skywork extraction & de-duplication |
| 2 | Preference pair conversion |
| 3 | Binary factuality scoring (GPT-4o-mini) |
| 4 | Canonical DPO transformation |
| 5 | Synthetic hallucination generation |
| 6 | Dataset merging |
| 7 | Balanced bucket construction |
| 8 | Optional preference flipping |
All paths and parameters are defined in:
src/aixpert/config/config.yaml
Every component โ datasets, models, hyperparameters, outputs, and evaluation โ is controlled via:
src/aixpert/config/config.yaml
Loaded using:
from utils.config_loader import load_config
cfg = load_config()This enables:
- Full reproducibility
- Multi-model automation
- Zero hard-coded paths
python -m aixpert.training.run_dpo_training \
--model "google/gemma-2-9b-it"Trains standard DPO using Skywork preferences.
python -m aixpert.training.run_factual_training \
--model_id "google/gemma-2-9b-it" \
--short "gemma2-9b" \
--delta 10Each ฮ value produces a separate fine-tuned model.
Evaluation is performed using GPT-4o-mini as an LLM-as-a-Judge.
| Metric | Meaning |
|---|---|
| factuality | Mean factual score |
| halluc_rate | % outputs below threshold |
| win_rate | ฮ-model vs baseline |
| count | Prompts evaluated |
Run evaluation:
python -m aixpert.evaluation.evaluations.run_all_evaluationsOutputs:
eval_results.json
- Gemma-2 (2B, 9B)
- Qwen-2.5 / Qwen-3
- LLaMA-3.x
- Any TRL-compatible causal LLM
Models are registered centrally in config.yaml.
- Hugging Face TRL โ DPO reference implementation
- Unsloth โ QLoRA optimization
- BitsAndBytes โ 4-bit quantization
- Flash-Attention-2
- Weights & Biases โ experiment tracking
- Accelerate โ multi-GPU orchestration
This project builds upon and extends the Skywork Reward-Preference-80K dataset.
We do not claim ownership of the Skywork dataset. All credit belongs to the original authors.
If you use this repository, please cite Skywork:
@article{liu2024skywork,
title={Skywork-Reward: Bag of Tricks for Reward Modeling in LLMs},
author={Liu, Chris Yuhao and Zeng, Liang and Liu, Jiacai and Yan, Rui and He, Jujie and Wang, Chaojie and Yan, Shuicheng and Liu, Yang and Zhou, Yahui},
journal={arXiv preprint arXiv:2410.18451},
year={2024}
}For dataset-related concerns, please contact the Skywork authors via their paper or Hugging Face repository.
If you find this code or dataset useful for your research, please consider citing:
@article{FactualAlignment2026,
title={Reducing Hallucinations in LLMs via Factuality-Aware Preference Learning},
author={Sindhuja Chaduvula, Ahmed Radwan, Azib Farooq, Yani Ioannou, Shaina Raza},
journal={arXiv preprint arXiv:2601.03027},
year={2026}
}For questions, collaborations, or issues:
- Open a GitHub Issue
- Or contact the maintainers via the Vector Institute
โก Factuality-aware Direct Preference Optimization promotes in reducing hallucinations and increase factualness
Resources used in preparing this research were provided, in part, by the Province of Ontario, the Government of Canada through CIFAR, and companies sponsoring the Vector Institute. This research was funded by the European Unionโs Horizon Europe research and innovation programme under the AIXPERT project (Grant Agreement No. 101214389), which aims to develop an agentic, multi-layered, GenAI-powered framework for creating explainable, accountable, and transparent AI systems.
We invite researchers and practitioners to build upon this framework.
