Tag

alignment

10 verified claims carrying this tag. Each has 2+ primary sources and an HMAC-SHA256 signature.

Reinforcement Learning from Human Feedback (RLHF) introduced in paper: Deep Reinforcement Learning from Human Preferences (Christiano et al., 2017).
67866330cd60e54d · 3 sources · 100% confidence
Direct Preference Optimization (DPO) introduced in paper: Direct Preference Optimization: Your Language Model is Secretly a Reward Model (Rafailov et al., 2023).
a3e691683a4577af · 2 sources · 100% confidence
Constitutional AI (CAI) introduced in paper: Constitutional AI: Harmlessness from AI Feedback (Bai et al., 2022).
ba1eb83c14795107 · 2 sources · 100% confidence
InstructGPT methodology introduced in paper: Training language models to follow instructions with human feedback (Ouyang et al., 2022).
5da8f8dffc038b8e · 2 sources · 100% confidence
InstructGPT introduced in: Ouyang et al. 2022 — RLHF-tuned GPT-3, direct ancestor of ChatGPT.
590b9de765b8126e · 2 sources · 100% confidence
Anthropic Constitutional AI Harmlessness introduced in paper: Bai et al. 2022 — training a helpful and harmless assistant.
6fa575eb9df5ac32 · 2 sources · 100% confidence
Odds Ratio Preference Optimization (ORPO) introduced in paper: ORPO: Monolithic Preference Optimization without Reference Model (Hong et al., 2024).
ff0975d391b66a6f · 3 sources · 92% confidence
Kahneman-Tversky Optimization (KTO) introduced in paper: KTO: Model Alignment as Prospect Theoretic Optimization (Ethayarajh et al., 2024).
a4713632c335406b · 3 sources · 92% confidence
Simple Preference Optimization (SimPO) introduced in paper: SimPO: Simple Preference Optimization with a Reference-Free Reward (Meng et al., 2024).
d47e9b204e1e73bd · 3 sources · 92% confidence
Additive (Bahdanau) attention introduced in paper: Neural Machine Translation by Jointly Learning to Align and Translate (Bahdanau et al., 2014).
bbf65d37f2df1971 · 2 sources · 82% confidence

Related tags

foundational5 rlhf5 20224 20243 preference-optimization3 openai2 introduced_in2 anthropic2 nips2 constitutional-ai2