All posts Catergories Cloud Tags Cloud List by Date

Direct Language Model Alignment from Online AI Feedback

OAIF

This post is a brief summary about the paper that I read for my study and curiosity, so I shortly arrange the content of the paper, titled Direct Language Model Alignment from Online AI Feedback (Guo et al., arXiv 2024), that I read and studied. [Read More]
Tags: LLM, Feedback, Reward

Self-Rewarding Language Models

Self-Rewarding

This post is a brief summary about the paper that I read for my study and curiosity, so I shortly arrange the content of the paper, titled Self-Rewarding Language Models (Yuan et al., arXiv 2024), that I read and studied. [Read More]
Tags: LLM, Feedback, Reward