Scaling Laws for Neural Language Models: An Empirical Analysis of Model Performance Across Compute Budgets and Dataset Size
We study the empirical covariance of performance with model size, dataset size, and compute budget. Our findings suggest that performance scales as a power-law with model parameter count.
Oct 24, 2023
local_fire_department
98.2 Heat
Nature MI 2023Alexander KirillovMeta AI Research
Segment Anything Model (SAM): Foundations for General Computer Vision and Zero-shot Transfer
Introducing a new task, model, and dataset for image segmentation. Using our efficient model in a data collection loop, we built the largest segmentation dataset to date.
Oct 23, 2023
trending_up
85.4 Heat
NeurIPS 2023Rafael RafailovCarnegie Mellon University
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
We present Direct Preference Optimization (DPO), a stable, performant, and computationally lightweight algorithm for fine-tuning large language models.