-
Value Augmented Sampling for Language Model Alignment and Personalization
Value Augmented Sampling
-
Deriving Muon
Muon
-
Curiosity-driven Red-teaming for Large Language Models
Curiosity Driven Red Teaming
-
Guided Speculative Inference for Efficient Test-Time Alignment of LLMs
Harvard Guided speculative decoding
-
Superhuman AI for Stratego Using Self-Play Reinforcement Learning and Test-Time Search
Stratego