-
Language Models use Lookbacks to Track Beliefs
lookback mechanisms
-
Fast KV Compaction via Attention Matching
Zweiger KV compaction
-
BRIDGE: Predicting Human Task Completion Time From Model Performance
Predicting Human Time from IRT difficulty
-
HunyuanVideo: A Systematic Framework For Large Video Generative Models
Hunyuan video
-
Exploring Multimodal Diffusion Transformers for Enhanced Prompt-based Image Editing
activation patching and attention maps in MM-DiTs