that they’re bad at reasoning $1M prize if LLM/other can solve these challenges Abstract shapes “initial → target” in JSON Open-weights models only (runs in off-line env) Abstraction & Reasoning Challenge By [ian]@ianozsvald[.com] Ian Ozsvald
Ask for 200 solutions Try grid, list, grid+list representations Grid only – poor. List better. Grid+list slightly better First solution By [ian]@ianozsvald[.com] Ian Ozsvald
3x3 “train” problem Very fast, runs on 3090 (24GB VRAM) Do you use Llama 3? Alpaca? ROPE? Do you have text correctness metrics? Summary By [ian]@ianozsvald[.com] Ian Ozsvald