CNRS researcher presenting generative AI to a large audience in a funny way. Dr. Martin Lefèvre, the kind of CNRS researcher who refers to debugging as “therapy,” was scheduled to present generative AI to a huge audience—hundreds of people, two drones for some reason, and a guy in the front row eating chips directly out of his backpack. Martin strolled on stage, slightly sweaty, holding a USB stick like it was the One Ring. “Ladies and gentlemen,” he began, “generative AI is revolutionizing research, art, and occasionally my grocery list.” … Depict a CNRS mathematics researcher presenting generative AI in front of a large audience. DALL·E 2
CNRS researcher presenting generative AI to a large audience in a funny way. Dr. Martin Lefèvre, the kind of CNRS researcher who refers to debugging as “therapy,” was scheduled to present generative AI to a huge audience—hundreds of people, two drones for some reason, and a guy in the front row eating chips directly out of his backpack. Martin strolled on stage, slightly sweaty, holding a USB stick like it was the One Ring. “Ladies and gentlemen,” he began, “generative AI is revolutionizing research, art, and occasionally my grocery list.” … Depict a CNRS mathematics researcher presenting generative AI in front of a large audience. DALL·E 2 Pre-training: denoising. Generation: dynamic transport. Pre-training: next token prediction. Generation: auto-regressive. Dr. Martin Lefèvre, the kind of CNRS researcher who refers to debugging as
of the arithmetic sequence 6, 10, 14, 18, ...? Answer: 412 Prompt: Pattern: each term + 4 Rule: a_n=6+(n−1)·4 Answer: n=100, 6+99·4=402 Training Small texts Next token prediction Reinforcement Learning Inference Very long Prompts Chain of thoughts Math reasonning
of the arithmetic sequence 6, 10, 14, 18, ...? Answer: 412 Prompt: Pattern: each term + 4 Rule: a_n=6+(n−1)·4 Answer: n=100, 6+99·4=402 Tell the story Training Small texts Next token prediction Reinforcement Learning Inference Very long Prompts Chain of thoughts Math reasonning
of the arithmetic sequence 6, 10, 14, 18, ...? Answer: 412 Prompt: Pattern: each term + 4 Rule: a_n=6+(n−1)·4 Answer: n=100, 6+99·4=402 Tell the story of a CNRS researcher Training Small texts Next token prediction Reinforcement Learning Inference Very long Prompts Chain of thoughts Math reasonning
of the arithmetic sequence 6, 10, 14, 18, ...? Answer: 412 Prompt: Pattern: each term + 4 Rule: a_n=6+(n−1)·4 Answer: n=100, 6+99·4=402 in a funny way. […] Tell the story of a CNRS researcher Training Small texts Next token prediction Reinforcement Learning Inference Very long Prompts Chain of thoughts Math reasonning
cloud Positional encoding Token encoding Tokenize Tell the story of a C N R S r e s e a r c h e r presenting generative AI to a large audience in a funny way. Tell the story of a C N R S r e s e a r c h e r presenting generative AI to a large audience in a funny way. <latexit sha1_base64="aTL0Qvb1dLhAur6wfZM9PGylzLY=">AAA9iHictVttcxu3EYbTt1h9c9qPnelcq7iTdFyNpHqadjKaifViSbFiySYlOwltDV9O1NknHs0jJdkM/0e/tn+kv6P/oP3Uv9DdBXDAkbhbQHV1IwkH4nl2sQcsdoFjZ5gm+Xh19Z+3Pvje93/wwx99eHvpxz/56c9+fuejX5zk2WTUjY+7WZqNnnfaeZwmg/h4nIzT+PlwFLcvOmn8rPN6Cz9/dhmP8iQbNMdvh/GLi3Z/kJwl3fYYql62iGE6inuz69O10zvLqyur9BMtFtZUYVmon6Pso1+vi5boiUx0xURciFgMxBjKqWiLHK5vxZpYFUOoeyGmUDeCUkKfx2ImlgA7gVYxtGhD7Wv424e7b1XtAO6RMyd0F6Sk8DsCZCTuAiaDdiMoo7SIPp8QM9ZWcU+JE3V7C/87iusCasfiHGo5nG7pi8O+jMWZ+DP1IYE+DakGe9dVLBOyCmoeWb0aA8MQ6rDcg89HUO4SUts5IkxOfUfbtunzf1FLrMX7rmo7Ef8mLe/CFYmG6n1WMLTFJfFH9DQn8JnUJwXJfWCIVR+xdEW2vqDeD6D9FOofwzWjkrZJB64p1c5qkVtwuZBbLHIXLhdyl0UewOVCHrDII7hcyCOFROyIbO7GN+By4Rus5CdwuZBPWORTuFzIpyzyBC4X8oRFfgOXC/kNi3wIlwv5kEU+gsuFfMQim3C5kE0WeQyXC3nMInfgciF3FLJ6po7gyognYWblAyiXZaCnSKHmAavfJnlHF3bTY053K7D8rN6G/27stodN4wrsjse4O6vA8iNvF3ykG8v7oj1aTVzYPRa7DyPAjd1nsV+KVxXYLz1m2usKLD/XDqCdG8t736/gzo39isU+hpIby69Rh1Djxh56rBjDCuwRi30i3lRgfbz+qALL+/0G+BU3ll+nmtDejfXxppMKLO9PTyCCcWP51eoZ1Lqxz1jsc3FdgX3OYr8G7+7Gfu2xwr6rwOo1dolWkD7FIzHM2Dq2djErsTQEtjYjPy3WlpRi4w7Uc5h+gekT5oJF7BaIXU/EQYE48NYrL/xoTvEuL6VRIBqeiE6xNmFpzLbvFe2xlHogtgvE9hyiLiLFZ637cknRha7hkONi5cKST5+ywn9jKVbjod7zasRhCSHH9jmN/HuULWEGhZaqYzsv1niJjOi+DnFF2ZvupZbB48aFV7BR1yyq40B1WNRbB+oti5o4UBMWdelAXbIoM/NtXMtjBBj747OY0p0cATJGrr4iiAoewKqzB3M0gvFzBFHgU6o5hP8Nyr25q04zzOZxncRdjhclTzyC0lQsQ73JCrcpv05phsWgmWx5qHJ8vMO9jamac9ILz4qVPCp2TPx5EtKnX/BgtBjRfArjeUQ1M4ruZCkMv1fMe10Kw++QxWcUxctSGH6stB/fQPemwjZvgG3AbBoq65tyKIfcf5EcurxEqy56XHyqF2rMIN91IP++ejL7N3guW1SS9jHlMI7c6l9e6l8Ih7Fzbtk5jAWjJxn16lIU3JOByntNOVSHjFbRgdLD3IU+GWzTU09Gl8M4jiDi2qKce2qVQ0fvsOiNKYdxnAi57zmjSF6Xwzj6dC/tYcphHLjb0lZ5vimHena0gMydTTnUqw9oFxj3gOSYlzUmKhpRnDRRbAnFB/W7NXbMv7iO4Z7NyyJHqGcysW01T6dYy+o10vFCDF5tHKgHxhcTKwYrc0zFOptfSR3GpfV9kces8Wj5A7BiBLNfngFwe+YpaKj3JNB7p8C4xmZd5Z5p3DqLw1FyNodqqdoxGy0auXLXqFx3SrVcXmZ6a+zYIn+d09gbUkx4QJbl7HBQ+YSrGDkLHZQsxPOF2O6dmq9l66+yuOEcYliMtC6dCMmTtPo81WX1hmXju+qUZwyXPPMx4xd3m8+Ut8GcJyNfhLrUybTb6X0kuw7X1XvC7HHLzyJ6ouivLslrJHQilbNZqN4tltH4lO4N9zGdyaEMydGF5xgplqGQp2a4i4776RF5VNvfcrLRXnqHTpZz8rraH9ej+xa670CH5zhbsGI8hlITcoZjuGt6ZDlLha0ysvhI/KE4Hc3oCdZn9GnJQ2oO6W/ikoesy7LPSyxXgMbRILN0f455Ho1vLTDxWb9LH5O7lj3/XTq51efbbRrj1aO5eiemR1LXSWpEs0ae6sq7eQlSg6nzk3WKX+t7ifJCJKIP5aS+tCRLuwzoxD+mDHZIkXFKs42bHeXW9v7U/Cda0pHQZ+d4mp2Rh4zI/0WwPmU0JiP6td8d0Cfo0iOk5CN9/E5SRDeuWCdhx5iJ4xIh32ow4y0mXzYh+ZrXnl05jUWZMch1YDY3trVNDigWjEnqSHl3M7frVx9Emvck7FEiGc1Y+YTkf0p/9a8eJ8sLIwItjE8gV77O9TwyylnQRm1a5et9kG5ra/lxocNLpbVZ/4xOH5c026aMC/XB1boHkrt0L2XhKBmR3vlCG7mO1u3mIvNwzo7Y2zPK4qXf76sVGPW+R6vkMs25Fo2SPoyCcZFF6LbcLvK83HpZZXY/7vz/wm5sXbYaMkbC7OBKC3H7+zFla7aWKYxqOX5f02xyW30016pezoDG4oU1l7+D2t/AX623vvfj6ZS8wiaNAclg7oxFZE200MJP1mZJlh6ZmsvcG3lmTOpWds1N8mvp3UyOfRnMckSj5lrtWujyTTheWRyvPG3YpLNGY0Vdrz3RKZtbNNVppa+8EGnNAOYJy8xHZBqVeGhp51J+rD2Wlc/xNeody7XKcrVhttqnAfac90G65/r87P6uWN0j8ZBimy5FYDJ/6dEsTSjm0rX1mZpkQMn3lX+1Z3+LalB6hzwoMsv3OHHGyFOnLl2zQtPfqZUtIz9vPIJ+b+lKtdE+tkXlPy4gL2hO5DQvNeI+tYiV/rYe0ZxHWrFijoh2/tsUU8m4oz5ntlubZxKV4gmTb8pZZWTJTGFA9ud23vYXstd9K3+NKCecqOi6A1zhTxgZJEbvJLgjy5yeEK5y8iRBRrQd8p+Lfkqe4g0sjVZI66nY8PAxMus1Y90eW7rHum+/h5ZodfPUXS14eam3RE7eTU702rSqXagYdTp3fzOutlrlyvd1dpjMyTX2mFAbO7MwWV4Z0xKfe0uRGoVJkRgfKWG9CNE/TPMQneXplC+zbq2ZyzsN0secU77EvQeKCFd094kzmvuU6Udnga9DWJtN1nBMuBuXqf0B29PirtTthXVI1t6uXY1SayWqWik0u71aGP8tPWRM3i8V3J6NbG3r3iplKfwujGToCvlGb1V+aHN+Dhf+jYQrO9QSffYOGxDfPhBbYuc9vA3xRpXljmZENegLenO5d1v1s9yi3kZvLHab30eCv4wEbM1pn9BKGqq7ZOY1t9n9+a/IC4xEzGpvWob3wZbC92RRUkh/EvJsfG8Sob+LE9oXLcGnJ2Up/nLkuQbXizOhv9MU1gfNzvegLCFEhn6Pwe+Zm9bhsmxJ9fZalOIrQ64C+sRF4/DkrzpXMe18PNTIeiLvXwJ6h7Madr1a/K/90HKMpHBZvtJy+q7ZK4+nLtvFakcW4+HwOWOk+Yzmaon+MrOidyZacsuTcV8U9KQyqzfvnx/jUTMGtKypkPugvHYSb48io68vC54LuHTIxH/EP27x30Z4U3BU6RHCpM8pqtl0C55Nf+PS1Tv9mY9OhqdKpzKbySMa9EbsltgXD+F3q4gAQ98Old+llP8R6/7+bA9qz8h76F10uXPQorqYdj/MKVqP7tUe4+md5bX5byEvFk7WV9b+tHL/yf3lLzbVN5Q/FL8Sv4W8ZE18Jr4Qe9DfYzpX+Kv4m/j7xtLG6sZnG3+RTT+4pTC/FKWfjc3/Ajzs2dE=</latexit> x1 Tokenize xn {xi }n i=1
cloud Positional encoding Token encoding Tokenize Tell the story of a C N R S r e s e a r c h e r presenting generative AI to a large audience in a funny way. Tell the story of a C N R S r e s e a r c h e r presenting generative AI to a large audience in a funny way. <latexit sha1_base64="aTL0Qvb1dLhAur6wfZM9PGylzLY=">AAA9iHictVttcxu3EYbTt1h9c9qPnelcq7iTdFyNpHqadjKaifViSbFiySYlOwltDV9O1NknHs0jJdkM/0e/tn+kv6P/oP3Uv9DdBXDAkbhbQHV1IwkH4nl2sQcsdoFjZ5gm+Xh19Z+3Pvje93/wwx99eHvpxz/56c9+fuejX5zk2WTUjY+7WZqNnnfaeZwmg/h4nIzT+PlwFLcvOmn8rPN6Cz9/dhmP8iQbNMdvh/GLi3Z/kJwl3fYYql62iGE6inuz69O10zvLqyur9BMtFtZUYVmon6Pso1+vi5boiUx0xURciFgMxBjKqWiLHK5vxZpYFUOoeyGmUDeCUkKfx2ImlgA7gVYxtGhD7Wv424e7b1XtAO6RMyd0F6Sk8DsCZCTuAiaDdiMoo7SIPp8QM9ZWcU+JE3V7C/87iusCasfiHGo5nG7pi8O+jMWZ+DP1IYE+DakGe9dVLBOyCmoeWb0aA8MQ6rDcg89HUO4SUts5IkxOfUfbtunzf1FLrMX7rmo7Ef8mLe/CFYmG6n1WMLTFJfFH9DQn8JnUJwXJfWCIVR+xdEW2vqDeD6D9FOofwzWjkrZJB64p1c5qkVtwuZBbLHIXLhdyl0UewOVCHrDII7hcyCOFROyIbO7GN+By4Rus5CdwuZBPWORTuFzIpyzyBC4X8oRFfgOXC/kNi3wIlwv5kEU+gsuFfMQim3C5kE0WeQyXC3nMInfgciF3FLJ6po7gyognYWblAyiXZaCnSKHmAavfJnlHF3bTY053K7D8rN6G/27stodN4wrsjse4O6vA8iNvF3ykG8v7oj1aTVzYPRa7DyPAjd1nsV+KVxXYLz1m2usKLD/XDqCdG8t736/gzo39isU+hpIby69Rh1Djxh56rBjDCuwRi30i3lRgfbz+qALL+/0G+BU3ll+nmtDejfXxppMKLO9PTyCCcWP51eoZ1Lqxz1jsc3FdgX3OYr8G7+7Gfu2xwr6rwOo1dolWkD7FIzHM2Dq2djErsTQEtjYjPy3WlpRi4w7Uc5h+gekT5oJF7BaIXU/EQYE48NYrL/xoTvEuL6VRIBqeiE6xNmFpzLbvFe2xlHogtgvE9hyiLiLFZ637cknRha7hkONi5cKST5+ywn9jKVbjod7zasRhCSHH9jmN/HuULWEGhZaqYzsv1niJjOi+DnFF2ZvupZbB48aFV7BR1yyq40B1WNRbB+oti5o4UBMWdelAXbIoM/NtXMtjBBj747OY0p0cATJGrr4iiAoewKqzB3M0gvFzBFHgU6o5hP8Nyr25q04zzOZxncRdjhclTzyC0lQsQ73JCrcpv05phsWgmWx5qHJ8vMO9jamac9ILz4qVPCp2TPx5EtKnX/BgtBjRfArjeUQ1M4ruZCkMv1fMe10Kw++QxWcUxctSGH6stB/fQPemwjZvgG3AbBoq65tyKIfcf5EcurxEqy56XHyqF2rMIN91IP++ejL7N3guW1SS9jHlMI7c6l9e6l8Ih7Fzbtk5jAWjJxn16lIU3JOByntNOVSHjFbRgdLD3IU+GWzTU09Gl8M4jiDi2qKce2qVQ0fvsOiNKYdxnAi57zmjSF6Xwzj6dC/tYcphHLjb0lZ5vimHena0gMydTTnUqw9oFxj3gOSYlzUmKhpRnDRRbAnFB/W7NXbMv7iO4Z7NyyJHqGcysW01T6dYy+o10vFCDF5tHKgHxhcTKwYrc0zFOptfSR3GpfV9kces8Wj5A7BiBLNfngFwe+YpaKj3JNB7p8C4xmZd5Z5p3DqLw1FyNodqqdoxGy0auXLXqFx3SrVcXmZ6a+zYIn+d09gbUkx4QJbl7HBQ+YSrGDkLHZQsxPOF2O6dmq9l66+yuOEcYliMtC6dCMmTtPo81WX1hmXju+qUZwyXPPMx4xd3m8+Ut8GcJyNfhLrUybTb6X0kuw7X1XvC7HHLzyJ6ouivLslrJHQilbNZqN4tltH4lO4N9zGdyaEMydGF5xgplqGQp2a4i4776RF5VNvfcrLRXnqHTpZz8rraH9ej+xa670CH5zhbsGI8hlITcoZjuGt6ZDlLha0ysvhI/KE4Hc3oCdZn9GnJQ2oO6W/ikoesy7LPSyxXgMbRILN0f455Ho1vLTDxWb9LH5O7lj3/XTq51efbbRrj1aO5eiemR1LXSWpEs0ae6sq7eQlSg6nzk3WKX+t7ifJCJKIP5aS+tCRLuwzoxD+mDHZIkXFKs42bHeXW9v7U/Cda0pHQZ+d4mp2Rh4zI/0WwPmU0JiP6td8d0Cfo0iOk5CN9/E5SRDeuWCdhx5iJ4xIh32ow4y0mXzYh+ZrXnl05jUWZMch1YDY3trVNDigWjEnqSHl3M7frVx9Emvck7FEiGc1Y+YTkf0p/9a8eJ8sLIwItjE8gV77O9TwyylnQRm1a5et9kG5ra/lxocNLpbVZ/4xOH5c026aMC/XB1boHkrt0L2XhKBmR3vlCG7mO1u3mIvNwzo7Y2zPK4qXf76sVGPW+R6vkMs25Fo2SPoyCcZFF6LbcLvK83HpZZXY/7vz/wm5sXbYaMkbC7OBKC3H7+zFla7aWKYxqOX5f02xyW30016pezoDG4oU1l7+D2t/AX623vvfj6ZS8wiaNAclg7oxFZE200MJP1mZJlh6ZmsvcG3lmTOpWds1N8mvp3UyOfRnMckSj5lrtWujyTTheWRyvPG3YpLNGY0Vdrz3RKZtbNNVppa+8EGnNAOYJy8xHZBqVeGhp51J+rD2Wlc/xNeody7XKcrVhttqnAfac90G65/r87P6uWN0j8ZBimy5FYDJ/6dEsTSjm0rX1mZpkQMn3lX+1Z3+LalB6hzwoMsv3OHHGyFOnLl2zQtPfqZUtIz9vPIJ+b+lKtdE+tkXlPy4gL2hO5DQvNeI+tYiV/rYe0ZxHWrFijoh2/tsUU8m4oz5ntlubZxKV4gmTb8pZZWTJTGFA9ud23vYXstd9K3+NKCecqOi6A1zhTxgZJEbvJLgjy5yeEK5y8iRBRrQd8p+Lfkqe4g0sjVZI66nY8PAxMus1Y90eW7rHum+/h5ZodfPUXS14eam3RE7eTU702rSqXagYdTp3fzOutlrlyvd1dpjMyTX2mFAbO7MwWV4Z0xKfe0uRGoVJkRgfKWG9CNE/TPMQneXplC+zbq2ZyzsN0secU77EvQeKCFd094kzmvuU6Udnga9DWJtN1nBMuBuXqf0B29PirtTthXVI1t6uXY1SayWqWik0u71aGP8tPWRM3i8V3J6NbG3r3iplKfwujGToCvlGb1V+aHN+Dhf+jYQrO9QSffYOGxDfPhBbYuc9vA3xRpXljmZENegLenO5d1v1s9yi3kZvLHab30eCv4wEbM1pn9BKGqq7ZOY1t9n9+a/IC4xEzGpvWob3wZbC92RRUkh/EvJsfG8Sob+LE9oXLcGnJ2Up/nLkuQbXizOhv9MU1gfNzvegLCFEhn6Pwe+Zm9bhsmxJ9fZalOIrQ64C+sRF4/DkrzpXMe18PNTIeiLvXwJ6h7Madr1a/K/90HKMpHBZvtJy+q7ZK4+nLtvFakcW4+HwOWOk+Yzmaon+MrOidyZacsuTcV8U9KQyqzfvnx/jUTMGtKypkPugvHYSb48io68vC54LuHTIxH/EP27x30Z4U3BU6RHCpM8pqtl0C55Nf+PS1Tv9mY9OhqdKpzKbySMa9EbsltgXD+F3q4gAQ98Old+llP8R6/7+bA9qz8h76F10uXPQorqYdj/MKVqP7tUe4+md5bX5byEvFk7WV9b+tHL/yf3lLzbVN5Q/FL8Sv4W8ZE18Jr4Qe9DfYzpX+Kv4m/j7xtLG6sZnG3+RTT+4pTC/FKWfjc3/Ajzs2dE=</latexit> x1 ˜ xi := ∑ j e⟨Qxi ,Kxj ⟩ ∑ ℓ e⟨Qxi ,Kxℓ ⟩ Vxj xi xj (Unmasked) Attention layer Tokenize xn {xi }n i=1 … next token probabilities Attention Norm MLP Classif … T ×
cloud Positional encoding Token encoding Tokenize Tell the story of a C N R S r e s e a r c h e r presenting generative AI to a large audience in a funny way. Tell the story of a C N R S r e s e a r c h e r presenting generative AI to a large audience in a funny way. <latexit sha1_base64="aTL0Qvb1dLhAur6wfZM9PGylzLY=">AAA9iHictVttcxu3EYbTt1h9c9qPnelcq7iTdFyNpHqadjKaifViSbFiySYlOwltDV9O1NknHs0jJdkM/0e/tn+kv6P/oP3Uv9DdBXDAkbhbQHV1IwkH4nl2sQcsdoFjZ5gm+Xh19Z+3Pvje93/wwx99eHvpxz/56c9+fuejX5zk2WTUjY+7WZqNnnfaeZwmg/h4nIzT+PlwFLcvOmn8rPN6Cz9/dhmP8iQbNMdvh/GLi3Z/kJwl3fYYql62iGE6inuz69O10zvLqyur9BMtFtZUYVmon6Pso1+vi5boiUx0xURciFgMxBjKqWiLHK5vxZpYFUOoeyGmUDeCUkKfx2ImlgA7gVYxtGhD7Wv424e7b1XtAO6RMyd0F6Sk8DsCZCTuAiaDdiMoo7SIPp8QM9ZWcU+JE3V7C/87iusCasfiHGo5nG7pi8O+jMWZ+DP1IYE+DakGe9dVLBOyCmoeWb0aA8MQ6rDcg89HUO4SUts5IkxOfUfbtunzf1FLrMX7rmo7Ef8mLe/CFYmG6n1WMLTFJfFH9DQn8JnUJwXJfWCIVR+xdEW2vqDeD6D9FOofwzWjkrZJB64p1c5qkVtwuZBbLHIXLhdyl0UewOVCHrDII7hcyCOFROyIbO7GN+By4Rus5CdwuZBPWORTuFzIpyzyBC4X8oRFfgOXC/kNi3wIlwv5kEU+gsuFfMQim3C5kE0WeQyXC3nMInfgciF3FLJ6po7gyognYWblAyiXZaCnSKHmAavfJnlHF3bTY053K7D8rN6G/27stodN4wrsjse4O6vA8iNvF3ykG8v7oj1aTVzYPRa7DyPAjd1nsV+KVxXYLz1m2usKLD/XDqCdG8t736/gzo39isU+hpIby69Rh1Djxh56rBjDCuwRi30i3lRgfbz+qALL+/0G+BU3ll+nmtDejfXxppMKLO9PTyCCcWP51eoZ1Lqxz1jsc3FdgX3OYr8G7+7Gfu2xwr6rwOo1dolWkD7FIzHM2Dq2djErsTQEtjYjPy3WlpRi4w7Uc5h+gekT5oJF7BaIXU/EQYE48NYrL/xoTvEuL6VRIBqeiE6xNmFpzLbvFe2xlHogtgvE9hyiLiLFZ637cknRha7hkONi5cKST5+ywn9jKVbjod7zasRhCSHH9jmN/HuULWEGhZaqYzsv1niJjOi+DnFF2ZvupZbB48aFV7BR1yyq40B1WNRbB+oti5o4UBMWdelAXbIoM/NtXMtjBBj747OY0p0cATJGrr4iiAoewKqzB3M0gvFzBFHgU6o5hP8Nyr25q04zzOZxncRdjhclTzyC0lQsQ73JCrcpv05phsWgmWx5qHJ8vMO9jamac9ILz4qVPCp2TPx5EtKnX/BgtBjRfArjeUQ1M4ruZCkMv1fMe10Kw++QxWcUxctSGH6stB/fQPemwjZvgG3AbBoq65tyKIfcf5EcurxEqy56XHyqF2rMIN91IP++ejL7N3guW1SS9jHlMI7c6l9e6l8Ih7Fzbtk5jAWjJxn16lIU3JOByntNOVSHjFbRgdLD3IU+GWzTU09Gl8M4jiDi2qKce2qVQ0fvsOiNKYdxnAi57zmjSF6Xwzj6dC/tYcphHLjb0lZ5vimHena0gMydTTnUqw9oFxj3gOSYlzUmKhpRnDRRbAnFB/W7NXbMv7iO4Z7NyyJHqGcysW01T6dYy+o10vFCDF5tHKgHxhcTKwYrc0zFOptfSR3GpfV9kces8Wj5A7BiBLNfngFwe+YpaKj3JNB7p8C4xmZd5Z5p3DqLw1FyNodqqdoxGy0auXLXqFx3SrVcXmZ6a+zYIn+d09gbUkx4QJbl7HBQ+YSrGDkLHZQsxPOF2O6dmq9l66+yuOEcYliMtC6dCMmTtPo81WX1hmXju+qUZwyXPPMx4xd3m8+Ut8GcJyNfhLrUybTb6X0kuw7X1XvC7HHLzyJ6ouivLslrJHQilbNZqN4tltH4lO4N9zGdyaEMydGF5xgplqGQp2a4i4776RF5VNvfcrLRXnqHTpZz8rraH9ej+xa670CH5zhbsGI8hlITcoZjuGt6ZDlLha0ysvhI/KE4Hc3oCdZn9GnJQ2oO6W/ikoesy7LPSyxXgMbRILN0f455Ho1vLTDxWb9LH5O7lj3/XTq51efbbRrj1aO5eiemR1LXSWpEs0ae6sq7eQlSg6nzk3WKX+t7ifJCJKIP5aS+tCRLuwzoxD+mDHZIkXFKs42bHeXW9v7U/Cda0pHQZ+d4mp2Rh4zI/0WwPmU0JiP6td8d0Cfo0iOk5CN9/E5SRDeuWCdhx5iJ4xIh32ow4y0mXzYh+ZrXnl05jUWZMch1YDY3trVNDigWjEnqSHl3M7frVx9Emvck7FEiGc1Y+YTkf0p/9a8eJ8sLIwItjE8gV77O9TwyylnQRm1a5et9kG5ra/lxocNLpbVZ/4xOH5c026aMC/XB1boHkrt0L2XhKBmR3vlCG7mO1u3mIvNwzo7Y2zPK4qXf76sVGPW+R6vkMs25Fo2SPoyCcZFF6LbcLvK83HpZZXY/7vz/wm5sXbYaMkbC7OBKC3H7+zFla7aWKYxqOX5f02xyW30016pezoDG4oU1l7+D2t/AX623vvfj6ZS8wiaNAclg7oxFZE200MJP1mZJlh6ZmsvcG3lmTOpWds1N8mvp3UyOfRnMckSj5lrtWujyTTheWRyvPG3YpLNGY0Vdrz3RKZtbNNVppa+8EGnNAOYJy8xHZBqVeGhp51J+rD2Wlc/xNeody7XKcrVhttqnAfac90G65/r87P6uWN0j8ZBimy5FYDJ/6dEsTSjm0rX1mZpkQMn3lX+1Z3+LalB6hzwoMsv3OHHGyFOnLl2zQtPfqZUtIz9vPIJ+b+lKtdE+tkXlPy4gL2hO5DQvNeI+tYiV/rYe0ZxHWrFijoh2/tsUU8m4oz5ntlubZxKV4gmTb8pZZWTJTGFA9ud23vYXstd9K3+NKCecqOi6A1zhTxgZJEbvJLgjy5yeEK5y8iRBRrQd8p+Lfkqe4g0sjVZI66nY8PAxMus1Y90eW7rHum+/h5ZodfPUXS14eam3RE7eTU702rSqXagYdTp3fzOutlrlyvd1dpjMyTX2mFAbO7MwWV4Z0xKfe0uRGoVJkRgfKWG9CNE/TPMQneXplC+zbq2ZyzsN0secU77EvQeKCFd094kzmvuU6Udnga9DWJtN1nBMuBuXqf0B29PirtTthXVI1t6uXY1SayWqWik0u71aGP8tPWRM3i8V3J6NbG3r3iplKfwujGToCvlGb1V+aHN+Dhf+jYQrO9QSffYOGxDfPhBbYuc9vA3xRpXljmZENegLenO5d1v1s9yi3kZvLHab30eCv4wEbM1pn9BKGqq7ZOY1t9n9+a/IC4xEzGpvWob3wZbC92RRUkh/EvJsfG8Sob+LE9oXLcGnJ2Up/nLkuQbXizOhv9MU1gfNzvegLCFEhn6Pwe+Zm9bhsmxJ9fZalOIrQ64C+sRF4/DkrzpXMe18PNTIeiLvXwJ6h7Madr1a/K/90HKMpHBZvtJy+q7ZK4+nLtvFakcW4+HwOWOk+Yzmaon+MrOidyZacsuTcV8U9KQyqzfvnx/jUTMGtKypkPugvHYSb48io68vC54LuHTIxH/EP27x30Z4U3BU6RHCpM8pqtl0C55Nf+PS1Tv9mY9OhqdKpzKbySMa9EbsltgXD+F3q4gAQ98Old+llP8R6/7+bA9qz8h76F10uXPQorqYdj/MKVqP7tUe4+md5bX5byEvFk7WV9b+tHL/yf3lLzbVN5Q/FL8Sv4W8ZE18Jr4Qe9DfYzpX+Kv4m/j7xtLG6sZnG3+RTT+4pTC/FKWfjc3/Ajzs2dE=</latexit> x1 ˜ xi := ∑ j e⟨Qxi ,Kxj ⟩ ∑ ℓ e⟨Qxi ,Kxℓ ⟩ Vxj xi xj (Unmasked) Attention layer Tokenize xn {xi }n i=1 Arbitrary number of Tokens Layers n → + ∞ T → + ∞ … next token probabilities Attention Norm MLP Classif … T ×
T W2 (μ, ν)2 := min T n ∑ i=1 ∥xi − yT(i) ∥2 Optimal Transport (Wasserstein) Distance ∥xi − yj ∥2 xi yj T Monge 1784 General measures: Kantorovitch relaxation Approximation by discrete measures or Kantorovitch 1942
∫ e⟨Qhx,Khy⟩ ∫ e⟨Qhx,Khy′  ⟩dμ(y′  ) Vhy dμ(y) Theorem [Furuya, de Hoop, Peyré]: Let be -continuous on a compact . Γ⋆ : 𝒫 (Ω) × Ω → ℝd Wass2 × ℓ2 Ω ⊂ ℝd Γθ [μ](x) := MLPθ (x) or For any there exists and such that ε N (θ1 , …, θN ) ∀(μ, x) ∈ 𝒫 (Ω) × Ω, |Γ⋆[μ](x) − ΓθN ⋄ ⋯ ⋄ Γθ1 [μ](x)| ≤ ε with and . token dimensions ≤ 4d H ≤ d