Ermon, Atri Rudra, and Christopher Re. “Hippo: Recurrent memory with optimal polynomial projections”. NeurIPS, 2020. • [Gu+, NeurIPS21]: Gu, Albert, Isys Johnson, Karan Goel, Khaled Saab, Tri Dao, Atri Rudra, and Christopher Ré. "Combining recurrent, convolutional, and continuous-time models with linear state space layers." NeurIPS, 2021. • [Gu+, ICLR22]: Albert Gu, Karan Goel, and Christopher Re. “Efficiently modeling long sequences with structured state spaces”. ICLR, 2022. • [Gu+, NeurIPS22]: Albert Gu, Ankit Gupta, Karan Goel, and Christopher Re. “On the parameterization and initialization of diagonal state space models”. NeurIPS, 2022. • [Ma+, ICLR23]: Xuezhe Ma, Chunting Zhou, Xiang Kong, Junxian He, Liangke Gui, Graham Neubig, Jonathan May, and Luke Zettlemoyer. "Mega: Moving Average Equipped Gated Attention." ICLR, 2023. • [Smith+, ICLR23]: Jimmy T.H. Smith, Andrew Warrington, Scott Linderman, “Simplified State Space Layers for Sequence Modeling” , ICLR, 2023. • [Mehta+, ICLR23]: Harsh Mehta, Ankit Gupta, Ashok Cutkosky, Behnam Neyshabur. “Long Range Language Modeling via Gated State Spaces”, ICLR, 2023 • [Tay+, ICLR20]: Yi Tay, Mostafa Dehghani, Samira Abnar, Yikang Shen, Dara Bahri, Philip Pham, Jinfeng Rao, Liu Yang, Sebastian Ruder, and Donald Metzler. “Long range arena: A benchmark for efficient transformers”, ICLR, 2020.