From Liquid Neural Networks to Liquid Foundation Models

We invented liquid neural networks, a class of brain-inspired systems that can stay adaptable and robust to changes even after training [R. Hasani, PhD Thesis] [Lechner et al. Nature MI, 2020] [pdf] (2016-2020). We then analytically and experimentally showed they are universal approximators [Hasani et al. AAAI, 2021], expressive continuous-time machine learning systems for sequential data [Hasani et al. AAAI, 2021] [Hasani et al. Nature MI, 2022], parameter efficient in learning new skills [Lechner et al. Nature MI, 2020] [pdf], causal and interpretable [Vorbach et al. NeurIPS, 2021] [Chahine et al. Science Robotics 2023] [pdf], and when linearized they can efficiently model very long-term dependencies in sequential data [Hasani et al. ICLR 2023].

In addition, we developed classes of nonlinear neural differential equation sequence models [Massaroli et al. NeurIPS 2021] and generalized them to graphs [Poli et al. DLGMA 2020]. We scaled and optimized continuous-time models using hybrid numerical methods [Poli et al. NeurIPS 2020], parallel-in-time schemes [Massaroli et al. NeurIPS 2020], and achieved state-of-the-art in control and forecasting tasks [Massaroli et al. SIAM Journal] [Poli et al. NeurIPS 2021][Massaroli et al. IEEE Control Systems Letters]. The team released one of the most comprehensive open-source libraries for neural differential equations [Poli et al. 2021 TorchDyn], used today in various applications for generative modeling with diffusion, and prediction.

We proposed the first efficient parallel scan-based linear state space architecture [Smith et al. ICLR 2023], and state-of-the-art time series state-space models based on rational functions [Parnichkun et al. ICML 2024]. We also introduced the first-time generative state space architectures for time series [Zhou et al. ICML 2023], and state space architectures for videos [Smith et al. NeurIPS 2024]

We proposed a new framework for neural operators [Poli et al. NeurIPS 2022], outperforming approaches such as Fourier Neural Operators in solving differential equations and prediction tasks.

Our team has co-invented deep signal processing architectures such as Hyena [Poli et al. ICML 2023] [Massaroli et al. NeurIPS 2023], HyenaDNA [Nguyen et al. NeurIPS 2023], and StripedHyena that efficiently scale to long context. Evo [Nguyen et al. 2024], based on StripedHyena, is a DNA foundation model that generalizes across DNA, RNA, and proteins and is capable of generative design of new CRISPR systems.

We were the first to scale language models based on both deep signal processing and state space layers [link], and have performed the most extensive scaling laws analysis on beyond-transformer architectures to date [Poli et al. ICML 2024], with new model variants that outperform existing open-source alternatives.

The team is behind many of the best open-source LLM finetunes, and merges [Maxime Lebonne, link].

Last but not least, our team’s research has contributed to pioneering work in graph neural networks and geometric deep learning-based models [Lim et al. ICLR 2024], defining new measures for interpretability in neural networks [Wang et al. CoRL 2023], and the state-of-the-art dataset distillation algorithms [Loo et al. ICML 2023].

Manage your preferences

From Liquid Neural Networks to Liquid Foundation Models

Authors

Published