The Google Research Brain Team Unveils a New Architecture for Computer Vision: MLP-Mixer
Recently, attention-based networks, such as the Vision Transformer, have gained popularity. In a post by the Google Research Brain team, a group of scientists introduced MLP-Mixer,…

Recently, attention-based networks, such as the Vision Transformer, have gained popularity. In a post by the Google Research Brain team, a group of scientists introduced MLP-Mixer, an architecture based solely on multilayer perceptrons (MLP). MLP-Mixer comprises two types of layers: one with MLP applied independently to image patches (i.e. “mixing” features by location) and another with MLP across patches (i.e. “mixing” spatial information). When trained on large datasets, MLP-Mixer achieved results comparable to the latest models. The group of scientists hopes that these results will inspire further research beyond well-established CNNs and transformers.
Původní zdroj: wordpress