In the main part of the report, we will consider the mathematical models that are used in transformers, as well as review the architectures of transformer networks. Next, we will consider data sets and types of tasks in which transformer networks are actively used. Let's compare the architectures of transformer networks with other architectures of deep neural networks on the same data sets.
In conclusion, a hypothesis will be put forward about the applicability of transformer networks for certain classes of problems and data sets.
Speaker: Andrey Gavrilov, master student of the department of MOEVM, ETU LETI

Ragistation via link