Transformer models and transfer learning methods continue to propel the field of Natural Language Processing forwards at a tremendous pace. However, state-of-the-art performance too often comes at the cost of (a lot of) complex code.
Simple Transformers avoids all the complexity and lets you get down to what matters – model training and experimenting with the Transformer model architecture. It helps you bypass all the complicated setups, boilerplate code, and all the other general unpleasantness by initializing a model in one line, training in the next, and evaluating with the third.
In this report, I build on the simpleTranformers repo, and explore some of the most common applications of deep NLP – including tasks from GLUE benchmark, along with the recipes for training SOTA transformer models to perform these tasks. I've used the distilbert transformer model for all the tasks as it is less expensive computationally. I also extensively explore optimizing your distilbert hyperparameters with Sweeps.
Simpletransformers comes with native support for model performance tracking, using Weights & Biases.