Did you know you can run GPT-J 6B on Graphcore IPU in the cloud? Following the now infamous leaked Google memo, there’s been a real storm in the AI world recently around smaller, open source language models, like GPT-J, that are cheaper and faster to fine-tune, run and perform just as well as larger models for many language tasks.
Research Highlights: SparseGPT: Prune LLMs Accurately in One-Shot
A new research paper shows that large-scale generative pretrained transformer (GPT) family models can be pruned to at least 50% sparsity in one-shot, without any retraining, at minimal loss of accuracy. This is achieved via a new pruning method called SparseGPT, specifically designed to work efficiently and accurately on massive GPT-family models.
Video Highlights: Attention Is All You Need – Paper Explained
In this video presentation, Mohammad Namvarpour presents a comprehensive study on Ashish Vaswani and his coauthors’ renowned paper, “Attention Is All You Need.” This paper is a major turning point in deep learning research. The transformer architecture, which was introduced in this paper, is now used in a variety of state-of-the-art models in natural language processing and beyond. Transformers are the basis of the large language models (LLMs) we’re seeing today.
The Move Toward Green Machine Learning
A new study suggests tactics for machine learning engineers to cut their carbon emissions. Led by David Patterson, researchers at Google and UC Berkeley found that AI developers can shrink a model’s carbon footprint a thousand-fold by streamlining architecture, upgrading hardware, and using efficient data centers.
Research Highlights: Transformer Feed-Forward Layers Are Key-Value Memories
In this regular column, we take a look at highlights for important research topics of the day for big data, data science, machine learning, AI and deep learning. It’s important to keep connected with the research arm of the field in order to see where we’re headed. In this edition, if you (like me) have wondered what the feed-forward layers in transformer models are actually doing, this is a pretty interesting paper on that topic. Enjoy!
Habana Labs and Hugging Face Partner to Accelerate Transformer Model Training
Habana® Labs, a pioneer in high-efficiency, purpose-built deep learning processors, and Hugging Face, the home of Transformer models, announced that they’re joining forces to make it easier and quicker to train high-quality transformer models. Thanks to the integration of Habana’s SynapseAI software suite with the Hugging Face Optimum open-source library, data scientists and machine learning engineers can now accelerate their Transformer training jobs on Habana processors with just a few lines of code and enjoy greater productivity as well as lower training cost.