transformer models Archives

Video Highlights: Fine Tune GPT-J 6B in Under 3 Hours on IPUs

June 3, 2023 by Editorial Team Leave a Comment

Did you know you can run GPT-J 6B on Graphcore IPU in the cloud? Following the now infamous leaked Google memo, there’s been a real storm in the AI world recently around smaller, open source language models, like GPT-J, that are cheaper and faster to fine-tune, run and perform just as well as larger models for many language tasks.

Filed Under: Data Science, Education / Training, Enterprise AI, Google News Feed, Machine Learning, Main Feature, News / Analysis, Uncategorized, Video Tagged With: AI, Deep Learning, LLM, transformer models, Weekly Newsletter Articles

Research Highlights: SparseGPT: Prune LLMs Accurately in One-Shot

March 3, 2023 by Editorial Team Leave a Comment

A new research paper shows that large-scale generative pretrained transformer (GPT) family models can be pruned to at least 50% sparsity in one-shot, without any retraining, at minimal loss of accuracy. This is achieved via a new pruning method called SparseGPT, specifically designed to work efficiently and accurately on massive GPT-family models.

Filed Under: Data Science, Enterprise AI, Featured, Google News Feed, Machine Learning, News / Analysis, Research / Reports, Uncategorized Tagged With: AI, ChatGPT, Deep Learning, transformer models, Weekly Newsletter Articles

Video Highlights: Attention Is All You Need – Paper Explained

February 15, 2023 by Editorial Team Leave a Comment

In this video presentation, Mohammad Namvarpour presents a comprehensive study on Ashish Vaswani and his coauthors’ renowned paper, “Attention Is All You Need.” This paper is a major turning point in deep learning research. The transformer architecture, which was introduced in this paper, is now used in a variety of state-of-the-art models in natural language processing and beyond. Transformers are the basis of the large language models (LLMs) we’re seeing today.

Filed Under: Big Data, Enterprise AI, Featured, Google News Feed, News / Analysis, Uncategorized, Video Tagged With: AI, Deep Learning, LLM, neural networks, transformer models, Weekly Newsletter Articles

The Move Toward Green Machine Learning

November 6, 2022 by Daniel Gutierrez Leave a Comment

A new study suggests tactics for machine learning engineers to cut their carbon emissions. Led by David Patterson, researchers at Google and UC Berkeley found that AI developers can shrink a model’s carbon footprint a thousand-fold by streamlining architecture, upgrading hardware, and using efficient data centers.

Filed Under: AI for Healthcare, Medical and Life Sciences, Big Data, Data Science, Enterprise AI, Google News Feed, Machine Learning, Main Feature, News / Analysis, Research / Reports, Uncategorized Tagged With: AI, Big Data, Deep Learning, GPT-3, Large language models, Machine Learning, NLP, transformer models

Research Highlights: Transformer Feed-Forward Layers Are Key-Value Memories

May 6, 2022 by Daniel Gutierrez Leave a Comment

In this regular column, we take a look at highlights for important research topics of the day for big data, data science, machine learning, AI and deep learning. It’s important to keep connected with the research arm of the field in order to see where we’re headed. In this edition, if you (like me) have wondered what the feed-forward layers in transformer models are actually doing, this is a pretty interesting paper on that topic. Enjoy!

Filed Under: Big Data, Data Science, Enterprise AI, Featured, Google News Feed, Machine Learning, News / Analysis, Research / Reports, Uncategorized Tagged With: AI, Deep Learning, transformer models, Weekly Newsletter Articles

Habana Labs and Hugging Face Partner to Accelerate Transformer Model Training

April 13, 2022 by Editorial Team Leave a Comment

Habana® Labs, a pioneer in high-efficiency, purpose-built deep learning processors, and Hugging Face, the home of Transformer models, announced that they’re joining forces to make it easier and quicker to train high-quality transformer models. Thanks to the integration of Habana’s SynapseAI software suite with the Hugging Face Optimum open-source library, data scientists and machine learning engineers can now accelerate their Transformer training jobs on Habana processors with just a few lines of code and enjoy greater productivity as well as lower training cost.

Filed Under: Big Data, Data Science, Enterprise AI, Google News Feed, Machine Learning, News / Analysis, Uncategorized Tagged With: AI, artificial intelligence, Deep Learning, Machine Learning, transformer models, Weekly Newsletter Articles

Video Highlights: Fine Tune GPT-J 6B in Under 3 Hours on IPUs

Research Highlights: SparseGPT: Prune LLMs Accurately in One-Shot

Video Highlights: Attention Is All You Need – Paper Explained

The Move Toward Green Machine Learning

Research Highlights: Transformer Feed-Forward Layers Are Key-Value Memories

Habana Labs and Hugging Face Partner to Accelerate Transformer Model Training

Sponsored Guest Articles

Webinar: Getting Started with Llama 3 on AMD Radeon and Instinct GPUs

White Papers

From complexity to clarity: Harnessing the power of AI/ML and risk-informed strategies to streamline clinical data management

Featured RSS Feed

More News from insideHPC