Google's BigBird Model Improves Natural Language and Genomics Processing

Delta-p statistics is an easier means of communicating results to a non-technical audience than the plain coefficients of a logistic regression model.

The panelists discuss DevOps buzzwords and when and where they might have value for organizations seeking performance improvements.

Researchers at Google have developed a new deep-learning model called BigBird that allows Transformer neural networks to process sequences up to 8x longer than previously possible. Networks based on this model achieved new state-of-the-art performance levels on natural-language processing and genomics tasks.

BigBird is a new self-attention model that reduces the neural-network complexity of Transformers, allowing for training and inference using longer input sequences. By increasing sequence length up to 8x, the team wasable to achieve new state-of-the-art performance on several NLP tasks, including question-answering and document summarization.The team also used BigBird to develop a new application for Transformer models in genomic sequence representations, improving accuracy over previous models by 5 percentage points.

The Transformer has become the neural-network architecture of choice for sequence learning, especially in the NLP domain.

BigBird is a new self-attention scheme that has complexity of O, which allows for sequence lengths of up to 4,096 items. Instead of each item attending to every other item, BigBird combines three smaller attention mechanisms. First is random attention, which links each item with a small constant number of other items, chosen randomly.

He noted that although the experiments in the paper used a sequence length of 4,096, the model could handle much larger sequences of up to 16k.

Original article
Author: InfoQ

Curated and peer-reviewed content covering innovation in professional software development, read by over 1 million developers worldwide

InfoQ has recently written 1 articles on similar topics including :
  1. "Google Cloud Dataflow is a fully-managed service for executing Apache Beam pipelines within the Google Cloud Platform(GCP). In a recent blog post, Google announced a new, more services-based architecture called Runner v2 to Dataflow which will include multi-language support for all of its language SDKs". (September 5, 2020)
Posted on  , ,