Superglue: A stickier benchmark for general-purpose language understanding systems

A Wang, Y Pruksachatkun, N Nangia… - Advances in neural …, 2019 - proceedings.neurips.cc
Advances in neural information processing systems, 2019proceedings.neurips.cc
In the last year, new models and methods for pretraining and transfer learning have driven
striking performance improvements across a range of language understanding tasks. The
GLUE benchmark, introduced a little over one year ago, offers a single-number metric that
summarizes progress on a diverse set of such tasks, but performance on the benchmark has
recently surpassed the level of non-expert humans, suggesting limited headroom for further
research. In this paper we present SuperGLUE, a new benchmark styled after GLUE with a …
Abstract
In the last year, new models and methods for pretraining and transfer learning have driven striking performance improvements across a range of language understanding tasks. The GLUE benchmark, introduced a little over one year ago, offers a single-number metric that summarizes progress on a diverse set of such tasks, but performance on the benchmark has recently surpassed the level of non-expert humans, suggesting limited headroom for further research. In this paper we present SuperGLUE, a new benchmark styled after GLUE with a new set of more difficult language understanding tasks, a software toolkit, and a public leaderboard. SuperGLUE is available at https://super. gluebenchmark. com.
proceedings.neurips.cc
以上显示的是最相近的搜索结果。 查看全部搜索结果