(Replying to PARENT post)
All something like GPT-2 is, is a text compressor, a model for an arithmetic encoder specifically. It predicts the probability of the next token conditional on a history. You can directly measure the 'bits per character'/'perplexity' as a measure of (compression) performance, and that is typically how these language models are evaluated: eg http://nlpprogress.com/english/language_modeling.html https://paperswithcode.com/task/language-modelling
The Hutter dataset is included, incidentally, and the top is a Transformer-XL (277M parameters) at 0.94 BPC. Does all of that seem 'very much not AGI'?
๐คgwern๐6y๐ผ0๐จ๏ธ0
(Replying to PARENT post)
Imagine a highly efficient compressor watching Toy Story in 4k definition. If it was really intelligent, then it could abstract the 3D-models, motion, physics, lightning, and textures, and store the movie (or something imperceivably like it) in a highly compressed representation. Or even write down just the script and direct a movie so similar that it can't be distinguished from the original.
๐คthrowawaywego๐6y๐ผ0๐จ๏ธ0
(Replying to PARENT post)
Being able to discern large scale themes and patterns in data, then using those findings distill the original data input to minimally redundant messages, is similar to the goal of an AGI that can comprehend any input data.
๐คasynchrony๐6y๐ผ0๐จ๏ธ0
(Replying to PARENT post)
From AGI perspective it should be more relevant to come up with the ideas/concepts described in those Wikipedia articles - not the exact words used by humans to describe them.
๐คjpalomaki๐6y๐ผ0๐จ๏ธ0
(Replying to PARENT post)
A compressor needs to detect patterns in data. Best if it works with general data. General intelligence is excellent at these tasks.
๐คherendin2๐6y๐ผ0๐จ๏ธ0
(Replying to PARENT post)
AGI would probably be sufficient, but not necessary.
๐คdooglius๐6y๐ผ0๐จ๏ธ0
(Replying to PARENT post)