Saturday, August 21, 2021

Israeli AI21 Launches the Biggest NLP Model So Far

 AI21 is a research lab is the Israeli equivalent OpenAI, founded by several machine learning luminaries including Prof. Amnon Shashua (MobileEye, Orcam, Digial Bank) who is a Prof at the Hebrew University. (Amnon was my lecturer for the ML course, which was an amazing course and he is an amazing person as well). 

This week AI21 announced the release of the largest NLP model called Jurassic-1. It is a comparable model to GPT-3. The is no objective evaluation of the two models, but AI21 mentions that the number of word tokens used to train the models is 250K (compared to around 50K of GPT-3) which gives more flexibility in answering questions regarding common phrases, named entities etc. A great tutorial for GPT-3 is given in Yannic's Youtube Channel:

Building such a large NLP model is challenging, since the model has around 170B parameters and you need weeks of training with hundreds of GPUs, a cost that typically only the biggest companies can afford. Another interesting company I recently met is LightOn which builds photon based hardware to training language models, they recently announced the largest French based model.

It will be interesting to see when AI21 and similar companies will move to training non-English corpuses which is the place such companies can shine. 

An interesting conference coming up soon is the NLP Summit (An online event Oct 5-7).

