Open-source capable language models just hit a remarkable milestone – the team behind Falcon 180B just released the model weights, training data, methodologies, and comprehensive technical details involved in the model’s development.
The Falcon series features models with 7B, 40B, and 180B parameters. Falcon is predominantly trained on web data – 3.5 trillion tokens- which sets it apart. This makes Falcon-180B not only one of the largest openly documented pre-trained models but also a top contender alongside giants like GPT-4 and Google’s PaLM-2-Large.
Falcon-180B achieved near PaLM-2-Large performance at a fraction of the cost and debunked the myth that web data alone isn’t enough for high-performance models. Through meticulous deduplication and filtering, their web-only data outperforms traditional curated corpora in natural language tasks.
The released models are open for research and commercial use.
Publication https://lnkd.in/gpmSQ8hv
Rejoignez des milliers de chercheurs et d’ingénieurs de réputation mondiale de Google, Stanford, OpenAI et Meta pour rester à la pointe de l’IA http://aitidbits.ai