Indian company Sarvam AI has open-sourced two large language models – 30B and 105B – with a focus on supporting the languages of India.
AI: Events
OLMo Hybrid: Transformers and Recurrent Networks Join Forces
Technical context • Research
Allen AI has introduced OLMo Hybrid, an open language model that combines two architectures for more efficient processing of long texts.
Two key libraries for running AI models on everyday devices have joined forces with Hugging Face – and it could change the future of local AI.
AI: Events
Tencent Releases the Most Compact Language Model: 0.3 Billion Parameters in 600 MB
Development
The Chinese company has open-sourced the HY-1.8B-2Bit model with 2-bit quantization – it weighs less than many mobile apps.
AI: Events
Olmix: Allen AI's Approach to Data Mixing Across All Stages of Language Model Training
Development
Allen AI has introduced Olmix, an open-source framework for data mixing in the language model training process, including pre-training, instruction tuning, and alignment.
Chinese company MiniMax has released M2.5, a family of open-weight models whose performance is approaching that of Claude 3.5 Sonnet.