H2O.ai Debuts Danube3 Series, Surpassing Apple and Rivalling Microsoft with New Small Language Models
H2O.ai, the open-source leader in Generative AI and machine learning, is excited to announce the global release of the H2O-Danube3 series, the latest addition to its suite of small language models. This series, now available on Hugging Face, includes the H2O-Danube3-4B and the compact H2O-Danube3-500M, both designed to push the boundaries of natural language processing (NLP) and make advanced capabilities accessible to a wider audience.
Also Read: AMD to Acquire Silo AI to Expand Enterprise AI Solutions Globally
“With H2O-Danube3, we continue to democratize advanced NLP capabilities, ensuring they are within reach for a wider audience while maintaining sustainability. The versatility of these models spans from enhancing chat applications to supporting research and on-device solutions, truly embodying our mission to bring AI to everyone”
“We are incredibly excited about the H2O-Danube3 series – a leap forward in making small language models more powerful and accessible. The H2O-Danube3-4B and H2O-Danube3-500M models are designed to push the envelope in terms of performance, outpacing competitors like Apple and rivaling even Microsoft’s offerings. These models are not just high-performing but also economically efficient and easily deployable on edge devices, making them perfect for enterprise and offline applications,” said Sri Ambati, CEO and Founder of H2O.ai.
“With H2O-Danube3, we continue to democratize advanced NLP capabilities, ensuring they are within reach for a wider audience while maintaining sustainability. The versatility of these models spans from enhancing chat applications to supporting research and on-device solutions, truly embodying our mission to bring AI to everyone,” added Sri Ambati.
H2O-Danube3-4B: A New Benchmark in NLP
The H2O-Danube3-4B model, trained on an impressive 6 trillion tokens, has achieved a stellar score of over 80% on the 10-shot HellaSwag benchmark. This performance not only surpasses Apple’s OpenELM-3B but also rivals Microsoft’s Phi3 4B, setting a new standard in the field.
H2O-Danube3-500M: Compact Yet Powerful
The H2O-Danube3-500M model, trained on 4 trillion tokens, demonstrates remarkable efficiency and versatility. It has achieved the highest scores in 8 out of 12 academic benchmarks when compared to similarly sized models, such as Alibaba’s Qwen2. Despite its compact size, the H2O-Danube3-500M is designed to handle a wide range of applications, from chatbots and research to on-device solutions.
Complementing H2O-Danube2 with Advanced Capabilities
The H2O-Danube3 series builds on the foundation laid by the H2O-Danube2 models. The new models are trained on high-quality web data, Wikipedia, academic texts, synthetic texts, and other higher-quality textual data, primarily in English. They have undergone final supervised tuning specifically for chat applications, ensuring they meet diverse user needs.
Also Read: Survey Reveals Only 20 Percent of Senior IT Leaders Are Using Generative AI in Production
Key Features:
- High Efficiency: Designed for efficient inference on consumer hardware and edge devices, H2O-Danube3 models can even run fully offline on modern smartphones with H2O AI Personal GPT .
- Open Access: All models are openly available under the Apache 2.0 license on Hugging Face .
- Competitive Performance: Extensive evaluations show that H2O-Danube3 models achieve highly competitive results across various academic, chat, and fine-tuning benchmarks.
- Use Cases: The models are suitable for a range of applications, including chatbot integration, fine-tuning for specific tasks such as sequence classification, question answering, or token classification, and offline use cases.
Technical Specs:
H2O-Danube3-4B: 3.96 billion trainable parameters, trained with a context length of up to 8,192 tokens.
H2O-Danube3-500M: 514 million trainable parameters, trained with a context length of up to 8,192 tokens.
[To share your insights with us as part of editorial or sponsored content, please write to psen@itechseries.com]
Comments are closed.