Madhuram: A Revolutionary Compact Language Model
We at Maruth Labs are excited to announce the development of our latest breakthrough in artificial intelligence: Madhuram, a language model that redefines efficiency in the realm of AI technology. Over the past six months, we have worked tirelessly to create a model that is significantly smaller than current state-of-the-art (SoTA) models and even leads the way in the Mobile Large Language Models (MobileLLMs) category.
What Makes Madhuram Stand Out?
Madhuram is designed with only 67.1 million parameters and has been pretrained on just 10 billion tokens. This is a fraction of the size and data used by most large-scale language models today. Despite its comparatively smaller size, Madhuram has achieved results competitive with models that are much larger in both scale and resource usage. This milestone is particularly exciting, as it opens the door to endless possibilities for industries that require efficient AI solutions without sacrificing performance.Performance with Minimal Resources
What sets Madhuram apart is its ability to perform competitively while utilizing 100 times less data than its larger counterparts. We have designed this model to push the limits of AI research by proving that high-performance results do not necessarily require massive amounts of resources. Madhuram’s efficiency makes it an ideal candidate for deployment in resource-constrained environments, including mobile devices and wearable technologies.Expanding Horizons: Wearable Technology & Robotics
Our vision for Madhuram goes beyond traditional AI applications. We aim to deploy Madhuram in wearable technology, leveraging its compact size and powerful capabilities to enhance human experiences and push the boundaries of robotics. From advanced automation systems to AI-powered devices, we believe Madhuram has the potential to transform industries by delivering state-of-the-art performance in cost-effective and hardware-friendly formats.Results and Competitive Benchmarking
Our benchmarking on four datasets shows that Madhuram holds its own against both mobile and non-mobile language models. The results underscore our commitment to delivering high-quality AI with a focus on efficiency and accessibility. The detailed results and comparisons can be found on the adjacent page.Model | Size | BoolQ | PIQA | Winogrande | Hellaswag |
---|---|---|---|---|---|
Madhuram | 67 M | 55.66 | 52.61 | 51.85 | 29.0 |
MobileLLM | 125 M | 45.8 | 28.7 | 42.9 | 65.7 |
MobileLLM | 1 B | 63 | 39 | 45 | 74.4 |
MobileLLM | 1.5 B | 67.5 | 40.9 | 46.4 | 74.8 |
BLOOM | 1.7 B | 50.9 | 31.2 | 43.2 | 70.0 |
Qwen-1.5 | 1.8 B | 61.1 | 36.5 | 47.2 | 74.1 |
GPT-neo | 2.7 B | 55.8 | 34.3 | 43.6 | 72.9 |
OPT | 2.7 B | 56.6 | 34.6 | 45.6 | 74.5 |
Pythia | 2.8 B | 59.4 | 33.6 | 43.2 | 73.8 |
BLOOM | 3 B | 55.1 | 40.9 | 46.4 | 70.5 |
Aman Goyal, Sanya Pandey, Mridul Bhatt
October 08, 2024
Contact | Back