Most models smashed by Pruna tech become 2-10x more efficient. The specific gains really depend on your ML model, target hardware and if you have custom requirements.
We're smashing and publishing the most popular AI models for free on Hugging Face. If you need to smash other models or after having trained/finetuned them on your data then you will need a paid API key with us. Pricings depend on various factors but always align with how much you get out of it. Request access to learn more.
Our current product makes your AI models more efficient at inference. Use it after training your models and before deploying them on your target hardware. Our next product iteration will make your model training more efficient too and we're eager for people to try it :)
Our approach integrates a suite of cutting-edge AI model compression techniques. These methods are the culmination of our years of research and numerous presentations at ML conferences including NeurIPS, ICML, and ICLR.
Our product only needs your AI model and specifications about your target hardware for inference. The smashed models could be less flexible if you have very specific use-case, and that can be worked out with a little support.
We aim to maintain the predictive performance of all smashed'AI models, ensuring they're as accurate as their original versions. However, we must clarify that while the practical results have consistently met our goals, we cannot provide a theoretical guarantee of exact match in predictions with the original model. We recommend you test the smashed models on your own internal benchmarks.