how can i run this on my Galaxy S23 Ultra?
It would be interesting to see models like this "Any-to-Any" running on Edge devices.
more likely never or waita 5-10 years
You can try with the Chatterui app on github. It uses CPU inference so your speed may not be great. Use q4_0 quants with ARM devices.
Theres also another app on the google playstore called layla that has experimental MLC inference, but its a little hit or miss.
You can try with the Chatterui app on github. It uses CPU inference so your speed may not be great. Use q4_0 quants with ARM devices.
Theres also another app on the google playstore called layla that has experimental MLC inference, but its a little hit or miss.
Where do you see quants?
You can just use your phone as a gateway to access the model that is running physically on your computer. The main reason why you can't run it on your phone is because these models require a lot of computing power and RAM, and currently even nvidia xx90 cards do not have enough ram to run 7B models. For now, you have no choice but to rely on cloud infrastructure if you want to use your phone as primary interface to AI models.