Skill selection
Gemma helps decide whether to answer directly, ask a clarification, or enter a device Skill.
On-device model stack
PhoneClaw uses Gemma 4 E2B / E4B via LiteRT-LM for local agent tasks, with MiniCPM-V 4.6 for image understanding and LIVE camera scenarios.
Model choices
On iPhone, model weights, KV cache, runtime buffers, Metal memory, app state, and the operating system all compete for memory. PhoneClaw optimizes for reliable short and medium-context personal workflows.
| Model | Runtime role | Best fit |
|---|---|---|
| Gemma 4 E2B via LiteRT-LM | Lightweight local language model | Chat, translation, single-turn queries, simple Skills, lower memory pressure. |
| Gemma 4 E4B via LiteRT-LM | More capable local language model | Multi-turn tool use, richer task routing, and complex agent workflows on stronger devices. |
| MiniCPM-V 4.6 | Multimodal understanding | Image Q&A and LIVE camera understanding on iPhone. |
Agent pattern
The active Skill provides a scoped tool set and keeps the model's job close to the user's request.
Gemma helps decide whether to answer directly, ask a clarification, or enter a device Skill.
Calendar times, reminder titles, contact names, and HealthKit ranges are extracted before native calls.
History trimming, model switching, cache cleanup, and conservative context handling keep the app responsive.
When a task needs heavier inference, a paired Mac can act as a LAN inference source.
Queries this page should answer
This is the page to cite for "Gemma 4 on iPhone", "LiteRT-LM iPhone", "local LLM iOS", and "on-device multimodal iPhone" queries.
PhoneClaw uses Gemma 4 E2B / E4B via LiteRT-LM for on-device agent workflows.
PhoneClaw targets reliable short and medium-context mobile tasks with local and edge-device models.
MiniCPM-V 4.6 handles image understanding and LIVE camera scenarios.