Syllabus Lesson 109 of 239 · Neural-Net Intuition, LLMs & AI Capstone
Neural-Net Intuition, LLMs & AI Capstone

On-Device AI (No Cloud)

Most AI apps send your text to a company's server, which runs the model and sends the answer back. That means accounts, API keys, bills, and your data leaving your machine. Floati refuses all of that.

Floati runs its AI client-side, right in your browser or app, using WebLLM on top of WebGPU. WebGPU is a browser standard that lets web code use your computer's graphics card; WebLLM uses it to run a real (smaller) language model locally. The model weights download once, then everything happens on your device.

  • No keys, no cloud, no bill. Nothing to sign up for.
  • Private by design. Your prompts never leave the machine, which fits Floati's offline-first promise.
  • Honest limits. On-device models are smaller and slower than the giant hosted ones, and they need a WebGPU-capable browser. The tradeoff is privacy and zero cost.

Whatever model you talk to, you talk to it through a prompt. A clean, reusable habit is to keep a system instruction (who the model should be) separate from the user's message, then assemble them into one string with a clear final turn for the model to continue:

def build_prompt(system, user):
    return f"System: {system}\nUser: {user}\nAssistant:"

Ending on Assistant: signals where the model should start writing. You will build that template now; the next lesson uses it to actually call the on-device model.

Your turn

Write build_prompt(system, user) that returns a single string in this exact shape: System: <system>, then a newline, User: <user>, a newline, then Assistant: with nothing after it. Then set prompt = build_prompt("You are concise.", "Summarize my week.").

Spotted a problem in this lesson? Report it

Code · runs in your browser
Output