
Run an LLM on Your Laptop for Free With Ollama
Try Text
The Rundown In this guide, you will install Ollama and chat with a real AI model that runs entirely on your laptop. You will be able to use it for everyday work like drafting, summarizing, and brainstorming, for free, with no account, no subscription, and nothing leaving your machine. Who This Is Useful For Consultants and agency owners who handle client information they do not want sent to a third-party server Marketers, writers, and operators who want to iterate on prompts without watching a usage meter Anyone curious about local AI who does not want to sign up for another subscription just to try it What You Will Build A working local AI chat setup using the Ollama desktop app. The model lives on your hard drive, runs on your CPU or GPU, and works offline once it has downloaded. What You Need to Get Started A reasonably modern laptop running Mac, Windows, or Linux At least 8 GB of RAM is ideal (4 GB works with a smaller model) At least 3 to 8 GB of free disk space, depending on the model you pick Step 1 Install Ollama Go to ollama.com/download and grab the installer for your operating system. No account, no sign-in, nothing to set up first. On Mac: open the downloaded file and drag the Ollama icon into your Applications folder. On Windows: run the installer and click through the wizard. On Linux: open a terminal and run curl -fsSL https://ollama.com/install.sh | sh . Once it is installed, open the app from your Applications folder or Start menu. Pro tip: Mac users who prefer Homebrew can run brew install ollama instead of using the .dmg installer. Step 2 Pick a Model That Fits Your Machine Click New Chat in the Ollama app, then click the model dropdown at the bottom of the chat window. The right model depends on how much RAM your machine has. We tested with gemma3:4b on a 16 GB MacBook and it ran comfortably. Your RAM Model picks Good for 4 GB gemma3:1b , qwen3:0.6b , tinyllama Tiny models. Fast, but reasoning is limited. Short rewrites and simple Q&A. 8 GB gemma3:4b , llama3.2:3b , phi4-mini Small models. The everyday tier for chat, drafting, and summarizing. 16 GB gemma3:12b , llama3.1:8b , qwen3:8b Sweet spot for most laptops. Noticeably smarter and good for real writing work. 32 GB+ or discrete GPU gemma3:27b , gpt-oss:20b , qwen3:32b Heavyweight. Closest you will get to cloud quality on a single machine. Apple Silicon Macs share memory between CPU and GPU, so a 16 GB M-series Mac generally runs models that a 16 GB PC struggles with. Pick a model and Ollama downloads it the first time you select it (cached after that). Pro tip: Browse ollama.com/search to see every model Ollama supports, including specialized ones for coding, vision, and tool use. Step 3 Start Chatting Right in the App Once the model finishes downloading, type a prompt and hit enter. You are now chatting with a real AI running entirely on your laptop. No internet call, no API key, no per-token cost. The first response is usually slower because the model has to load into memory. After that, replies stream in fast. Pro tip: If the desktop app is too clean for your taste and you prefer the terminal, run ollama run gemma3:4b instead. Same model, same chat, no GUI. Step 4 Test That It Runs Locally Turn on airplane mode or unplug your ethernet cable, then send another prompt. It still works. Nothing is hitting a server, nothing is being stored anywhere, and you did not pay anyone for that response. This is the real point of running local. If you handle client NDAs, internal financials, or anything you would not paste into a hosted AI tool, this is the difference that matters. Going Further Once the basic chat is working, three obvious places to extend the setup: Wire up a coding agent. Ollama can launch Claude Code , Codex , or OpenCode against the model you just downloaded. Your coding agent runs on free, private inference instead of paying per token. The single command is ollama launch claude (or codex , or opencode ). Give your model tools and web search. Ollama exposes a local API at http://localhost:11434 and supports tool calling. That means you can connect it to web search, file readers, and other utilities so it is not stuck with only what it learned at training time. Move it to a dedicated machine. If you find yourself running a model all day, an old Mac mini, a recent used M-series Mac mini, or a repurposed PC makes a great always-on local AI box. You can hit it from any device on your home network and stop fighting your daily-driver laptop for resources. The bigger picture is that cloud AI is not going anywhere for hard reasoning, but a growing share of everyday AI work is going to run locally. Getting Ollama installed tonight is how you stay a year ahead of that shift.
Tools

AI training for the future of work.
Get access to all our AI certificate courses, hundreds of real-world AI use cases, live expert-led workshops, an exclusive network of AI early adopters, and more.




