← Back to Blog
AI7 min read

Local AI: What It Is, Why It Exists, and Why Your Practice Should Know About It

2026-01-28

When most people think about AI tools, they think about the cloud. You type something into a website, it goes to a server somewhere, a powerful model processes it, and the answer comes back. That's how ChatGPT works. That's how Gemini works. That's how most AI tools work.

What most people don't realize is that there's a completely different category of AI: models that run locally, on your own hardware, without ever connecting to the internet or sending data to a third party.

For healthcare practices, this distinction matters enormously.

What "Local AI" Actually Means

Modern AI language models — the kind that can draft letters, summarize documents, answer questions, and assist with administrative tasks — can be packaged and run on a standard computer. No cloud required.

Tools like Ollama make this practical for non-technical users. You install Ollama on a Windows or Mac computer (a reasonably modern one), download an open-source model, and you have a private AI assistant running entirely on your local machine.

The models available for local use include some genuinely capable options:

  • Llama 3 (Meta's open-source model — performs comparably to early GPT-4)
  • Mistral (French AI lab — fast, efficient, strong on instruction-following)
  • Phi-3 (Microsoft's small but surprisingly capable model)
  • Gemma (Google's open-source model)

These aren't toy models. For the kinds of tasks a small healthcare practice needs — drafting communications, summarizing documents, answering staff questions, generating templates — they perform well.

Why This Matters for HIPAA

The central HIPAA concern with AI tools is data leaving your control. When PHI goes to OpenAI's servers or Google's servers, it's being processed by a vendor who may or may not have signed a Business Associate Agreement with you, and who may or may not use that data for model training or other purposes.

With a locally-deployed model, none of that applies. The data never leaves your network. There's no third-party vendor receiving PHI. There's no BAA needed because there's no BAA partner — you're running your own infrastructure.

This doesn't mean local AI is automatically HIPAA-compliant in all uses. HIPAA compliance involves more than just data transmission — it includes access controls, audit logs, encryption at rest, and workforce training. But on the specific question of "is patient data leaving my control?" — local AI gives you a clear answer: no.

For practices that want to use AI assistance for tasks that involve PHI — drafting notes, summarizing patient histories, generating prior auth letters with actual patient details — local AI is the architecture that makes it possible without the cloud compliance exposure.

The Trade-offs Are Real

Local AI isn't free from trade-offs. Here's an honest assessment:

Hardware requirements. Running a capable AI model locally requires a reasonably modern computer. A machine with a dedicated GPU will run models significantly faster than one without. For most small practices, a mid-range workstation (roughly $800–$1,500) can run models like Llama 3 8B comfortably. Larger models require more powerful hardware.

Setup complexity. Installing and configuring a local AI model is not a plug-and-play experience for non-technical users. Tools like Ollama have simplified this significantly, but it still requires someone who's comfortable with software configuration. This is the primary reason most small practices don't do it on their own.

Model capability gaps. Local models are genuinely capable, but the largest cloud models (GPT-4o, Claude 3 Opus, Gemini Ultra) still outperform them on complex reasoning tasks. For administrative work in a small practice, this gap usually doesn't matter. For complex clinical decision support, it might.

No internet? No real-time information. Local models don't have access to current information unless you feed it to them. If you need current CPT code updates or the latest payer policy, you'll need to provide that content manually — the model won't look it up.

Maintenance. Models get updated. Ollama gets updated. Someone in your practice needs to own keeping these things current — or you contract with someone who does.

What It Looks Like in Practice

Here's a realistic example of how a small OT practice might use local AI:

A therapist finishes a session and needs to write a progress note. Instead of dictating into an expensive medical transcription service or typing it out manually, they pull up a simple interface (we can build this as a custom tool for your practice) and describe the session in plain language. The local AI model drafts a structured note based on your practice's preferred format. The therapist reviews, edits, and approves. The note takes 3 minutes instead of 12.

The patient's name, diagnosis, and clinical details were involved in that process — but they never left the office. They were processed on a computer that sits in the same building as the patient's chart.

That's local AI applied to a real clinical workflow.

Another example: your front desk coordinator needs to draft a prior authorization letter for a patient's insurance company. They describe the situation to the local AI, paste in the relevant clinical codes, and the AI drafts the letter. They clean it up and send it. Time saved: 20 minutes. Data that left the building: none.

The Cloud Alternative (Done Right)

Local AI isn't the only compliant path. Major cloud AI providers have started offering HIPAA-eligible configurations with proper BAA options:

  • Microsoft Azure OpenAI Service — GPT-4 and other models through Azure, with HIPAA-eligible configuration and BAA available
  • AWS Bedrock — Access to multiple frontier models through Amazon's infrastructure, with BAA available for covered entities

These options give you cloud-level model performance with the legal framework that makes HIPAA compliance possible. They're more expensive than consumer AI tools and require proper configuration to be actually compliant — but they're a legitimate path for practices that want the latest models without running local hardware.

The right choice between local and compliant cloud depends on your practice's size, technical capacity, volume of AI usage, and privacy preferences. It's not a one-size-fits-all answer.

Where to Start

You don't need to solve all of this today. But here's a reasonable sequence:

First: Get clarity on what your staff is actually using right now. Before you invest in any AI infrastructure, understand your current exposure.

Second: Identify the 2–3 administrative tasks where AI assistance would save the most time in your practice. Draft letters? Progress notes? Scheduling communications? Staff training materials?

Third: Evaluate whether those tasks involve PHI. If not, you can start using compliant consumer tools with a basic policy in place. If they do, that's when the local vs. cloud compliant conversation becomes relevant.

Fourth: Get a proper setup. Whether that's Ollama on a local workstation, an Azure OpenAI configuration with a BAA, or a custom workflow built around one of these — do it right. The efficiency gains are real, but they need to be built on a foundation that doesn't put your practice at risk.

The practices that will benefit most from AI in the next three to five years aren't the ones who wait for a perfect solution. They're the ones who start now, start safely, and build from there.


If you want to know what AI implementation would actually look like in your specific practice — including what hardware you'd need, what tasks it makes sense for, and what compliance steps are required — that's a conversation we have as part of every free practice audit.

Ready to Find Out What's Costing Your Practice?

In 15 minutes, we'll identify your top 3 revenue and time leaks — at no cost and no obligation.

Get Your Free Practice Audit