Go's Constraints and Idioms Make AI Coding Better

From Vibe Coding to Vibe Engineering

Posted by Craig Johnston on Wednesday, February 4, 2026

Andrej Karpathy coined “vibe coding” in early 2025: “forget that the code even exists… I ‘Accept All’ always, I don’t read the diffs anymore.” Code becomes disposable, malleable, generated on demand. Karpathy vibe-coded a BPE tokenizer in Rust without deeply knowing Rust. For throwaway projects and weekend experiments, this works.

But Karpathy also proposed “vibe engineering” as the counterpart for production work: automated testing, planning, documentation, code review. Most teams struggle to bridge these two modes.

Language choice matters here. Go’s strong typing, enforced idioms, and tooling constrain AI output in useful ways.

The Language Determines the Outcome

Not all languages are equal for AI-assisted development. The constraints a language imposes determine whether you spend your time reviewing AI output or fixing it.

Consider the spectrum:

Loosely-typed languages (Python, JavaScript): AI can “get away with” more. Runtime errors catch what the compiler doesn’t. The AI produces plausible-looking code that fails in production. A function that returns None when you expected a list? That’s a 3 AM page, not a compile error.

Strongly-typed languages (Go, Rust, TypeScript): The compiler is a first-pass reviewer. Type errors provide immediate, specific feedback. The AI must satisfy the type system before code even runs.

This distinction matters more for AI than for human developers. LLMs are pattern matchers trained on code. They produce statistically likely code, not necessarily correct code. Dynamic languages accept statistically likely code. Typed languages reject it unless it’s also type-correct.

The compiler becomes an automated verification loop. Anthropic’s Claude Code best practices call this “the single highest-leverage thing you can do.”

Go compiles fast. Sub-second for most projects. AI writes code, compiler rejects it, AI reads the error and fixes. This loop runs dozens of times per minute.

Go’s Idiomatic Advantage

LLMs learn from code. Go has an unusually uniform codebase to learn from:

  • gofmt enforces a single formatting standard. No tabs-vs-spaces debates. No brace placement arguments. Just one way.
  • Effective Go and Go Code Review Comments are widely adopted. They’re community norms, not suggestions.
  • The standard library sets patterns that the ecosystem actually follows. HTTP handlers, error handling, context propagation all work the same way across most Go codebases.

When an LLM generates Go code, it draws from a more consistent training distribution. Python has PEP 8, but enforcement is optional. JavaScript has a dozen competing style guides. Go has one way.

This uniformity reduces AI variance. The if err != nil { return err } pattern appears millions of times in training data, so the LLM knows it cold. Naming uses MixedCaps, not underscores, with exported vs unexported determined by capitalization. Packages are flat, with internal directories for private code. There’s just less to get wrong.

The practical result: less time correcting AI output. The AI’s first guess is more likely to match your team’s standards because those standards match the community’s standards.

The Tooling Ecosystem as Verification Infrastructure

Anthropic’s best practices state: “Claude performs dramatically better when it can verify its own work.”

Go’s verification stack:

# Unit tests with race detection
go test -race ./...

# Coverage verification
go test -coverprofile=coverage.out ./...
go tool cover -func=coverage.out

# Built-in static analysis
go vet ./...

# Aggregated linters
golangci-lint run ./...

# Additional static analysis
staticcheck ./...

# Security scanning
gosec ./...

# Cyclomatic complexity
gocyclo -over 10 .

# Dead code / ineffective assignments
ineffassign ./...

# Spelling in comments and strings
misspell ./...

# Dependency vulnerabilities
govulncheck ./...

# Build/release validation
goreleaser release --snapshot --clean --skip=publish,sign,sbom

Why each tool matters for AI:

The -race flag catches data races that look correct on paper. AI writes concurrent code that compiles fine but races at runtime. A human might miss this in review. The race detector won’t.

Coverage thresholds are enforceable. Require 80%+ and the AI can’t skip tests. It can check compliance immediately.

go vet is built into Go. It catches suspicious constructs: Printf format string mismatches, unreachable code, bad struct tags. Fast and zero-config.

golangci-lint aggregates dozens of linters. Runs them in parallel with shared caching. The AI gets specific feedback: “unused parameter ‘ctx’ in function Foo” beats “this code doesn’t look right.”

staticcheck finds bugs and performance issues golangci-lint might miss. Deprecated API usage, inefficient string concatenation, incorrect time.Time comparisons.

gosec catches security issues. SQL injection, hardcoded credentials, weak crypto, path traversal. AI-generated code introduces these easily.

gocyclo measures cyclomatic complexity. AI tends to generate deeply nested conditionals. Keeping complexity under 10 per function forces refactoring into smaller, testable units.

ineffassign finds assignments to variables that are never used afterward. AI generates dead code. This catches it.

misspell finds spelling errors in comments and strings. AI hallucinates words occasionally. This catches them before they ship.

govulncheck scans dependencies for known vulnerabilities. AI suggests packages from training data. Some have CVEs now.

GoReleaser validates the whole pipeline. Docker images, binaries, checksums. If it builds in CI, it builds for real.

Go Report Card runs these checks against public repositories. 100% on goreportcard.com is a concrete, verifiable target.

Go’s tooling was designed for automation. Machine-readable output. CI integration. The verification loop AI needs to self-correct.

Testcontainers for End-to-End Verification

Unit tests verify logic. Integration tests verify systems. Testcontainers-go spins up real Docker containers during tests: databases, message queues, caches, entire service stacks.

func TestUserRepository(t *testing.T) {
    ctx := context.Background()

    postgres, err := testcontainers.GenericContainer(ctx, testcontainers.GenericContainerRequest{
        ContainerRequest: testcontainers.ContainerRequest{
            Image:        "postgres:16-alpine",
            ExposedPorts: []string{"5432/tcp"},
            Env: map[string]string{
                "POSTGRES_PASSWORD": "test",
                "POSTGRES_DB":       "testdb",
            },
            WaitingFor: wait.ForLog("database system is ready to accept connections"),
        },
        Started: true,
    })
    require.NoError(t, err)
    defer postgres.Terminate(ctx)

    // Get connection string, run actual database tests
    host, _ := postgres.Host(ctx)
    port, _ := postgres.MappedPort(ctx, "5432")
    // ...
}

Why this matters for AI-assisted development:

AI generates database queries, cache interactions, queue consumers. Unit tests with mocks prove the code compiles. Testcontainers prove it actually works against real Postgres, real Redis, real Kafka.

The AI can run these tests itself. Write a repository method, spin up Postgres, run the query, verify results. No mock behavior to get wrong. No “works on my machine” surprises.

Testcontainers handles cleanup automatically. Containers start fresh for each test. No state leaks between runs. The AI can iterate rapidly without accumulating test pollution.

For microservices, compose multiple containers. Test your service against its actual dependencies. The verification loop extends from “does it compile” to “does it work in a realistic environment.”

Encoding Discipline: The CLAUDE.md Philosophy

CLAUDE.md is persistent context that loads at the start of every AI session. Unlike chat history, it survives session restarts. Team standards go here.

Keep it concise. Every line competes for context window space. Only include what the AI can’t infer from the code itself.

What belongs: verification commands (the AI can’t guess your exact test invocation), coverage thresholds, project-specific conventions that differ from Go defaults.

What doesn’t: standard Go idioms (the AI already knows these), verbose explanations, things a linter enforces anyway.

Here’s a production CLAUDE.md for Go projects:

## Code Standards

1. **Idiomatic Go**: Follow gofmt, Effective Go, and Go Code Review Comments.

2. **Test Coverage**: Maintain >80% coverage. Run:
   - `go test -coverprofile=coverage.out ./pkg/...`
   - `go tool cover -func=coverage.out | grep <function_name>` for specific functions
   - Framework callbacks may be excluded if core logic is tested separately

3. **Full Test Suite**: "Testing" means passing all checks:
   - `go test -race ./...`
   - `go test -coverprofile=coverage.out ./...`
   - `go vet ./...`
   - `golangci-lint run ./...`
   - `staticcheck ./...`
   - `gosec ./...`
   - `gocyclo -over 10 .`
   - `ineffassign ./...`
   - `misspell ./...`
   - `govulncheck ./...`
   - `goreleaser release --snapshot --clean --skip=publish,sign,sbom`

4. **Human Review**: All code requires human review before commit.

5. **Go Report Card**: Maintain 100% on [goreportcard.com](https://goreportcard.com).

Concrete commands. Objective thresholds. Explicit human review requirement.

The Workflow: Vibe Engineering in Practice

The four-phase workflow from Claude Code best practices maps well to Go development:

1. Explore: Use Plan Mode. Read files, understand the codebase structure. Identify existing patterns. No changes yet.

2. Plan: Design the implementation. Which packages need modification? What interfaces change? How will you test edge cases? Write this down before writing code.

3. Implement: Write code with verification at each step. Run go test after every significant change. Let the compiler and linter guide corrections.

4. Verify: Full test suite, linting, security scan, human review. Don’t skip steps because the AI seems confident. AI confidence does not correlate with correctness.

The discipline matters most in phase 4. The AI might generate code that compiles, passes tests, and looks reasonable. The security scanner might still find a vulnerability. The coverage report might reveal an untested error path. The human reviewer might notice a semantic bug the tools missed.

This is what separates vibe coding from vibe engineering.

Constraints Enable Creativity

Vibe coding removes constraints. Production development requires them.

Go’s constraints happen to align well with what AI needs. The type system catches errors before runtime. The idioms are uniform enough that AI output is predictable. The tooling automates verification.

For AI-assisted development: strongly-typed language, fast compilation, uniform idioms, verification stack AI can run, standards in CLAUDE.md.

Go isn’t the only option. Rust has stronger guarantees but slower compilation. TypeScript works for frontend. Go compiles fast, has uniform idioms, and the tooling integrates well.

I trust AI more when the compiler and tests can call it out. Go’s tooling does that well.


References

Note: This blog is a collection of personal notes. Making them public encourages me to think beyond the limited scope of the current problem I'm trying to solve or concept I'm implementing, and hopefully provides something useful to my team and others.

This blog post, titled: "Go's Constraints and Idioms Make AI Coding Better: From Vibe Coding to Vibe Engineering" by Craig Johnston, is licensed under a Creative Commons Attribution 4.0 International License. Creative Commons License