dhy@ironhide: ~/site
dhy@ironhide:~/site$cat header.html
_____ _ _ _ _ | __ \| | | | | | | | | | | |_| | | | | | | | | _ | |_| | | |__| | | | | _ | |_____/|_| |_|_| |_| ~/dhy.tr — personal notes & technical writing
dhy@ironhide:~/site$ls -la *.md

Memory Allocation Optimization with sync.Pool in Go: Practical Ways to Reduce GC Pressure

Problem

Go's garbage collector (GC) is a great engineering achievement — low latency, concurrent mark-sweep, non-generational but gets the job done. However, in high-throughput systems (API gateways, proxies, message brokers), continuously allocating objects and leaving them to the GC creates serious performance problems.

Consider this: you wrote an API server handling 100,000 HTTP requests per second. For each request, you're allocating a bytes.Buffer, a json.Encoder, and maybe a few []byte slices. This means hundreds of thousands of heap allocations per second. The GC has to work continuously to collect these objects, eating into your CPU time.

A real-world example: in a log aggregation pipeline, I was processing 500,000 log lines per second. When I profiled, I saw the GC was consuming 18% of the CPU. The solution? sync.Pool.

What is sync.Pool?

sync.Pool is an object pool that comes with Go's standard library. It lets you reuse temporary objects, dramatically reducing the GC's workload.

var bufferPool = sync.Pool{
    New: func() interface{} {
        return new(bytes.Buffer)
    },
}

This simple structure says: "If there's an idle Buffer in the pool, return it; otherwise, create a new one."

How It Works

The internal mechanism of sync.Pool works as follows:

  1. Per-P Local Cache: Each P (logical processor, GOMAXPROCS count) has its own local pool. Provides lock-free access.
  2. Shared Pool: When the local pool is empty, it tries to steal from the shared pool (similar to work-stealing).
  3. GC Cleanup: At each GC cycle, all objects in the pool are cleared. This is why sync.Pool is only for temporary objects — you can't store persistent state.
  4. Victim Cache: Objects from the cycle before GC are kept in the victim cache, so they can still be reused during sudden allocation spikes.
// Internal structure of sync.Pool (simplified)
type Pool struct {
    noCopy noCopy

    local     unsafe.Pointer // [P]poolLocal array
    localSize uintptr

    victim     unsafe.Pointer // previous cycle's locals
    victimSize uintptr

    New func() interface{}
}

type poolLocal struct {
    poolLocalInternal
    pad [128 - unsafe.Sizeof(poolLocalInternal{})%128]byte
}

type poolLocalInternal struct {
    private interface{}   // Only used by the owning P
    shared  poolChain     // Can be stolen by other Ps
}

Usage: Right and Wrong

Correct Usage

package main

import (
    "bytes"
    "encoding/json"
    "sync"
)

var jsonBufferPool = sync.Pool{
    New: func() interface{} {
        return new(bytes.Buffer)
    },
}

func handleRequest(data interface{}) ([]byte, error) {
    // Get buffer from pool
    buf := jsonBufferPool.Get().(*bytes.Buffer)
    defer func() {
        buf.Reset() // IMPORTANT: clear the buffer
        jsonBufferPool.Put(buf) // Return to pool
    }()

    // JSON encode
    encoder := json.NewEncoder(buf)
    if err := encoder.Encode(data); err != nil {
        return nil, err
    }

    // Copy buffer contents (before returning to pool!)
    result := make([]byte, buf.Len())
    copy(result, buf.Bytes())
    return result, nil
}

Wrong Usage (DO NOT DO THIS)

// WRONG: Returning a reference to an object taken from the pool
func handleRequestBad(data interface{}) ([]byte, error) {
    buf := jsonBufferPool.Get().(*bytes.Buffer)
    defer jsonBufferPool.Put(buf) // buf.Reset() IS MISSING!
    
    encoder := json.NewEncoder(buf)
    encoder.Encode(data)
    
    // WRONG: buf.Bytes() returns the internal slice
    return buf.Bytes(), nil // DATA CORRUPTED WHEN BUFFER RETURNED TO POOL
}

Critical Rules

  1. Always Put after Get: Using defer is the safest approach.
  2. Reset before Put: If you don't reset the object, the next user will see old data.
  3. Don't leak references from the pool: Copy the data, not the pointer.
  4. sync.Pool is NOT persistent storage: GC can delete everything.

Benchmark: With Pool vs Without Pool

Let's run a real benchmark:

package main

import (
    "bytes"
    "encoding/json"
    "sync"
    "testing"
)

type TestData struct {
    ID      int      `json:"id"`
    Name    string   `json:"name"`
    Tags    []string `json:"tags"`
    Active  bool     `json:"active"`
}

var testData = TestData{
    ID:     42,
    Name:   "sync-pool-test",
    Tags:   []string{"golang", "performance", "optimization"},
    Active: true,
}

// Pool-less version
func BenchmarkJSONEncodeNoPool(b *testing.B) {
    for i := 0; i < b.N; i++ {
        buf := new(bytes.Buffer)
        encoder := json.NewEncoder(buf)
        encoder.Encode(testData)
        _ = buf.Bytes()
    }
}

// Pooled version
var pool = sync.Pool{
    New: func() interface{} {
        return new(bytes.Buffer)
    },
}

func BenchmarkJSONEncodeWithPool(b *testing.B) {
    for i := 0; i < b.N; i++ {
        buf := pool.Get().(*bytes.Buffer)
        encoder := json.NewEncoder(buf)
        encoder.Encode(testData)
        buf.Reset()
        pool.Put(buf)
    }
}

Typical results (Go 1.22, AMD64):

BenchmarkJSONEncodeNoPool-16      2,500,000    450 ns/op    256 B/op    3 allocs/op
BenchmarkJSONEncodeWithPool-16    5,800,000    205 ns/op      0 B/op    0 allocs/op

Difference: 2.2x faster, zero allocations, lower GC pressure.

Real-World Scenarios

1. Request Context Pooling in HTTP Servers

type RequestContext struct {
    UserID    string
    SessionID string
    Body      []byte
    Headers   map[string]string
    TraceID   string
}

var ctxPool = sync.Pool{
    New: func() interface{} {
        return &RequestContext{
            Headers: make(map[string]string, 16),
            Body:    make([]byte, 0, 4096),
        }
    },
}

func handleHTTP(w http.ResponseWriter, r *http.Request) {
    ctx := ctxPool.Get().(*RequestContext)
    defer func() {
        // Cleanup — reset slices to zero length but keep capacity
        ctx.Body = ctx.Body[:0]
        ctx.UserID = ""
        ctx.SessionID = ""
        ctx.TraceID = ""
        for k := range ctx.Headers {
            delete(ctx.Headers, k)
        }
        ctxPool.Put(ctx)
    }()

    // Parse request...
    ctx.UserID = r.Header.Get("X-User-ID")
    ctx.TraceID = r.Header.Get("X-Trace-ID")

    // Business logic...
    processRequest(ctx)

    // Write response...
    w.WriteHeader(200)
}

This pattern can reduce allocations by up to 90% in high-QPS services.

2. []byte Pools — Wire-Size Buffers

The most common allocation in network programming is creating []byte buffers for reading incoming data:

var bufPool = sync.Pool{
    New: func() interface{} {
        // 64KB buffer — larger than a typical TCP segment
        b := make([]byte, 64*1024)
        return &b
    },
}

func readConnection(conn net.Conn) error {
    bufPtr := bufPool.Get().(*[]byte)
    buf := *bufPtr
    defer func() {
        bufPool.Put(bufPtr)
    }()

    n, err := conn.Read(buf)
    if err != nil {
        return err
    }

    // Process incoming data
    process(buf[:n])
    return nil
}

3. Protocol Buffers

If you have protobuf messages and continuously encode/decode:

var protoPool = sync.Pool{
    New: func() interface{} {
        return &mypb.LargeMessage{}
    },
}

func handleProtobuf(conn net.Conn) error {
    msg := protoPool.Get().(*mypb.LargeMessage)
    defer func() {
        msg.Reset() // protobuf's own Reset method
        protoPool.Put(msg)
    }()

    data, err := io.ReadAll(conn)
    if err != nil {
        return err
    }

    if err := proto.Unmarshal(data, msg); err != nil {
        return err
    }

    return processMessage(msg)
}

Advanced Techniques

Size-Class Pooling

Not all buffers are the same size. You can use separate pools for different size classes:

type ByteBufferPool struct {
    small  sync.Pool // 4KB
    medium sync.Pool // 64KB
    large  sync.Pool // 1MB
}

func NewByteBufferPool() *ByteBufferPool {
    return &ByteBufferPool{
        small: sync.Pool{
            New: func() interface{} {
                return make([]byte, 4096)
            },
        },
        medium: sync.Pool{
            New: func() interface{} {
                return make([]byte, 65536)
            },
        },
        large: sync.Pool{
            New: func() interface{} {
                return make([]byte, 1048576)
            },
        },
    }
}

func (p *ByteBufferPool) Get(size int) []byte {
    switch {
    case size <= 4096:
        return p.small.Get().([]byte)[:size]
    case size <= 65536:
        return p.medium.Get().([]byte)[:size]
    default:
        return p.large.Get().([]byte)[:size]
    }
}

Generic sync.Pool (Go 1.18+)

With generics introduced in Go 1.18, you can write type-safe pools:

type Pool[T any] struct {
    pool sync.Pool
}

func NewPool[T any](fn func() T) *Pool[T] {
    return &Pool[T]{
        pool: sync.Pool{
            New: func() interface{} {
                return fn()
            },
        },
    }
}

func (p *Pool[T]) Get() T {
    return p.pool.Get().(T)
}

func (p *Pool[T]) Put(v T) {
    p.pool.Put(v)
}

// Usage
var bufPool = NewPool(func() *bytes.Buffer {
    return new(bytes.Buffer)
})

buf := bufPool.Get() // Returns *bytes.Buffer, no cast needed!

When NOT to Use sync.Pool

Pooling everything isn't correct either. Here are situations to avoid:

1. Short-Lived, Small Objects

Go's escape analysis already allocates small objects on the stack. Using a pool causes more harm than benefit:

// DON'T DO THIS — pooling for int is unnecessary
var intPool = sync.Pool{
    New: func() interface{} { return new(int) },
}

2. Objects with State

If the object has state and this state needs to be preserved between reuses, don't use a pool:

// WRONG — state leak risk
type User struct {
    ID        int
    LastLogin time.Time
    cache     map[string]string // You might forget to clear this
}

3. Very Large Objects (Long-Term GC Delays)

Very large objects puff up memory while waiting in the pool. They stay in RAM until GC cleans them up.

Profiling and Monitoring

To measure the impact of sync.Pool usage:

import "runtime"

func printMemStats() {
    var m runtime.MemStats
    runtime.ReadMemStats(&m)

    fmt.Printf("Alloc = %v MiB\n", m.Alloc/1024/1024)
    fmt.Printf("TotalAlloc = %v MiB\n", m.TotalAlloc/1024/1024)
    fmt.Printf("Sys = %v MiB\n", m.Sys/1024/1024)
    fmt.Printf("NumGC = %v\n", m.NumGC)
    fmt.Printf("PauseTotalNs = %v\n", m.PauseTotalNs)
}

Better yet, compare with benchmarks:

# Run benchmark
go test -bench=. -benchmem -benchtime=10s

# Generate memory allocation profile
go test -bench=. -benchmem -memprofile=mem.out
go tool pprof -http=:8080 mem.out

# CPU profile
go test -bench=. -cpuprofile=cpu.out
go tool pprof -http=:8081 cpu.out

Real-World Case Study: fasthttp

fasthttp, the most well-known high-performance HTTP library for Go, uses sync.Pool everywhere. Request and response objects, header parsing structures, even connection buffers are all pooled.

Here's a real example from fasthttp's internals:

// fasthttp/server.go — simplified
var ctxPool = sync.Pool{
    New: func() interface{} {
        return &RequestCtx{}
    },
}

func (s *Server) serveConn(c net.Conn) error {
    ctx := ctxPool.Get().(*RequestCtx)
    ctx.s = s
    ctx.c = c

    // ... request parsing, routing, response ...

    ctx.s = nil
    ctx.c = nil
    ctxPool.Put(ctx)
    return nil
}

This approach is one of the main reasons fasthttp is up to 10x faster than net/http. (Of course, it's not just pooling — zero-copy parsing, skipping header normalization, and other optimizations also play a role.)

Conclusion

sync.Pool is one of the most powerful but most misunderstood tools for memory allocation optimization in Go. When used correctly, it can reduce GC pressure by 30-70% and increase throughput by 2-5x.

Summary recommendations:

  1. Profile first — make sure you have an allocation problem (go test -benchmem, pprof).
  2. Pool large, frequently-allocated objects: bytes.Buffer, []byte slices, protobuf messages.
  3. Never bend the Get → use → Reset → Put rule, secure it with defer.
  4. Don't leak pointers taken from the pool — copy for data integrity.
  5. Don't pool small, stack-allocated objects — creates unnecessary overhead.
  6. Offer a type-safe API with a generic wrapper (Go 1.18+).

And remember: the best allocation is the one never made. But the second best is making it once and using it many times.

Tags: go, golang, sync-pool, memory-optimization, garbage-collection, allocation, performance, fasthttp, benchmarking, profiling

Date: 2026-05-27

dhy@ironhide:~/site$