Comprehensive Guide: Implementing Error Handling in Golang Microservices with Best Practices


1 Мар 2015
Error handling in microservices presents unique challenges that require sophisticated solutions. When building distributed systems, we need to consider how errors propagate across service boundaries and impact the overall system behavior.

Let's explore how to implement effective error handling in Golang microservices. The key is to create a consistent error handling strategy that maintains context and provides meaningful information for debugging and monitoring.

Error handling in microservices requires different considerations compared to monolithic applications. Network failures, timeouts, and partial system failures are common scenarios we must handle gracefully.

Creating a custom error type allows us to include additional context and metadata:

type CustomError struct {
Code string
Message string
Timestamp time.Time
TraceID string
ServiceID string
Retryable bool
StatusCode int

func (e *CustomError) Error() string {
return fmt.Sprintf("[%s] %s", e.Code, e.Message)

When handling errors across service boundaries, we need to consider error serialization and deserialization:

type ErrorResponse struct {
Error struct {
Code string `json:"code"`
Message string `json:"message"`
Details map[string]any `json:"details,omitempty"`
} `json:"error"`

func WriteError(w http.ResponseWriter, err error) {
var customErr *CustomError
if errors.As(err, &customErr) {
response := ErrorResponse{}
response.Error.Code = customErr.Code
response.Error.Message = customErr.Message


Error: struct {
Code string `json:"code"`
Message string `json:"message"`
Details map[string]any `json:"details,omitempty"`
Message: "An unexpected error occurred",

Context awareness is crucial for proper error handling. We can create middleware to inject relevant context:

func ErrorContextMiddleware(next http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
ctx := r.Context()
ctx = context.WithValue(ctx, "trace_id", uuid.New().String())
ctx = context.WithValue(ctx, "start_time", time.Now())

next.ServeHTTP(w, r.WithContext(ctx))

Error categorization helps in making decisions about retry strategies and client responses:

type ErrorCategory int

const (
TransientError ErrorCategory = iota

func categorizeError(err error) ErrorCategory {
var customErr *CustomError
if errors.As(err, &customErr) {
switch {
case strings.HasPrefix(customErr.Code, "SEC_"):
return SecurityError
case customErr.Retryable:
return TransientError
case customErr.StatusCode >= 400 && customErr.StatusCode < 500:
return BusinessError
return PermanentError
return PermanentError

Implementing circuit breakers for external service calls:

type CircuitBreaker struct {
failures int
threshold int
resetTimeout time.Duration
lastFailure time.Time
mu sync.Mutex

func (cb *CircuitBreaker) Execute(operation func() error) error {
if cb.failures >= cb.threshold &&
time.Since(cb.lastFailure) < cb.resetTimeout {
return &CustomError{
Message: "Circuit breaker is open",

if err := operation(); err != nil {
cb.lastFailure = time.Now()
return err

cb.failures = 0
return nil

Error logging and monitoring are essential for maintaining system health:

func logError(ctx context.Context, err error) {
fields := make(map[string]interface{})

if traceID, ok := ctx.Value("trace_id").(string); ok {
fields["trace_id"] = traceID

if startTime, ok := ctx.Value("start_time").(time.Time); ok {
fields["duration"] = time.Since(startTime)

var customErr *CustomError
if errors.As(err, &customErr) {
fields["error_code"] = customErr.Code
fields["status_code"] = customErr.StatusCode
fields["service_id"] = customErr.ServiceID

fields["error"] = err.Error()

// Log to your preferred logging system
log.WithFields(fields).Error("Service error occurred")

Implementing retry mechanisms with exponential backoff:

func retryWithBackoff(operation func() error, maxRetries int) error {
var err error
for i := 0; i < maxRetries; i++ {
err = operation()
if err == nil {
return nil

if !isRetryable(err) {
return err

backoffDuration := time.Duration(math.Pow(2, float64(i))) * time.Second
return err

func isRetryable(err error) bool {
var customErr *CustomError
if errors.As(err, &customErr) {
return customErr.Retryable
return false

Implementing graceful degradation:

type ServiceDependency struct {
Primary func() (interface{}, error)
Fallback func() (interface{}, error)
Cache *cache.Cache
CacheTTL time.Duration

func (sd *ServiceDependency) Execute() (interface{}, error) {
result, err := sd.Primary()
if err == nil {
sd.Cache.Set("latest_result", result, sd.CacheTTL)
return result, nil

if cached, found := sd.Cache.Get("latest_result"); found {
return cached, nil

if sd.Fallback != nil {
return sd.Fallback()

return nil, err

Error aggregation for batch operations:

type BatchError struct {
Errors []error

func (be *BatchError) Error() string {
var messages []string
for _, err := range be.Errors {
messages = append(messages, err.Error())
return strings.Join(messages, "; ")

func processBatch(items []Item) error {
var batchErr BatchError

for _, item := range items {
if err := processItem(item); err != nil {
batchErr.Errors = append(batchErr.Errors, err)

if len(batchErr.Errors) > 0 {
return &batchErr

return nil

Handling panics in microservices:

func RecoveryMiddleware(next http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
defer func() {
if err := recover(); err != nil {
stack := debug.Stack()
logError(r.Context(), fmt.Errorf("panic: %v\n%s", err, stack))

WriteError(w, &CustomError{
Message: "An unexpected error occurred",
StatusCode: http.StatusInternalServerError,

next.ServeHTTP(w, r)

These patterns and implementations provide a robust foundation for handling errors in microservices. The key is to maintain consistency across services while providing enough context for effective debugging and monitoring.

Remember to adapt these patterns based on your specific requirements and infrastructure. Regular testing and monitoring of error handling mechanisms ensure they continue to meet your system's needs as it evolves.

