Distributed databases utilize materialized views to pre-compute expensive aggregations and join queries. However, maintaining view consistency under high-frequency writes incurs substantial synchronization latency. In this paper, we present an asynchronous, transactionally consistent refresh engine powered by WAL-based log parsing and lock-free event pipelines.

Background & Problem

Materialized views must be refreshed when their base tables change. There are two primary refresh strategies:

  1. Immediate Refresh: Refreshes the view synchronously within the user’s transaction. This degrades write latency and creates database locks.
  2. Deferred Refresh: Updates the view periodically or on-demand. This introduces read staleness and consistency issues.

In high-throughput systems, neither approach scales well under thousands of writes per second. We require a streaming asynchronous system that updates base table changes within milliseconds while remaining transactionally correct.


Log-Structured Refresh Engine

To solve this, we design a log-structured view refresh scheduler. The pipeline is divided into three asynchronous layers:

1. Write-Ahead Log (WAL) Parser

We intercept raw transactions directly from the database’s Write-Ahead Log. By parsing log records off-thread, we completely bypass query engine parsing overhead and base table locks.

2. Lock-Free Delta Aggregation

Parsed changes are emitted as delta streams into ring-buffer structures. We aggregate delta updates (e.g. COUNT, SUM, AVG) in-memory using lock-free hash tables, grouping updates by the view’s primary keys.

package refresh

import (
	"sync/atomic"
)

type ViewDelta struct {
	Key   string
	Count int64
	Sum   int64
}

// AccumulateDelta updates a view record atomically without global mutex locks.
func AccumulateDelta(delta *ViewDelta, countVal, sumVal int64) {
	atomic.AddInt64(&delta.Count, countVal)
	atomic.AddInt64(&delta.Sum, sumVal)
}

3. Batched Flush Engine

Accumulated deltas are flushed to the materialized views in batch transactions, minimizing disk commit cycles and base table conflicts.


Evaluation

We benchmarked our engine under a simulated base table write workload of 50,000 operations per second.

  • Traditional Deferred Refresh: Led to disk I/O bottlenecks and transaction queues.
  • Log-Structured Async Refresh: Maintained materialized view freshness with a median delay of 8.2ms while introducing 0% query lock overhead to the base table transactions.