High-Performance Batch Processing in .NET core: From 10 to 350 Records Processing/Second

When building enterprise applications that process thousands or millions of records, the difference between naive one-by-one processing and optimized batch operations can mean the difference between minutes and hours of processing time. This article explores how we achieved a 350x performance improvement (from ~1 record/second to 350 records/second) using strategic design patterns and .NET Core optimizations.

The Challenge: N+1 Query Problem at Scale

Most applications start with simple CRUD operations that work perfectly for small datasets. However, when processing large batches of records, the traditional approach quickly becomes a bottleneck:

// Naive approach - processes one record at a time
foreach (var record in records)
{
    var existingEntity = await repository.GetByKeyAsync(record.Key);
    var relatedData = await repository.GetRelatedDataAsync(record.RelatedId);
    // ... more individual database calls
    await repository.CreateAsync(newEntity);
}

For 1,000 records, this approach generates 6,000+ database round trips – a classic N+1 query problem that kills performance.

The Solution: Repository Pattern with Batch Operations

We implemented a comprehensive batch processing solution using several key design patterns:

1. Enhanced Repository Pattern with Bulk Operations

Our BaseRepository implements both traditional CRUD operations and high-performance batch methods:

public abstract class BaseRepository<TEntity, TDto, TPatchDto>
    : IBaseRepository<TEntity, TDto, TPatchDto>
{
    protected readonly DbContext _context;
    protected readonly DbSet<TEntity> _dbSet;
    protected readonly IMapper _mapper;

    /// <summary>
    /// Bulk create multiple entities in batches for high-performance scenarios
    /// </summary>
    public virtual async Task<List<TDto>> BulkCreateAsync(
        List<TDto> dtos, 
        int batchSize = 1000, 
        CancellationToken cancellationToken = default)
    {
        if (!dtos.Any()) return new List<TDto>();

        var results = new List<TDto>();
        
        // Process in batches to avoid memory issues
        for (int i = 0; i < dtos.Count; i += batchSize)
        {
            var batch = dtos.Skip(i).Take(batchSize).ToList();
            var entities = batch.Select(dto => {
                var entity = _mapper.Map<TEntity>(dto);
                SetTimestamps(entity, isNewEntity: true);
                return entity;
            }).ToList();
            
            // Single database round trip per batch
            await _dbSet.AddRangeAsync(entities, cancellationToken);
            await _context.SaveChangesAsync(cancellationToken);
            
            results.AddRange(_mapper.Map<List<TDto>>(entities));
        }
        
        return results;
    }

    /// <summary>
    /// Bulk lookup entities by property value - eliminates N+1 queries
    /// </summary>
    public virtual async Task<Dictionary<TKey, TDto>> GetByPropertyBatchAsync<TKey>(
        Expression<Func<TEntity, TKey>> keySelector,
        List<TKey> keys,
        CancellationToken cancellationToken = default)
    {
        if (!keys.Any()) return new Dictionary<TKey, TDto>();

        var entities = await GetQueryable()
            .Where(BuildContainsExpression(keySelector, keys))
            .AsNoTracking()  // Critical for read-only operations
            .ToListAsync(cancellationToken);
            
        var dtos = _mapper.Map<List<TDto>>(entities);
        var keyProperty = keySelector.Compile();
        
        return entities.Zip(dtos, (entity, dto) => new { Key = keyProperty(entity), Dto = dto })
                      .ToDictionary(x => x.Key, x => x.Dto);
    }
}

2. Template Method Pattern for Specialized Repositories

Each entity repository extends the base repository and implements domain-specific batch lookups:

public class EntityRepository : BaseRepository<Entity, EntityDto, EntityPatchDto>
{
    public async Task<Dictionary<string, EntityDto>> GetByIdentifierBatchAsync(
        List<string> identifiers, 
        CancellationToken cancellationToken = default)
    {
        if (!identifiers?.Any() == true) 
            return new Dictionary<string, EntityDto>();

        var normalized = identifiers.Where(x => !string.IsNullOrWhiteSpace(x))
                                   .Distinct()
                                   .ToList();

        var entities = await GetQueryable()
            .Where(e => normalized.Contains(e.Identifier))
            .AsNoTracking()  // No change tracking for batch reads
            .ToListAsync(cancellationToken);

        var dtos = _mapper.Map<List<EntityDto>>(entities);
        return dtos.ToDictionary(dto => dto.Identifier, dto => dto);
    }
}

3. Facade Pattern for Batch Processing Service

The service layer orchestrates the entire batch processing workflow:

public class BatchProcessingService
{
    public async Task<List<ProcessingResultDto>> ProcessRecordsAsync(
        Guid batchId, 
        List<RecordDto> records, 
        CancellationToken cancellationToken = default)
    {
        // Step 1: Deduplicate keys for batch lookups
        var uniqueKeys = records.Select(r => r.KeyProperty)
                               .Where(k => !string.IsNullOrWhiteSpace(k))
                               .Distinct()
                               .ToList();

        var relatedIds = records.Select(r => r.RelatedId)
                               .Distinct()
                               .ToList();

        // Step 2: Batch lookups - eliminate N+1 queries
        var existingEntitiesMap = await _entityRepository
            .GetByIdentifierBatchAsync(uniqueKeys, cancellationToken);
        
        var relatedDataMap = await _relatedRepository
            .GetByIdsBatchAsync(relatedIds, cancellationToken);

        // Step 3: In-memory processing and validation
        var validationResults = records.ToDictionary(
            r => r.Id, 
            r => ValidateRecord(r, existingEntitiesMap, relatedDataMap)
        );

        // Step 4: Prepare bulk operations
        var entitiesToCreate = new List<EntityDto>();
        var processingJobs = new List<ProcessingJobDto>();

        foreach (var record in records)
        {
            var validation = validationResults[record.Id];
            if (validation.IsValid)
            {
                entitiesToCreate.Add(CreateEntityFromRecord(record, validation));
                processingJobs.Add(CreateProcessingJob(record, validation));
            }
        }

        // Step 5: Execute bulk operations
        if (entitiesToCreate.Any())
        {
            await _entityRepository.BulkCreateAsync(entitiesToCreate, 500, cancellationToken);
        }

        if (processingJobs.Any())
        {
            await _processingJobRepository.BulkCreateAsync(processingJobs, 500, cancellationToken);
        }

        return BuildProcessingResults(records, validationResults);
    }
}

Key Performance Optimizations

1. Database Round-Trip Reduction

Before: 1000 records × 6 queries = 6,000 database callsAfter: 6 batch queries total (regardless of record count)

// Single batch query replaces hundreds of individual lookups
var existingEntities = await repository.GetByPropertyBatchAsync(
    entity => entity.Identifier, 
    allIdentifiers, 
    cancellationToken
);

2. Memory and Change Tracking Optimization

Use AsNoTracking() for read-only batch operations:

var entities = await GetQueryable()
    .Where(e => identifiers.Contains(e.Identifier))
    .AsNoTracking()  // Crucial for large datasets
    .ToListAsync(cancellationToken);

3. Smart Batching Strategy

Implement adaptive batch sizing based on volume:

var batchSize = records.Count switch
{
    <= 50 => records.Count,     // Small batches - process all at once
    <= 200 => 100,             // Medium batches - optimal size
    <= 1000 => 200,            // Large batches - maximize throughput
    _ => 500                   // Very large batches - aggressive batching
};

4. Expression Tree Optimization

Build efficient Contains expressions for batch lookups:

private Expression<Func<TEntity, bool>> BuildContainsExpression<TKey>(
    Expression<Func<TEntity, TKey>> keySelector, 
    List<TKey> keys)
{
    var parameter = keySelector.Parameters[0];
    var body = keySelector.Body;
    var containsMethod = typeof(List<TKey>).GetMethod("Contains");
    var containsCall = Expression.Call(Expression.Constant(keys), containsMethod, body);
    
    return Expression.Lambda<Func<TEntity, bool>>(containsCall, parameter);
}

Design Patterns Used

1. Repository Pattern

Abstracts data access logic and provides a consistent interface for both single and batch operations.

2. Template Method Pattern

Base repository defines the structure of batch operations while allowing specialized repositories to implement domain-specific logic.

3. Facade Pattern

Service layer provides a simplified interface that orchestrates complex batch processing workflows.

4. Strategy Pattern

Adaptive batch sizing and different processing strategies based on data volume.

5. Factory Pattern

Dynamic creation of Entity Framework expressions and AutoMapper configurations.

Performance Results

MetricBeforeAfterImprovement
Throughput~10 record/sec350 records/sec35x
Database Calls6,000+ for 1K records6 total1000x reduction
Processing Time1.6+ minutes for 1K records3 seconds32x faster
Memory UsageHigh (full tracking)Optimized (AsNoTracking)Significantly reduced

Scalability Considerations

Horizontal Scaling

The batch processing approach is stateless and ready for horizontal scaling:

// Stateless design allows easy distribution across multiple nodes
public async Task<BatchResultDto> ProcessBatchAsync(
    ProcessBatchRequest request,
    CancellationToken cancellationToken = default)
{
    // No shared state - each batch is independent
    return await ProcessRecordsAsync(request.BatchId, request.Records, cancellationToken);
}

Large Dataset Handling

Built-in support for massive datasets using the filtering and sorting infrastructure:

public async Task<(List<TDto> items, int totalCount)> GetAllAsync(
    string? filter = null,
    string? sortBy = null,
    int pageNumber = 1,
    int pageSize = 10,
    CancellationToken cancellationToken = default)
{
    var entities = GetQueryable();
    
    // Efficient filtering for millions of records
    if (!string.IsNullOrWhiteSpace(filter))
    {
        entities = FilterHelper.ApplyFilter(entities, filter, query);
    }
    
    // Database-level sorting and pagination
    entities = FilterHelper.ApplySorting(entities, sortBy, order);
    
    var pagedEntities = await entities
        .Skip((pageNumber - 1) * pageSize)
        .Take(pageSize)
        .ToListAsync(cancellationToken);
    
    return (dtos, totalCount);
}

Achieving high-performance batch processing in .NET requires a combination of sound architectural patterns and strategic optimizations. By implementing the Repository pattern with batch operations, utilizing Entity Framework’s bulk capabilities, and carefully managing database round trips, we transformed a system that could barely handle real-time processing into one capable of enterprise-scale batch operations.The key insight is that business logic is rarely the bottleneck – it’s almost always the data access pattern. By shifting from individual database calls to batch operations while preserving all business rules, you can achieve dramatic performance improvements without sacrificing functionality or reliability.This approach scales horizontally, handles datasets with millions of records, and maintains backward compatibility – making it suitable for both small applications and enterprise systems processing massive data volumes.