Overview

Console application demonstrating Azure Event Hubs consumer pattern using EventProcessorHost. Receives and processes IoT telemetry messages from distributed devices with automatic partition management, checkpointing, and fault tolerance.

Built as reference implementation for Azure IoT telemetry pipeline.

Architecture

EventProcessorHost Pattern

  • Abstracts partition management and load balancing
  • Automatically distributes partitions across consumer instances
  • Handles partition ownership via Azure Storage leases
  • Provides scalable, fault-tolerant message consumption

SimpleEventProcessor Implementation

  • OpenAsync: Initialize processor for assigned partition
  • ProcessEventsAsync: Handle batch of messages from partition
  • CloseAsync: Cleanup when partition revoked or shutdown
  • CheckpointAsync: Persist current partition offset

Checkpoint Strategy

  • Checkpoint every 5 minutes (configurable)
  • Stores offset in Azure Storage Blob
  • Enables resume from last checkpoint on restart/failover
  • Trade-off between performance and duplicate processing risk

Message Processing

  • Batch processing: ProcessEventsAsync receives IEnumerable
  • UTF-8 decoding for JSON payloads
  • Partition-aware logging with partition ID
  • Synchronous processing (no message loss)

Technical Implementation

Connection Management: EventProcessorHost maintains persistent connections to Event Hub partitions. Uses AMQP 1.0 protocol for efficient bi-directional communication. Connection pool managed internally by Azure SDK.

Partition Assignment: Multiple EventProcessorHost instances automatically coordinate via Azure Storage leases. Each partition assigned to exactly one consumer. Rebalancing occurs when consumers added/removed. Typical rebalance time: 30-60 seconds.

Checkpointing Mechanism: CheckpointAsync() writes partition offset, sequence number, and timestamp to Azure Storage Blob. Blob name format: {EventHub}/{ConsumerGroup}/{Partition}/checkpoint. Uses optimistic concurrency (ETag) to prevent conflicts.

Error Handling: IEventProcessor.CloseAsync() called on unhandled exceptions. Partition automatically reassigned to another consumer. Checkpoint preserved, so processing resumes from last saved offset. Poison messages require dead-letter queue pattern (not implemented in this demo).

Technical Challenges

Duplicate Message Processing: With 5-minute checkpoint interval, consumer restart can reprocess up to 5 minutes of messages. Implemented idempotent message handling in downstream systems by tracking message IDs.

Partition Rebalancing: When deploying multiple consumer instances, partitions periodically rebalance. Messages from rebalanced partitions briefly paused during ownership transfer. Acceptable for telemetry scenarios; critical systems need more sophisticated handling.

Storage Latency: Checkpoint calls to Azure Storage add latency (~50-100ms). Checkpointing every message would reduce throughput significantly. 5-minute interval balances performance vs duplicate risk.

Results

Successfully consumed messages from 16-partition Event Hub at 10,000 messages/second. EventProcessorHost scaled horizontally by deploying 4 consumer instances (4 partitions each). Checkpoint overhead < 0.5% of total processing time with 5-minute interval.

Served as reference code for Azure IoT training workshops and customer implementations.

Tech Stack

  • Language: C#
  • Framework: .NET Framework 4.5+
  • Azure SDK: Microsoft.Azure.ServiceBus.EventProcessorHost (v2.x)
  • Azure Services: Event Hubs, Storage (Blob for checkpoints)
  • Protocol: AMQP 1.0

Source Code

Code will be available on GitHub at: https://github.com/tanchunsiong/iot-downloader

Important: Remove hardcoded connection strings before committing. Use app.config, environment variables, or Azure Key Vault.

Project Created: 2015-2016


Connect: