Increasingly, more and more companies are transitioning their software architectures to a new event-driven-architecture (EDA) paradigm to handle the increased load of generated data across their enterprises. According to Wikipedia, an “Event-driven architecture is a software architecture paradigm promoting the production, detection, consumption of, and reaction to events”. In the context of IoT, example events are new device registrations, the last telemetry message or the completion of a firmware update. The event producers (i.e., the devices) and the consumers of these events (the rest of the system) don’t need to know nothing about each other, they only need to react when events occur. This leads to a system architecture where a) software components and services are loosely coupled and developed independently, and b) a horizontally scalable system where new consumers (or producers) can be added dynamically to handle the increased processing load.
Apache Kafka has emerged as the de-facto standard for the implementation of next-generation event-driven architectures, and for a good reason. Rooted in LinkedIn where it was developed to handle the large number of events occurring within the platform, Apache Kafka has soon found its way as an independent open-source project (with commercial backing from Confluent and others) and more and more enterprises have started to adopt it in their quest to handle the increasing flow of data within their systems.
Core to Kafka is the concept of a) a “commit log” as an efficient and highly available long-term storage of incoming events, b) a scalable Consumer/Producer API that plugs into this commit log and c) ability for consumers to rewind in time at will i.e., to apply a new analytical processing model to old data, tune their machine learning ML models etc. This “rewind” feature is what distinguishes Kafka from other previously Message-Oriented Middleware (MOM) systems, where the “occurred” event was lost when it was consumed by the system. In addition to the core Kafka project, several frameworks have been created on top of Kafka with the most prominent:
- Kafka Connect: A pluggable connector framework to support the propagation of events from Kafka to any number of external systems such as database stores, cloud stores etc. to take advantage of the specific characteristics of each one. Already, there is support for a large collection of different systems and the list keeps growing.
- Streaming Analytics: An API to apply real-time streaming analytics on incoming events using either code or an SQL like language to support actionable insights at the point of event generation. A credit card fraud text message you receive or ad recommendations during your web browsing most likely were generated by a streaming analytic job running inside Kafka.
To help our customers who either already use Kafka in their architecture or plan to use it to build their next-gen IoT event-driven systems, we’ve developed a Pelion Kafka Connect Connector that eases the integration of our Pelion IoT platform with Kafka and makes it possible to enjoy the benefits this integration brings.
Storing telemetry data & sending device requests through Kafka
Both a Kafka Connect Source connector (receiving data from devices and storing them to Kafka) and a Sink Connector (sending device requests through Kafka) are provided. Internally, the connectors utilize the mechanism that Pelion provides for any user-defined application to receive telemetry data and send device requests: Notification Channels , Device Pre-subscriptions and our extensive Service API catalogue.
Here is a snippet of the Kafka source connector configuration:
"pelion.access.key.list": "<access_key1>, <access_key2>",
"subscriptions": "presub1, presub2, presub3",
"subscriptions.presub1.resource-path": "/3200/0/5501, /3303/*",
"resource.type.mapping": "1:i, 5501:i, 21:i, 5853:s"
If you are a long-time Pelion developer, you can’t escape noticing that the configuration parameters such as ‘endpoint-name’, ‘resource-path’ and ‘endpoint-type’, match one-to-one to the rules you define when you set up pre-subscriptions, giving you enough flexibility to cover most of your use cases. For the Sink connector, the configuration is much simpler since all that is required is your ‘Access Key’ to be able to make API calls.
- For more information about the Source and Sink configuration, have a look at our documentation
Once you have configured your environment parameters in the connectors and deployed them to a running Kafka Connect cluster, incoming telemetry messages coming from your devices now stored in Kafka topics. Further, you get the ability to send device requests through a Kafka topic too.
Integration with external systems
Just as in the case of our Pelion connector, a great list of available connectors is available, that can connect Kafka to any external system that an organisation may use, whether this system is running locally or in the cloud. This allows a “mix & match” connector paradigm by combining multiple connectors (Pelion and others) to allow forwarding of your IoT data to any external system that fits better for the needs at hand. Examples can be forwarding your time-series telemetry data to a local on-premises Elasticsearch installation to take advantage of its strong searching capabilities and simultaneously sending the data to an Amazon S3 instance in the cloud for long-term “cold” storage.
Analyzing and acting
Another useful feature provided by this Kafka integration is the ability to perform streaming analytics on top of incoming data. Using a familiar SQL-like language (or in code), you can build applications that transform, filter join and aggregate events happening across your systems in-real-time and then be able to respond immediately. Combining it with the Connector framework we discussed previously to forward the results to an external system, you can build a real-time notification pipeline leveraging just your Kafka infrastructure along.
The video below showcases an analytical SQL query running that counts the notifications received from a device sensor over a window of time (10s) and if this count exceeds three, then a Slack notification message is sent (using Kafka Connect):
If you are already using Kafka in your architecture, we hope that you’ll find the developed connector useful to allow a native integration with the Pelion IoT platform. If not, we hope we sparked your interest to explore more. We must admit though that Kafka is not for the faint-hearted, at least in the beginning, but once you get grips with it, we believe you will be impressed with what it can make possible. In an upcoming blog posts, we will look at how Pelion, Kafka and Machine Learning can be used together for real-time predictive maintenance and more. In the meantime, you can visit our quick-start guide to replicate exactly what we have described here. All that is required is a “free-tier” account in Pelion IoT platform, our Virtual-Demo app (in case you don’t own a Pelion Ready hardware) and coffee! Have fun!