This is not a step by step guide with commands. This is a guide to get started moving from generic Kafka to Confluent Inc’s Kafka, Connectors and ready to consume one GitHub repository. Initially it will appear that with the easiness we installed Apache Kafka and MQTT Server, with same easiness we can connect Apache Kafka with MQTT. There are actually many things to learn, think. MQTT is a protocol allowing publish/subscribe exchanges. Multiple implementations of client libraries and brokers are compatible. MQTT only specifies the transport, and the application part like how data will be handled and possibly stored etc are vague. Apache Kafka is a message broker with the focus of storing massive amounts of data, and allowing consumption in real-time or later utilization. Kafka uses own network protocol. We we need to store massive amount of messages, ensure batch processing, we need Kafka. Kafka itself has no built in message priority, security is poor and the heavy protocol makes it difficult to easily use.
MQTT does not seem to have the ability to handle very high throughput from sensors so again we have to turn to Kafka which can handle high throughput. If we use thing like Arduino Yun, as it’s one of processors runs OpenWRT Linux distribution, we can simply port Kafka to this OS and run Kafka client to push data to Apache Kafka server. However, generalized direct method yet not discovered/written.
Apache Kafka is developed in Java, and its deployment is managed by Apache ZooKeeper. Any OS capable of running a JVM can be used to deploy a Kafka cluster. We have written how to install the generic Kafka.
In this context, we need to understand how to implement Kafka consumers and producers using Spring and learn how to use Kafka Connectors. There are different types of Kafka Connectors. Kafka Connect is a framework for connecting Kafka with external systems, they are ready-to-use components to import data from external systems into Kafka topics and also export data from Kafka topics into external systems. Some of the Kafka connectors are maintained by the community, while others are supported by Confluent or other such. When we need to use Connector, instead of using the normal Kafka distribution, we use the Confluent Platform (Kafka distribution provided by Confluent). Confluent Platform has some additional tools and some additional pre-built Connectors. Here we will get the Confluent’s Kafka :
One popular way to connect is written in these repo and guide :
Modified way of the above approach can be running all the stuffs on Docker. We can build a platform with Kafka Connect, MQTT and MongoDB.