Kerberos – Installation guide, Integration with Apache Kafka and CDH (part 2/3)

Ivan Zeko

DATA ENGINEER

We will show you how to integrate Kerberos with probably the most popular data streaming platform on the market, Apache Kafka.

 

Introduction

In part one of this Kerberos series, we explained process of installing the Kerberos KDC server and Kerberos Client. We have also provided a test scenario that included SSH authentication using Kerberos ticketing system where we were able to authenticate without providing any passwords. In this second part of our Kerberos series, we will show you how to integrate Kerberos with probably the most popular data streaming platform on the market, Apache Kafka.

In case that you are unfamiliar with the Apache Kafka, it is used for building real-time data pipelines and streaming apps. It is horizontally scalable, fault-tolerant, wicked fast, and runs in production in thousands of companies.

Now, lets walk through the requirement that you will need in order to successfully finish this setup.

Prerequisites

Before we jump into integration setup, make sure that you have:

  • Kerberos KDC Server successfully installed from the first part

  • Single node Apache Kafka up and running (not on the same machine as KDC server)

Kafka initial setup can be simple and does not require a lot of configuring for this walkthrough.

In the following examples, we will use:

Installing Kerberos Client libraries

Like on every machine that uses Kerberos, we need to install the required packages. So, if your Kafka server doesn’t have the following packages:

  • krb5-workstation

  • pam_krb5

then run the following command:

[root@ ~] # yum -y install krb5-workstation pam_krb5

Installing the NTP service

In the previous part, we explained the importance of the NTP service and how time difference can make Kerberos authentication process result in failure. That’s why we need to make sure that system time on our client machine is in sync with the system time of our KDC server.

On your Kafka machine, run the following commands to install the NTP service:

[root~] # yum -y install ntp [root~] # ntpdate 0.rhel.pool.ntp.org [root~] # systemctl start ntpd.service [root~] # systemctl enable ntpd.service

Configuring host settings

In this step, we will provide our Kafka machine with information about our KDC server IP address and FQDN. We will also make sure that our Kafka host name matches its FQDN.

CONFIGURING /ETC/HOSTS FILE

Using one of the text editors that your machine provides, add the following entry to your /etc/hosts file:

[root~] # vi /etc/hosts <kdc_ip_address> kdc.example.com

HOSTNAME CONFIGURATION

We have already mentioned, it is important that your hostname matches the FQDN because Kerberos is using that information when preforming the authentication.

To check that, simply run the following command:

[root~] # hostname kafka.company.com

If your hostname does not match the specified, then make the changes within the /etc/hostname file by using one of the text editors:

[root~] # vi /etc/hostname kafka.company.com

and run the following command:

[root~] # hostnamectl set-hostname kafka.company.com

Run the same hostname command again and make sure that the value is now correct.

Creating Kerberos principals

The next step in Kerberos-Kafka integration would be creating the Kerberosprincipals for:

  • Kafka machine (host)

  • Kafka

  • Zookeeper

On your KDC server, login as a sudo user and run the kadmin.local command (you can do the same thing using kadmin command by logging in with KDC admin user from Kafka machine). Create all the required principals and add them to your keytab:

[root~] # kadmin.local kadmin.local: addprinc -randkey host/kafka.company.com kadmin.local: addprinc -randkey kafka/kafka.company.com kadmin.local: addprinc -randkey zookeeper/kafka.company.com kadmin.local: ktadd host/kafka.company.com kadmin.local: ktadd kafka/kafka.company.com kadmin.local: ktadd zookeeper/kafka.company.com

Kafka service principal will be used for authenticating brokers, clients, etc., while zookeeper service principal will be used for authenticating to zookeeper service.

COPYING KRB5.CONF AND KRB5.KEYTAB FILES

In order to get Kerberos client to work, you need to copy krb5.conf and krb5.keytab files form your KDC server to Kafka machine. Default location of those files are:

  • /etc/krb5.conf

  • /etc/krb5.keytab

We will copy those files to the same location on our Kafka machine.

Note! You can now configure the SSH authentication using Kerberos, like we described in the first part of this Kerberos series, to test the connectivity and to test if your Kerberos client can successfully preform authentication. (OPTIONAL)

Kafka configuration

Going through the following sections, you will notice that each component requires JAAS file creation and corresponding property file configuration. JAAS configuration files are used when authentication is performed in a pluggable fashion. This way Java applications can remain independent from underlying authentication technologies. To keep things organized, we will create a new directory called kafka within the /etc directory and store JAAS files in there.

LOGGING

To help you out with problems you may encounter, we will show you how to configure Kerberos debugging when starting the Kafka services so you can track the them much easier. To enable SASL/GSSAPI debug output, you can set the sun.security.krb5.debug system property to true. For example:

[user~] $ export KAFKA_OPTS=-Dsun.security.krb5.debug=true [user~] $ bin/kafka-server-start etc/kafka/server.properties

ZOOKEEPER CONFIGURATION

JAAS file

Like we already mentioned, first step of zookeeper configuration will be creation of JAAS configuration file within the/etc/kafkadirectory called zookeeper_jaas.conf. Using one of the text editors on your Kafka server, create a new file and edit it so it matches the following:

[user~] $ vi /etc/kafka/zookeeper_jaas.conf Server { com.sun.security.auth.module.Krb5LoginModule required useKeyTab=true keyTab="/etc/krb5.keytab" storeKey=true useTicketCache=false principal="zookeeper/kafka.company.com@EXAMPLE.COM"; };

zookeeper.properties

In your zookeeper property file, using one of the text editors, add the following lines:

[user~] $ vi /opt/kafka/config/zookeeper.properties authProvider.1=org.apache.zookeeper.server.auth.SASLAuthenticationProvider kerberos.removeHostFromPrincipal=true kerberos.removeRealmFromPrincipal=true

Starting zookeeper server

Earlier, in Logging section, we showed you the way of enabling debugger for Kerberos. You can use that property when starting zookeeper server. Beside that, using the same environment variable (KAFKA_OPTS), you need to provide the location of your zookeeper JAAS file before starting zookeeper. In order to do that, run the following commands:

[user~] $ export KAFKA_OPTS="-Dsun.security.krb5.debug=true -Djava.security.auth.login.config=/etc/kafka/zookeeper_jaas.conf" [user~] $ bin/zookeeper-server-start config/zookeeper.properties

KAFKA BROKER CONFIGURATION

JAAS file

Just like we did for a zookeeper, we are going to create a corresponding JAAS file for Kafka broker called kafka_server_jaas.conf. Using one of the text editors on your Kafka server, create a new file within /etc/kafka directory and edit it so it matches the following:

[user~] $ vi /etc/kafka/kafka_server_jaas.conf KafkaServer { com.sun.security.auth.module.Krb5LoginModule required useKeyTab=true storeKey=true keyTab="/etc/krb5.keytab" principal="kafka/kafka.company.com@EXAMPLE.COM "; }; Client { com.sun.security.auth.module.Krb5LoginModule required useKeyTab=true storeKey=true keyTab="/etc/krb5.keytab" principal="kafka/kafka.company.com@EXAMPLE.COM "; };

server.properties

Open your Kafka server property file and enable GSSAPI mechanism by adding the following lines:

# List of enabled mechanisms, can be more than one sasl.enabled.mechanisms=GSSAPI # Specify one of of the SASL mechanisms sasl.mechanism.inter.broker.protocol=GSSAPI

Now, to enable SASL for inter-broker communication, add the following to the broker properties file (it defaults toPLAINTEXT). Set the protocol to:

  • SASL_SSL: if SSL encryption is enabled (SSL encryption should always be used if SASL mechanism is PLAIN)

  • SASL_PLAINTEXT: if SSL encryption is not enabled

In our example we didn’t use SSL encryption. Change the value of listeners and advertised listeners to match the following:

listeners=SASL_PLAINTEXT://0.0.0.0:9092 advertised.listeners=SASL_PLAINTEXT://kafka.company.com:9092

In case that you don’t want SASL authentication for inter-brokercommunication, or in case some clients that are connecting to Kafka broker do not use SASL, then configure both SASL_PLAINTEXT and PLAINTEXT for different ports. In the following example, traffic through the port 9092 will not use SASL, while port 9093 will.

listeners=PLAINTEXT://0.0.0.0:9092,SASL_PLAINTEXT://0.0.0.0:9093 advertised.listeners=PLAINTEXT://kafka.company.com:9092,SASL_PLAINTEXT://kafka.company.com:9093

When using GSSAPI, you need to configure a service name that matches the service name of the Kafka principal configured in the server JAAS file. In earlier JAAS file examples, with principal=”kafka/kafka.company.com@EXAMPLE.COM”;, the primary is “kafka”:

sasl.kerberos.service.name=kafka

It is also possible to strictly specify the inter-broker protocol using the following parameter:

security.inter.broker.protocol=SASL_PLAINTEXT

The metadata stored in ZooKeeper is such that only brokers will be able to modify the corresponding znodes, but znodes are world readable. While the data stored in ZooKeeper is not sensitive, inappropriate manipulation of znodes can cause cluster disruption.

In the server.properties file, enable ZooKeeper ACLs. :

zookeeper.set.acl=true

It is recommended to limit access to ZooKeeper via network segmentation (only brokers and some admin tools need access to ZooKeeper if the new consumer and new producer are used).

Starting Kafka broker

Using second terminal (session), to the same Kafka server, export the KAFKA_OPTS with Kerberos debug set to True, but this time pointing to the kafka_server_jaas.conf file:

[user~] $ export KAFKA_OPTS="-Dsun.security.krb5.debug=true -Djava.security.auth.login.config=/etc/kafka/kafka_server_jaas.conf" [user~] $ bin/kafka-server-start.sh config/server.properties

KAFKA CLIENTS

JAAS file

Within /etc/kafka directory create a new file called kafka_client_jaas.file and add the following files:

KafkaClient { com.sun.security.auth.module.Krb5LoginModule required useKeyTab=true storeKey=true keyTab="/etc/krb5.keytab" principal="kafka/kafka.company.com@EXAMPLE.COM"; }; Client { com.sun.security.auth.module.Krb5LoginModule required useKeyTab=true storeKey=true keyTab="/etc/krb5.keytab" principal="kafka/kafka.company.com@EXAMPLE.COM"; };

producer.properties & consumer.properties

This configuration set-up is same for both, consumer and producer config files. Configure the following properties in a client properties file:

sasl.mechanism=GSSAPI # Configure SASL_SSL if SSL encryption is enabled security.protocol=SASL_PLAINTEXT

Configure a service name that matches the primary name of the Kafka server configured in the broker JAAS file.

sasl.kerberos.service.name=kafka

Starting Kafka console producer

Using third terminal (session), to the same Kafka server, export the KAFKA_OPTS with Kerberos debug set to True, but this time pointing to the kafka_client_jaas.conf file:

[user~] $ export KAFKA_OPTS="-Dsun.security.krb5.debug=true -Djava.security.auth.login.config=/etc/kafka/kafka_client_jaas.conf" [user~] $ bin/kafka-console-producer.sh --broker-list kafka.company.com:9092 --topic test-topic --producer.config config/producer.properties

Starting Kafka console consumer

Using another terminal (session), to the same Kafka server, export the KAFKA_OPTS with Kerberos debug set to True and pointing to the same kafka_client_jaas.conf file:

[user~] $ export KAFKA_OPTS="-Dsun.security.krb5.debug=true -Djava.security.auth.login.config=/etc/kafka/kafka_client_jaas.conf" [user~] $ bin/kafka-console-consumer.sh --bootstrap-servers kafka.company.com:9092 --topic test-topic --consumer.config config/consumer.properties

Summary

If everything went right,you’ve managed to create JAAS files, configure and run all the main Kafka components (Zookeeper server, Kafka broker and Kafka clients) with integrated Kerberos security. The next step would be enabling SSL encryption for all Kafka traffic what would make your cluster much safer, but that’s a topic for another time. We hope that you find this blog helpful.

Stay tuned for the third part of this Kerberos series where we will integrate Kerberos with Cloudera CDH, one of the most popular open source platform distributions that includes Apache Hadoop.

Please visit the other two parts of the blog series:

PART 1/3

PART 3/3

Data Catalog

Data Catalog

ASSOCIATE DATA ENGINEER Introduction This is the first part of a multi-part series where we will be discussing the...

read more