@riferrei

What is Pulsar I/O? @riferrei

Apac @riferrei he P u ls ar

It’s the plumbing part Source Pulsar I/O @riferrei Sink

”Everything is fine In the backend” @riferrei

@riferrei

Ricardo Ferreira Developer Advocate q Elastic Community Team q HashiCorp Ambassador q Learned about data i/o at: Confluent, Oracle, Red Hat q Distributed Systems, O11y, Streaming Systems, databases q riferrei@elastic.co q riferrei@riferrei.com @riferrei

Agenda • Understanding the Pulsar i/O Architecture • Installing and managing Pulsar Connectors • Troubleshooting and debugging techniques @riferrei

Understanding The Pulsar I/O Architecture @riferrei

The Architecture is like a lasagna 😋 Pulsar Connectors Pulsar Functions Functions worker @riferrei

Pulsar Functions Programming model Input Topics Topic 1 Input Messages Topic 2 Topic 3 Output Topic PulSAR Function Topic 5 Log Output Topic 4 @riferrei Output Message Log Topic

Anatomy of a source connector Input Topics Topic 1 Source Connector Input Messages Topic 2 Topic 3 Output Topic PulSAR Function Topic 5 Log Output Topic 4 @riferrei Output Message Log Topic

Anatomy of a source connector @riferrei

Anatomy of a Sink connector Input Topics Topic 1 Input Messages Topic 2 Topic 3 Sink Connector @riferrei Output Topic PulSAR Function Output Message Topic 5 Log Output Topic 4 Log Topic

Anatomy of a Sink connector @riferrei

Records are your unit-of-work Source Connector Sink Connector @riferrei

Functions worker is how you deploy Running along with brokers @riferrei Running in their own cluster

Functions worker is how you deploy • Running along with Brokers Ø Less clusters to manage. Better Operational simplicity. Ø No resources isolation. CPU, memory, and network is shared. • Running in their own cluster Ø Right-sized deployment as resources are exclusive. Ø More clusters to manage. Hard to operate at scale. @riferrei

Running along with Brokers 1. conf/Broker.conf @riferrei 2. conf/functions_worker.yml

Running along with Brokers Checking if worker on broker is correct: Result should be: @riferrei

Running in their own cluster 1. conf/Broker.conf @riferrei 2. conf/functions_worker.yml

Running in their own cluster Checking if functions worker is correct: Result should be: @riferrei

Fixing the admin rest requests conf/proxy.conf https://pulsar.apache.org/docs/en/administration-proxy @riferrei

Functions runtime configuration Process Runtime (Default) Thread Runtime Kubernetes Runtime Process 1 Thread 1 StatefulSet 1 Process 2 Thread 2 StatefulSet 2 Process 3 Host Machine @riferrei JVM K8S Cluster

Functions runtime configuration @riferrei Resource Specified as Runtime CPU Number of Cores Docker, K8s RAM Number of Bytes Docker, K8s Disk Number of Bytes Docker, K8S

Installing and Managing Pulsar Connectors @riferrei

The bag of Gold for Pulsar Connectors Pulsar Connectors StreamNative HUb Pulsar CLI GitHub @riferrei

Two Types of connectors 1. Built-in Connectors 2. Custom Connectors Custom Source 1 Custom Source 2 Custom Sink 1 @riferrei

StreamNative hub: home of connectors https://Hub.streamnative.io @riferrei

StreamNative hub: code and examples @riferrei

Pulsar Website: code and examples @riferrei

Getting started with Pulsar I/O • For development Ø Use the ”pulsar-all” docker image. Includes all connectors. Ø Run as a thread with the ”localrun” option from the admin cli. • For Production Ø Install the connectors on all brokers/function workers. Ø Connectors will be a list of .nar files on the ./connectors. @riferrei

Verifying your Pulsar i/o Setup Checking which source Connectors are available Result should be: @riferrei

Verifying your Pulsar i/o Setup Checking which Sink Connectors are available Result should be: @riferrei

Testing connectors with localrun: @riferrei

Troubleshooting And Debugging Techniques @riferrei

How to investigate Pulsar I/o Issues Metrics Logs Traces de e Co c r u So Proxy ca Lo @riferrei n lru Breakpoints Stats

Problem: my connector is not running

  1. Check the connector Configuration @riferrei

Problem: my connector is not running 2. Check the Current Connector status @riferrei

Problem: my connector is not running 3. Check the status from the topic @riferrei

Problem: my connector is not running Tenant 4. Check the connector logs @riferrei Namespace Connector Name

Problem: my connector is not running 5. Debug with localrun Play with the number of Connector Threads @riferrei

Problem: Sink is not receiving any data WiretaP 🔎 🧐 MITMProxy -Dhttp.proxy = mitmproxyhost -Dhttp.port = 8080 https://mitmproxy.org @riferrei

Problem: Multiple Clusters and logs Meet the Beats 😎 Visualize Elasticsearch Parse and Enhance Logstash @riferrei Kibana

Problem: I don’t know what to do What about… When you have no idea about what is going on? @riferrei

Debug the connector’s code 1. Enable the jdwp protocol on Pulsar 2. Configure the function runtime to thread 3. Attach your ide to the JVM @riferrei 4. Set breakpoints and debug

@riferrei

THANK YOU 🙂 @riferrei