SlideShare a Scribd company logo
1 of 29
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka
Consulting
Avro
Kafka & Avro:
Confluent Schema
Registry
Managing Record Schema in
Kafka
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka
Consulting
™
Confluent Schema Registry
❖ Confluent Schema Registry stores Avro Schemas for Kafka
clients
❖ Provides REST interface for putting and getting Avro schemas
❖ Stores a history of schemas
❖ versioned
❖ allows you to configure compatibility setting
❖ supports evolution of schemas
❖ Provides serializers used by Kafka clients which handles schema
storage and serialization of records using Avro
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka
Consulting
™
Why Schema Registry?
❖ Producer creates a record/message, which is an Avro record
❖ Record contains the schema and data
❖ Schema Registry Avro Serializer serializes the data and schema id (just id)
❖ Keeps a cache of registered schemas from Schema Registry to ids
❖ Consumer receives payload and deserializes it with Schema Registry Avro Deserializers
❖ Deserializer looks up the full schema from cache or Schema Registry based on id
❖ Consumer has its schema, one it is expecting record/message to conform to
❖ Compatibility check is performed or two schemas
❖ if no match, but are compatible, then payload transformation happens aka Schema Evolution
❖ if not failure
❖ Kafka records have Key and Value and schema can be done on both
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka
Consulting
™
Schema Compatibility
❖ Backward Compatibility (default)
❖ New, backward compatible schema, will not break consumers
❖ Producers could be using older schema that is backwards compatible with Consumer
❖ Forward compatibility
❖ Records sent with new forward compatible schema can be deserialized with older schemas
❖ Consumers can use an older schema and never be updated (maybe never needs new fields)
❖ Full compatibility
❖ New version of a schema is backward and forward compatible
❖ None
❖ Schema will not be validated for compatibility at all
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka
Consulting
™
Schema Registry Config
❖ Compatibility can be configured globally or per schema
❖ Options are:
❖ NONE - don’t check for schema compatibility
❖ FORWARD - check to make sure last schema version is forward
compatible with new schemas
❖ BACKWARDS (default) - make sure new schema is backwards
compatible with latest
❖ FULL - make sure new schema is forwards and backwards
compatible from latest to new and from new to latest
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka
Consulting
™
Schema Registry Actions
❖ Register schemas for key and values of Kafka records
❖ List schemas (subjects)
❖ List all versions of a subject (schema)
❖ Retrieve a schema by version or id
❖ get latest version of schema
❖ Check to see if schema is compatible with a certain version
❖ Get the compatibility level setting of the Schema Registry
❖ BACKWARDS, NONE
❖ Add compatibility settings to a subject/schema
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka
Consulting
™
Schema Evolution
❖ Avro schema is changed after data has been written to store using an older version of
that schema, then Avro might do a Schema Evolution
❖ Schema evolution is automatic transformation of Avro schema
❖ transformation is between version of consumer schema and what the producer put
into the Kafka log
❖ When Consumer schema is not identical to the Producer schema used to serialize
the Kafka Record then a data transformation is performed on the Kafka record (key or
value)
❖ If the schemas match then no need to do a transformation
❖ Schema evolution is happens only during deserialization at the Consumer
❖ If Consumer’s schema is different from Producer’s schema, then value or key is
automatically modified during deserialization to conform to consumers reader schema
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka
Consulting
™
Allowed Schema Modifications
❖ Add a field with a default
❖ Remove a field that had a default value
❖ Change a fields order attribute
❖ Change a fields default value
❖ Remove or add a field alias
❖ Remove or add a type alias
❖ Change a type to a union that contains original type
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka
Consulting
™
Rules of the Road for modifying
Schema
❖ Provide a default value for fields in your schema
❖ Allows you to delete the field later later
❖ Don’t change a field's data type
❖ When adding a new field to your schema, you have to
provide a default value for the field
❖ Don’t rename an existing field
❖ You can add an alias
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka
Consulting
™
Remember our example
Employee
Avro covered in Avro/Kafka Tutorial
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka
Consulting
™
Let’s say
❖ Employee did not have an age in version 1 of the
schema
❖ Later we decided to add an age field with a default value
of -1
❖ Now let’s say we have a Producer using version 2, and
a Consumer using version 1
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka
Consulting
™
Scenario adding a new field age with
default value
❖ Producer uses version 2 of the Employee schema and creates a
com.cloudurable.Employee record, and sets age field to 42, then sends it to Kafka topic
new-employees
❖ Consumer consumes records from new-employees using version 1 of the Employee
Schema
❖ Since Consumer is using version 1 of schema, age field is removed during
deserialization
❖ Same consumer modifies name field and then writes the record back to a NoSQL store
❖ When it does this, the age field is missing from value that it writes to the store
❖ Another client using version 2 reads the record from the NoSQL store
❖ Age field is missing from the record (because the Consumer wrote it with version 1),
age is set to default value of -1
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka
Consulting
™
Schema Registry Actions
❖ Register schemas for key and values of Kafka records
❖ List schemas (subjects)
❖ List all versions of a subject (schema)
❖ Retrieve a schema by version or id
❖ get latest version of schema
❖ Check to see if schema is compatible with a certain version
❖ Get the compatibility level setting of the Schema Registry
❖ BACKWARDS, FORWARD, FULL, NONE
❖ Add compatibility settings to a subject/schema
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka
Consulting
™
Register a Schema
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka
Consulting
™
Register a Schema
{"id":2}
curl -X POST -H "Content-Type: application/vnd.schemaregistry.v1+json"
--data '{"schema": "{"type": …}’ 
http://localhost:8081/subjects/Employee/versions
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka
Consulting
™
List All Schema
["Employee","Employee2","FooBar"]
curl -X GET http://localhost:8081/subjects
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka
Consulting
™
Working with versions
[1,2,3,4,5]
{“subject”:"Employee","version":2,"id":4,"schema":"
{"type":"record","name":"Employee",
”namespace”:"com.cloudurable.phonebook", …
{“subject”:"Employee","version":1,"id":3,"schema":"
{"type":"record","name":"Employee",
”namespace”:"com.cloudurable.phonebook", …
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka
Consulting
™
Working with Schemas
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka
Consulting
™
Changing Compatibility
Checks
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka
Consulting
™
Incompatible Change
{“error_code":409,"
message":"Schema being registered is incompatible with an e
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka
Consulting
™
Incompatible Change
{"is_compatible":false}
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka
Consulting
™
Use Schema Registry
❖ Start up Schema Registry server pointing to Zookeeper
cluster
❖ Import Kafka Avro Serializer and Avro Jars
❖ Configure Producer to use Schema Registry
❖ Use KafkaAvroSerializer from Producer
❖ Configure Consumer to use Schema Registry
❖ Use KafkaAvroDeserializer from Consumer
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka
Consulting
™
Start up Schema Registry
Server
cat ~/tools/confluent-3.2.1/etc/schema-registry/schema-registry.properties
listeners=http://0.0.0.0:8081
kafkastore.connection.url=localhost:2181
kafkastore.topic=_schemas
debug=false
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka
Consulting
™
Import Kafka Avro Serializer &
Avro Jars
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka
Consulting
™
Configure Producer to use Schema
Registry
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka
Consulting
™
Use KafkaAvroSerializer from
Producer
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka
Consulting
™
Configure Consumer to use Schema
Registry
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka
Consulting
™
Use KafkaAvroDeserializer from
Consumer
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka
Consulting
™
Schema Registry
❖ Confluent provides Schema Registry to manage Avro
Schemas for Kafka Consumers and Producers
❖ Avro provides Schema Migration
❖ Confluent uses Schema compatibility checks to see if
Producer schema and Consumer schemas are
compatible and to do Schema evolution if needed
❖ Use KafkaAvroSerializer from Producer
❖ Use KafkaAvroDeserializer from Consumer

More Related Content

What's hot

Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Kafka Tutorial - Introduction to Apache Kafka (Part 1)Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Kafka Tutorial - Introduction to Apache Kafka (Part 1)Jean-Paul Azar
 
Introduction to Kafka Streams
Introduction to Kafka StreamsIntroduction to Kafka Streams
Introduction to Kafka StreamsGuozhang Wang
 
KSQL: Streaming SQL for Kafka
KSQL: Streaming SQL for KafkaKSQL: Streaming SQL for Kafka
KSQL: Streaming SQL for Kafkaconfluent
 
Stream processing using Kafka
Stream processing using KafkaStream processing using Kafka
Stream processing using KafkaKnoldus Inc.
 
Fundamentals of Apache Kafka
Fundamentals of Apache KafkaFundamentals of Apache Kafka
Fundamentals of Apache KafkaChhavi Parasher
 
So You Want to Write a Connector?
So You Want to Write a Connector? So You Want to Write a Connector?
So You Want to Write a Connector? confluent
 
Apache Kafka® and API Management
Apache Kafka® and API ManagementApache Kafka® and API Management
Apache Kafka® and API Managementconfluent
 
Common issues with Apache Kafka® Producer
Common issues with Apache Kafka® ProducerCommon issues with Apache Kafka® Producer
Common issues with Apache Kafka® Producerconfluent
 
Securing Kafka
Securing Kafka Securing Kafka
Securing Kafka confluent
 
Kafka for Real-Time Replication between Edge and Hybrid Cloud
Kafka for Real-Time Replication between Edge and Hybrid CloudKafka for Real-Time Replication between Edge and Hybrid Cloud
Kafka for Real-Time Replication between Edge and Hybrid CloudKai Wähner
 
Producer Performance Tuning for Apache Kafka
Producer Performance Tuning for Apache KafkaProducer Performance Tuning for Apache Kafka
Producer Performance Tuning for Apache KafkaJiangjie Qin
 
Kafka Tutorial - basics of the Kafka streaming platform
Kafka Tutorial - basics of the Kafka streaming platformKafka Tutorial - basics of the Kafka streaming platform
Kafka Tutorial - basics of the Kafka streaming platformJean-Paul Azar
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache KafkaJeff Holoman
 
ksqlDB: A Stream-Relational Database System
ksqlDB: A Stream-Relational Database SystemksqlDB: A Stream-Relational Database System
ksqlDB: A Stream-Relational Database Systemconfluent
 

What's hot (20)

Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Kafka Tutorial - Introduction to Apache Kafka (Part 1)Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Kafka Tutorial - Introduction to Apache Kafka (Part 1)
 
Introduction to Kafka Streams
Introduction to Kafka StreamsIntroduction to Kafka Streams
Introduction to Kafka Streams
 
KSQL: Streaming SQL for Kafka
KSQL: Streaming SQL for KafkaKSQL: Streaming SQL for Kafka
KSQL: Streaming SQL for Kafka
 
Stream processing using Kafka
Stream processing using KafkaStream processing using Kafka
Stream processing using Kafka
 
Fundamentals of Apache Kafka
Fundamentals of Apache KafkaFundamentals of Apache Kafka
Fundamentals of Apache Kafka
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
So You Want to Write a Connector?
So You Want to Write a Connector? So You Want to Write a Connector?
So You Want to Write a Connector?
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache Kafka
 
Apache Kafka® and API Management
Apache Kafka® and API ManagementApache Kafka® and API Management
Apache Kafka® and API Management
 
Common issues with Apache Kafka® Producer
Common issues with Apache Kafka® ProducerCommon issues with Apache Kafka® Producer
Common issues with Apache Kafka® Producer
 
Apache Kafka - Overview
Apache Kafka - OverviewApache Kafka - Overview
Apache Kafka - Overview
 
Securing Kafka
Securing Kafka Securing Kafka
Securing Kafka
 
Kafka for Real-Time Replication between Edge and Hybrid Cloud
Kafka for Real-Time Replication between Edge and Hybrid CloudKafka for Real-Time Replication between Edge and Hybrid Cloud
Kafka for Real-Time Replication between Edge and Hybrid Cloud
 
Apache Kafka
Apache KafkaApache Kafka
Apache Kafka
 
Kafka presentation
Kafka presentationKafka presentation
Kafka presentation
 
Producer Performance Tuning for Apache Kafka
Producer Performance Tuning for Apache KafkaProducer Performance Tuning for Apache Kafka
Producer Performance Tuning for Apache Kafka
 
Kafka Tutorial - basics of the Kafka streaming platform
Kafka Tutorial - basics of the Kafka streaming platformKafka Tutorial - basics of the Kafka streaming platform
Kafka Tutorial - basics of the Kafka streaming platform
 
Envoy and Kafka
Envoy and KafkaEnvoy and Kafka
Envoy and Kafka
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache Kafka
 
ksqlDB: A Stream-Relational Database System
ksqlDB: A Stream-Relational Database SystemksqlDB: A Stream-Relational Database System
ksqlDB: A Stream-Relational Database System
 

Similar to Kafka Schema Registry for Avro Schema Management

Kafka Tutorial - Introduction to Apache Kafka (Part 2)
Kafka Tutorial - Introduction to Apache Kafka (Part 2)Kafka Tutorial - Introduction to Apache Kafka (Part 2)
Kafka Tutorial - Introduction to Apache Kafka (Part 2)Jean-Paul Azar
 
Kafka Intro With Simple Java Producer Consumers
Kafka Intro With Simple Java Producer ConsumersKafka Intro With Simple Java Producer Consumers
Kafka Intro With Simple Java Producer ConsumersJean-Paul Azar
 
Brief introduction to Kafka Streaming Platform
Brief introduction to Kafka Streaming PlatformBrief introduction to Kafka Streaming Platform
Brief introduction to Kafka Streaming PlatformJean-Paul Azar
 
Kafka Tutorial - DevOps, Admin and Ops
Kafka Tutorial - DevOps, Admin and OpsKafka Tutorial - DevOps, Admin and Ops
Kafka Tutorial - DevOps, Admin and OpsJean-Paul Azar
 
Avro Tutorial - Records with Schema for Kafka and Hadoop
Avro Tutorial - Records with Schema for Kafka and HadoopAvro Tutorial - Records with Schema for Kafka and Hadoop
Avro Tutorial - Records with Schema for Kafka and HadoopJean-Paul Azar
 
Kafka MirrorMaker: Disaster Recovery, Scaling Reads, Isolate Mission Critical...
Kafka MirrorMaker: Disaster Recovery, Scaling Reads, Isolate Mission Critical...Kafka MirrorMaker: Disaster Recovery, Scaling Reads, Isolate Mission Critical...
Kafka MirrorMaker: Disaster Recovery, Scaling Reads, Isolate Mission Critical...Jean-Paul Azar
 
Kafka Tutorial, Kafka ecosystem with clustering examples
Kafka Tutorial, Kafka ecosystem with clustering examplesKafka Tutorial, Kafka ecosystem with clustering examples
Kafka Tutorial, Kafka ecosystem with clustering examplesJean-Paul Azar
 
Kafka Tutorial: Streaming Data Architecture
Kafka Tutorial: Streaming Data ArchitectureKafka Tutorial: Streaming Data Architecture
Kafka Tutorial: Streaming Data ArchitectureJean-Paul Azar
 
Kafka Tutorial: Advanced Producers
Kafka Tutorial: Advanced ProducersKafka Tutorial: Advanced Producers
Kafka Tutorial: Advanced ProducersJean-Paul Azar
 
Kafka Tutorial - introduction to the Kafka streaming platform
Kafka Tutorial - introduction to the Kafka streaming platformKafka Tutorial - introduction to the Kafka streaming platform
Kafka Tutorial - introduction to the Kafka streaming platformJean-Paul Azar
 
Kafka Tutorial Advanced Kafka Consumers
Kafka Tutorial Advanced Kafka ConsumersKafka Tutorial Advanced Kafka Consumers
Kafka Tutorial Advanced Kafka ConsumersJean-Paul Azar
 
Infrastructure as code deployed using Stacker
Infrastructure as code deployed using StackerInfrastructure as code deployed using Stacker
Infrastructure as code deployed using StackerMessageMedia
 
Kafka meetup - kafka connect
Kafka meetup -  kafka connectKafka meetup -  kafka connect
Kafka meetup - kafka connectYi Zhang
 
Kafka Connect and Streams (Concepts, Architecture, Features)
Kafka Connect and Streams (Concepts, Architecture, Features)Kafka Connect and Streams (Concepts, Architecture, Features)
Kafka Connect and Streams (Concepts, Architecture, Features)Kai Wähner
 
Developing a custom Kafka connector? Make it shine! | Igor Buzatović, Porsche...
Developing a custom Kafka connector? Make it shine! | Igor Buzatović, Porsche...Developing a custom Kafka connector? Make it shine! | Igor Buzatović, Porsche...
Developing a custom Kafka connector? Make it shine! | Igor Buzatović, Porsche...HostedbyConfluent
 
Integration for real-time Kafka SQL
Integration for real-time Kafka SQLIntegration for real-time Kafka SQL
Integration for real-time Kafka SQLAmit Nijhawan
 
Connecting Apache Kafka With Mule ESB
Connecting Apache Kafka With Mule ESBConnecting Apache Kafka With Mule ESB
Connecting Apache Kafka With Mule ESBJitendra Bafna
 
Real time data pipline with kafka streams
Real time data pipline with kafka streamsReal time data pipline with kafka streams
Real time data pipline with kafka streamsYoni Farin
 

Similar to Kafka Schema Registry for Avro Schema Management (20)

Kafka Tutorial - Introduction to Apache Kafka (Part 2)
Kafka Tutorial - Introduction to Apache Kafka (Part 2)Kafka Tutorial - Introduction to Apache Kafka (Part 2)
Kafka Tutorial - Introduction to Apache Kafka (Part 2)
 
Kafka Intro With Simple Java Producer Consumers
Kafka Intro With Simple Java Producer ConsumersKafka Intro With Simple Java Producer Consumers
Kafka Intro With Simple Java Producer Consumers
 
Brief introduction to Kafka Streaming Platform
Brief introduction to Kafka Streaming PlatformBrief introduction to Kafka Streaming Platform
Brief introduction to Kafka Streaming Platform
 
Kafka Tutorial - DevOps, Admin and Ops
Kafka Tutorial - DevOps, Admin and OpsKafka Tutorial - DevOps, Admin and Ops
Kafka Tutorial - DevOps, Admin and Ops
 
Avro Tutorial - Records with Schema for Kafka and Hadoop
Avro Tutorial - Records with Schema for Kafka and HadoopAvro Tutorial - Records with Schema for Kafka and Hadoop
Avro Tutorial - Records with Schema for Kafka and Hadoop
 
Kafka MirrorMaker: Disaster Recovery, Scaling Reads, Isolate Mission Critical...
Kafka MirrorMaker: Disaster Recovery, Scaling Reads, Isolate Mission Critical...Kafka MirrorMaker: Disaster Recovery, Scaling Reads, Isolate Mission Critical...
Kafka MirrorMaker: Disaster Recovery, Scaling Reads, Isolate Mission Critical...
 
Kafka Tutorial, Kafka ecosystem with clustering examples
Kafka Tutorial, Kafka ecosystem with clustering examplesKafka Tutorial, Kafka ecosystem with clustering examples
Kafka Tutorial, Kafka ecosystem with clustering examples
 
Kafka Tutorial: Streaming Data Architecture
Kafka Tutorial: Streaming Data ArchitectureKafka Tutorial: Streaming Data Architecture
Kafka Tutorial: Streaming Data Architecture
 
Kafka Tutorial: Advanced Producers
Kafka Tutorial: Advanced ProducersKafka Tutorial: Advanced Producers
Kafka Tutorial: Advanced Producers
 
Kafka Tutorial - introduction to the Kafka streaming platform
Kafka Tutorial - introduction to the Kafka streaming platformKafka Tutorial - introduction to the Kafka streaming platform
Kafka Tutorial - introduction to the Kafka streaming platform
 
Kafka Tutorial Advanced Kafka Consumers
Kafka Tutorial Advanced Kafka ConsumersKafka Tutorial Advanced Kafka Consumers
Kafka Tutorial Advanced Kafka Consumers
 
Infrastructure as code deployed using Stacker
Infrastructure as code deployed using StackerInfrastructure as code deployed using Stacker
Infrastructure as code deployed using Stacker
 
Apache cassandra
Apache cassandraApache cassandra
Apache cassandra
 
Kafka meetup - kafka connect
Kafka meetup -  kafka connectKafka meetup -  kafka connect
Kafka meetup - kafka connect
 
Kafka Connect and Streams (Concepts, Architecture, Features)
Kafka Connect and Streams (Concepts, Architecture, Features)Kafka Connect and Streams (Concepts, Architecture, Features)
Kafka Connect and Streams (Concepts, Architecture, Features)
 
Developing a custom Kafka connector? Make it shine! | Igor Buzatović, Porsche...
Developing a custom Kafka connector? Make it shine! | Igor Buzatović, Porsche...Developing a custom Kafka connector? Make it shine! | Igor Buzatović, Porsche...
Developing a custom Kafka connector? Make it shine! | Igor Buzatović, Porsche...
 
Integration for real-time Kafka SQL
Integration for real-time Kafka SQLIntegration for real-time Kafka SQL
Integration for real-time Kafka SQL
 
Connecting Apache Kafka With Mule ESB
Connecting Apache Kafka With Mule ESBConnecting Apache Kafka With Mule ESB
Connecting Apache Kafka With Mule ESB
 
spark-kafka_mod
spark-kafka_modspark-kafka_mod
spark-kafka_mod
 
Real time data pipline with kafka streams
Real time data pipline with kafka streamsReal time data pipline with kafka streams
Real time data pipline with kafka streams
 

Recently uploaded

Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 

Recently uploaded (20)

Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 

Kafka Schema Registry for Avro Schema Management

  • 1. ™ Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting Avro Kafka & Avro: Confluent Schema Registry Managing Record Schema in Kafka
  • 2. Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting ™ Confluent Schema Registry ❖ Confluent Schema Registry stores Avro Schemas for Kafka clients ❖ Provides REST interface for putting and getting Avro schemas ❖ Stores a history of schemas ❖ versioned ❖ allows you to configure compatibility setting ❖ supports evolution of schemas ❖ Provides serializers used by Kafka clients which handles schema storage and serialization of records using Avro
  • 3. Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting ™ Why Schema Registry? ❖ Producer creates a record/message, which is an Avro record ❖ Record contains the schema and data ❖ Schema Registry Avro Serializer serializes the data and schema id (just id) ❖ Keeps a cache of registered schemas from Schema Registry to ids ❖ Consumer receives payload and deserializes it with Schema Registry Avro Deserializers ❖ Deserializer looks up the full schema from cache or Schema Registry based on id ❖ Consumer has its schema, one it is expecting record/message to conform to ❖ Compatibility check is performed or two schemas ❖ if no match, but are compatible, then payload transformation happens aka Schema Evolution ❖ if not failure ❖ Kafka records have Key and Value and schema can be done on both
  • 4. Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting ™ Schema Compatibility ❖ Backward Compatibility (default) ❖ New, backward compatible schema, will not break consumers ❖ Producers could be using older schema that is backwards compatible with Consumer ❖ Forward compatibility ❖ Records sent with new forward compatible schema can be deserialized with older schemas ❖ Consumers can use an older schema and never be updated (maybe never needs new fields) ❖ Full compatibility ❖ New version of a schema is backward and forward compatible ❖ None ❖ Schema will not be validated for compatibility at all
  • 5. Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting ™ Schema Registry Config ❖ Compatibility can be configured globally or per schema ❖ Options are: ❖ NONE - don’t check for schema compatibility ❖ FORWARD - check to make sure last schema version is forward compatible with new schemas ❖ BACKWARDS (default) - make sure new schema is backwards compatible with latest ❖ FULL - make sure new schema is forwards and backwards compatible from latest to new and from new to latest
  • 6. Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting ™ Schema Registry Actions ❖ Register schemas for key and values of Kafka records ❖ List schemas (subjects) ❖ List all versions of a subject (schema) ❖ Retrieve a schema by version or id ❖ get latest version of schema ❖ Check to see if schema is compatible with a certain version ❖ Get the compatibility level setting of the Schema Registry ❖ BACKWARDS, NONE ❖ Add compatibility settings to a subject/schema
  • 7. Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting ™ Schema Evolution ❖ Avro schema is changed after data has been written to store using an older version of that schema, then Avro might do a Schema Evolution ❖ Schema evolution is automatic transformation of Avro schema ❖ transformation is between version of consumer schema and what the producer put into the Kafka log ❖ When Consumer schema is not identical to the Producer schema used to serialize the Kafka Record then a data transformation is performed on the Kafka record (key or value) ❖ If the schemas match then no need to do a transformation ❖ Schema evolution is happens only during deserialization at the Consumer ❖ If Consumer’s schema is different from Producer’s schema, then value or key is automatically modified during deserialization to conform to consumers reader schema
  • 8. Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting ™ Allowed Schema Modifications ❖ Add a field with a default ❖ Remove a field that had a default value ❖ Change a fields order attribute ❖ Change a fields default value ❖ Remove or add a field alias ❖ Remove or add a type alias ❖ Change a type to a union that contains original type
  • 9. Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting ™ Rules of the Road for modifying Schema ❖ Provide a default value for fields in your schema ❖ Allows you to delete the field later later ❖ Don’t change a field's data type ❖ When adding a new field to your schema, you have to provide a default value for the field ❖ Don’t rename an existing field ❖ You can add an alias
  • 10. Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting ™ Remember our example Employee Avro covered in Avro/Kafka Tutorial
  • 11. Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting ™ Let’s say ❖ Employee did not have an age in version 1 of the schema ❖ Later we decided to add an age field with a default value of -1 ❖ Now let’s say we have a Producer using version 2, and a Consumer using version 1
  • 12. Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting ™ Scenario adding a new field age with default value ❖ Producer uses version 2 of the Employee schema and creates a com.cloudurable.Employee record, and sets age field to 42, then sends it to Kafka topic new-employees ❖ Consumer consumes records from new-employees using version 1 of the Employee Schema ❖ Since Consumer is using version 1 of schema, age field is removed during deserialization ❖ Same consumer modifies name field and then writes the record back to a NoSQL store ❖ When it does this, the age field is missing from value that it writes to the store ❖ Another client using version 2 reads the record from the NoSQL store ❖ Age field is missing from the record (because the Consumer wrote it with version 1), age is set to default value of -1
  • 13. Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting ™ Schema Registry Actions ❖ Register schemas for key and values of Kafka records ❖ List schemas (subjects) ❖ List all versions of a subject (schema) ❖ Retrieve a schema by version or id ❖ get latest version of schema ❖ Check to see if schema is compatible with a certain version ❖ Get the compatibility level setting of the Schema Registry ❖ BACKWARDS, FORWARD, FULL, NONE ❖ Add compatibility settings to a subject/schema
  • 14. Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting ™ Register a Schema
  • 15. Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting ™ Register a Schema {"id":2} curl -X POST -H "Content-Type: application/vnd.schemaregistry.v1+json" --data '{"schema": "{"type": …}’ http://localhost:8081/subjects/Employee/versions
  • 16. Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting ™ List All Schema ["Employee","Employee2","FooBar"] curl -X GET http://localhost:8081/subjects
  • 17. Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting ™ Working with versions [1,2,3,4,5] {“subject”:"Employee","version":2,"id":4,"schema":" {"type":"record","name":"Employee", ”namespace”:"com.cloudurable.phonebook", … {“subject”:"Employee","version":1,"id":3,"schema":" {"type":"record","name":"Employee", ”namespace”:"com.cloudurable.phonebook", …
  • 18. Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting ™ Working with Schemas
  • 19. Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting ™ Changing Compatibility Checks
  • 20. Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting ™ Incompatible Change {“error_code":409," message":"Schema being registered is incompatible with an e
  • 21. Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting ™ Incompatible Change {"is_compatible":false}
  • 22. Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting ™ Use Schema Registry ❖ Start up Schema Registry server pointing to Zookeeper cluster ❖ Import Kafka Avro Serializer and Avro Jars ❖ Configure Producer to use Schema Registry ❖ Use KafkaAvroSerializer from Producer ❖ Configure Consumer to use Schema Registry ❖ Use KafkaAvroDeserializer from Consumer
  • 23. Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting ™ Start up Schema Registry Server cat ~/tools/confluent-3.2.1/etc/schema-registry/schema-registry.properties listeners=http://0.0.0.0:8081 kafkastore.connection.url=localhost:2181 kafkastore.topic=_schemas debug=false
  • 24. Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting ™ Import Kafka Avro Serializer & Avro Jars
  • 25. Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting ™ Configure Producer to use Schema Registry
  • 26. Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting ™ Use KafkaAvroSerializer from Producer
  • 27. Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting ™ Configure Consumer to use Schema Registry
  • 28. Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting ™ Use KafkaAvroDeserializer from Consumer
  • 29. Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting ™ Schema Registry ❖ Confluent provides Schema Registry to manage Avro Schemas for Kafka Consumers and Producers ❖ Avro provides Schema Migration ❖ Confluent uses Schema compatibility checks to see if Producer schema and Consumer schemas are compatible and to do Schema evolution if needed ❖ Use KafkaAvroSerializer from Producer ❖ Use KafkaAvroDeserializer from Consumer