Kafka Deep Dive w/ a Ex-Meta Staff Engineer

Hello Interview - SWE Interview Preparation

10 месяцев назад

189,698 Просмотров

Комментарии:

@mikehoran9484 - 19.03.2025 05:10

I think your broker/partition diagram is a little confusing. Kafka defaults to 3 brokers, because to achieve a replication factor 3, you need 3 brokers with one partition (leader + 2 followers) each on each broker. You talk about a "partition going down", but it's really the broker that could go down. So if broker 1 containing the partition 1 leader goes down, then it's broker 2 (follower 1) or broker 3 (follower 2) that will take over as the leader.

Ответить

@张萝卜叶 - 17.03.2025 22:06

This is an amazing overview holy...

Ответить

@Rkkhandelwal14 - 16.03.2025 10:44

How consumer will know, what is the first offset value he need to pass while reading from Kafka?

Ответить

@ethanyang5265 - 16.03.2025 04:35

Awesome video

Ответить

@4plucas - 13.03.2025 18:20

Thank you for sharing ❤

Ответить

@okdotpy - 12.03.2025 00:52

Need a video on comparing streams/MQ services.

Like what are the driving decisions between using Kinesis, Kafka, SNS/SQS, Redis pubsub, RabbitMQ, etc

Ответить

@mario-a77 - 07.03.2025 12:35

Got a question about compound key. Some doc says the maximum partitions for a broker is 4000, and maximum 200,000 for a cluster. That doesn't seem to be enough if you set the key to AdId:userID. The amount of key will easily exceed this limit.

Ответить

@GabrielAugustoDeVito - 06.03.2025 17:51

Great lesson!

I have a CRM system that receives events from the client (example: a card was moved, or a card was updated)

And whenever the event goes to the queue, I would like to:

- send to a webhook
- send to a socket service
- send to an automatic actions service

every event should be processed in order (order matters) and also has to guarantee that the three queues events was processed

in this case, should I use consumer groups?

Also, is there a way I have multiple consumers for each consumer group?

ty

Ответить

@AakashIyer21 - 28.02.2025 00:40

Amazing content. I had a few questions

1. Where does the broker keep the append only log file? Is it kept on disk or in memory ?

2. When the consumer tries to read the next message, does the broker do a sequential disk read (assuming it is on disk)?

Thanks!

Ответить

@270MinutesLater - 27.02.2025 21:06

very well explained!

Ответить

@furrygoldennuts - 23.02.2025 02:30

With idempotence enabled, acks=all is automatically applied, requiring acknowledgments from all in-sync replicas.

Ответить

@gaurav1064 - 22.02.2025 18:03

Is this bluescape where you are designing?

Ответить

@shaunakkakade1325 - 21.02.2025 02:16

Is Zookeeper part of the Interview when discussing Kafka? I know Kafka can run without ZK these days but just wondering how much Zookeeper knowledge is required?

Ответить

@YeGaogaosanyong - 18.02.2025 00:08

Thanks. So what are the other 4 technologies in your top 5?

Ответить

@qeetcode - 18.02.2025 00:06

This is super helpful. Thanks a million.
In the meantime, can you also shed light on the delivery mode of Kafka? In particular, how is the "exactly once" delivery achieved?

Ответить

@rak590 - 17.02.2025 19:13

1. do consumers commit offset for every message they process or they can keep processing messages and commit offset once in a while.
2. would love to know about kafka streams and kafka channels with motivating examples :)
thanks @hellointerview & @evan!

Ответить

@megawooloos - 14.02.2025 03:59

Dumb question, but does Broker 2 need to have a leader for Topic A? Or does the Apache ZooKeeper (or some broker manager) manage partition follower replication between brokers?

Ответить

@mcdaddy1334 - 13.02.2025 21:45

This is by far the best channel for discussing system design topics. My goodness, everything is structured so well. Well done!

Ответить

@Anonymous2334-p8j - 10.02.2025 23:31

I feel like half way you missed the topic a bit since Kafka is not really a message queue.

Ответить

@Yusuf07HD - 10.02.2025 05:28

absolute best content. keep up with great work :)

Ответить

@eversmart2672 - 07.02.2025 22:22

!!!AWESOME!!! & THANK YOU

Ответить

@souravmondal6478 - 06.02.2025 10:20

Question: How does Kafka ensure or maintain consistency between partitions (leader and followers)?

Ответить

@laghavmohan7210 - 04.02.2025 13:54

A Dead Letter Queue in Kafka is one or more Kafka topics that receive and store messages that could not be processed

Ответить

@mangeshshikrodkar6192 - 31.01.2025 10:11

aim for 1 MB per message. And broker can take 1TB data and 10k messages per second. That would leave us with 100MB msg size which may not be a good practice. Instead , why not have 1M messages per second (given each msg of 1MB) ? cant we scale better this way ? in fact if msg size is small, we can bump up rate eg: 1kB msg (in case of some small text msgs like feeds or whatsapp) and 10^9 msgs per second.

Ответить

@rsKayiira - 28.01.2025 07:57

Please do one for Cassandra as well

Ответить

@rsKayiira - 28.01.2025 07:42

This is excellent

Ответить

@CrazyHunk14 - 27.01.2025 17:34

This is an amazing explanation for a beginner like me in system design! I really appreciate the videos—please keep them coming. I’m hoping to crack into FAANG companies in the next few months.

Ответить

@adamobrien8343 - 27.01.2025 04:02

Best video I have seen on the topic. Thank you!

Ответить

@deem3365 - 25.01.2025 13:32

Thank you for the this, really enlighten me after using kafka in work for so long

Ответить

@asterixcode - 19.01.2025 20:15

Have you seen use cases with retention periods set to infinite and what would be some examples of that? retention ms and retention bytes both set to -1

Ответить

@isaifahmad - 15.01.2025 23:11

Thanks for this insightful session; it is incredibly informative, well structured and highly useful for a novice or an expert of Kafka. Your method of setting real life examples and ability to emulate consumer (not Kafka consumer 😊) mindset and going behind the reasoning is exemplary; great job 👍🏼

Ответить

@lokesh2608 - 13.01.2025 01:37

This is such an awesome video. It was a good introduction and prompted me to then follow up on various details like how can i increase the paritition count, what are antipatterns, dynamically adding topics, adding brokers etc and other operational management considerations. This is just awesome and great job.

Ответить

@lokesh2608 - 13.01.2025 01:31

For consumer retries, why use a Retry topic? why not put the message back into the main topic? Is the suggestion that when putting on the retry topic, maybe we also have a "number of retries" metadata which we keep incrementing (and the original producer doesnt need this). Or is there another reason? Say because maybe we want a separate consumer group to process retries for whatever reason?

Ответить

@andrehil - 12.01.2025 21:28

Quick question: why use composite keys? Isn't it better to then just use option 1 (no keys)?
After all, you lose ordering anyway.

Ответить

@andrehil - 12.01.2025 21:26

Awesome video!
Quick feedback: please zoom a bit more when there's text, it's a little hard to read 😅
The font itself is also a bit hard to read.

Ответить

@venil82 - 12.01.2025 20:09

wish i had discovered this video earlier

Ответить

@ShreyanshNayak-j1c - 11.01.2025 17:21

One of the best videos and explanation on the entire internet. Thanks a lot...

Ответить

@BhavikSankesara - 09.01.2025 15:42

Exceptionally insightful video. Nice dive-deep

Ответить

@nikkinic112 - 08.01.2025 09:30

Great video on Kafka.

Question: It seems Kafka can do it all now. When do we still want to use Amazon SQS or RabbitMQ (use case) ?

Ответить

@Conqwer - 06.01.2025 11:14

Wrt. hot partitions, wouldn't it help to scale the consumer group? Or did I miss a constraint that wouldn't make this feasible?

Ответить

@Conqwer - 06.01.2025 10:28

A great video, thank you!

One thing that I would like to highlight is that the partitioning of the queue and the consumer groups are motivated by scaling, but topics is motivated by abstraction.

Ethan did mention this, but I think it was worth highlighting.

Ответить