System Design Google Autocomplete | Typeahead Suggestion | HLD Auto Suggestion | TRIE Data Structure

System Design Google Autocomplete | Typeahead Suggestion | HLD Auto Suggestion | TRIE Data Structure

The Tech Granth

3 года назад

15,586 Просмотров

Ссылки и html тэги не поддерживаются


Комментарии:

@arunsonu1688
@arunsonu1688 - 06.08.2023 15:58

why isn't elastic search used?

Ответить
@okeyD90232
@okeyD90232 - 15.06.2023 09:14

how come the trie is O(L*N) , you have 26 possible characters in a word, and if the search query has 5 words with 6 chars each, it would be 30 characters (not including the spaces). Now with that , the first level in tried would contains 26 nodes, and 2nd level would also contains 26* 26 and so on, this would lead to O(n^m) where n is number of characters we are supporting and m is going to be the number of levels and with my example, 26^30 is going to be huge to maintain.

Ответить
@benjaminosei6086
@benjaminosei6086 - 29.05.2023 16:53

This is really good.

Ответить
@ameyjain3462
@ameyjain3462 - 25.04.2021 04:07

Missing pieces:
Data storage estimates
Traffic estimate
Cache won't talk to zookeeper ( looks wrong to me) -> also no sql does it automatically splits the data on different nodes.
Spark streaming hdfs are some words should be used only when needed. Just use a queue instead of technology buzzwords.

What are the bottelecks in the system?
What if same phrase is getting searched again an dagain- you will run ito hot cache issues.
Can it be extensible to support more ranking usecases.
How are you going to actually store the Trie in database - what is the actual schema?
overall 5/10

Ответить
@abhishekpal2097
@abhishekpal2097 - 01.04.2021 21:51

High-Level System Design is NOT about telling what you know!! It is about problem-solving in a Distributed architecture.
If you say "I will use HDFS", then many people may not understand what is the motive. So I would suggest you please start with a simple solution first. Then tell the drawback of that solution and mention several ways to solve it. Then choose any one solution (with the reason of the choice). In this way, the viewer will evolve with the problem.

Ответить
@KrishnaSharma-vl4re
@KrishnaSharma-vl4re - 30.03.2021 00:11

Throw all these buzz words in an interview and you are setting yourself up for a failure unless you really know everything in details

Ответить
@brvamshi
@brvamshi - 17.02.2021 21:39

Thanks for simplifying the design for us. I want to know
1. when the request to be served is not present in the cache who is responsible for routing that request to ZK? From your diagram, it appears that a distributed cache like Redis/Memcache can do that automatic routing? Isn't that typeahead service code/functionality routing the request to zk?
2. When the request is served by the Trie, it also gets persisted in the cache. Who responsible for storing it in the cache? Is it the type ahead service which stores in the cache?

Ответить
@rajeshbhagat1913
@rajeshbhagat1913 - 03.01.2021 20:40

How do you store a trie data structure in the database?

Ответить
@rishikhurana17
@rishikhurana17 - 01.01.2021 02:57

Thanks for the video. I have a question. In your example lets suppose the word "yarn" is only been searched once and there is no other word seached/updated for ya. In this case , yarn will still be stored in the cache or you kept a threshold for saving it in cache( for example storing the element in the cache only when it has been searched for more than 1000 time)
Also, where is the actual frequency of the words is stored ? Only that way you can find out if a new word becomes popular.

Ответить
@Dhindsa99
@Dhindsa99 - 29.12.2020 20:06

You suggested we can use either SQL or NOSQL for storing TRIE, is there any preference for either ( in industry) and why ?

Ответить