Real time End to End PySpark Project

Real time End to End PySpark Project

learn by doing it

1 год назад

48,548 Просмотров

Ссылки и html тэги не поддерживаются


Комментарии:

Sehajpreet Singh
Sehajpreet Singh - 30.09.2023 13:55

Telegram link not working

Ответить
Vinitha Shanmuganathan
Vinitha Shanmuganathan - 28.09.2023 06:52

Hi, can you add the dataset that was used in this session?

Ответить
Vishal Kamble
Vishal Kamble - 26.09.2023 12:31

please make video on end to end data engineering project using pysparks adf and sql data bricks etc..

Ответить
barri vikram
barri vikram - 22.09.2023 15:45

could you please share which file using these videos?

Ответить
Nirmal Kumar
Nirmal Kumar - 15.09.2023 08:33

Can i keep this mention this project in resume??

Ответить
Darklord
Darklord - 14.09.2023 09:24

none of the telegram links are working, please fix it asap! thank you

Ответить
arya sivaprasad
arya sivaprasad - 11.09.2023 09:45

plz do in pycharm

Ответить
Din San
Din San - 29.08.2023 21:04

Hi,
Could you please create a video to combine below 3 csv data files into one data frame dynamically

File name: Class_01.csv
StudentID Student Name Gender Subject B Subject C Subject D
1 Balbinder Male 91 56 65
2 Sushma Female 90 60 70
3 Simon Male 75 67 89
4 Banita Female 52 65 73
5 Anita Female 78 92 57

File name: Class_02.csv
StudentID Student Name Gender Subject A Subject B Subject C Subject E
1 Richard Male 50 55 64 66
2 Sam Male 44 67 84 72
3 Rohan Male 67 54 75 96
4 Reshma Female 64 83 46 78
5 Kamal Male 78 89 91 90

File name: Class_03.csv
StudentID Student Name Gender Subject A Subject D Subject E
1 Mohan Male 70 39 45
2 Sohan Male 56 73 80
3 shyam Male 60 50 55
4 Radha Female 75 80 72
5 Kirthi Female 60 50 55

Ответить
Kailas Mehtre
Kailas Mehtre - 27.08.2023 12:23

Bro I have one question if i want to put a project in my resume then how do i do it with project name n description n responsibilities

Could you pls share like one two projects with documentation

Its humble request bro

Ответить
Huzaifah_Yoo
Huzaifah_Yoo - 18.08.2023 13:45

ok

Ответить
Amar Nath
Amar Nath - 27.07.2023 00:24

Tnq so much sir.

Ответить
Muskan Choudhary
Muskan Choudhary - 18.07.2023 09:03

What should be the name of this project

Ответить
sai srihari
sai srihari - 05.07.2023 09:39

please provide end to end project of GCP any migration or other

Ответить
Anonymous
Anonymous - 24.06.2023 16:27

Sir, Please make one video one whole flow of ADE Project... No need to explain practically.... Just wanted to learn whole flow from data ingestion till Power Bi .... I am confused between how we connect to DataBricks then how we connect to powerBi .. i didn't find any video like this.... Every video is short and to that point...plz explain what is the previous and next step in that video

Ответить
Pia Nikalje
Pia Nikalje - 17.06.2023 09:05

CSV FILES are always in String datatype.

Ответить
vishnu 1993
vishnu 1993 - 29.05.2023 11:17

Thanks for the clear explanation, can you provide excel sheet which used in this session ?

Ответить
Reddy
Reddy - 27.05.2023 04:19

If is it possible can you make video on this use case

Take any sample data Solve this using ( Adf , Databricks , PySpark ) :

I own a multi-specialty hospital chain with locations all across the world. My hospital is famous for
vaccinations. Patients who come to my hospital (across the globe) will be given a user card with which
they can access any of my hospitals in any location.
Current Status:
We maintain customers data in Country wise database due to local policies. Now with legal approvals
to build centralized data platform, we need our Data engineering team to collate data from individual
databases into single source of truth having cleaned standardized data. Business wants to generate a
simple PowerBI report for top executives summarizing till date vaccination metrics. This report will be
published and generated daily for the next 18 months. The 3 metrics mentioned below are required for
the phase 1 release.
Deliverables for assessment:
Python code that does the below
 Data cleansing/exception handling
 Data merging into single source of truth
 Data transformations and aggregations
 Code should have unit testing
Metrics needed:
 Total vaccination count by country and vaccination type
 % vaccination in each country (You can assume values for total population)
 % vaccination contribution by country (Sum of percentages add up to 100)
Expected output format
 Metric 1: CountryName, VaccinationType, No. of vaccinations
 Metric 2: CountryName, % Vaccinated
 Metric 3: CountryName, % Contribution

NOTE: End goal is to create data that can be consumed by PowerBI report directly.
scope is 3 countries.we will get from each country. Initially
you will receive a bulk load file for each country, post that you will receive daily incremental files for each country

Ответить
Reddy
Reddy - 27.05.2023 04:11

Great Video

Ответить
ZAHRA NOOR
ZAHRA NOOR - 26.05.2023 21:51

Astonishing

Ответить
Ef
Ef - 26.05.2023 21:14

Thank you so much!!

Ответить
talk with jyoti
talk with jyoti - 26.05.2023 20:29

You give great content

Ответить