An end to end guide on data extraction, transformation and loading (ETL)from a local website using Python and Beautiful Soup library.
The complete source code can be found on either github,
https://github.com/JacksonCakes/dataengineering/blob/main/ETL_for_Malaysia's_14th_General_Election_for_the_14th_Selangor_State_Legislative_Assembly.ipynb
or google colab
https://colab.research.google.com/drive/1wrN3WV8KPLe00ufySLygaeNP3B3bTzRx?usp=sharing#scrollTo=zkawnYsZ9fRl
Extraction, transformation and loading (ETL) is one of the major workflows in the field of data engineering.
Usually, ETL involves integration of various sources of data into a single, usable and centralized data warehouse for different purposes such as insight analysis or business intelligence.
Data extraction is the first process in ETL that involves retrieving data from multiple sources for further processing, storage or…
Computer Science Student | Aspiring Data Scientist | I Post what I Practice