ETL – Extract Transform and Load

There are several common use case scenarios where we need to EXTRACT data from a given source and TRANSFORM the data that we extracted so that it can be LOADED into a database or another system. In this article, I am going to talk about a CSV upload and a few of the considerations one needs to make to Extract, Transform and Load the CSV data into a database.

There are several common use case scenarios where we need to EXTRACT data from a given source and TRANSFORM the data that we extracted so that it can be LOADED into a database or another system. In this article, I am going to talk about a CSV upload and a few of the considerations one needs to make to Extract, Transform and Load the CSV data into a database.

Extract

To put it in simple words, a CSV file is a plain text file with data separated by commas, tabs, or semi-colons. When a spreadsheet application opens a CSV, it displays the data as a table. Independently of your choice of language (JavaPythonJavaScriptPHP, etc), you will find a CSV library code to deal with the data.

Below an example of a comma-separated CSV data:

"Date","Pupil","Grade"
"25 May","Bloggs, Fred","C"
"25 May","Doe, Jane","B"
"15 July","Bloggs, Fred","A"
"15 April","Muniz, Alvin ""Hank""","A"

Transform

Usually, data will have to be transformed, for example by combining two columns into a single one (first name + middle name), or some data needs to be split into several individual pieces, i.e. when addresses come as a single cell record but displayed into several lines. The two tables below show an example of this case.

Load

To keep performance high, the CSV is read line by line and the data in every line is inserted into a database or printed onto the screen so that the user can select one or more roles to be imported. You want to process the single row straight away and discard it from memory as soon as you finish.

In many cases, the first row contains or displays column names, which can then be used to match pre-existent data in the system. This allows flexibility when parsing the data in the order it is presented.

A more complex case is described in this case study, where the data is requested from the source system, then transformed before finally loaded into the destination system.

Techifide

Explore the content from Techifide and delve into a world of tech insights, expert tips, and valuable resources. Stay informed and inspired!

Recent Techi Talks

“Talk a Little Bit About Yourself” in IT Job Interviews

"Can you talk a little bit about yourself?" On the surface, it might appear as if the interviewer is seeking…

7 months ago

Leveraging Brazilian IT Recruitment for Fast Scaling

Leverage Brazilian IT recruitment for rapid scaling! Compelling reasons why contracting software engineers from Brazil can be a game-changer for…

7 months ago

How to prepare for a technical IT interview

Preparing for a technical IT interview requires a combination of technical knowledge, problem-solving skills, and effective communication.

8 months ago

Tips for Candidates

Discover a multitude of invaluable tips to elevate your career to new heights. Uncover the art of crafting an exceptional…

8 months ago

Questions you should ask during a job interview

Engage actively in the conversation and use the opportunity to assess if the company and role align with your career…

8 months ago

What should not be included on a CV?

By omitting unnecessary or unrelated information, you can ensure that your CV remains focused and impactful.

8 months ago

This website uses cookies.