Elena Canorea
Communications Lead
Article co-written by Estefanía Guzmán and José Ángel Quevedo.
Unlike the ETL process, which requires data transformations to be performed before loading the target system, the ELT process follows a different strategy. Instead of moving the data to an intermediate environment for transformation, ELT loads the raw data directly into the storage system and performs the transformations afterward.
Using the ELT data flow process, data cleansing, enrichment, and transformation are performed directly in our storage system. The raw data is stored permanently in that system, allowing multiple transformations to be performed at any time.
Cloud data warehousing systems such as Snowflake, Amazon Redshift, Google BigQuery, Azure Data Lake, Azure Synapse, among others, have the necessary digital infrastructure, both in terms of storage and processing power, to handle large volumes of data.
Although the ELT data pipeline is not universally used, its popularity is increasing as companies migrate their infrastructures to cloud environments.
There are two main aspects that differentiate the ETL and ELT processes: the first is where the data transformation takes place, while the second is related to how the data is retained in the storage systems.
ETL transforms data on a separate processing server, while ELT transforms data within the data warehouse itself.
That is, ETL does not transfer data to the warehouse without first processing it, while ELT does send raw data directly to the warehouse.
In the case of ETL, the data ingestion process is slowed down by transforming the data on a separate server before the loading process.
In contrast, in the ELT process, faster data ingestion is offered because data is not sent to a secondary server for restructuring. In fact, with ELT, data can be loaded and transformed simultaneously.
ELT’s raw data retention creates an ideal historical archive for generating business intelligence. As objectives and strategies change, BI teams can revisit the raw data to develop new transformations using complete data sets. ETL, on the other hand, does not generate complete sets of raw data that can be queried again: it only stores data that has already been transformed.
These factors make ELT more flexible, efficient, and scalable, especially for ingesting large amounts of data, processing sets containing structured and unstructured data, and developing various business intelligence.
The way unstructured data is processed is crucial. At this time, ELT is the best option, as it provides superior processing of semi-structured and unstructured data compared to ETL, which is typically used for structured data.
Most of the data is unstructured (images, videos, PDF files, PowerPoint documents, etc.), so this type of data remains more difficult to access and process. In the future, the industry will focus on eliminating the difficulties and improving the interpretation of this unstructured data, and ELT will play an important role in this.
However, ETL is ideal for compute-intensive transformations, systems with legacy architectures, or data workflows that require manipulation before entering a target system, such as the removal of personally identifiable information (PII).
The ETL and ELT data pipeline includes cleansing and filtering, which are a key part of the data transformation process. And because the ETL method completes the transformation before uploading the data to the server, it is better for meeting privacy and security standards for the possibility of sensitive data transfer.
Category | ETL | ELT |
Definition | Data is extracted from a source system, transformed on a secondary processing server, and loaded onto a target system. | Data is extracted from a source system, loaded into a target system, and transformed within the target system. |
Extraction | The raw data is extracted using API connectors. | The raw data is extracted using API connectors. |
Transformation | The raw data is transformed into a processing server. | The raw data is transformed within the target system. |
Loading | The transformed data is loaded into a target system. | The raw data is loaded directly into the target system. |
Speed | ETL is a time-consuming process; data is transformed before it is loaded into a target system. | ELT is faster by comparison; data is loaded directly into a target system and transformed in parallel. |
Code-based transformations | They are performed on a secondary server. Best for computationally intensive transformations and pre-cleaning. | Transformations performed on the database; simultaneous loading and transformations; speed and efficiency. |
Privacy | Pre-loading transformation can eliminate PII (HIPPA support). | Direct data upload requires more privacy safeguards. |
Maintenance | The secondary processing server increases the maintenance load. | By using fewer systems, the maintenance load is reduced. |
Flow | The data are transformed before entering the target system; therefore, the raw data cannot be queried again. | The raw data is loaded directly into the target system and can be queried indefinitely. |
Data volume | Ideal for small data sets with complicated transformation requirements. | Ideal for large data sets requiring speed and efficiency.Conclusion |
Cloud data warehouses have given rise to a new frontier in data integration. Choosing between ETL and ELT processes depends on the needs of the team and the project.
However, for both cases, the goal is clear: to prepare data for analysis and use it for decision-making. The easiest way to resolve the ETL vs. ELT dilemma and understand their differences is to understand the T in both approaches. The critical factor that differentiates the two is when and where the execution of the transformation takes place.
Implementing an ELT process is more difficult compared to ETL. However, companies now prefer ELT to ETL because of faster performance, versatility, and scalability.
In contrast, ETL data pipelines create a more secure process for handling sensitive data and meeting compliance standards.
Whichever option is chosen, data teams across the spectrum are activating their integration strategies by leveraging a data integration platform. We can help you get one in place.
Elena Canorea
Communications Lead
Cookie | Duration | Description |
---|---|---|
__cfduid | 1 year | The cookie is used by cdn services like CloudFare to identify individual clients behind a shared IP address and apply security settings on a per-client basis. It does not correspond to any user ID in the web application and does not store any personally identifiable information. |
__cfduid | 29 days 23 hours 59 minutes | The cookie is used by cdn services like CloudFare to identify individual clients behind a shared IP address and apply security settings on a per-client basis. It does not correspond to any user ID in the web application and does not store any personally identifiable information. |
__cfduid | 1 year | The cookie is used by cdn services like CloudFare to identify individual clients behind a shared IP address and apply security settings on a per-client basis. It does not correspond to any user ID in the web application and does not store any personally identifiable information. |
__cfduid | 29 days 23 hours 59 minutes | The cookie is used by cdn services like CloudFare to identify individual clients behind a shared IP address and apply security settings on a per-client basis. It does not correspond to any user ID in the web application and does not store any personally identifiable information. |
_ga | 1 year | This cookie is installed by Google Analytics. The cookie is used to calculate visitor, session, campaign data and keep track of site usage for the site's analytics report. The cookies store information anonymously and assign a randomly generated number to identify unique visitors. |
_ga | 1 year | This cookie is installed by Google Analytics. The cookie is used to calculate visitor, session, campaign data and keep track of site usage for the site's analytics report. The cookies store information anonymously and assign a randomly generated number to identify unique visitors. |
_ga | 1 year | This cookie is installed by Google Analytics. The cookie is used to calculate visitor, session, campaign data and keep track of site usage for the site's analytics report. The cookies store information anonymously and assign a randomly generated number to identify unique visitors. |
_ga | 1 year | This cookie is installed by Google Analytics. The cookie is used to calculate visitor, session, campaign data and keep track of site usage for the site's analytics report. The cookies store information anonymously and assign a randomly generated number to identify unique visitors. |
_gat_UA-326213-2 | 1 year | No description |
_gat_UA-326213-2 | 1 year | No description |
_gat_UA-326213-2 | 1 year | No description |
_gat_UA-326213-2 | 1 year | No description |
_gid | 1 year | This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the wbsite is doing. The data collected including the number visitors, the source where they have come from, and the pages viisted in an anonymous form. |
_gid | 1 year | This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the wbsite is doing. The data collected including the number visitors, the source where they have come from, and the pages viisted in an anonymous form. |
_gid | 1 year | This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the wbsite is doing. The data collected including the number visitors, the source where they have come from, and the pages viisted in an anonymous form. |
_gid | 1 year | This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the wbsite is doing. The data collected including the number visitors, the source where they have come from, and the pages viisted in an anonymous form. |
attributionCookie | session | No description |
cookielawinfo-checkbox-analytics | 1 year | Set by the GDPR Cookie Consent plugin, this cookie is used to record the user consent for the cookies in the "Analytics" category . |
cookielawinfo-checkbox-necessary | 1 year | This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary". |
cookielawinfo-checkbox-necessary | 11 months | This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary". |
cookielawinfo-checkbox-necessary | 11 months | This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary". |
cookielawinfo-checkbox-necessary | 1 year | This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary". |
cookielawinfo-checkbox-non-necessary | 11 months | This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Non Necessary". |
cookielawinfo-checkbox-non-necessary | 11 months | This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Non Necessary". |
cookielawinfo-checkbox-non-necessary | 11 months | This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Non Necessary". |
cookielawinfo-checkbox-non-necessary | 1 year | This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Non Necessary". |
cookielawinfo-checkbox-performance | 1 year | Set by the GDPR Cookie Consent plugin, this cookie is used to store the user consent for cookies in the category "Performance". |
cppro-ft | 1 year | No description |
cppro-ft | 7 years 1 months 12 days 23 hours 59 minutes | No description |
cppro-ft | 7 years 1 months 12 days 23 hours 59 minutes | No description |
cppro-ft | 1 year | No description |
cppro-ft-style | 1 year | No description |
cppro-ft-style | 1 year | No description |
cppro-ft-style | session | No description |
cppro-ft-style | session | No description |
cppro-ft-style-temp | 23 hours 59 minutes | No description |
cppro-ft-style-temp | 23 hours 59 minutes | No description |
cppro-ft-style-temp | 23 hours 59 minutes | No description |
cppro-ft-style-temp | 1 year | No description |
i18n | 10 years | No description available. |
IE-jwt | 62 years 6 months 9 days 9 hours | No description |
IE-LANG_CODE | 62 years 6 months 9 days 9 hours | No description |
IE-set_country | 62 years 6 months 9 days 9 hours | No description |
JSESSIONID | session | The JSESSIONID cookie is used by New Relic to store a session identifier so that New Relic can monitor session counts for an application. |
viewed_cookie_policy | 11 months | The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data. |
viewed_cookie_policy | 1 year | The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data. |
viewed_cookie_policy | 1 year | The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data. |
viewed_cookie_policy | 11 months | The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data. |
VISITOR_INFO1_LIVE | 5 months 27 days | A cookie set by YouTube to measure bandwidth that determines whether the user gets the new or old player interface. |
wmc | 9 years 11 months 30 days 11 hours 59 minutes | No description |
Cookie | Duration | Description |
---|---|---|
__cf_bm | 30 minutes | This cookie, set by Cloudflare, is used to support Cloudflare Bot Management. |
sp_landing | 1 day | The sp_landing is set by Spotify to implement audio content from Spotify on the website and also registers information on user interaction related to the audio content. |
sp_t | 1 year | The sp_t cookie is set by Spotify to implement audio content from Spotify on the website and also registers information on user interaction related to the audio content. |
Cookie | Duration | Description |
---|---|---|
_hjAbsoluteSessionInProgress | 1 year | No description |
_hjAbsoluteSessionInProgress | 1 year | No description |
_hjAbsoluteSessionInProgress | 1 year | No description |
_hjAbsoluteSessionInProgress | 1 year | No description |
_hjFirstSeen | 29 minutes | No description |
_hjFirstSeen | 29 minutes | No description |
_hjFirstSeen | 29 minutes | No description |
_hjFirstSeen | 1 year | No description |
_hjid | 11 months 29 days 23 hours 59 minutes | This cookie is set by Hotjar. This cookie is set when the customer first lands on a page with the Hotjar script. It is used to persist the random user ID, unique to that site on the browser. This ensures that behavior in subsequent visits to the same site will be attributed to the same user ID. |
_hjid | 11 months 29 days 23 hours 59 minutes | This cookie is set by Hotjar. This cookie is set when the customer first lands on a page with the Hotjar script. It is used to persist the random user ID, unique to that site on the browser. This ensures that behavior in subsequent visits to the same site will be attributed to the same user ID. |
_hjid | 1 year | This cookie is set by Hotjar. This cookie is set when the customer first lands on a page with the Hotjar script. It is used to persist the random user ID, unique to that site on the browser. This ensures that behavior in subsequent visits to the same site will be attributed to the same user ID. |
_hjid | 1 year | This cookie is set by Hotjar. This cookie is set when the customer first lands on a page with the Hotjar script. It is used to persist the random user ID, unique to that site on the browser. This ensures that behavior in subsequent visits to the same site will be attributed to the same user ID. |
_hjIncludedInPageviewSample | 1 year | No description |
_hjIncludedInPageviewSample | 1 year | No description |
_hjIncludedInPageviewSample | 1 year | No description |
_hjIncludedInPageviewSample | 1 year | No description |
_hjSession_1776154 | session | No description |
_hjSessionUser_1776154 | session | No description |
_hjTLDTest | 1 year | No description |
_hjTLDTest | 1 year | No description |
_hjTLDTest | session | No description |
_hjTLDTest | session | No description |
_lfa_test_cookie_stored | past | No description |
Cookie | Duration | Description |
---|---|---|
loglevel | never | No description available. |
prism_90878714 | 1 month | No description |
redirectFacebook | 2 minutes | No description |
YSC | session | YSC cookie is set by Youtube and is used to track the views of embedded videos on Youtube pages. |
yt-remote-connected-devices | never | YouTube sets this cookie to store the video preferences of the user using embedded YouTube video. |
yt-remote-device-id | never | YouTube sets this cookie to store the video preferences of the user using embedded YouTube video. |
yt.innertube::nextId | never | This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen. |
yt.innertube::requests | never | This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen. |