Setting up an ETL configuration
ETL examples, troubleshooting, and tips
One of the ways to integrate with external systems is using the Extract, Transform, Load (ETL) system. The ETL system creates databases that can be accessed directly by 3rd party tools and solutions. It also allows scheduled execution of transformation scripts on the IT Advisor server. Together this gives the ability to extract and enter data on the ITA server.
Based on the ETL system, it is possible to develop custom solutions, integrating ITA with a broad range of data sources.
ETL can be used in 2 ways: Importing Data in to and Exporting Data from IT Advisor.
- Importing data into IT Advisor is used whenever data must be provided to ITA.
The data is being imported via the ETL Import Database.
There are 2 ways of importing data. One where "Data is Ready" and one where "Data needs change" before being used in ITA.
- Data is Ready
When data is ready the only thing that must be done is setting up the integration between the external system providing data and ITA.
This is done at database level. The integration is configured using the server configuration interface and can be found here
- Data needs change
When data needs to be changed/modified before it is used inside ITA this can be done using a transformation or a set of transformations called a job. More information here
Once transformation(s) are in place the integration is configured using the server configuration interface and can be found here
- Data is Ready
- Exporting data from IT Advisor is used when ever data is used outside ITA.
The data is exported using the ETL Export Database.
The data exported from IT Advisor cannot be changed by transformations inside the ITA server. If transformation is needed this must be done outside the server in a separate setup.
- Setting up the export is done using the server configuration interface as described here
Upgrading ETL external system integration
No files (or databases) will be deleted when upgrading/restoring an ITA system. Upgrade of the ETL system is a part of the general ITA upgrade.
When upgrading an existing solution running ETL for importing and/or exporting data to ITA a couple of guidelines are:
- Both import and export databases are viewed as "api-like" interface to the system.
- At any point in time the data in the import and/or export database represents the currently best known state of the system.
- The ETL databases do not contain the sole instance of any piece of data.
This means that it is always safe to drop any ETL database and recreate it without the risk of losing data.
Since no data is stored in the ETL database, it is highly recommended to drop the databases and recreate them rather than moving them. This will allow databases to follow potential new database schema(s). To use the old databases, these must be moved manually which is not recommended.
When upgrading from previous versions the old transformation files will be moved into the folder
/data/pentaho_backup. The transformation files must manually be moved from
/data/pentaho_backup into a potential new folder structure (described below). Restoring an old backup will place transformation files in the same structure as they were in the backed up system. Please note this might not be the correct folder structure since the structure has changed between versions.
Note: When moving/copying files into the new folder structure make sure to preserve the ownership of the folders in order for the future script to be added and executed.
The transformation files must be placed in their respective folders on the ITA server.
The transformation files folder structure on the ITA server is
/data/pentaho/import. You can find more information about the ETL Transformation here.
Please sign in to leave a comment.