Revision: Feb. 17, 2014
DataSelf ETL v2013.2
This help information is a brief guide to the most commonly used features of the DataSelf ETL engine. Some topics are not included in this guide because they should only be used by persons with advanced training from DataSelf.
This guide assumes the reader is familiar with foundational concepts of data warehousing such as source systems; metadata, tables; ODBC connections; filters; grouping of records; formulas; indexing; manual, automatic, and batch refresh; and extract, transformation, and load (ETL).
This guide is indexed by How To Topics.
Type Ctrl-F and enter your search terms in the Find window.
How do I . . . ? |
Do this or Refer to |
Click on the table’s name in Browse Table window; right-click, click Modify Import/Mapping to open the Import Table Wizard. |
|
See DataSelf ETL Users. (This is rarely used.) |
|
Add or remove indexes in a table |
Action > Wizard > Index/Sort to open the Index/Sort Wizard. Follow the wizard’s instructions. |
Action > Wizard > Profile to open the Profile Wizard. Follow the wizard’s instructions. (This wizard is rarely used.) |
|
Action > Wizard > Relationship to open the Relationship Wizard. Follow the wizard’s instructions. |
|
Action > Wizard > Filter to open the Filter Wizard. Follow the wizard’s instructions. |
|
Action > Wizard > Formula to open the Formula Wizard. Follow the wizard’s instructions. |
|
Action > Wizard > Summary to open the Summary Wizard. Follow the wizard’s instructions. |
|
Browse a table’s data |
Double-click on table’s name in the Browse Table Window. See also: Meaning of the colored column headings in the Table window |
Browse a table’s filters |
Click on the table’s name in Browse Table window; right-click, click Filter. |
Browse a table’s metadata |
Metadata about an entire table or Metadata about the columns/fields in a table. See also: Report Metadata. |
Click on table’s name in the Browse Table Window; right-click, click Fields. See also: Fields Window. |
|
Change a column/field’s data type or size |
See instructions at Modify Structure Window |
Connect to source data with ODBC |
See ODBC Connections and Source Systems. |
Connect to Excel Files |
See Importing Data from Excel and Source Systems. (2013.2) |
Connect DataSelf ETL to data warehouse /
|
|
Delete a table |
Click on the table’s name on the Browse Table window; right-click, click . Delete. |
Import a new table |
|
Manual refresh |
See Refresh. Contrast with Batch & Automatic Refresh. |
Refresh a table |
See Refresh. |
Refresh all Tables |
See Refresh. |
Reindex a table |
See Reindex. Tables. |
Reindex all tables |
See Reindex. Tables. |
Rename a column/field |
See instructions at Modify Structure Window. |
Rename a table |
See instructions at Modify Structure Window. |
Click on table’s name in the Browse Table Window; right-click, click Print Metadata. |
______________________________________________________________________________________________________
The Browse Table window (see below) opens automatically when you log to the ETL. This window is the starting point to several tasks in the ETL such as browsing data in tables, checking ETL metadata, and refreshing tables. Default presentation of Browse Table window. |
Browse Table |
To Open: Browse Table button
|
Mouse Controls – Browse Table window
Click on row |
Select row. |
Click on selected row |
Open table window. See also double-click on row. |
Double-click on row |
Open table window. |
Right-click |
Show sub-menu. |
Ctrl-click on column |
Adjusts column width making it narrower. |
Ctrl-drag on column |
Adjusts column width. Ctrl-click on left or right margin and drag the mouse to the left or right. |
Click Maximize button |
Shows more columns in the window, exposing the columns of metadata. |
To Open: Click maximize on the Browse Table window. (The default settings for the Browse Table window do not show all the columns in the window.)
Columns in the Browse Table Window (Table metadata)
first column |
Name of table used in the DataSelf ETL.(The name of a SQL view in the data warehouse.) |
Physical Name |
Name of table (physical table) in the DataSelf BI data warehouse. |
Source Table Name |
Name of table in source system that supplies data to the physical table. (Data is extracted from the source table and loaded into the physical table during the refresh. |
Data Source Name (DSN) |
Name of DSN (ODBC connection) that is used by refresh to connect to the source table. |
Last Refresh |
Date the physical table was last refreshed. |
Table Created by |
Name of user who first created the physical table in the ETL. |
Table Created |
Date physical table was created. |
See also: Metadata about individual columns / fields in the table.
To Open: On the Browse Table window click twice on a table name.
The heading line of the Table Window shows the name of the table followed by the number of records in the table. The example below shows that the ICLOC_Locations table has 6 records.
Example
of the color coded columns in a Table window. |
Red column heading |
The table is currently sorted by the values in this column. |
Green column heading |
Derived field. The data in the column is derived from a formula. Contrast with black heading columns. Shift-click green column heading to display the formula. |
Black column heading |
Neither of the types above. The data from the column is extracted directly from a column in the source table. |
To Browse Formulas |
Shift-click green column heading Displays the formula used to derive the values in the column. That is, the formula used to create the field. |
|
To Open: Right-click anywhere on the Browse Table window to display the sub-menu. To Open for a Particular Table: First select (click on) the table name in the Browse Table window then right-click the selection to open the sub-menu. This section shows only some of the options on the sub-menu – not all options are described. |
Browse metadata about
individual columns / fields in the selected table. |
|
Displays the filter settings that are available for the selected table. |
|
Index/Sort |
Display the indexes available for the selected table. |
A report of all the metadata for the selected table. |
|
Summary |
Shows the summary fields available for the selected table. |
Renames tables or columns and/or changes the size or type of columns. |
|
Reindex |
Reindexes all of the indexed columns of the select selected table |
|
Reindexes all of the indexed columns of the select selected table |
Refresh |
Refreshes the data in the selected table. A refresh extracts data from the source system’s database, recalculates formulas, and reindexes as required. |
|
Performs a refresh on all the tables in the data warehouse. This procedure is not recommended. Use the manual batch refresh procedure instead. See also the caution in the Refresh section. |
Recalculate |
Recalculates the formulas for the selected table. |
Recalculate all Formulas |
Recalculates all the formulas for all the tables in the data warehouse. |
Figure 2 List of columns/fields in a table. This window is opened by the Fields option. Figure 3 Fields window. Expand window to show metadata for rows. |
To Open: Right-click anywhere on the Browse Table window then click Fields . Browse metadata about individual columns / fields in the
selected table.
See also: Browse Table window Sub-menu |
To Open: On the Browse Table window click a table name to select it. Then right-click and click Modify Structure.
Rename a table |
Click the table name in the window’s heading. Type in changes to the name and press Enter. |
Rename a column/field name. |
Click the column heading, type changes in the 3 lines available and press enter. |
Change the size or data type of a column/field. |
Click on the data in a column below the column heading. Double-click a field model in the Change Field Models window. Click Yes to save in database. |
Edit Table Name Mode
Clicking on the window header puts the window into Table Edit Name mode. No other actions can be performed until you press Enter or Escape to exit edit mode.
To Save Modifications: When all modifications have been made to a table then right-click and click . Changes are not saved until you click Confirm. |
This wizard allows the user to change the mapping properties of tables. For example, changes can be made to the mappings as provided by DataSelf in the out-of-the-box solutions, fields may be added or removed from a table, or filters added to a table’s data refresh process.
To Open: On the Browse Table window click a table name to select it. Then right-click and click Modify Import/Mapping
Follow the instructions in the Import Table Wizard.
Reindex defined: A reindex (or re-index) rebuilds the indexes in the indexed columns of tables in the DataSelf BI data warehouse. Reindexing is one of the steps in the refresh process.
Implications. Reindexing a table may take a few seconds to several minutes to finish depending on the number of records in the table. Re-indexing all tables may take a long time. The performance of DataSelf BI and the server that hosts DataSelf BI may also be impacted.
To Start a Reindex: Click on table’s name in the Browse Table Window; right-click and click Reindex or Reindex all Tables
Reindex |
Reindexes all of the indexed columns of the select selected table.. |
|
Reindexes all the indexed columns in all tables in the data warehouse. On some installations this may tie up resources on the server that hosts DataSelf BI for some time. |
To Open: Action > Wizard > Refresh or click the Refresh Wizard button.
Follow the instructions on the wizard.
|
Refresh Wizard This example shows refresh Batch Number 0 (zero). This batch has 5 steps.
The Step column specifies the order in which the files in this batch will be refreshed. Use the Move arrows to change the order of the steps. |
See also: Batches, Batch Refresh.
Refresh Defined: A refresh extracts data from the source system’s database, recalculates formulas, and reindexes tables as required. The primary function of the DataSelf ETL is first to setup, configure, and maintain the DataSelf ETL metadata and second to run the refresh processes. The operation performed by each refresh process is controlled by this metadata.
Source System defined: The source system is the accounting, ERP, order entry, etc software systems in your organization. The “E” in ETL stands for extract. During a refresh the DataSelf BI ETL engine extracts data from the source system’s database or databases. For this reason please take note of the following caution.
Implications
A refresh usually draws resources from your source system server, your network, and the server that hosts DataSelf BI. Please heed the caution below.
Caution: A refresh is a resource intensive process that calls on the database resources of other computer hardware and software systems within your company. On some installations a refresh, especially a full batch refresh, may have significant impacts on the performance of other business software systems within your organization. (On the other hand some well-tuned installations report little to no noticeable impact on performance. Check with the DataSelf system manager for more information.)
The data warehouse is normally refreshed on schedule using the batch refresh process but other options are available. A refresh may be either manual or scheduled, on a single table or by batch.
A manual refresh is refresh that is run by you. A scheduled or automatic refresh is a refresh that is run by a job scheduler.
Refresh Single Table |
|
Manual Batch Refresh |
See Batch Refresh |
Scheduled (automatic) Batch Refresh |
See Batch Refresh |
v Note: Use of the manual Refresh all Tables option is strongly discouraged. Use the manual batch refresh option instead.
To Start a Refresh (manual refresh): Click on table’s name in the Browse Table Window; right-click and click Refresh or Refresh all Tables.
Refresh |
Refreshes the data in the selected table.. |
|
Performs a refresh on all the tables in the data warehouse. This process has been replaced by the batch refresh process and should not be used. See also: Warning about the Refresh all Tables Option. |
v NOTE: Depending on the table’s content, after the data is refreshed from the source, you may have to confirm if you want to recalculate formulas.
A batch is a group of tables that are configured to refresh in a particular sequence. Batches and sequences are specified in the Refresh Wizard.
To Run: To run a batch refresh run the DSRefresh.bat file. Speak with the designated DataSelf BI system manager for more information.
The batch refresh process is usually scheduled to run overnight. When the refresh process is run by job scheduler the batch refresh is also known as an automatic refresh.
The Refresh Wizard specifies a particular sequence for the refresh process by establishing batches and processing steps within the batches.
Some installations may have several refresh batches. Each batch takes care of specific needs and may run on a separate schedule. An understanding of the components involved in a batch refresh should help you to better understand how to make changes.
Components of the Batch Refresh
dsrefresh.exe |
The batch refresh program. The program has one optional parameter: a batch number. The default batch number is zero. |
DSRefresh.bat |
The script which is usually
used to run dsrefresh.exe. |
A program which runs the .bat file according to a schedule. Any job scheduler program can be used. The most popular scheduler among DataSelf users is SQL Jobs, the scheduler built into Microsoft’s SQL Server. This option provides several logging features and it is better integrated with the DataSelf BI tool set. Windows Task Scheduler is another popular option. |
|
The metadata is a set of instructions that controls the refresh process. Specifically, the operation of dsrefresh.exe is controlled by the instructions in this metadata. The metadata is setup and maintained by the wizards inn DataSelf ETL. See also: Out-of-the-box Templates from DataSelf |
|
DataSelf ETL |
A major function of the wizards in the DataSelf ETL is to setup and maintain the metadata. |
The ETL automatic refresh runs via batches. Each batch may have several steps. A batch might for example refresh ERP tables in the first step, refresh CRM tables in the 2nd steps, refresh combined tables in a 3rd step. The configuration of batches and is done in the Refresh Wizard.
If a table in either a manual or batch/automatic refresh is not being refreshed as expected here are the first things that DataSelf’s experts look for.
Trouble Shooting Connection Issues
Verify that the ODBC connections to all data sources are working. Setting up the OCBC connections for the out-of-the-box solutions from DataSelf should have been done during the initial installation but a number of things can break the connection. For more see ODBC Connections.
Run a manual refresh to be sure there are no problems connecting to this table.
Trouble Shooting Mappings Issues with the Import Table Wizard
The import mappings to the table may be preventing the refresh from working as desired.
Open the Browse Table window. Click the table’s name, then right-click and click Modify Import/Mapping.)
Import Table Wizard - Select Tables window:
Click Back to see the names of the OCBC connections or to change the login and password.
Click Next.
Import Table Wizard - Select Fields window: Verify that the appropriate fields are imported in the Select Fields window. The fields imported are listed on the right side pane.
Click Next.
Import Table Wizard - Select Data Grouping (optional) window: For most tables, you should see no fields on Grouping Fields pane. If there are grouping fields then verify that the appropriate grouping is being used to summarize data.
Click Next.
Import Table Wizard - Select Import Filter (optional) window: This step may not show up for some tables. For most of the remaining tables, this step should have (none) selected. If appropriate, verify that the appropriate filter is being used to select data.
Click Next.
Import Table Wizard – Finish window: Verify the following refresh options are set appropriately:
· The Maximum number of rows to import box should usually be un-checked.
· If the Disable automatic refresh box is checked then the refresh will not extract data from the source system. Formulas, however, will still be calculated.
· Click Refresh Type to see what’s been selected.
· Click Table Union to see if this table is being merged with other table(s).
· Read the caution about refreshing tables. If it’s Ok to refresh a table click Finish & Import Now. The table refresh should run without errors.
On some systems the use of the manual Refresh all Tables option may corrupt the data warehouse. Use the manual batch refresh procedure instead.
Proper use of this option is beyond the scope of this document.
The Refresh all Tables command ignores the sequence of operations in a batch as configured in the Refresh Wizard. A refresh performed by the Refresh all Tables will probably be partly incorrect. In some cases a data warehouse refreshed out of proper sequence may be seriously corrupted.
Part of the DataSelf BI solution includes one or more out-of-the-box templates for each source database DataSelf supports. A main component of each template is metadata for the DataSelf ETL. The out-of-the-box metadata was created by DataSelf using only the DataSelf ETL.
See also: Components of the Batch Refresh
To Open the Import Table Wizard: Click Action > Wizard > Table.
Follow the instructions on the Import Table Wizard.
Add the table to the automatic refresh process with the Refresh Wizard..
Requirements: An ODBC connection to the source system that contains the new table is required.
Metadata in the context of the DataSelf ETL engine and the refresh process includes everything the refresh process needs to know. Metadata includes information about:
· Source systems and their databases
· The relationship between source system database tables and data warehouse tables
· Format of data warehouse tables and columns (or files and fields).
· Formulas
· Indexes
· Groupings and summarizations.
· Refresh batches
The out-of-the-box solutions from DataSelf consist largely
of metadata.
See also: Out-of-the-box Templates
from DataSelf
OCBC connections to the source systems for the out-of-the-box solutions from DataSelf should have been done during the initial installation. But a number of things can break an ODBC connection. Setting up and fixing ODBC connections often requires a computer system manager with the right passwords and technical knowledge.
See also: SQL Server Connection Tool
See also: Advanced Technical Assistance
Connects the ETL to the data warehouse. Controls the connection of the DataSelf ETL engine to the SQL Server instance where the data warehouse is hosted.
This window also has options which resets the metadata for the data warehouse and data cubes back to their initial out-of-the-box configurations.
Under normal conditions this tools is used once when first installing the ETL.
Caution: Some of the settings in the SQL Server Connection window can erase all changes made to the data warehouse and data cubes since installation.
To Open: Click Tools > SQL Server Connection.
v Important: In most cases, the boxes below the Database name should not be checked.
To add, change, or delete users open the User Wizard and follow the instructions.
To Open: Action > Wizard > User
v Important Note: These user settings apply only to ETL users. Other components in the DataSelf BI solution have their own user settings. Changing the settings here will not impact the user and security settings for the data warehouse and data cubes.
ETL is an acronym for Extract, Transformation, and Load. The purpose of an ETL engine is to extract data from source databases, transform (clean up, summarize, reformate, and calculate), and load the transformed information into the data warehouse.
Every DataSelf customer should have at least one person designated as a primary contact and person designated as system manager. Your organization’s policy (as well as and the support agreement with DataSelf) may ask you to contact of these people before contacting DataSelf directly.
Contacting DataSelf for Support
Email: support@dataself.com
Home page: www.dataself.com
Tech support line: 408-351-3569