DataStage Training in Hyderabad


To get faster solutions and provide the data access and reports faster , DataStage has many set of options .It is an ideal tool for data integration projects – such as data warehouses, data marts and system migrations as well.
DataStage is developed since years and has developed a lot technology wise and is considered as a very powerfool tool. It has the capability to extract data from any sort of source, modify it or clean the data with its capability of transformation and can load into any target. “Any” here includes Databases, packages (SAP, PeopleSoft, Siebel), WebServer logs, Spreadsheets, XMLs and soon.

Myths on DataStage capabilities:

DataStageis often considerd as  a
  • For designing databased , Datastage is considered as a Data Modeling tool .
  • Data Discovery/Profiling tool .
  • Can be considered as a Reporting Tool as it transfers the data into a defined structure which can then be used as a reporting source . It is not a tool for producing reports on data. DataStage can only produce reports on itself – job design, logs, executions.

How DataStage Works?

Let us focus on the working of DataStage :
The design paradigm of DataStage is simplicity, to design an ETL task, you draw “the High Level Picture” – a picture reflecting the sources of data, the processing that the data are to undergo, and the targets for the data.
Datastage reads the complicated files and that too in depth . Say as for example a simple design may read data from a text file, and then change the data formats , executivesummarization of the data, and depicit the results into another text file. That’s all which a picture would show.
Now- to provide full solution of the story, users need to go in depth and fill in the additional detail information such as – the pathname of the source file, which columns need to be transformed , grouped and aggregated, the pathname of the target file , etc.
DataStage is metadata-driven and this is the best thing about it.

DataStage Variants:


DataStage is available to be installed in different variants:


  • No parellelism as the erver edition runs on a single server.  It makes the code in a language termed as DataStage BASIC.
  • Automatic parellism is permitted on SMP or MPP as the enterprise edition runs on an such an architecture It produces Orchestrate shell scripts.
  • Enterprise MVS edition is used for mainframe execution.


Datastage and Informatica are almost the same tools and are very powerful ETL tools . Both perform the same activity and in the same pattern . Like for example - Performance, maintainability, learning curve are all same and can be compared.

Datastage has the functionality of  drag and drop i.e a stage within in one canvas area for a pipeline source-target job.


Mapplets and Worklets are the fucntions in Informatica which provides ease of re-usability in case if mappings and workflows are to be reused.This really improves the performance
DataStage provdes similar re-usability via the containers(local&shared). User need to copy, complie and run the job or workflow to reuse it .



Informatica’s thrust is the auto-generated code. A mapping gets created by dropping a source-transformation-target that a ser need not to compile. 

DataStage
requires to compile a job in order to run it successfully.


Comments