A common issue in data integration is that often the documentation and the SAS® data integration job source code start to diverge and eventually become out of sync. At Capgemini, working for a specific client, we developed a solution to rectify this challenge. This presentation therefore focuses on the metadata documentation generator; it :
1) looks at how to use programming and documentation standards in SAS data integration jobs to enable the generation of documentation from the metadata
2) shows how the documentation is generated from the metadata, and the challenges that were encountered creating the code.
Presentation by Richard Hogenberg, Capgemini at SAS Global Forum 2017.
http://support.sas.com/resources/papers/proceedings17/1517-2017.pdf
Scanning the Internet for External Cloud Exposures via SSL Certs
SAS Data Integration: a Capgemini Solution to Accelerate and Keeping it All 'in Sync'?
1.
2. Presenter
Richard Hogenberg, Lead global SAS® CoE, Capgemini
Richard Hogenberg is the lead of the global SAS® CoE at Capgemini
where he has been since 1996. Richard is a SAS® solution architect and
technical specialist working mainly in Fraud and Risk projects.
Richard holds 3 SAS® certifications:
- Certified Base Programmer for SAS® 9
- Certified Advanced Programmer for SAS® 9
- Certified Data Integration Developer for SAS® 9
3. SAS® Data Integration Documentation Generator:
a Capgemini Solution to Accelerate and Keeping it All "in Sync"
4. SAS® Data Integration Documentation Generator:
a Capgemini Solution to Accelerate and Keeping it All "in Sync"
INTRODUCTION
• Everyone has probably encountered the situation that the documentation
and the corresponding source code were out of sync.
• Reliance on the source code itself in stead of documentation.
• Our solution contains the documentation within the SAS® Data Integration
Studio jobs. Therefore the complete documentation is available within the
SAS® Metadata.
5. SAS® Data Integration Documentation Generator:
a Capgemini Solution to Accelerate and Keeping it All "in Sync"
What is needed
• Clear standards for SAS® Data Integration Job development in order to
ensure that all developers work the same way and all the transformations
are clearly documented.
• A program to extract the information out of the SAS® Metadata in order to
generate a ‘paper’ version of the documentation.
6. SAS® Data Integration Documentation Generator:
a Capgemini Solution to Accelerate and Keeping it All "in Sync"
Standards for SAS® Data Integration Job development
• Job level
• Naming standards for the SAS® Data Integration Jobs
• Sticky notes
• Transformation level
• Naming standards for the transformations
• Defining which property fields can contain information that can be extracted
from the SAS® Metadata and therefore should be filled
7. SAS® Data Integration Documentation Generator:
a Capgemini Solution to Accelerate and Keeping it All "in Sync"
Job level
Usage of sticky notes
8. SAS® Data Integration Documentation Generator:
a Capgemini Solution to Accelerate and Keeping it All "in Sync"
Transformation level
• Changing the name from the default transformation name to a
meaningful name will greatly enhance the understanding of a Data
Integration job as people can easily see what is supposed to happen in
each of the transformations
• The description of a transformation can only contain a limited amount of
characters. We use the notes tab of the transformation as the place to
put a full description of the technical / functional details of the
transformation.
• User written code ensure that it is properly commented by itself and use
the notes section of the transformation for more details.
• Adding comments to an expression that is part of a transformation.
9. SAS® Data Integration Documentation Generator:
a Capgemini Solution to Accelerate and Keeping it All "in Sync"
Challenges encountered while developing the program that extracts the
documentation from the SAS® Metadata
• Understanding the SAS® Metadata structure
• Extracting the information from the SAS® Metadata structure
10. SAS® Data Integration Documentation Generator:
a Capgemini Solution to Accelerate and Keeping it All "in Sync"
Understanding the SAS® Metadata structure
1) SAS® Metadata Browser
11. SAS® Data Integration Documentation Generator:
a Capgemini Solution to Accelerate and Keeping it All "in Sync"
Understanding the SAS® Metadata structure (continued)
12. SAS® Data Integration Documentation Generator:
a Capgemini Solution to Accelerate and Keeping it All "in Sync"
Understanding the SAS® Metadata structure (continued)
2) SAS® Metadata Packages
Archive a SAS Data Integration job by creating an ‘archive as a SAS package’.
Then unzip the package. Within the unzipped package there are multiple
XML files. One of these XML files contains the complete structure of the
Data Integration job.
13. SAS® Data Integration Documentation Generator:
a Capgemini Solution to Accelerate and Keeping it All "in Sync"
Extracting the information from the SAS® Metadata structure
SAS® function used:
• metadata_getnobj
• metadata_getnasn
• metadata_getnatr
• metadata_getattr
14. SAS® Data Integration Documentation Generator:
a Capgemini Solution to Accelerate and Keeping it All "in Sync"
Extracting the information from the SAS® Metadata structure (continued)
• Below is an example how to obtain the uri’s of the Data Integration jobs that
contain DG_:
i=0;
do until(rc1<0);
i+1;
rc1=metadata_getnobj("omsobj:Job?@Name contains 'DG_'",i,uri);
if rc1>0 then do;
…
end;
end;
15. SAS® Data Integration Documentation Generator:
a Capgemini Solution to Accelerate and Keeping it All "in Sync"
Extracting the information from the SAS® Metadata structure (continued)
• The code below shows how to analyze the metadata tree over all the
transformations within a job:
i=0;
do until(rc1<0);
i+1;
rc1=metadata_getnasn("&JobID.","TransformationSources",i,nuri);
if rc1>0 then do;
…
end;
end;
16. SAS® Data Integration Documentation Generator:
a Capgemini Solution to Accelerate and Keeping it All "in Sync"
The job itself
17. SAS® Data Integration Documentation Generator:
a Capgemini Solution to Accelerate and Keeping it All "in Sync"
What does the result look like
18. SAS® Data Integration Documentation Generator:
a Capgemini Solution to Accelerate and Keeping it All "in Sync"
Questions?
19. Don't Forget to Provide Feedback!
1. Go to the Agenda icon in the conference app.
2. Find this session title and select it.
3. On the sessions page, scroll down to Surveys and
select the name of the survey.
4. Complete the survey and click Finish.