Tuesday, January 26, 2016

Informatica most common Functions Used during ETL Process


INSTR
Syntax
INSTR (string, search_value [,start [,occurrence [,comparison_type ]]] )
Argument
Required/Optional
Description
string
Required
The string must be a character string. Passes the value you want to evaluate. You can enter any valid transformation expression. The results of the expression must be a character string. If not, INSTR converts the value to a string before evaluating it.
search_value
Required
Any value. The search value is case sensitive. The set of characters you want to search for. The search_value must match a part of the string. For example, if you write INSTR(‘Alfred Pope’, ‘Alfred Smith’) the function returns 0.
You can enter any valid transformation expression. If you want to search for a character string, enclose the characters you want to search for in single quotation marks, for example ‘abc’.
start
Required
Must be an integer value. The position in the string where you want to start the search. You can enter any valid transformation expression.
The default is 1, meaning that INSTR starts the search at the first character in the string.If the start position is 0, INSTR searches from the first character in the string. If the start position is a positive number, INSTR locates the start position by counting from the beginning of the string. If the start position is a negative number, INSTR locates the start position by counting from the end of the string. If you omit this argument, the function uses the default value of 1.
occurrence
Required
A positive integer greater than 0. You can enter any valid transformation expression. If the search value appears more than once in the string, you can specify which occurrence you want to search for. For example, you would enter 2 to search for the second occurrence from the start position.
If you omit this argument, the function uses the default value of 1, meaning that INSTR searches for the first occurrence of the search value. If you pass a decimal, the PowerCenter Integration Service rounds it to the nearest integer value. If you pass a negative integer or 0, the session fails.
comparison_type
Optional
The string comparison type, either linguistic or binary, when the PowerCenter Integration Service runs in Unicode mode. When the PowerCenter Integration Service runs in ASCII mode, the comparison type is always binary.
Linguistic comparisons take language-specific collation rules into account, while binary comparisons perform bitwise matching. For example, the German sharp s character matches the string “ss” in a linguistic comparison, but not in a binary comparison. Binary
comparisons run faster than linguistic comparisons.
Must be an integer value, either 0 or 1:
– 0: INSTR performs a linguistic string comparison.
– 1: INSTR performs a binary string comparison.
Default is 0.

Return Value
Integer if the search is successful. Integer represents the position of the first character in the search_value, counting from left to right.
1. 0 if the search is unsuccessful.
2. NULL if a value passed to the function is NULL.
Return Value
The following expression returns the position of the first occurrence of the letter ‘a’, starting at the beginning of each company name. Because the search_value argument is case sensitive, it skips the ‘A’ in ‘Blue Fin Aqua Center’, and returns the position for the ‘a’ in ‘Aqua’:
INSTR( COMPANY, ‘a’ )
COMPANY
RETURN VALUE
Blue Fin Aqua Center
13
Maco Shark Shop
2
Scuba Gear
5
Frank’s Dive Shop
3
VIP Diving Club
0
The following expression returns the position of the second occurrence of the letter ‘a’, starting at the beginning of each company name. Because the search_value argument is case sensitive, it skips the ‘A’ in ‘Blue Fin Aqua Center’, and returns 0:
INSTR( COMPANY, ‘a’, 1, 2 )
COMPANY
RETURN VALUE
Blue Fin Aqua Center
0
Maco Shark Shop
8
Scuba Gear
9
Frank’s Dive Shop
0
VIP Diving Club
0
The following expression returns the position of the second occurrence of the letter ‘a’ in each company name, starting from the last character in the company name. Because the search_value argument is case sensitive, it skips the ‘A’ in ‘Blue Fin Aqua Center’, and returns 0:
INSTR( COMPANY, ‘a’, -1, 2 )
COMPANY
RETURN VALUE
Blue Fin Aqua Center
0
Maco Shark Shop
2
Scuba Gear
5
Frank’s Dive Shop
0
VIP Diving Club
0
The following expression returns the position of the first character in the string ‘Blue Fin Aqua Center’ (starting from the last character in the company name):
INSTR( COMPANY, ‘Blue Fin Aqua Center’, -1, 1 )
COMPANY
RETURN VALUE
Blue Fin Aqua Center
1
Maco Shark Shop
0
Scuba Gear
0
Frank’s Dive Shop
0
VIP Diving Club
0
Using Nested INSTR
You can nest the INSTR function within other functions to accomplish more complex tasks.
The following expression evaluates a string, starting from the end of the string. The expression finds the last (rightmost) space in the string and then returns all characters to the left of it:
SUBSTR( CUST_NAME,1,INSTR( CUST_NAME,’ ‘ ,-1,1 ))
CUST NAME
RETURN VALUE
PATRICIA JONES
PATRICIA
MARY ELLEN SHAH
MARY ELLEN
The following expression removes the character ‘#’ from a string:
SUBSTR( CUST_ID, 1, INSTR(CUST_ID, ‘#’)-1 ) || SUBSTR( CUST_ID, INSTR(CUST_ID, ‘#’)+1 )
CUST_ID
RETURN VALUE
ID#33
ID33
#A3577
A3577
SS #712403399
SS 712403399


IIF
Returns one of two values you specify, based on the results of a condition.
Syntax
IIF( condition, value1 [,value2] )
Argument
Required/Optional
Description
condition
Required
The condition you want to evaluate. You can enter any valid transformation expression   that evaluates to TRUE or FALSE.
value1
Required
Any datatype except Binary. The value you want to return if the condition is TRUE. The return value is always the datatype specified by this argument. You can enter any valid transformation expression, including another IIF expression.
value2
Optional
Any datatype except Binary. The value you want to return if the condition is FALSE. You can enter any valid transformation expression, including another IIF expression.
Unlike conditional functions in some systems, the FALSE (value2) condition in the IIF function is not required. If you omit value2, the function returns the following when the condition is FALSE: 1. 0 if value1 is a Numeric datatype. 2. Empty string if value1 is a String datatype. 3. NULL if value1 is a Date/Time datatype. For example, the following expression does not include a FALSE condition and value1 is a string datatype so the PowerCenter Integration Service returns an empty string for each row that evaluates to FALSE: IIF( SALES > 100, EMP_NAME )
SALES
EMP_NAME
RETURN VALUE
150
John Smith
John Smith
50
Pierre Bleu
” (empty string)
120
Sally Green
Sally Green
NULL
Greg Jones
” (empty string)
Return Value value1 if the condition is TRUE. value2 if the condition is FALSE. For example, the following expression includes the FALSE condition NULL so the PowerCenter Integration Service returns NULL for each row that evaluates to FALSE: IIF( SALES > 100, EMP_NAME, NULL )
SALES
EMP_NAME
RETURN VALUE
150
John Smith
John Smith
50
Pierre Bleu
” (empty string)
120
Sally Green
Sally Green
NULL
Greg Jones
” (empty string)
If the data contains multibyte characters and the condition argument compares string data, the return value depends on the code page and data movement mode of the PowerCenter Integration Service.
IIF and Datatypes
When you use IIF, the datatype of the return value is the same as the datatype of the result with the greatest precision.
For example, you have the following expression:
IIF( SALES < 100, 1, .3333 )
The TRUE result (1) is an integer and the FALSE result (.3333) is a decimal. The Decimal datatype has greater precision than Integer, so the datatype of the return value is always a Decimal.
When you run a session in high precision mode and at least one result is Double, the datatype of the return value is Double.
Special Uses of IIF
Use nested IIF statements to test multiple conditions. The following example tests for various conditions and returns 0 if sales is 0 or negative:
IIF( SALES > 0, IIF( SALES < 50, SALARY1, IIF( SALES < 100, SALARY2, IIF( SALES < 200, SALARY3, BONUS))), 0 )
Use IIF in update strategies. For example:
IIF( ISNULL( ITEM_NAME ), DD_REJECT, DD_INSERT)
Alternative to IIF
Use “DECODE”  instead of IIF in many cases. DECODE may improve readability. The following shows how you use DECODE instead of IIF using the first example from the previous section:
DECODE( TRUE, SALES > 0 and SALES < 50, SALARY1, SALES > 49 AND SALES < 100, SALARY2, SALES > 99 AND SALES < 200, SALARY3,SALES > 199, BONUS)
You can often use a Filter transformation instead of IIF to maximize session performance.

Variable Functions
The transformation language includes a group of variable functions to update the current value of a mapping variable throughout the session. When you run a workflow, the PowerCenter Integration Service evaluates the start and current value of a variable at the beginning of the session based on the final value of the variable from the last session run. Use the following variable functions:
     1.    SetCountVariable
2.    SetMaxVariable
3.    SetMinVariable
4.    SetVariable
Use different variable functions with a variable based on the aggregation type of the variable.
When using mapping variables in sessions with multiple partitions, use variable functions to determine the final value of the variable for each partition. At the end of the session, the PowerCenter Integration Service performs the aggregate function across all partitions to determine one final value to save to the repository. Unless overridden, it uses the saved value as the start value of the variable for the next time you use this session.
For example, you use SetMinVariable to set a variable to the minimum evaluated value. The PowerCenter Integration Service calculates the minimum current value for the variable for each partition. Then at the end of the session, it finds the minimum current value across all partitions and saves that value into the repository.
Use SetVariable only once for each mapping variable in a pipeline. When you create multiple partitions in a pipeline, the PowerCenter Integration Service uses multiple threads to process that pipeline. If you use this function more than once for the same variable, the current value of a mapping variable may have indeterministic results.

SETCOUNTVARIABLE
Counts the rows evaluated by the function and increments the current value of a mapping variable based on the count. Increases the current value by one for each row marked for insertion. Decreases the current value by one for each row marked for deletion. Keeps the current value the same for each row marked for update or reject. Returns the new current value.
At the end of a successful session, the PowerCenter Integration Service saves the last current value to the repository. When used with a session that contains multiple partitions, the PowerCenter Integration Service generates different current values for each partition. At the end of the session, it determines the total count for all partitions and saves the total to the repository. Unless overridden, it uses the saved value as the initial value of the variable for the next time you use this session.
Use the SETCOUNTVARIABLE function only once for each mapping variable in a pipeline. The PowerCenter Integration Service processes variable functions as it encounters them in the mapping. The order in which the PowerCenter Integration Service encounters variable functions in the mapping may not be the same for every session run. This may cause inconsistent results when you use the same variable function multiple times in a mapping.
Use SETCOUNTVARIABLE with mapping variables with a Count aggregation type. Use SETCOUNTVARIABLE in the following transformations:
·         Expression
·         Filter
·         Router
·         Update Strategy
The PowerCenter Integration Service does not save the final value of a mapping variable to the repository when any of the following are true:
·         The session fails to complete.
·         The session is configured for a test load.
·         The session is a debug session.
·         The session runs in debug mode and is configured to discard session output.
Syntax
SETCOUNTVARIABLE( $$Variable )
Argument
Required/
Optional
Description
$$Variable
Required
Name of the mapping variable you want to set. Use mapping variables with a count aggregation type.
Return Value
TRUE if the data matches the pattern.
Example
You have a mapping that updates a slowly changing dimension table containing distributor information. The following expression counts the number of current distributors with the mapping variable $$CurrentDistributors and returns the current value to the CUR_DIST port. It increases the count by one for each inserted row, decreases the count for each deleted row, and keeps the count the same for all updated or rejected rows. The initial value of $$CurrentDistributors from the previous session run is 23.
SETCOUNTVARIABLE ($$CurrentDistributors)
(row marked for)
DIST_ID
DISTRIBUTOR
CUR_DIST
(update)
000015
MSD Inc.
23
(insert)
000024
Darkroom Co.
24
(insert)
000025
Howard’s Supply
25
(update)
000003
JNR Ltd.
25
(delete)
000024
Darkroom Co.
24
(insert)
000026
Supply.com
25
At the end of the session, the PowerCenter Integration Service saves ‘25’ to the repository as the current value for $$CurrentDistributors. The next time the session runs, the Integration Service evaluates the initial value to $$CurrentDistributors to ‘25’.
The PowerCenter Integration Service saves the same value for $$CurrentDistributors to the repository for sessions with multiple partitions as for sessions with a single partition.

SETMAXVARIABLE
Sets the current value of a mapping variable to the higher of two values: the current value of the variable or the value you specify. Returns the new current value. The function executes only if a row is marked as insert. SETMAXVARIABLE ignores all other row types and the current value remains unchanged.
At the end of a successful session, the PowerCenter Integration Service saves the final current value to the repository. When used with a session that contains multiple partitions, the PowerCenter Integration Service generates different current values for each partition. At the end of the session, it saves the highest current value across all partitions to the repository. Unless overridden, it uses the saved value as the initial value of the variable for the next session run.
When used with a string mapping variable, SETMAXVARIABLE returns the higher string based on the sort order selected for the session.
Use the SETMAXVARIABLE function only once for each mapping variable in a pipeline. The PowerCenter Integration Service processes variable functions as it encounters them in the mapping. The order in which the PowerCenter Integration Service encounters variable functions in the mapping may not be the same for every session run. This can cause inconsistent results when you use the same variable function multiple times in a mapping.
Use SETMAXVARIABLE with mapping variables with a Max aggregation type. Use SETMAXVARIABLE in the following transformations:
·         Expression
·         Filter
·         Router
·         Update Strategy
The PowerCenter Integration Service does not save the final value of a mapping variable to the repository when any of the following conditions are true:
·         The session fails to complete.
·         The session is configured for a test load.
·         The session is a debug session.
·         The session runs in debug mode and is configured to discard session output.
Syntax
SETMAXVARIABLE( $$Variable, value )
Argument
Required/
Optional
Description
$$Variable
Required
Name of the mapping variable you want to set. Use mapping variables with Max aggregation
type.
value
Required
The value you want the PowerCenter Integration Service to compare against the current value of the variable. You can enter any valid transformation expression that evaluates to a datatype compatible with the datatype of the variable.
Return Value
The higher of two values: the current value of the variable or the value you specified. The return value is the new current value of the variable.
When value is NULL the PowerCenter Integration Service returns the current value of $$Variable.
Examples
The following expression compares the number of items purchased in each transaction with a mapping variable $$MaxItems. It sets $$MaxItems to the higher of two values and returns the historically highest number of items purchased in a single transaction to the MAX_ITEMS port. The initial value of $$MaxItems from the previous session run is 22.
SETMAXVARIABLE ($$MAXITEMS, ITEMS)
TRANSACTION
ITEMS
MAX_ITEMS
0100002
12
22
0100003
5
22
0100004
18
22
0100005
35
35
0100006
5
35
0100007
14
35
At the end of the session, the PowerCenter Integration Service saves ‘35’ to the repository as the maximum current value for $$MaxItems. The next time the session runs, the PowerCenter Integration Service evaluates the initial value to $$MaxItems to ‘35’.
If the same session contains three partitions, the PowerCenter Integration Service evaluates $$MaxItems for each partition. Then, it saves the largest value to the repository. For example, the last evaluated value for $$MaxItems in each partition is as follows:
Partition      Final Current Value for $$MaxItems
Partition 1      35
Partition 2      23
Partition 3      22

SETMINVARIABLE
Sets the current value of a mapping variable to the lower of two values: the current value of the variable or the value you specify. Returns the new current value. The SETMINVARIABLE function executes only if a row is marked as insert. SETMINVARIABLE ignores all other row types and the current value remains unchanged.
At the end of a successful session, the PowerCenter Integration Service saves the final current value to the repository. When used with a session that contains multiple partitions, the PowerCenter Integration Service generates different current values for each partition. At the end of the session, it saves the lowest current value across all partitions to the repository. Unless overridden, it uses the saved value as the initial value of the variable for the next session run.
When used with a string mapping variable, SETMINVARIABLE returns the lower string based on the sort order selected for the session.
Use the SETMINVARIABLE function only once for each mapping variable in a pipeline. The PowerCenter Integration Service processes variable functions as it encounters them in the mapping. The order in which the PowerCenter Integration Service encounters variable functions in the mapping may not be the same for every session run. This may cause inconsistent results when you use the same variable function multiple times in a mapping.
Use SETMINVARIABLE with mapping variables with a Min aggregation type. Use SETMINVARIABLE in the following transformations:
·         Expression
·         Filter
·         Router
·         Update Strategy
The PowerCenter Integration Service does not save the final value of a mapping variable to the repository when any of the following conditions are true:


Syntax
SETMINVARIABLE( $$Variable, value )
Argument
Required/
Optional
Description
$$Variable
Required
Name of the mapping variable you want to set. Use with mapping variables with Min aggregation type.
value
Required
The value you want the PowerCenter Integration Service to compare against the current value of the variable. You can enter any valid transformation expression that evaluates to a datatype compatible with the datatype of the variable.
Return Value
The lower of two values: the current value of the variable or the value you specified. The return value is the new current value of the variable.
When value is NULL, the PowerCenter Integration Service returns the current value of $$Variable.
Example
The following expression compares the price of an item with a mapping variable $$MinPrice. It sets $$MinPrice to the lower of two values and returns the historically lowest item price to the MIN_PRICE port. The initial value of $$MinPrice from the previous session run is 22.50.
SETMINVARIABLE ($$MinPrice, PRICE)
DATE
PRICE 
MIN_PRICE
05/01/2000 09:00:00
23.50
22.50
05/01/2000 10:00:00
27.00
22.50
05/01/2000 11:00:00
26.75
22.50
05/01/2000 12:00:00
25.25
22.50
05/01/2000 13:00:00
22.00
22.00
05/01/2000 14:00:00
22.75
22.00
05/01/2000 15:00:00
23.00
22.00
05/01/2000 16:00:00
24.25
22.00
05/01/2000 17:00:00
24.00
22.00
At the end of the session, the PowerCenter Integration Service saves 22.00 to the repository as the minimum current value for $$MinPrice. The next time the session runs, the PowerCenter Integration Service evaluates the initial value to $$MinPrice to 22.00.
If the same session contains three partitions, the PowerCenter Integration Service evaluates $$MinPrice for each partition. Then, it saves the smallest value to the repository. For example, the last evaluated value for $$MinPrice in each partition is as follows:
Partition      Final Current Value for $$MinPrice
Partition 1      22.00
Partition 2      22.00
Partition 3      22.00

SETVARIABLE
Sets the current value of a mapping variable to a value you specify. Returns the specified value. The SETVARIABLE function executes only if a row is marked as insert or update. SETVARIABLE ignores all other row types and the current value remains unchanged.
At the end of a successful session, the PowerCenter Integration Service compares the final current value of the variable to the start value of the variable. Based on the aggregate type of the variable, it saves a final current value to the repository. Unless overridden, it uses the saved value as the initial value of the variable for the next session run.
Use the SETVARIABLE function only once for each mapping variable in a pipeline. The PowerCenter Integration Service processes variable functions as it encounters them in the mapping. The order in which the PowerCenter Integration Service encounters variable functions in the mapping may not be the same for every session run. This may cause inconsistent results when you use the same variable function multiple times in a mapping.
Use SETVARIABLE in the following transformations:
·         Expression
·         Filter
·         Router
·         Update Strategy
The PowerCenter Integration Service does not save the final value of a mapping variable to the repository when any of the following conditions are true:
·         The session fails to complete.
·         The session is configured for a test load.
·         The session is a debug session.
·         The session runs in debug mode and is configured to discard session output.
Syntax
SETVARIABLE( $$Variable, value )
Argument
Required/
Optional
Description
$$Variable
Required
Name of the mapping variable you want to set. Use with mapping variables with Max/Min aggregation type.
value
Required
The value you want to set the current value of the variable to. You can enter any valid transformation expression that evaluates to a datatype compatible with the datatype of the variable.

Return Value
Current value of the variable.
When value is NULL, the PowerCenter Integration Service returns the current value of $$Variable.
Examples
The following expression sets a mapping variable $$Time to the system date at the time the PowerCenter Integration Service evaluates the row and returns the system date to the SET_$$TIME port:
SETVARIABLE ($$Time, SYSDATE)
TRANSACTION
ITEMS
SET_$$TIME
0100002
534.23
10/10/2000 01:34:33
0100003
699.01
10/10/2000 01:34:34
0100004
97.50
10/10/2000 01:34:35
0100005
116.43
10/10/2000 01:34:36
0100006
323.95
10/10/2000 01:34:37
At the end of the session, the PowerCenter Integration Service saves 10/10/2000 01:34:37 to the repository as the last evaluated current value for $$Timestamp.
The next time the session runs, the PowerCenter Integration Service evaluates all references to $$Timestamp to 10/10/2000 01:34:37.
The following expression sets the mapping variable $$Timestamp to the timestamp associated with the row and returns the timestamp to the SET_$$TIMESTAMP port:
SETVARIABLE ($$Time, TIMESTAMP)
TRANSACTION
TIMESTAMP
TOTAL
SET_$$TIMESTAMP
0100002
10/01/2000 12:01:01
534.23
10/01/2000 12:01:01
0100003
10/01/2000 12:10:22
699.01
10/01/2000 12:10:22
0100004
10/01/2000 12:16:45
97.50
10/01/2000 12:16:45
0100005
10/01/2000 12:23:10
116.43
10/01/2000 12:23:10
0100006
10/01/2000 12:40:31
323.95
10/01/2000 12:40:31
At the end of the session, the PowerCenter Integration Service saves 10/01/2000 12:40:31 to the repository as the last evaluated current value for $$Timestamp.
The next time the session runs, the PowerCenter Integration Service evaluates the initial value of $$Timestamp to
10/01/2000 12:40:31.
At the end of the session, the PowerCenter Integration Service saves 10/01/2000 12:40:31 to the repository as the last evaluated current value for $$Timestamp.

Data Cleansing Functions
The transformation language includes a group of functions to eliminate data errors. You can complete the following tasks with data cleansing functions:

  •          Test input values.
  •          Convert the datatype of an input value.
  •          Trim string values.
  •          Replace characters in a string.
  •          Encode strings.
  •          Match patterns in regular expressions.
The transformation language includes the following data cleansing functions:
     1.    GREATEST
2.    IN
3.    INSTR
4.    IS_DATE
5.    IS_NUMBER
6.    IS_SPACES
7.    ISNULL
8.    LEAST
9.    LTRIM
10.  METAPHONE
11.  REG_EXTRACT
12.  REG_MATCH
13.  REG_REPLACE
14.  REPLACECHR
15.  REPLACESTR
16.  RTRIM
17.  SOUNDEX
18.  SUBSTR
19.  TO_BIGINT
20.  TO_CHAR
21.  TO_DATE
22.  TO_DECIMAL
23.  TO_FLOAT
24.  TO_INTEGER

Character Functions
1. LENGTH:
The LENGTH function returns the number of characters in a string, including trailing blanks. It is available in the Designer and the Workflow Manager.
LENGTH (string)
Example: The following expression returns the length of each customer name:
LENGTH (CUSTOMER_NAME)
CUSTOMER_NAME
Leonardo
NULL
Edwin Britto
2. LPAD:
RETURN VALUE
8
NULL
12
The LPAD function adds a set of blanks or characters to the beginning of a string, to set a string to a specified length. It is available in the Designer and the Workflow Manager.
LPAD (first_string, length [, second_string])
Example: The following expression standardizes numbers to five digits by padding them with leading zeros.
LPAD (NUM, 5, ‘0’)
NUM
1
250
3. LTRIM:
RETURN VALUE
00001
00250
The LTRIM function removes blanks or characters from the beginning of a string. It is available in the Designer and the Workflow Manager.
LTRIM (string [, trim_set])
LTRIM (string) removes the leading spaces or blanks from the string. When LTRIM function is used with a trim set, which is optional, it removes the characters in the trim set from the string.
Example : The following expression removes the leading zeroes in the port
ITEM_CODE.
LTRIM (ITEM_CODE,’0′)
ITEM_CODE
006
0803
RETURN VALUE
6
803
* The LTRIM function can be nested when needed to remove multiple characters.
4. RPAD:
The RPAD function converts a string to a specified length by adding blanks or characters to the end of the string. It is available in the Designer and the Workflow Manager.
RPAD( first_string, length [, second_string ] )
Example: The following expression returns the string with a length of 5 characters, appending the string ‘:’ to the end of each word:
RPAD (WORD, 5, ‘:’)
WORD
Date
Time
5. RTRIM:
RETURN VALUE
Date:
Time:
The RTRIM function removes blanks or characters from the end of a string. It is available in the Designer and the Workflow Manager.
RTRIM (string [, trim_set])
The RTRIM function can be combined with the LENGTH function if the trailing blanks are to be ignored. It can also be nested when needed to remove multiple characters.
RTRIM (string) removes the trailing spaces or blanks from the string. When RTRIM function is used with a trimset, which is optional, it removes the characters in the trimset from the string.
For example,
RTRIM (ITEM_CODE,’10’)
The above expression removes the characters 10 in the port ITEM_CODE.
ITEM_CODE
0610
380
RETURN VALUE
06
38
In the second example the function removes the trailing zero since the RTRIM compares the first character in the trimset with the last character of the string, since it does not match it takes the second character in the trimset and compares with last character of the string. Since it matches it removes it.
6. SUBSTR:
The SUBSTR function returns a portion of a string. It is available in the Designer and the Workflow Manager.
SUBSTR( string, start [, length ] )
The SUBSTR may not give the desired result if the string on which it is used is not trimmed. Though it is always a good practice to trim the strings before using them in any expression, it becomes extremely important to trim them if they are used in a SUBSTR function.
For example, if there is a function
SUBSTR (NAME, 2,2)
It will not return the 2,3 characters of the NAME if the port has leading spaces. In this case LTRIM becomes essential.
SUBSTR(LTRIM(NAME),2,2)
The SUBSTR function can also be used to get the last few characters as described below.
SUBSTR(NAME,-3,3)
This function will return the last three characters of the string. But it may not return the required last three characters if the port has trailing blanks, hence RTRIM is essential.
SUBSTR(RTRIM(NAME),-3,3)
Hence it is always better to trim the strings before using them in a SUBSTR function.
SUBSTR(LTRIM(RTRIM(NAME)),3,2)
The above expression will get the 3,4 character of the port NAME irrespective of whether the port has leading or trailing blanks or not.

Conversion Functions
1. TO_CHAR:
The TO_CHAR function converts numeric values and dates to text strings. It is available in the Designer and the Workflow Manager.
TO_CHAR( numeric_value )
TO_CHAR (date [, format ] )
Example : The following expression converts the values in the SALES port to text:
TO_CHAR (SALES )
SALES
1800.03
-22.57891
RETURN VALUE
‘1800.03’
‘-22.57891′
The following expression converts the dates in the DATE_PROMISED port to text in the format MON DD YYYY:
TO_CHAR (DATE_PROMISED, ‘MON DD YYYY’ )
DATE_PROMISED
Apr 1 1998 12:00:10AM
RETURN VALUE
‘Apr 01 1998′
If we omit the format_string argument, TO_CHAR returns a string in the default date format ‘MM/DD/YYYY’.
We can use Conversion functions with DATE functions in order to do some calculations.
The following composite expression converts the string DATE_PROMISED to date, adds 1 to it and then converts the same to text string with the format YYYYMMDD.
TO_CHAR(ADD_TO_DATE(TO_DATE(DATE_PROMISED),’DD’,1),’YYYYMMDD’)
Test functions can also be used with Conversion functions.
The following expression uses IS_DATE along with TO_CHAR.
IS_DATE(TO_CHAR(DATE_PROMISED,’YYYYMMDD’))
* TO_CHAR returns NULL if invalid Date is passed to the function.
2. TO_DATE:
The TO_DATE function converts a character string to a date datatype in the same format as the character string. It is available in the Designer and the Workflow Manager.
TO_DATE( string [, format ] )
Example : The following expression returns date values for the strings in the DATE_PROMISED port. TO_DATE always returns a date and time. If we pass a string that does not have a time value, the date returned always includes the time 00:00:00. If we execute a session in the twentieth century, the century will be 19.
The current year on the machine running the Informatica Server is 1998:
TO_DATE( DATE_PROMISED, ‘MM/DD/YY’ )
DATE_PROMISED
’12/28/81′
NULL
RETURN VALUE
Dec 28 1981 00:00:00
NULL
The format of the string must exactly be the format given in the TO_DATE function.
* TO_DATE function fails if invalid date entries are given. To avoid this we must use IS_DATE function to check if the string has a valid date to be converted.
3. TO_DECIMAL:
The TO_DECIMAL function converts any value (except binary) to a decimal. It is available in the Designer.
TO_DECIMAL( value [, scale ] )
Example : This expression uses values from the port IN_TAX. The datatype is decimal with precision of 10 and scale of 3:
TO_DECIMAL( IN_TAX, 3 )
IN_TAX
‘15.6789’
NULL
‘A12.3Grove’
RETURN VALUE
15.678
NULL
0
We can also use two conversion functions together in a single expression.
The following expression uses the functions TO_DECIMAL and TO_CHAR.
TO_DECIMAL(TO_CHAR(DATE_PROMISED,’YYYYMMDD’))
4. TO_FLOAT:
The TO_FLOAT function converts any value (except binary) to a double-precision floating point number (the Double datatype). It is available in the Designer and the Workflow Manager.
TO_FLOAT( value )
Example : This expression uses values from the port IN_TAX:
TO_FLOAT( IN_TAX )
IN_TAX
‘15.6789’
NULL
5. TO_INTEGER:
RETURN VALUE
15.6789
NULL
The TO_INTEGER function converts any value (except binary) to an integer by rounding the decimal portion of a value. It is available in the Designer and the Workflow Manager.
TO_INTEGER( value )
Example : This expression uses values from the port IN_TAX:
TO_INTEGER( IN_TAX )
IN_TAX
‘15.6789’
‘60.2’
RETURN VALUE
16
60



Sunday, December 20, 2015

Software Development Life Cycle


Software Development Life Cycle
Ø      Requirement (gathering & analysis)
Ø      Design (HLD, LLD)
Ø      Develop (Code)
Ø      Test (Validate)
Ø      Deploy (Implement)
Ø      Support (Maintain) 



Requirements
Types of requirements

• Functional Requirements
• Data flow diagrams
• Entity-relationship diagrams
• State transition diagrams
• Data Dictionary
• System parameters
Ø Security, integrity, performance, reliability
• External Interfaces
• Acceptance Criteria
Ø Validation and Performance criteria 

Design

High Level:
Functional Design
• Activity Flows / Work Flows
• System Flows and Application Flows
• Logical Data Model
• Program Inventory
• Architectural Diagrams

Low Level:
• Program Specs - Database Design
• Architectural diagrams - Data Access Design
• Interface design - Class Specifications

Build (Code)

Ø Environment Setup
Ø Test Data Creation
Ø Coding

Testing

• Test plans
• Test specifications
Ø Test cases + Test data + Expected Results
• Testing
• Test Results

Implement

Ø Data Conversion
Ø Data Migration
Ø Database Creation
Ø Production Tests
Ø Contingency Planning
Ø User Training

Support

Ø Production Support
Ø Enhancements 

Myths of SDLC

»Once the program is written and working, job is done.
»Until I get the program running, I have no way of assessing its quality. 

SDLC Models
Waterfall Model
The waterfall model is a sequential software development model (a process for the creation of software) in which development is seen as flowing steadily downwards (like a waterfall) through the phases of requirements analysis, design, implementation, testing (validation), integration, and maintenance.
V Model
V model is the software development model which encapsulates the steps in Verification and Validation phases for each step in the SDLC. For each phase, the subsequent phase becomes the verification (QA) phase and the corresponding testing phase in the other arm of the V becomes the validating (Testing) phase

Iterative Development Models
Prescribes the construction of initially small but ever larger portions of a software project to help all those involved to uncover important issues early before problems or faulty assumptions can lead to disaster.
Example: RUP, RAD, Prototyping, Agile

Testing within each lifecycle
In any life cycle model, there are several characteristics of good testing:
  • For every development activity there is a corresponding testing activity.
  • Each test level has test objectives specific to that level.
  • The analysis and design of tests for a given test level should begin during the corresponding development activity.
  • Testers should be involved in reviewing documents as soon as drafts are available in the development life cycle.
Software Quality
What is Quality?
Quality is the totality of features and characteristics of a product or service that bear on its ability to satisfy stated and implied needs.
Software Quality means: Total conformance to user requirements and specifications. Absence of Defects Fitness of Use.
Other definitions
Errors: Mistakes we make.
Defects: Defects are the results of errors.
Failures: Software failures are the results of Defects - observed while running the software
Quality Assurance vs Quality Control
Quality Assurance
1.Ensures that the product is developed correctly
2.Prevention activity
3.Eg. Process definition, planning
4.Done at the beginning of project
Quality Control
1.Ensures that the product developed is correct
2.Detection and correction activity
3.Eg. Reviews and Testing
4.Carried out after initiation
Importance of QA/QC
“All QC/QA activities are important, because all lead to quality software, if practiced correctly. However, if you allow me to choose one only, my choice is Formal Technical Reviews. If conducted properly, they are the single-most effective ways to uncover and fix defects while they are still inexpensive to find and fix.”
Roger Pressman
What is V&V?
Verification. 
  • Determine whether the artifacts of each phase of SDLC fulfill the requirements of the previous phase. 
  • Are we building the product correctly? 
  • Also called as static technique or Review
Validation.
  • Determine whether the artifacts of each phase of the SDLC are in tune with the customer requirements. 
  • Are we building the correct product? 
  • Also called as dynamic technique or Testing
Static and Dynamic V&V

Static and Dynamic techniques.
Why V&V?
  • V&V is required because:
  • Software construction is a complex task.
  • Software construction is a Human activity. Humans make mistakes.
  • Discipline and Methodology for construction reduce mistakes but cannot eliminate mistakes.
  • V&V is about evaluating the products and processes used for producing the products.
  • V&V is NOT about evaluating the producers.
Types of V&V
Reviews: 
  • Project Plans, Schedules, and Costs 
  • Requirements, Functional Specifications, and Design 
  • Test Plans, Test Specifications, Test Scripts, 
  • Source Code
Testing of Software: 
  • Unit
  • Integration 
  • System Customer Acceptance


Friday, November 27, 2015

Unix Interview Question on ETL


# Write a script to create a menu using case?
$vi menu_case.sh
clear
tput cut 10 25
echo “main menu”
tput cut 11 25
echo “————“
tput cut 12 25
echo “1.Date “
tput cut 13 25
echo “2.All files”
tput cut 13 25
echo “3.All Users”
tput cut 14 25
echo “4.No items found”
read choice
case $choice in
1)echo “Today Date ::” `date`
;;
2)ls -f
;;
3)finger
;;
*)exit
;;
esac
:wq

# Write a script to check the given type of character?

$vi char_type.sh
echo “enter a character”
read char
case $char in
[a-z])echo “lower case”
;;
[A-Z])echo “upper case”
;;
[0-9])echo “Digit”
;;
?)echo “special character”
;;
*)echo “Entered more than one character”
;;
esac
:wq



# Write a script to check the given two file are equal or not?

$vi twofiles_same.sh
echo “enter first file “
read f1
echo “enter second file”
read f2
x =`cmp $f1 $f2`
if [-z $x]
then
echo “both are same”
else
echo “both are not same”
fi
:wq

# Write a script open the given file?

$vi open_file.sh
echo “enter file name:”
read fname
if [-e $fname]
then
if [-f $fname]
then
if [-r $fname]
then
cat $fname
else
chmod u+r $fname
cat $fname
else “it is not a regular file”
else
echo “It is not a regualr file”
else
echo “file doesnt exist”
fi
:wq

# Write a script to check the given type of file?
$vi type_file.sh
echo “enter file name:”
read fname
if [ -e $fname ]
then
if [ -f $fname ]
then
echo “regular file “
else
echo “directory file “
elif [ -l $fname ]
then
echo ” link file”
elif [ -b $fname -o -c $fname ]
then
echo “Device file”
fi
:wq

# Write a script to check the given user is logged in or not?

$vi delete.sh
echo “enter login name:”
read lname
if who | grep $lname > /dev/null
then
echo “user is online”
else
echo “user is offline “
fi
:wq

# Write a script to delete the given file?
$vi delete.sh
echo “enter filename:”
read fname
if rm $fname
then
echo “deleted the file”
else
echo “file is not existed “
fi
:wq

# Write a script to accept two strings and check the given strings are equal or not?

$vi two_strings.sh
echo “enter first string:”
read s1
echo “enter second string:”
read s2
if [ "$s1" = "$s2" ]
then
echo “equal”
else
echo “not equal”
fi
:wq
# Write a script to check the given number is even or odd number?

$vi even_odd.sh
echo “enter a number: “
read n
if [ `expr $n % 2` = 0 ]
then
echo “entered number : even number”
else
echo “entered number : odd number “
fi
:wq

#Write a script to check the given number is positive or negative?

$vi pos_neg.sh
echo “enter a number: “
read n
if [ $n gt 0 ]
then
echo “entered number : positive number”
else
echo “entered number : negative number”
fi
:wq