Talend ETL Tutorial : How To Read Multiple Files Using Open Studio Designer

Praveen Singh         1 comment

Till now, we have been reading single Excel and Delimited file. In this Talend tutorial for beginners, we will be reading multiple files using tFileList  Talend component.

Requirement :

The requirement is to read multiple text files present in the folder. In this demo tutorial, we will read multiple files available in a folder, pass these files as input and read the contents of the files and display the contents on the console.

Steps To Read Multiple Files At Once :

  • Let's have a look at our input files first. For this demo, I have created a folder and put 4 four files under it. You can write anything under it. All these contents will be displayed on the console once we run our job.
  • As far as the contents are concerned, I have put just one line in all input files. 

  • The final job design will be as shown in the below screenshot. As you can see in the screenshot, I have created a new job HowToReadMultipleFiles. The job creation process is same as used in the previous tutorials. I have used only 3 components here : tFileList_1, tFileInputDelimited_1 and tLogRow_1.

  • First thing first, to read multiple files, we have to configure the tFileList_1 component. Browse for the directory where your files are present. Next, you have to select FileList type, you will get 3 options ( Files, Directories and Both) in the drop down list. Based on your requirement, you can select any one of it.

Global Parameter To Consider While Using tFileList_1 :

As we are reading the files dynamically, we have to pass dynamic values instead of fixed file name.
Global parameters help us to resolve this problem. Based on your requirement, you can take any of the values and use it to the files dynamically.

  • This variable holds current file name without complete path.

  • This variables holds current file name with complete path.

  • This variable holds current file`s extension, if file has no extension then it will return null

  • This variable holds only directory name with path exclude file name.

  • This variables holds the number of files listed by tFilelist.
  • In my job design, I use ((String)globalMap.get("tFileList_1_CURRENT_FILEPATH")) as the file name must be there with its directory path. The System must know where your files are located. This parameter is being used in the tFileInputDelimited component. As this is the place where we mention the input file names and path. If you want to read excel files then you can use tFileInputExcel in place of tFileInputDelimited.. 

  • You don't have to do any modification in tLogRow. Just link this component with the tInputFileDemilited component and rest will be taken care of automatically.

Talend Components Used In the Design :


tFileList iterates on files or folders of a set directory. tFileList retrieves a set of files or folders based on a filemask pattern and iterates on each unity.

Get more details from here : tFileList Component Detail


tFileInputDelimited reads a given file row by row with simple separated fields. Opens a file and reads it row by row to split them up into fields then sends fields as defined in the Schema to the next Job component, via a Row link.

Get more details from here : tFileInputDelimited Component


Displays data or results in the Run console.tLogRow is used to monitor data processed.

Get more details from here : tLogRow Component Details

The Final Execution Of Job :

Published by Praveen Singh

A blogger by passion.You can find me tucked in my bed and blogging on weekends when not roaming around. Besides blogging, I love music and you can find my songs on my fb page:PraveenUnplugged
Follow on Youtube : Videos On Latest Happenings |ThingsToKnow
Follow us Talend In Action

1 comment:

  1. It is amazing that you have finally reached this point. i think that you had to overcome many obstacles that require the good physical availability.

    appvn app


© 2015 Techie's House. Designed by Bloggertheme9