Data Factory > protocol-ftp-get β
TL;DR;
This task allows retrieving one or more files from a FTP/SFTP server.
Name:
Max execution time: 120 mins
Target β
Recover one or more files from an FTP/SFTP server
Example of use in a job β
- Get a taxonomy file in XML format and update the structure of a table
1. FTP Get Retrieve the last file ending with -table.xml
2. Import Table Import the table1
2
2
- Retrieve a taxonomy file from a Zip archive and update the structure of a table
1. FTP Get the last file ending with -table.zip
2. Unzip Uncompress the archive
3. Import Table Import the table1
2
3
2
3
Inputs and Outputs β
json
{
"name": "protocol-ftp-get",
"taskReferenceName": "ftp_get",
"description": "The business description of the task",
"type": "SUB_WORKFLOW",
"optional": false,
"inputParameters": {
"connection": "SFTP",
"host": "",
"username": "main_user",
"password": "*****",
"port": 21,
"mode": "PARAMETERS",
"remoteFolder": "/images/",
"sort": "LAST_MODIFIED_DESC",
"filter": "-table\\.xml$",
"maxFiles": 1
}
}1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
Inputs β
TIP
If at least one task parameters (whether mandatory or not) is invalid, task execution is stopped and the returned status is FAILED. For example:
- The proposed value for the
connectionproperty is invalid.
| Property | Description |
|---|---|
connection | Mandatory β Enum - FTP, SFTP, FTPS |
authenticationMethod | Enum - PASSWORD, PUBLIC_KEY (only availble if connection = SFTP), Default: PASSWORDIf mode != SFTP and authenticationMethod = PUBLIC_KEY, the task fails. |
host | Mandatory - String |
privateKey | Mandatory if authenticationMethod=PUBLIC_KEY - StringIt is strongly recommended to use a secret variable to store the private key. |
passphrase | String Only taken into account if authenticationMethod= PUBLIC_KEY. |
username | Mandatory - String |
password | Mandatory if connection != SFTP, if connection == SFTP and password is set then an invalid input error is added - String |
port | Mandatory - Number |
mode | Mandatory, enum - PARAMETERS, REQUEST PARAMETERS: allows retrieving one or more files using the parameters describe belowREQUEST: Allows filling in the documents you wish to retrieve within an XML document. |
remoteFolder | Mandatory if mode = PARAMETERS - String - Default value: / The folder in which the file(s) should be retrieved. |
sort | Mandatory if mode = PARAMETERS - Enum - LAST_MODIFIED_ASC, LAST_MODIFIED_DESC, ALPHABETICAL_ASC, ALPHABETICAL_DESC, SIZE_ASC, SIZE_DESCLAST_MODIFIED_ASC: Date of last modification, ascendantLAST_MODIFIED_DESC: Date of last modification, descendantALPHABETICAL_ASC: Alphabetical order, ascendingALPHABETICAL_DESC: Alphabetical order, descendingSIZE_ASC: File size, ascendingSIZE_DESC: File size, descending |
filter | Mandatory if mode = PARAMETERS - String A regex (example: -products\.xml$) |
maxFiles | Mandatory if mode = PARAMETERS - Number The maximum number of files to retrieve. An integer greater or equal to 1. |
request | Mandatory if mode = REQUEST - File See below |
TIP
The supported private keys are the following :
---- BEGIN SSH2 PUBLIC KEY ---------BEGIN RSA PRIVATE KEY----------BEGIN DSA PRIVATE KEY----------BEGIN EC PRIVATE KEY-----PuTTY-User-Key-File-2: (ssh-rsa)PuTTY-User-Key-File-2: (ssh-dss)-----BEGIN OPENSSH PRIVATE KEY-----
Outputs β
TIP
The outputs of the task defined below are always available if the task completes its execution (status COMPLETED and FAILED). If the task is stopped before the end of its execution (if the task is canceled by the user for example - CANCELED status, or if it exceeds the authorized execution time - TIMED_OUT), the outputs are not available.
| Property | Description |
|---|---|
listing | File XML file that lists the files retrieved from the FTP (present regardless of the mode used) |
noFile | Enum - YES, NOIf no file matching what is to be retrieved: YES otherwise NO. |
allFilesRecovered | Enum - YES, NOYES: all the files to be recovered have been recovered. |
files | Array of File The list of files retrieved (present regardless of the mode used). The order is not guaranteed. |
file | File The first file retrieved If mode = PARAMETERS: The first file (respecting the sort parameter)If mode = REQUEST: The first available file in the XML order. |
Details regarding the request document β
xml
<Ftp-Get>
<File>
<Path>/products-1.csv</Path>
</File>
<File>
<Path>/sub-folder/products-2.csv</Path>
</File>
</Ftp-Get>1
2
3
4
5
6
7
8
2
3
4
5
6
7
8
| XPath | Description | Occurrence |
|---|---|---|
Ftp-Get | Root | 1 |
File | A file to be recovered | 0..* |
Path | The complete path with the extension. Subfolders are separated by the / character. Must start with / | 1 |
Details regarding the listing document β
xml
<Files>
<File>
<Url>https://app.product-live.com/files-data-factory listing /d05a74cf11788d8f3ae9bf0e0e028dde66f0c83005c5e0d1211b0069945c0c11</Url>
<Path>/products-1.csv</Path>
<Last-Modified>2020-04-10T13:40:23.83Z</Last-Modified>
<Size>1954387</Size>
</File>
<File>
<Url>https://app.product-live.com/files-data-factory/fb26911d77fe9a9dc44b111eef5b5db7ca2019c8038445662f29b20c54cb6f29</Url>
<Path>/sub-folder/products-2.csv</Path>
<Last-Modified>2020-04-10T13:40:23.83Z</Last-Modified>
<Size>1954387</Size>
</File>
</Files>1
2
3
4
5
6
7
8
9
10
11
12
13
14
2
3
4
5
6
7
8
9
10
11
12
13
14
| XPath | Description | Occurrence |
|---|---|---|
Files | Root | 1 |
File | A file | 0..* |
Url | Full, private URL of the file | 1 |
Path | The full path of the file on the remote server. Subfolders are separated by the / character Start with a / | 1 |
Last-Modified | The date of the last modification of the file, format is UTC | 1 |
Size | The size of the file in Kb | 1 |
Known limitations β
- Since the runtime limit of this task is 2 hours, the recovery of a large number of files will not run to completion.
- For SFTP connections, only the following ciphers are allowed:
aes256-cbc,aes192-cbc,aes128-cbc,blowfish-cbc,3des-cbc,aes128-gcm,aes256-gcm,arcfour256,arcfour128,cast128-cbc,arcfour.
Appendices β
- Jobs example: data-factory-job-collection/By Tasks/protocol-ftp-get