Hi @Baker, Rick
Thanks for the question and using MS Q&A platform.
To accomplish your objective, you'll need to begin by copying the list of files as an array. Then, you can utilize a data flow to transform it into a format that lists each file name row by row.
First, use the Get Metadata activity
to retrieve the list of files in your blob container. Then, filter the list to only include image files. Next, use a For Each
loop to iterate through the filtered list and store the file names in an append variable.
Create a dummy file with only one value and use the Copy Activity
to append the append variable
value to it as an additional column.
Finally, use a data flow to transform the array of file names into a row-wise format. Use the Derived Column transformation to convert the array into a proper array format using the expression split(replace(replace(Images,'[',''),']',''),',')
. Then, use the Flatten
transformation to transform the array into rows.
Below is the pipeline JSON code:
{
"name": "pipeline1",
"properties": {
"activities": [
{
"name": "Get Metadata1",
"type": "GetMetadata",
"dependsOn": [],
"policy": {
"timeout": "0.12:00:00",
"retry": 0,
"retryIntervalInSeconds": 30,
"secureOutput": false,
"secureInput": false
},
"userProperties": [],
"typeProperties": {
"dataset": {
"referenceName": "Binary1",
"type": "DatasetReference"
},
"fieldList": [
"childItems"
],
"storeSettings": {
"type": "AzureBlobStorageReadSettings",
"enablePartitionDiscovery": false
},
"formatSettings": {
"type": "BinaryReadSettings"
}
}
},
{
"name": "Filter1",
"type": "Filter",
"dependsOn": [
{
"activity": "Get Metadata1",
"dependencyConditions": [
"Succeeded"
]
}
],
"userProperties": [],
"typeProperties": {
"items": {
"value": "@activity('Get Metadata1').output.childItems",
"type": "Expression"
},
"condition": {
"value": "@contains(item().name,'.jpg')",
"type": "Expression"
}
}
},
{
"name": "ForEach1",
"type": "ForEach",
"dependsOn": [
{
"activity": "Filter1",
"dependencyConditions": [
"Succeeded"
]
}
],
"userProperties": [],
"typeProperties": {
"items": {
"value": "@activity('Filter1').output.value",
"type": "Expression"
},
"isSequential": true,
"activities": [
{
"name": "Append variable1",
"type": "AppendVariable",
"dependsOn": [],
"userProperties": [],
"typeProperties": {
"variableName": "filenames",
"value": {
"value": "@item().name",
"type": "Expression"
}
}
}
]
}
},
{
"name": "Copy data1",
"type": "Copy",
"dependsOn": [
{
"activity": "ForEach1",
"dependencyConditions": [
"Succeeded"
]
}
],
"policy": {
"timeout": "0.12:00:00",
"retry": 0,
"retryIntervalInSeconds": 30,
"secureOutput": false,
"secureInput": false
},
"userProperties": [],
"typeProperties": {
"source": {
"type": "DelimitedTextSource",
"additionalColumns": [
{
"name": "ImageNames",
"value": {
"value": "@string(variables('filenames'))",
"type": "Expression"
}
}
],
"storeSettings": {
"type": "AzureBlobStorageReadSettings",
"recursive": true,
"enablePartitionDiscovery": false
},
"formatSettings": {
"type": "DelimitedTextReadSettings"
}
},
"sink": {
"type": "DelimitedTextSink",
"storeSettings": {
"type": "AzureBlobStorageWriteSettings",
"copyBehavior": "MergeFiles"
},
"formatSettings": {
"type": "DelimitedTextWriteSettings",
"quoteAllText": true,
"fileExtension": ".txt"
}
},
"enableStaging": false,
"translator": {
"type": "TabularTranslator",
"mappings": [
{
"source": {
"name": "ImageNames",
"type": "String"
},
"sink": {
"name": "Images",
"physicalType": "String"
}
}
],
"typeConversion": true,
"typeConversionSettings": {
"allowDataTruncation": true,
"treatBooleanAsNumber": false
}
}
},
"inputs": [
{
"referenceName": "DelimitedText1",
"type": "DatasetReference"
}
],
"outputs": [
{
"referenceName": "DelimitedText2",
"type": "DatasetReference"
}
]
},
{
"name": "Data flow1",
"type": "ExecuteDataFlow",
"dependsOn": [
{
"activity": "Copy data1",
"dependencyConditions": [
"Succeeded"
]
}
],
"policy": {
"timeout": "0.12:00:00",
"retry": 0,
"retryIntervalInSeconds": 30,
"secureOutput": false,
"secureInput": false
},
"userProperties": [],
"typeProperties": {
"dataflow": {
"referenceName": "dataflow1",
"type": "DataFlowReference"
},
"compute": {
"coreCount": 8,
"computeType": "General"
},
"traceLevel": "Fine"
}
}
],
"variables": {
"filenames": {
"type": "Array"
}
},
"annotations": []
}
}
In the example provided above, I have used an image file in JPG format.
This approach should allow you to create a file containing the names of all the image files in your blob container in a row-wise format.
Hope this helps. Do let us know if you any further queries.
If this answers your query, do click Accept Answer
and Yes
for was this answer helpful. And, if you have any further query do let us know.