This flow shows how to convert a CSV entry to a JSON document using ExtractText and ReplaceText.CsvToJSONa06404c1-34a4-48e9-869d-ae67511e1d340ef4dff7-1324-40b8-9ae8-db3eb4ccd2af0 MB00ef4dff7-1324-40b8-9ae8-db3eb4ccd2afaf418612-6920-4348-a892-d5e8ebae7e1fPROCESSOR0 sec1success000e6e639-ae9b-4c26-b6d1-bc132c28d2fe0ef4dff7-1324-40b8-9ae8-db3eb4ccd2af0 MB00ef4dff7-1324-40b8-9ae8-db3eb4ccd2af5d4cd549-2624-49cc-a82d-f58a9575a471PROCESSOR0 sec1success099f0c5c8-2073-48ae-97e9-1111392a752f0ef4dff7-1324-40b8-9ae8-db3eb4ccd2af0 MB00ef4dff7-1324-40b8-9ae8-db3eb4ccd2aff6929630-aabd-4047-8199-45d529ba00dcPROCESSOR0 sec1success08ebcbe04-54a7-442f-bfaa-74572a86a5580ef4dff7-1324-40b8-9ae8-db3eb4ccd2af0 MB00ef4dff7-1324-40b8-9ae8-db3eb4ccd2aff60357a4-9506-4b72-a477-73e45e44bfafPROCESSOR0 sec1matched0c89fa7cb-d7b1-42da-aa21-8cb417aa8ac80ef4dff7-1324-40b8-9ae8-db3eb4ccd2af342.09998474121096270.8281.4172668457031388.0479431152344af418612-6920-4348-a892-d5e8ebae7e1f0ef4dff7-1324-40b8-9ae8-db3eb4ccd2af386.6000061035156284.5999755859375WARN1TIMER_DRIVEN1EVENT_DRIVEN0CRON_DRIVEN1TIMER_DRIVEN0 secCRON_DRIVEN* * * * * ?Character SetUTF-8The Character Set in which the file is encodedCharacter SetfalseCharacter SettruefalsefalseMaximum Buffer Size1 MBSpecifies the maximum amount of data to buffer (per file) in order to apply the
regular expressions. Files larger than the specified maximum will not be fully
evaluated.
Maximum Buffer SizefalseMaximum Buffer SizetruefalsefalseMaximum Capture Group Length1024Specifies the maximum number of characters a given capture group value can
have. Any characters beyond the max will be truncated.
Maximum Capture Group LengthfalseMaximum Capture Group LengthfalsefalsefalseEnable Canonical EquivalencetruetruefalsefalsefalseIndicates that two characters match only when their full canonical
decompositions match.
Enable Canonical EquivalencefalseEnable Canonical EquivalencetruefalsefalseEnable Case-insensitive MatchingtruetruefalsefalsefalseIndicates that two characters match even if they are in a different case. Can
also be specified via the embeded flag (?i).
Enable Case-insensitive MatchingfalseEnable Case-insensitive MatchingtruefalsefalsePermit Whitespace and Comments in PatterntruetruefalsefalsefalseIn this mode, whitespace is ignored, and embedded comments starting with # are
ignored until the end of a line. Can also be specified via the embeded flag (?x).
Permit Whitespace and Comments in PatternfalsePermit Whitespace and Comments in PatterntruefalsefalseEnable DOTALL ModetruetruefalsefalsefalseIndicates that the expression '.' should match any character, including a line
terminator. Can also be specified via the embeded flag (?s).
Enable DOTALL ModefalseEnable DOTALL ModetruefalsefalseEnable Literal Parsing of the PatterntruetruefalsefalsefalseIndicates that Metacharacters and escape characters should be given no special
meaning.
Enable Literal Parsing of the PatternfalseEnable Literal Parsing of the PatterntruefalsefalseEnable Multiline ModetruetruefalsefalsefalseIndicates that '^' and '$' should match just after and just before a line
terminator or end of sequence, instead of only the begining or end of the entire input.
Can also be specified via the embeded flag (?m).
Enable Multiline ModefalseEnable Multiline ModetruefalsefalseEnable Unicode-aware Case FoldingtruetruefalsefalsefalseWhen used with 'Enable Case-insensitive Matching', matches in a manner
consistent with the Unicode Standard. Can also be specified via the embeded flag (?u).
Enable Unicode-aware Case FoldingfalseEnable Unicode-aware Case FoldingtruefalsefalseEnable Unicode Predefined Character ClassestruetruefalsefalsefalseSpecifies conformance with the Unicode Technical Standard #18: Unicode Regular
Expression Annex C: Compatibility Properties. Can also be specified via the embeded flag
(?U).
Enable Unicode Predefined Character ClassesfalseEnable Unicode Predefined Character ClassestruefalsefalseEnable Unix Lines ModetruetruefalsefalsefalseIndicates that only the '
' line terminator is recognized in the behavior of '.', '^', and '$'. Can also be
specified via the embeded flag (?d).
Enable Unix Lines ModefalseEnable Unix Lines ModetruefalsefalseInclude Capture Group 0truetruefalsefalsetrueIndicates that Capture Group 0 should be included as an attribute. Capture
Group 0 represents the entirety of the regular expression match, is typically not used,
and could have considerable length.
Include Capture Group 0falseInclude Capture Group 0truefalsefalsecsvcsvtruecsvfalsefalsefalsefalse30 secCharacter SetUTF-8Maximum Buffer Size1 MBMaximum Capture Group Length1024Enable Canonical EquivalencefalseEnable Case-insensitive MatchingfalsePermit Whitespace and Comments in PatternfalseEnable DOTALL ModefalseEnable Literal Parsing of the PatternfalseEnable Multiline ModefalseEnable Unicode-aware Case FoldingfalseEnable Unicode Predefined Character ClassesfalseEnable Unix Lines ModefalseInclude Capture Group 0falsecsv(.+),(.+),(.+),(.+)00 secTIMER_DRIVEN1 secExtractTextfalseFlowFiles are routed to this relationship when the Regular Expression is successfully
evaluated and the FlowFile is modified as a result
matchedtrueFlowFiles are routed to this relationship when no provided Regular Expression matches the
content of the FlowFile
unmatchedRUNNINGtruetrueorg.apache.nifi.processors.standard.ExtractTextf60357a4-9506-4b72-a477-73e45e44bfaf0ef4dff7-1324-40b8-9ae8-db3eb4ccd2af393.0440.3999938964844WARN1TIMER_DRIVEN1EVENT_DRIVEN0CRON_DRIVEN1TIMER_DRIVEN0 secCRON_DRIVEN* * * * * ?Regular Expression(?s:^.*$)The Regular Expression to search for in the FlowFile contentRegular ExpressionfalseRegular ExpressiontruefalsetrueReplacement Value$1The value to replace the regular expression with. Back-references to Regular
Expression capturing groups are supported, but back-references that reference capturing
groups that do not exist in the regular expression will be treated as literal value.
Replacement ValuefalseReplacement ValuetruefalsetrueCharacter SetUTF-8The Character Set in which the file is encodedCharacter SetfalseCharacter SettruefalsefalseMaximum Buffer Size1 MBSpecifies the maximum amount of data to buffer (per file or per line, depending
on the Evaluation Mode) in order to apply the regular expressions. If 'Entire Text' (in
Evaluation Mode) is selected and the FlowFile is larger than this value, the FlowFile
will be routed to 'failure'. In 'Line-by-Line' Mode, if a single line is larger than
this value, the FlowFile will be routed to 'failure'. A default value of 1 MB is
provided, primarily for 'Entire Text' mode. In 'Line-by-Line' Mode, a value such as 8 KB
or 16 KB is suggested. This value is ignored and the buffer is not used if 'Regular
Expression' is set to '.*'
Maximum Buffer SizefalseMaximum Buffer SizetruefalsefalseEvaluation ModeLine-by-LineLine-by-LineEntire textEntire textEntire textEvaluate the 'Regular Expression' against each line (Line-by-Line) or buffer
the entire file into memory (Entire Text) and then evaluate the 'Regular Expression'.
Evaluation ModefalseEvaluation Modetruefalsefalsefalse30 secRegular Expression(?s:^.*$)Replacement Value{ "field1" : "${csv.1}", "field2" : "${csv.2}",
"field3" : "${csv.3}", "field4" : "${csv.4}" }
Character SetUTF-8Maximum Buffer Size1 MBEvaluation ModeEntire text00 secTIMER_DRIVEN1 secReplaceTexttrueFlowFiles that could not be updated are routed to this relationshipfailurefalseFlowFiles that have been successfully updated are routed to this relationship, as well as
FlowFiles whose content does not match the given Regular Expression
successRUNNINGtruetrueorg.apache.nifi.processors.standard.ReplaceText5d4cd549-2624-49cc-a82d-f58a9575a4710ef4dff7-1324-40b8-9ae8-db3eb4ccd2af389.6000061035156595.0WARN1TIMER_DRIVEN1EVENT_DRIVEN0CRON_DRIVEN1TIMER_DRIVEN0 secCRON_DRIVEN* * * * * ?false30 sec00 secTIMER_DRIVEN1 secUpdateAttributetrueAll FlowFiles are routed to this relationshipsuccessRUNNINGtruetrueorg.apache.nifi.processors.attributes.UpdateAttributef6929630-aabd-4047-8199-45d529ba00dc0ef4dff7-1324-40b8-9ae8-db3eb4ccd2af379.0000305175781127.59999084472656WARN1TIMER_DRIVEN1EVENT_DRIVEN0CRON_DRIVEN1TIMER_DRIVEN0 secCRON_DRIVEN* * * * * ?Regular Expression(?s:^.*$)The Regular Expression to search for in the FlowFile contentRegular ExpressionfalseRegular ExpressiontruefalsetrueReplacement Value$1The value to replace the regular expression with. Back-references to Regular
Expression capturing groups are supported, but back-references that reference capturing
groups that do not exist in the regular expression will be treated as literal value.
Replacement ValuefalseReplacement ValuetruefalsetrueCharacter SetUTF-8The Character Set in which the file is encodedCharacter SetfalseCharacter SettruefalsefalseMaximum Buffer Size1 MBSpecifies the maximum amount of data to buffer (per file or per line, depending
on the Evaluation Mode) in order to apply the regular expressions. If 'Entire Text' (in
Evaluation Mode) is selected and the FlowFile is larger than this value, the FlowFile
will be routed to 'failure'. In 'Line-by-Line' Mode, if a single line is larger than
this value, the FlowFile will be routed to 'failure'. A default value of 1 MB is
provided, primarily for 'Entire Text' mode. In 'Line-by-Line' Mode, a value such as 8 KB
or 16 KB is suggested. This value is ignored and the buffer is not used if 'Regular
Expression' is set to '.*'
Maximum Buffer SizefalseMaximum Buffer SizetruefalsefalseEvaluation ModeLine-by-LineLine-by-LineEntire textEntire textEntire textEvaluate the 'Regular Expression' against each line (Line-by-Line) or buffer
the entire file into memory (Entire Text) and then evaluate the 'Regular Expression'.
Evaluation ModefalseEvaluation Modetruefalsefalsefalse30 secRegular Expression(?s:^.*$)Replacement Valuea,b,c,dCharacter SetUTF-8Maximum Buffer Size1 MBEvaluation ModeEntire text00 secTIMER_DRIVEN1 secReplaceTexttrueFlowFiles that could not be updated are routed to this relationshipfailurefalseFlowFiles that have been successfully updated are routed to this relationship, as well as
FlowFiles whose content does not match the given Regular Expression
successRUNNINGtruetrueorg.apache.nifi.processors.standard.ReplaceText7063134a-d245-4cf2-8d07-5f252ab40c850ef4dff7-1324-40b8-9ae8-db3eb4ccd2af371.60003662109375-28.199995040893555WARN1TIMER_DRIVEN1EVENT_DRIVEN0CRON_DRIVEN1TIMER_DRIVEN0 secCRON_DRIVEN* * * * * ?File SizeThe size of the file that will be usedFile SizefalseFile SizetruefalsefalseBatch Size1The number of FlowFiles to be transferred in each invocationBatch SizefalseBatch SizetruefalsefalseData FormatBinaryBinaryTextTextBinarySpecifies whether the data should be Text or BinaryData FormatfalseData FormattruefalsefalseUnique FlowFilestruetruefalsefalsefalseIf true, each FlowFile that is generated will be unique. If false, a random
value will be generated and all FlowFiles will get the same content but this offers much
higher throughput
Unique FlowFilesfalseUnique FlowFilestruefalsefalsefalse30 secFile Size1 bBatch Size1Data FormatBinaryUnique FlowFilesfalse010 secTIMER_DRIVEN1 secGenerateFlowFilefalsesuccessRUNNINGfalsetrueorg.apache.nifi.processors.standard.GenerateFlowFile09/22/2015 09:09:59 EDT