summaryrefslogtreecommitdiff
path: root/docs/processor
diff options
context:
space:
mode:
authordoufenghu <[email protected]>2024-06-01 18:31:17 +0800
committerdoufenghu <[email protected]>2024-06-01 18:31:17 +0800
commit258e0fbdf263ed1edde1964a505387b018933a16 (patch)
tree5980f63340ae935b7eb9287dd821e3bb33fb781d /docs/processor
parent0f0f3b17b3036cd2f09ea6ff2737e5b6c555b704 (diff)
[Improve][docs] Update ClickHouse connector and UDF document.
Diffstat (limited to 'docs/processor')
-rw-r--r--docs/processor/udf.md86
1 files changed, 78 insertions, 8 deletions
diff --git a/docs/processor/udf.md b/docs/processor/udf.md
index 5e6bd6a..74fa2d0 100644
--- a/docs/processor/udf.md
+++ b/docs/processor/udf.md
@@ -9,6 +9,7 @@
- [Domain](#domain)
- [Drop](#drop)
- [Eval](#eval)
+- [Flatten](#flatten)
- [From Unix Timestamp](#from-unix-timestamp)
- [Generate String Array](#generate-string-array)
- [GeoIP Lookup](#geoip-lookup)
@@ -180,6 +181,47 @@ If the value of `direction` is `69`, the value of `internal_ip` will be `client_
parameters:
value_expression: 'direction=69 ? client_ip : server_ip'
```
+### Flatten
+Flatten the fields of nested structure to the top level. The new fields name are named using the field name prefixed with the names of the struct fields to reach it, separated by dots as default.
+
+```FLATTEN(filter, lookup_fields, output_fields[, parameters])```
+- filter: optional
+- lookup_fields: optional
+- output_fields: not required
+- parameters: optional
+ - prefix: `<String>` optional. Prefix string for flattened field names. Default is empty.
+ - depth: `<Integer>` optional. Number representing the nested levels to consider for flattening. Minimum 1. Default is `5`.
+ - delimiter: `<String>` optional. The string used to join nested keys Default is `.`.
+ - json_string_keys: `<Array>` optional. The keys of the json string fields. It indicates keys that contain JSON strings and should be parsed and flattened. Default is empty.
+
+Example 1:
+
+Flatten the nested structure of fields and tags in Metrics. If lookup_fields is empty, flatten all nested structures.
+
+```yaml
+ - function: FLATTEN
+ lookup_fields: [tags,fields]
+```
+
+Example 2:
+
+Flatten the nested structure of the session record field `encapsulation` (JSON String format), add the prefix `tunnels`, specify the nesting depth as `3`, and use an dot "." as the delimiter.
+```yaml
+ - function: FLATTEN
+ lookup_fields: [encapsulation]
+ parameters:
+ prefix: tunnels
+ depth: 3
+ delimiter: .
+ json_string_keys: [encapsulation]
+```
+Output:
+```json
+{
+ "tunnels.encapsulation.ipv4.client_ip:": "192.168.11.12",
+ "tunnels.encapsulation.ipv4.server_ip": "8.8.8.8"
+}
+```
### From Unix Timestamp
From unix timestamp function is used to convert the unix timestamp to date time string. The default time zone is UTC+0.
@@ -311,20 +353,48 @@ Example:
```
### Rename
-Rename function is used to rename the field name.
+Rename function is used to rename or reformat(e.g. by replacing character underscores with dots) the field name.
-```RENAME(filter, lookup_fields, output_fields)```
+```RENAME(filter, lookup_fields, output_fields, parameters)```
- filter: optional
-- lookup_fields: required
-- output_fields: required
-- parameters: not required
+- lookup_fields: not required
+- output_fields: not required
+- parameters: required
+ - parent_fields: `<Array>` optional. Specify fields whose children will inherit the Rename fields and Rename expression operations.
+ - rename_fields: `Map<String, String>` required. The key is the original field name, and the value is the new field name.
+ - current_field_name: `<String>` required. The original field name.
+ - new_field_name: `<String>` required. The new field name.
+ - rename_expression: `<String>` optional. AviatorScript expression whose returned value will be used to rename fields.
+
+```
+A single Function can include both rename_fields (to rename specified field names) and rename_expression (to globally rename fields). However, the Rename fields strategy will execute first.
+```
+Example 1:
+
+Remove the prefix "tags_" from the field names and rename the field "timestamp_ms" to "recv_time_ms".
-Example:
```yaml
- function: RENAME
- lookup_fields: [http_domain]
- output_fields: [server_domain]
+ - parameters:
+ rename_fields:
+ - timestamp_ms: recv_time_ms
+ rename_expression: key=string.replace_all(key,'tags_',''); return key;
+
+```
+
+Example 2:
+
+Rename the field `client_ip` to `source_ip`, including the fields under the `encapsulation.ipv4` tunnel.
+
+```yaml
+ - function: RENAME
+ - parameters:
+ parent_fields: [encapsulation.ipv4]
+ rename_fields:
+ - client_ip: source_ip
+
```
+Output: `source_ip:192.168.4.1, encapsulation.ipv4.source_ip:192.168.12.12`
### Snowflake ID