[Improve][docs] Add a description of the new features for version 1.7.1-SNAPSHOT.

author: doufenghu <[email protected]> 2024-11-01 20:40:46 +0800
committer: doufenghu <[email protected]> 2024-11-01 20:40:46 +0800
commit: 5818ed2ac9ca31a35a55f330160a9cf7f63bf6f3 (patch)
tree: 0d2f00c6d6c1791de8c5588572e0e7fb538803f2
parent: e25eabde3ccb3f0d52346cb11cac757763c41be8 (diff)
6 files changed, 143 insertions, 16 deletions
diff --git a/docs/connector/formats/csv.md b/docs/connector/formats/csv.md
index ca8d10b..76769b2 100644
--- a/docs/connector/formats/csv.md
+++ b/docs/connector/formats/csv.md
@@ -4,8 +4,7 @@
 >
 > ## Description
 >
-> The CSV format allows to read and write CSV data based on an CSV schema. Currently, the CSV schema is derived from table schema.
-> **The CSV format must config schema for source/sink**.
+> The CSV format allows for reading and writing CSV data based on a schema. Currently, the CSV schema is derived from the table schema.
 
 | Name         | Supported Versions | Maven                                                                                                                     |
 |--------------|--------------------|---------------------------------------------------------------------------------------------------------------------------|
@@ -16,12 +15,12 @@
 | Name                        | Type      | Required | Default | Description                                                                                                                                                        |
 |-----------------------------|-----------|----------|---------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------|
 | format                      | String    | Yes      | (none)  | Specify what format to use, here should be 'csv'.                                                                                                                  |
-| csv.field.delimiter         | String    | No       | ,       | Field delimiter character (',' by default), must be single character. You can use backslash to specify special characters, e.g. '\t' represents the tab character. |
-| csv.disable.quote.character | Boolean   | No       | false   | Disabled quote character for enclosing field values (false by default). If true, option 'csv.quote.character' can not be set.                                      |
-| csv.quote.character         | String    | No       | "       | Quote character for enclosing field values (" by default).                                                                                                         |
+| csv.field.delimiter         | String    | No       | ,       | Field delimiter character (`,` by default), must be single character. You can use backslash to specify special characters, e.g. '\t' represents the tab character. |
+| csv.disable.quote.character | Boolean   | No       | false   | Disabled quote character for enclosing field values (`false` by default). If true, option `csv.quote.character` can not be set.                                    |
+| csv.quote.character         | String    | No       | "       | Quote character for enclosing field values (`"` by default).                                                                                                       |
 | csv.allow.comments          | Boolean   | No       | false   | Ignore comment lines that start with '#' (disabled by default). If enabled, make sure to also ignore parse errors to allow empty rows.                             |
 | csv.ignore.parse.errors     | Boolean   | No       | false   | Skip fields and rows with parse errors instead of failing. Fields are set to null in case of errors.                                                               |
-| csv.array.element.delimiter | String    | No       | ;       | Array element delimiter string for separating array and row element values (';' by default).                                                                       |
+| csv.array.element.delimiter | String    | No       | ;       | Array element delimiter string for separating array and row element values (`;`  by default).                                                                      |
 | csv.escape.character        | String    | No       | (none)  | Escape character for escaping values (disabled by default).                                                                                                        |
 | csv.null.literal            | String    | No       | (none)  | Null literal string that is interpreted as a null value (disabled by default).                                                                                     |
 
diff --git a/docs/connector/sink/starrocks.md b/docs/connector/sink/starrocks.md
index f07e432..208fa39 100644
--- a/docs/connector/sink/starrocks.md
+++ b/docs/connector/sink/starrocks.md
@@ -1,25 +1,25 @@
 # Starrocks
 
-> Starrocks sink connector
+> StarRocks sink connector
 >
 > ## Description
 >
-> Sink connector for Starrocks, know more in https://docs.starrocks.io/zh/docs/loading/Flink-connector-starrocks/.
+> Sink connector for StarRocks, know more in https://docs.starrocks.io/zh/docs/loading/Flink-connector-starrocks/.
 
 ## Sink Options
 
-Starrocks sink custom properties. If properties belongs to Starrocks Flink Connector Config, you can use `connection.` prefix to set.
+StarRocks sink custom properties. If properties belongs to StarRocks Flink Connector Config, you can use `connection.` prefix to set.
 
 | Name                | Type    | Required | Default | Description                                                                                                                                                                                                                                               |
 |---------------------|---------|----------|---------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
 | log.failures.only   | Boolean | No       | true    | Optional flag to whether the sink should fail on errors, or only log them; If this is set to true, then exceptions will be only logged, if set to false, exceptions will be eventually thrown, true by default.                                           |
 | connection.jdbc-url | String  | Yes      | (none)  | The address that is used to connect to the MySQL server of the FE. You can specify multiple addresses, which must be separated by a comma (,). Format: jdbc:mysql://<fe_host1>:<fe_query_port1>,<fe_host2>:<fe_query_port2>,<fe_host3>:<fe_query_port3>.. |
 | connection.load-url | String  | Yes      | (none)  | The address that is used to connect to the HTTP server of the FE. You can specify multiple addresses, which must be separated by a semicolon (;). Format: <fe_host1>:<fe_http_port1>;<fe_host2>:<fe_http_port2>..                                         |
-| connection.config   | Map     | No       | (none)  | Starrocks Flink Connector Options, know more in https://docs.starrocks.io/docs/loading/Flink-connector-starrocks/#options.                                                                                                                                |
+| connection.config   | Map     | No       | (none)  | StarRocks Flink Connector Options, know more in https://docs.starrocks.io/docs/loading/Flink-connector-starrocks/#options.                                                                                                                                |
 
 ## Example
 
-This example read data of inline test source and write to Starrocks table `test`.
+This example read data of inline test source and write to StarRocks table `test`.
 
 ```yaml
 sources: # [object] Define connector source
diff --git a/docs/grootstream-design-cn.md b/docs/grootstream-design-cn.md
index 41fcd0d..8579dc8 100644
--- a/docs/grootstream-design-cn.md
+++ b/docs/grootstream-design-cn.md
@@ -114,7 +114,8 @@ grootstream:
     vault:
       type: vault
       url: <vault-url>
-      token: <vault-token>
+      username: <vault-username>
+      password: <vault-password>
       default_key_path: <default-vault-key-path>
       plugin_key_path: <plugin-vault-key-path>
   
@@ -1295,6 +1296,23 @@ sinks:
       format: raw
 ```
 
+### CSV 
+
+按照既定的Schema读取/写入csv格式数据。
+
+| 属性名                      | 必填 | 默认值 | 类型    | 描述                                                         |
+| --------------------------- | ---- | ------ | ------- | ------------------------------------------------------------ |
+| csv.field.delimiter         | Y    | ,      | String  | 指定字段值之间的分隔符，默认为逗号                           |
+| csv.quote.character         | N    | "      | String  | 指定用于包围字段值的引号字符，默认为双引号"。如果csv.disable.quote.character为true，无法使用该选项。 |
+| csv.disable.quote.character | N    | false  | Boolean | 是否禁用包围字段值的引号字符。默认为false                    |
+| csv.allow.comments          | N    | false  | Boolean | 忽略以 `#` 开头的注释行（默认情况下禁用）。如果启用此选项，确保同时忽略解析错误，以允许存在空行。这意味着在处理 CSV 文件时，任何以 `#` 开头的行都将被视为注释，不会被解析或读取。 |
+| csv.ignore.parse.errors     | N    | false  | Boolean | 忽略解析错误，默认为false。遇到格式错误输出异常日志。        |
+| csv.array.element.delimiter | N    | ;      | String  | 数组中元素的分隔符                                           |
+| csv.escape.character        | N    |        | String  | 转义特殊字符的字符。例如：分隔符、引号或换行符。             |
+| csv.null.literal            | N    |        | String  | 指定NULL值的字符串                                           |
+
+
+
 # 任务编排
 
 ```yaml
@@ -1480,7 +1498,7 @@ Parameters:
     identifier: aes-128-gcm96
 ```
 
-Note : 读取任务变量`projection.encrypt.schema.registry.uri`，返回加密字段，数据类型为Array。
+Note : 读取任务变量`projection.encrypt.schema.registry.uri`，返回加密的字段，数据类型为Array。
 
  #### Eval
 
@@ -1621,7 +1639,7 @@ Parameters:
 
 - secret_key =  `<string>` 用于生成MAC的密钥。
 - algorithm=  `<string>`  用于生成MAC的HASH算法。默认是`sha256` 
-- output_format = `<string>` 输出MAC的格式。默认为`'hex'` 。支持：`base64` | `hex `。
+- output_format = `<string>` 输出MAC的格式。默认为`'base64'` 。支持：`base64` | `hex `。
 
 ```
 - function: HMAC
@@ -1850,6 +1868,28 @@ Parameters:
   output_fields: [ sessions ]
 ```
 
+
+
+ #### Max
+
+在时间窗口内获取最大值
+
+```yaml
+- function: MAX
+  lookup_fields: [ received_time ]
+  output_fields: [ received_time ]
+```
+
+ #### Min
+
+在时间窗口内获取最小值
+
+```yaml
+- function: MIN
+  lookup_fields: [ received_time ]
+  output_fields: [ received_time ]
+```
+
  #### Mean
 
 在时间窗口内对指定的数值对象求平均值。
diff --git a/docs/processor/udaf.md b/docs/processor/udaf.md
index 66d6ad5..f305201 100644
--- a/docs/processor/udaf.md
+++ b/docs/processor/udaf.md
@@ -9,7 +9,9 @@
 - [First Value](#First-Value)
 - [Last Value](#Last-Value)
 - [Long Count](#Long-Count)
+- [Max](#Max)
 - [MEAN](#Mean)
+- [Min](#Min)
 - [Number SUM](#Number-SUM)
 - [HLLD](#HLLD)
 - [Approx Count Distinct HLLD](#Approx-Count-Distinct-HLLD)
@@ -116,6 +118,23 @@ Example
   output_fields: [sessions]
 ```
 
+### Max
+
+MAX is used to get the maximum value of the field in the group of events.
+
+```MAX(filter, lookup_fields, output_fields)```
+- filter: optional
+- lookup_fields: required. Now only support one field.
+- output_fields: optional. If not set, the output field name is `lookup_field_name`.
+
+Example
+
+```yaml 
+- function: MAX
+  lookup_fields: [receive_time]
+  output_fields: [receive_time]
+```
+
 ### Mean
 
 MEAN is used to calculate the mean value of the field in the group of events. The lookup field value must be a number.
@@ -135,6 +154,25 @@ Example
   output_fields: [received_bytes_mean]
 ```
 
+
+### Min
+
+MIN is used to get the minimum value of the field in the group of events.
+
+```MIN(filter, lookup_fields, output_fields)```
+- filter: optional
+- lookup_fields: required. Now only support one field.
+- output_fields: optional. If not set, the output field name is `lookup_field_name`.
+
+Example
+
+```yaml
+- function: MIN
+  lookup_fields: [receive_time]
+  output_fields: [receive_time]
+```
+
+
 ### Number SUM
 
 NUMBER_SUM is used to sum the value of the field in the group of events. The lookup field value must be a number. 
diff --git a/docs/processor/udf.md b/docs/processor/udf.md
index e480275..7f5c656 100644
--- a/docs/processor/udf.md
+++ b/docs/processor/udf.md
@@ -10,11 +10,13 @@
 - [Current Unix Timestamp](#current-unix-timestamp)
 - [Domain](#domain)
 - [Drop](#drop)
+- [Encrypt](#encrypt)
 - [Eval](#eval)
 - [Flatten](#flatten)
 - [From Unix Timestamp](#from-unix-timestamp)
 - [Generate String Array](#generate-string-array)
 - [GeoIP Lookup](#geoip-lookup)
+- [HMAC](#hmac)
 - [JSON Extract](#json-extract)
 - [Path Combine](#path-combine)
 - [Rename](#rename)
@@ -174,6 +176,30 @@ Example:
   filter:  event.server_ip == '4.4.4.4'
 ```
 
+### Encrypt
+
+Encrypt function is used to encrypt the field value by the specified algorithm.
+
+Note: This feature allows you to use a third-party RESTful API to retrieve encrypted fields. By using these fields as criteria, you can determine whether the current field is encrypted. You must also set the projection.encrypt.schema.registry.uri as a job property.
+For example, setting `projection.encrypt.schema.registry.uri=127.0.0.1:9999/v1/schema/session_record?option=encrypt_fields` will return the encrypted fields in an array format.
+
+```ENCRYPT(filter, lookup_fields, output_fields[, parameters])```
+- filter: optional
+- lookup_fields: required
+- output_fields: required
+- parameters: required
+  - identifier: `<String>` required. The identifier of the encryption algorithm. Supports `aes-128-gcm96`, `aes-256-gcm96`, and `sm4-gcm96`.
+
+Example:
+Encrypt the phone number by the AES-128-GCM96 algorithm. Here phone_number will replace the original value with the encrypted value.
+```yaml 
+- function: ENCRYPT
+  lookup_fields: [phone_number]
+  output_fields: [phone_number]
+  parameters:
+    identifier: aes-128-gcm96
+```
+
 ### Eval
 
 Eval function is used to adds or removes fields from events by evaluating an value expression.
@@ -383,6 +409,29 @@ Example:
       CITY: server_administrative_area
 ```
 
+### HMAC
+
+HMAC function is used to generate the hash-based message authentication code (HMAC) by the specified algorithm.
+
+```HMAC(filter, lookup_fields, output_fields[, parameters])```
+- filter: optional
+- lookup_fields: required
+- output_fields: required
+- parameters: required
+  - secret_key: `<String>` required. The secret key used to generate the HMAC.
+  - output_format: `<String>` required. Enum: `HEX`, `BASE64`. Default is `BASE64`.
+
+Example:
+    
+```yaml
+  - function: HMAC
+    lookup_fields: [phone_number]
+    output_fields: [phone_number_hmac]
+    parameters:
+      secret_key: abcdefg
+      output_format: BASE64
+```
+
 ### JSON Extract
 
 JSON extract function is used to extract the value from json string.
@@ -604,4 +653,5 @@ Example:
   output_fields: [log_uuid]
 
 ```
-Result: such as 2ed6657d-e927-568b-95e1-2665a8aea6a2.
-\ No newline at end of file
+Result: such as 2ed6657d-e927-568b-95e1-2665a8aea6a2.
+
diff --git a/pom.xml b/pom.xml
index cdf6569..36eadf4 100644
--- a/pom.xml
+++ b/pom.xml
@@ -23,7 +23,7 @@
     </modules>
 
     <properties>
-        <revision>1.7.0-SNAPSHOT</revision>
+        <revision>1.7.1-SNAPSHOT</revision>
         <java.version>11</java.version>
         <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
         <maven.compiler.source>${java.version}</maven.compiler.source>
author	doufenghu <[email protected]>	2024-11-01 20:40:46 +0800
committer	doufenghu <[email protected]>	2024-11-01 20:40:46 +0800
commit	5818ed2ac9ca31a35a55f330160a9cf7f63bf6f3 (patch)
tree	0d2f00c6d6c1791de8c5588572e0e7fb538803f2
parent	e25eabde3ccb3f0d52346cb11cac757763c41be8 (diff)