summaryrefslogtreecommitdiff
path: root/docs
diff options
context:
space:
mode:
author窦凤虎 <[email protected]>2024-09-19 10:23:32 +0000
committer窦凤虎 <[email protected]>2024-09-19 10:23:32 +0000
commitc0b9acfc3adc85abbd06207259b2515edc5c4eae (patch)
tree366ba5634e795bcd623831c5e7bda898c83777de /docs
parent62e969df69b28a9f435c925669cf6dfe018aa74f (diff)
parent3a95fef4c663c3f28c25daeb4cc19d0219fdfd48 (diff)
Merge branch 'release/1.6.0' into 'master'v1.6.0
[Improve][bootstrap] Improve job-level user-defined variables, move the path... See merge request galaxy/platform/groot-stream!111
Diffstat (limited to 'docs')
-rw-r--r--docs/connector/config-encryption-decryption.md4
-rw-r--r--docs/env-config.md23
-rw-r--r--docs/grootstream-config.md5
-rw-r--r--docs/user-guide.md4
4 files changed, 28 insertions, 8 deletions
diff --git a/docs/connector/config-encryption-decryption.md b/docs/connector/config-encryption-decryption.md
index 3146569..c2b05f6 100644
--- a/docs/connector/config-encryption-decryption.md
+++ b/docs/connector/config-encryption-decryption.md
@@ -6,14 +6,14 @@ In production environments, sensitive configuration items such as passwords are
## How to use
-Groot Stream default support base64 and AES encryption and decryption.
+Groot Stream support base64, AES and SM4 encryption and decryption.
Base64 encryption support encrypt the following parameters:
- username
- password
- auth
-AES encryption support encrypt the following parameters:
+AES/SM4 encryption support encrypt the following parameters:
- username
- password
- auth
diff --git a/docs/env-config.md b/docs/env-config.md
index 7a31494..8e22a53 100644
--- a/docs/env-config.md
+++ b/docs/env-config.md
@@ -57,10 +57,10 @@ Specify a list of classpath URLs via `pipeline.classpaths`, The classpaths are s
You can directly use the flink parameter by prefixing `flink.`, such as `flink.execution.buffer-timeout`, `flink.object-reuse`, etc. More details can be found in the official [flink documentation](https://flink.apache.org/).
Of course, you can use groot stream parameter, here are some parameter names corresponding to the names in Flink.
-| Groot Stream | Flink |
+| Groot Stream | Flink |
|----------------------------------------|---------------------------------------------------------------|
-| execution.buffer-timeout | flink.execution.buffer-timeout |
-| pipeline.object-reuse | flink.object-reuse |
+| execution.buffer-timeout | flink.execution.buffer-timeout.interval |
+| pipeline.object-reuse | flink.pipeline.object-reuse |
| pipeline.max-parallelism | flink.pipeline.max-parallelism |
| execution.restart.strategy | flink.restart-strategy |
| execution.restart.attempts | flink.restart-strategy.fixed-delay.attempts |
@@ -70,3 +70,20 @@ Of course, you can use groot stream parameter, here are some parameter names cor
| execution.restart.delayInterval | flink.restart-strategy.failure-rate.delay |
| ... | ... |
+## Properties
+Job-level user-defined variables can be set in the `properties` section using key-value pairs, where the key represents a configuration property and the value specifies the desired setting.
+The properties can be used in the configuration file by using `props.${property_name}`. It will override the corresponding settings in the `grootstream.yaml` file for the duration of the job.
+```yaml
+application:
+ env:
+ name: example-inline-to-print
+ parallelism: 3
+ pipeline:
+ object-reuse: true
+ properties:
+ hos.bucket.name.rtp_file: job_level_traffic_rtp_file_bucket
+ hos.bucket.name.http_file: job_level_traffic_http_file_bucket
+ hos.bucket.name.eml_file: job_level_traffic_eml_file_bucket
+ hos.bucket.name.policy_capture_file: job_level_traffic_policy_capture_file_bucket
+```
+
diff --git a/docs/grootstream-config.md b/docs/grootstream-config.md
index fb902ae..9dd442f 100644
--- a/docs/grootstream-config.md
+++ b/docs/grootstream-config.md
@@ -20,7 +20,7 @@ grootstream:
```
-### Knowledge Base
+## Knowledge Base
The knowledge base is a collection of libraries that can be used in the groot-stream job's UDFs. File system type can be specified `local`, `http` or `hdfs`.
If the value is `http`, must be ` QGW Knowledge Base Repository` URL. The library will be dynamically updated according to the `scheduler.knowledge_base.update.interval.minutes` configuration.
@@ -77,3 +77,6 @@ grootstream:
- asn_builtin.mmdb
- asn_user_defined.mmdb
```
+## Properties
+Global user-defined variables can be set in the `properties` section using key-value pairs, where the key represents a configuration property and the value specifies the desired setting.
+The properties can be used in the configuration file by using `props.${property_name}`. \ No newline at end of file
diff --git a/docs/user-guide.md b/docs/user-guide.md
index e35616f..d52cfed 100644
--- a/docs/user-guide.md
+++ b/docs/user-guide.md
@@ -153,7 +153,7 @@ Used to define job environment configuration information. For more details, you
# Command
## Run a job by CLI
-
+Note: When submitting a job via CLI, you can use `-D` parameter to specify flink configuration. For example, `-Dexecution.buffer-timeout.interval=1000` to set the buffer timeout to 1000ms. More details can be found in the official [flink documentation](https://flink.apache.org/).
```bash
Usage: start.sh [options]
Options:
@@ -164,7 +164,7 @@ Options:
-e, --deploy-mode <deploy mode> Deploy mode, only support [run] (default: run)
--target <target> Submitted target type, support [local, remote, yarn-session, yarn-per-job]
-n, --name <name> Job name (default: groot-stream-job)
- -i, --variable <variable> User-defined parameters, eg. -i key=value (default: [])
+ -i, --variable <variable> User-defined variables, eg. -i key=value (default: [])
-h, --help Show help message
-v, --version Show version message