[Improve][docs] Add connector.md, Kafka.md and improve user-guide.md description.

author: doufenghu <[email protected]> 2024-01-23 18:16:02 +0800
committer: doufenghu <[email protected]> 2024-01-23 18:16:02 +0800
commit: d7d7164b6f7b3b61273780803ff200cebabafcfc (patch)
tree: ed0e78c041d16a9737470715f2a716eb05f00aea /docs/connector
parent: 4a1fbba350ab79275b6cf976501900f83dbfe9d7 (diff)
2 files changed, 105 insertions, 0 deletions
diff --git a/docs/connector/connector.md b/docs/connector/connector.md
new file mode 100644
index 0000000..6df1e23
--- /dev/null
+++ b/docs/connector/connector.md
@@ -0,0 +1,44 @@
+# Source Connector
+Source Connector contains some common core features, and each source connector supports them to varying degrees.
+
+## Common Source Options
+
+```yaml
+sources:
+  ${source_name}:
+    type: ${source_connector_type}
+    fields: 
+      - name: ${field_name}
+        type: ${field_type}
+    properties:
+      ${prop_key}: ${prop_value}
+```
+
+| Name                                |                                    Type                                     | Required |         Default          | Description                                                                                                              |
+|-------------------------------------|-----------------------------------------------------------------------------|----------|--------------------------|--------------------------------------------------------------------------------------------------------------------------|
+| type                                | String                                                                      | Yes      | -                        | The type of the source connector. The `SourceTableFactory` will use this value as identifier to create source connector. |
+| fields                              | Array of `Field`                                                            | No       | -                        | The structure of the data, including field names and field types.                                                        |
+| properties                          | Map of String                                                               | Yes      | -                        | The source connector customize properties, more details see the [Source](source) documentation.                          |                                                                                                                                                                           
+
+## Schema Field Projection
+The source connector supports reading only specified fields from the data source. For example `KafkaSource` will read all content from topic and then use `fields` to filter unnecessary columns. 
+The Schema Structure refer to [Schema Structure](../user-guide.md#schema-structure).
+
+# Sink Connector
+Sink Connector contains some common core features, and each sink connector supports them to varying degrees.
+
+## Common Sink Options
+
+```yaml
+sinks:
+  ${sink_name}:
+    type: ${sink_connector_type}
+    properties:
+      ${prop_key}: ${prop_value}
+```
+
+| Name                                |                                    Type                                     | Required |         Default          | Description                                                                                                        |
+|-------------------------------------|-----------------------------------------------------------------------------|----------|--------------------------|--------------------------------------------------------------------------------------------------------------------|
+| type                                | String                                                                      | Yes      | -                        | The type of the sink connector. The `SinkTableFactory` will use this value as identifier to create sink connector. |
+| properties                          | Map of String                                                               | Yes      | -                        | The sink connector customize properties, more details see the [Sink](sink) documentation.                          |                                                                                                                                                                           
+
diff --git a/docs/connector/source/Kafka.md b/docs/connector/source/Kafka.md
index e69de29..49ea262 100644
--- a/docs/connector/source/Kafka.md
+++ b/docs/connector/source/Kafka.md
@@ -0,0 +1,61 @@
+# Kafka
+> Kafka source connector
+## Description
+Source connector for Apache Kafka
+## Source Options
+In order to use the Kafka connector, the following dependencies are required. They can be download by Nexus Maven Repository.
+
+| Datasource      | Supported Versions | Maven                                                                                                                          |
+|-----------------|--------------------|--------------------------------------------------------------------------------------------------------------------------------|
+| connector-kafka | Universal          | [Download](http://192.168.40.153:8099/service/local/repositories/platform-release/content/com/geedgenetworks/connector-kafka/) |
+
+Kafka source customizes properties. if properties belongs to Kafka Consumer Config, you can use `kafka.` prefix to set.
+
+| Name                      | Type   | Required | Default      | Description                                                                                                                                                   |
+|---------------------------|--------|----------|--------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| topic                     | String | Yes      | -            | Topic name(s). It also supports topic list for source by using comma to separate topic names. eg. `topic1,topic2`                                             |
+| kafka.bootstrap.servers   | String | Yes      | -            | A list of host/port pairs to use for establishing the initial connection to the Kafka cluster. This list should be in the form `host1:port1,host2:port2,...`. |
+| format                    | String | No       | json         | Data format. The default value is `json`. The Optional values are `json`, `protobuf`.                                                                         |
+| Format Properties         |        | No       | -            | Data format properties. Please refer to [Format Options](../formats) for details.                                                                             |                                                                                                                                       
+| Kafka Consumer Properties |        | No       | -            | Kafka consumer properties. Please refer to [Kafka Consumer Config](https://kafka.apache.org/documentation/#consumerconfigs) for details.                      |
+
+## Example
+This example read data of kafka topic `SESSION-RECORD` and print to console.
+```yaml
+sources:
+  kafka_source:
+    type : kafka
+    fields: # [array of object] Schema field projection, support read data only from specified fields.
+      - name: client_ip
+        type: string
+      - name: server_ip
+        type: string
+    properties: # [object] Kafka source properties
+      topic:  SESSION-RECORD
+      kafka.bootstrap.servers: 192.168.44.11:9092
+      kafka.session.timeout.ms: 60000
+      kafka.max.poll.records: 3000
+      kafka.max.partition.fetch.bytes: 31457280
+      kafka.group.id: GROOT-STREAM-example-KAFKA-TO-PRINT
+      kafka.auto.offset.reset: latest
+      format: json
+
+sinks: # [object] Define connector sink
+  print_sink:
+      type: print
+      properties:
+        mode: log_info
+        format: json
+
+application: # [object] Define job configuration
+  env:
+    name: groot-stream-job-kafka-to-print
+    parallelism: 1
+    pipeline:
+      object-reuse: true
+  topology:
+    - name: kafka_source
+      downstream: [print_sink]
+    - name: print_sink
+      downstream: []
+```
author	doufenghu <[email protected]>	2024-01-23 18:16:02 +0800
committer	doufenghu <[email protected]>	2024-01-23 18:16:02 +0800
commit	d7d7164b6f7b3b61273780803ff200cebabafcfc (patch)
tree	ed0e78c041d16a9737470715f2a716eb05f00aea /docs/connector
parent	4a1fbba350ab79275b6cf976501900f83dbfe9d7 (diff)