summaryrefslogtreecommitdiff
path: root/docs/connector
diff options
context:
space:
mode:
authordoufenghu <[email protected]>2024-01-23 18:16:02 +0800
committerdoufenghu <[email protected]>2024-01-23 18:16:02 +0800
commitd7d7164b6f7b3b61273780803ff200cebabafcfc (patch)
treeed0e78c041d16a9737470715f2a716eb05f00aea /docs/connector
parent4a1fbba350ab79275b6cf976501900f83dbfe9d7 (diff)
[Improve][docs] Add connector.md, Kafka.md and improve user-guide.md description.
Diffstat (limited to 'docs/connector')
-rw-r--r--docs/connector/connector.md44
-rw-r--r--docs/connector/source/Kafka.md61
2 files changed, 105 insertions, 0 deletions
diff --git a/docs/connector/connector.md b/docs/connector/connector.md
new file mode 100644
index 0000000..6df1e23
--- /dev/null
+++ b/docs/connector/connector.md
@@ -0,0 +1,44 @@
+# Source Connector
+Source Connector contains some common core features, and each source connector supports them to varying degrees.
+
+## Common Source Options
+
+```yaml
+sources:
+ ${source_name}:
+ type: ${source_connector_type}
+ fields:
+ - name: ${field_name}
+ type: ${field_type}
+ properties:
+ ${prop_key}: ${prop_value}
+```
+
+| Name | Type | Required | Default | Description |
+|-------------------------------------|-----------------------------------------------------------------------------|----------|--------------------------|--------------------------------------------------------------------------------------------------------------------------|
+| type | String | Yes | - | The type of the source connector. The `SourceTableFactory` will use this value as identifier to create source connector. |
+| fields | Array of `Field` | No | - | The structure of the data, including field names and field types. |
+| properties | Map of String | Yes | - | The source connector customize properties, more details see the [Source](source) documentation. |
+
+## Schema Field Projection
+The source connector supports reading only specified fields from the data source. For example `KafkaSource` will read all content from topic and then use `fields` to filter unnecessary columns.
+The Schema Structure refer to [Schema Structure](../user-guide.md#schema-structure).
+
+# Sink Connector
+Sink Connector contains some common core features, and each sink connector supports them to varying degrees.
+
+## Common Sink Options
+
+```yaml
+sinks:
+ ${sink_name}:
+ type: ${sink_connector_type}
+ properties:
+ ${prop_key}: ${prop_value}
+```
+
+| Name | Type | Required | Default | Description |
+|-------------------------------------|-----------------------------------------------------------------------------|----------|--------------------------|--------------------------------------------------------------------------------------------------------------------|
+| type | String | Yes | - | The type of the sink connector. The `SinkTableFactory` will use this value as identifier to create sink connector. |
+| properties | Map of String | Yes | - | The sink connector customize properties, more details see the [Sink](sink) documentation. |
+
diff --git a/docs/connector/source/Kafka.md b/docs/connector/source/Kafka.md
index e69de29..49ea262 100644
--- a/docs/connector/source/Kafka.md
+++ b/docs/connector/source/Kafka.md
@@ -0,0 +1,61 @@
+# Kafka
+> Kafka source connector
+## Description
+Source connector for Apache Kafka
+## Source Options
+In order to use the Kafka connector, the following dependencies are required. They can be download by Nexus Maven Repository.
+
+| Datasource | Supported Versions | Maven |
+|-----------------|--------------------|--------------------------------------------------------------------------------------------------------------------------------|
+| connector-kafka | Universal | [Download](http://192.168.40.153:8099/service/local/repositories/platform-release/content/com/geedgenetworks/connector-kafka/) |
+
+Kafka source customizes properties. if properties belongs to Kafka Consumer Config, you can use `kafka.` prefix to set.
+
+| Name | Type | Required | Default | Description |
+|---------------------------|--------|----------|--------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| topic | String | Yes | - | Topic name(s). It also supports topic list for source by using comma to separate topic names. eg. `topic1,topic2` |
+| kafka.bootstrap.servers | String | Yes | - | A list of host/port pairs to use for establishing the initial connection to the Kafka cluster. This list should be in the form `host1:port1,host2:port2,...`. |
+| format | String | No | json | Data format. The default value is `json`. The Optional values are `json`, `protobuf`. |
+| Format Properties | | No | - | Data format properties. Please refer to [Format Options](../formats) for details. |
+| Kafka Consumer Properties | | No | - | Kafka consumer properties. Please refer to [Kafka Consumer Config](https://kafka.apache.org/documentation/#consumerconfigs) for details. |
+
+## Example
+This example read data of kafka topic `SESSION-RECORD` and print to console.
+```yaml
+sources:
+ kafka_source:
+ type : kafka
+ fields: # [array of object] Schema field projection, support read data only from specified fields.
+ - name: client_ip
+ type: string
+ - name: server_ip
+ type: string
+ properties: # [object] Kafka source properties
+ topic: SESSION-RECORD
+ kafka.bootstrap.servers: 192.168.44.11:9092
+ kafka.session.timeout.ms: 60000
+ kafka.max.poll.records: 3000
+ kafka.max.partition.fetch.bytes: 31457280
+ kafka.group.id: GROOT-STREAM-example-KAFKA-TO-PRINT
+ kafka.auto.offset.reset: latest
+ format: json
+
+sinks: # [object] Define connector sink
+ print_sink:
+ type: print
+ properties:
+ mode: log_info
+ format: json
+
+application: # [object] Define job configuration
+ env:
+ name: groot-stream-job-kafka-to-print
+ parallelism: 1
+ pipeline:
+ object-reuse: true
+ topology:
+ - name: kafka_source
+ downstream: [print_sink]
+ - name: print_sink
+ downstream: []
+```