summaryrefslogtreecommitdiff
path: root/docs/connector
diff options
context:
space:
mode:
authorlifengchao <[email protected]>2024-05-23 16:02:37 +0800
committerlifengchao <[email protected]>2024-05-23 16:02:37 +0800
commitb92affae7feae10d2cfaac2d75ad1c15f5ea95cb (patch)
treeb28d9ea23a6e3f78584b34883c61e1e1a5a89272 /docs/connector
parent729c78f4299f86820ec239106933d1ef4fafd92d (diff)
* [feature][connector-file] GAL-572 支持FileSource
Diffstat (limited to 'docs/connector')
-rw-r--r--docs/connector/source/file.md56
1 files changed, 56 insertions, 0 deletions
diff --git a/docs/connector/source/file.md b/docs/connector/source/file.md
new file mode 100644
index 0000000..f92ab84
--- /dev/null
+++ b/docs/connector/source/file.md
@@ -0,0 +1,56 @@
+# File
+
+> File source connector
+
+## Description
+
+File source connector is used to generate data from a text file(local file or hdfs file). It is useful for testing.
+
+## Source Options
+
+File source custom properties.
+
+| Name | Type | Required | Default | Description |
+|---------------------------|---------|----------|---------|---------------------------------------------------------------------------------------------------|
+| path | String | Yes | (none) | File path, support local path or hdfs path. Example: ./logs/logs.json, hdfs://ns1/test/logs.json. |
+| format | String | Yes | (none) | Data format. The Optional values are `json`, `csv`. |
+| [format].config | Map | No | (none) | Data format properties. Please refer to [Format Options](../formats) for details. |
+| rows.per.second | Integer | No | 1000 | Rows per second to control the emit rate. |
+| number.of.rows | Long | No | -1 | Total number of rows to emit. By default, the source is unbounded. |
+| millis.per.row | Long | No | 0 | Millis per row to control the emit rate. If greater than 0, rows.per.second is not effective. |
+| read.local.file.in.client | Boolean | No | true | Whether read local file in client. |
+
+## Example
+
+This example read data of file test source and print to console.
+
+```yaml
+sources:
+ file_source:
+ type: file
+ properties:
+ # path: 'hdfs://ns1/test/logs.json'
+ path: './logs.json'
+ rows.per.second: 2
+ format: json
+
+sinks:
+ print_sink:
+ type: print
+ properties:
+ format: json
+
+application:
+ env:
+ name: example-file-to-print
+ parallelism: 2
+ pipeline:
+ object-reuse: true
+ topology:
+ - name: file_source
+ downstream: [ print_sink ]
+ - name: print_sink
+ downstream: [ ]
+```
+
+