diff options
| author | lifengchao <[email protected]> | 2024-05-23 16:02:37 +0800 |
|---|---|---|
| committer | lifengchao <[email protected]> | 2024-05-23 16:02:37 +0800 |
| commit | b92affae7feae10d2cfaac2d75ad1c15f5ea95cb (patch) | |
| tree | b28d9ea23a6e3f78584b34883c61e1e1a5a89272 /docs | |
| parent | 729c78f4299f86820ec239106933d1ef4fafd92d (diff) | |
* [feature][connector-file] GAL-572 支持FileSource
Diffstat (limited to 'docs')
| -rw-r--r-- | docs/connector/source/file.md | 56 |
1 files changed, 56 insertions, 0 deletions
diff --git a/docs/connector/source/file.md b/docs/connector/source/file.md new file mode 100644 index 0000000..f92ab84 --- /dev/null +++ b/docs/connector/source/file.md @@ -0,0 +1,56 @@ +# File + +> File source connector + +## Description + +File source connector is used to generate data from a text file(local file or hdfs file). It is useful for testing. + +## Source Options + +File source custom properties. + +| Name | Type | Required | Default | Description | +|---------------------------|---------|----------|---------|---------------------------------------------------------------------------------------------------| +| path | String | Yes | (none) | File path, support local path or hdfs path. Example: ./logs/logs.json, hdfs://ns1/test/logs.json. | +| format | String | Yes | (none) | Data format. The Optional values are `json`, `csv`. | +| [format].config | Map | No | (none) | Data format properties. Please refer to [Format Options](../formats) for details. | +| rows.per.second | Integer | No | 1000 | Rows per second to control the emit rate. | +| number.of.rows | Long | No | -1 | Total number of rows to emit. By default, the source is unbounded. | +| millis.per.row | Long | No | 0 | Millis per row to control the emit rate. If greater than 0, rows.per.second is not effective. | +| read.local.file.in.client | Boolean | No | true | Whether read local file in client. | + +## Example + +This example read data of file test source and print to console. + +```yaml +sources: + file_source: + type: file + properties: + # path: 'hdfs://ns1/test/logs.json' + path: './logs.json' + rows.per.second: 2 + format: json + +sinks: + print_sink: + type: print + properties: + format: json + +application: + env: + name: example-file-to-print + parallelism: 2 + pipeline: + object-reuse: true + topology: + - name: file_source + downstream: [ print_sink ] + - name: print_sink + downstream: [ ] +``` + + |
