diff options
Diffstat (limited to 'docs/develop-guide.md')
| -rw-r--r-- | docs/develop-guide.md | 22 |
1 files changed, 22 insertions, 0 deletions
diff --git a/docs/develop-guide.md b/docs/develop-guide.md index 2742cee..927d2d3 100644 --- a/docs/develop-guide.md +++ b/docs/develop-guide.md @@ -15,6 +15,28 @@ | groot-docs | Docs module of groot-stream, which is responsible for providing documents. | | groot-release | Release module of groot-stream, which is responsible for providing release scripts. | +## Event Model +Groot Stream based all stream processing on data records common known as events. A event is a collection of key-value pairs(fields). As follows: + +```json +{ + "__timestamp": "<Timestamp in UNIX epoch format (milliseconds)>", + "__headers": "Map<String, String> headers of the source that delivered the event", + "__window_start_timestamp" : "<Timestamp in UNIX epoch format (milliseconds)>", + "__window_end_timestamp" : "<Timestamp in UNIX epoch format (milliseconds)>", + "key1": "<value1>", + "key2": "<value2>", + "keyN": "<valueN>" +} +``` +Groot Stream add internal fields during pipeline processing. A few notes about internal fields: +- Internal fields start with a double underscore `__`. +- Each source can add one or many internal fields to the each event. For example, the Kafka source adds both a `__timestamp` and a `__input_id` field. +- Treat internal fields as read-only. Modifying them can result in unintended consequences to your data flows. +- Internal fields only exist for the duration of the event processing pipeline. They are not documented under sources or sinks. +- If you do not configure a timestamp for extraction, the Pipeline process assigns the current time (in UNIX epoch format) to the __timestamp field. +- If you have multiple sources, you can determine the origin of the event by examining the `__headers` field. For example, the Kafka source appends the topic name as the `__input_id` key in the `__headers`. + ## How to write a high quality Git commit message > [purpose] [module name] [sub-module name] Description (JIRA Issue ID) |
