diff options
| author | 窦凤虎 <[email protected]> | 2024-11-13 01:45:19 +0000 |
|---|---|---|
| committer | 窦凤虎 <[email protected]> | 2024-11-13 01:45:19 +0000 |
| commit | 80f78d2dfa3a79794d417469f50e18d762464052 (patch) | |
| tree | fac46d862a3477b5f945b1fcf6c47c987c69faf5 | |
| parent | 19b38f16d2be491341d58bf816f89340bc6104c2 (diff) | |
| parent | b636c24d8349cd3ddd306e8a9561724fbd0d2b4c (diff) | |
Merge branch 'feature/spi' into 'develop'
Feature/spi
See merge request galaxy/platform/groot-stream!136
379 files changed, 8718 insertions, 9252 deletions
diff --git a/config/template/grootstream_job_template.yaml b/config/template/grootstream_job_template.yaml index b26fbb2..c4aa726 100644 --- a/config/template/grootstream_job_template.yaml +++ b/config/template/grootstream_job_template.yaml @@ -151,7 +151,7 @@ preprocessing_pipelines: # [object] Define Processors for preprocessing pipeline # It will be accomplished the common processing for the event by the user-defined functions. # processing_pipelines: # [object] Define Processors for processing pipelines. - z: # [object] Define projection processor name, must be unique. + projection_processor: # [object] Define projection processor name, must be unique. type: projection # [string] Processor Type remove_fields: output_fields: diff --git a/docs/connector/connector.md b/docs/connector/connector.md index 93d64b0..c254d53 100644 --- a/docs/connector/connector.md +++ b/docs/connector/connector.md @@ -70,7 +70,7 @@ schema: To retrieve the schema from a local file using its absolute path. -> Ensures that the file path is accessible to all nodes in your Flink cluster. +> Ensures that the file path is accessible to all jobTopologyNodes in your Flink cluster. ```yaml schema: diff --git a/docs/connector/formats/protobuf.md b/docs/connector/formats/protobuf.md index 2dfb65e..467177a 100644 --- a/docs/connector/formats/protobuf.md +++ b/docs/connector/formats/protobuf.md @@ -13,7 +13,7 @@ ## Format Options -> Ensures that the file path is accessible to all nodes in your Flink cluster. +> Ensures that the file path is accessible to all jobTopologyNodes in your Flink cluster. | Name | Type | Required | Default | Description | |-------------------------------|---------|----------|---------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| diff --git a/docs/connector/source/file.md b/docs/connector/source/file.md index bdbf74e..98c6aee 100644 --- a/docs/connector/source/file.md +++ b/docs/connector/source/file.md @@ -24,7 +24,7 @@ File source custom properties. This example read data of file test source and print to console. -> Ensures that the file path is accessible to all nodes in your Flink cluster. +> Ensures that the file path is accessible to all jobTopologyNodes in your Flink cluster. ```yaml sources: diff --git a/docs/filter/aviator.md b/docs/filter/aviator.md index e7f6c2b..acf98a5 100644 --- a/docs/filter/aviator.md +++ b/docs/filter/aviator.md @@ -11,7 +11,7 @@ | Name | Type | Required | Default | Description | |-------------|--------|----------|---------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| type | String | Yes | (none) | The type of the filter operator. Now only support `com.geedgenetworks.core.filter.AviatorFilter`. | +| type | String | Yes | (none) | The type of the filter operator. Now only support `com.geedgenetworks.core.filter.AviatorFilterProcessor`. | | properties | Map | Yes | (none) | Filter operator properties. | | -expression | String | Yes | (none) | Based on the filter expression, the event will be passed to downstream if the expression is true, otherwise it will be dropped. Build a filter expression need add prefix `event.`, if you want get event field. | diff --git a/docs/grootstream-config.md b/docs/grootstream-config.md index b7fd037..ad54603 100644 --- a/docs/grootstream-config.md +++ b/docs/grootstream-config.md @@ -24,7 +24,7 @@ grootstream: The knowledge base is a collection of libraries that can be used in the groot-stream job's UDFs. File system type can be specified `local`, `http` or `hdfs`. If the value is `http`, must be ` QGW Knowledge Base Repository` URL. The library will be dynamically updated according to the `scheduler.knowledge_base.update.interval.minutes` configuration. -If the value is `local`, the library will be loaded from the local file system. Need to manually upgrade all nodes in the Flink cluster when the library is updated. +If the value is `local`, the library will be loaded from the local file system. Need to manually upgrade all jobTopologyNodes in the Flink cluster when the library is updated. If the value is `hdfs`, the library will be loaded from the HDFS file system. More details about hdfs operation can be found in the [HDFS](./faq.md#hadoop-hdfs-commands-for-beginners). | Name | Type | Required | Default | Description | @@ -36,7 +36,7 @@ If the value is `hdfs`, the library will be loaded from the HDFS file system. Mo ### Define the knowledge base file from a local file -> Ensures that the file path is accessible to all nodes in your Flink cluster. +> Ensures that the file path is accessible to all jobTopologyNodes in your Flink cluster. ```yaml grootstream: @@ -65,7 +65,7 @@ grootstream: ### Define the knowledge base file from a HDFS file system -> Ensure that the HDFS file system is accessible to all nodes in your Flink cluster. +> Ensure that the HDFS file system is accessible to all jobTopologyNodes in your Flink cluster. ```yaml grootstream: diff --git a/docs/images/Groot Stream Architecture.jpg b/docs/images/Groot Stream Architecture.jpg Binary files differnew file mode 100644 index 0000000..7f15b10 --- /dev/null +++ b/docs/images/Groot Stream Architecture.jpg diff --git a/docs/processor/projection-processor.md b/docs/processor/projection-processor.md index 4319f36..62258de 100644 --- a/docs/processor/projection-processor.md +++ b/docs/processor/projection-processor.md @@ -32,13 +32,13 @@ sources: filters: filter_operator: - type: com.geedgenetworks.core.filter.AviatorFilter + type: com.geedgenetworks.core.filter.AviatorFilterProcessor properties: expression: event.server_ip != '12.12.12.12' processing_pipelines: # [object] Define Processors projection_processor: # [object] Define projection processor name - type: com.geedgenetworks.core.processor.projection.ProjectionProcessorImpl + type: com.geedgenetworks.core.processor.projection.ProjectionProcessor remove_fields: [http_request_line, http_response_line, http_response_content_type] functions: # [array of object] Define UDFs - function: DROP # [string] Define DROP function for filter event diff --git a/docs/processor/split-processor.md b/docs/processor/split-processor.md index e1a1163..05d3d92 100644 --- a/docs/processor/split-processor.md +++ b/docs/processor/split-processor.md @@ -10,7 +10,7 @@ Using the flink side Outputs send data from a stream to multiple downstream cons | name | type | required | default value | |-------------------|--------|----------|--------------------------------------------------------------------------------------------| -| type | String | Yes | The type of the processor, now only support ` com.geedgenetworks.core.split.SplitOperator` | +| type | String | Yes | The type of the processor, now only support ` com.geedgenetworks.core.split.SplitProcessor` | | rules | Array | Yes | Array of Object. Defining rules for labeling Side Output Tag | | [rule.]tag | String | Yes | The tag name of the side output | | [rule.]expression | String | Yes | The expression to evaluate the event. | diff --git a/docs/user-guide.md b/docs/user-guide.md index d52cfed..764628a 100644 --- a/docs/user-guide.md +++ b/docs/user-guide.md @@ -32,13 +32,13 @@ sources: filters: filter: - type: com.geedgenetworks.core.filter.AviatorFilter + type: com.geedgenetworks.core.filter.AviatorFilterProcessor properties: expression: event.decoded_as == 'BASE' preprocessing_pipelines: preprocessor: - type: com.geedgenetworks.core.processor.projection.ProjectionProcessorImpl + type: com.geedgenetworks.core.processor.projection.ProjectionProcessor functions: - function: EVAL output_fields: [additional_field_subdomain] @@ -47,7 +47,7 @@ preprocessing_pipelines: processing_pipelines: processor: - type: com.geedgenetworks.core.processor.projection.ProjectionProcessorImpl + type: com.geedgenetworks.core.processor.projection.ProjectionProcessor remove_fields: [log_id] output_fields: [] functions: @@ -58,7 +58,7 @@ processing_pipelines: postprocessing_pipelines: postprocessor: - type: com.geedgenetworks.core.processor.projection.ProjectionProcessorImpl + type: com.geedgenetworks.core.processor.projection.ProjectionProcessor remove_fields: [dup_traffic_flag] sinks: diff --git a/groot-api/pom.xml b/groot-api/pom.xml new file mode 100644 index 0000000..1588f11 --- /dev/null +++ b/groot-api/pom.xml @@ -0,0 +1,39 @@ +<?xml version="1.0" encoding="UTF-8"?> +<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" + xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"> + <modelVersion>4.0.0</modelVersion> + <parent> + <groupId>com.geedgenetworks</groupId> + <artifactId>groot-stream</artifactId> + <version>${revision}</version> + </parent> + + <artifactId>groot-api</artifactId> + <name>Groot : API </name> + + <dependencies> + + <dependency> + <groupId>com.geedgenetworks</groupId> + <artifactId>groot-common</artifactId> + <version>${revision}</version> + <scope>provided</scope> + </dependency> + + + <dependency> + <groupId>org.apache.flink</groupId> + <artifactId>flink-table-api-java-bridge_${scala.version}</artifactId> + <scope>provided</scope> + </dependency> + + <dependency> + <groupId>org.apache.flink</groupId> + <artifactId>flink-table-planner-blink_${scala.version}</artifactId> + <scope>provided</scope> + </dependency> + + + </dependencies> + +</project>
\ No newline at end of file diff --git a/groot-common/src/main/java/com/geedgenetworks/common/udf/AggregateFunction.java b/groot-api/src/main/java/com/geedgenetworks/api/common/udf/AggregateFunction.java index 6f6e048..e846be1 100644 --- a/groot-common/src/main/java/com/geedgenetworks/common/udf/AggregateFunction.java +++ b/groot-api/src/main/java/com/geedgenetworks/api/common/udf/AggregateFunction.java @@ -1,7 +1,7 @@ -package com.geedgenetworks.common.udf; +package com.geedgenetworks.api.common.udf; -import com.geedgenetworks.common.Accumulator; -import com.geedgenetworks.common.Event; +import com.geedgenetworks.common.config.Accumulator; +import com.geedgenetworks.api.connector.event.Event; import java.io.Serializable; diff --git a/groot-common/src/main/java/com/geedgenetworks/common/udf/ScalarFunction.java b/groot-api/src/main/java/com/geedgenetworks/api/common/udf/ScalarFunction.java index 2723652..17e299d 100644 --- a/groot-common/src/main/java/com/geedgenetworks/common/udf/ScalarFunction.java +++ b/groot-api/src/main/java/com/geedgenetworks/api/common/udf/ScalarFunction.java @@ -1,10 +1,12 @@ -package com.geedgenetworks.common.udf; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.common.config.CheckUDFContextUtil; -import com.geedgenetworks.common.config.UDFContextConfigOptions; +package com.geedgenetworks.api.common.udf; + import com.geedgenetworks.common.exception.CommonErrorCode; import com.geedgenetworks.common.exception.GrootStreamRuntimeException; +import com.geedgenetworks.api.configuration.CheckUDFContextUtil; +import com.geedgenetworks.api.configuration.UDFContextConfigOptions; +import com.geedgenetworks.api.connector.event.Event; import org.apache.flink.api.common.functions.RuntimeContext; + import java.io.Serializable; public interface ScalarFunction extends Serializable { diff --git a/groot-common/src/main/java/com/geedgenetworks/common/udf/TableFunction.java b/groot-api/src/main/java/com/geedgenetworks/api/common/udf/TableFunction.java index e602291..8b8a008 100644 --- a/groot-common/src/main/java/com/geedgenetworks/common/udf/TableFunction.java +++ b/groot-api/src/main/java/com/geedgenetworks/api/common/udf/TableFunction.java @@ -1,8 +1,7 @@ -package com.geedgenetworks.common.udf; +package com.geedgenetworks.api.common.udf; -import com.geedgenetworks.common.Event; +import com.geedgenetworks.api.connector.event.Event; import org.apache.flink.api.common.functions.RuntimeContext; -import org.apache.flink.util.Collector; import java.io.Serializable; import java.util.List; diff --git a/groot-common/src/main/java/com/geedgenetworks/common/udf/UDFContext.java b/groot-api/src/main/java/com/geedgenetworks/api/common/udf/UDFContext.java index ea98226..c595212 100644 --- a/groot-common/src/main/java/com/geedgenetworks/common/udf/UDFContext.java +++ b/groot-api/src/main/java/com/geedgenetworks/api/common/udf/UDFContext.java @@ -1,4 +1,4 @@ -package com.geedgenetworks.common.udf; +package com.geedgenetworks.api.common.udf; import com.fasterxml.jackson.annotation.JsonProperty; import lombok.Data; diff --git a/groot-core/src/main/java/com/geedgenetworks/core/processor/projection/UdfEntity.java b/groot-api/src/main/java/com/geedgenetworks/api/common/udf/UdfEntity.java index ab6a6f5..d8434d6 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/processor/projection/UdfEntity.java +++ b/groot-api/src/main/java/com/geedgenetworks/api/common/udf/UdfEntity.java @@ -1,10 +1,4 @@ -package com.geedgenetworks.core.processor.projection; - -import com.geedgenetworks.common.udf.AggregateFunction; -import com.geedgenetworks.common.udf.ScalarFunction; - -import com.geedgenetworks.common.udf.TableFunction; -import com.geedgenetworks.common.udf.UDFContext; +package com.geedgenetworks.api.common.udf; import com.googlecode.aviator.Expression; import lombok.Data; diff --git a/groot-common/src/main/java/com/geedgenetworks/common/config/CheckUDFContextUtil.java b/groot-api/src/main/java/com/geedgenetworks/api/configuration/CheckUDFContextUtil.java index f1170be..3d6e53b 100644 --- a/groot-common/src/main/java/com/geedgenetworks/common/config/CheckUDFContextUtil.java +++ b/groot-api/src/main/java/com/geedgenetworks/api/configuration/CheckUDFContextUtil.java @@ -1,6 +1,8 @@ -package com.geedgenetworks.common.config; +package com.geedgenetworks.api.configuration; + +import com.geedgenetworks.common.config.CheckResult; +import com.geedgenetworks.api.common.udf.UDFContext; -import com.geedgenetworks.common.udf.UDFContext; import java.util.Arrays; import java.util.List; import java.util.stream.Collectors; @@ -16,7 +18,7 @@ public final class CheckUDFContextUtil { .collect(Collectors.toList()); if (!invalidParams.isEmpty()) { - String errorMsg = java.lang.String.format("Please specify [%s] as non-empty.", java.lang.String.join(",", invalidParams)); + String errorMsg = String.format("Please specify [%s] as non-empty.", String.join(",", invalidParams)); return CheckResult.error(errorMsg); } return CheckResult.success(); @@ -33,7 +35,7 @@ public final class CheckUDFContextUtil { .collect(Collectors.toList()); if (invalidParams.size() == params.length) { - String errorMsg = java.lang.String.format("Please specify at least one config of [%s] as non-empty.", java.lang.String.join(",", invalidParams)); + String errorMsg = String.format("Please specify at least one config of [%s] as non-empty.", String.join(",", invalidParams)); return CheckResult.error(errorMsg); } return CheckResult.success(); @@ -74,7 +76,7 @@ public final class CheckUDFContextUtil { .collect(Collectors.toList()); if (!missingKeys.isEmpty()) { - String errorMsg = java.lang.String.format("Please specify [%s] as non-empty.", java.lang.String.join(",", missingKeys)); + String errorMsg = String.format("Please specify [%s] as non-empty.", String.join(",", missingKeys)); return CheckResult.error(errorMsg); } return CheckResult.success(); diff --git a/groot-common/src/main/java/com/geedgenetworks/common/config/UDFContextConfigOptions.java b/groot-api/src/main/java/com/geedgenetworks/api/configuration/UDFContextConfigOptions.java index 87bbf36..021d198 100644 --- a/groot-common/src/main/java/com/geedgenetworks/common/config/UDFContextConfigOptions.java +++ b/groot-api/src/main/java/com/geedgenetworks/api/configuration/UDFContextConfigOptions.java @@ -1,8 +1,12 @@ -package com.geedgenetworks.common.config; +package com.geedgenetworks.api.configuration; + +import com.alibaba.fastjson2.TypeReference; +import com.geedgenetworks.common.config.Option; +import com.geedgenetworks.common.config.Options; import java.util.List; import java.util.Map; -import com.alibaba.fastjson2.TypeReference; + public interface UDFContextConfigOptions { Option<String> NAME = Options.key("name") .stringType() diff --git a/groot-core/src/main/java/com/geedgenetworks/core/utils/LoadIntervalDataOptions.java b/groot-api/src/main/java/com/geedgenetworks/api/configuration/util/LoadIntervalDataOptions.java index a81794d..688fcc0 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/utils/LoadIntervalDataOptions.java +++ b/groot-api/src/main/java/com/geedgenetworks/api/configuration/util/LoadIntervalDataOptions.java @@ -1,80 +1,80 @@ -package com.geedgenetworks.core.utils;
-
-import java.io.Serializable;
-
-public class LoadIntervalDataOptions implements Serializable {
- final String name;
-
- final long intervalMs;
- final boolean failOnException;
- final boolean updateDataOnStart;
-
- /**
- * @param name 名称, 用于日志打印以及线程名称标识
- * @param intervalMs 每隔多长时间更新数据
- * @param failOnException 更新数据时发生异常是否失败(默认false), 为true时如果发现异常data()方法下次返回数据时会抛出异常
- * @param updateDataOnStart start时是否先更新数据(默认true), 为false时start候intervalMs时间后才会第一个更新数据
- */
- private LoadIntervalDataOptions(String name, long intervalMs, boolean failOnException, boolean updateDataOnStart) {
- this.name = name;
- this.intervalMs = intervalMs;
- this.failOnException = failOnException;
- this.updateDataOnStart = updateDataOnStart;
- }
-
- public String getName() {
- return name;
- }
-
- public long getIntervalMs() {
- return intervalMs;
- }
-
- public boolean isFailOnException() {
- return failOnException;
- }
-
- public boolean isUpdateDataOnStart() {
- return updateDataOnStart;
- }
-
- public static Builder builder() {
- return new Builder();
- }
-
- public static LoadIntervalDataOptions defaults(String name, long intervalMs) {
- return builder().withName(name).withIntervalMs(intervalMs).build();
- }
-
- public static final class Builder {
- private String name = "";
- private long intervalMs = 1000 * 60 * 10;
- private boolean failOnException = false;
- private boolean updateDataOnStart = true;
-
- public Builder withName(String name) {
- this.name = name;
- return this;
- }
-
- public Builder withIntervalMs(long intervalMs) {
- this.intervalMs = intervalMs;
- return this;
- }
-
- public Builder withFailOnException(boolean failOnException) {
- this.failOnException = failOnException;
- return this;
- }
-
- public Builder withUpdateDataOnStart(boolean updateDataOnStart) {
- this.updateDataOnStart = updateDataOnStart;
- return this;
- }
-
- public LoadIntervalDataOptions build() {
- return new LoadIntervalDataOptions(name, intervalMs, failOnException, updateDataOnStart);
- }
- }
-
-}
+package com.geedgenetworks.api.configuration.util; + +import java.io.Serializable; + +public class LoadIntervalDataOptions implements Serializable { + final String name; + + final long intervalMs; + final boolean failOnException; + final boolean updateDataOnStart; + + /** + * @param name 名称, 用于日志打印以及线程名称标识 + * @param intervalMs 每隔多长时间更新数据 + * @param failOnException 更新数据时发生异常是否失败(默认false), 为true时如果发现异常data()方法下次返回数据时会抛出异常 + * @param updateDataOnStart start时是否先更新数据(默认true), 为false时start候intervalMs时间后才会第一个更新数据 + */ + private LoadIntervalDataOptions(String name, long intervalMs, boolean failOnException, boolean updateDataOnStart) { + this.name = name; + this.intervalMs = intervalMs; + this.failOnException = failOnException; + this.updateDataOnStart = updateDataOnStart; + } + + public String getName() { + return name; + } + + public long getIntervalMs() { + return intervalMs; + } + + public boolean isFailOnException() { + return failOnException; + } + + public boolean isUpdateDataOnStart() { + return updateDataOnStart; + } + + public static Builder builder() { + return new Builder(); + } + + public static LoadIntervalDataOptions defaults(String name, long intervalMs) { + return builder().withName(name).withIntervalMs(intervalMs).build(); + } + + public static final class Builder { + private String name = ""; + private long intervalMs = 1000 * 60 * 10; + private boolean failOnException = false; + private boolean updateDataOnStart = true; + + public Builder withName(String name) { + this.name = name; + return this; + } + + public Builder withIntervalMs(long intervalMs) { + this.intervalMs = intervalMs; + return this; + } + + public Builder withFailOnException(boolean failOnException) { + this.failOnException = failOnException; + return this; + } + + public Builder withUpdateDataOnStart(boolean updateDataOnStart) { + this.updateDataOnStart = updateDataOnStart; + return this; + } + + public LoadIntervalDataOptions build() { + return new LoadIntervalDataOptions(name, intervalMs, failOnException, updateDataOnStart); + } + } + +} diff --git a/groot-core/src/main/java/com/geedgenetworks/core/utils/LoadIntervalDataUtil.java b/groot-api/src/main/java/com/geedgenetworks/api/configuration/util/LoadIntervalDataUtil.java index 566d217..0c92a39 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/utils/LoadIntervalDataUtil.java +++ b/groot-api/src/main/java/com/geedgenetworks/api/configuration/util/LoadIntervalDataUtil.java @@ -1,86 +1,86 @@ -package com.geedgenetworks.core.utils;
-
-import org.apache.flink.shaded.guava18.com.google.common.util.concurrent.ThreadFactoryBuilder;
-import org.apache.flink.util.function.SupplierWithException;
-import org.slf4j.Logger;
-import org.slf4j.LoggerFactory;
-
-import java.util.concurrent.ScheduledExecutorService;
-import java.util.concurrent.ScheduledThreadPoolExecutor;
-import java.util.concurrent.ThreadFactory;
-import java.util.concurrent.TimeUnit;
-import java.util.concurrent.atomic.AtomicBoolean;
-
-public class LoadIntervalDataUtil<T> {
- static final Logger LOG = LoggerFactory.getLogger(LoadIntervalDataUtil.class);
-
- private final SupplierWithException<T, Exception> dataSupplier;
- private final LoadIntervalDataOptions options;
-
- private final AtomicBoolean started = new AtomicBoolean(false);
- private final AtomicBoolean stopped = new AtomicBoolean(false);
- private ScheduledExecutorService scheduler;
- private volatile Exception exception;
- private volatile T data;
-
- private LoadIntervalDataUtil(SupplierWithException<T, Exception> dataSupplier, LoadIntervalDataOptions options) {
- this.dataSupplier = dataSupplier;
- this.options = options;
- }
-
- public static <T> LoadIntervalDataUtil<T> newInstance(SupplierWithException<T, Exception> dataSupplier, LoadIntervalDataOptions options) {
- LoadIntervalDataUtil<T> loadIntervalDataUtil = new LoadIntervalDataUtil(dataSupplier, options);
- loadIntervalDataUtil.start();
- return loadIntervalDataUtil;
- }
-
- public T data() throws Exception {
- if (!options.failOnException || exception == null) {
- return data;
- } else {
- throw exception;
- }
- }
-
- private void updateData() {
- try {
- LOG.info("{} updateData start....", options.name);
- data = dataSupplier.get();
- LOG.info("{} updateData end....", options.name);
- } catch (Throwable t) {
- if (options.failOnException) {
- exception = new RuntimeException(t);
- }
- LOG.info("{} updateData error", options.name, t);
- }
- }
-
- private void start() {
- if (started.compareAndSet(false, true)) {
- if (options.updateDataOnStart) {
- updateData();
- }
- this.scheduler = newDaemonSingleThreadScheduledExecutor(String.format("LoadIntervalDataUtil[%s]", options.name));
- this.scheduler.scheduleWithFixedDelay(() -> updateData(), options.intervalMs, options.intervalMs, TimeUnit.MILLISECONDS);
- LOG.info("{} start....", options.name);
- }
- }
-
- public void stop() {
- if (stopped.compareAndSet(false, true)) {
- if (scheduler != null) {
- this.scheduler.shutdown();
- }
- LOG.info("{} stop....", options.name);
- }
- }
-
- private static ScheduledExecutorService newDaemonSingleThreadScheduledExecutor(String threadName) {
- ThreadFactory threadFactory = new ThreadFactoryBuilder().setDaemon(true).setNameFormat(threadName).build();
- ScheduledThreadPoolExecutor executor = new ScheduledThreadPoolExecutor(1, threadFactory);
- // By default, a cancelled task is not automatically removed from the work queue until its delay
- // elapses. We have to enable it manually.
- executor.setRemoveOnCancelPolicy(true);
- return executor;
- }
-}
+package com.geedgenetworks.api.configuration.util; + +import org.apache.flink.shaded.guava18.com.google.common.util.concurrent.ThreadFactoryBuilder; +import org.apache.flink.util.function.SupplierWithException; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.concurrent.ScheduledExecutorService; +import java.util.concurrent.ScheduledThreadPoolExecutor; +import java.util.concurrent.ThreadFactory; +import java.util.concurrent.TimeUnit; +import java.util.concurrent.atomic.AtomicBoolean; + +public class LoadIntervalDataUtil<T> { + static final Logger LOG = LoggerFactory.getLogger(LoadIntervalDataUtil.class); + + private final SupplierWithException<T, Exception> dataSupplier; + private final LoadIntervalDataOptions options; + + private final AtomicBoolean started = new AtomicBoolean(false); + private final AtomicBoolean stopped = new AtomicBoolean(false); + private ScheduledExecutorService scheduler; + private volatile Exception exception; + private volatile T data; + + private LoadIntervalDataUtil(SupplierWithException<T, Exception> dataSupplier, LoadIntervalDataOptions options) { + this.dataSupplier = dataSupplier; + this.options = options; + } + + public static <T> LoadIntervalDataUtil<T> newInstance(SupplierWithException<T, Exception> dataSupplier, LoadIntervalDataOptions options) { + LoadIntervalDataUtil<T> loadIntervalDataUtil = new LoadIntervalDataUtil(dataSupplier, options); + loadIntervalDataUtil.start(); + return loadIntervalDataUtil; + } + + public T data() throws Exception { + if (!options.failOnException || exception == null) { + return data; + } else { + throw exception; + } + } + + private void updateData() { + try { + LOG.info("{} updateData start....", options.name); + data = dataSupplier.get(); + LOG.info("{} updateData end....", options.name); + } catch (Throwable t) { + if (options.failOnException) { + exception = new RuntimeException(t); + } + LOG.info("{} updateData error", options.name, t); + } + } + + private void start() { + if (started.compareAndSet(false, true)) { + if (options.updateDataOnStart) { + updateData(); + } + this.scheduler = newDaemonSingleThreadScheduledExecutor(String.format("LoadIntervalDataUtil[%s]", options.name)); + this.scheduler.scheduleWithFixedDelay(() -> updateData(), options.intervalMs, options.intervalMs, TimeUnit.MILLISECONDS); + LOG.info("{} start....", options.name); + } + } + + public void stop() { + if (stopped.compareAndSet(false, true)) { + if (scheduler != null) { + this.scheduler.shutdown(); + } + LOG.info("{} stop....", options.name); + } + } + + private static ScheduledExecutorService newDaemonSingleThreadScheduledExecutor(String threadName) { + ThreadFactory threadFactory = new ThreadFactoryBuilder().setDaemon(true).setNameFormat(threadName).build(); + ScheduledThreadPoolExecutor executor = new ScheduledThreadPoolExecutor(1, threadFactory); + // By default, a cancelled task is not automatically removed from the work queue until its delay + // elapses. We have to enable it manually. + executor.setRemoveOnCancelPolicy(true); + return executor; + } +} diff --git a/groot-common/src/main/java/com/geedgenetworks/common/Event.java b/groot-api/src/main/java/com/geedgenetworks/api/connector/event/Event.java index 20ecca7..2b04140 100644 --- a/groot-common/src/main/java/com/geedgenetworks/common/Event.java +++ b/groot-api/src/main/java/com/geedgenetworks/api/connector/event/Event.java @@ -1,4 +1,4 @@ -package com.geedgenetworks.common; +package com.geedgenetworks.api.connector.event; import lombok.Data; diff --git a/groot-core/src/main/java/com/geedgenetworks/core/connector/schema/DynamicSchema.java b/groot-api/src/main/java/com/geedgenetworks/api/connector/schema/DynamicSchema.java index 1b4f755..182218b 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/connector/schema/DynamicSchema.java +++ b/groot-api/src/main/java/com/geedgenetworks/api/connector/schema/DynamicSchema.java @@ -1,86 +1,86 @@ -package com.geedgenetworks.core.connector.schema;
-
-import com.geedgenetworks.core.connector.schema.utils.DynamicSchemaManager;
-import com.geedgenetworks.core.types.StructType;
-import org.apache.commons.codec.digest.DigestUtils;
-import org.apache.commons.lang3.StringUtils;
-
-import static org.apache.flink.util.Preconditions.checkArgument;
-
-public abstract class DynamicSchema implements Schema {
- protected SchemaParser.Parser parser;
- protected StructType dataType;
- private String contentMd5;
- protected final long intervalMs;
-
- public DynamicSchema(SchemaParser.Parser parser, long intervalMs) {
- this.parser = parser;
- this.intervalMs = intervalMs;
- }
-
- public abstract String getCacheKey();
- protected abstract String getDataTypeContent();
-
- @Override
- public StructType getDataType() {
- return dataType;
- }
-
- protected boolean parseDataType(String _content){
- checkArgument(StringUtils.isNotBlank(_content), "DataType is null");
- String _contentMd5 = computeMd5(_content);
- if(_contentMd5.equals(contentMd5)){
- return false;
- }
-
- StructType type;
- if(dataType == null){
- type = parser.parser(_content);
- contentMd5 = _contentMd5;
- dataType = type;
- return true;
- }
-
- type = parser.parser(_content);
- if(dataType.equals(type)){
- return false;
- }else{
- contentMd5 = _contentMd5;
- dataType = type;
- return true;
- }
- }
-
- // 更新并返回更新后的dataType, 如果没有更新返回null
- public StructType updateDataType(){
- String content = getDataTypeContent();
- if(StringUtils.isBlank(content)){
- return null;
- }
- if(parseDataType(content)){
- return dataType;
- }
- return null;
- }
-
- final public void registerSchemaChangeAware(SchemaChangeAware aware){
- DynamicSchemaManager.registerSchemaChangeAware(this, aware);
- }
-
- final public void unregisterSchemaChangeAware(SchemaChangeAware aware){
- DynamicSchemaManager.unregisterSchemaChangeAware(this, aware);
- }
-
- String computeMd5(String text){
- return DigestUtils.md5Hex(text);
- }
-
- public long getIntervalMs() {
- return intervalMs;
- }
-
- @Override
- final public boolean isDynamic() {
- return true;
- }
-}
+package com.geedgenetworks.api.connector.schema; + +import com.geedgenetworks.api.connector.schema.utils.DynamicSchemaManager; +import com.geedgenetworks.api.connector.type.StructType; +import org.apache.commons.codec.digest.DigestUtils; +import org.apache.commons.lang3.StringUtils; + +import static org.apache.flink.util.Preconditions.checkArgument; + +public abstract class DynamicSchema implements Schema { + protected SchemaParser.Parser parser; + protected StructType dataType; + private String contentMd5; + protected final long intervalMs; + + public DynamicSchema(SchemaParser.Parser parser, long intervalMs) { + this.parser = parser; + this.intervalMs = intervalMs; + } + + public abstract String getCacheKey(); + protected abstract String getDataTypeContent(); + + @Override + public StructType getDataType() { + return dataType; + } + + protected boolean parseDataType(String _content){ + checkArgument(StringUtils.isNotBlank(_content), "DataType is null"); + String _contentMd5 = computeMd5(_content); + if(_contentMd5.equals(contentMd5)){ + return false; + } + + StructType type; + if(dataType == null){ + type = parser.parser(_content); + contentMd5 = _contentMd5; + dataType = type; + return true; + } + + type = parser.parser(_content); + if(dataType.equals(type)){ + return false; + }else{ + contentMd5 = _contentMd5; + dataType = type; + return true; + } + } + + // 更新并返回更新后的dataType, 如果没有更新返回null + public StructType updateDataType(){ + String content = getDataTypeContent(); + if(StringUtils.isBlank(content)){ + return null; + } + if(parseDataType(content)){ + return dataType; + } + return null; + } + + final public void registerSchemaChangeAware(SchemaChangeAware aware){ + DynamicSchemaManager.registerSchemaChangeAware(this, aware); + } + + final public void unregisterSchemaChangeAware(SchemaChangeAware aware){ + DynamicSchemaManager.unregisterSchemaChangeAware(this, aware); + } + + String computeMd5(String text){ + return DigestUtils.md5Hex(text); + } + + public long getIntervalMs() { + return intervalMs; + } + + @Override + final public boolean isDynamic() { + return true; + } +} diff --git a/groot-core/src/main/java/com/geedgenetworks/core/connector/schema/HttpDynamicSchema.java b/groot-api/src/main/java/com/geedgenetworks/api/connector/schema/HttpDynamicSchema.java index 5eb6b87..bb8069b 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/connector/schema/HttpDynamicSchema.java +++ b/groot-api/src/main/java/com/geedgenetworks/api/connector/schema/HttpDynamicSchema.java @@ -1,43 +1,43 @@ -package com.geedgenetworks.core.connector.schema;
-
-import com.geedgenetworks.core.utils.HttpClientPoolUtil;
-import com.geedgenetworks.shaded.org.apache.http.Header;
-import com.geedgenetworks.shaded.org.apache.http.message.BasicHeader;
-import org.apache.flink.util.TimeUtils;
-import org.slf4j.Logger;
-import org.slf4j.LoggerFactory;
-
-import java.net.URI;
-import java.time.Duration;
-
-import static org.apache.flink.util.Preconditions.checkNotNull;
-
-public class HttpDynamicSchema extends DynamicSchema{
- static final Logger LOG = LoggerFactory.getLogger(HttpDynamicSchema.class);
- private static final Header header = new BasicHeader("Content-Type", "application/x-www-form-urlencoded");
- private final String url;
- private final String key;
- public HttpDynamicSchema(String url, SchemaParser.Parser parser, long intervalMs) {
- super(parser, intervalMs);
- checkNotNull(url);
- this.url = url;
- this.key = String.format("%s_%s", url, TimeUtils.formatWithHighestUnit(Duration.ofMillis(intervalMs)));
- parseDataType(getDataTypeContent());
- }
-
- @Override
- public String getCacheKey() {
- return key;
- }
-
- @Override
- protected String getDataTypeContent() {
- try {
- String response = HttpClientPoolUtil.getInstance().httpGet(URI.create(url), header);
- return response;
- } catch (Exception e) {
- LOG.error("request " + url + " error", e);
- return null;
- }
- }
-}
+package com.geedgenetworks.api.connector.schema; + +import com.geedgenetworks.common.utils.HttpClientPoolUtil; +import com.geedgenetworks.shaded.org.apache.http.Header; +import com.geedgenetworks.shaded.org.apache.http.message.BasicHeader; +import org.apache.flink.util.TimeUtils; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.net.URI; +import java.time.Duration; + +import static org.apache.flink.util.Preconditions.checkNotNull; + +public class HttpDynamicSchema extends DynamicSchema { + static final Logger LOG = LoggerFactory.getLogger(HttpDynamicSchema.class); + private static final Header header = new BasicHeader("Content-Type", "application/x-www-form-urlencoded"); + private final String url; + private final String key; + public HttpDynamicSchema(String url, SchemaParser.Parser parser, long intervalMs) { + super(parser, intervalMs); + checkNotNull(url); + this.url = url; + this.key = String.format("%s_%s", url, TimeUtils.formatWithHighestUnit(Duration.ofMillis(intervalMs))); + parseDataType(getDataTypeContent()); + } + + @Override + public String getCacheKey() { + return key; + } + + @Override + protected String getDataTypeContent() { + try { + String response = HttpClientPoolUtil.getInstance().httpGet(URI.create(url), header); + return response; + } catch (Exception e) { + LOG.error("request " + url + " error", e); + return null; + } + } +} diff --git a/groot-core/src/main/java/com/geedgenetworks/core/connector/schema/Schema.java b/groot-api/src/main/java/com/geedgenetworks/api/connector/schema/Schema.java index 6bd6764..7be573f 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/connector/schema/Schema.java +++ b/groot-api/src/main/java/com/geedgenetworks/api/connector/schema/Schema.java @@ -1,35 +1,35 @@ -package com.geedgenetworks.core.connector.schema;
-
-import com.geedgenetworks.core.connector.schema.utils.DynamicSchemaManager;
-import com.geedgenetworks.core.types.StructType;
-import org.apache.flink.util.Preconditions;
-
-import java.io.Serializable;
-
-import static org.apache.flink.util.Preconditions.checkArgument;
-
-public interface Schema extends Serializable {
- StructType getDataType();
-
- boolean isDynamic();
-
- public static Schema newSchema(StructType dataType){
- return new StaticSchema(dataType);
- }
-
- public static Schema newHttpDynamicSchema(String url){
- HttpDynamicSchema dynamicSchema = new HttpDynamicSchema(url, SchemaParser.PARSER_AVRO, 1000 * 60 * 30);
- checkArgument(dynamicSchema.getDataType() != null);
- return dynamicSchema;
- }
-
- public static void registerSchemaChangeAware(Schema schema, SchemaChangeAware aware){
- Preconditions.checkArgument(schema.isDynamic());
- DynamicSchemaManager.registerSchemaChangeAware((DynamicSchema)schema, aware);
- }
-
- public static void unregisterSchemaChangeAware(Schema schema, SchemaChangeAware aware){
- Preconditions.checkArgument(schema.isDynamic());
- DynamicSchemaManager.unregisterSchemaChangeAware((DynamicSchema)schema, aware);
- }
-}
+package com.geedgenetworks.api.connector.schema; + +import com.geedgenetworks.api.connector.schema.utils.DynamicSchemaManager; +import com.geedgenetworks.api.connector.type.StructType; +import org.apache.flink.util.Preconditions; + +import java.io.Serializable; + +import static org.apache.flink.util.Preconditions.checkArgument; + +public interface Schema extends Serializable { + StructType getDataType(); + + boolean isDynamic(); + + public static Schema newSchema(StructType dataType){ + return new StaticSchema(dataType); + } + + public static Schema newHttpDynamicSchema(String url){ + HttpDynamicSchema dynamicSchema = new HttpDynamicSchema(url, SchemaParser.PARSER_AVRO, 1000 * 60 * 30); + checkArgument(dynamicSchema.getDataType() != null); + return dynamicSchema; + } + + public static void registerSchemaChangeAware(Schema schema, SchemaChangeAware aware){ + Preconditions.checkArgument(schema.isDynamic()); + DynamicSchemaManager.registerSchemaChangeAware((DynamicSchema)schema, aware); + } + + public static void unregisterSchemaChangeAware(Schema schema, SchemaChangeAware aware){ + Preconditions.checkArgument(schema.isDynamic()); + DynamicSchemaManager.unregisterSchemaChangeAware((DynamicSchema)schema, aware); + } +} diff --git a/groot-api/src/main/java/com/geedgenetworks/api/connector/schema/SchemaChangeAware.java b/groot-api/src/main/java/com/geedgenetworks/api/connector/schema/SchemaChangeAware.java new file mode 100644 index 0000000..e9485c0 --- /dev/null +++ b/groot-api/src/main/java/com/geedgenetworks/api/connector/schema/SchemaChangeAware.java @@ -0,0 +1,7 @@ +package com.geedgenetworks.api.connector.schema; + +import com.geedgenetworks.api.connector.type.StructType; + +public interface SchemaChangeAware { + void schemaChange(StructType dataType); +} diff --git a/groot-core/src/main/java/com/geedgenetworks/core/connector/schema/SchemaParser.java b/groot-api/src/main/java/com/geedgenetworks/api/connector/schema/SchemaParser.java index a2fcc21..ff43cd3 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/connector/schema/SchemaParser.java +++ b/groot-api/src/main/java/com/geedgenetworks/api/connector/schema/SchemaParser.java @@ -1,112 +1,110 @@ -package com.geedgenetworks.core.connector.schema;
-
-import com.alibaba.fastjson2.JSON;
-import com.alibaba.fastjson2.JSONArray;
-import com.alibaba.fastjson2.JSONObject;
-import com.geedgenetworks.core.types.ArrayType;
-import com.geedgenetworks.core.types.DataType;
-import com.geedgenetworks.core.types.StructType;
-import com.geedgenetworks.core.types.StructType.StructField;
-import com.geedgenetworks.core.types.Types;
-
-import java.io.Serializable;
-import java.util.*;
-import java.util.stream.Collectors;
-
-public class SchemaParser {
- public static final String TYPE_BUILTIN = "builtin";
- public static final String TYPE_AVRO = "avro";
-
- public static final Parser PARSER_BUILTIN = new BuiltinParser();
- public static final Parser PARSER_AVRO = new AvroParser();
-
-
- public static Parser getParser(String type){
- if(TYPE_BUILTIN.equals(type)){
- return PARSER_BUILTIN;
- }else if(TYPE_AVRO.equals(type)){
- return PARSER_AVRO;
- }
-
- throw new UnsupportedOperationException("not supported parser:" + type);
- }
-
- public static class BuiltinParser implements Parser{
- @Override
- public StructType parser(String content){
- if(JSON.isValidArray(content)){
- return Types.parseSchemaFromJson(content);
- }else{
- return Types.parseStructType(content);
- }
- // throw new IllegalArgumentException("can not parse schema for:" + content);
- }
- }
-
- public static class AvroParser implements Parser{
- @Override
- public StructType parser(String content) {
- org.apache.avro.Schema schema = new org.apache.avro.Schema.Parser().parse(content);
- Set<String> disabledFields = getDisabledFields(JSON.parseObject(content).getJSONArray("fields"));
- return convert2StructType(schema, disabledFields);
- }
-
- private StructType convert2StructType(org.apache.avro.Schema schema, Set<String> disabledFields){
- List<org.apache.avro.Schema.Field> fields = schema.getFields();
- List<StructField> _fields = new ArrayList<>(fields.size());
- for (int i = 0; i < fields.size(); i++) {
- String fieldName = fields.get(i).name();
- if(disabledFields.contains(fieldName)){
- continue;
- }
- org.apache.avro.Schema fieldSchema = fields.get(i).schema();
- _fields.add(new StructField(fieldName, convert(fieldSchema)));
- }
- return new StructType(_fields.toArray(new StructField[_fields.size()]));
- }
-
- private DataType convert(org.apache.avro.Schema schema){
- switch (schema.getType()){
- case INT:
- return Types.INT;
- case LONG:
- return Types.BIGINT;
- case FLOAT:
- return Types.FLOAT;
- case DOUBLE:
- return Types.DOUBLE;
- case BOOLEAN:
- return Types.BOOLEAN;
- case STRING:
- return Types.STRING;
- case BYTES:
- return Types.BINARY;
- case ARRAY:
- return new ArrayType(convert(schema.getElementType()));
- case RECORD:
- return convert2StructType(schema, Collections.EMPTY_SET);
- default:
- throw new UnsupportedOperationException(schema.toString());
- }
- }
-
- private Set<String> getDisabledFields(JSONArray fields){
- Set<String> disabledFields = new HashSet<>();
- JSONObject fieldObject;
- JSONObject doc;
- for (int i = 0; i < fields.size(); i++) {
- fieldObject = fields.getJSONObject(i);
- doc = fieldObject.getJSONObject("doc");
- // 过滤禁用的字段
- if(doc != null && "disabled".equals(doc.getString("visibility"))){
- disabledFields.add(fieldObject.getString("name"));
- }
- }
- return disabledFields;
- }
- }
-
- public interface Parser extends Serializable {
- StructType parser(String content);
- }
-}
+package com.geedgenetworks.api.connector.schema; + +import com.alibaba.fastjson2.JSON; +import com.alibaba.fastjson2.JSONArray; +import com.alibaba.fastjson2.JSONObject; +import com.geedgenetworks.api.connector.type.ArrayType; +import com.geedgenetworks.api.connector.type.DataType; +import com.geedgenetworks.api.connector.type.Types; +import com.geedgenetworks.api.connector.type.StructType; +import com.geedgenetworks.api.connector.type.StructType.StructField; +import java.io.Serializable; +import java.util.*; + +public class SchemaParser { + public static final String TYPE_BUILTIN = "builtin"; + public static final String TYPE_AVRO = "avro"; + + public static final Parser PARSER_BUILTIN = new BuiltinParser(); + public static final Parser PARSER_AVRO = new AvroParser(); + + + public static Parser getParser(String type){ + if(TYPE_BUILTIN.equals(type)){ + return PARSER_BUILTIN; + }else if(TYPE_AVRO.equals(type)){ + return PARSER_AVRO; + } + + throw new UnsupportedOperationException("not supported parser:" + type); + } + + public static class BuiltinParser implements Parser{ + @Override + public StructType parser(String content){ + if(JSON.isValidArray(content)){ + return Types.parseSchemaFromJson(content); + }else{ + return Types.parseStructType(content); + } + // throw new IllegalArgumentException("can not parse schema for:" + content); + } + } + + public static class AvroParser implements Parser{ + @Override + public StructType parser(String content) { + org.apache.avro.Schema schema = new org.apache.avro.Schema.Parser().parse(content); + Set<String> disabledFields = getDisabledFields(JSON.parseObject(content).getJSONArray("fields")); + return convert2StructType(schema, disabledFields); + } + + private StructType convert2StructType(org.apache.avro.Schema schema, Set<String> disabledFields){ + List<org.apache.avro.Schema.Field> fields = schema.getFields(); + List<StructField> _fields = new ArrayList<>(fields.size()); + for (int i = 0; i < fields.size(); i++) { + String fieldName = fields.get(i).name(); + if(disabledFields.contains(fieldName)){ + continue; + } + org.apache.avro.Schema fieldSchema = fields.get(i).schema(); + _fields.add(new StructField(fieldName, convert(fieldSchema))); + } + return new StructType(_fields.toArray(new StructField[_fields.size()])); + } + + private DataType convert(org.apache.avro.Schema schema){ + switch (schema.getType()){ + case INT: + return Types.INT; + case LONG: + return Types.BIGINT; + case FLOAT: + return Types.FLOAT; + case DOUBLE: + return Types.DOUBLE; + case BOOLEAN: + return Types.BOOLEAN; + case STRING: + return Types.STRING; + case BYTES: + return Types.BINARY; + case ARRAY: + return new ArrayType(convert(schema.getElementType())); + case RECORD: + return convert2StructType(schema, Collections.EMPTY_SET); + default: + throw new UnsupportedOperationException(schema.toString()); + } + } + + private Set<String> getDisabledFields(JSONArray fields){ + Set<String> disabledFields = new HashSet<>(); + JSONObject fieldObject; + JSONObject doc; + for (int i = 0; i < fields.size(); i++) { + fieldObject = fields.getJSONObject(i); + doc = fieldObject.getJSONObject("doc"); + // 过滤禁用的字段 + if(doc != null && "disabled".equals(doc.getString("visibility"))){ + disabledFields.add(fieldObject.getString("name")); + } + } + return disabledFields; + } + } + + public interface Parser extends Serializable { + StructType parser(String content); + } +} diff --git a/groot-core/src/main/java/com/geedgenetworks/core/connector/schema/StaticSchema.java b/groot-api/src/main/java/com/geedgenetworks/api/connector/schema/StaticSchema.java index ab6893d..b6c741d 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/connector/schema/StaticSchema.java +++ b/groot-api/src/main/java/com/geedgenetworks/api/connector/schema/StaticSchema.java @@ -1,21 +1,22 @@ -package com.geedgenetworks.core.connector.schema;
-
-import com.geedgenetworks.core.types.StructType;
-
-public class StaticSchema implements Schema{
- private final StructType dataType;
-
- public StaticSchema(StructType dataType) {
- this.dataType = dataType;
- }
-
- @Override
- public StructType getDataType() {
- return dataType;
- }
-
- @Override
- final public boolean isDynamic() {
- return false;
- }
-}
+package com.geedgenetworks.api.connector.schema; + + +import com.geedgenetworks.api.connector.type.StructType; + +public class StaticSchema implements Schema{ + private final StructType dataType; + + public StaticSchema(StructType dataType) { + this.dataType = dataType; + } + + @Override + public StructType getDataType() { + return dataType; + } + + @Override + final public boolean isDynamic() { + return false; + } +} diff --git a/groot-core/src/main/java/com/geedgenetworks/core/connector/schema/utils/DynamicSchemaManager.java b/groot-api/src/main/java/com/geedgenetworks/api/connector/schema/utils/DynamicSchemaManager.java index 0ee04d2..41dc445 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/connector/schema/utils/DynamicSchemaManager.java +++ b/groot-api/src/main/java/com/geedgenetworks/api/connector/schema/utils/DynamicSchemaManager.java @@ -1,148 +1,149 @@ -package com.geedgenetworks.core.connector.schema.utils;
-
-import com.geedgenetworks.core.connector.schema.DynamicSchema;
-import com.geedgenetworks.core.connector.schema.SchemaChangeAware;
-import com.geedgenetworks.core.types.StructType;
-import org.apache.flink.runtime.util.ExecutorThreadFactory;
-import org.slf4j.Logger;
-import org.slf4j.LoggerFactory;
-
-import java.util.*;
-import java.util.concurrent.Executors;
-import java.util.concurrent.ScheduledExecutorService;
-import java.util.concurrent.TimeUnit;
-
-import static org.apache.flink.util.Preconditions.checkNotNull;
-public class DynamicSchemaManager {
- private static final Logger LOG = LoggerFactory.getLogger(DynamicSchemaManager.class);
- private static final Map<String, DynamicSchemaWithAwares> registeredSchemaWithAwares = new LinkedHashMap<>();
- private static ScheduledExecutorService scheduler = null;
-
- // 注册某个dynamicSchema的监听感知
- public static synchronized void registerSchemaChangeAware(DynamicSchema dynamicSchema, SchemaChangeAware aware){
- checkNotNull(dynamicSchema);
- checkNotNull(aware);
-
- String key = dynamicSchema.getCacheKey();
- DynamicSchemaWithAwares schemaWithAwares = registeredSchemaWithAwares.get(key);
- if(schemaWithAwares == null){
- schemaWithAwares = new DynamicSchemaWithAwares(dynamicSchema);
- schedule(schemaWithAwares);
- registeredSchemaWithAwares.put(key, schemaWithAwares);
- LOG.info("start schedule for {}, current contained schedules:{}", schemaWithAwares.dynamicSchema.getCacheKey(), registeredSchemaWithAwares.keySet());
- }
-
- for (SchemaChangeAware registeredAware : schemaWithAwares.awares) {
- if(registeredAware == aware){
- LOG.error("aware({}) for {} has already registered", aware, key);
- return;
- }
- }
-
- schemaWithAwares.awares.add(aware);
- LOG.info("register aware({}) for {}", aware, key);
- }
-
- // 注销某个dynamicSchema的监听感知
- public static synchronized void unregisterSchemaChangeAware(DynamicSchema dynamicSchema, SchemaChangeAware aware){
- checkNotNull(dynamicSchema);
- checkNotNull(aware);
-
- String key = dynamicSchema.getCacheKey();
- DynamicSchemaWithAwares schemaWithAwares = registeredSchemaWithAwares.get(key);
- if(schemaWithAwares == null){
- LOG.error("not register aware for {}", key);
- return;
- }
-
- Iterator<SchemaChangeAware> iter = schemaWithAwares.awares.iterator();
- SchemaChangeAware registeredAware;
- boolean find = false;
- while (iter.hasNext()){
- registeredAware = iter.next();
- if(registeredAware == aware){
- iter.remove();
- find = true;
- break;
- }
- }
-
- if(find){
- LOG.info("unregister aware({}) for {}", aware, schemaWithAwares.dynamicSchema.getCacheKey());
- if(schemaWithAwares.awares.isEmpty()){
- registeredSchemaWithAwares.remove(key);
- LOG.info("stop schedule for {}, current contained schedules:{}", schemaWithAwares.dynamicSchema.getCacheKey(), registeredSchemaWithAwares.keySet());
- }
- if(registeredSchemaWithAwares.isEmpty()){
- destroySchedule();
- }
- }else{
- LOG.error("can not find register aware({}) for {}", aware, schemaWithAwares.dynamicSchema.getCacheKey());
- }
- }
-
- private static void schedule(DynamicSchemaWithAwares schemaWithAwares){
- if(scheduler == null){
- scheduler = Executors.newScheduledThreadPool(1, new ExecutorThreadFactory("DynamicSchemaUpdateScheduler"));
- LOG.info("create SchemaUpdateScheduler");
- }
- scheduler.schedule(schemaWithAwares, schemaWithAwares.dynamicSchema.getIntervalMs(), TimeUnit.MILLISECONDS);
- }
-
- private static void destroySchedule(){
- if(scheduler != null){
- try {
- scheduler.shutdownNow();
- LOG.info("destroy SchemaUpdateScheduler");
- } catch (Exception e) {
- LOG.error("shutdown error", e);
- }
- scheduler = null;
- }
- }
-
- private static class DynamicSchemaWithAwares implements Runnable{
- DynamicSchema dynamicSchema;
- private List<SchemaChangeAware> awares;
-
- public DynamicSchemaWithAwares(DynamicSchema dynamicSchema) {
- this.dynamicSchema = dynamicSchema;
- awares = new ArrayList<>();
- }
-
- @Override
- public void run() {
- if(awares.isEmpty()){
- return;
- }
-
- try {
- update();
- } catch (Throwable e) {
- LOG.error("schema update error", e);
- }
-
- if(!awares.isEmpty()){
- scheduler.schedule(this, dynamicSchema.getIntervalMs(), TimeUnit.MILLISECONDS);
- }
- }
-
- public void update() {
- StructType dataType = dynamicSchema.updateDataType();
- // 距离上次没有更新
- if(dataType == null){
- return;
- }
-
- LOG.warn("schema for {} change to:{}", dynamicSchema.getCacheKey(), dataType.simpleString());
- for (SchemaChangeAware aware : awares) {
- try {
- aware.schemaChange(dataType);
- } catch (Exception e) {
- LOG.error("schema change aware error", e);
- }
- }
- }
-
- }
-}
+package com.geedgenetworks.api.connector.schema.utils; + + +import com.geedgenetworks.api.connector.schema.DynamicSchema; +import com.geedgenetworks.api.connector.schema.SchemaChangeAware; +import com.geedgenetworks.api.connector.type.StructType; +import org.apache.flink.runtime.util.ExecutorThreadFactory; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.*; +import java.util.concurrent.Executors; +import java.util.concurrent.ScheduledExecutorService; +import java.util.concurrent.TimeUnit; + +import static org.apache.flink.util.Preconditions.checkNotNull; +public class DynamicSchemaManager { + private static final Logger LOG = LoggerFactory.getLogger(DynamicSchemaManager.class); + private static final Map<String, DynamicSchemaWithAwares> registeredSchemaWithAwares = new LinkedHashMap<>(); + private static ScheduledExecutorService scheduler = null; + + // 注册某个dynamicSchema的监听感知 + public static synchronized void registerSchemaChangeAware(DynamicSchema dynamicSchema, SchemaChangeAware aware){ + checkNotNull(dynamicSchema); + checkNotNull(aware); + + String key = dynamicSchema.getCacheKey(); + DynamicSchemaWithAwares schemaWithAwares = registeredSchemaWithAwares.get(key); + if(schemaWithAwares == null){ + schemaWithAwares = new DynamicSchemaWithAwares(dynamicSchema); + schedule(schemaWithAwares); + registeredSchemaWithAwares.put(key, schemaWithAwares); + LOG.info("start schedule for {}, current contained schedules:{}", schemaWithAwares.dynamicSchema.getCacheKey(), registeredSchemaWithAwares.keySet()); + } + + for (SchemaChangeAware registeredAware : schemaWithAwares.awares) { + if(registeredAware == aware){ + LOG.error("aware({}) for {} has already registered", aware, key); + return; + } + } + + schemaWithAwares.awares.add(aware); + LOG.info("register aware({}) for {}", aware, key); + } + + // 注销某个dynamicSchema的监听感知 + public static synchronized void unregisterSchemaChangeAware(DynamicSchema dynamicSchema, SchemaChangeAware aware){ + checkNotNull(dynamicSchema); + checkNotNull(aware); + + String key = dynamicSchema.getCacheKey(); + DynamicSchemaWithAwares schemaWithAwares = registeredSchemaWithAwares.get(key); + if(schemaWithAwares == null){ + LOG.error("not register aware for {}", key); + return; + } + + Iterator<SchemaChangeAware> iter = schemaWithAwares.awares.iterator(); + SchemaChangeAware registeredAware; + boolean find = false; + while (iter.hasNext()){ + registeredAware = iter.next(); + if(registeredAware == aware){ + iter.remove(); + find = true; + break; + } + } + + if(find){ + LOG.info("unregister aware({}) for {}", aware, schemaWithAwares.dynamicSchema.getCacheKey()); + if(schemaWithAwares.awares.isEmpty()){ + registeredSchemaWithAwares.remove(key); + LOG.info("stop schedule for {}, current contained schedules:{}", schemaWithAwares.dynamicSchema.getCacheKey(), registeredSchemaWithAwares.keySet()); + } + if(registeredSchemaWithAwares.isEmpty()){ + destroySchedule(); + } + }else{ + LOG.error("can not find register aware({}) for {}", aware, schemaWithAwares.dynamicSchema.getCacheKey()); + } + } + + private static void schedule(DynamicSchemaWithAwares schemaWithAwares){ + if(scheduler == null){ + scheduler = Executors.newScheduledThreadPool(1, new ExecutorThreadFactory("DynamicSchemaUpdateScheduler")); + LOG.info("create SchemaUpdateScheduler"); + } + scheduler.schedule(schemaWithAwares, schemaWithAwares.dynamicSchema.getIntervalMs(), TimeUnit.MILLISECONDS); + } + + private static void destroySchedule(){ + if(scheduler != null){ + try { + scheduler.shutdownNow(); + LOG.info("destroy SchemaUpdateScheduler"); + } catch (Exception e) { + LOG.error("shutdown error", e); + } + scheduler = null; + } + } + + private static class DynamicSchemaWithAwares implements Runnable{ + DynamicSchema dynamicSchema; + private List<SchemaChangeAware> awares; + + public DynamicSchemaWithAwares(DynamicSchema dynamicSchema) { + this.dynamicSchema = dynamicSchema; + awares = new ArrayList<>(); + } + + @Override + public void run() { + if(awares.isEmpty()){ + return; + } + + try { + update(); + } catch (Throwable e) { + LOG.error("schema update error", e); + } + + if(!awares.isEmpty()){ + scheduler.schedule(this, dynamicSchema.getIntervalMs(), TimeUnit.MILLISECONDS); + } + } + + public void update() { + StructType dataType = dynamicSchema.updateDataType(); + // 距离上次没有更新 + if(dataType == null){ + return; + } + + LOG.warn("schema for {} change to:{}", dynamicSchema.getCacheKey(), dataType.simpleString()); + for (SchemaChangeAware aware : awares) { + try { + aware.schemaChange(dataType); + } catch (Exception e) { + LOG.error("schema change aware error", e); + } + } + } + + } +} diff --git a/groot-core/src/main/java/com/geedgenetworks/core/connector/format/DecodingFormat.java b/groot-api/src/main/java/com/geedgenetworks/api/connector/serialization/DecodingFormat.java index 3a2b3f7..95514ef 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/connector/format/DecodingFormat.java +++ b/groot-api/src/main/java/com/geedgenetworks/api/connector/serialization/DecodingFormat.java @@ -1,7 +1,7 @@ -package com.geedgenetworks.core.connector.format; +package com.geedgenetworks.api.connector.serialization; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.core.types.StructType; +import com.geedgenetworks.api.connector.event.Event; +import com.geedgenetworks.api.connector.type.StructType; import org.apache.flink.api.common.serialization.DeserializationSchema; public interface DecodingFormat { diff --git a/groot-core/src/main/java/com/geedgenetworks/core/connector/format/EncodingFormat.java b/groot-api/src/main/java/com/geedgenetworks/api/connector/serialization/EncodingFormat.java index 41afbbc..c8f9ce5 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/connector/format/EncodingFormat.java +++ b/groot-api/src/main/java/com/geedgenetworks/api/connector/serialization/EncodingFormat.java @@ -1,7 +1,7 @@ -package com.geedgenetworks.core.connector.format; +package com.geedgenetworks.api.connector.serialization; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.core.types.StructType; +import com.geedgenetworks.api.connector.event.Event; +import com.geedgenetworks.api.connector.type.StructType; import org.apache.flink.api.common.serialization.SerializationSchema; public interface EncodingFormat { diff --git a/groot-core/src/main/java/com/geedgenetworks/core/connector/format/MapDeserialization.java b/groot-api/src/main/java/com/geedgenetworks/api/connector/serialization/MapDeserialization.java index 7887097..b3e3a13 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/connector/format/MapDeserialization.java +++ b/groot-api/src/main/java/com/geedgenetworks/api/connector/serialization/MapDeserialization.java @@ -1,8 +1,8 @@ -package com.geedgenetworks.core.connector.format;
-
-import java.io.IOException;
-import java.util.Map;
-
-public interface MapDeserialization {
- Map<String, Object> deserializeToMap(byte[] bytes) throws IOException;
-}
+package com.geedgenetworks.api.connector.serialization; + +import java.io.IOException; +import java.util.Map; + +public interface MapDeserialization { + Map<String, Object> deserializeToMap(byte[] bytes) throws IOException; +} diff --git a/groot-api/src/main/java/com/geedgenetworks/api/connector/sink/SinkConfig.java b/groot-api/src/main/java/com/geedgenetworks/api/connector/sink/SinkConfig.java new file mode 100644 index 0000000..1c9c27d --- /dev/null +++ b/groot-api/src/main/java/com/geedgenetworks/api/connector/sink/SinkConfig.java @@ -0,0 +1,14 @@ +package com.geedgenetworks.api.connector.sink; + +import lombok.Data; + +import java.io.Serializable; +import java.util.Map; +@Data +public class SinkConfig implements Serializable { + private String name; + private String type; + private Map<String, Object> schema; + private Map<String, String> properties; + +} diff --git a/groot-common/src/main/java/com/geedgenetworks/common/config/SinkConfigOptions.java b/groot-api/src/main/java/com/geedgenetworks/api/connector/sink/SinkConfigOptions.java index 1662bcb..12011aa 100644 --- a/groot-common/src/main/java/com/geedgenetworks/common/config/SinkConfigOptions.java +++ b/groot-api/src/main/java/com/geedgenetworks/api/connector/sink/SinkConfigOptions.java @@ -1,4 +1,7 @@ -package com.geedgenetworks.common.config; +package com.geedgenetworks.api.connector.sink; + +import com.geedgenetworks.common.config.Option; +import com.geedgenetworks.common.config.Options; import java.util.Map; diff --git a/groot-core/src/main/java/com/geedgenetworks/core/connector/sink/SinkProvider.java b/groot-api/src/main/java/com/geedgenetworks/api/connector/sink/SinkProvider.java index f143f7f..19c8fe4 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/connector/sink/SinkProvider.java +++ b/groot-api/src/main/java/com/geedgenetworks/api/connector/sink/SinkProvider.java @@ -1,6 +1,6 @@ -package com.geedgenetworks.core.connector.sink; +package com.geedgenetworks.api.connector.sink; -import com.geedgenetworks.common.Event; +import com.geedgenetworks.api.connector.event.Event; import org.apache.flink.streaming.api.datastream.DataStream; import org.apache.flink.streaming.api.datastream.DataStreamSink; diff --git a/groot-api/src/main/java/com/geedgenetworks/api/connector/sink/SinkTableFactory.java b/groot-api/src/main/java/com/geedgenetworks/api/connector/sink/SinkTableFactory.java new file mode 100644 index 0000000..ae5b390 --- /dev/null +++ b/groot-api/src/main/java/com/geedgenetworks/api/connector/sink/SinkTableFactory.java @@ -0,0 +1,8 @@ +package com.geedgenetworks.api.connector.sink; + + +import com.geedgenetworks.api.factory.ConnectorFactory; + +public interface SinkTableFactory extends ConnectorFactory { + SinkProvider getSinkProvider(Context context); +} diff --git a/groot-api/src/main/java/com/geedgenetworks/api/connector/source/SourceConfig.java b/groot-api/src/main/java/com/geedgenetworks/api/connector/source/SourceConfig.java new file mode 100644 index 0000000..d7f7393 --- /dev/null +++ b/groot-api/src/main/java/com/geedgenetworks/api/connector/source/SourceConfig.java @@ -0,0 +1,16 @@ +package com.geedgenetworks.api.connector.source; + +import lombok.Data; + +import java.io.Serializable; +import java.util.Map; +@Data +public class SourceConfig implements Serializable { + private String name; + private String type; + private Map<String, Object> schema; + private String watermark_timestamp; + private String watermark_timestamp_unit = "ms"; + private Long watermark_lag; + private Map<String, String> properties; +} diff --git a/groot-common/src/main/java/com/geedgenetworks/common/config/SourceConfigOptions.java b/groot-api/src/main/java/com/geedgenetworks/api/connector/source/SourceConfigOptions.java index 4192fe9..cec53ce 100644 --- a/groot-common/src/main/java/com/geedgenetworks/common/config/SourceConfigOptions.java +++ b/groot-api/src/main/java/com/geedgenetworks/api/connector/source/SourceConfigOptions.java @@ -1,6 +1,9 @@ -package com.geedgenetworks.common.config; +package com.geedgenetworks.api.connector.source; import com.alibaba.fastjson2.TypeReference; +import com.geedgenetworks.common.config.Option; +import com.geedgenetworks.common.config.Options; + import java.util.List; import java.util.Map; diff --git a/groot-core/src/main/java/com/geedgenetworks/core/connector/source/SourceProvider.java b/groot-api/src/main/java/com/geedgenetworks/api/connector/source/SourceProvider.java index 4fc08dd..37a2d49 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/connector/source/SourceProvider.java +++ b/groot-api/src/main/java/com/geedgenetworks/api/connector/source/SourceProvider.java @@ -1,7 +1,7 @@ -package com.geedgenetworks.core.connector.source; +package com.geedgenetworks.api.connector.source; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.core.types.StructType; +import com.geedgenetworks.api.connector.event.Event; +import com.geedgenetworks.api.connector.type.StructType; import org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator; import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment; diff --git a/groot-api/src/main/java/com/geedgenetworks/api/connector/source/SourceTableFactory.java b/groot-api/src/main/java/com/geedgenetworks/api/connector/source/SourceTableFactory.java new file mode 100644 index 0000000..404fdd5 --- /dev/null +++ b/groot-api/src/main/java/com/geedgenetworks/api/connector/source/SourceTableFactory.java @@ -0,0 +1,7 @@ +package com.geedgenetworks.api.connector.source; + +import com.geedgenetworks.api.factory.ConnectorFactory; + +public interface SourceTableFactory extends ConnectorFactory { + SourceProvider getSourceProvider(Context context); +} diff --git a/groot-core/src/main/java/com/geedgenetworks/core/types/ArrayType.java b/groot-api/src/main/java/com/geedgenetworks/api/connector/type/ArrayType.java index a9ccd77..e8fae36 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/types/ArrayType.java +++ b/groot-api/src/main/java/com/geedgenetworks/api/connector/type/ArrayType.java @@ -1,4 +1,4 @@ -package com.geedgenetworks.core.types; +package com.geedgenetworks.api.connector.type; public class ArrayType extends DataType { public DataType elementType; diff --git a/groot-core/src/main/java/com/geedgenetworks/core/types/BinaryType.java b/groot-api/src/main/java/com/geedgenetworks/api/connector/type/BinaryType.java index ca83d61..3d3b5f0 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/types/BinaryType.java +++ b/groot-api/src/main/java/com/geedgenetworks/api/connector/type/BinaryType.java @@ -1,8 +1,8 @@ -package com.geedgenetworks.core.types; +package com.geedgenetworks.api.connector.type; public class BinaryType extends DataType { - BinaryType() { + public BinaryType() { } @Override diff --git a/groot-api/src/main/java/com/geedgenetworks/api/connector/type/BooleanType.java b/groot-api/src/main/java/com/geedgenetworks/api/connector/type/BooleanType.java new file mode 100644 index 0000000..99e29e3 --- /dev/null +++ b/groot-api/src/main/java/com/geedgenetworks/api/connector/type/BooleanType.java @@ -0,0 +1,11 @@ +package com.geedgenetworks.api.connector.type; + + +public class BooleanType extends DataType { + public BooleanType() { + } + @Override + public String simpleString() { + return "boolean"; + } +} diff --git a/groot-core/src/main/java/com/geedgenetworks/core/types/DataType.java b/groot-api/src/main/java/com/geedgenetworks/api/connector/type/DataType.java index 02d141e..f1f222f 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/types/DataType.java +++ b/groot-api/src/main/java/com/geedgenetworks/api/connector/type/DataType.java @@ -1,4 +1,4 @@ -package com.geedgenetworks.core.types; +package com.geedgenetworks.api.connector.type; import java.io.Serializable; diff --git a/groot-core/src/main/java/com/geedgenetworks/core/types/DoubleType.java b/groot-api/src/main/java/com/geedgenetworks/api/connector/type/DoubleType.java index 5728e1f..96af23b 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/types/DoubleType.java +++ b/groot-api/src/main/java/com/geedgenetworks/api/connector/type/DoubleType.java @@ -1,7 +1,7 @@ -package com.geedgenetworks.core.types; +package com.geedgenetworks.api.connector.type; public class DoubleType extends DataType { - DoubleType() { + public DoubleType() { } @Override public String simpleString() { diff --git a/groot-core/src/main/java/com/geedgenetworks/core/types/FloatType.java b/groot-api/src/main/java/com/geedgenetworks/api/connector/type/FloatType.java index 8984c57..5decc23 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/types/FloatType.java +++ b/groot-api/src/main/java/com/geedgenetworks/api/connector/type/FloatType.java @@ -1,7 +1,7 @@ -package com.geedgenetworks.core.types; +package com.geedgenetworks.api.connector.type; public class FloatType extends DataType { - FloatType() { + public FloatType() { } @Override public String simpleString() { diff --git a/groot-core/src/main/java/com/geedgenetworks/core/types/IntegerType.java b/groot-api/src/main/java/com/geedgenetworks/api/connector/type/IntegerType.java index 82c7752..6dd9864 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/types/IntegerType.java +++ b/groot-api/src/main/java/com/geedgenetworks/api/connector/type/IntegerType.java @@ -1,7 +1,7 @@ -package com.geedgenetworks.core.types; +package com.geedgenetworks.api.connector.type; public class IntegerType extends DataType { - IntegerType() { + public IntegerType() { } @Override public String simpleString() { diff --git a/groot-core/src/main/java/com/geedgenetworks/core/types/LongType.java b/groot-api/src/main/java/com/geedgenetworks/api/connector/type/LongType.java index 52a35d9..fa4bf79 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/types/LongType.java +++ b/groot-api/src/main/java/com/geedgenetworks/api/connector/type/LongType.java @@ -1,7 +1,7 @@ -package com.geedgenetworks.core.types; +package com.geedgenetworks.api.connector.type; public class LongType extends DataType { - LongType() { + public LongType() { } @Override public String simpleString() { diff --git a/groot-core/src/main/java/com/geedgenetworks/core/types/StringType.java b/groot-api/src/main/java/com/geedgenetworks/api/connector/type/StringType.java index 91f95e0..d411aa1 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/types/StringType.java +++ b/groot-api/src/main/java/com/geedgenetworks/api/connector/type/StringType.java @@ -1,7 +1,8 @@ -package com.geedgenetworks.core.types; +package com.geedgenetworks.api.connector.type; + public class StringType extends DataType { - StringType() { + public StringType() { } @Override public String simpleString() { diff --git a/groot-core/src/main/java/com/geedgenetworks/core/types/StructType.java b/groot-api/src/main/java/com/geedgenetworks/api/connector/type/StructType.java index f816ad4..eeb5aef 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/types/StructType.java +++ b/groot-api/src/main/java/com/geedgenetworks/api/connector/type/StructType.java @@ -1,4 +1,4 @@ -package com.geedgenetworks.core.types; +package com.geedgenetworks.api.connector.type; import org.apache.commons.lang3.StringUtils; @@ -23,7 +23,7 @@ public class StructType extends DataType { return false; } StructField[] otherFields = ((StructType) o).fields; - return java.util.Arrays.equals(fields, otherFields); + return Arrays.equals(fields, otherFields); } @Override diff --git a/groot-core/src/main/java/com/geedgenetworks/core/types/Types.java b/groot-api/src/main/java/com/geedgenetworks/api/connector/type/Types.java index 7cc3d3a..9a1fd45 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/types/Types.java +++ b/groot-api/src/main/java/com/geedgenetworks/api/connector/type/Types.java @@ -1,144 +1,144 @@ -package com.geedgenetworks.core.types;
-
-import com.alibaba.fastjson2.JSON;
-import com.alibaba.fastjson2.JSONArray;
-import com.alibaba.fastjson2.JSONObject;
-import com.geedgenetworks.core.types.StructType.StructField;
-
-import java.util.ArrayList;
-import java.util.List;
-import java.util.regex.Matcher;
-import java.util.regex.Pattern;
-
-public class Types {
- public static final IntegerType INT = new IntegerType();
- public static final LongType BIGINT = new LongType();
- public static final StringType STRING = new StringType();
- public static final FloatType FLOAT = new FloatType();
- public static final DoubleType DOUBLE = new DoubleType();
- public static final BooleanType BOOLEAN = new BooleanType();
- public static final BinaryType BINARY = new BinaryType();
-
- public static final Pattern ARRAY_RE = Pattern.compile("array\\s*<(.+)>", Pattern.CASE_INSENSITIVE);
- public static final Pattern STRUCT_RE = Pattern.compile("struct\\s*<(.+)>", Pattern.CASE_INSENSITIVE);
-
- public static StructType parseSchemaFromJson(String jsonFields) {
- JSONArray fieldArray = JSON.parseArray(jsonFields);
- StructField[] fields = new StructField[fieldArray.size()];
-
- for (int i = 0; i < fieldArray.size(); i++) {
- JSONObject fieldObject = fieldArray.getJSONObject(i);
- String name = fieldObject.getString("name").trim();
- String type = fieldObject.getString("type").trim();
- DataType dataType = parseDataType(type);
- fields[i] = new StructField(name, dataType);
- }
-
- return new StructType(fields);
- }
-
- // 解析struct<>中的字段
- public static StructType parseStructType(String str){
- // 外面是否包含struct<>都能解析
- Matcher matcher = STRUCT_RE.matcher(str);
- if(matcher.matches()){
- str = matcher.group(1);
- }
-
- List<StructField> fields = new ArrayList<>();
- int startPos = 0, endPos = -1;
- int i = startPos + 1;
- int level = 0;
- while (i < str.length()){
- while (i < str.length()){
- if(str.charAt(i) == ':'){
- endPos = i;
- break;
- }
- i++;
- }
-
- if(endPos <= startPos){
- throw new UnsupportedOperationException("can not parse " + str);
- }
-
- String name = str.substring(startPos, endPos).trim();
- startPos = i + 1;
- endPos = -1;
- i = startPos + 1;
- while (i < str.length()){
- if(str.charAt(i) == ',' && level == 0){
- endPos = i;
- break;
- }
- if(str.charAt(i) == '<'){
- level++;
- }
- if(str.charAt(i) == '>'){
- level--;
- }
- i++;
- }
-
- if(i == str.length()){
- endPos = i;
- }
- if(endPos <= startPos){
- throw new UnsupportedOperationException("can not parse " + str);
- }
-
- String tp = str.substring(startPos, endPos).trim();
- fields.add(new StructField(name, parseDataType(tp)));
-
- i++;
- startPos = i;
- endPos = -1;
- }
-
- return new StructType(fields.toArray(new StructField[fields.size()]));
- }
-
- public static DataType parseDataType(String type){
- type = type.trim();
- if("int".equalsIgnoreCase(type)){
- return INT;
- } else if ("bigint".equalsIgnoreCase(type)){
- return BIGINT;
- } else if ("string".equalsIgnoreCase(type)){
- return STRING;
- } else if ("float".equalsIgnoreCase(type)){
- return FLOAT;
- } else if ("double".equalsIgnoreCase(type)){
- return DOUBLE;
- } else if ("boolean".equalsIgnoreCase(type)){
- return BOOLEAN;
- } else if ("binary".equalsIgnoreCase(type)){
- return BINARY;
- }
-
- // array类型
- Matcher matcher = ARRAY_RE.matcher(type);
- if(matcher.matches()){
- String eleType = matcher.group(1);
- DataType elementType = parseDataType(eleType);
- return new ArrayType(elementType);
- }
-
- // struct类型
- matcher = STRUCT_RE.matcher(type);
- if(matcher.matches()){
- String str = matcher.group(1);
- return parseStructType(str);
- }
-
- throw new UnsupportedOperationException("not support type:" + type);
- }
-
- static void buildFormattedString(DataType dataType, String prefix, StringBuilder sb, int maxDepth){
- if(dataType instanceof ArrayType){
- ((ArrayType)dataType).buildFormattedString(prefix, sb, maxDepth - 1);
- } else if (dataType instanceof StructType) {
- ((StructType)dataType).buildFormattedString(prefix, sb, maxDepth - 1);
- }
- }
-}
+package com.geedgenetworks.api.connector.type; + +import com.alibaba.fastjson2.JSON; +import com.alibaba.fastjson2.JSONArray; +import com.alibaba.fastjson2.JSONObject; +import com.geedgenetworks.api.connector.type.StructType.StructField; + +import java.util.ArrayList; +import java.util.List; +import java.util.regex.Matcher; +import java.util.regex.Pattern; + +public class Types { + public static final IntegerType INT = new IntegerType(); + public static final LongType BIGINT = new LongType(); + public static final StringType STRING = new StringType(); + public static final FloatType FLOAT = new FloatType(); + public static final DoubleType DOUBLE = new DoubleType(); + public static final BooleanType BOOLEAN = new BooleanType(); + public static final BinaryType BINARY = new BinaryType(); + + public static final Pattern ARRAY_RE = Pattern.compile("array\\s*<(.+)>", Pattern.CASE_INSENSITIVE); + public static final Pattern STRUCT_RE = Pattern.compile("struct\\s*<(.+)>", Pattern.CASE_INSENSITIVE); + + public static StructType parseSchemaFromJson(String jsonFields) { + JSONArray fieldArray = JSON.parseArray(jsonFields); + StructField[] fields = new StructField[fieldArray.size()]; + + for (int i = 0; i < fieldArray.size(); i++) { + JSONObject fieldObject = fieldArray.getJSONObject(i); + String name = fieldObject.getString("name").trim(); + String type = fieldObject.getString("type").trim(); + DataType dataType = parseDataType(type); + fields[i] = new StructField(name, dataType); + } + + return new StructType(fields); + } + + // 解析struct<>中的字段 + public static StructType parseStructType(String str){ + // 外面是否包含struct<>都能解析 + Matcher matcher = STRUCT_RE.matcher(str); + if(matcher.matches()){ + str = matcher.group(1); + } + + List<StructField> fields = new ArrayList<>(); + int startPos = 0, endPos = -1; + int i = startPos + 1; + int level = 0; + while (i < str.length()){ + while (i < str.length()){ + if(str.charAt(i) == ':'){ + endPos = i; + break; + } + i++; + } + + if(endPos <= startPos){ + throw new UnsupportedOperationException("can not parse " + str); + } + + String name = str.substring(startPos, endPos).trim(); + startPos = i + 1; + endPos = -1; + i = startPos + 1; + while (i < str.length()){ + if(str.charAt(i) == ',' && level == 0){ + endPos = i; + break; + } + if(str.charAt(i) == '<'){ + level++; + } + if(str.charAt(i) == '>'){ + level--; + } + i++; + } + + if(i == str.length()){ + endPos = i; + } + if(endPos <= startPos){ + throw new UnsupportedOperationException("can not parse " + str); + } + + String tp = str.substring(startPos, endPos).trim(); + fields.add(new StructField(name, parseDataType(tp))); + + i++; + startPos = i; + endPos = -1; + } + + return new StructType(fields.toArray(new StructField[fields.size()])); + } + + public static DataType parseDataType(String type){ + type = type.trim(); + if("int".equalsIgnoreCase(type)){ + return INT; + } else if ("bigint".equalsIgnoreCase(type)){ + return BIGINT; + } else if ("string".equalsIgnoreCase(type)){ + return STRING; + } else if ("float".equalsIgnoreCase(type)){ + return FLOAT; + } else if ("double".equalsIgnoreCase(type)){ + return DOUBLE; + } else if ("boolean".equalsIgnoreCase(type)){ + return BOOLEAN; + } else if ("binary".equalsIgnoreCase(type)){ + return BINARY; + } + + // array类型 + Matcher matcher = ARRAY_RE.matcher(type); + if(matcher.matches()){ + String eleType = matcher.group(1); + DataType elementType = parseDataType(eleType); + return new ArrayType(elementType); + } + + // struct类型 + matcher = STRUCT_RE.matcher(type); + if(matcher.matches()){ + String str = matcher.group(1); + return parseStructType(str); + } + + throw new UnsupportedOperationException("not support type:" + type); + } + + static void buildFormattedString(DataType dataType, String prefix, StringBuilder sb, int maxDepth){ + if(dataType instanceof ArrayType){ + ((ArrayType)dataType).buildFormattedString(prefix, sb, maxDepth - 1); + } else if (dataType instanceof StructType) { + ((StructType)dataType).buildFormattedString(prefix, sb, maxDepth - 1); + } + } +} diff --git a/groot-core/src/main/java/com/geedgenetworks/core/factories/TableFactory.java b/groot-api/src/main/java/com/geedgenetworks/api/factory/ConnectorFactory.java index affeead..1697a24 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/factories/TableFactory.java +++ b/groot-api/src/main/java/com/geedgenetworks/api/factory/ConnectorFactory.java @@ -1,56 +1,56 @@ -package com.geedgenetworks.core.factories;
-
-import com.geedgenetworks.core.connector.schema.Schema;
-import com.geedgenetworks.core.types.StructType;
-import org.apache.flink.configuration.Configuration;
-
-import java.util.Map;
-
-public interface TableFactory extends Factory {
-
- public static class Context {
- private final Schema schema;
- private final Map<String, String> options;
- private final Configuration configuration;
-
- public Context(Schema schema, Map<String, String> options, Configuration configuration) {
- this.schema = schema;
- this.options = options;
- this.configuration = configuration;
- }
-
- public Schema getSchema() {
- return schema;
- }
-
- public StructType getPhysicalDataType() {
- if(schema == null){
- return null;
- }else{
- if(schema.isDynamic()){
- throw new UnsupportedOperationException("DynamicSchema");
- }
- return schema.getDataType();
- }
- }
-
- public StructType getDataType() {
- if(schema == null){
- return null;
- }else{
- if(schema.isDynamic()){
- throw new UnsupportedOperationException("DynamicSchema");
- }
- return schema.getDataType();
- }
- }
-
- public Map<String, String> getOptions() {
- return options;
- }
-
- public Configuration getConfiguration() {
- return configuration;
- }
- }
-}
+package com.geedgenetworks.api.factory; + +import com.geedgenetworks.api.connector.schema.Schema; +import com.geedgenetworks.api.connector.type.StructType; +import org.apache.flink.configuration.Configuration; + +import java.util.Map; + +public interface ConnectorFactory extends Factory { + + public static class Context { + private final Schema schema; + private final Map<String, String> options; + private final Configuration configuration; + + public Context(Schema schema, Map<String, String> options, Configuration configuration) { + this.schema = schema; + this.options = options; + this.configuration = configuration; + } + + public Schema getSchema() { + return schema; + } + + public StructType getPhysicalDataType() { + if(schema == null){ + return null; + }else{ + if(schema.isDynamic()){ + throw new UnsupportedOperationException("DynamicSchema"); + } + return schema.getDataType(); + } + } + + public StructType getDataType() { + if(schema == null){ + return null; + }else{ + if(schema.isDynamic()){ + throw new UnsupportedOperationException("DynamicSchema"); + } + return schema.getDataType(); + } + } + + public Map<String, String> getOptions() { + return options; + } + + public Configuration getConfiguration() { + return configuration; + } + } +} diff --git a/groot-api/src/main/java/com/geedgenetworks/api/factory/DecodingFormatFactory.java b/groot-api/src/main/java/com/geedgenetworks/api/factory/DecodingFormatFactory.java new file mode 100644 index 0000000..9d06dc3 --- /dev/null +++ b/groot-api/src/main/java/com/geedgenetworks/api/factory/DecodingFormatFactory.java @@ -0,0 +1,10 @@ +package com.geedgenetworks.api.factory; + +import com.geedgenetworks.api.connector.serialization.DecodingFormat; +import org.apache.flink.configuration.ReadableConfig; + +public interface DecodingFormatFactory extends FormatFactory { + DecodingFormat createDecodingFormat(ConnectorFactory.Context context, ReadableConfig formatOptions); +} + + diff --git a/groot-api/src/main/java/com/geedgenetworks/api/factory/EncodingFormatFactory.java b/groot-api/src/main/java/com/geedgenetworks/api/factory/EncodingFormatFactory.java new file mode 100644 index 0000000..fca9273 --- /dev/null +++ b/groot-api/src/main/java/com/geedgenetworks/api/factory/EncodingFormatFactory.java @@ -0,0 +1,8 @@ +package com.geedgenetworks.api.factory; + +import com.geedgenetworks.api.connector.serialization.EncodingFormat; +import org.apache.flink.configuration.ReadableConfig; + +public interface EncodingFormatFactory extends FormatFactory { + EncodingFormat createEncodingFormat(ConnectorFactory.Context context, ReadableConfig formatOptions); +} diff --git a/groot-api/src/main/java/com/geedgenetworks/api/factory/Factory.java b/groot-api/src/main/java/com/geedgenetworks/api/factory/Factory.java new file mode 100644 index 0000000..e8b1da2 --- /dev/null +++ b/groot-api/src/main/java/com/geedgenetworks/api/factory/Factory.java @@ -0,0 +1,17 @@ +package com.geedgenetworks.api.factory; + +import org.apache.flink.configuration.ConfigOption; + +import java.util.Set; + +public interface Factory { + /** + * Returns the factory identifier. + * If multiple factories exist for different versions, a version should be appended using "-". + * (e.g. {@code kafka-1}). + */ + String type(); + + Set<ConfigOption<?>> requiredOptions(); + Set<ConfigOption<?>> optionalOptions(); +} diff --git a/groot-core/src/main/java/com/geedgenetworks/core/factories/FactoryUtil.java b/groot-api/src/main/java/com/geedgenetworks/api/factory/FactoryUtil.java index a120ca5..8c5a7eb 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/factories/FactoryUtil.java +++ b/groot-api/src/main/java/com/geedgenetworks/api/factory/FactoryUtil.java @@ -1,7 +1,7 @@ -package com.geedgenetworks.core.factories; +package com.geedgenetworks.api.factory; -import com.geedgenetworks.core.connector.format.DecodingFormat; -import com.geedgenetworks.core.connector.format.EncodingFormat; +import com.geedgenetworks.api.connector.serialization.DecodingFormat; +import com.geedgenetworks.api.connector.serialization.EncodingFormat; import org.apache.flink.configuration.*; import org.apache.flink.table.api.TableException; import org.apache.flink.table.api.ValidationException; @@ -74,12 +74,12 @@ public final class FactoryUtil { } public static TableFactoryHelper createTableFactoryHelper( - TableFactory factory, TableFactory.Context context) { + ConnectorFactory factory, ConnectorFactory.Context context) { return new TableFactoryHelper(factory, context); } public static <T extends Factory> T discoverFactory( - ClassLoader classLoader, Class<T> factoryClass, String factoryIdentifier) { + ClassLoader classLoader, Class<T> factoryClass, String type) { final List<Factory> factories = discoverFactories(classLoader); final List<Factory> foundFactories = @@ -96,7 +96,7 @@ public final class FactoryUtil { final List<Factory> matchingFactories = foundFactories.stream() - .filter(f -> f.factoryIdentifier().equals(factoryIdentifier)) + .filter(f -> f.type().equals(type)) .collect(Collectors.toList()); if (matchingFactories.isEmpty()) { @@ -105,10 +105,10 @@ public final class FactoryUtil { "Could not find any factory for identifier '%s' that implements '%s' in the classpath.\n\n" + "Available factory identifiers are:\n\n" + "%s", - factoryIdentifier, + type, factoryClass.getName(), foundFactories.stream() - .map(Factory::factoryIdentifier) + .map(Factory::type) .distinct() .sorted() .collect(Collectors.joining("\n")))); @@ -119,7 +119,7 @@ public final class FactoryUtil { "Multiple factories for identifier '%s' that implement '%s' found in the classpath.\n\n" + "Ambiguous factory classes are:\n\n" + "%s", - factoryIdentifier, + type, factoryClass.getName(), matchingFactories.stream() .map(f -> f.getClass().getName()) @@ -154,7 +154,12 @@ public final class FactoryUtil { return result; } - public static <T extends TableFactory> T discoverTableFactory( + public static <T extends ProcessorFactory> T discoverProcessorFactory( + Class<T> factoryClass, String type) { + return discoverFactory(Thread.currentThread().getContextClassLoader(), factoryClass, type); + } + + public static <T extends ConnectorFactory> T discoverConnectorFactory( Class<T> factoryClass, String connector) { return discoverFactory(Thread.currentThread().getContextClassLoader(), factoryClass, connector); } @@ -202,12 +207,12 @@ public final class FactoryUtil { } public static class TableFactoryHelper { - private final TableFactory factory; - private final TableFactory.Context context; + private final ConnectorFactory factory; + private final ConnectorFactory.Context context; private final Configuration allOptions; private final Set<String> consumedOptionKeys; - public TableFactoryHelper(TableFactory factory, TableFactory.Context context) { + public TableFactoryHelper(ConnectorFactory factory, ConnectorFactory.Context context) { this.factory = factory; this.context = context; this.allOptions = context.getConfiguration(); @@ -225,7 +230,7 @@ public final class FactoryUtil { public void validate() { validateFactoryOptions(factory, allOptions); validateUnconsumedKeys( - factory.factoryIdentifier(), + factory.type(), allOptions.keySet(), consumedOptionKeys); } @@ -270,7 +275,7 @@ public final class FactoryUtil { throw new ValidationException( String.format( "Error creating sink format '%s' in option space '%s'.", - formatFactory.factoryIdentifier(), + formatFactory.type(), formatPrefix), t); } @@ -308,7 +313,7 @@ public final class FactoryUtil { throw new ValidationException( String.format( "Error creating scan format '%s' in option space '%s'.", - formatFactory.factoryIdentifier(), + formatFactory.type(), formatPrefix), t); } @@ -338,7 +343,7 @@ public final class FactoryUtil { return Optional.of(factory); } private String formatPrefix(Factory formatFactory, ConfigOption<String> formatOption) { - String identifier = formatFactory.factoryIdentifier(); + String identifier = formatFactory.type(); return getFormatPrefix(formatOption, identifier); } diff --git a/groot-core/src/main/java/com/geedgenetworks/core/factories/FormatFactory.java b/groot-api/src/main/java/com/geedgenetworks/api/factory/FormatFactory.java index 49889ca..9ca8572 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/factories/FormatFactory.java +++ b/groot-api/src/main/java/com/geedgenetworks/api/factory/FormatFactory.java @@ -1,4 +1,5 @@ -package com.geedgenetworks.core.factories; +package com.geedgenetworks.api.factory; public interface FormatFactory extends Factory { + } diff --git a/groot-api/src/main/java/com/geedgenetworks/api/factory/ProcessorFactory.java b/groot-api/src/main/java/com/geedgenetworks/api/factory/ProcessorFactory.java new file mode 100644 index 0000000..3928f02 --- /dev/null +++ b/groot-api/src/main/java/com/geedgenetworks/api/factory/ProcessorFactory.java @@ -0,0 +1,7 @@ +package com.geedgenetworks.api.factory; + +import com.geedgenetworks.api.processor.Processor; + +public interface ProcessorFactory extends Factory { + Processor<?> createProcessor(); +} diff --git a/groot-core/src/main/java/com/geedgenetworks/core/factories/ServiceLoaderUtil.java b/groot-api/src/main/java/com/geedgenetworks/api/factory/ServiceLoaderUtil.java index 7a97a57..222146e 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/factories/ServiceLoaderUtil.java +++ b/groot-api/src/main/java/com/geedgenetworks/api/factory/ServiceLoaderUtil.java @@ -1,4 +1,4 @@ -package com.geedgenetworks.core.factories; +package com.geedgenetworks.api.factory; import java.util.*; diff --git a/groot-core/src/main/java/com/geedgenetworks/core/metrics/InternalMetrics.java b/groot-api/src/main/java/com/geedgenetworks/api/metrics/InternalMetrics.java index 0bd3cc2..3192655 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/metrics/InternalMetrics.java +++ b/groot-api/src/main/java/com/geedgenetworks/api/metrics/InternalMetrics.java @@ -1,4 +1,4 @@ -package com.geedgenetworks.core.metrics; +package com.geedgenetworks.api.metrics; import org.apache.flink.api.common.functions.RuntimeContext; import org.apache.flink.metrics.Counter; diff --git a/groot-api/src/main/java/com/geedgenetworks/api/processor/Processor.java b/groot-api/src/main/java/com/geedgenetworks/api/processor/Processor.java new file mode 100644 index 0000000..fede994 --- /dev/null +++ b/groot-api/src/main/java/com/geedgenetworks/api/processor/Processor.java @@ -0,0 +1,15 @@ +package com.geedgenetworks.api.processor; + +import com.geedgenetworks.api.connector.event.Event; +import com.typesafe.config.Config; +import org.apache.flink.streaming.api.datastream.DataStream; +import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment; +import java.io.Serializable; + + +public interface Processor<T extends ProcessorConfig> extends Serializable { + + DataStream<Event> process(StreamExecutionEnvironment env, DataStream<Event> input, T processorConfig) ; + + T parseConfig(String name, Config config); +} diff --git a/groot-core/src/main/java/com/geedgenetworks/core/pojo/ProcessorConfig.java b/groot-api/src/main/java/com/geedgenetworks/api/processor/ProcessorConfig.java index 18fb300..325deac 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/pojo/ProcessorConfig.java +++ b/groot-api/src/main/java/com/geedgenetworks/api/processor/ProcessorConfig.java @@ -1,16 +1,14 @@ -package com.geedgenetworks.core.pojo; +package com.geedgenetworks.api.processor; import lombok.Data; import java.io.Serializable; -import java.util.List; import java.util.Map; + @Data public class ProcessorConfig implements Serializable { + private String name; private String type; private int parallelism; private Map<String, Object> properties; - private String name; - private List<String> output_fields; - private List<String> remove_fields; } diff --git a/groot-api/src/main/java/com/geedgenetworks/api/processor/ProcessorConfigOptions.java b/groot-api/src/main/java/com/geedgenetworks/api/processor/ProcessorConfigOptions.java new file mode 100644 index 0000000..f5511c7 --- /dev/null +++ b/groot-api/src/main/java/com/geedgenetworks/api/processor/ProcessorConfigOptions.java @@ -0,0 +1,34 @@ +package com.geedgenetworks.api.processor; + +import com.geedgenetworks.common.config.Option; +import com.geedgenetworks.common.config.Options; +import java.util.Map; + +public interface ProcessorConfigOptions { + Option<String> NAME = Options.key("name") + .stringType() + .noDefaultValue() + .withDescription("The name of the operator."); + + Option<String> TYPE = Options.key("type") + .stringType() + .noDefaultValue() + .withDescription("The type of operator."); + + Option<Integer> PARALLELISM = Options.key("parallelism") + .intType() + .defaultValue(1) + .withDescription("The parallelism of the operator."); + + Option<Map<String, String>> PROPERTIES = Options.key("properties") + .mapType() + .noDefaultValue() + .withDescription("Custom properties for sink."); + + + + + + + +} diff --git a/groot-bootstrap/pom.xml b/groot-bootstrap/pom.xml index 24a202a..60e602a 100644 --- a/groot-bootstrap/pom.xml +++ b/groot-bootstrap/pom.xml @@ -18,14 +18,25 @@ <dependencies> <dependency> + <groupId>com.beust</groupId> + <artifactId>jcommander</artifactId> + </dependency> + + <dependency> <groupId>com.geedgenetworks</groupId> - <artifactId>groot-common</artifactId> + <artifactId>groot-core</artifactId> <version>${revision}</version> </dependency> <dependency> <groupId>com.geedgenetworks</groupId> - <artifactId>groot-core</artifactId> + <artifactId>groot-api</artifactId> + <version>${revision}</version> + </dependency> + + <dependency> + <groupId>com.geedgenetworks</groupId> + <artifactId>groot-common</artifactId> <version>${revision}</version> </dependency> @@ -99,6 +110,13 @@ <scope>${scope}</scope> </dependency> + <dependency> + <groupId>com.geedgenetworks</groupId> + <artifactId>format-raw</artifactId> + <version>${revision}</version> + <scope>${scope}</scope> + </dependency> + <!-- Idea debug dependencies --> <dependency> <groupId>org.apache.flink</groupId> @@ -122,13 +140,6 @@ </dependency> <dependency> - <groupId>com.geedgenetworks</groupId> - <artifactId>format-raw</artifactId> - <version>${revision}</version> - <scope>${scope}</scope> - </dependency> - - <dependency> <groupId>org.apache.flink</groupId> <artifactId>flink-runtime-web_${scala.version}</artifactId> <version>${flink.version}</version> @@ -165,24 +176,12 @@ </dependency> <dependency> - <groupId>com.typesafe</groupId> - <artifactId>config</artifactId> - </dependency> - - - <dependency> <groupId>org.apache.flink</groupId> <artifactId>flink-test-utils_${scala.version}</artifactId> <version>${flink.version}</version> <scope>test</scope> </dependency> - <dependency> - <groupId>com.beust</groupId> - <artifactId>jcommander</artifactId> - </dependency> - - </dependencies> <build> diff --git a/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/command/AES128GCM96Shade.java b/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/command/AES128GCM96Shade.java index 03ed1af..c91c733 100644 --- a/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/command/AES128GCM96Shade.java +++ b/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/command/AES128GCM96Shade.java @@ -1,7 +1,6 @@ package com.geedgenetworks.bootstrap.command; import cn.hutool.core.util.RandomUtil; -import com.geedgenetworks.common.crypto.CryptoShade; import javax.crypto.Cipher; import javax.crypto.spec.GCMParameterSpec; diff --git a/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/command/AES256GCM96Shade.java b/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/command/AES256GCM96Shade.java index efee134..4eadc28 100644 --- a/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/command/AES256GCM96Shade.java +++ b/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/command/AES256GCM96Shade.java @@ -1,7 +1,6 @@ package com.geedgenetworks.bootstrap.command; import cn.hutool.core.util.RandomUtil; -import com.geedgenetworks.common.crypto.CryptoShade; import javax.crypto.Cipher; import javax.crypto.spec.GCMParameterSpec; diff --git a/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/command/AESShade.java b/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/command/AESShade.java index 91e05d0..b593937 100644 --- a/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/command/AESShade.java +++ b/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/command/AESShade.java @@ -2,7 +2,6 @@ package com.geedgenetworks.bootstrap.command; import cn.hutool.crypto.SecureUtil; import cn.hutool.crypto.symmetric.SymmetricAlgorithm; -import com.geedgenetworks.common.crypto.CryptoShade; import java.nio.charset.StandardCharsets; diff --git a/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/command/Base64Shade.java b/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/command/Base64Shade.java index d07c372..6cdce0f 100644 --- a/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/command/Base64Shade.java +++ b/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/command/Base64Shade.java @@ -1,6 +1,5 @@ package com.geedgenetworks.bootstrap.command; -import com.geedgenetworks.common.crypto.CryptoShade; import java.nio.charset.StandardCharsets; import java.util.Base64; diff --git a/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/command/CommandArgs.java b/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/command/CommandArgs.java index 6ee5151..bd7882a 100644 --- a/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/command/CommandArgs.java +++ b/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/command/CommandArgs.java @@ -2,7 +2,7 @@ package com.geedgenetworks.bootstrap.command; import com.beust.jcommander.Parameter; import com.geedgenetworks.bootstrap.enums.DeployMode; -import com.geedgenetworks.common.Constants; +import com.geedgenetworks.common.config.Constants; import lombok.Data; import lombok.EqualsAndHashCode; diff --git a/groot-common/src/main/java/com/geedgenetworks/common/crypto/CryptoShade.java b/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/command/CryptoShade.java index 985f4df..78515a7 100644 --- a/groot-common/src/main/java/com/geedgenetworks/common/crypto/CryptoShade.java +++ b/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/command/CryptoShade.java @@ -1,4 +1,4 @@ -package com.geedgenetworks.common.crypto; +package com.geedgenetworks.bootstrap.command; public interface CryptoShade { diff --git a/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/command/ExecuteCommand.java b/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/command/ExecuteCommand.java index c3538b0..01f3bdd 100644 --- a/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/command/ExecuteCommand.java +++ b/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/command/ExecuteCommand.java @@ -7,7 +7,7 @@ import com.geedgenetworks.bootstrap.execution.ExecutionConfigKeyName; import com.geedgenetworks.bootstrap.execution.JobExecution; import com.geedgenetworks.bootstrap.utils.ConfigBuilder; import com.geedgenetworks.bootstrap.utils.ConfigFileUtils; -import com.geedgenetworks.common.Constants; +import com.geedgenetworks.common.config.Constants; import com.geedgenetworks.common.config.ConfigProvider; import com.geedgenetworks.common.config.GrootStreamConfig; import com.typesafe.config.Config; diff --git a/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/command/SM4GCM96Shade.java b/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/command/SM4GCM96Shade.java index a6d27e4..7fa84b4 100644 --- a/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/command/SM4GCM96Shade.java +++ b/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/command/SM4GCM96Shade.java @@ -1,7 +1,6 @@ package com.geedgenetworks.bootstrap.command; import cn.hutool.core.util.RandomUtil; -import com.geedgenetworks.common.crypto.CryptoShade; import javax.crypto.Cipher; import javax.crypto.spec.GCMParameterSpec; diff --git a/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/command/SM4Shade.java b/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/command/SM4Shade.java index e274716..4e04d9e 100644 --- a/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/command/SM4Shade.java +++ b/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/command/SM4Shade.java @@ -3,7 +3,6 @@ package com.geedgenetworks.bootstrap.command; import cn.hutool.crypto.KeyUtil; import cn.hutool.crypto.SmUtil; import cn.hutool.crypto.symmetric.SM4; -import com.geedgenetworks.common.crypto.CryptoShade; import java.nio.charset.StandardCharsets; diff --git a/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/enums/OperatorType.java b/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/enums/OperatorType.java new file mode 100644 index 0000000..a32c844 --- /dev/null +++ b/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/enums/OperatorType.java @@ -0,0 +1,32 @@ +package com.geedgenetworks.bootstrap.enums; + +public enum OperatorType { + SOURCE("source"), + FILTER("filter"), + SPLIT("split"), + PREPROCESSING("preprocessing"), + PROCESSING("processing"), + POSTPROCESSING("postprocessing"), + SINK("sink"); + + private final String type; + public String getType() { + return type; + } + OperatorType(String type) {this.type = type;} + + public static OperatorType fromType(String type) { + for (OperatorType stage : values()) { + if (stage.type.equalsIgnoreCase(type)) { + return stage; + } + } + throw new IllegalArgumentException("Unknown type: " + type); + } + + @Override + public String toString() { + return type; + } + +} diff --git a/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/enums/ProcessorType.java b/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/enums/ProcessorType.java deleted file mode 100644 index 6f33cae..0000000 --- a/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/enums/ProcessorType.java +++ /dev/null @@ -1,19 +0,0 @@ -package com.geedgenetworks.bootstrap.enums; - -public enum ProcessorType { - SOURCE("source"), - FILTER("filter"), - SPLIT("split"), - PREPROCESSING("preprocessing"), - PROCESSING("processing"), - POSTPROCESSING("postprocessing"), - SINK("sink"); - - private final String type; - - ProcessorType(String type) {this.type = type;} - - public String getType() { - return type; - } -} diff --git a/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/execution/AbstractExecutor.java b/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/execution/AbstractExecutor.java index f5b1a5d..8ad33a2 100644 --- a/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/execution/AbstractExecutor.java +++ b/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/execution/AbstractExecutor.java @@ -1,39 +1,25 @@ package com.geedgenetworks.bootstrap.execution; -import com.geedgenetworks.common.Event; import com.geedgenetworks.common.utils.ReflectionUtils; -import com.geedgenetworks.core.filter.Filter; -import com.geedgenetworks.core.processor.Processor; -import com.geedgenetworks.core.split.Split; -import com.typesafe.config.Config; +import com.geedgenetworks.api.connector.event.Event; import org.apache.flink.streaming.api.datastream.DataStream; -import org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator; import java.net.URL; import java.net.URLClassLoader; -import java.util.*; import java.util.function.BiConsumer; -public abstract class AbstractExecutor<K, V> - implements Executor<DataStream<Event>, JobRuntimeEnvironment> { - protected JobRuntimeEnvironment jobRuntimeEnvironment; - protected final Config operatorConfig; - protected final Map<K,V> operatorMap; +public abstract class AbstractExecutor<E, C> implements Executor<DataStream<Event>> { + public E environment; + protected final C jobConfig; - protected AbstractExecutor(List<URL> jarPaths, Config operatorConfig) { - this.operatorConfig = operatorConfig; - this.operatorMap = initialize(jarPaths, operatorConfig); + protected AbstractExecutor(E environment, C jobConfig) { + this.environment = environment; + this.jobConfig = jobConfig; + initialize(jobConfig); } + protected abstract void initialize(C jobConfig); - @Override - public void setRuntimeEnvironment(JobRuntimeEnvironment jobRuntimeEnvironment) { - this.jobRuntimeEnvironment = jobRuntimeEnvironment; - - } - - protected abstract Map<K, V> initialize(List<URL> jarPaths, Config operatorConfig); - - protected static final BiConsumer<ClassLoader, URL> ADD_URL_TO_CLASSLOADER = + protected static final BiConsumer<ClassLoader, URL> ADD_URL_TO_CLASSLOADER = (classLoader, url) -> { if (classLoader.getClass().getName().endsWith("SafetyNetWrapperClassLoader")) { URLClassLoader c = diff --git a/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/execution/AbstractProcessorExecutor.java b/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/execution/AbstractProcessorExecutor.java deleted file mode 100644 index 42a3a11..0000000 --- a/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/execution/AbstractProcessorExecutor.java +++ /dev/null @@ -1,88 +0,0 @@ -package com.geedgenetworks.bootstrap.execution; - -import com.geedgenetworks.bootstrap.exception.JobExecuteException; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.common.config.*; -import com.geedgenetworks.common.exception.CommonErrorCode; -import com.geedgenetworks.common.exception.ConfigValidationException; -import com.geedgenetworks.core.pojo.ProcessorConfig; -import com.geedgenetworks.core.processor.Processor; -import com.typesafe.config.Config; -import org.apache.flink.streaming.api.datastream.DataStream; - -import java.net.URL; -import java.util.List; -import java.util.Map; -import java.util.ServiceLoader; - -public abstract class AbstractProcessorExecutor extends AbstractExecutor<String, ProcessorConfig> { - - - protected AbstractProcessorExecutor(List<URL> jarPaths, Config operatorConfig) { - super(jarPaths, operatorConfig); - } - - @Override - public DataStream<Event> execute(DataStream<Event> dataStream, Node node) throws JobExecuteException { - - ProcessorConfig processorConfig = operatorMap.get(node.getName()); - boolean found = false; // 标志变量 - ServiceLoader<Processor> processors = ServiceLoader.load(Processor.class); - for (Processor processor : processors) { - if(processor.type().equals(processorConfig.getType())){ - found = true; - if (node.getParallelism() > 0) { - processorConfig.setParallelism(node.getParallelism()); - } - try { - - dataStream = processor.processorFunction( - dataStream, processorConfig, jobRuntimeEnvironment.getStreamExecutionEnvironment().getConfig()); - } catch (Exception e) { - throw new JobExecuteException("Create orderby pipeline instance failed!", e); - } - break; - } - } - if (!found) { - throw new JobExecuteException("No matching processor found for type: " + processorConfig.getType()); - } - return dataStream; - } - - protected ProcessorConfig checkConfig(String key, Map<String, Object> value, Config processorsConfig) { - ProcessorConfig ProcessorConfig = new ProcessorConfig(); - boolean found = false; // 标志变量 - CheckResult result = CheckConfigUtil.checkAllExists(processorsConfig.getConfig(key), - ProjectionConfigOptions.TYPE.key()); - if (!result.isSuccess()) { - throw new ConfigValidationException(CommonErrorCode.CONFIG_VALIDATION_FAILED, String.format( - "Postprocessor: %s, Message: %s", - key, result.getMsg())); - } - ServiceLoader<Processor> processors = ServiceLoader.load(Processor.class); - for (Processor processor : processors) { - if(processor.type().equals(value.getOrDefault("type", "").toString())){ - found = true; - try { - ProcessorConfig = processor.checkConfig(key, value, processorsConfig); - - } catch (Exception e) { - throw new JobExecuteException("Create orderby pipeline instance failed!", e); - } - break; - } - } - if (!found) { - throw new JobExecuteException("No matching processor found for type: " + value.getOrDefault("type", "").toString()); - } - return ProcessorConfig; - } - - - - - - - -} diff --git a/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/execution/Executor.java b/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/execution/Executor.java index d57d6bf..e36971d 100644 --- a/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/execution/Executor.java +++ b/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/execution/Executor.java @@ -2,10 +2,7 @@ package com.geedgenetworks.bootstrap.execution; import com.geedgenetworks.bootstrap.exception.JobExecuteException; -public interface Executor<T, ENV extends RuntimeEnvironment> { - - T execute(T dataStream, Node edge) throws JobExecuteException; - - void setRuntimeEnvironment(ENV runtimeEnvironment); +public interface Executor<T> { + T execute(T dataStream, JobTopologyNode jobTopologyNode) throws JobExecuteException; } diff --git a/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/execution/FilterExecutor.java b/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/execution/FilterExecutor.java deleted file mode 100644 index f3c81c2..0000000 --- a/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/execution/FilterExecutor.java +++ /dev/null @@ -1,105 +0,0 @@ -package com.geedgenetworks.bootstrap.execution; - -import com.alibaba.fastjson.JSONObject; -import com.geedgenetworks.bootstrap.enums.ProcessorType; -import com.geedgenetworks.bootstrap.exception.JobExecuteException; -import com.geedgenetworks.common.Constants; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.common.config.CheckConfigUtil; -import com.geedgenetworks.common.config.CheckResult; -import com.geedgenetworks.common.config.FilterConfigOptions; -import com.geedgenetworks.common.config.ProjectionConfigOptions; -import com.geedgenetworks.common.exception.CommonErrorCode; -import com.geedgenetworks.common.exception.ConfigValidationException; -import com.geedgenetworks.core.filter.Filter; -import com.geedgenetworks.core.pojo.FilterConfig; - -import com.google.common.collect.Maps; -import com.typesafe.config.Config; -import lombok.extern.slf4j.Slf4j; -import org.apache.flink.streaming.api.datastream.DataStream; -import org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator; - -import java.net.URL; -import java.util.List; -import java.util.Map; -import java.util.ServiceLoader; - -/** - * Initialize config and execute filter operator - */ -@Slf4j -public class FilterExecutor extends AbstractExecutor<String, FilterConfig> { - private static final String PROCESSOR_TYPE = ProcessorType.FILTER.getType(); - - public FilterExecutor(List<URL> jarPaths, Config config) { - super(jarPaths, config); - } - - @Override - protected Map<String, FilterConfig> initialize(List<URL> jarPaths, Config operatorConfig) { - Map<String, FilterConfig> filterConfigMap = Maps.newHashMap(); - if (operatorConfig.hasPath(Constants.FILTERS)) { - Config filterConfig = operatorConfig.getConfig(Constants.FILTERS); - filterConfig.root().unwrapped().forEach((key, value) -> { - CheckResult result = CheckConfigUtil.checkAllExists(filterConfig.getConfig(key), - FilterConfigOptions.TYPE.key(), FilterConfigOptions.PROPERTIES.key()); - if (!result.isSuccess()) { - throw new ConfigValidationException(CommonErrorCode.CONFIG_VALIDATION_FAILED, String.format( - "Filter: %s, Message: %s", - key, result.getMsg())); - } - filterConfigMap.put(key, checkConfig(key, (Map<String, Object>) value, filterConfig)); - }); - } - - return filterConfigMap; - } - - @Override - public DataStream<Event> execute(DataStream<Event> dataStream, Node node) throws JobExecuteException { - FilterConfig filterConfig = operatorMap.get(node.getName()); - boolean found = false; // 标志变量 - ServiceLoader<Filter> filters = ServiceLoader.load(Filter.class); - for (Filter filter : filters) { - if(filter.type().equals(filterConfig.getType())){ - found = true; - if (node.getParallelism() > 0) { - filterConfig.setParallelism(node.getParallelism()); - } - try { - dataStream = - filter.filterFunction( - dataStream, filterConfig); - } catch (Exception e) { - throw new JobExecuteException("Create filter instance failed!", e); - } - break; - } - } - if (!found) { - throw new JobExecuteException("No matching filter found for type: " + filterConfig.getType()); - } - return dataStream; - } - - protected FilterConfig checkConfig(String key, Map<String, Object> value, Config config) { - FilterConfig filterConfig = new FilterConfig(); - boolean found = false; // 标志变量 - ServiceLoader<Filter> filters = ServiceLoader.load(Filter.class); - for (Filter filter : filters) { - if(filter.type().equals(value.getOrDefault("type", "").toString())){ - found = true; - try { - filterConfig = filter.checkConfig(key, value, config); - } catch (Exception e) { - throw new JobExecuteException("Create split pipeline instance failed!", e); - } - } - } - if (!found) { - throw new JobExecuteException("No matching filter found for type: " + value.getOrDefault("type", "").toString()); - } - return filterConfig; - } -} diff --git a/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/execution/JobExecution.java b/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/execution/JobExecution.java index 706fc18..ad31d88 100644 --- a/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/execution/JobExecution.java +++ b/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/execution/JobExecution.java @@ -1,12 +1,12 @@ package com.geedgenetworks.bootstrap.execution; import com.alibaba.fastjson2.JSONObject; -import com.geedgenetworks.bootstrap.enums.ProcessorType; +import com.geedgenetworks.bootstrap.enums.OperatorType; import com.geedgenetworks.bootstrap.exception.JobExecuteException; import com.geedgenetworks.bootstrap.main.GrootStreamRunner; -import com.geedgenetworks.common.Constants; -import com.geedgenetworks.common.Event; +import com.geedgenetworks.common.config.Constants; import com.geedgenetworks.common.config.GrootStreamConfig; +import com.geedgenetworks.api.connector.event.Event; import com.google.common.collect.Lists; import com.google.common.collect.Maps; import com.typesafe.config.Config; @@ -29,14 +29,10 @@ import java.util.stream.Stream; public class JobExecution { private final JobRuntimeEnvironment jobRuntimeEnvironment; - private final Executor<DataStream<Event>, JobRuntimeEnvironment> sourceExecutor; - private final Executor<DataStream<Event>, JobRuntimeEnvironment> sinkExecutor; - private final Executor<DataStream<Event>, JobRuntimeEnvironment> filterExecutor; - private final Executor<DataStream<Event>, JobRuntimeEnvironment> splitExecutor; - private final Executor<DataStream<Event>, JobRuntimeEnvironment> preprocessingExecutor; - private final Executor<DataStream<Event>, JobRuntimeEnvironment> processingExecutor; - private final Executor<DataStream<Event>, JobRuntimeEnvironment> postprocessingExecutor; - private final List<Node> nodes; + private final Executor<DataStream<Event>> sourceExecutor; + private final Executor<DataStream<Event>> sinkExecutor; + private final Executor<DataStream<Event>> processorExecutor; + private final List<JobTopologyNode> jobTopologyNodes; private final List<URL> jarPaths; private final Map<String,String> nodeNameWithSplitTags = new HashMap<>(); @@ -50,25 +46,13 @@ public class JobExecution { } registerPlugin(jobConfig.getConfig(Constants.APPLICATION)); - - this.sourceExecutor = new SourceExecutor(jarPaths, jobConfig); - this.sinkExecutor = new SinkExecutor(jarPaths, jobConfig); - this.filterExecutor = new FilterExecutor(jarPaths, jobConfig); - this.splitExecutor = new SplitExecutor(jarPaths, jobConfig); - this.preprocessingExecutor = new PreprocessingExecutor(jarPaths, jobConfig); - this.processingExecutor = new ProcessingExecutor(jarPaths, jobConfig); - this.postprocessingExecutor = new PostprocessingExecutor(jarPaths, jobConfig); this.jobRuntimeEnvironment = JobRuntimeEnvironment.getInstance(this.registerPlugin(jobConfig, jarPaths), grootStreamConfig); - this.sourceExecutor.setRuntimeEnvironment(jobRuntimeEnvironment); - this.sinkExecutor.setRuntimeEnvironment(jobRuntimeEnvironment); - this.filterExecutor.setRuntimeEnvironment(jobRuntimeEnvironment); - this.splitExecutor.setRuntimeEnvironment(jobRuntimeEnvironment); - this.preprocessingExecutor.setRuntimeEnvironment(jobRuntimeEnvironment); - this.processingExecutor.setRuntimeEnvironment(jobRuntimeEnvironment); - this.postprocessingExecutor.setRuntimeEnvironment(jobRuntimeEnvironment); - this.nodes = buildJobNode(jobConfig); + this.sourceExecutor = new SourceExecutor(jobRuntimeEnvironment, jobConfig); + this.sinkExecutor = new SinkExecutor(jobRuntimeEnvironment, jobConfig); + this.processorExecutor = new ProcessorExecutor(jobRuntimeEnvironment, jobConfig); + this.jobTopologyNodes = buildJobNode(jobConfig); } @@ -88,7 +72,7 @@ public class JobExecution { try { return uri.toURL(); } catch (MalformedURLException e) { - throw new RuntimeException("the uri of jar illegal: " + uri, e); + throw new RuntimeException("The uri of jar illegal:" + uri, e); } }).collect(Collectors.toList()); jarDependencies.forEach(url -> { @@ -153,7 +137,7 @@ public class JobExecution { return config; } - private List<Node> buildJobNode(Config config) { + private List<JobTopologyNode> buildJobNode(Config config) { Map<String, Object> sources = Maps.newHashMap(); Map<String, Object> sinks = Maps.newHashMap(); @@ -187,34 +171,34 @@ public class JobExecution { List<? extends Config> topology = config.getConfig(Constants.APPLICATION).getConfigList(Constants.APPLICATION_TOPOLOGY); - List<Node> nodes = Lists.newArrayList(); + List<JobTopologyNode> jobTopologyNodes = Lists.newArrayList(); topology.forEach(item -> { - Node node = JSONObject.from(item.root().unwrapped()).toJavaObject(Node.class); - nodes.add(node); + JobTopologyNode jobTopologyNode = JSONObject.from(item.root().unwrapped()).toJavaObject(JobTopologyNode.class); + jobTopologyNodes.add(jobTopologyNode); }); - for (Node node : nodes) { - if (sources.containsKey(node.getName())) { - node.setType(ProcessorType.SOURCE); - } else if (sinks.containsKey(node.getName())) { - node.setType(ProcessorType.SINK); - } else if (filters.containsKey(node.getName())) { - node.setType(ProcessorType.FILTER); - } else if (splits.containsKey(node.getName())) { - node.setType(ProcessorType.SPLIT); - } else if (preprocessingPipelines.containsKey(node.getName())) { - node.setType(ProcessorType.PREPROCESSING); - } else if (processingPipelines.containsKey(node.getName())) { - node.setType(ProcessorType.PROCESSING); - } else if (postprocessingPipelines.containsKey(node.getName())) { - node.setType(ProcessorType.POSTPROCESSING); + for (JobTopologyNode jobTopologyNode : jobTopologyNodes) { + if (sources.containsKey(jobTopologyNode.getName())) { + jobTopologyNode.setType(OperatorType.SOURCE); + } else if (sinks.containsKey(jobTopologyNode.getName())) { + jobTopologyNode.setType(OperatorType.SINK); + } else if (filters.containsKey(jobTopologyNode.getName())) { + jobTopologyNode.setType(OperatorType.FILTER); + } else if (splits.containsKey(jobTopologyNode.getName())) { + jobTopologyNode.setType(OperatorType.SPLIT); + } else if (preprocessingPipelines.containsKey(jobTopologyNode.getName())) { + jobTopologyNode.setType(OperatorType.PREPROCESSING); + } else if (processingPipelines.containsKey(jobTopologyNode.getName())) { + jobTopologyNode.setType(OperatorType.PROCESSING); + } else if (postprocessingPipelines.containsKey(jobTopologyNode.getName())) { + jobTopologyNode.setType(OperatorType.POSTPROCESSING); } else { - throw new JobExecuteException("unsupported process type " + node.getName()); + throw new JobExecuteException("unsupported process type " + jobTopologyNode.getName()); } } - return nodes; + return jobTopologyNodes; } @@ -223,14 +207,14 @@ public class JobExecution { if (!jobRuntimeEnvironment.isLocalMode() && !jobRuntimeEnvironment.isTestMode()) { jobRuntimeEnvironment.registerPlugin(jarPaths); } - List<Node> sourceNodes = nodes - .stream().filter(v -> v.getType().name().equals(ProcessorType.SOURCE.name())).collect(Collectors.toList()); + List<JobTopologyNode> sourceJobTopologyNodes = jobTopologyNodes + .stream().filter(v -> v.getType().name().equals(OperatorType.SOURCE.name())).collect(Collectors.toList()); DataStream<Event> dataStream = null; - for (Node sourceNode : sourceNodes) { - dataStream = sourceExecutor.execute(dataStream, sourceNode); - for (String nodeName : sourceNode.getDownstream()) { + for (JobTopologyNode sourceJobTopologyNode : sourceJobTopologyNodes) { + dataStream = sourceExecutor.execute(dataStream, sourceJobTopologyNode); + for (String nodeName : sourceJobTopologyNode.getDownstream()) { buildJobGraph(dataStream, nodeName); } } @@ -251,68 +235,68 @@ public class JobExecution { } private void buildJobGraph(DataStream<Event> dataStream, String downstreamNodeName) { - Node node = getNode(downstreamNodeName).orElseGet(() -> { + JobTopologyNode jobTopologyNode = getNode(downstreamNodeName).orElseGet(() -> { throw new JobExecuteException("Can't find downstream node " + downstreamNodeName); }); - if (node.getType().name().equals(ProcessorType.FILTER.name())) { - if (nodeNameWithSplitTags.containsKey(node.getName())) { - dataStream = filterExecutor.execute(((SingleOutputStreamOperator<Event>) dataStream).getSideOutput(new OutputTag<Event>(nodeNameWithSplitTags.get(node.getName())) { - }), node); + if (jobTopologyNode.getType().name().equals(OperatorType.FILTER.name())) { + if (nodeNameWithSplitTags.containsKey(jobTopologyNode.getName())) { + dataStream = processorExecutor.execute(((SingleOutputStreamOperator<Event>) dataStream) + .getSideOutput(new OutputTag<Event>(nodeNameWithSplitTags.get(jobTopologyNode.getName())) {}), jobTopologyNode); } else { - dataStream = filterExecutor.execute(dataStream, node); + dataStream = processorExecutor.execute(dataStream, jobTopologyNode); } - } else if (node.getType().name().equals(ProcessorType.SPLIT.name())) { - if (node.getTags().size() == node.getDownstream().size()) { - for (int i = 0; i < node.getDownstream().size();i++) { - nodeNameWithSplitTags.put(node.getDownstream().get(i),node.getTags().get(i)); + } else if (jobTopologyNode.getType().name().equals(OperatorType.SPLIT.name())) { + if (jobTopologyNode.getTags().size() == jobTopologyNode.getDownstream().size()) { + for (int i = 0; i < jobTopologyNode.getDownstream().size(); i++) { + nodeNameWithSplitTags.put(jobTopologyNode.getDownstream().get(i), jobTopologyNode.getTags().get(i)); } } else { throw new JobExecuteException("split node downstream size not equal tags size"); } - dataStream = splitExecutor.execute(dataStream, node); - } else if (node.getType().name().equals(ProcessorType.PREPROCESSING.name())) { - if (nodeNameWithSplitTags.containsKey(node.getName())) { - dataStream = preprocessingExecutor.execute(((SingleOutputStreamOperator<Event>) dataStream).getSideOutput(new OutputTag<Event>(nodeNameWithSplitTags.get(node.getName())){ - }), node); + dataStream = processorExecutor.execute(dataStream, jobTopologyNode); + } else if (jobTopologyNode.getType().name().equals(OperatorType.PREPROCESSING.name())) { + if (nodeNameWithSplitTags.containsKey(jobTopologyNode.getName())) { + dataStream = processorExecutor.execute(((SingleOutputStreamOperator<Event>) dataStream).getSideOutput(new OutputTag<Event>(nodeNameWithSplitTags.get(jobTopologyNode.getName())){ + }), jobTopologyNode); } else { - dataStream = preprocessingExecutor.execute(dataStream, node); + dataStream = processorExecutor.execute(dataStream, jobTopologyNode); } - } else if (node.getType().name().equals(ProcessorType.PROCESSING.name())) { - if (nodeNameWithSplitTags.containsKey(node.getName())) { - dataStream = processingExecutor.execute(((SingleOutputStreamOperator<Event>) dataStream).getSideOutput(new OutputTag<Event>(nodeNameWithSplitTags.get(node.getName())) { - }), node); + } else if (jobTopologyNode.getType().name().equals(OperatorType.PROCESSING.name())) { + if (nodeNameWithSplitTags.containsKey(jobTopologyNode.getName())) { + dataStream = processorExecutor.execute(((SingleOutputStreamOperator<Event>) dataStream).getSideOutput(new OutputTag<Event>(nodeNameWithSplitTags.get(jobTopologyNode.getName())) { + }), jobTopologyNode); } else { - dataStream = processingExecutor.execute(dataStream, node); + dataStream = processorExecutor.execute(dataStream, jobTopologyNode); } - } else if (node.getType().name().equals(ProcessorType.POSTPROCESSING.name())) { - if (nodeNameWithSplitTags.containsKey(node.getName())) { - dataStream = postprocessingExecutor.execute(((SingleOutputStreamOperator<Event>) dataStream).getSideOutput(new OutputTag<Event>(nodeNameWithSplitTags.get(node.getName())) { - }), node); + } else if (jobTopologyNode.getType().name().equals(OperatorType.POSTPROCESSING.name())) { + if (nodeNameWithSplitTags.containsKey(jobTopologyNode.getName())) { + dataStream = processorExecutor.execute(((SingleOutputStreamOperator<Event>) dataStream).getSideOutput(new OutputTag<Event>(nodeNameWithSplitTags.get(jobTopologyNode.getName())) { + }), jobTopologyNode); } else { - dataStream = postprocessingExecutor.execute(dataStream, node); + dataStream = processorExecutor.execute(dataStream, jobTopologyNode); } - } else if (node.getType().name().equals(ProcessorType.SINK.name())) { - if (nodeNameWithSplitTags.containsKey(node.getName())) { - dataStream = sinkExecutor.execute(((SingleOutputStreamOperator<Event>) dataStream).getSideOutput(new OutputTag<Event>(nodeNameWithSplitTags.get(node.getName())) { - }), node); + } else if (jobTopologyNode.getType().name().equals(OperatorType.SINK.name())) { + if (nodeNameWithSplitTags.containsKey(jobTopologyNode.getName())) { + dataStream = sinkExecutor.execute(((SingleOutputStreamOperator<Event>) dataStream).getSideOutput(new OutputTag<Event>(nodeNameWithSplitTags.get(jobTopologyNode.getName())) { + }), jobTopologyNode); } else { - dataStream = sinkExecutor.execute(dataStream, node); + dataStream = sinkExecutor.execute(dataStream, jobTopologyNode); } } else { - throw new JobExecuteException("unsupported process type " + node.getType().name()); + throw new JobExecuteException("unsupported process type " + jobTopologyNode.getType().name()); } - for (String nodeName : node.getDownstream()) { + for (String nodeName : jobTopologyNode.getDownstream()) { buildJobGraph(dataStream, nodeName); } } - private Optional<Node> getNode(String name) { - return nodes.stream().filter(v -> v.getName().equals(name)).findFirst(); + private Optional<JobTopologyNode> getNode(String name) { + return jobTopologyNodes.stream().filter(v -> v.getName().equals(name)).findFirst(); } diff --git a/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/execution/JobRuntimeEnvironment.java b/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/execution/JobRuntimeEnvironment.java index e23d446..7b4d66b 100644 --- a/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/execution/JobRuntimeEnvironment.java +++ b/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/execution/JobRuntimeEnvironment.java @@ -3,7 +3,7 @@ package com.geedgenetworks.bootstrap.execution; import com.alibaba.fastjson2.JSON; import com.geedgenetworks.bootstrap.enums.TargetType; import com.geedgenetworks.bootstrap.utils.EnvironmentUtil; -import com.geedgenetworks.common.Constants; +import com.geedgenetworks.common.config.Constants; import com.geedgenetworks.common.config.CheckResult; import com.geedgenetworks.common.config.GrootStreamConfig; import com.geedgenetworks.common.utils.ReflectionUtils; diff --git a/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/execution/Node.java b/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/execution/JobTopologyNode.java index 66303c2..dcc15e9 100644 --- a/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/execution/Node.java +++ b/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/execution/JobTopologyNode.java @@ -1,6 +1,6 @@ package com.geedgenetworks.bootstrap.execution; -import com.geedgenetworks.bootstrap.enums.ProcessorType; +import com.geedgenetworks.bootstrap.enums.OperatorType; import lombok.AllArgsConstructor; import lombok.Data; import lombok.EqualsAndHashCode; @@ -10,15 +10,18 @@ import java.io.Serializable; import java.util.Collections; import java.util.List; +/** + * Represents an operator node in the execution graph. + */ @Data @NoArgsConstructor @AllArgsConstructor @EqualsAndHashCode -public class Node implements Serializable { +public class JobTopologyNode implements Serializable { private String name; - private ProcessorType type; - private List<String> downstream = Collections.emptyList(); + private OperatorType type; private int parallelism; + private List<String> downstream = Collections.emptyList(); private List<String> tags = Collections.emptyList(); } diff --git a/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/execution/PostprocessingExecutor.java b/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/execution/PostprocessingExecutor.java deleted file mode 100644 index 03e5bd5..0000000 --- a/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/execution/PostprocessingExecutor.java +++ /dev/null @@ -1,44 +0,0 @@ -package com.geedgenetworks.bootstrap.execution; - -import com.geedgenetworks.bootstrap.enums.ProcessorType; -import com.geedgenetworks.bootstrap.exception.JobExecuteException; -import com.geedgenetworks.common.Constants; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.core.pojo.ProcessorConfig; -import com.google.common.collect.Maps; -import com.typesafe.config.Config; -import org.apache.flink.streaming.api.datastream.DataStream; -import org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator; - -import java.net.URL; -import java.util.List; -import java.util.Map; - -/** - * Initialize config and execute postprocessor - */ -public class PostprocessingExecutor extends AbstractProcessorExecutor { - private static final String PROCESSOR_TYPE = ProcessorType.POSTPROCESSING.getType(); - - public PostprocessingExecutor(List<URL> jarPaths, Config config) { - super(jarPaths, config); - } - - @Override - protected Map<String, ProcessorConfig> initialize(List<URL> jarPaths, Config operatorConfig) { - Map<String, ProcessorConfig> postprocessingConfigMap = Maps.newHashMap(); - if (operatorConfig.hasPath(Constants.POSTPROCESSING_PIPELINES)) { - Config postprocessors = operatorConfig.getConfig(Constants.POSTPROCESSING_PIPELINES); - postprocessors.root().unwrapped().forEach((key, value) -> { - postprocessingConfigMap.put(key, checkConfig(key, (Map<String, Object>) value, postprocessors)); - }); - } - return postprocessingConfigMap; - } - - - @Override - public DataStream<Event> execute(DataStream<Event> dataStream, Node node) throws JobExecuteException { - return super.execute(dataStream, node); - } -} diff --git a/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/execution/PreprocessingExecutor.java b/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/execution/PreprocessingExecutor.java deleted file mode 100644 index da8dc62..0000000 --- a/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/execution/PreprocessingExecutor.java +++ /dev/null @@ -1,47 +0,0 @@ -package com.geedgenetworks.bootstrap.execution; - -import com.geedgenetworks.bootstrap.enums.ProcessorType; -import com.geedgenetworks.bootstrap.exception.JobExecuteException; -import com.geedgenetworks.common.Constants; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.core.pojo.ProcessorConfig; -import com.google.common.collect.Maps; -import com.typesafe.config.Config; -import lombok.extern.slf4j.Slf4j; -import org.apache.flink.streaming.api.datastream.DataStream; -import org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator; - -import java.net.URL; -import java.util.List; -import java.util.Map; - -/** - * Initialize config and execute preprocessor - */ -@Slf4j -public class PreprocessingExecutor extends AbstractProcessorExecutor { - private static final String PROCESSOR_TYPE = ProcessorType.PREPROCESSING.getType(); - - public PreprocessingExecutor(List<URL> jarPaths, Config config) { - super(jarPaths, config); - } - - @Override - protected Map<String, ProcessorConfig> initialize(List<URL> jarPaths, Config operatorConfig) { - Map<String, ProcessorConfig> preprocessingConfigMap = Maps.newHashMap(); - if (operatorConfig.hasPath(Constants.PREPROCESSING_PIPELINES)) { - Config preprocessors = operatorConfig.getConfig(Constants.PREPROCESSING_PIPELINES); - preprocessors.root().unwrapped().forEach((key, value) -> { - preprocessingConfigMap.put(key, checkConfig(key, (Map<String, Object>) value, preprocessors)); - }); - } - return preprocessingConfigMap; - } - - @Override - public DataStream<Event> execute(DataStream<Event> dataStream, Node node) throws JobExecuteException { - - return super.execute(dataStream, node); - - } -} diff --git a/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/execution/ProcessingExecutor.java b/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/execution/ProcessingExecutor.java deleted file mode 100644 index cf6b496..0000000 --- a/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/execution/ProcessingExecutor.java +++ /dev/null @@ -1,44 +0,0 @@ -package com.geedgenetworks.bootstrap.execution; - -import com.geedgenetworks.bootstrap.enums.ProcessorType; -import com.geedgenetworks.bootstrap.exception.JobExecuteException; -import com.geedgenetworks.common.Constants; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.core.pojo.ProcessorConfig; -import com.google.common.collect.Maps; -import com.typesafe.config.Config; -import org.apache.flink.streaming.api.datastream.DataStream; -import org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator; - -import java.net.URL; -import java.util.List; -import java.util.Map; - -/** - * Initialize config and execute processor - */ -public class ProcessingExecutor extends AbstractProcessorExecutor { - private static final String PROCESSOR_TYPE = ProcessorType.PROCESSING.getType(); - - public ProcessingExecutor(List<URL> jarPaths, Config config) { - super(jarPaths, config); - } - - @Override - protected Map<String, ProcessorConfig> initialize(List<URL> jarPaths, Config operatorConfig) { - Map<String, ProcessorConfig> processingConfigMap = Maps.newHashMap(); - if (operatorConfig.hasPath(Constants.PROCESSING_PIPELINES)) { - Config processors = operatorConfig.getConfig(Constants.PROCESSING_PIPELINES); - processors.root().unwrapped().forEach((key, value) -> { - processingConfigMap.put(key, checkConfig(key, (Map<String, Object>) value, processors)); - }); - } - return processingConfigMap; - } - - @Override - public DataStream<Event> execute(DataStream<Event> dataStream, Node node) throws JobExecuteException { - - return super.execute(dataStream, node); - } -} diff --git a/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/execution/ProcessorExecutor.java b/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/execution/ProcessorExecutor.java new file mode 100644 index 0000000..204866f --- /dev/null +++ b/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/execution/ProcessorExecutor.java @@ -0,0 +1,102 @@ +package com.geedgenetworks.bootstrap.execution; + +import com.geedgenetworks.api.processor.ProcessorConfigOptions; +import com.geedgenetworks.api.factory.FactoryUtil; +import com.geedgenetworks.api.factory.ProcessorFactory; +import com.geedgenetworks.bootstrap.exception.JobExecuteException; +import com.geedgenetworks.common.config.CheckConfigUtil; +import com.geedgenetworks.common.config.CheckResult; +import com.geedgenetworks.common.config.Constants; +import com.geedgenetworks.common.exception.CommonErrorCode; +import com.geedgenetworks.common.exception.ConfigValidationException; +import com.geedgenetworks.api.processor.Processor; +import com.geedgenetworks.api.processor.ProcessorConfig; +import com.geedgenetworks.api.connector.event.Event; +import com.google.common.collect.Maps; +import com.typesafe.config.Config; +import org.apache.flink.streaming.api.datastream.DataStream; +import java.util.Map; +/** + * Initialize config and execute processor + */ +public class ProcessorExecutor extends AbstractExecutor<JobRuntimeEnvironment, Config> { + private Map<String, ProcessorConfig> operators; + private Map<String, Processor<?>> processors; + + public ProcessorExecutor(JobRuntimeEnvironment environment, Config jobConfig) { + super(environment, jobConfig); + } + + @Override + protected void initialize(Config jobConfig) { + operators = Maps.newHashMap(); + processors = Maps.newHashMap(); + + if (jobConfig.hasPath(Constants.FILTERS)) { + discoveryProcessors(jobConfig.getConfig(Constants.FILTERS)); + } + + if (jobConfig.hasPath(Constants.SPLITS)) { + discoveryProcessors(jobConfig.getConfig(Constants.SPLITS)); + } + + if (jobConfig.hasPath(Constants.PREPROCESSING_PIPELINES)) { + discoveryProcessors(jobConfig.getConfig(Constants.PREPROCESSING_PIPELINES)); + } + + if (jobConfig.hasPath(Constants.PROCESSING_PIPELINES)) { + discoveryProcessors(jobConfig.getConfig(Constants.PROCESSING_PIPELINES)); + } + + if (jobConfig.hasPath(Constants.POSTPROCESSING_PIPELINES)) { + discoveryProcessors(jobConfig.getConfig(Constants.POSTPROCESSING_PIPELINES)); + } + } + + private void discoveryProcessors(Config config) { + + config.root().unwrapped().forEach((key, value) -> { + + CheckResult result = CheckConfigUtil.checkAllExists(config.getConfig(key), + ProcessorConfigOptions.TYPE.key()); + if (!result.isSuccess()) { + throw new ConfigValidationException(CommonErrorCode.CONFIG_VALIDATION_FAILED, String.format( + "Processor: %s, Message: %s", key, result.getMsg())); + } + + Processor processor = FactoryUtil + .discoverProcessorFactory(ProcessorFactory.class, ((Map<?, ?>) value).get("type").toString()).createProcessor(); + processors.put(key, processor); + operators.put(key, processor.parseConfig(key,config.getConfig(key))); + }); + + } + + @Override + public DataStream<Event> execute(DataStream<Event> input, JobTopologyNode jobTopologyNode) throws JobExecuteException { + String name = jobTopologyNode.getName(); + ProcessorConfig operatorConfig = operators.get(name); + if (operatorConfig == null) { + throw new JobExecuteException("No matching operator configuration found for: " + name); + } + + Processor processor = processors.get(operatorConfig.getName()); + + if (processor == null) { + throw new JobExecuteException("No matching processor found for type: " + operatorConfig.getType()); + } + + // Set parallelism if needed + int parallelism = jobTopologyNode.getParallelism(); + if (parallelism > 0) { + operatorConfig.setParallelism(parallelism); + } + + try { + return processor.process(environment.getStreamExecutionEnvironment(), input, operatorConfig); + } catch (Exception e) { + throw new JobExecuteException("Failed to execute processor due to unexpected error.", e); + } + + } +} diff --git a/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/execution/RuntimeEnvironment.java b/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/execution/RuntimeEnvironment.java index b177e40..023ba65 100644 --- a/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/execution/RuntimeEnvironment.java +++ b/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/execution/RuntimeEnvironment.java @@ -1,5 +1,5 @@ package com.geedgenetworks.bootstrap.execution; -import com.geedgenetworks.common.Constants; +import com.geedgenetworks.common.config.Constants; import com.geedgenetworks.common.config.CheckResult; import com.typesafe.config.Config; import com.typesafe.config.ConfigUtil; diff --git a/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/execution/SinkExecutor.java b/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/execution/SinkExecutor.java index 70934b8..501fa81 100644 --- a/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/execution/SinkExecutor.java +++ b/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/execution/SinkExecutor.java @@ -1,50 +1,46 @@ package com.geedgenetworks.bootstrap.execution; import com.alibaba.fastjson.JSONObject; -import com.geedgenetworks.bootstrap.enums.ProcessorType; +import com.geedgenetworks.bootstrap.enums.OperatorType; import com.geedgenetworks.bootstrap.exception.JobExecuteException; import com.geedgenetworks.bootstrap.utils.SchemaConfigParse; -import com.geedgenetworks.common.Constants; -import com.geedgenetworks.common.Event; +import com.geedgenetworks.common.config.Constants; import com.geedgenetworks.common.config.CheckConfigUtil; import com.geedgenetworks.common.config.CheckResult; -import com.geedgenetworks.common.config.SinkConfigOptions; import com.geedgenetworks.common.exception.CommonErrorCode; import com.geedgenetworks.common.exception.ConfigValidationException; -import com.geedgenetworks.core.connector.schema.Schema; -import com.geedgenetworks.core.connector.sink.SinkProvider; -import com.geedgenetworks.core.factories.FactoryUtil; -import com.geedgenetworks.core.factories.SinkTableFactory; -import com.geedgenetworks.core.factories.TableFactory; -import com.geedgenetworks.core.pojo.SinkConfig; +import com.geedgenetworks.api.connector.sink.SinkConfig; +import com.geedgenetworks.api.connector.sink.SinkConfigOptions; +import com.geedgenetworks.api.connector.sink.SinkProvider; +import com.geedgenetworks.api.connector.sink.SinkTableFactory; +import com.geedgenetworks.api.connector.event.Event; +import com.geedgenetworks.api.factory.FactoryUtil; +import com.geedgenetworks.api.factory.ConnectorFactory; +import com.geedgenetworks.api.connector.schema.Schema; import com.google.common.collect.Maps; import com.typesafe.config.Config; import lombok.extern.slf4j.Slf4j; import org.apache.flink.configuration.Configuration; import org.apache.flink.streaming.api.datastream.DataStream; import org.apache.flink.streaming.api.datastream.DataStreamSink; -import org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator; -import java.net.URL; -import java.util.List; import java.util.Map; /** * Initialize config and execute sink connector */ @Slf4j -public class SinkExecutor extends AbstractExecutor<String, SinkConfig> { - private static final String PROCESSOR_TYPE = ProcessorType.SINK.getType(); - - public SinkExecutor(List<URL> jarPaths, Config config) { - super(jarPaths, config); +public class SinkExecutor extends AbstractExecutor<JobRuntimeEnvironment, Config> { + private static final String PROCESSOR_TYPE = OperatorType.SINK.getType(); + private Map<String, SinkConfig> operators; + public SinkExecutor(JobRuntimeEnvironment environment, Config jobConfig) { + super(environment, jobConfig); } @Override - protected Map<String, SinkConfig> initialize(List<URL> jarPaths, Config operatorConfig) { - Map<String, SinkConfig> sinkConfigMap = Maps.newHashMap(); - - if (operatorConfig.hasPath(Constants.SINKS)) { - Config sinks = operatorConfig.getConfig(Constants.SINKS); + protected void initialize(Config jobConfig) { + operators = Maps.newHashMap(); + if (jobConfig.hasPath(Constants.SINKS)) { + Config sinks = jobConfig.getConfig(Constants.SINKS); sinks.root().unwrapped().forEach((key,value) -> { CheckResult result = CheckConfigUtil.checkAllExists(sinks.getConfig(key), SinkConfigOptions.TYPE.key(), SinkConfigOptions.PROPERTIES.key()); @@ -56,26 +52,25 @@ public class SinkExecutor extends AbstractExecutor<String, SinkConfig> { SinkConfig sinkConfig = new JSONObject((Map<String, Object>) value).toJavaObject(SinkConfig.class); sinkConfig.setName(key); - sinkConfigMap.put(key, sinkConfig); + operators.put(key, sinkConfig); }); } - return sinkConfigMap; } @Override - public DataStream<Event> execute(DataStream<Event> dataStream, Node node) throws JobExecuteException { - SinkConfig sinkConfig = operatorMap.get(node.getName()); + public DataStream<Event> execute(DataStream<Event> input, JobTopologyNode jobTopologyNode) throws JobExecuteException { + SinkConfig sinkConfig = operators.get(jobTopologyNode.getName()); try { - SinkTableFactory sinkTableFactory = FactoryUtil.discoverTableFactory(SinkTableFactory.class, sinkConfig.getType()); + SinkTableFactory sinkTableFactory = FactoryUtil.discoverConnectorFactory(SinkTableFactory.class, sinkConfig.getType()); Map<String, String> options = sinkConfig.getProperties(); Configuration configuration = Configuration.fromMap(options); Schema schema = null; if(sinkConfig.getSchema() != null && !sinkConfig.getSchema().isEmpty()){ schema = SchemaConfigParse.parseSchemaConfig(sinkConfig.getSchema()); } - TableFactory.Context context = new TableFactory.Context(schema, options, configuration); + ConnectorFactory.Context context = new ConnectorFactory.Context(schema, options, configuration); SinkProvider sinkProvider = sinkTableFactory.getSinkProvider(context); if(!sinkProvider.supportDynamicSchema() && schema != null && schema.isDynamic()){ throw new UnsupportedOperationException(String.format("sink(%s) not support DynamicSchema", sinkConfig.getName())); @@ -85,9 +80,9 @@ public class SinkExecutor extends AbstractExecutor<String, SinkConfig> { System.out.println(String.format("sink(%s) schema:\n%s", sinkConfig.getName(), schema.getDataType().treeString())); } - DataStreamSink<?> dataStreamSink = sinkProvider.consumeDataStream(dataStream); - if (node.getParallelism() > 0) { - dataStreamSink.setParallelism(node.getParallelism()); + DataStreamSink<?> dataStreamSink = sinkProvider.consumeDataStream(input); + if (jobTopologyNode.getParallelism() > 0) { + dataStreamSink.setParallelism(jobTopologyNode.getParallelism()); } dataStreamSink.name(sinkConfig.getName()); } catch (Exception e) { diff --git a/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/execution/SourceExecutor.java b/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/execution/SourceExecutor.java index 9dff6b0..ca4fc1d 100644 --- a/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/execution/SourceExecutor.java +++ b/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/execution/SourceExecutor.java @@ -1,23 +1,23 @@ package com.geedgenetworks.bootstrap.execution; import com.alibaba.fastjson2.JSONObject; -import com.geedgenetworks.bootstrap.enums.ProcessorType; +import com.geedgenetworks.bootstrap.enums.OperatorType; import com.geedgenetworks.bootstrap.exception.ConfigCheckException; import com.geedgenetworks.bootstrap.exception.JobExecuteException; import com.geedgenetworks.bootstrap.utils.SchemaConfigParse; -import com.geedgenetworks.common.Constants; -import com.geedgenetworks.common.Event; +import com.geedgenetworks.common.config.Constants; import com.geedgenetworks.common.config.CheckConfigUtil; import com.geedgenetworks.common.config.CheckResult; -import com.geedgenetworks.common.config.SourceConfigOptions; import com.geedgenetworks.common.exception.CommonErrorCode; import com.geedgenetworks.common.exception.ConfigValidationException; -import com.geedgenetworks.core.connector.schema.Schema; -import com.geedgenetworks.core.connector.source.SourceProvider; -import com.geedgenetworks.core.factories.FactoryUtil; -import com.geedgenetworks.core.factories.SourceTableFactory; -import com.geedgenetworks.core.factories.TableFactory; -import com.geedgenetworks.core.pojo.SourceConfig; +import com.geedgenetworks.api.connector.source.SourceConfigOptions; +import com.geedgenetworks.api.connector.source.SourceConfig; +import com.geedgenetworks.api.connector.source.SourceProvider; +import com.geedgenetworks.api.connector.source.SourceTableFactory; +import com.geedgenetworks.api.connector.event.Event; +import com.geedgenetworks.api.factory.FactoryUtil; +import com.geedgenetworks.api.factory.ConnectorFactory; +import com.geedgenetworks.api.connector.schema.Schema; import com.google.common.collect.Maps; import com.typesafe.config.*; import lombok.extern.slf4j.Slf4j; @@ -28,26 +28,24 @@ import org.apache.flink.configuration.Configuration; import org.apache.flink.streaming.api.datastream.DataStream; import org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator; -import java.net.URL; import java.time.Duration; -import java.util.List; import java.util.Map; /** * Initialize config and execute source connector */ @Slf4j -public class SourceExecutor extends AbstractExecutor<String, SourceConfig> { - private static final String PROCESSOR_TYPE = ProcessorType.SOURCE.getType(); - - public SourceExecutor(List<URL> jarPaths, Config config) { - super(jarPaths, config); +public class SourceExecutor extends AbstractExecutor<JobRuntimeEnvironment, Config> { + private static final String PROCESSOR_TYPE = OperatorType.SOURCE.getType(); + private Map<String, SourceConfig> operators; + public SourceExecutor(JobRuntimeEnvironment environment, Config jobConfig) { + super(environment, jobConfig); } @Override - protected Map<String, SourceConfig> initialize(List<URL> jarPaths, Config operatorConfig) { - Map<String, SourceConfig> sourceConfigMap = Maps.newHashMap(); - if (operatorConfig.hasPath(Constants.SOURCES)) { - Config sources = operatorConfig.getConfig(Constants.SOURCES); + protected void initialize(Config jobConfig) { + operators = Maps.newHashMap(); + if (jobConfig.hasPath(Constants.SOURCES)) { + Config sources = jobConfig.getConfig(Constants.SOURCES); sources.root().unwrapped().forEach((key,value) -> { CheckResult result = CheckConfigUtil.checkAllExists(sources.getConfig(key), SourceConfigOptions.TYPE.key(), SourceConfigOptions.PROPERTIES.key()); @@ -59,27 +57,25 @@ public class SourceExecutor extends AbstractExecutor<String, SourceConfig> { SourceConfig sourceConfig = new JSONObject((Map<String, Object>) value).toJavaObject(SourceConfig.class); sourceConfig.setName(key); - sourceConfigMap.put(key, sourceConfig); + operators.put(key, sourceConfig); }); } - - return sourceConfigMap; } @Override - public DataStream<Event> execute(DataStream<Event> outputStreamOperator, Node node) throws JobExecuteException { - SourceConfig sourceConfig = operatorMap.get(node.getName()); + public DataStream<Event> execute(DataStream<Event> input, JobTopologyNode jobTopologyNode) throws JobExecuteException { + SourceConfig sourceConfig = operators.get(jobTopologyNode.getName()); SingleOutputStreamOperator sourceSingleOutputStreamOperator; try { - SourceTableFactory tableFactory = FactoryUtil.discoverTableFactory(SourceTableFactory.class, sourceConfig.getType()); + SourceTableFactory tableFactory = FactoryUtil.discoverConnectorFactory(SourceTableFactory.class, sourceConfig.getType()); Map<String, String> options = sourceConfig.getProperties(); Configuration configuration = Configuration.fromMap(options); Schema schema = null; if(sourceConfig.getSchema() != null && !sourceConfig.getSchema().isEmpty()){ schema = SchemaConfigParse.parseSchemaConfig(sourceConfig.getSchema()); } - TableFactory.Context context = new TableFactory.Context(schema, options, configuration); + ConnectorFactory.Context context = new ConnectorFactory.Context(schema, options, configuration); SourceProvider sourceProvider = tableFactory.getSourceProvider(context); if(!sourceProvider.supportDynamicSchema() && schema != null && schema.isDynamic()){ throw new UnsupportedOperationException(String.format("source(%s) not support DynamicSchema", sourceConfig.getName())); @@ -89,18 +85,18 @@ public class SourceExecutor extends AbstractExecutor<String, SourceConfig> { System.out.println(String.format("source(%s) schema:\n%s", sourceConfig.getName(), schema.getDataType().treeString())); } - sourceSingleOutputStreamOperator = sourceProvider.produceDataStream(jobRuntimeEnvironment.getStreamExecutionEnvironment()).name(sourceConfig.getName()); - if (node.getParallelism() > 0) { - sourceSingleOutputStreamOperator.setParallelism(node.getParallelism()); + sourceSingleOutputStreamOperator = sourceProvider.produceDataStream(environment.getStreamExecutionEnvironment()).name(sourceConfig.getName()); + if (jobTopologyNode.getParallelism() > 0) { + sourceSingleOutputStreamOperator.setParallelism(jobTopologyNode.getParallelism()); } - sourceSingleOutputStreamOperator = setWatermarkIfNecessary(sourceSingleOutputStreamOperator, sourceConfig, node); + sourceSingleOutputStreamOperator = setWatermarkIfNecessary(sourceSingleOutputStreamOperator, sourceConfig, jobTopologyNode); return sourceSingleOutputStreamOperator; } catch (Exception e) { throw new JobExecuteException("Create source instance failed!", e); } } - private SingleOutputStreamOperator<Event> setWatermarkIfNecessary(SingleOutputStreamOperator<Event> dataStream, SourceConfig sourceConfig, Node node){ + private SingleOutputStreamOperator<Event> setWatermarkIfNecessary(SingleOutputStreamOperator<Event> dataStream, SourceConfig sourceConfig, JobTopologyNode jobTopologyNode){ final String watermarkTimestamp = sourceConfig.getWatermark_timestamp(); if(StringUtils.isNotBlank(watermarkTimestamp)){ String timestampUnit = sourceConfig.getWatermark_timestamp_unit(); @@ -139,8 +135,8 @@ public class SourceExecutor extends AbstractExecutor<String, SourceConfig> { WatermarkStrategy.<Event>forBoundedOutOfOrderness(Duration.ofMillis(watermarkLag)) .withTimestampAssigner(timestampAssigner) ); - if (node.getParallelism() > 0) { - dataStream.setParallelism(node.getParallelism()); + if (jobTopologyNode.getParallelism() > 0) { + dataStream.setParallelism(jobTopologyNode.getParallelism()); } } return dataStream; diff --git a/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/execution/SplitExecutor.java b/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/execution/SplitExecutor.java deleted file mode 100644 index 7fe93b5..0000000 --- a/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/execution/SplitExecutor.java +++ /dev/null @@ -1,108 +0,0 @@ -package com.geedgenetworks.bootstrap.execution; - -import com.alibaba.fastjson.JSONObject; -import com.geedgenetworks.bootstrap.exception.JobExecuteException; -import com.geedgenetworks.common.Constants; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.common.config.CheckConfigUtil; -import com.geedgenetworks.common.config.CheckResult; -import com.geedgenetworks.common.config.SplitConfigOptions; -import com.geedgenetworks.common.exception.CommonErrorCode; -import com.geedgenetworks.common.exception.ConfigValidationException; -import com.geedgenetworks.common.udf.RuleContext; -import com.geedgenetworks.core.filter.Filter; -import com.geedgenetworks.core.pojo.FilterConfig; -import com.geedgenetworks.core.pojo.SplitConfig; -import com.geedgenetworks.core.split.Split; -import com.google.common.collect.Maps; -import com.typesafe.config.Config; -import lombok.extern.slf4j.Slf4j; -import org.apache.flink.streaming.api.datastream.DataStream; - -import java.net.URL; -import java.util.List; -import java.util.Map; -import java.util.ServiceLoader; - - -/** - * Initialize config and execute filter operator - */ -@Slf4j -public class SplitExecutor extends AbstractExecutor<String, SplitConfig> { - - - public SplitExecutor(List<URL> jarPaths, Config config) { - super(jarPaths, config); - } - - @Override - protected Map<String, SplitConfig> initialize(List<URL> jarPaths, Config operatorConfig) { - Map<String, SplitConfig> splitConfigMap = Maps.newHashMap(); - if (operatorConfig.hasPath(Constants.SPLITS)) { - Config splitsConfig = operatorConfig.getConfig(Constants.SPLITS); - splitsConfig.root().unwrapped().forEach((key, value) -> { - CheckResult result = CheckConfigUtil.checkAllExists(splitsConfig.getConfig(key), - SplitConfigOptions.TYPE.key()); - if (!result.isSuccess()) { - throw new ConfigValidationException(CommonErrorCode.CONFIG_VALIDATION_FAILED, String.format( - "split: %s, Message: %s", - key, result.getMsg())); - } - SplitConfig splitConfig = new JSONObject((Map<String, Object>) value).toJavaObject(SplitConfig.class); - splitConfig.setName(key); - splitConfigMap.put(key, splitConfig); - }); - } - - return splitConfigMap; - } - - @Override - public DataStream<Event> execute(DataStream<Event> dataStream, Node node) throws JobExecuteException { - SplitConfig splitConfig = operatorMap.get(node.getName()); - boolean found = false; // 标志变量 - ServiceLoader<Split> splits = ServiceLoader.load(Split.class); - for (Split split : splits) { - found = true; // 标志变量 - if(split.type().equals(splitConfig.getType())){ - if (node.getParallelism() > 0) { - splitConfig.setParallelism(node.getParallelism()); - } - try { - dataStream = - split.splitFunction( - dataStream, splitConfig); - } catch (Exception e) { - throw new JobExecuteException("Create split instance failed!", e); - } - break; - } - } - if (!found) { - throw new JobExecuteException("No matching split found for type: " + splitConfig.getType()); - } - return dataStream; - } - - protected SplitConfig checkConfig(String key, Map<String, Object> value, Config config) { - SplitConfig splitConfig = new SplitConfig(); - boolean found = false; // 标志变量 - ServiceLoader<Split> splits = ServiceLoader.load(Split.class); - for (Split split : splits) { - if(split.type().equals(value.getOrDefault("type", "").toString())){ - found = true; - try { - splitConfig = split.checkConfig(key, value, config); - } catch (Exception e) { - throw new JobExecuteException("Create split pipeline instance failed!", e); - } - break; - } - } - if (!found) { - throw new JobExecuteException("No matching split found for type: " + value.getOrDefault("type", "").toString()); - } - return splitConfig; - } -} diff --git a/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/utils/CommandLineUtils.java b/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/utils/CommandLineUtils.java index 6c7d546..b87d05f 100644 --- a/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/utils/CommandLineUtils.java +++ b/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/utils/CommandLineUtils.java @@ -4,7 +4,7 @@ import com.beust.jcommander.JCommander; import com.beust.jcommander.ParameterException; import com.geedgenetworks.bootstrap.command.CommandArgs; import com.geedgenetworks.bootstrap.command.UsageFormatter; -import com.geedgenetworks.common.Constants; +import com.geedgenetworks.common.config.Constants; public final class CommandLineUtils { private CommandLineUtils() { diff --git a/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/utils/CryptoShadeUtils.java b/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/utils/CryptoShadeUtils.java index 94dda4d..238bc07 100644 --- a/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/utils/CryptoShadeUtils.java +++ b/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/utils/CryptoShadeUtils.java @@ -2,8 +2,8 @@ package com.geedgenetworks.bootstrap.utils; import com.alibaba.fastjson2.JSON; import com.alibaba.fastjson2.JSONObject; -import com.geedgenetworks.common.Constants; -import com.geedgenetworks.common.crypto.CryptoShade; +import com.geedgenetworks.bootstrap.command.CryptoShade; +import com.geedgenetworks.common.config.Constants; import com.geedgenetworks.common.config.TypesafeConfigUtils; import com.google.common.base.Preconditions; import com.typesafe.config.*; diff --git a/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/utils/EnvironmentUtil.java b/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/utils/EnvironmentUtil.java index 8028608..472aab9 100644 --- a/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/utils/EnvironmentUtil.java +++ b/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/utils/EnvironmentUtil.java @@ -1,7 +1,7 @@ package com.geedgenetworks.bootstrap.utils; import com.geedgenetworks.bootstrap.execution.ExecutionConfigKeyName; -import com.geedgenetworks.common.Constants; +import com.geedgenetworks.common.config.Constants; import com.geedgenetworks.common.config.CheckResult; import com.typesafe.config.Config; import com.typesafe.config.ConfigUtil; diff --git a/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/utils/SchemaConfigParse.java b/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/utils/SchemaConfigParse.java index c3076b4..00fcd61 100644 --- a/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/utils/SchemaConfigParse.java +++ b/groot-bootstrap/src/main/java/com/geedgenetworks/bootstrap/utils/SchemaConfigParse.java @@ -1,73 +1,73 @@ -package com.geedgenetworks.bootstrap.utils;
-
-import com.alibaba.fastjson2.JSON;
-import com.geedgenetworks.common.exception.CommonErrorCode;
-import com.geedgenetworks.common.exception.ConfigValidationException;
-import com.geedgenetworks.core.connector.schema.Schema;
-import com.geedgenetworks.core.connector.schema.SchemaParser;
-import com.geedgenetworks.core.types.StructType;
-import com.geedgenetworks.core.types.Types;
-import org.apache.commons.io.FileUtils;
-
-import java.io.File;
-import java.io.IOException;
-import java.nio.charset.StandardCharsets;
-import java.util.List;
-import java.util.Map;
-
-public class SchemaConfigParse {
- static final String KEY_BUILTIN = "fields";
- static final String KEY_LOCAL_FILE = "local_file";
- static final String KEY_HTTP = "url";
-
- public static Schema parseSchemaConfig(Map<String, Object> schemaConfig){
- if(schemaConfig == null && schemaConfig.isEmpty()){
- return null;
- }
-
- int builtin = 0, localFile = 0, http = 0;
- if(schemaConfig.containsKey(KEY_BUILTIN)){
- builtin = 1;
- }
- if(schemaConfig.containsKey(KEY_LOCAL_FILE)){
- localFile = 1;
- }
- if(schemaConfig.containsKey(KEY_HTTP)){
- http = 1;
- }
- if(builtin + localFile + http > 1){
- throw new ConfigValidationException(CommonErrorCode.CONFIG_VALIDATION_FAILED, "only support one type schema:" + schemaConfig);
- }
-
- if(builtin == 1){
- Object fields = schemaConfig.get(KEY_BUILTIN);
- if(fields instanceof List){
- StructType dataType = Types.parseSchemaFromJson(JSON.toJSONString(fields));
- return Schema.newSchema(dataType);
- }else if(fields instanceof String){
- StructType dataType = Types.parseStructType((String) fields);
- return Schema.newSchema(dataType);
- }else{
- throw new ConfigValidationException(CommonErrorCode.CONFIG_VALIDATION_FAILED, "only support schema fields:" + fields);
- }
- }
-
- if(localFile == 1){
- String path = schemaConfig.get(KEY_LOCAL_FILE).toString();
- try {
- String content = FileUtils.readFileToString(new File(path), StandardCharsets.UTF_8);
- StructType dataType = SchemaParser.PARSER_AVRO.parser(content);
- return Schema.newSchema(dataType);
- } catch (IOException e) {
- throw new ConfigValidationException(CommonErrorCode.CONFIG_VALIDATION_FAILED, "schema path read error:" + path, e);
- }
- }
-
- if(http == 1){
- String url = schemaConfig.get(KEY_HTTP).toString();
- return Schema.newHttpDynamicSchema(url);
- }
-
- return null;
- }
-}
+package com.geedgenetworks.bootstrap.utils; + +import com.alibaba.fastjson2.JSON; +import com.geedgenetworks.common.exception.CommonErrorCode; +import com.geedgenetworks.common.exception.ConfigValidationException; +import com.geedgenetworks.api.connector.schema.Schema; +import com.geedgenetworks.api.connector.schema.SchemaParser; +import com.geedgenetworks.api.connector.type.StructType; +import com.geedgenetworks.api.connector.type.Types; +import org.apache.commons.io.FileUtils; + +import java.io.File; +import java.io.IOException; +import java.nio.charset.StandardCharsets; +import java.util.List; +import java.util.Map; + +public class SchemaConfigParse { + static final String KEY_BUILTIN = "fields"; + static final String KEY_LOCAL_FILE = "local_file"; + static final String KEY_HTTP = "url"; + + public static Schema parseSchemaConfig(Map<String, Object> schemaConfig){ + if(schemaConfig == null && schemaConfig.isEmpty()){ + return null; + } + + int builtin = 0, localFile = 0, http = 0; + if(schemaConfig.containsKey(KEY_BUILTIN)){ + builtin = 1; + } + if(schemaConfig.containsKey(KEY_LOCAL_FILE)){ + localFile = 1; + } + if(schemaConfig.containsKey(KEY_HTTP)){ + http = 1; + } + if(builtin + localFile + http > 1){ + throw new ConfigValidationException(CommonErrorCode.CONFIG_VALIDATION_FAILED, "only support one type schema:" + schemaConfig); + } + + if(builtin == 1){ + Object fields = schemaConfig.get(KEY_BUILTIN); + if(fields instanceof List){ + StructType dataType = Types.parseSchemaFromJson(JSON.toJSONString(fields)); + return Schema.newSchema(dataType); + }else if(fields instanceof String){ + StructType dataType = Types.parseStructType((String) fields); + return Schema.newSchema(dataType); + }else{ + throw new ConfigValidationException(CommonErrorCode.CONFIG_VALIDATION_FAILED, "only support schema fields:" + fields); + } + } + + if(localFile == 1){ + String path = schemaConfig.get(KEY_LOCAL_FILE).toString(); + try { + String content = FileUtils.readFileToString(new File(path), StandardCharsets.UTF_8); + StructType dataType = SchemaParser.PARSER_AVRO.parser(content); + return Schema.newSchema(dataType); + } catch (IOException e) { + throw new ConfigValidationException(CommonErrorCode.CONFIG_VALIDATION_FAILED, "schema path read error:" + path, e); + } + } + + if(http == 1){ + String url = schemaConfig.get(KEY_HTTP).toString(); + return Schema.newHttpDynamicSchema(url); + } + + return null; + } +} diff --git a/groot-bootstrap/src/main/resources/META-INF/services/com.geedgenetworks.common.crypto.CryptoShade b/groot-bootstrap/src/main/resources/META-INF/services/com.geedgenetworks.bootstrap.command.CryptoShade index 273b40d..273b40d 100644 --- a/groot-bootstrap/src/main/resources/META-INF/services/com.geedgenetworks.common.crypto.CryptoShade +++ b/groot-bootstrap/src/main/resources/META-INF/services/com.geedgenetworks.bootstrap.command.CryptoShade diff --git a/groot-bootstrap/src/test/java/com/geedgenetworks/bootstrap/main/LogoTest.java b/groot-bootstrap/src/test/java/com/geedgenetworks/bootstrap/main/LogoTest.java index d101bb2..aeb9d4b 100644 --- a/groot-bootstrap/src/test/java/com/geedgenetworks/bootstrap/main/LogoTest.java +++ b/groot-bootstrap/src/test/java/com/geedgenetworks/bootstrap/main/LogoTest.java @@ -1,6 +1,6 @@ package com.geedgenetworks.bootstrap.main; -import com.geedgenetworks.common.Constants; +import com.geedgenetworks.common.config.Constants; import org.junit.Assert; import org.junit.Test; diff --git a/groot-bootstrap/src/test/java/com/geedgenetworks/bootstrap/main/simple/JobAggTest.java b/groot-bootstrap/src/test/java/com/geedgenetworks/bootstrap/main/simple/JobAggTest.java index e33998c..fa9c2dd 100644 --- a/groot-bootstrap/src/test/java/com/geedgenetworks/bootstrap/main/simple/JobAggTest.java +++ b/groot-bootstrap/src/test/java/com/geedgenetworks/bootstrap/main/simple/JobAggTest.java @@ -8,7 +8,7 @@ import com.geedgenetworks.bootstrap.execution.JobExecution; import com.geedgenetworks.bootstrap.main.simple.collect.CollectSink; import com.geedgenetworks.bootstrap.utils.CommandLineUtils; import com.geedgenetworks.bootstrap.utils.ConfigFileUtils; -import com.geedgenetworks.common.Constants; +import com.geedgenetworks.common.config.Constants; import com.geedgenetworks.common.config.ConfigProvider; import com.geedgenetworks.common.config.GrootStreamConfig; import com.typesafe.config.Config; diff --git a/groot-bootstrap/src/test/java/com/geedgenetworks/bootstrap/main/simple/JobDosTest.java b/groot-bootstrap/src/test/java/com/geedgenetworks/bootstrap/main/simple/JobDosTest.java index ea3793e..bd4f9d8 100644 --- a/groot-bootstrap/src/test/java/com/geedgenetworks/bootstrap/main/simple/JobDosTest.java +++ b/groot-bootstrap/src/test/java/com/geedgenetworks/bootstrap/main/simple/JobDosTest.java @@ -8,7 +8,7 @@ import com.geedgenetworks.bootstrap.execution.JobExecution; import com.geedgenetworks.bootstrap.main.simple.collect.CollectSink; import com.geedgenetworks.bootstrap.utils.CommandLineUtils; import com.geedgenetworks.bootstrap.utils.ConfigFileUtils; -import com.geedgenetworks.common.Constants; +import com.geedgenetworks.common.config.Constants; import com.geedgenetworks.common.config.ConfigProvider; import com.geedgenetworks.common.config.GrootStreamConfig; import com.typesafe.config.Config; diff --git a/groot-bootstrap/src/test/java/com/geedgenetworks/bootstrap/main/simple/JobEtlTest.java b/groot-bootstrap/src/test/java/com/geedgenetworks/bootstrap/main/simple/JobEtlTest.java index 80b7129..1fc62d0 100644 --- a/groot-bootstrap/src/test/java/com/geedgenetworks/bootstrap/main/simple/JobEtlTest.java +++ b/groot-bootstrap/src/test/java/com/geedgenetworks/bootstrap/main/simple/JobEtlTest.java @@ -8,7 +8,7 @@ import com.geedgenetworks.bootstrap.execution.JobExecution; import com.geedgenetworks.bootstrap.main.simple.collect.CollectSink; import com.geedgenetworks.bootstrap.utils.CommandLineUtils; import com.geedgenetworks.bootstrap.utils.ConfigFileUtils; -import com.geedgenetworks.common.Constants; +import com.geedgenetworks.common.config.Constants; import com.geedgenetworks.common.config.ConfigProvider; import com.geedgenetworks.common.config.GrootStreamConfig; import com.typesafe.config.Config; @@ -93,7 +93,6 @@ public class JobEtlTest { executeCommandArgs.buildCommand(); - GrootStreamConfig grootStreamConfig = ConfigProvider.locateAndGetGrootStreamConfig(); Path configFile = ConfigFileUtils.getConfigPath(executeCommandArgs); // check config file exist diff --git a/groot-bootstrap/src/test/java/com/geedgenetworks/bootstrap/main/simple/JobSplitTest.java b/groot-bootstrap/src/test/java/com/geedgenetworks/bootstrap/main/simple/JobSplitTest.java index 352bad2..577b293 100644 --- a/groot-bootstrap/src/test/java/com/geedgenetworks/bootstrap/main/simple/JobSplitTest.java +++ b/groot-bootstrap/src/test/java/com/geedgenetworks/bootstrap/main/simple/JobSplitTest.java @@ -3,13 +3,12 @@ package com.geedgenetworks.bootstrap.main.simple; import cn.hutool.setting.yaml.YamlUtil; import com.geedgenetworks.bootstrap.command.ExecuteCommandArgs; import com.geedgenetworks.bootstrap.enums.EngineType; -import com.geedgenetworks.bootstrap.exception.JobExecuteException; import com.geedgenetworks.bootstrap.execution.ExecutionConfigKeyName; import com.geedgenetworks.bootstrap.execution.JobExecution; import com.geedgenetworks.bootstrap.main.simple.collect.CollectSink; import com.geedgenetworks.bootstrap.utils.CommandLineUtils; import com.geedgenetworks.bootstrap.utils.ConfigFileUtils; -import com.geedgenetworks.common.Constants; +import com.geedgenetworks.common.config.Constants; import com.geedgenetworks.common.config.ConfigProvider; import com.geedgenetworks.common.config.GrootStreamConfig; import com.typesafe.config.Config; @@ -23,7 +22,6 @@ import org.junit.ClassRule; import org.junit.Test; import java.nio.file.Path; -import java.util.List; import java.util.Map; import static org.junit.Assert.assertTrue; diff --git a/groot-bootstrap/src/test/java/com/geedgenetworks/bootstrap/main/simple/collect/CollectSink.java b/groot-bootstrap/src/test/java/com/geedgenetworks/bootstrap/main/simple/collect/CollectSink.java index c5806ed..ccb01a4 100644 --- a/groot-bootstrap/src/test/java/com/geedgenetworks/bootstrap/main/simple/collect/CollectSink.java +++ b/groot-bootstrap/src/test/java/com/geedgenetworks/bootstrap/main/simple/collect/CollectSink.java @@ -1,6 +1,6 @@ package com.geedgenetworks.bootstrap.main.simple.collect; -import com.geedgenetworks.common.Event; +import com.geedgenetworks.api.connector.event.Event; import org.apache.flink.streaming.api.functions.sink.SinkFunction; import java.util.*; diff --git a/groot-bootstrap/src/test/java/com/geedgenetworks/bootstrap/main/simple/collect/CollectTableFactory.java b/groot-bootstrap/src/test/java/com/geedgenetworks/bootstrap/main/simple/collect/CollectTableFactory.java index 32a0acd..15d6328 100644 --- a/groot-bootstrap/src/test/java/com/geedgenetworks/bootstrap/main/simple/collect/CollectTableFactory.java +++ b/groot-bootstrap/src/test/java/com/geedgenetworks/bootstrap/main/simple/collect/CollectTableFactory.java @@ -1,8 +1,8 @@ package com.geedgenetworks.bootstrap.main.simple.collect; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.core.connector.sink.SinkProvider; -import com.geedgenetworks.core.factories.SinkTableFactory; +import com.geedgenetworks.api.connector.sink.SinkProvider; +import com.geedgenetworks.api.connector.sink.SinkTableFactory; +import com.geedgenetworks.api.connector.event.Event; import org.apache.flink.configuration.ConfigOption; import org.apache.flink.streaming.api.datastream.DataStream; import org.apache.flink.streaming.api.datastream.DataStreamSink; @@ -16,7 +16,7 @@ import java.util.Set; public class CollectTableFactory implements SinkTableFactory { public static final String IDENTIFIER = "collect"; @Override - public String factoryIdentifier() { + public String type() { return IDENTIFIER; } diff --git a/groot-bootstrap/src/test/java/com/geedgenetworks/bootstrap/utils/CryptoShadeTest.java b/groot-bootstrap/src/test/java/com/geedgenetworks/bootstrap/utils/CryptoShadeTest.java index f77ba44..a3d2bd5 100644 --- a/groot-bootstrap/src/test/java/com/geedgenetworks/bootstrap/utils/CryptoShadeTest.java +++ b/groot-bootstrap/src/test/java/com/geedgenetworks/bootstrap/utils/CryptoShadeTest.java @@ -3,7 +3,7 @@ package com.geedgenetworks.bootstrap.utils; import cn.hutool.setting.yaml.YamlUtil; import com.alibaba.fastjson2.JSON; import com.alibaba.fastjson2.JSONObject; -import com.geedgenetworks.common.Constants; +import com.geedgenetworks.common.config.Constants; import com.typesafe.config.Config; import com.typesafe.config.ConfigObject; import com.typesafe.config.ConfigRenderOptions; diff --git a/groot-common/pom.xml b/groot-common/pom.xml index 37a4d25..66096ae 100644 --- a/groot-common/pom.xml +++ b/groot-common/pom.xml @@ -11,16 +11,13 @@ <artifactId>groot-common</artifactId> <name>Groot : Common</name> - <properties> - - </properties> - <dependencies> <dependency> - <groupId>com.jayway.jsonpath</groupId> - <artifactId>json-path</artifactId> + <groupId>com.typesafe</groupId> + <artifactId>config</artifactId> </dependency> + <dependency> <groupId>com.alibaba</groupId> <artifactId>fastjson</artifactId> @@ -32,70 +29,66 @@ </dependency> <dependency> - <groupId>com.googlecode.aviator</groupId> - <artifactId>aviator</artifactId> - </dependency> - - <dependency> <groupId>cn.hutool</groupId> <artifactId>hutool-all</artifactId> </dependency> - <dependency> - <groupId>org.bouncycastle</groupId> - <artifactId>bcprov-jdk18on</artifactId> - </dependency> - - - <dependency> - <groupId>org.apache.avro</groupId> - <artifactId>avro</artifactId> + <groupId>com.geedgenetworks</groupId> + <artifactId>sketches</artifactId> </dependency> <dependency> - <groupId>com.geedgenetworks</groupId> - <artifactId>galaxy</artifactId> + <groupId>com.alibaba.nacos</groupId> + <artifactId>nacos-client</artifactId> <exclusions> <exclusion> - <groupId>org.apache.httpcomponents</groupId> - <artifactId>httpclient</artifactId> + <groupId>commons-codec</groupId> + <artifactId>commons-codec</artifactId> </exclusion> </exclusions> </dependency> + <dependency> - <groupId>org.yaml</groupId> - <artifactId>snakeyaml</artifactId> + <groupId>org.apache.avro</groupId> + <artifactId>avro</artifactId> </dependency> <dependency> - <groupId>com.github.rholder</groupId> - <artifactId>guava-retrying</artifactId> + <groupId>com.googlecode.aviator</groupId> + <artifactId>aviator</artifactId> </dependency> <dependency> <groupId>com.hazelcast</groupId> <artifactId>hazelcast</artifactId> </dependency> + <dependency> - <groupId>com.lmax</groupId> - <artifactId>disruptor</artifactId> + <groupId>com.geedgenetworks</groupId> + <artifactId>http-client-shaded</artifactId> + <version>${project.version}</version> + <exclusions> + <exclusion> + <groupId>commons-codec</groupId> + <artifactId>commons-codec</artifactId> + </exclusion> + </exclusions> + <classifier>optional</classifier> </dependency> - <!-- flink --> <dependency> - <groupId>org.apache.flink</groupId> - <artifactId>flink-table-api-java-bridge_${scala.version}</artifactId> - <scope>provided</scope> + <groupId>com.geedgenetworks</groupId> + <artifactId>galaxy</artifactId> + <exclusions> + <exclusion> + <groupId>org.apache.httpcomponents</groupId> + <artifactId>httpclient</artifactId> + </exclusion> + </exclusions> </dependency> </dependencies> - <build> - <plugins> - - </plugins> - </build> - </project> diff --git a/groot-common/src/main/java/com/geedgenetworks/common/Accumulator.java b/groot-common/src/main/java/com/geedgenetworks/common/config/Accumulator.java index 403cecc..fdadea3 100644 --- a/groot-common/src/main/java/com/geedgenetworks/common/Accumulator.java +++ b/groot-common/src/main/java/com/geedgenetworks/common/config/Accumulator.java @@ -1,8 +1,6 @@ -package com.geedgenetworks.common; +package com.geedgenetworks.common.config; import lombok.Data; -import org.apache.flink.metrics.Counter; - import java.io.Serializable; import java.util.Map; diff --git a/groot-common/src/main/java/com/geedgenetworks/common/config/CheckConfigUtil.java b/groot-common/src/main/java/com/geedgenetworks/common/config/CheckConfigUtil.java index 1d4e819..3084c00 100644 --- a/groot-common/src/main/java/com/geedgenetworks/common/config/CheckConfigUtil.java +++ b/groot-common/src/main/java/com/geedgenetworks/common/config/CheckConfigUtil.java @@ -3,7 +3,6 @@ package com.geedgenetworks.common.config; import com.typesafe.config.Config; import java.util.Arrays; -import java.util.LinkedList; import java.util.List; import java.util.stream.Collectors; diff --git a/groot-common/src/main/java/com/geedgenetworks/common/config/CommonConfigLocator.java b/groot-common/src/main/java/com/geedgenetworks/common/config/CommonConfigLocator.java index 5302cc2..65e7437 100644 --- a/groot-common/src/main/java/com/geedgenetworks/common/config/CommonConfigLocator.java +++ b/groot-common/src/main/java/com/geedgenetworks/common/config/CommonConfigLocator.java @@ -1,6 +1,5 @@ package com.geedgenetworks.common.config; -import com.geedgenetworks.common.Constants; import com.hazelcast.internal.config.AbstractConfigLocator; import static com.hazelcast.internal.config.DeclarativeConfigUtil.YAML_ACCEPTED_SUFFIXES; diff --git a/groot-common/src/main/java/com/geedgenetworks/common/config/ConfigProvider.java b/groot-common/src/main/java/com/geedgenetworks/common/config/ConfigProvider.java index a967ae5..5dfcc1c 100644 --- a/groot-common/src/main/java/com/geedgenetworks/common/config/ConfigProvider.java +++ b/groot-common/src/main/java/com/geedgenetworks/common/config/ConfigProvider.java @@ -1,6 +1,5 @@ package com.geedgenetworks.common.config; -import com.geedgenetworks.common.Constants; import com.hazelcast.client.config.ClientConfig; import com.hazelcast.client.config.YamlClientConfigBuilder; import com.hazelcast.client.config.impl.YamlClientConfigLocator; diff --git a/groot-common/src/main/java/com/geedgenetworks/common/Constants.java b/groot-common/src/main/java/com/geedgenetworks/common/config/Constants.java index 27ce8fb..ac4d0bf 100644 --- a/groot-common/src/main/java/com/geedgenetworks/common/Constants.java +++ b/groot-common/src/main/java/com/geedgenetworks/common/config/Constants.java @@ -1,4 +1,4 @@ -package com.geedgenetworks.common; +package com.geedgenetworks.common.config; public final class Constants { diff --git a/groot-common/src/main/java/com/geedgenetworks/common/config/FilterConfigOptions.java b/groot-common/src/main/java/com/geedgenetworks/common/config/FilterConfigOptions.java deleted file mode 100644 index a553608..0000000 --- a/groot-common/src/main/java/com/geedgenetworks/common/config/FilterConfigOptions.java +++ /dev/null @@ -1,15 +0,0 @@ -package com.geedgenetworks.common.config; - -import java.util.Map; - -public interface FilterConfigOptions { - Option<String> TYPE = Options.key("type") - .stringType() - .noDefaultValue() - .withDescription("The type of filter ."); - - Option<Map<String, String>> PROPERTIES = Options.key("properties") - .mapType() - .noDefaultValue() - .withDescription("Custom properties for filter."); -} diff --git a/groot-common/src/main/java/com/geedgenetworks/common/config/GrootStreamConfig.java b/groot-common/src/main/java/com/geedgenetworks/common/config/GrootStreamConfig.java index 189b05b..4dc6cbe 100644 --- a/groot-common/src/main/java/com/geedgenetworks/common/config/GrootStreamConfig.java +++ b/groot-common/src/main/java/com/geedgenetworks/common/config/GrootStreamConfig.java @@ -1,6 +1,5 @@ package com.geedgenetworks.common.config; -import com.geedgenetworks.common.Constants; import com.hazelcast.config.Config; import lombok.extern.slf4j.Slf4j; diff --git a/groot-common/src/main/java/com/geedgenetworks/common/config/GrootStreamConfigBuilder.java b/groot-common/src/main/java/com/geedgenetworks/common/config/GrootStreamConfigBuilder.java index 4b5a974..842d1bf 100644 --- a/groot-common/src/main/java/com/geedgenetworks/common/config/GrootStreamConfigBuilder.java +++ b/groot-common/src/main/java/com/geedgenetworks/common/config/GrootStreamConfigBuilder.java @@ -13,6 +13,7 @@ import org.w3c.dom.Node; import java.io.InputStream; import java.util.Properties; + import static com.hazelcast.internal.config.yaml.W3cDomUtil.asW3cNode; public class GrootStreamConfigBuilder extends AbstractYamlConfigBuilder { diff --git a/groot-common/src/main/java/com/geedgenetworks/common/KeybyEntity.java b/groot-common/src/main/java/com/geedgenetworks/common/config/KeybyEntity.java index f1dc38f..3e6ded8 100644 --- a/groot-common/src/main/java/com/geedgenetworks/common/KeybyEntity.java +++ b/groot-common/src/main/java/com/geedgenetworks/common/config/KeybyEntity.java @@ -1,4 +1,4 @@ -package com.geedgenetworks.common; +package com.geedgenetworks.common.config; import lombok.Data; diff --git a/groot-common/src/main/java/com/geedgenetworks/common/config/UDFPluginConfigLocator.java b/groot-common/src/main/java/com/geedgenetworks/common/config/UDFPluginConfigLocator.java index 49da576..4be72b6 100644 --- a/groot-common/src/main/java/com/geedgenetworks/common/config/UDFPluginConfigLocator.java +++ b/groot-common/src/main/java/com/geedgenetworks/common/config/UDFPluginConfigLocator.java @@ -1,6 +1,5 @@ package com.geedgenetworks.common.config; -import com.geedgenetworks.common.Constants; import com.hazelcast.internal.config.AbstractConfigLocator; import lombok.extern.slf4j.Slf4j; diff --git a/groot-core/src/main/java/com/geedgenetworks/core/utils/HttpClientPoolUtil.java b/groot-common/src/main/java/com/geedgenetworks/common/utils/HttpClientPoolUtil.java index 56e4540..cb98cc9 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/utils/HttpClientPoolUtil.java +++ b/groot-common/src/main/java/com/geedgenetworks/common/utils/HttpClientPoolUtil.java @@ -1,8 +1,7 @@ -package com.geedgenetworks.core.utils; +package com.geedgenetworks.common.utils; -import com.geedgenetworks.shaded.org.apache.http.Header; +import com.geedgenetworks.shaded.org.apache.http.*; import com.geedgenetworks.shaded.org.apache.http.HttpEntity; -import com.geedgenetworks.shaded.org.apache.http.HttpStatus; import com.geedgenetworks.shaded.org.apache.http.client.methods.CloseableHttpResponse; import com.geedgenetworks.shaded.org.apache.http.client.methods.HttpGet; import com.geedgenetworks.shaded.org.apache.http.config.Registry; diff --git a/groot-connectors/connector-clickhouse/src/main/java/com/geedgenetworks/connectors/clickhouse/ClickHouseTableFactory.java b/groot-connectors/connector-clickhouse/src/main/java/com/geedgenetworks/connectors/clickhouse/ClickHouseTableFactory.java index 441bc00..274061d 100644 --- a/groot-connectors/connector-clickhouse/src/main/java/com/geedgenetworks/connectors/clickhouse/ClickHouseTableFactory.java +++ b/groot-connectors/connector-clickhouse/src/main/java/com/geedgenetworks/connectors/clickhouse/ClickHouseTableFactory.java @@ -1,12 +1,12 @@ package com.geedgenetworks.connectors.clickhouse; import com.geedgenetworks.connectors.clickhouse.sink.EventBatchIntervalClickHouseSink; -import com.geedgenetworks.core.connector.schema.Schema; -import com.geedgenetworks.core.connector.sink.SinkProvider; -import com.geedgenetworks.core.factories.FactoryUtil; -import com.geedgenetworks.core.factories.FactoryUtil.TableFactoryHelper; -import com.geedgenetworks.core.factories.SinkTableFactory; -import com.geedgenetworks.common.Event; +import com.geedgenetworks.api.connector.sink.SinkProvider; +import com.geedgenetworks.api.connector.sink.SinkTableFactory; +import com.geedgenetworks.api.connector.event.Event; +import com.geedgenetworks.api.factory.FactoryUtil; +import com.geedgenetworks.api.factory.FactoryUtil.TableFactoryHelper; +import com.geedgenetworks.api.connector.schema.Schema; import org.apache.flink.configuration.ConfigOption; import org.apache.flink.configuration.MemorySize; import org.apache.flink.configuration.ReadableConfig; @@ -25,7 +25,7 @@ import static com.geedgenetworks.connectors.clickhouse.ClickHouseConnectorOption public class ClickHouseTableFactory implements SinkTableFactory { public static final String IDENTIFIER = "clickhouse"; @Override - public String factoryIdentifier() { + public String type() { return IDENTIFIER; } diff --git a/groot-connectors/connector-clickhouse/src/main/java/com/geedgenetworks/connectors/clickhouse/sink/EventBatchIntervalClickHouseSink.java b/groot-connectors/connector-clickhouse/src/main/java/com/geedgenetworks/connectors/clickhouse/sink/EventBatchIntervalClickHouseSink.java index f8600b8..4b64a84 100644 --- a/groot-connectors/connector-clickhouse/src/main/java/com/geedgenetworks/connectors/clickhouse/sink/EventBatchIntervalClickHouseSink.java +++ b/groot-connectors/connector-clickhouse/src/main/java/com/geedgenetworks/connectors/clickhouse/sink/EventBatchIntervalClickHouseSink.java @@ -1,12 +1,12 @@ package com.geedgenetworks.connectors.clickhouse.sink; import com.alibaba.fastjson2.JSON; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.core.connector.schema.Schema; -import com.geedgenetworks.core.connector.schema.SchemaChangeAware; -import com.geedgenetworks.core.metrics.InternalMetrics; -import com.geedgenetworks.core.types.StructType; +import com.geedgenetworks.api.metrics.InternalMetrics; +import com.geedgenetworks.api.connector.event.Event; +import com.geedgenetworks.api.connector.schema.Schema; +import com.geedgenetworks.api.connector.schema.SchemaChangeAware; +import com.geedgenetworks.api.connector.type.StructType; import com.github.housepower.data.Block; import org.apache.flink.configuration.Configuration; diff --git a/groot-connectors/connector-clickhouse/src/main/resources/META-INF/services/com.geedgenetworks.core.factories.Factory b/groot-connectors/connector-clickhouse/src/main/resources/META-INF/services/com.geedgenetworks.api.factory.Factory index 9f8187a..9f8187a 100644 --- a/groot-connectors/connector-clickhouse/src/main/resources/META-INF/services/com.geedgenetworks.core.factories.Factory +++ b/groot-connectors/connector-clickhouse/src/main/resources/META-INF/services/com.geedgenetworks.api.factory.Factory diff --git a/groot-connectors/connector-file/src/main/java/com/geedgenetworks/connectors/file/FileSourceProvider.java b/groot-connectors/connector-file/src/main/java/com/geedgenetworks/connectors/file/FileSourceProvider.java index 28cf68a..4a3fd77 100644 --- a/groot-connectors/connector-file/src/main/java/com/geedgenetworks/connectors/file/FileSourceProvider.java +++ b/groot-connectors/connector-file/src/main/java/com/geedgenetworks/connectors/file/FileSourceProvider.java @@ -1,94 +1,94 @@ -package com.geedgenetworks.connectors.file;
-
-import com.geedgenetworks.common.Event;
-import com.geedgenetworks.core.connector.source.SourceProvider;
-import com.geedgenetworks.core.types.StructType;
-import org.apache.commons.io.IOUtils;
-import org.apache.commons.io.LineIterator;
-import org.apache.flink.api.common.serialization.DeserializationSchema;
-import org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator;
-import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
-import org.apache.flink.streaming.api.functions.source.SourceFunction;
-
-import java.io.*;
-import java.nio.charset.StandardCharsets;
-
-public class FileSourceProvider implements SourceProvider {
- private final StructType physicalDataType;
- private final DeserializationSchema<Event> deserialization;
-
- private final String path;
- private final boolean readLocalFileInClient;
- private final int rowsPerSecond;
- private final long numberOfRows;
- private final long millisPerRow;
-
- public FileSourceProvider(StructType physicalDataType, DeserializationSchema<Event> deserialization, String path, boolean readLocalFileInClient, int rowsPerSecond, long numberOfRows, long millisPerRow) {
- this.physicalDataType = physicalDataType;
- this.deserialization = deserialization;
- this.path = path;
- this.readLocalFileInClient = readLocalFileInClient;
- this.rowsPerSecond = rowsPerSecond;
- this.numberOfRows = numberOfRows;
- this.millisPerRow = millisPerRow;
- }
-
- @Override
- public SingleOutputStreamOperator<Event> produceDataStream(StreamExecutionEnvironment env) {
- boolean isLocalPath = !path.startsWith("hdfs://");
-
- SourceFunction<Event> sourceFunction = null;
- if (isLocalPath) {
- if (readLocalFileInClient) {
- byte[] lineBytes = getLocalTextFileLineBytes(path);
- sourceFunction = new MemoryTextFileSource(deserialization, lineBytes, rowsPerSecond, numberOfRows, millisPerRow);
- } else {
- sourceFunction = new LocalTextFileSource(deserialization, path, rowsPerSecond, numberOfRows, millisPerRow);
- }
- } else {
- sourceFunction = new HdfsTextFileSource(deserialization, path, rowsPerSecond, numberOfRows, millisPerRow);
- }
-
- return env.addSource(sourceFunction);
- }
-
- @Override
- public StructType getPhysicalDataType() {
- return physicalDataType;
- }
-
- private byte[] getLocalTextFileLineBytes(String path) {
- try {
- File file = new File(path);
- long fileLength = file.length();
- if(fileLength > (1 << 20) * 128){
- throw new IllegalArgumentException(String.format("file:%s size is bigger than 128MB"));
- }
-
- ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
- byte[] intBytes = new byte[4];
- byte[] bytes;
- try(InputStream inputStream = new FileInputStream(file)){
- LineIterator lines = IOUtils.lineIterator(inputStream, "utf-8");
- while (lines.hasNext()) {
- String line = lines.next().trim();
- if(line.isEmpty()){
- continue;
- }
- bytes = line.getBytes(StandardCharsets.UTF_8);
- intBytes[0] = (byte) (bytes.length >> 24);
- intBytes[1] = (byte) (bytes.length >> 16);
- intBytes[2] = (byte) (bytes.length >> 8);
- intBytes[3] = (byte) bytes.length;
- outputStream.write(intBytes);
- outputStream.write(bytes);
- }
- }
-
- return outputStream.toByteArray();
- } catch (IOException e) {
- throw new RuntimeException(e);
- }
- }
-
-}
+package com.geedgenetworks.connectors.file; + +import com.geedgenetworks.api.connector.source.SourceProvider; +import com.geedgenetworks.api.connector.event.Event; +import com.geedgenetworks.api.connector.type.StructType; +import org.apache.commons.io.IOUtils; +import org.apache.commons.io.LineIterator; +import org.apache.flink.api.common.serialization.DeserializationSchema; +import org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator; +import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment; +import org.apache.flink.streaming.api.functions.source.SourceFunction; + +import java.io.*; +import java.nio.charset.StandardCharsets; + +public class FileSourceProvider implements SourceProvider { + private final StructType physicalDataType; + private final DeserializationSchema<Event> deserialization; + + private final String path; + private final boolean readLocalFileInClient; + private final int rowsPerSecond; + private final long numberOfRows; + private final long millisPerRow; + + public FileSourceProvider(StructType physicalDataType, DeserializationSchema<Event> deserialization, String path, boolean readLocalFileInClient, int rowsPerSecond, long numberOfRows, long millisPerRow) { + this.physicalDataType = physicalDataType; + this.deserialization = deserialization; + this.path = path; + this.readLocalFileInClient = readLocalFileInClient; + this.rowsPerSecond = rowsPerSecond; + this.numberOfRows = numberOfRows; + this.millisPerRow = millisPerRow; + } + + @Override + public SingleOutputStreamOperator<Event> produceDataStream(StreamExecutionEnvironment env) { + boolean isLocalPath = !path.startsWith("hdfs://"); + + SourceFunction<Event> sourceFunction = null; + if (isLocalPath) { + if (readLocalFileInClient) { + byte[] lineBytes = getLocalTextFileLineBytes(path); + sourceFunction = new MemoryTextFileSource(deserialization, lineBytes, rowsPerSecond, numberOfRows, millisPerRow); + } else { + sourceFunction = new LocalTextFileSource(deserialization, path, rowsPerSecond, numberOfRows, millisPerRow); + } + } else { + sourceFunction = new HdfsTextFileSource(deserialization, path, rowsPerSecond, numberOfRows, millisPerRow); + } + + return env.addSource(sourceFunction); + } + + @Override + public StructType getPhysicalDataType() { + return physicalDataType; + } + + private byte[] getLocalTextFileLineBytes(String path) { + try { + File file = new File(path); + long fileLength = file.length(); + if(fileLength > (1 << 20) * 128){ + throw new IllegalArgumentException(String.format("file:%s size is bigger than 128MB")); + } + + ByteArrayOutputStream outputStream = new ByteArrayOutputStream(); + byte[] intBytes = new byte[4]; + byte[] bytes; + try(InputStream inputStream = new FileInputStream(file)){ + LineIterator lines = IOUtils.lineIterator(inputStream, "utf-8"); + while (lines.hasNext()) { + String line = lines.next().trim(); + if(line.isEmpty()){ + continue; + } + bytes = line.getBytes(StandardCharsets.UTF_8); + intBytes[0] = (byte) (bytes.length >> 24); + intBytes[1] = (byte) (bytes.length >> 16); + intBytes[2] = (byte) (bytes.length >> 8); + intBytes[3] = (byte) bytes.length; + outputStream.write(intBytes); + outputStream.write(bytes); + } + } + + return outputStream.toByteArray(); + } catch (IOException e) { + throw new RuntimeException(e); + } + } + +} diff --git a/groot-connectors/connector-file/src/main/java/com/geedgenetworks/connectors/file/FileTableFactory.java b/groot-connectors/connector-file/src/main/java/com/geedgenetworks/connectors/file/FileTableFactory.java index 5e1bde5..02baa51 100644 --- a/groot-connectors/connector-file/src/main/java/com/geedgenetworks/connectors/file/FileTableFactory.java +++ b/groot-connectors/connector-file/src/main/java/com/geedgenetworks/connectors/file/FileTableFactory.java @@ -1,60 +1,60 @@ -package com.geedgenetworks.connectors.file;
-
-import com.geedgenetworks.core.connector.format.DecodingFormat;
-import com.geedgenetworks.core.connector.source.SourceProvider;
-import com.geedgenetworks.core.factories.DecodingFormatFactory;
-import com.geedgenetworks.core.factories.FactoryUtil;
-import com.geedgenetworks.core.factories.SourceTableFactory;
-import com.geedgenetworks.core.types.StructType;
-import org.apache.flink.configuration.ConfigOption;
-import org.apache.flink.configuration.ReadableConfig;
-
-import java.util.HashSet;
-import java.util.Set;
-
-import static com.geedgenetworks.connectors.file.FileConnectorOptions.*;
-
-public class FileTableFactory implements SourceTableFactory {
- public static final String IDENTIFIER = "file";
-
- @Override
- public String factoryIdentifier() {
- return IDENTIFIER;
- }
-
- @Override
- public SourceProvider getSourceProvider(Context context) {
- final FactoryUtil.TableFactoryHelper helper = FactoryUtil.createTableFactoryHelper(this, context);
- DecodingFormat decodingFormat = helper.discoverDecodingFormat(DecodingFormatFactory.class, FactoryUtil.FORMAT);
- helper.validate();
-
- StructType physicalDataType = context.getPhysicalDataType();
- ReadableConfig config = context.getConfiguration();
-
- String path = config.get(PATH).trim();
- boolean readLocalFileInClient = config.get(READ_LOCAL_FILE_IN_CLIENT);
- int rowsPerSecond = config.get(ROWS_PER_SECOND);
- long numberOfRows = config.get(NUMBER_OF_ROWS);
- long millisPerRow = config.get(MILLIS_PER_ROW);
-
- return new FileSourceProvider(physicalDataType, decodingFormat.createRuntimeDecoder(physicalDataType), path, readLocalFileInClient, rowsPerSecond, numberOfRows, millisPerRow);
- }
-
- @Override
- public Set<ConfigOption<?>> requiredOptions() {
- Set<ConfigOption<?>> options = new HashSet<>();
- options.add(PATH);
- options.add(FactoryUtil.FORMAT);
- return options;
- }
-
- @Override
- public Set<ConfigOption<?>> optionalOptions() {
- Set<ConfigOption<?>> options = new HashSet<>();
- options.add(ROWS_PER_SECOND);
- options.add(NUMBER_OF_ROWS);
- options.add(MILLIS_PER_ROW);
- options.add(READ_LOCAL_FILE_IN_CLIENT);
- return options;
- }
-}
+package com.geedgenetworks.connectors.file; + +import com.geedgenetworks.api.connector.serialization.DecodingFormat; +import com.geedgenetworks.api.connector.source.SourceProvider; +import com.geedgenetworks.api.connector.source.SourceTableFactory; +import com.geedgenetworks.api.factory.DecodingFormatFactory; +import com.geedgenetworks.api.factory.FactoryUtil; +import com.geedgenetworks.api.connector.type.StructType; +import org.apache.flink.configuration.ConfigOption; +import org.apache.flink.configuration.ReadableConfig; + +import java.util.HashSet; +import java.util.Set; + +import static com.geedgenetworks.connectors.file.FileConnectorOptions.*; + +public class FileTableFactory implements SourceTableFactory { + public static final String IDENTIFIER = "file"; + + @Override + public String type() { + return IDENTIFIER; + } + + @Override + public SourceProvider getSourceProvider(Context context) { + final FactoryUtil.TableFactoryHelper helper = FactoryUtil.createTableFactoryHelper(this, context); + DecodingFormat decodingFormat = helper.discoverDecodingFormat(DecodingFormatFactory.class, FactoryUtil.FORMAT); + helper.validate(); + + StructType physicalDataType = context.getPhysicalDataType(); + ReadableConfig config = context.getConfiguration(); + + String path = config.get(PATH).trim(); + boolean readLocalFileInClient = config.get(READ_LOCAL_FILE_IN_CLIENT); + int rowsPerSecond = config.get(ROWS_PER_SECOND); + long numberOfRows = config.get(NUMBER_OF_ROWS); + long millisPerRow = config.get(MILLIS_PER_ROW); + + return new FileSourceProvider(physicalDataType, decodingFormat.createRuntimeDecoder(physicalDataType), path, readLocalFileInClient, rowsPerSecond, numberOfRows, millisPerRow); + } + + @Override + public Set<ConfigOption<?>> requiredOptions() { + Set<ConfigOption<?>> options = new HashSet<>(); + options.add(PATH); + options.add(FactoryUtil.FORMAT); + return options; + } + + @Override + public Set<ConfigOption<?>> optionalOptions() { + Set<ConfigOption<?>> options = new HashSet<>(); + options.add(ROWS_PER_SECOND); + options.add(NUMBER_OF_ROWS); + options.add(MILLIS_PER_ROW); + options.add(READ_LOCAL_FILE_IN_CLIENT); + return options; + } +} diff --git a/groot-connectors/connector-file/src/main/java/com/geedgenetworks/connectors/file/HdfsTextFileSource.java b/groot-connectors/connector-file/src/main/java/com/geedgenetworks/connectors/file/HdfsTextFileSource.java index 22fdcc0..09994f8 100644 --- a/groot-connectors/connector-file/src/main/java/com/geedgenetworks/connectors/file/HdfsTextFileSource.java +++ b/groot-connectors/connector-file/src/main/java/com/geedgenetworks/connectors/file/HdfsTextFileSource.java @@ -1,113 +1,113 @@ -package com.geedgenetworks.connectors.file;
-
-import com.geedgenetworks.common.Event;
-import org.apache.commons.io.IOUtils;
-import org.apache.commons.io.LineIterator;
-import org.apache.flink.api.common.serialization.DeserializationSchema;
-import org.apache.flink.configuration.Configuration;
-import org.apache.flink.streaming.api.functions.source.RichParallelSourceFunction;
-import org.apache.flink.util.Preconditions;
-import org.apache.hadoop.fs.FileSystem;
-import org.apache.hadoop.fs.Path;
-import org.slf4j.Logger;
-import org.slf4j.LoggerFactory;
-
-import java.io.InputStream;
-import java.nio.charset.StandardCharsets;
-
-public class HdfsTextFileSource extends RichParallelSourceFunction<Event> {
- private static final Logger LOG = LoggerFactory.getLogger(HdfsTextFileSource.class);
- private final DeserializationSchema<Event> deserialization;
- private final String path;
- private final int rowsPerSecond;
- private final long numberOfRows;
- private final long millisPerRow;
- private transient FileSystem fs;
- private volatile boolean stop;
-
- protected HdfsTextFileSource(DeserializationSchema<Event> deserialization, String path, int rowsPerSecond, long numberOfRows, long millisPerRow) {
- this.deserialization = deserialization;
- this.path = path;
- this.rowsPerSecond = rowsPerSecond;
- this.numberOfRows = numberOfRows;
- this.millisPerRow = millisPerRow;
- }
-
- @Override
- public void open(Configuration parameters) throws Exception {
- fs = new Path(path).getFileSystem(new org.apache.hadoop.conf.Configuration());
- Preconditions.checkArgument(fs.isFile(new Path(path)), "%s is not file", path);
- }
-
- @Override
- public void run(SourceContext<Event> ctx) throws Exception {
- final long rowsForSubtask = getRowsForSubTask();
- final long rowsPerSecondForSubtask = getRowsPerSecondForSubTask();
-
- Event event;
- long rows = 0;
- int batchRows = 0;
- byte[] bytes;
- long batchStartTs = System.currentTimeMillis();
- long batchWait;
-
- while (!stop && rows < rowsForSubtask) {
- try(InputStream inputStream = fs.open(new Path(path))){
- LineIterator lines = IOUtils.lineIterator(inputStream, "utf-8");
- while (!stop && lines.hasNext() && rows < rowsForSubtask) {
- String line = lines.next().trim();
- if(line.isEmpty()){
- continue;
- }
- bytes = line.getBytes(StandardCharsets.UTF_8);
- try {
- event = deserialization.deserialize(bytes);
- event.getExtractedFields().put(Event.INTERNAL_TIMESTAMP_KEY, System.currentTimeMillis());
- ctx.collect(event);
- rows += 1;
- } catch (Exception e) {
- LOG.error("deserialize error for:" + line, e);
- continue;
- }
-
- if(millisPerRow > 0){
- Thread.sleep(millisPerRow);
- }else{
- batchRows += 1;
- if(batchRows >= rowsPerSecondForSubtask){
- batchRows = 0;
- batchWait = 1000L - (System.currentTimeMillis() - batchStartTs);
- if(batchWait > 0) {
- Thread.sleep(batchWait);
- }
- batchStartTs = System.currentTimeMillis();
- }
- }
- }
- }
- }
- }
-
- private long getRowsPerSecondForSubTask() {
- int numSubtasks = getRuntimeContext().getNumberOfParallelSubtasks();
- int indexOfThisSubtask = getRuntimeContext().getIndexOfThisSubtask();
- long baseRowsPerSecondPerSubtask = rowsPerSecond / numSubtasks;
- return (rowsPerSecond % numSubtasks > indexOfThisSubtask) ? baseRowsPerSecondPerSubtask + 1 : baseRowsPerSecondPerSubtask;
- }
-
- private long getRowsForSubTask() {
- if (numberOfRows < 0) {
- return Long.MAX_VALUE;
- } else {
- int numSubtasks = getRuntimeContext().getNumberOfParallelSubtasks();
- int indexOfThisSubtask = getRuntimeContext().getIndexOfThisSubtask();
- final long baseNumOfRowsPerSubtask = numberOfRows / numSubtasks;
- return (numberOfRows % numSubtasks > indexOfThisSubtask) ? baseNumOfRowsPerSubtask + 1 : baseNumOfRowsPerSubtask;
- }
- }
-
- @Override
- public void cancel() {
- stop = true;
- }
-}
+package com.geedgenetworks.connectors.file; + +import com.geedgenetworks.api.connector.event.Event; +import org.apache.commons.io.IOUtils; +import org.apache.commons.io.LineIterator; +import org.apache.flink.api.common.serialization.DeserializationSchema; +import org.apache.flink.configuration.Configuration; +import org.apache.flink.streaming.api.functions.source.RichParallelSourceFunction; +import org.apache.flink.util.Preconditions; +import org.apache.hadoop.fs.FileSystem; +import org.apache.hadoop.fs.Path; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.InputStream; +import java.nio.charset.StandardCharsets; + +public class HdfsTextFileSource extends RichParallelSourceFunction<Event> { + private static final Logger LOG = LoggerFactory.getLogger(HdfsTextFileSource.class); + private final DeserializationSchema<Event> deserialization; + private final String path; + private final int rowsPerSecond; + private final long numberOfRows; + private final long millisPerRow; + private transient FileSystem fs; + private volatile boolean stop; + + protected HdfsTextFileSource(DeserializationSchema<Event> deserialization, String path, int rowsPerSecond, long numberOfRows, long millisPerRow) { + this.deserialization = deserialization; + this.path = path; + this.rowsPerSecond = rowsPerSecond; + this.numberOfRows = numberOfRows; + this.millisPerRow = millisPerRow; + } + + @Override + public void open(Configuration parameters) throws Exception { + fs = new Path(path).getFileSystem(new org.apache.hadoop.conf.Configuration()); + Preconditions.checkArgument(fs.isFile(new Path(path)), "%s is not file", path); + } + + @Override + public void run(SourceContext<Event> ctx) throws Exception { + final long rowsForSubtask = getRowsForSubTask(); + final long rowsPerSecondForSubtask = getRowsPerSecondForSubTask(); + + Event event; + long rows = 0; + int batchRows = 0; + byte[] bytes; + long batchStartTs = System.currentTimeMillis(); + long batchWait; + + while (!stop && rows < rowsForSubtask) { + try(InputStream inputStream = fs.open(new Path(path))){ + LineIterator lines = IOUtils.lineIterator(inputStream, "utf-8"); + while (!stop && lines.hasNext() && rows < rowsForSubtask) { + String line = lines.next().trim(); + if(line.isEmpty()){ + continue; + } + bytes = line.getBytes(StandardCharsets.UTF_8); + try { + event = deserialization.deserialize(bytes); + event.getExtractedFields().put(Event.INTERNAL_TIMESTAMP_KEY, System.currentTimeMillis()); + ctx.collect(event); + rows += 1; + } catch (Exception e) { + LOG.error("deserialize error for:" + line, e); + continue; + } + + if(millisPerRow > 0){ + Thread.sleep(millisPerRow); + }else{ + batchRows += 1; + if(batchRows >= rowsPerSecondForSubtask){ + batchRows = 0; + batchWait = 1000L - (System.currentTimeMillis() - batchStartTs); + if(batchWait > 0) { + Thread.sleep(batchWait); + } + batchStartTs = System.currentTimeMillis(); + } + } + } + } + } + } + + private long getRowsPerSecondForSubTask() { + int numSubtasks = getRuntimeContext().getNumberOfParallelSubtasks(); + int indexOfThisSubtask = getRuntimeContext().getIndexOfThisSubtask(); + long baseRowsPerSecondPerSubtask = rowsPerSecond / numSubtasks; + return (rowsPerSecond % numSubtasks > indexOfThisSubtask) ? baseRowsPerSecondPerSubtask + 1 : baseRowsPerSecondPerSubtask; + } + + private long getRowsForSubTask() { + if (numberOfRows < 0) { + return Long.MAX_VALUE; + } else { + int numSubtasks = getRuntimeContext().getNumberOfParallelSubtasks(); + int indexOfThisSubtask = getRuntimeContext().getIndexOfThisSubtask(); + final long baseNumOfRowsPerSubtask = numberOfRows / numSubtasks; + return (numberOfRows % numSubtasks > indexOfThisSubtask) ? baseNumOfRowsPerSubtask + 1 : baseNumOfRowsPerSubtask; + } + } + + @Override + public void cancel() { + stop = true; + } +} diff --git a/groot-connectors/connector-file/src/main/java/com/geedgenetworks/connectors/file/LocalTextFileSource.java b/groot-connectors/connector-file/src/main/java/com/geedgenetworks/connectors/file/LocalTextFileSource.java index 28634a2..aec3f55 100644 --- a/groot-connectors/connector-file/src/main/java/com/geedgenetworks/connectors/file/LocalTextFileSource.java +++ b/groot-connectors/connector-file/src/main/java/com/geedgenetworks/connectors/file/LocalTextFileSource.java @@ -1,103 +1,103 @@ -package com.geedgenetworks.connectors.file;
-
-import com.geedgenetworks.common.Event;
-import org.apache.commons.io.IOUtils;
-import org.apache.commons.io.LineIterator;
-import org.apache.flink.api.common.serialization.DeserializationSchema;
-import org.apache.flink.streaming.api.functions.source.RichParallelSourceFunction;
-import org.slf4j.Logger;
-import org.slf4j.LoggerFactory;
-
-import java.io.FileInputStream;
-import java.io.InputStream;
-import java.nio.charset.StandardCharsets;
-
-public class LocalTextFileSource extends RichParallelSourceFunction<Event> {
- private static final Logger LOG = LoggerFactory.getLogger(LocalTextFileSource.class);
- private final DeserializationSchema<Event> deserialization;
- private final String path;
- private final int rowsPerSecond;
- private final long numberOfRows;
- private final long millisPerRow;
- private volatile boolean stop;
-
- protected LocalTextFileSource(DeserializationSchema<Event> deserialization, String path, int rowsPerSecond, long numberOfRows, long millisPerRow) {
- this.deserialization = deserialization;
- this.path = path;
- this.rowsPerSecond = rowsPerSecond;
- this.numberOfRows = numberOfRows;
- this.millisPerRow = millisPerRow;
- }
-
- @Override
- public void run(SourceContext<Event> ctx) throws Exception {
- final long rowsForSubtask = getRowsForSubTask();
- final long rowsPerSecondForSubtask = getRowsPerSecondForSubTask();
-
- Event event;
- long rows = 0;
- int batchRows = 0;
- byte[] bytes;
- long nextReadTime = System.currentTimeMillis();
- long waitMs;
-
- while (!stop && rows < rowsForSubtask) {
- try(InputStream inputStream = new FileInputStream(path)){
- LineIterator lines = IOUtils.lineIterator(inputStream, "utf-8");
- while (!stop && lines.hasNext() && rows < rowsForSubtask) {
- String line = lines.next().trim();
- if(line.isEmpty()){
- continue;
- }
- bytes = line.getBytes(StandardCharsets.UTF_8);
- try {
- event = deserialization.deserialize(bytes);
- event.getExtractedFields().put(Event.INTERNAL_TIMESTAMP_KEY, System.currentTimeMillis());
- ctx.collect(event);
- rows += 1;
- } catch (Exception e) {
- LOG.error("deserialize error for:" + line, e);
- continue;
- }
-
- if(millisPerRow > 0){
- Thread.sleep(millisPerRow);
- }else{
- batchRows += 1;
- if(batchRows >= rowsPerSecondForSubtask){
- batchRows = 0;
- nextReadTime += 1000;
- waitMs = Math.max(0, nextReadTime - System.currentTimeMillis());
- if(waitMs > 0) {
- Thread.sleep(waitMs);
- }
- }
- }
- }
- }
- }
- }
-
- private long getRowsPerSecondForSubTask() {
- int numSubtasks = getRuntimeContext().getNumberOfParallelSubtasks();
- int indexOfThisSubtask = getRuntimeContext().getIndexOfThisSubtask();
- long baseRowsPerSecondPerSubtask = rowsPerSecond / numSubtasks;
- return (rowsPerSecond % numSubtasks > indexOfThisSubtask) ? baseRowsPerSecondPerSubtask + 1 : baseRowsPerSecondPerSubtask;
- }
-
- private long getRowsForSubTask() {
- if (numberOfRows < 0) {
- return Long.MAX_VALUE;
- } else {
- int numSubtasks = getRuntimeContext().getNumberOfParallelSubtasks();
- int indexOfThisSubtask = getRuntimeContext().getIndexOfThisSubtask();
- final long baseNumOfRowsPerSubtask = numberOfRows / numSubtasks;
- return (numberOfRows % numSubtasks > indexOfThisSubtask) ? baseNumOfRowsPerSubtask + 1 : baseNumOfRowsPerSubtask;
- }
- }
-
- @Override
- public void cancel() {
- stop = true;
- }
-}
+package com.geedgenetworks.connectors.file; + +import com.geedgenetworks.api.connector.event.Event; +import org.apache.commons.io.IOUtils; +import org.apache.commons.io.LineIterator; +import org.apache.flink.api.common.serialization.DeserializationSchema; +import org.apache.flink.streaming.api.functions.source.RichParallelSourceFunction; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.FileInputStream; +import java.io.InputStream; +import java.nio.charset.StandardCharsets; + +public class LocalTextFileSource extends RichParallelSourceFunction<Event> { + private static final Logger LOG = LoggerFactory.getLogger(LocalTextFileSource.class); + private final DeserializationSchema<Event> deserialization; + private final String path; + private final int rowsPerSecond; + private final long numberOfRows; + private final long millisPerRow; + private volatile boolean stop; + + protected LocalTextFileSource(DeserializationSchema<Event> deserialization, String path, int rowsPerSecond, long numberOfRows, long millisPerRow) { + this.deserialization = deserialization; + this.path = path; + this.rowsPerSecond = rowsPerSecond; + this.numberOfRows = numberOfRows; + this.millisPerRow = millisPerRow; + } + + @Override + public void run(SourceContext<Event> ctx) throws Exception { + final long rowsForSubtask = getRowsForSubTask(); + final long rowsPerSecondForSubtask = getRowsPerSecondForSubTask(); + + Event event; + long rows = 0; + int batchRows = 0; + byte[] bytes; + long nextReadTime = System.currentTimeMillis(); + long waitMs; + + while (!stop && rows < rowsForSubtask) { + try(InputStream inputStream = new FileInputStream(path)){ + LineIterator lines = IOUtils.lineIterator(inputStream, "utf-8"); + while (!stop && lines.hasNext() && rows < rowsForSubtask) { + String line = lines.next().trim(); + if(line.isEmpty()){ + continue; + } + bytes = line.getBytes(StandardCharsets.UTF_8); + try { + event = deserialization.deserialize(bytes); + event.getExtractedFields().put(Event.INTERNAL_TIMESTAMP_KEY, System.currentTimeMillis()); + ctx.collect(event); + rows += 1; + } catch (Exception e) { + LOG.error("deserialize error for:" + line, e); + continue; + } + + if(millisPerRow > 0){ + Thread.sleep(millisPerRow); + }else{ + batchRows += 1; + if(batchRows >= rowsPerSecondForSubtask){ + batchRows = 0; + nextReadTime += 1000; + waitMs = Math.max(0, nextReadTime - System.currentTimeMillis()); + if(waitMs > 0) { + Thread.sleep(waitMs); + } + } + } + } + } + } + } + + private long getRowsPerSecondForSubTask() { + int numSubtasks = getRuntimeContext().getNumberOfParallelSubtasks(); + int indexOfThisSubtask = getRuntimeContext().getIndexOfThisSubtask(); + long baseRowsPerSecondPerSubtask = rowsPerSecond / numSubtasks; + return (rowsPerSecond % numSubtasks > indexOfThisSubtask) ? baseRowsPerSecondPerSubtask + 1 : baseRowsPerSecondPerSubtask; + } + + private long getRowsForSubTask() { + if (numberOfRows < 0) { + return Long.MAX_VALUE; + } else { + int numSubtasks = getRuntimeContext().getNumberOfParallelSubtasks(); + int indexOfThisSubtask = getRuntimeContext().getIndexOfThisSubtask(); + final long baseNumOfRowsPerSubtask = numberOfRows / numSubtasks; + return (numberOfRows % numSubtasks > indexOfThisSubtask) ? baseNumOfRowsPerSubtask + 1 : baseNumOfRowsPerSubtask; + } + } + + @Override + public void cancel() { + stop = true; + } +} diff --git a/groot-connectors/connector-file/src/main/java/com/geedgenetworks/connectors/file/MemoryTextFileSource.java b/groot-connectors/connector-file/src/main/java/com/geedgenetworks/connectors/file/MemoryTextFileSource.java index 35b9f4e..56444bb 100644 --- a/groot-connectors/connector-file/src/main/java/com/geedgenetworks/connectors/file/MemoryTextFileSource.java +++ b/groot-connectors/connector-file/src/main/java/com/geedgenetworks/connectors/file/MemoryTextFileSource.java @@ -1,99 +1,99 @@ -package com.geedgenetworks.connectors.file;
-
-import com.geedgenetworks.common.Event;
-import org.apache.flink.api.common.serialization.DeserializationSchema;
-import org.apache.flink.streaming.api.functions.source.RichParallelSourceFunction;
-import org.slf4j.Logger;
-import org.slf4j.LoggerFactory;
-
-import java.nio.ByteBuffer;
-import java.nio.charset.StandardCharsets;
-
-public class MemoryTextFileSource extends RichParallelSourceFunction<Event> {
- private static final Logger LOG = LoggerFactory.getLogger(MemoryTextFileSource.class);
- private final DeserializationSchema<Event> deserialization;
- private final byte[] lineBytes;
- private final int rowsPerSecond;
- private final long numberOfRows;
- private final long millisPerRow;
- private volatile boolean stop;
-
- protected MemoryTextFileSource(DeserializationSchema<Event> deserialization, byte[] lineBytes, int rowsPerSecond, long numberOfRows, long millisPerRow) {
- this.deserialization = deserialization;
- this.lineBytes = lineBytes;
- this.rowsPerSecond = rowsPerSecond;
- this.numberOfRows = numberOfRows;
- this.millisPerRow = millisPerRow;
- }
-
- @Override
- public void run(SourceContext<Event> ctx) throws Exception {
- final long rowsForSubtask = getRowsForSubTask();
- final long rowsPerSecondForSubtask = getRowsPerSecondForSubTask();
-
- Event event;
- long rows = 0;
- int batchRows = 0;
- byte[] bytes;
- long nextReadTime = System.currentTimeMillis();
- long waitMs;
- ByteBuffer buffer = ByteBuffer.wrap(lineBytes);
- int lineSize;
-
- while (!stop && rows < rowsForSubtask) {
- while (!stop && buffer.hasRemaining() && rows < rowsForSubtask){
- lineSize = buffer.getInt();
- bytes = new byte[lineSize];
- buffer.get(bytes);
- try {
- event = deserialization.deserialize(bytes);
- event.getExtractedFields().put(Event.INTERNAL_TIMESTAMP_KEY, System.currentTimeMillis());
- ctx.collect(event);
- rows += 1;
- } catch (Exception e) {
- LOG.error("deserialize error for:" + new String(bytes, StandardCharsets.UTF_8), e);
- continue;
- }
-
- if(millisPerRow > 0){
- Thread.sleep(millisPerRow);
- }else{
- batchRows += 1;
- if(batchRows >= rowsPerSecondForSubtask){
- batchRows = 0;
- nextReadTime += 1000;
- waitMs = Math.max(0, nextReadTime - System.currentTimeMillis());
- if(waitMs > 0) {
- Thread.sleep(waitMs);
- }
- }
- }
- }
- buffer.clear();
- }
-
- }
-
- private long getRowsPerSecondForSubTask() {
- int numSubtasks = getRuntimeContext().getNumberOfParallelSubtasks();
- int indexOfThisSubtask = getRuntimeContext().getIndexOfThisSubtask();
- long baseRowsPerSecondPerSubtask = rowsPerSecond / numSubtasks;
- return (rowsPerSecond % numSubtasks > indexOfThisSubtask) ? baseRowsPerSecondPerSubtask + 1 : baseRowsPerSecondPerSubtask;
- }
-
- private long getRowsForSubTask() {
- if (numberOfRows < 0) {
- return Long.MAX_VALUE;
- } else {
- int numSubtasks = getRuntimeContext().getNumberOfParallelSubtasks();
- int indexOfThisSubtask = getRuntimeContext().getIndexOfThisSubtask();
- final long baseNumOfRowsPerSubtask = numberOfRows / numSubtasks;
- return (numberOfRows % numSubtasks > indexOfThisSubtask) ? baseNumOfRowsPerSubtask + 1 : baseNumOfRowsPerSubtask;
- }
- }
-
- @Override
- public void cancel() {
- stop = true;
- }
-}
+package com.geedgenetworks.connectors.file; + +import com.geedgenetworks.api.connector.event.Event; +import org.apache.flink.api.common.serialization.DeserializationSchema; +import org.apache.flink.streaming.api.functions.source.RichParallelSourceFunction; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.nio.ByteBuffer; +import java.nio.charset.StandardCharsets; + +public class MemoryTextFileSource extends RichParallelSourceFunction<Event> { + private static final Logger LOG = LoggerFactory.getLogger(MemoryTextFileSource.class); + private final DeserializationSchema<Event> deserialization; + private final byte[] lineBytes; + private final int rowsPerSecond; + private final long numberOfRows; + private final long millisPerRow; + private volatile boolean stop; + + protected MemoryTextFileSource(DeserializationSchema<Event> deserialization, byte[] lineBytes, int rowsPerSecond, long numberOfRows, long millisPerRow) { + this.deserialization = deserialization; + this.lineBytes = lineBytes; + this.rowsPerSecond = rowsPerSecond; + this.numberOfRows = numberOfRows; + this.millisPerRow = millisPerRow; + } + + @Override + public void run(SourceContext<Event> ctx) throws Exception { + final long rowsForSubtask = getRowsForSubTask(); + final long rowsPerSecondForSubtask = getRowsPerSecondForSubTask(); + + Event event; + long rows = 0; + int batchRows = 0; + byte[] bytes; + long nextReadTime = System.currentTimeMillis(); + long waitMs; + ByteBuffer buffer = ByteBuffer.wrap(lineBytes); + int lineSize; + + while (!stop && rows < rowsForSubtask) { + while (!stop && buffer.hasRemaining() && rows < rowsForSubtask){ + lineSize = buffer.getInt(); + bytes = new byte[lineSize]; + buffer.get(bytes); + try { + event = deserialization.deserialize(bytes); + event.getExtractedFields().put(Event.INTERNAL_TIMESTAMP_KEY, System.currentTimeMillis()); + ctx.collect(event); + rows += 1; + } catch (Exception e) { + LOG.error("deserialize error for:" + new String(bytes, StandardCharsets.UTF_8), e); + continue; + } + + if(millisPerRow > 0){ + Thread.sleep(millisPerRow); + }else{ + batchRows += 1; + if(batchRows >= rowsPerSecondForSubtask){ + batchRows = 0; + nextReadTime += 1000; + waitMs = Math.max(0, nextReadTime - System.currentTimeMillis()); + if(waitMs > 0) { + Thread.sleep(waitMs); + } + } + } + } + buffer.clear(); + } + + } + + private long getRowsPerSecondForSubTask() { + int numSubtasks = getRuntimeContext().getNumberOfParallelSubtasks(); + int indexOfThisSubtask = getRuntimeContext().getIndexOfThisSubtask(); + long baseRowsPerSecondPerSubtask = rowsPerSecond / numSubtasks; + return (rowsPerSecond % numSubtasks > indexOfThisSubtask) ? baseRowsPerSecondPerSubtask + 1 : baseRowsPerSecondPerSubtask; + } + + private long getRowsForSubTask() { + if (numberOfRows < 0) { + return Long.MAX_VALUE; + } else { + int numSubtasks = getRuntimeContext().getNumberOfParallelSubtasks(); + int indexOfThisSubtask = getRuntimeContext().getIndexOfThisSubtask(); + final long baseNumOfRowsPerSubtask = numberOfRows / numSubtasks; + return (numberOfRows % numSubtasks > indexOfThisSubtask) ? baseNumOfRowsPerSubtask + 1 : baseNumOfRowsPerSubtask; + } + } + + @Override + public void cancel() { + stop = true; + } +} diff --git a/groot-connectors/connector-file/src/main/resources/META-INF/services/com.geedgenetworks.core.factories.Factory b/groot-connectors/connector-file/src/main/resources/META-INF/services/com.geedgenetworks.api.factory.Factory index d1c44cc..d1c44cc 100644 --- a/groot-connectors/connector-file/src/main/resources/META-INF/services/com.geedgenetworks.core.factories.Factory +++ b/groot-connectors/connector-file/src/main/resources/META-INF/services/com.geedgenetworks.api.factory.Factory diff --git a/groot-connectors/connector-ipfix-collector/src/main/java/com/geedgenetworks/connectors/ipfix/collector/IPFixSourceProvider.java b/groot-connectors/connector-ipfix-collector/src/main/java/com/geedgenetworks/connectors/ipfix/collector/IPFixSourceProvider.java index 2272781..7426637 100644 --- a/groot-connectors/connector-ipfix-collector/src/main/java/com/geedgenetworks/connectors/ipfix/collector/IPFixSourceProvider.java +++ b/groot-connectors/connector-ipfix-collector/src/main/java/com/geedgenetworks/connectors/ipfix/collector/IPFixSourceProvider.java @@ -3,11 +3,12 @@ package com.geedgenetworks.connectors.ipfix.collector; import cn.hutool.log.Log; import cn.hutool.log.LogFactory; import com.geedgenetworks.connectors.ipfix.collector.utils.IPFixUtil; -import com.geedgenetworks.core.connector.source.SourceProvider; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.core.metrics.InternalMetrics; -import com.geedgenetworks.core.types.*; -import com.geedgenetworks.core.types.DataType; +import com.geedgenetworks.api.metrics.InternalMetrics; +import com.geedgenetworks.api.connector.source.SourceProvider; +import com.geedgenetworks.api.connector.type.*; +import com.geedgenetworks.api.connector.event.Event; +import com.geedgenetworks.api.connector.type.DataType; +import com.geedgenetworks.api.connector.type.StructType; import org.apache.flink.api.common.functions.RichFlatMapFunction; import org.apache.flink.configuration.Configuration; import org.apache.flink.configuration.ReadableConfig; diff --git a/groot-connectors/connector-ipfix-collector/src/main/java/com/geedgenetworks/connectors/ipfix/collector/IPFixTableFactory.java b/groot-connectors/connector-ipfix-collector/src/main/java/com/geedgenetworks/connectors/ipfix/collector/IPFixTableFactory.java index 3ef6b58..b4ca1e7 100644 --- a/groot-connectors/connector-ipfix-collector/src/main/java/com/geedgenetworks/connectors/ipfix/collector/IPFixTableFactory.java +++ b/groot-connectors/connector-ipfix-collector/src/main/java/com/geedgenetworks/connectors/ipfix/collector/IPFixTableFactory.java @@ -1,9 +1,9 @@ package com.geedgenetworks.connectors.ipfix.collector; -import com.geedgenetworks.core.connector.source.SourceProvider; -import com.geedgenetworks.core.factories.FactoryUtil; -import com.geedgenetworks.core.factories.SourceTableFactory; -import com.geedgenetworks.core.types.StructType; +import com.geedgenetworks.api.connector.source.SourceProvider; +import com.geedgenetworks.api.connector.source.SourceTableFactory; +import com.geedgenetworks.api.factory.FactoryUtil; +import com.geedgenetworks.api.connector.type.StructType; import org.apache.flink.configuration.ConfigOption; import org.apache.flink.configuration.ReadableConfig; @@ -27,7 +27,7 @@ public class IPFixTableFactory implements SourceTableFactory { } @Override - public String factoryIdentifier() { + public String type() { return IDENTIFIER; } diff --git a/groot-connectors/connector-ipfix-collector/src/main/resources/META-INF/services/com.geedgenetworks.core.factories.Factory b/groot-connectors/connector-ipfix-collector/src/main/resources/META-INF/services/com.geedgenetworks.api.factory.Factory index bcf4133..bcf4133 100644 --- a/groot-connectors/connector-ipfix-collector/src/main/resources/META-INF/services/com.geedgenetworks.core.factories.Factory +++ b/groot-connectors/connector-ipfix-collector/src/main/resources/META-INF/services/com.geedgenetworks.api.factory.Factory diff --git a/groot-connectors/connector-kafka/pom.xml b/groot-connectors/connector-kafka/pom.xml index 448383b..aa7b5b4 100644 --- a/groot-connectors/connector-kafka/pom.xml +++ b/groot-connectors/connector-kafka/pom.xml @@ -22,10 +22,10 @@ </exclusion> </exclusions> </dependency> + <dependency> <groupId>org.xerial.snappy</groupId> <artifactId>snappy-java</artifactId> - <version>1.1.8.3</version> </dependency> </dependencies> diff --git a/groot-connectors/connector-kafka/src/main/java/com/geedgenetworks/connectors/kafka/EventKafkaDeserializationSchema.java b/groot-connectors/connector-kafka/src/main/java/com/geedgenetworks/connectors/kafka/EventKafkaDeserializationSchema.java index bd95dd6..35fcde7 100644 --- a/groot-connectors/connector-kafka/src/main/java/com/geedgenetworks/connectors/kafka/EventKafkaDeserializationSchema.java +++ b/groot-connectors/connector-kafka/src/main/java/com/geedgenetworks/connectors/kafka/EventKafkaDeserializationSchema.java @@ -1,7 +1,7 @@ package com.geedgenetworks.connectors.kafka; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.core.metrics.InternalMetrics; +import com.geedgenetworks.api.metrics.InternalMetrics; +import com.geedgenetworks.api.connector.event.Event; import org.apache.flink.api.common.functions.RuntimeContext; import org.apache.flink.api.common.serialization.DeserializationSchema; import org.apache.flink.api.common.typeinfo.TypeInformation; diff --git a/groot-connectors/connector-kafka/src/main/java/com/geedgenetworks/connectors/kafka/KafkaSinkProvider.java b/groot-connectors/connector-kafka/src/main/java/com/geedgenetworks/connectors/kafka/KafkaSinkProvider.java index 496e6a3..57a7d70 100644 --- a/groot-connectors/connector-kafka/src/main/java/com/geedgenetworks/connectors/kafka/KafkaSinkProvider.java +++ b/groot-connectors/connector-kafka/src/main/java/com/geedgenetworks/connectors/kafka/KafkaSinkProvider.java @@ -1,10 +1,10 @@ package com.geedgenetworks.connectors.kafka; -import com.geedgenetworks.common.Event; import com.geedgenetworks.connectors.kafka.rate.RateLimitingStrategy; -import com.geedgenetworks.core.connector.format.EncodingFormat; -import com.geedgenetworks.core.connector.sink.SinkProvider; -import com.geedgenetworks.core.types.StructType; +import com.geedgenetworks.api.connector.serialization.EncodingFormat; +import com.geedgenetworks.api.connector.sink.SinkProvider; +import com.geedgenetworks.api.connector.event.Event; +import com.geedgenetworks.api.connector.type.StructType; import org.apache.flink.api.common.serialization.SerializationSchema; import org.apache.flink.streaming.api.datastream.DataStream; import org.apache.flink.streaming.api.datastream.DataStreamSink; diff --git a/groot-connectors/connector-kafka/src/main/java/com/geedgenetworks/connectors/kafka/KafkaSourceProvider.java b/groot-connectors/connector-kafka/src/main/java/com/geedgenetworks/connectors/kafka/KafkaSourceProvider.java index ad34557..6b6de05 100644 --- a/groot-connectors/connector-kafka/src/main/java/com/geedgenetworks/connectors/kafka/KafkaSourceProvider.java +++ b/groot-connectors/connector-kafka/src/main/java/com/geedgenetworks/connectors/kafka/KafkaSourceProvider.java @@ -1,9 +1,9 @@ package com.geedgenetworks.connectors.kafka; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.core.connector.format.DecodingFormat; -import com.geedgenetworks.core.connector.source.SourceProvider; -import com.geedgenetworks.core.types.StructType; +import com.geedgenetworks.api.connector.serialization.DecodingFormat; +import com.geedgenetworks.api.connector.source.SourceProvider; +import com.geedgenetworks.api.connector.event.Event; +import com.geedgenetworks.api.connector.type.StructType; import org.apache.flink.api.common.serialization.DeserializationSchema; import org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator; import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment; diff --git a/groot-connectors/connector-kafka/src/main/java/com/geedgenetworks/connectors/kafka/KafkaTableFactory.java b/groot-connectors/connector-kafka/src/main/java/com/geedgenetworks/connectors/kafka/KafkaTableFactory.java index 394e618..dca76ed 100644 --- a/groot-connectors/connector-kafka/src/main/java/com/geedgenetworks/connectors/kafka/KafkaTableFactory.java +++ b/groot-connectors/connector-kafka/src/main/java/com/geedgenetworks/connectors/kafka/KafkaTableFactory.java @@ -1,146 +1,149 @@ -package com.geedgenetworks.connectors.kafka;
-
-import com.geedgenetworks.connectors.kafka.rate.BlockDropRateLimitingStrategy;
-import com.geedgenetworks.connectors.kafka.rate.NoRateLimitingStrategy;
-import com.geedgenetworks.connectors.kafka.rate.RateLimitingStrategy;
-import com.geedgenetworks.connectors.kafka.rate.RateLimitingStrategyType;
-import com.geedgenetworks.core.connector.format.DecodingFormat;
-import com.geedgenetworks.core.connector.format.EncodingFormat;
-import com.geedgenetworks.core.connector.sink.SinkProvider;
-import com.geedgenetworks.core.connector.source.SourceProvider;
-import com.geedgenetworks.core.factories.*;
-import com.geedgenetworks.core.factories.FactoryUtil.TableFactoryHelper;
-import com.geedgenetworks.core.types.StructType;
-import org.apache.flink.configuration.ConfigOption;
-import org.apache.flink.configuration.ReadableConfig;
-import org.apache.flink.util.Preconditions;
-
-import java.util.*;
-
-import static com.geedgenetworks.connectors.kafka.KafkaConnectorOptions.*;
-import static com.geedgenetworks.connectors.kafka.KafkaConnectorOptionsUtil.*;
-
-public class KafkaTableFactory implements SourceTableFactory, SinkTableFactory {
- public static final String IDENTIFIER = "kafka";
- @Override
- public String factoryIdentifier() {
- return IDENTIFIER;
- }
-
- @Override
- public SourceProvider getSourceProvider(Context context) {
- final TableFactoryHelper helper = FactoryUtil.createTableFactoryHelper(this, context);
- // 获取valueDecodingFormat
- DecodingFormat valueDecodingFormat = helper.discoverDecodingFormat(DecodingFormatFactory.class, FactoryUtil.FORMAT);
-
- helper.validateExcept(PROPERTIES_PREFIX); // 校验参数,排除properties.参数
-
- StructType physicalDataType = context.getPhysicalDataType(); // 列类型
- ReadableConfig config = context.getConfiguration();
-
- List<String> topics = config.get(TOPIC);
- final Properties properties = getKafkaProperties(context.getOptions());
-
- return new KafkaSourceProvider(physicalDataType, valueDecodingFormat, topics, properties);
- }
-
- @Override
- public SinkProvider getSinkProvider(Context context) {
- final TableFactoryHelper helper = FactoryUtil.createTableFactoryHelper(this, context);
- // 获取valueEncodingFormat
- EncodingFormat valueEncodingFormat = helper.discoverEncodingFormat(EncodingFormatFactory.class, FactoryUtil.FORMAT);
-
- helper.validateExcept(PROPERTIES_PREFIX, HEADERS_PREFIX); // 校验参数,排除properties.参数
-
- StructType dataType = context.getDataType();
- ReadableConfig config = context.getConfiguration();
-
- String topic = config.get(TOPIC).get(0);
- boolean logFailuresOnly = config.get(LOG_FAILURES_ONLY);
- final Properties properties = getKafkaProperties(context.getOptions());
- Map<String, String> headers = getKafkaHeaders(context.getOptions());
-
- return new KafkaSinkProvider(dataType, valueEncodingFormat, topic, properties, headers, logFailuresOnly, getRateLimitingStrategy(config));
- }
-
- @Override
- public Set<ConfigOption<?>> requiredOptions() {
- final Set<ConfigOption<?>> options = new HashSet<>();
- options.add(TOPIC);
- options.add(FactoryUtil.FORMAT);
- options.add(PROPS_BOOTSTRAP_SERVERS);
- return options;
- }
-
- @Override
- public Set<ConfigOption<?>> optionalOptions() {
- final Set<ConfigOption<?>> options = new HashSet<>();
- options.add(LOG_FAILURES_ONLY);
- options.add(RATE_LIMITING_STRATEGY);
- options.add(RATE_LIMITING_LIMIT_RATE);
- options.add(RATE_LIMITING_WINDOW_SIZE);
- options.add(RATE_LIMITING_BLOCK_DURATION);
- options.add(RATE_LIMITING_BLOCK_RESET_DURATION);
- return options;
- }
-
- private RateLimitingStrategy getRateLimitingStrategy(ReadableConfig config){
- RateLimitingStrategyType strategyType = config.get(RATE_LIMITING_STRATEGY);
- switch (strategyType){
- case NONE:
- return new NoRateLimitingStrategy();
- case SLIDING_WINDOW:
- return new BlockDropRateLimitingStrategy(
- config.get(RATE_LIMITING_WINDOW_SIZE),
- parseRateLimitingRate(config.get(RATE_LIMITING_LIMIT_RATE)),
- config.get(RATE_LIMITING_BLOCK_DURATION).toMillis(),
- config.get(RATE_LIMITING_BLOCK_RESET_DURATION).toMillis());
- default:
- throw new IllegalArgumentException("not supported strategy:" + strategyType);
- }
- }
-
- private long parseRateLimitingRate(String text){
- Preconditions.checkNotNull(text);
- final String trimmed = text.trim();
- Preconditions.checkArgument(!trimmed.isEmpty());
-
- final int len = trimmed.length();
- int pos = 0;
-
- char current;
- while (pos < len && (current = trimmed.charAt(pos)) >= '0' && current <= '9') {
- pos++;
- }
-
- final String number = trimmed.substring(0, pos);
- final String unit = trimmed.substring(pos).trim().toLowerCase();
-
- if (number.isEmpty()) {
- throw new NumberFormatException("text does not start with a number");
- }
-
- final long value;
- try {
- value = Long.parseLong(number); // this throws a NumberFormatException on overflow
- } catch (NumberFormatException e) {
- throw new IllegalArgumentException("The value '" + number+ "' cannot be re represented as long.");
- }
-
- long multiplier;
- if("mbps".equals(unit)){
- multiplier = 1 << 20;
- }else if("kbps".equals(unit)){
- multiplier = 1 << 10;
- }else if("bps".equals(unit)){
- multiplier = 1;
- }else if(unit.isEmpty()){
- multiplier = 1;
- }else{
- throw new IllegalArgumentException(text);
- }
-
- // bit单位转为byte单位
- return value * multiplier / 8;
- }
-}
+package com.geedgenetworks.connectors.kafka; + +import com.geedgenetworks.api.connector.serialization.DecodingFormat; +import com.geedgenetworks.api.connector.serialization.EncodingFormat; +import com.geedgenetworks.connectors.kafka.rate.BlockDropRateLimitingStrategy; +import com.geedgenetworks.connectors.kafka.rate.NoRateLimitingStrategy; +import com.geedgenetworks.connectors.kafka.rate.RateLimitingStrategy; +import com.geedgenetworks.connectors.kafka.rate.RateLimitingStrategyType; +import com.geedgenetworks.api.connector.sink.SinkProvider; +import com.geedgenetworks.api.connector.sink.SinkTableFactory; +import com.geedgenetworks.api.connector.source.SourceProvider; +import com.geedgenetworks.api.connector.source.SourceTableFactory; +import com.geedgenetworks.api.factory.DecodingFormatFactory; +import com.geedgenetworks.api.factory.EncodingFormatFactory; +import com.geedgenetworks.api.factory.FactoryUtil; +import com.geedgenetworks.api.connector.type.StructType; +import org.apache.flink.configuration.ConfigOption; +import org.apache.flink.configuration.ReadableConfig; +import org.apache.flink.util.Preconditions; + +import java.util.*; + +import static com.geedgenetworks.connectors.kafka.KafkaConnectorOptions.*; +import static com.geedgenetworks.connectors.kafka.KafkaConnectorOptionsUtil.*; + +public class KafkaTableFactory implements SourceTableFactory, SinkTableFactory { + public static final String IDENTIFIER = "kafka"; + @Override + public String type() { + return IDENTIFIER; + } + + @Override + public SourceProvider getSourceProvider(Context context) { + final FactoryUtil.TableFactoryHelper helper = FactoryUtil.createTableFactoryHelper(this, context); + // 获取valueDecodingFormat + DecodingFormat valueDecodingFormat = helper.discoverDecodingFormat(DecodingFormatFactory.class, FactoryUtil.FORMAT); + + helper.validateExcept(PROPERTIES_PREFIX); // 校验参数,排除properties.参数 + + StructType physicalDataType = context.getPhysicalDataType(); // 列类型 + ReadableConfig config = context.getConfiguration(); + + List<String> topics = config.get(TOPIC); + final Properties properties = getKafkaProperties(context.getOptions()); + + return new KafkaSourceProvider(physicalDataType, valueDecodingFormat, topics, properties); + } + + @Override + public SinkProvider getSinkProvider(Context context) { + final FactoryUtil.TableFactoryHelper helper = FactoryUtil.createTableFactoryHelper(this, context); + // 获取valueEncodingFormat + EncodingFormat valueEncodingFormat = helper.discoverEncodingFormat(EncodingFormatFactory.class, FactoryUtil.FORMAT); + + helper.validateExcept(PROPERTIES_PREFIX, HEADERS_PREFIX); // 校验参数,排除properties.参数 + + StructType dataType = context.getDataType(); + ReadableConfig config = context.getConfiguration(); + + String topic = config.get(TOPIC).get(0); + boolean logFailuresOnly = config.get(LOG_FAILURES_ONLY); + final Properties properties = getKafkaProperties(context.getOptions()); + Map<String, String> headers = getKafkaHeaders(context.getOptions()); + + return new KafkaSinkProvider(dataType, valueEncodingFormat, topic, properties, headers, logFailuresOnly, getRateLimitingStrategy(config)); + } + + @Override + public Set<ConfigOption<?>> requiredOptions() { + final Set<ConfigOption<?>> options = new HashSet<>(); + options.add(TOPIC); + options.add(FactoryUtil.FORMAT); + options.add(PROPS_BOOTSTRAP_SERVERS); + return options; + } + + @Override + public Set<ConfigOption<?>> optionalOptions() { + final Set<ConfigOption<?>> options = new HashSet<>(); + options.add(LOG_FAILURES_ONLY); + options.add(RATE_LIMITING_STRATEGY); + options.add(RATE_LIMITING_LIMIT_RATE); + options.add(RATE_LIMITING_WINDOW_SIZE); + options.add(RATE_LIMITING_BLOCK_DURATION); + options.add(RATE_LIMITING_BLOCK_RESET_DURATION); + return options; + } + + private RateLimitingStrategy getRateLimitingStrategy(ReadableConfig config){ + RateLimitingStrategyType strategyType = config.get(RATE_LIMITING_STRATEGY); + switch (strategyType){ + case NONE: + return new NoRateLimitingStrategy(); + case SLIDING_WINDOW: + return new BlockDropRateLimitingStrategy( + config.get(RATE_LIMITING_WINDOW_SIZE), + parseRateLimitingRate(config.get(RATE_LIMITING_LIMIT_RATE)), + config.get(RATE_LIMITING_BLOCK_DURATION).toMillis(), + config.get(RATE_LIMITING_BLOCK_RESET_DURATION).toMillis()); + default: + throw new IllegalArgumentException("not supported strategy:" + strategyType); + } + } + + private long parseRateLimitingRate(String text){ + Preconditions.checkNotNull(text); + final String trimmed = text.trim(); + Preconditions.checkArgument(!trimmed.isEmpty()); + + final int len = trimmed.length(); + int pos = 0; + + char current; + while (pos < len && (current = trimmed.charAt(pos)) >= '0' && current <= '9') { + pos++; + } + + final String number = trimmed.substring(0, pos); + final String unit = trimmed.substring(pos).trim().toLowerCase(); + + if (number.isEmpty()) { + throw new NumberFormatException("text does not start with a number"); + } + + final long value; + try { + value = Long.parseLong(number); // this throws a NumberFormatException on overflow + } catch (NumberFormatException e) { + throw new IllegalArgumentException("The value '" + number+ "' cannot be re represented as long."); + } + + long multiplier; + if("mbps".equals(unit)){ + multiplier = 1 << 20; + }else if("kbps".equals(unit)){ + multiplier = 1 << 10; + }else if("bps".equals(unit)){ + multiplier = 1; + }else if(unit.isEmpty()){ + multiplier = 1; + }else{ + throw new IllegalArgumentException(text); + } + + // bit单位转为byte单位 + return value * multiplier / 8; + } +} diff --git a/groot-connectors/connector-kafka/src/main/java/org/apache/flink/streaming/connectors/kafka/GrootFlinkKafkaProducer.java b/groot-connectors/connector-kafka/src/main/java/org/apache/flink/streaming/connectors/kafka/GrootFlinkKafkaProducer.java index 3b7e0c5..22af04f 100644 --- a/groot-connectors/connector-kafka/src/main/java/org/apache/flink/streaming/connectors/kafka/GrootFlinkKafkaProducer.java +++ b/groot-connectors/connector-kafka/src/main/java/org/apache/flink/streaming/connectors/kafka/GrootFlinkKafkaProducer.java @@ -1,25 +1,8 @@ -/* - * Licensed to the Apache Software Foundation (ASF) under one or more - * contributor license agreements. See the NOTICE file distributed with - * this work for additional information regarding copyright ownership. - * The ASF licenses this file to You under the Apache License, Version 2.0 - * (the "License"); you may not use this file except in compliance with - * the License. You may obtain a copy of the License at - * - * http://www.apache.org/licenses/LICENSE-2.0 - * - * Unless required by applicable law or agreed to in writing, software - * distributed under the License is distributed on an "AS IS" BASIS, - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - * See the License for the specific language governing permissions and - * limitations under the License. - */ - package org.apache.flink.streaming.connectors.kafka; import com.geedgenetworks.connectors.kafka.rate.RateLimitingStatus; import com.geedgenetworks.connectors.kafka.rate.RateLimitingStrategy; -import com.geedgenetworks.core.metrics.InternalMetrics; +import com.geedgenetworks.api.metrics.InternalMetrics; import org.apache.commons.lang3.StringUtils; import org.apache.flink.annotation.Internal; import org.apache.flink.annotation.PublicEvolving; diff --git a/groot-connectors/connector-kafka/src/main/resources/META-INF/services/com.geedgenetworks.core.factories.Factory b/groot-connectors/connector-kafka/src/main/resources/META-INF/services/com.geedgenetworks.api.factory.Factory index 531df31..531df31 100644 --- a/groot-connectors/connector-kafka/src/main/resources/META-INF/services/com.geedgenetworks.core.factories.Factory +++ b/groot-connectors/connector-kafka/src/main/resources/META-INF/services/com.geedgenetworks.api.factory.Factory diff --git a/groot-connectors/connector-mock/pom.xml b/groot-connectors/connector-mock/pom.xml index 4932eec..b13f7a5 100644 --- a/groot-connectors/connector-mock/pom.xml +++ b/groot-connectors/connector-mock/pom.xml @@ -18,6 +18,7 @@ <artifactId>datafaker</artifactId> <version>1.9.0</version> </dependency> + </dependencies> </project>
\ No newline at end of file diff --git a/groot-connectors/connector-mock/src/main/java/com/geedgenetworks/connectors/mock/MockSource.java b/groot-connectors/connector-mock/src/main/java/com/geedgenetworks/connectors/mock/MockSource.java index dc80141..61792a1 100644 --- a/groot-connectors/connector-mock/src/main/java/com/geedgenetworks/connectors/mock/MockSource.java +++ b/groot-connectors/connector-mock/src/main/java/com/geedgenetworks/connectors/mock/MockSource.java @@ -1,95 +1,95 @@ -package com.geedgenetworks.connectors.mock;
-
-import com.geedgenetworks.common.Event;
-import com.geedgenetworks.connectors.mock.faker.ObjectFaker;
-import org.apache.flink.configuration.Configuration;
-import org.apache.flink.streaming.api.functions.source.RichParallelSourceFunction;
-
-import java.util.Map;
-
-public class MockSource extends RichParallelSourceFunction<Event> {
- private final ObjectFaker faker;
- private final int rowsPerSecond;
- private final long numberOfRows;
- private final long millisPerRow;
- private volatile boolean stop;
-
- public MockSource(ObjectFaker faker, int rowsPerSecond, long numberOfRows, long millisPerRow) {
- this.faker = faker;
- this.rowsPerSecond = rowsPerSecond;
- this.numberOfRows = numberOfRows;
- this.millisPerRow = millisPerRow;
- }
-
- @Override
- public void open(Configuration parameters) throws Exception {
- faker.init(getRuntimeContext().getNumberOfParallelSubtasks(), getRuntimeContext().getIndexOfThisSubtask());
- }
-
- @Override
- public void run(SourceContext<Event> ctx) throws Exception {
- final long rowsForSubtask = getRowsForSubTask();
- final int rowsPerSecondForSubtask = getRowsPerSecondForSubTask();
-
- Event event;
- Map<String, Object> value;
- long rows = 0;
- int batchRows = 0;
- long nextReadTime = System.currentTimeMillis();
- long waitMs;
-
- while (!stop && rows < rowsForSubtask) {
- while (!stop && rows < rowsForSubtask){
- event = new Event();
- value = faker.geneValue();
- value.put(Event.INTERNAL_TIMESTAMP_KEY, System.currentTimeMillis());
- event.setExtractedFields(value);
- ctx.collect(event);
- rows += 1;
-
- if(millisPerRow > 0){
- Thread.sleep(millisPerRow);
- }else{
- batchRows += 1;
- if(batchRows >= rowsPerSecondForSubtask){
- batchRows = 0;
- nextReadTime += 1000;
- waitMs = Math.max(0, nextReadTime - System.currentTimeMillis());
- if(waitMs > 0) {
- Thread.sleep(waitMs);
- }
- }
- }
- }
- }
-
- }
-
- @Override
- public void close() throws Exception {
- faker.destroy();
- }
-
- @Override
- public void cancel() {
- stop = true;
- }
-
- private int getRowsPerSecondForSubTask() {
- int numSubtasks = getRuntimeContext().getNumberOfParallelSubtasks();
- int indexOfThisSubtask = getRuntimeContext().getIndexOfThisSubtask();
- int baseRowsPerSecondPerSubtask = rowsPerSecond / numSubtasks;
- return (rowsPerSecond % numSubtasks > indexOfThisSubtask) ? baseRowsPerSecondPerSubtask + 1 : baseRowsPerSecondPerSubtask;
- }
-
- private long getRowsForSubTask() {
- if (numberOfRows < 0) {
- return Long.MAX_VALUE;
- } else {
- int numSubtasks = getRuntimeContext().getNumberOfParallelSubtasks();
- int indexOfThisSubtask = getRuntimeContext().getIndexOfThisSubtask();
- final long baseNumOfRowsPerSubtask = numberOfRows / numSubtasks;
- return (numberOfRows % numSubtasks > indexOfThisSubtask) ? baseNumOfRowsPerSubtask + 1 : baseNumOfRowsPerSubtask;
- }
- }
-}
+package com.geedgenetworks.connectors.mock; + +import com.geedgenetworks.connectors.mock.faker.ObjectFaker; +import com.geedgenetworks.api.connector.event.Event; +import org.apache.flink.configuration.Configuration; +import org.apache.flink.streaming.api.functions.source.RichParallelSourceFunction; + +import java.util.Map; + +public class MockSource extends RichParallelSourceFunction<Event> { + private final ObjectFaker faker; + private final int rowsPerSecond; + private final long numberOfRows; + private final long millisPerRow; + private volatile boolean stop; + + public MockSource(ObjectFaker faker, int rowsPerSecond, long numberOfRows, long millisPerRow) { + this.faker = faker; + this.rowsPerSecond = rowsPerSecond; + this.numberOfRows = numberOfRows; + this.millisPerRow = millisPerRow; + } + + @Override + public void open(Configuration parameters) throws Exception { + faker.init(getRuntimeContext().getNumberOfParallelSubtasks(), getRuntimeContext().getIndexOfThisSubtask()); + } + + @Override + public void run(SourceContext<Event> ctx) throws Exception { + final long rowsForSubtask = getRowsForSubTask(); + final int rowsPerSecondForSubtask = getRowsPerSecondForSubTask(); + + Event event; + Map<String, Object> value; + long rows = 0; + int batchRows = 0; + long nextReadTime = System.currentTimeMillis(); + long waitMs; + + while (!stop && rows < rowsForSubtask) { + while (!stop && rows < rowsForSubtask){ + event = new Event(); + value = faker.geneValue(); + value.put(Event.INTERNAL_TIMESTAMP_KEY, System.currentTimeMillis()); + event.setExtractedFields(value); + ctx.collect(event); + rows += 1; + + if(millisPerRow > 0){ + Thread.sleep(millisPerRow); + }else{ + batchRows += 1; + if(batchRows >= rowsPerSecondForSubtask){ + batchRows = 0; + nextReadTime += 1000; + waitMs = Math.max(0, nextReadTime - System.currentTimeMillis()); + if(waitMs > 0) { + Thread.sleep(waitMs); + } + } + } + } + } + + } + + @Override + public void close() throws Exception { + faker.destroy(); + } + + @Override + public void cancel() { + stop = true; + } + + private int getRowsPerSecondForSubTask() { + int numSubtasks = getRuntimeContext().getNumberOfParallelSubtasks(); + int indexOfThisSubtask = getRuntimeContext().getIndexOfThisSubtask(); + int baseRowsPerSecondPerSubtask = rowsPerSecond / numSubtasks; + return (rowsPerSecond % numSubtasks > indexOfThisSubtask) ? baseRowsPerSecondPerSubtask + 1 : baseRowsPerSecondPerSubtask; + } + + private long getRowsForSubTask() { + if (numberOfRows < 0) { + return Long.MAX_VALUE; + } else { + int numSubtasks = getRuntimeContext().getNumberOfParallelSubtasks(); + int indexOfThisSubtask = getRuntimeContext().getIndexOfThisSubtask(); + final long baseNumOfRowsPerSubtask = numberOfRows / numSubtasks; + return (numberOfRows % numSubtasks > indexOfThisSubtask) ? baseNumOfRowsPerSubtask + 1 : baseNumOfRowsPerSubtask; + } + } +} diff --git a/groot-connectors/connector-mock/src/main/java/com/geedgenetworks/connectors/mock/MockTableFactory.java b/groot-connectors/connector-mock/src/main/java/com/geedgenetworks/connectors/mock/MockTableFactory.java index f768f7f..43e3364 100644 --- a/groot-connectors/connector-mock/src/main/java/com/geedgenetworks/connectors/mock/MockTableFactory.java +++ b/groot-connectors/connector-mock/src/main/java/com/geedgenetworks/connectors/mock/MockTableFactory.java @@ -1,81 +1,81 @@ -package com.geedgenetworks.connectors.mock;
-
-import com.geedgenetworks.common.Event;
-import com.geedgenetworks.connectors.mock.faker.FakerUtils;
-import com.geedgenetworks.connectors.mock.faker.ObjectFaker;
-import com.geedgenetworks.core.connector.source.SourceProvider;
-import com.geedgenetworks.core.factories.FactoryUtil;
-import com.geedgenetworks.core.factories.SourceTableFactory;
-import com.geedgenetworks.core.types.StructType;
-import org.apache.commons.io.FileUtils;
-import org.apache.flink.configuration.ConfigOption;
-import org.apache.flink.configuration.ReadableConfig;
-import org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator;
-import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
-
-import java.io.File;
-import java.nio.charset.StandardCharsets;
-import java.util.HashSet;
-import java.util.Set;
-
-import static com.geedgenetworks.connectors.mock.MockConnectorOptions.*;
-
-public class MockTableFactory implements SourceTableFactory {
- public static final String IDENTIFIER = "mock";
-
- @Override
- public String factoryIdentifier() {
- return IDENTIFIER;
- }
-
- @Override
- public SourceProvider getSourceProvider(Context context) {
- final FactoryUtil.TableFactoryHelper helper = FactoryUtil.createTableFactoryHelper(this, context);
- helper.validate();
-
- final StructType physicalDataType = context.getPhysicalDataType();
- ReadableConfig config = context.getConfiguration();
-
- final String mockDescFilePath = config.get(MOCK_DESC_FILE_PATH).trim();
- final int rowsPerSecond = config.get(ROWS_PER_SECOND);
- final long numberOfRows = config.get(NUMBER_OF_ROWS);
- final long millisPerRow = config.get(MILLIS_PER_ROW);
-
- return new SourceProvider() {
- @Override
- public SingleOutputStreamOperator<Event> produceDataStream(StreamExecutionEnvironment env) {
- return env.addSource(new MockSource(parseFaker(mockDescFilePath), rowsPerSecond, numberOfRows, millisPerRow));
- }
-
- @Override
- public StructType getPhysicalDataType() {
- return physicalDataType;
- }
- };
- }
-
- @Override
- public Set<ConfigOption<?>> requiredOptions() {
- Set<ConfigOption<?>> options = new HashSet<>();
- options.add(MOCK_DESC_FILE_PATH);
- return options;
- }
-
- @Override
- public Set<ConfigOption<?>> optionalOptions() {
- Set<ConfigOption<?>> options = new HashSet<>();
- options.add(ROWS_PER_SECOND);
- options.add(NUMBER_OF_ROWS);
- options.add(MILLIS_PER_ROW);
- return options;
- }
-
- private ObjectFaker parseFaker(String mockDescFilePath){
- try {
- String json = FileUtils.readFileToString(new File(mockDescFilePath), StandardCharsets.UTF_8);
- return FakerUtils.parseObjectFakerFromJson(json);
- } catch (Exception e) {
- throw new RuntimeException(e);
- }
- }
-}
+package com.geedgenetworks.connectors.mock; + +import com.geedgenetworks.connectors.mock.faker.FakerUtils; +import com.geedgenetworks.connectors.mock.faker.ObjectFaker; +import com.geedgenetworks.api.connector.source.SourceProvider; +import com.geedgenetworks.api.connector.source.SourceTableFactory; +import com.geedgenetworks.api.connector.event.Event; +import com.geedgenetworks.api.factory.FactoryUtil; +import com.geedgenetworks.api.connector.type.StructType; +import org.apache.commons.io.FileUtils; +import org.apache.flink.configuration.ConfigOption; +import org.apache.flink.configuration.ReadableConfig; +import org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator; +import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment; + +import java.io.File; +import java.nio.charset.StandardCharsets; +import java.util.HashSet; +import java.util.Set; + +import static com.geedgenetworks.connectors.mock.MockConnectorOptions.*; + +public class MockTableFactory implements SourceTableFactory { + public static final String IDENTIFIER = "mock"; + + @Override + public String type() { + return IDENTIFIER; + } + + @Override + public SourceProvider getSourceProvider(Context context) { + final FactoryUtil.TableFactoryHelper helper = FactoryUtil.createTableFactoryHelper(this, context); + helper.validate(); + + final StructType physicalDataType = context.getPhysicalDataType(); + ReadableConfig config = context.getConfiguration(); + + final String mockDescFilePath = config.get(MOCK_DESC_FILE_PATH).trim(); + final int rowsPerSecond = config.get(ROWS_PER_SECOND); + final long numberOfRows = config.get(NUMBER_OF_ROWS); + final long millisPerRow = config.get(MILLIS_PER_ROW); + + return new SourceProvider() { + @Override + public SingleOutputStreamOperator<Event> produceDataStream(StreamExecutionEnvironment env) { + return env.addSource(new MockSource(parseFaker(mockDescFilePath), rowsPerSecond, numberOfRows, millisPerRow)); + } + + @Override + public StructType getPhysicalDataType() { + return physicalDataType; + } + }; + } + + @Override + public Set<ConfigOption<?>> requiredOptions() { + Set<ConfigOption<?>> options = new HashSet<>(); + options.add(MOCK_DESC_FILE_PATH); + return options; + } + + @Override + public Set<ConfigOption<?>> optionalOptions() { + Set<ConfigOption<?>> options = new HashSet<>(); + options.add(ROWS_PER_SECOND); + options.add(NUMBER_OF_ROWS); + options.add(MILLIS_PER_ROW); + return options; + } + + private ObjectFaker parseFaker(String mockDescFilePath){ + try { + String json = FileUtils.readFileToString(new File(mockDescFilePath), StandardCharsets.UTF_8); + return FakerUtils.parseObjectFakerFromJson(json); + } catch (Exception e) { + throw new RuntimeException(e); + } + } +} diff --git a/groot-connectors/connector-mock/src/main/java/com/geedgenetworks/connectors/mock/faker/FakerUtils.java b/groot-connectors/connector-mock/src/main/java/com/geedgenetworks/connectors/mock/faker/FakerUtils.java index 0a36100..1d3b517 100644 --- a/groot-connectors/connector-mock/src/main/java/com/geedgenetworks/connectors/mock/faker/FakerUtils.java +++ b/groot-connectors/connector-mock/src/main/java/com/geedgenetworks/connectors/mock/faker/FakerUtils.java @@ -1,252 +1,252 @@ -package com.geedgenetworks.connectors.mock.faker;
-
-import com.alibaba.fastjson2.JSONObject;
-import com.alibaba.fastjson2.JSONReader;
-import com.geedgenetworks.connectors.mock.faker.ObjectFaker.FieldFaker;
-import com.geedgenetworks.connectors.mock.faker.NumberFaker.*;
-import com.geedgenetworks.connectors.mock.faker.StringFaker.*;
-import com.geedgenetworks.connectors.mock.faker.TimestampFaker.*;
-
-import com.alibaba.fastjson2.JSON;
-import com.alibaba.fastjson2.JSONArray;
-import com.geedgenetworks.core.types.DataType;
-import com.geedgenetworks.core.types.Types;
-import org.apache.flink.util.Preconditions;
-
-import java.math.BigDecimal;
-import java.util.Arrays;
-import java.util.List;
-import java.util.stream.Collectors;
-
-public class FakerUtils {
-
- public static ObjectFaker parseObjectFakerFromJson(String json) {
- JSONArray jsonArray = JSON.parseArray(json, JSONReader.Feature.UseBigDecimalForDoubles);
- return parseObjectFaker(jsonArray);
- }
-
- private static Faker<?> parseFaker(JSONObject obj) {
- String type = obj.getString("type");
- Preconditions.checkNotNull(type, "type is required");
- type = type.trim();
-
- if ("Number".equalsIgnoreCase(type)) {
- return wrapFaker(parseNumberFaker(obj), obj);
- } else if ("Sequence".equalsIgnoreCase(type)) {
- return wrapFaker(parseSequenceFaker(obj), obj);
- } else if ("UniqueSequence".equalsIgnoreCase(type)) {
- return wrapFaker(parseUniqueSequenceFaker(obj), obj);
- } else if ("String".equalsIgnoreCase(type)) {
- return wrapFaker(parseStringFaker(obj), obj);
- } else if ("Timestamp".equalsIgnoreCase(type)) {
- return wrapFaker(parseTimestampFaker(obj), obj);
- } else if ("FormatTimestamp".equalsIgnoreCase(type)) {
- return wrapFaker(parseFormatTimestampFaker(obj), obj);
- } else if ("IPv4".equalsIgnoreCase(type)) {
- return wrapFaker(parseIPv4Faker(obj), obj);
- } else if ("Expression".equalsIgnoreCase(type)) {
- return wrapFaker(parseExpressionFaker(obj), obj);
- } else if ("Hlld".equalsIgnoreCase(type)) {
- return wrapFaker(parseHlldFaker(obj), obj);
- } else if ("HdrHistogram".equalsIgnoreCase(type)) {
- return wrapFaker(parseHdrHistogramFaker(obj), obj);
- } else if ("Object".equalsIgnoreCase(type)) {
- return wrapFaker(parseObjectFaker(obj.getJSONArray("fields")), obj);
- } else if ("Union".equalsIgnoreCase(type)) {
- return wrapFaker(parseUnionFaker(obj), obj);
- } else if ("Eval".equalsIgnoreCase(type)) {
- return parseEvalFaker(obj);
- }
-
- throw new UnsupportedOperationException("not support type:" + type);
- }
-
- private static Faker<?> wrapFaker(Faker<?> faker, JSONObject obj) {
- if(obj.getBooleanValue("array", false)){
- faker = new ArrayFaker((Faker<Object>) faker, obj.getIntValue("arrayLenMin", 0), obj.getIntValue("arrayLenMax", 5));
- }
- return NullAbleFaker.wrap(faker, obj.getDoubleValue("nullRate"));
- }
-
- private static ObjectFaker parseObjectFaker(JSONArray fieldJsonArray) {
- return new ObjectFaker(parseObjectFieldFakers(fieldJsonArray));
- }
-
- private static FieldFaker[] parseObjectFieldFakers(JSONArray fieldJsonArray) {
- FieldFaker[] fields = new FieldFaker[fieldJsonArray.size()];
-
- for (int i = 0; i < fieldJsonArray.size(); i++) {
- JSONObject jsonObject = fieldJsonArray.getJSONObject(i);
- String name = jsonObject.getString("name");
- fields[i] = new FieldFaker(name, parseFaker(jsonObject));
- }
-
- return fields;
- }
-
-
- private static UnionFaker parseUnionFaker(JSONObject obj) {
- JSONArray fieldsJsonArray = obj.getJSONArray("unionFields");
- boolean random = obj.getBooleanValue("random", true);
- UnionFaker.FieldsFaker[] fieldsFakers = new UnionFaker.FieldsFaker[fieldsJsonArray.size()];
-
- for (int i = 0; i < fieldsJsonArray.size(); i++) {
- JSONObject jsonObject = fieldsJsonArray.getJSONObject(i);
- int weight = jsonObject.getIntValue("weight", 1);
- Preconditions.checkArgument(weight >= 0 && weight < 10000000);
- FieldFaker[] fields = parseObjectFieldFakers(jsonObject.getJSONArray("fields"));
- fieldsFakers[i] = new UnionFaker.FieldsFaker(fields, weight);
- }
-
- return new UnionFaker(fieldsFakers, random);
- }
-
- private static Faker<?> parseEvalFaker(JSONObject obj) {
- String expression = obj.getString("expression");
- Preconditions.checkNotNull(expression);
- return new EvalFaker(expression);
- }
-
- private static Faker<?> parseExpressionFaker(JSONObject obj) {
- String expression = obj.getString("expression");
- Preconditions.checkNotNull(expression);
- return new ExpressionFaker(expression);
- }
-
- private static Faker<?> parseHlldFaker(JSONObject obj) {
- long itemCount = obj.getLongValue("itemCount", 1000000L);
- int batchCount = obj.getIntValue("batchCount", 10000);
- int precision = obj.getIntValue("precision", 12);
- return new HlldFaker(itemCount, batchCount, precision);
- }
-
- private static Faker<?> parseHdrHistogramFaker(JSONObject obj) {
- int max = obj.getIntValue("max", 100000);
- int batchCount = obj.getIntValue("batchCount", 1000);
- int numberOfSignificantValueDigits = obj.getIntValue("numberOfSignificantValueDigits", 1);
- return new HdrHistogramFaker(max, batchCount, numberOfSignificantValueDigits);
- }
-
- private static Faker<?> parseIPv4Faker(JSONObject obj) {
- String start = obj.getString("start");
- String end = obj.getString("end");
- if(start == null){
- start = "0.0.0.0";
- }
- if(end == null){
- start = "255.255.255.255";
- }
- return new IPv4Faker(IPv4Faker.ipv4ToLong(start), IPv4Faker.ipv4ToLong(end) + 1);
- }
-
- private static Faker<?> parseFormatTimestampFaker(JSONObject obj) {
- String format = obj.getString("format");
- boolean utc = obj.getBooleanValue("utc", false);
- if(format == null){
- format = FormatTimestamp.NORM_DATETIME_PATTERN;
- }
- return new FormatTimestamp(format, utc);
- }
-
- private static Faker<?> parseTimestampFaker(JSONObject obj) {
- String unit = obj.getString("unit");
- if("millis".equals(unit)){
- return new Timestamp();
- }else{
- return new UnixTimestamp();
- }
- }
-
- private static Faker<?> parseUniqueSequenceFaker(JSONObject obj) {
- long start = obj.getLongValue("start", 0L);
- return new UniqueSequenceFaker(start);
- }
-
- private static Faker<?> parseSequenceFaker(JSONObject obj) {
- long start = obj.getLongValue("start", 0L);
- long step = obj.getLongValue("step", 1L);
- int batch = obj.getIntValue("batch", 1);
- return new SequenceFaker(start, step, batch);
- }
-
- private static Faker<?> parseStringFaker(JSONObject obj) {
- String regex = obj.getString("regex");
- JSONArray options = obj.getJSONArray("options");
- boolean random = obj.getBooleanValue("random", true);
-
- if (options != null && options.size() > 0) {
- return new OptionString(options.stream().map(x -> x == null ? null : x.toString()).toArray(String[]::new), random);
- }else{
- if(regex == null){
- regex = "[a-zA-Z]{0,5}";
- }
- return new RegexString(regex);
- }
- }
-
- private static Faker<?> parseNumberFaker(JSONObject obj) {
- Number start = (Number) obj.get("min");
- Number end = (Number) obj.get("max");
- JSONArray options = obj.getJSONArray("options");
- boolean random = obj.getBooleanValue("random", true);
-
- DataType dataType;
- if (options != null && options.size() > 0) {
- dataType = getNumberDataType(options.stream().map(x -> (Number) x).collect(Collectors.toList()));
- if (dataType.equals(Types.INT)) {
- return new OptionIntNumber(options.stream().map(x -> x == null ? null : ((Number) x).intValue()).toArray(Integer[]::new), random);
- } else if (dataType.equals(Types.BIGINT)) {
- return new OptionLongNumber(options.stream().map(x -> x == null ? null : ((Number) x).longValue()).toArray(Long[]::new), random);
- } else {
- return new OptionDoubleNumber(options.stream().map(x -> x == null ? null : ((Number) x).doubleValue()).toArray(Double[]::new), random);
- }
- } else {
- if(start == null){
- start = 0;
- }
- if(end == null){
- end = Integer.MAX_VALUE;
- }
- Preconditions.checkArgument(end.doubleValue() > start.doubleValue());
- dataType = getNumberDataType(Arrays.asList(start, end));
- if (dataType.equals(Types.INT)) {
- return new RangeIntNumber(start.intValue(), end.intValue(), random);
- } else if (dataType.equals(Types.BIGINT)) {
- return new RangeLongNumber(start.longValue(), end.longValue(), random);
- } else {
- return new RangeDoubleNumber(start.doubleValue(), end.doubleValue());
- }
- }
- }
-
- private static DataType getNumberDataType(List<Number> list) {
- DataType dataType = Types.INT;
-
- for (Number number : list) {
- if (number == null) {
- continue;
- }
-
- if (number instanceof Short || number instanceof Integer) {
- continue;
- }
-
- if (number instanceof Long) {
- if (!dataType.equals(Types.DOUBLE)) {
- dataType = Types.BIGINT;
- }
- continue;
- }
-
- if (number instanceof Float || number instanceof Double || number instanceof BigDecimal) {
- dataType = Types.DOUBLE;
- continue;
- }
-
- throw new IllegalArgumentException(number.toString());
- }
-
- return dataType;
- }
-
-}
+package com.geedgenetworks.connectors.mock.faker; + +import com.alibaba.fastjson2.JSONObject; +import com.alibaba.fastjson2.JSONReader; +import com.geedgenetworks.connectors.mock.faker.ObjectFaker.FieldFaker; +import com.geedgenetworks.connectors.mock.faker.NumberFaker.*; +import com.geedgenetworks.connectors.mock.faker.StringFaker.*; +import com.geedgenetworks.connectors.mock.faker.TimestampFaker.*; + +import com.alibaba.fastjson2.JSON; +import com.alibaba.fastjson2.JSONArray; +import com.geedgenetworks.api.connector.type.DataType; +import com.geedgenetworks.api.connector.type.Types; +import org.apache.flink.util.Preconditions; + +import java.math.BigDecimal; +import java.util.Arrays; +import java.util.List; +import java.util.stream.Collectors; + +public class FakerUtils { + + public static ObjectFaker parseObjectFakerFromJson(String json) { + JSONArray jsonArray = JSON.parseArray(json, JSONReader.Feature.UseBigDecimalForDoubles); + return parseObjectFaker(jsonArray); + } + + private static Faker<?> parseFaker(JSONObject obj) { + String type = obj.getString("type"); + Preconditions.checkNotNull(type, "type is required"); + type = type.trim(); + + if ("Number".equalsIgnoreCase(type)) { + return wrapFaker(parseNumberFaker(obj), obj); + } else if ("Sequence".equalsIgnoreCase(type)) { + return wrapFaker(parseSequenceFaker(obj), obj); + } else if ("UniqueSequence".equalsIgnoreCase(type)) { + return wrapFaker(parseUniqueSequenceFaker(obj), obj); + } else if ("String".equalsIgnoreCase(type)) { + return wrapFaker(parseStringFaker(obj), obj); + } else if ("Timestamp".equalsIgnoreCase(type)) { + return wrapFaker(parseTimestampFaker(obj), obj); + } else if ("FormatTimestamp".equalsIgnoreCase(type)) { + return wrapFaker(parseFormatTimestampFaker(obj), obj); + } else if ("IPv4".equalsIgnoreCase(type)) { + return wrapFaker(parseIPv4Faker(obj), obj); + } else if ("Expression".equalsIgnoreCase(type)) { + return wrapFaker(parseExpressionFaker(obj), obj); + } else if ("Hlld".equalsIgnoreCase(type)) { + return wrapFaker(parseHlldFaker(obj), obj); + } else if ("HdrHistogram".equalsIgnoreCase(type)) { + return wrapFaker(parseHdrHistogramFaker(obj), obj); + } else if ("Object".equalsIgnoreCase(type)) { + return wrapFaker(parseObjectFaker(obj.getJSONArray("fields")), obj); + } else if ("Union".equalsIgnoreCase(type)) { + return wrapFaker(parseUnionFaker(obj), obj); + } else if ("Eval".equalsIgnoreCase(type)) { + return parseEvalFaker(obj); + } + + throw new UnsupportedOperationException("not support type:" + type); + } + + private static Faker<?> wrapFaker(Faker<?> faker, JSONObject obj) { + if(obj.getBooleanValue("array", false)){ + faker = new ArrayFaker((Faker<Object>) faker, obj.getIntValue("arrayLenMin", 0), obj.getIntValue("arrayLenMax", 5)); + } + return NullAbleFaker.wrap(faker, obj.getDoubleValue("nullRate")); + } + + private static ObjectFaker parseObjectFaker(JSONArray fieldJsonArray) { + return new ObjectFaker(parseObjectFieldFakers(fieldJsonArray)); + } + + private static FieldFaker[] parseObjectFieldFakers(JSONArray fieldJsonArray) { + FieldFaker[] fields = new FieldFaker[fieldJsonArray.size()]; + + for (int i = 0; i < fieldJsonArray.size(); i++) { + JSONObject jsonObject = fieldJsonArray.getJSONObject(i); + String name = jsonObject.getString("name"); + fields[i] = new FieldFaker(name, parseFaker(jsonObject)); + } + + return fields; + } + + + private static UnionFaker parseUnionFaker(JSONObject obj) { + JSONArray fieldsJsonArray = obj.getJSONArray("unionFields"); + boolean random = obj.getBooleanValue("random", true); + UnionFaker.FieldsFaker[] fieldsFakers = new UnionFaker.FieldsFaker[fieldsJsonArray.size()]; + + for (int i = 0; i < fieldsJsonArray.size(); i++) { + JSONObject jsonObject = fieldsJsonArray.getJSONObject(i); + int weight = jsonObject.getIntValue("weight", 1); + Preconditions.checkArgument(weight >= 0 && weight < 10000000); + FieldFaker[] fields = parseObjectFieldFakers(jsonObject.getJSONArray("fields")); + fieldsFakers[i] = new UnionFaker.FieldsFaker(fields, weight); + } + + return new UnionFaker(fieldsFakers, random); + } + + private static Faker<?> parseEvalFaker(JSONObject obj) { + String expression = obj.getString("expression"); + Preconditions.checkNotNull(expression); + return new EvalFaker(expression); + } + + private static Faker<?> parseExpressionFaker(JSONObject obj) { + String expression = obj.getString("expression"); + Preconditions.checkNotNull(expression); + return new ExpressionFaker(expression); + } + + private static Faker<?> parseHlldFaker(JSONObject obj) { + long itemCount = obj.getLongValue("itemCount", 1000000L); + int batchCount = obj.getIntValue("batchCount", 10000); + int precision = obj.getIntValue("precision", 12); + return new HlldFaker(itemCount, batchCount, precision); + } + + private static Faker<?> parseHdrHistogramFaker(JSONObject obj) { + int max = obj.getIntValue("max", 100000); + int batchCount = obj.getIntValue("batchCount", 1000); + int numberOfSignificantValueDigits = obj.getIntValue("numberOfSignificantValueDigits", 1); + return new HdrHistogramFaker(max, batchCount, numberOfSignificantValueDigits); + } + + private static Faker<?> parseIPv4Faker(JSONObject obj) { + String start = obj.getString("start"); + String end = obj.getString("end"); + if(start == null){ + start = "0.0.0.0"; + } + if(end == null){ + start = "255.255.255.255"; + } + return new IPv4Faker(IPv4Faker.ipv4ToLong(start), IPv4Faker.ipv4ToLong(end) + 1); + } + + private static Faker<?> parseFormatTimestampFaker(JSONObject obj) { + String format = obj.getString("format"); + boolean utc = obj.getBooleanValue("utc", false); + if(format == null){ + format = FormatTimestamp.NORM_DATETIME_PATTERN; + } + return new FormatTimestamp(format, utc); + } + + private static Faker<?> parseTimestampFaker(JSONObject obj) { + String unit = obj.getString("unit"); + if("millis".equals(unit)){ + return new Timestamp(); + }else{ + return new UnixTimestamp(); + } + } + + private static Faker<?> parseUniqueSequenceFaker(JSONObject obj) { + long start = obj.getLongValue("start", 0L); + return new UniqueSequenceFaker(start); + } + + private static Faker<?> parseSequenceFaker(JSONObject obj) { + long start = obj.getLongValue("start", 0L); + long step = obj.getLongValue("step", 1L); + int batch = obj.getIntValue("batch", 1); + return new SequenceFaker(start, step, batch); + } + + private static Faker<?> parseStringFaker(JSONObject obj) { + String regex = obj.getString("regex"); + JSONArray options = obj.getJSONArray("options"); + boolean random = obj.getBooleanValue("random", true); + + if (options != null && options.size() > 0) { + return new OptionString(options.stream().map(x -> x == null ? null : x.toString()).toArray(String[]::new), random); + }else{ + if(regex == null){ + regex = "[a-zA-Z]{0,5}"; + } + return new RegexString(regex); + } + } + + private static Faker<?> parseNumberFaker(JSONObject obj) { + Number start = (Number) obj.get("min"); + Number end = (Number) obj.get("max"); + JSONArray options = obj.getJSONArray("options"); + boolean random = obj.getBooleanValue("random", true); + + DataType dataType; + if (options != null && options.size() > 0) { + dataType = getNumberDataType(options.stream().map(x -> (Number) x).collect(Collectors.toList())); + if (dataType.equals(Types.INT)) { + return new OptionIntNumber(options.stream().map(x -> x == null ? null : ((Number) x).intValue()).toArray(Integer[]::new), random); + } else if (dataType.equals(Types.BIGINT)) { + return new OptionLongNumber(options.stream().map(x -> x == null ? null : ((Number) x).longValue()).toArray(Long[]::new), random); + } else { + return new OptionDoubleNumber(options.stream().map(x -> x == null ? null : ((Number) x).doubleValue()).toArray(Double[]::new), random); + } + } else { + if(start == null){ + start = 0; + } + if(end == null){ + end = Integer.MAX_VALUE; + } + Preconditions.checkArgument(end.doubleValue() > start.doubleValue()); + dataType = getNumberDataType(Arrays.asList(start, end)); + if (dataType.equals(Types.INT)) { + return new RangeIntNumber(start.intValue(), end.intValue(), random); + } else if (dataType.equals(Types.BIGINT)) { + return new RangeLongNumber(start.longValue(), end.longValue(), random); + } else { + return new RangeDoubleNumber(start.doubleValue(), end.doubleValue()); + } + } + } + + private static DataType getNumberDataType(List<Number> list) { + DataType dataType = Types.INT; + + for (Number number : list) { + if (number == null) { + continue; + } + + if (number instanceof Short || number instanceof Integer) { + continue; + } + + if (number instanceof Long) { + if (!dataType.equals(Types.DOUBLE)) { + dataType = Types.BIGINT; + } + continue; + } + + if (number instanceof Float || number instanceof Double || number instanceof BigDecimal) { + dataType = Types.DOUBLE; + continue; + } + + throw new IllegalArgumentException(number.toString()); + } + + return dataType; + } + +} diff --git a/groot-connectors/connector-mock/src/main/resources/META-INF/services/com.geedgenetworks.core.factories.Factory b/groot-connectors/connector-mock/src/main/resources/META-INF/services/com.geedgenetworks.api.factory.Factory index eea834f..eea834f 100644 --- a/groot-connectors/connector-mock/src/main/resources/META-INF/services/com.geedgenetworks.core.factories.Factory +++ b/groot-connectors/connector-mock/src/main/resources/META-INF/services/com.geedgenetworks.api.factory.Factory diff --git a/groot-connectors/connector-starrocks/src/main/java/com/geedgenetworks/connectors/starrocks/StarRocksTableFactory.java b/groot-connectors/connector-starrocks/src/main/java/com/geedgenetworks/connectors/starrocks/StarRocksTableFactory.java index fc41481..3bca2fa 100644 --- a/groot-connectors/connector-starrocks/src/main/java/com/geedgenetworks/connectors/starrocks/StarRocksTableFactory.java +++ b/groot-connectors/connector-starrocks/src/main/java/com/geedgenetworks/connectors/starrocks/StarRocksTableFactory.java @@ -1,85 +1,86 @@ -package com.geedgenetworks.connectors.starrocks;
-
-import com.geedgenetworks.common.Event;
-import com.geedgenetworks.core.connector.sink.SinkProvider;
-import com.geedgenetworks.core.factories.FactoryUtil;
-import com.geedgenetworks.core.factories.SinkTableFactory;
-import com.starrocks.connector.flink.table.sink.EventStarRocksDynamicSinkFunctionV2;
-import com.starrocks.connector.flink.table.sink.SinkFunctionFactory;
-import com.starrocks.connector.flink.table.sink.StarRocksSinkOptions;
-import org.apache.flink.configuration.ConfigOption;
-import org.apache.flink.configuration.ConfigOptions;
-import org.apache.flink.streaming.api.datastream.DataStream;
-import org.apache.flink.streaming.api.datastream.DataStreamSink;
-import org.apache.flink.streaming.api.functions.sink.SinkFunction;
-import org.apache.flink.util.Preconditions;
-
-import java.util.HashSet;
-import java.util.Set;
-
-public class StarRocksTableFactory implements SinkTableFactory {
- public static final String IDENTIFIER = "starrocks";
- @Override
- public String factoryIdentifier() {
- return IDENTIFIER;
- }
-
- @Override
- public SinkProvider getSinkProvider(Context context) {
- final FactoryUtil.TableFactoryHelper helper = FactoryUtil.createTableFactoryHelper(this, context);
- helper.validateExcept(CONNECTION_INFO_PREFIX);
-
- final boolean logFailuresOnly = context.getConfiguration().get(LOG_FAILURES_ONLY);
- StarRocksSinkOptions.Builder builder = StarRocksSinkOptions.builder();
- context.getOptions().forEach((key, value) -> {
- if(key.startsWith(CONNECTION_INFO_PREFIX)){
- builder.withProperty(key.substring(CONNECTION_INFO_PREFIX.length()), value);
- }
- });
- builder.withProperty("sink.properties.format", "json");
- final StarRocksSinkOptions options = builder.build();
- SinkFunctionFactory.detectStarRocksFeature(options);
- Preconditions.checkArgument(options.isSupportTransactionStreamLoad());
- final SinkFunction<Event> sinkFunction = new EventStarRocksDynamicSinkFunctionV2(options, logFailuresOnly);
- return new SinkProvider() {
- @Override
- public DataStreamSink<?> consumeDataStream(DataStream<Event> dataStream) {
- /*DataStream<String> ds = dataStream.flatMap(new FlatMapFunction<Event, String>() {
- @Override
- public void flatMap(Event value, Collector<String> out) throws Exception {
- try {
- out.collect(JSON.toJSONString(value.getExtractedFields()));
- } catch (Exception e) {
- e.printStackTrace();
- }
- }
- });
- SinkFunction<String> sink = StarRocksSink.sink(options);
- return ds.addSink(sink);
- */
- return dataStream.addSink(sinkFunction);
- }
- };
- }
-
- @Override
- public Set<ConfigOption<?>> requiredOptions() {
- return new HashSet<>();
- }
-
- @Override
- public Set<ConfigOption<?>> optionalOptions() {
- final Set<ConfigOption<?>> options = new HashSet<>();
- options.add(LOG_FAILURES_ONLY);
- return options;
- }
-
- public static final String CONNECTION_INFO_PREFIX = "connection.";
-
- public static final ConfigOption<Boolean> LOG_FAILURES_ONLY =
- ConfigOptions.key("log.failures.only")
- .booleanType()
- .defaultValue(true)
- .withDescription("Optional flag to whether the sink should fail on errors, or only log them;\n"
- + "If this is set to true, then exceptions will be only logged, if set to false, exceptions will be eventually thrown, true by default.");
-}
+package com.geedgenetworks.connectors.starrocks; + + +import com.geedgenetworks.api.connector.sink.SinkProvider; +import com.geedgenetworks.api.connector.sink.SinkTableFactory; +import com.geedgenetworks.api.connector.event.Event; +import com.geedgenetworks.api.factory.FactoryUtil; +import com.starrocks.connector.flink.table.sink.EventStarRocksDynamicSinkFunctionV2; +import com.starrocks.connector.flink.table.sink.SinkFunctionFactory; +import com.starrocks.connector.flink.table.sink.StarRocksSinkOptions; +import org.apache.flink.configuration.ConfigOption; +import org.apache.flink.configuration.ConfigOptions; +import org.apache.flink.streaming.api.datastream.DataStream; +import org.apache.flink.streaming.api.datastream.DataStreamSink; +import org.apache.flink.streaming.api.functions.sink.SinkFunction; +import org.apache.flink.util.Preconditions; + +import java.util.HashSet; +import java.util.Set; + +public class StarRocksTableFactory implements SinkTableFactory { + public static final String IDENTIFIER = "starrocks"; + @Override + public String type() { + return IDENTIFIER; + } + + @Override + public SinkProvider getSinkProvider(Context context) { + final FactoryUtil.TableFactoryHelper helper = FactoryUtil.createTableFactoryHelper(this, context); + helper.validateExcept(CONNECTION_INFO_PREFIX); + + final boolean logFailuresOnly = context.getConfiguration().get(LOG_FAILURES_ONLY); + StarRocksSinkOptions.Builder builder = StarRocksSinkOptions.builder(); + context.getOptions().forEach((key, value) -> { + if(key.startsWith(CONNECTION_INFO_PREFIX)){ + builder.withProperty(key.substring(CONNECTION_INFO_PREFIX.length()), value); + } + }); + builder.withProperty("sink.properties.format", "json"); + final StarRocksSinkOptions options = builder.build(); + SinkFunctionFactory.detectStarRocksFeature(options); + Preconditions.checkArgument(options.isSupportTransactionStreamLoad()); + final SinkFunction<Event> sinkFunction = new EventStarRocksDynamicSinkFunctionV2(options, logFailuresOnly); + return new SinkProvider() { + @Override + public DataStreamSink<?> consumeDataStream(DataStream<Event> dataStream) { + /*DataStream<String> ds = dataStream.flatMap(new FlatMapFunction<Event, String>() { + @Override + public void flatMap(Event value, Collector<String> out) throws Exception { + try { + out.collect(JSON.toJSONString(value.getExtractedFields())); + } catch (Exception e) { + e.printStackTrace(); + } + } + }); + SinkFunction<String> sink = StarRocksSink.sink(options); + return ds.addSink(sink); + */ + return dataStream.addSink(sinkFunction); + } + }; + } + + @Override + public Set<ConfigOption<?>> requiredOptions() { + return new HashSet<>(); + } + + @Override + public Set<ConfigOption<?>> optionalOptions() { + final Set<ConfigOption<?>> options = new HashSet<>(); + options.add(LOG_FAILURES_ONLY); + return options; + } + + public static final String CONNECTION_INFO_PREFIX = "connection."; + + public static final ConfigOption<Boolean> LOG_FAILURES_ONLY = + ConfigOptions.key("log.failures.only") + .booleanType() + .defaultValue(true) + .withDescription("Optional flag to whether the sink should fail on errors, or only log them;\n" + + "If this is set to true, then exceptions will be only logged, if set to false, exceptions will be eventually thrown, true by default."); +} diff --git a/groot-connectors/connector-starrocks/src/main/java/com/starrocks/connector/flink/table/sink/EventStarRocksDynamicSinkFunctionV2.java b/groot-connectors/connector-starrocks/src/main/java/com/starrocks/connector/flink/table/sink/EventStarRocksDynamicSinkFunctionV2.java index 71a9467..7bf57ab 100644 --- a/groot-connectors/connector-starrocks/src/main/java/com/starrocks/connector/flink/table/sink/EventStarRocksDynamicSinkFunctionV2.java +++ b/groot-connectors/connector-starrocks/src/main/java/com/starrocks/connector/flink/table/sink/EventStarRocksDynamicSinkFunctionV2.java @@ -1,8 +1,8 @@ package com.starrocks.connector.flink.table.sink; import com.alibaba.fastjson2.JSON; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.core.metrics.InternalMetrics; +import com.geedgenetworks.api.metrics.InternalMetrics; +import com.geedgenetworks.api.connector.event.Event; import com.starrocks.connector.flink.manager.StarRocksSinkBufferEntity; import com.starrocks.connector.flink.manager.StarRocksStreamLoadListener; import com.starrocks.connector.flink.tools.EnvUtils; diff --git a/groot-connectors/connector-starrocks/src/main/java/com/starrocks/connector/flink/table/sink/EventStreamLoadListener.java b/groot-connectors/connector-starrocks/src/main/java/com/starrocks/connector/flink/table/sink/EventStreamLoadListener.java index 337109b..d7b7ef2 100644 --- a/groot-connectors/connector-starrocks/src/main/java/com/starrocks/connector/flink/table/sink/EventStreamLoadListener.java +++ b/groot-connectors/connector-starrocks/src/main/java/com/starrocks/connector/flink/table/sink/EventStreamLoadListener.java @@ -1,28 +1,28 @@ -package com.starrocks.connector.flink.table.sink;
-
-import com.geedgenetworks.core.metrics.InternalMetrics;
-import com.starrocks.connector.flink.manager.StarRocksStreamLoadListener;
-import com.starrocks.data.load.stream.StreamLoadResponse;
-import org.apache.flink.api.common.functions.RuntimeContext;
-
-public class EventStreamLoadListener extends StarRocksStreamLoadListener {
- private transient InternalMetrics internalMetrics;
- public EventStreamLoadListener(RuntimeContext context, StarRocksSinkOptions sinkOptions, InternalMetrics internalMetrics) {
- super(context, sinkOptions);
- this.internalMetrics = internalMetrics;
- }
-
- @Override
- public void flushSucceedRecord(StreamLoadResponse response) {
- super.flushSucceedRecord(response);
- if (response.getFlushRows() != null) {
- internalMetrics.incrementOutEvents(response.getFlushRows());
- }
- }
-
- @Override
- public void flushFailedRecord() {
- super.flushFailedRecord();
- internalMetrics.incrementErrorEvents(1);
- }
-}
+package com.starrocks.connector.flink.table.sink; + +import com.geedgenetworks.api.metrics.InternalMetrics; +import com.starrocks.connector.flink.manager.StarRocksStreamLoadListener; +import com.starrocks.data.load.stream.StreamLoadResponse; +import org.apache.flink.api.common.functions.RuntimeContext; + +public class EventStreamLoadListener extends StarRocksStreamLoadListener { + private transient InternalMetrics internalMetrics; + public EventStreamLoadListener(RuntimeContext context, StarRocksSinkOptions sinkOptions, InternalMetrics internalMetrics) { + super(context, sinkOptions); + this.internalMetrics = internalMetrics; + } + + @Override + public void flushSucceedRecord(StreamLoadResponse response) { + super.flushSucceedRecord(response); + if (response.getFlushRows() != null) { + internalMetrics.incrementOutEvents(response.getFlushRows()); + } + } + + @Override + public void flushFailedRecord() { + super.flushFailedRecord(); + internalMetrics.incrementErrorEvents(1); + } +} diff --git a/groot-connectors/connector-starrocks/src/main/resources/META-INF/services/com.geedgenetworks.core.factories.Factory b/groot-connectors/connector-starrocks/src/main/resources/META-INF/services/com.geedgenetworks.api.factory.Factory index c04c5dc..d5d12b5 100644 --- a/groot-connectors/connector-starrocks/src/main/resources/META-INF/services/com.geedgenetworks.core.factories.Factory +++ b/groot-connectors/connector-starrocks/src/main/resources/META-INF/services/com.geedgenetworks.api.factory.Factory @@ -1 +1 @@ -com.geedgenetworks.connectors.starrocks.StarRocksTableFactory
+com.geedgenetworks.connectors.starrocks.StarRocksTableFactory diff --git a/groot-connectors/pom.xml b/groot-connectors/pom.xml index cf5381c..7fe10f7 100644 --- a/groot-connectors/pom.xml +++ b/groot-connectors/pom.xml @@ -20,23 +20,26 @@ <module>connector-starrocks</module> </modules> <dependencies> + <dependency> <groupId>com.geedgenetworks</groupId> - <artifactId>groot-common</artifactId> + <artifactId>groot-api</artifactId> <version>${revision}</version> <scope>provided</scope> </dependency> <dependency> <groupId>com.geedgenetworks</groupId> - <artifactId>groot-core</artifactId> + <artifactId>groot-common</artifactId> <version>${revision}</version> <scope>provided</scope> </dependency> + <dependency> <groupId>org.apache.flink</groupId> <artifactId>flink-table-api-java-bridge_${scala.version}</artifactId> + <scope>provided</scope> </dependency> </dependencies> diff --git a/groot-core/pom.xml b/groot-core/pom.xml index 184e148..3a8e712 100644 --- a/groot-core/pom.xml +++ b/groot-core/pom.xml @@ -12,41 +12,29 @@ <name>Groot : Core </name> <dependencies> - <dependency> - <groupId>com.typesafe</groupId> - <artifactId>config</artifactId> - </dependency> - - <dependency> - <groupId>com.fasterxml.uuid</groupId> - <artifactId>java-uuid-generator</artifactId> - </dependency> <dependency> - <groupId>com.uber</groupId> - <artifactId>h3</artifactId> - <version>4.1.1</version> + <groupId>com.geedgenetworks</groupId> + <artifactId>groot-api</artifactId> + <version>${revision}</version> + <scope>provided</scope> </dependency> <dependency> - <groupId>org.mock-server</groupId> - <artifactId>mockserver-netty</artifactId> - <version>5.11.2</version> - <scope>test</scope> + <groupId>com.geedgenetworks</groupId> + <artifactId>groot-common</artifactId> + <version>${revision}</version> + <scope>provided</scope> </dependency> <dependency> - <groupId>org.mockito</groupId> - <artifactId>mockito-core</artifactId> - <version>4.0.0</version> - <scope>test</scope> + <groupId>com.fasterxml.uuid</groupId> + <artifactId>java-uuid-generator</artifactId> </dependency> <dependency> - <groupId>org.mockito</groupId> - <artifactId>mockito-inline</artifactId> - <version>4.0.0</version> - <scope>test</scope> + <groupId>com.uber</groupId> + <artifactId>h3</artifactId> </dependency> <dependency> @@ -60,50 +48,44 @@ </dependency> <dependency> - <groupId>com.geedgenetworks</groupId> - <artifactId>http-client-shaded</artifactId> - <version>${project.version}</version> - <exclusions> - <exclusion> - <groupId>commons-codec</groupId> - <artifactId>commons-codec</artifactId> - </exclusion> - </exclusions> - <classifier>optional</classifier> + <groupId>org.quartz-scheduler</groupId> + <artifactId>quartz</artifactId> </dependency> + <dependency> + <groupId>com.googlecode.aviator</groupId> + <artifactId>aviator</artifactId> + </dependency> <dependency> - <groupId>com.geedgenetworks</groupId> - <artifactId>groot-common</artifactId> - <version>${revision}</version> - <scope>provided</scope> + <groupId>io.github.jopenlibs</groupId> + <artifactId>vault-java-driver</artifactId> </dependency> <dependency> - <groupId>com.geedgenetworks</groupId> - <artifactId>sketches</artifactId> + <groupId>org.bouncycastle</groupId> + <artifactId>bcpkix-jdk18on</artifactId> </dependency> <dependency> - <groupId>com.alibaba.nacos</groupId> - <artifactId>nacos-client</artifactId> - <exclusions> - <exclusion> - <groupId>commons-codec</groupId> - <artifactId>commons-codec</artifactId> - </exclusion> - </exclusions> + <groupId>org.mock-server</groupId> + <artifactId>mockserver-netty</artifactId> + <version>5.11.2</version> + <scope>test</scope> </dependency> <dependency> - <groupId>org.quartz-scheduler</groupId> - <artifactId>quartz</artifactId> + <groupId>org.mockito</groupId> + <artifactId>mockito-core</artifactId> + <version>4.0.0</version> + <scope>test</scope> </dependency> <dependency> - <groupId>com.googlecode.aviator</groupId> - <artifactId>aviator</artifactId> + <groupId>org.mockito</groupId> + <artifactId>mockito-inline</artifactId> + <version>4.0.0</version> + <scope>test</scope> </dependency> <dependency> @@ -124,16 +106,7 @@ <scope>provided</scope> </dependency> - <dependency> - <groupId>io.github.jopenlibs</groupId> - <artifactId>vault-java-driver</artifactId> - <version>6.2.0</version> - </dependency> - <dependency> - <groupId>org.bouncycastle</groupId> - <artifactId>bcpkix-jdk18on</artifactId> - <version>1.78.1</version> - </dependency> + </dependencies> <build> diff --git a/groot-core/src/main/java/com/geedgenetworks/core/connector/inline/InlineSourceProvider.java b/groot-core/src/main/java/com/geedgenetworks/core/connector/inline/InlineSourceProvider.java index f88c321..4ed69b5 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/connector/inline/InlineSourceProvider.java +++ b/groot-core/src/main/java/com/geedgenetworks/core/connector/inline/InlineSourceProvider.java @@ -1,15 +1,14 @@ package com.geedgenetworks.core.connector.inline; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.core.connector.format.DecodingFormat; -import com.geedgenetworks.core.connector.source.SourceProvider; -import com.geedgenetworks.core.types.StructType; -import org.apache.flink.api.common.serialization.DeserializationSchema; +import com.geedgenetworks.api.connector.source.SourceProvider; +import com.geedgenetworks.api.connector.serialization.DecodingFormat; +import com.geedgenetworks.api.connector.event.Event; +import com.geedgenetworks.api.connector.type.StructType; import org.apache.flink.configuration.Configuration; import org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator; -import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment; import org.apache.flink.streaming.api.functions.source.RichParallelSourceFunction; - +import org.apache.flink.api.common.serialization.DeserializationSchema; +import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment; import java.util.List; public class InlineSourceProvider implements SourceProvider { diff --git a/groot-core/src/main/java/com/geedgenetworks/core/connector/inline/InlineTableFactory.java b/groot-core/src/main/java/com/geedgenetworks/core/connector/inline/InlineTableFactory.java index 50c6f65..7eab006 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/connector/inline/InlineTableFactory.java +++ b/groot-core/src/main/java/com/geedgenetworks/core/connector/inline/InlineTableFactory.java @@ -1,118 +1,117 @@ -package com.geedgenetworks.core.connector.inline;
-
-import com.alibaba.fastjson2.JSON;
-import com.alibaba.fastjson2.JSONArray;
-import com.geedgenetworks.core.connector.format.DecodingFormat;
-import com.geedgenetworks.core.connector.source.SourceProvider;
-import com.geedgenetworks.core.factories.DecodingFormatFactory;
-import com.geedgenetworks.core.factories.FactoryUtil;
-import com.geedgenetworks.core.factories.SourceTableFactory;
-import com.geedgenetworks.core.types.StructType;
-import org.apache.flink.configuration.ConfigOption;
-import org.apache.flink.configuration.ConfigOptions;
-import org.apache.flink.configuration.ReadableConfig;
-import org.apache.flink.util.StringUtils;
-
-import java.nio.charset.StandardCharsets;
-import java.time.Duration;
-import java.util.*;
-
-import static org.apache.flink.configuration.ConfigOptions.key;
-
-/**
- * 用于测试的source,用于简单测试format、function等
- */
-public class InlineTableFactory implements SourceTableFactory {
- public static final String IDENTIFIER = "inline";
- @Override
- public String factoryIdentifier() {
- return IDENTIFIER;
- }
-
- @Override
- public SourceProvider getSourceProvider(Context context) {
- final FactoryUtil.TableFactoryHelper helper = FactoryUtil.createTableFactoryHelper(this, context);
- // 获取DecodingFormat
- DecodingFormat decodingFormat = helper.discoverDecodingFormat(DecodingFormatFactory.class, FactoryUtil.FORMAT);
-
- helper.validate(); // 校验参数
-
- StructType physicalDataType = context.getPhysicalDataType(); // 列类型
- ReadableConfig config = context.getConfiguration();
-
- String data = config.get(DATA);
- String type = config.get(TYPE);
- long intervalPerRow = config.get(INTERVAL_PER_ROW).toMillis();
- int repeatCount = config.get(REPEAT_COUNT);
-
- return new InlineSourceProvider(physicalDataType, decodingFormat, parseData(data, type), intervalPerRow, repeatCount);
- }
-
- @Override
- public Set<ConfigOption<?>> requiredOptions() {
- final Set<ConfigOption<?>> options = new HashSet<>();
- options.add(DATA);
- options.add(FactoryUtil.FORMAT);
- return options;
- }
-
- @Override
- public Set<ConfigOption<?>> optionalOptions() {
- final Set<ConfigOption<?>> options = new HashSet<>();
- options.add(TYPE);
- options.add(INTERVAL_PER_ROW);
- options.add(REPEAT_COUNT);
- return options;
- }
-
- List<byte[]> parseData(String data, String type){
- List<byte[]> datas;
- if(JSON.isValidArray(data)){
- List<String> dataArray = JSON.parseArray(data, String.class);
- datas = new ArrayList<>(dataArray.size());
- for (int i = 0; i < dataArray.size(); i++) {
- datas.add(getDataBytes(dataArray.get(i), type));
- }
- }else{
- datas = new ArrayList<>(1);
- datas.add(getDataBytes(data, type));
- }
- return datas;
- }
-
- byte[] getDataBytes(String data, String type){
- if(DATA_TYPE_STRING.equalsIgnoreCase(type)){
- return data.getBytes(StandardCharsets.UTF_8);
- } else if (DATA_TYPE_HEX.equalsIgnoreCase(type)){
- return StringUtils.hexStringToByte(data);
- } else if (DATA_TYPE_BASE64.equalsIgnoreCase(type)) {
- return Base64.getDecoder().decode(data.getBytes(StandardCharsets.UTF_8));
- }else{
- throw new IllegalArgumentException("Unsupported type:" + type);
- }
- }
-
- final static String DATA_TYPE_STRING = "string";
- final static String DATA_TYPE_HEX = "hex";
- final static String DATA_TYPE_BASE64 = "base64";
- public static final ConfigOption<String> DATA =
- ConfigOptions.key("data")
- .stringType()
- .noDefaultValue()
- .withDescription("inline source的输入数据");
- public static final ConfigOption<String> TYPE =
- ConfigOptions.key("type")
- .stringType()
- .defaultValue(DATA_TYPE_STRING)
- .withDescription("数据类型:string(UTF8字符串),hex(十六进制编码),base64(base64编码)");
- public static final ConfigOption<Duration> INTERVAL_PER_ROW =
- ConfigOptions.key("interval.per.row")
- .durationType()
- .defaultValue(Duration.ofSeconds(1))
- .withDescription("sleep interval per row to control the emit rate.");
- public static final ConfigOption<Integer> REPEAT_COUNT =
- key("repeat.count")
- .intType()
- .defaultValue(-1)
- .withDescription("repeat emit data count. By default, the source is unbounded.");
-}
+package com.geedgenetworks.core.connector.inline; + +import com.alibaba.fastjson2.JSON; +import com.geedgenetworks.api.connector.source.SourceProvider; +import com.geedgenetworks.api.connector.serialization.DecodingFormat; +import com.geedgenetworks.api.connector.source.SourceTableFactory; +import com.geedgenetworks.api.factory.DecodingFormatFactory; +import com.geedgenetworks.api.factory.FactoryUtil; +import com.geedgenetworks.api.connector.type.StructType; +import org.apache.flink.configuration.ConfigOption; +import org.apache.flink.configuration.ConfigOptions; +import org.apache.flink.configuration.ReadableConfig; +import org.apache.flink.util.StringUtils; + +import java.nio.charset.StandardCharsets; +import java.time.Duration; +import java.util.*; + +import static org.apache.flink.configuration.ConfigOptions.key; + +/** + * 用于测试的source,用于简单测试format、function等 + */ +public class InlineTableFactory implements SourceTableFactory { + public static final String IDENTIFIER = "inline"; + @Override + public String type() { + return IDENTIFIER; + } + + @Override + public SourceProvider getSourceProvider(Context context) { + final FactoryUtil.TableFactoryHelper helper = FactoryUtil.createTableFactoryHelper(this, context); + // 获取DecodingFormat + DecodingFormat decodingFormat = helper.discoverDecodingFormat(DecodingFormatFactory.class, FactoryUtil.FORMAT); + + helper.validate(); // 校验参数 + + StructType physicalDataType = context.getPhysicalDataType(); // 列类型 + ReadableConfig config = context.getConfiguration(); + + String data = config.get(DATA); + String type = config.get(TYPE); + long intervalPerRow = config.get(INTERVAL_PER_ROW).toMillis(); + int repeatCount = config.get(REPEAT_COUNT); + + return new InlineSourceProvider(physicalDataType, decodingFormat, parseData(data, type), intervalPerRow, repeatCount); + } + + @Override + public Set<ConfigOption<?>> requiredOptions() { + final Set<ConfigOption<?>> options = new HashSet<>(); + options.add(DATA); + options.add(FactoryUtil.FORMAT); + return options; + } + + @Override + public Set<ConfigOption<?>> optionalOptions() { + final Set<ConfigOption<?>> options = new HashSet<>(); + options.add(TYPE); + options.add(INTERVAL_PER_ROW); + options.add(REPEAT_COUNT); + return options; + } + + List<byte[]> parseData(String data, String type){ + List<byte[]> datas; + if(JSON.isValidArray(data)){ + List<String> dataArray = JSON.parseArray(data, String.class); + datas = new ArrayList<>(dataArray.size()); + for (int i = 0; i < dataArray.size(); i++) { + datas.add(getDataBytes(dataArray.get(i), type)); + } + }else{ + datas = new ArrayList<>(1); + datas.add(getDataBytes(data, type)); + } + return datas; + } + + byte[] getDataBytes(String data, String type){ + if(DATA_TYPE_STRING.equalsIgnoreCase(type)){ + return data.getBytes(StandardCharsets.UTF_8); + } else if (DATA_TYPE_HEX.equalsIgnoreCase(type)){ + return StringUtils.hexStringToByte(data); + } else if (DATA_TYPE_BASE64.equalsIgnoreCase(type)) { + return Base64.getDecoder().decode(data.getBytes(StandardCharsets.UTF_8)); + }else{ + throw new IllegalArgumentException("Unsupported type:" + type); + } + } + + final static String DATA_TYPE_STRING = "string"; + final static String DATA_TYPE_HEX = "hex"; + final static String DATA_TYPE_BASE64 = "base64"; + public static final ConfigOption<String> DATA = + ConfigOptions.key("data") + .stringType() + .noDefaultValue() + .withDescription("inline source的输入数据"); + public static final ConfigOption<String> TYPE = + ConfigOptions.key("type") + .stringType() + .defaultValue(DATA_TYPE_STRING) + .withDescription("数据类型:string(UTF8字符串),hex(十六进制编码),base64(base64编码)"); + public static final ConfigOption<Duration> INTERVAL_PER_ROW = + ConfigOptions.key("interval.per.row") + .durationType() + .defaultValue(Duration.ofSeconds(1)) + .withDescription("sleep interval per row to control the emit rate."); + public static final ConfigOption<Integer> REPEAT_COUNT = + key("repeat.count") + .intType() + .defaultValue(-1) + .withDescription("repeat emit data count. By default, the source is unbounded."); +} diff --git a/groot-core/src/main/java/com/geedgenetworks/core/connector/print/PrintSinkFunction.java b/groot-core/src/main/java/com/geedgenetworks/core/connector/print/PrintSinkFunction.java index 0f2b3d5..22187bd 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/connector/print/PrintSinkFunction.java +++ b/groot-core/src/main/java/com/geedgenetworks/core/connector/print/PrintSinkFunction.java @@ -1,8 +1,8 @@ package com.geedgenetworks.core.connector.print; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.core.connector.format.EncodingFormat; -import com.geedgenetworks.core.types.StructType; +import com.geedgenetworks.api.connector.serialization.EncodingFormat; +import com.geedgenetworks.api.connector.event.Event; +import com.geedgenetworks.api.connector.type.StructType; import org.apache.flink.api.common.serialization.SerializationSchema; import org.apache.flink.configuration.Configuration; import org.apache.flink.metrics.Counter; diff --git a/groot-core/src/main/java/com/geedgenetworks/core/connector/print/PrintTableFactory.java b/groot-core/src/main/java/com/geedgenetworks/core/connector/print/PrintTableFactory.java index 3bc4910..e558aeb 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/connector/print/PrintTableFactory.java +++ b/groot-core/src/main/java/com/geedgenetworks/core/connector/print/PrintTableFactory.java @@ -1,18 +1,17 @@ package com.geedgenetworks.core.connector.print; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.core.connector.format.EncodingFormat; -import com.geedgenetworks.core.connector.sink.SinkProvider; -import com.geedgenetworks.core.factories.EncodingFormatFactory; -import com.geedgenetworks.core.factories.FactoryUtil; -import com.geedgenetworks.core.factories.SinkTableFactory; -import com.geedgenetworks.core.types.StructType; +import com.geedgenetworks.api.connector.serialization.EncodingFormat; +import com.geedgenetworks.api.connector.sink.SinkTableFactory; +import com.geedgenetworks.api.connector.event.Event; +import com.geedgenetworks.api.connector.type.StructType; import org.apache.flink.configuration.ConfigOption; import org.apache.flink.configuration.ConfigOptions; import org.apache.flink.configuration.ReadableConfig; import org.apache.flink.streaming.api.datastream.DataStream; import org.apache.flink.streaming.api.datastream.DataStreamSink; - +import com.geedgenetworks.api.connector.sink.SinkProvider; +import com.geedgenetworks.api.factory.FactoryUtil; +import com.geedgenetworks.api.factory.EncodingFormatFactory; import java.util.HashSet; import java.util.Optional; import java.util.Set; @@ -25,7 +24,7 @@ import static com.geedgenetworks.core.connector.print.PrintMode.STDOUT; public class PrintTableFactory implements SinkTableFactory { public static final String IDENTIFIER = "print"; @Override - public String factoryIdentifier() { + public String type() { return IDENTIFIER; } diff --git a/groot-core/src/main/java/com/geedgenetworks/core/connector/schema/SchemaChangeAware.java b/groot-core/src/main/java/com/geedgenetworks/core/connector/schema/SchemaChangeAware.java deleted file mode 100644 index a70df24..0000000 --- a/groot-core/src/main/java/com/geedgenetworks/core/connector/schema/SchemaChangeAware.java +++ /dev/null @@ -1,7 +0,0 @@ -package com.geedgenetworks.core.connector.schema;
-
-import com.geedgenetworks.core.types.StructType;
-
-public interface SchemaChangeAware {
- void schemaChange(StructType dataType);
-}
diff --git a/groot-core/src/main/java/com/geedgenetworks/core/factories/DecodingFormatFactory.java b/groot-core/src/main/java/com/geedgenetworks/core/factories/DecodingFormatFactory.java deleted file mode 100644 index 298b6e0..0000000 --- a/groot-core/src/main/java/com/geedgenetworks/core/factories/DecodingFormatFactory.java +++ /dev/null @@ -1,9 +0,0 @@ -package com.geedgenetworks.core.factories; - - -import com.geedgenetworks.core.connector.format.DecodingFormat; -import org.apache.flink.configuration.ReadableConfig; - -public interface DecodingFormatFactory extends FormatFactory { - DecodingFormat createDecodingFormat(TableFactory.Context context, ReadableConfig formatOptions); -} diff --git a/groot-core/src/main/java/com/geedgenetworks/core/factories/EncodingFormatFactory.java b/groot-core/src/main/java/com/geedgenetworks/core/factories/EncodingFormatFactory.java deleted file mode 100644 index ffd5af3..0000000 --- a/groot-core/src/main/java/com/geedgenetworks/core/factories/EncodingFormatFactory.java +++ /dev/null @@ -1,8 +0,0 @@ -package com.geedgenetworks.core.factories; - -import com.geedgenetworks.core.connector.format.EncodingFormat; -import org.apache.flink.configuration.ReadableConfig; - -public interface EncodingFormatFactory extends FormatFactory { - EncodingFormat createEncodingFormat(TableFactory.Context context, ReadableConfig formatOptions); -} diff --git a/groot-core/src/main/java/com/geedgenetworks/core/factories/Factory.java b/groot-core/src/main/java/com/geedgenetworks/core/factories/Factory.java deleted file mode 100644 index a650b7e..0000000 --- a/groot-core/src/main/java/com/geedgenetworks/core/factories/Factory.java +++ /dev/null @@ -1,21 +0,0 @@ -package com.geedgenetworks.core.factories; - -import org.apache.flink.configuration.ConfigOption; - -import java.util.Set; - -/** - * 用于注册source、 sink、format factory的基础接口,Factory从key、value配置创建实例。 - * 可用的factory列表使用java Service Provider Interfaces (SPI)发现,把实现类添加到META_INF/services/com.geedgenetworks.core.factories.Factory即可。 - * factory实现和配置参考flink sql - */ -public interface Factory { - // 返回Factory的唯一标识 - String factoryIdentifier(); - - // 必须配置的参数 - Set<ConfigOption<?>> requiredOptions(); - - // 可选的参数 - Set<ConfigOption<?>> optionalOptions(); -} diff --git a/groot-core/src/main/java/com/geedgenetworks/core/factories/SinkTableFactory.java b/groot-core/src/main/java/com/geedgenetworks/core/factories/SinkTableFactory.java deleted file mode 100644 index 819b57c..0000000 --- a/groot-core/src/main/java/com/geedgenetworks/core/factories/SinkTableFactory.java +++ /dev/null @@ -1,7 +0,0 @@ -package com.geedgenetworks.core.factories; - -import com.geedgenetworks.core.connector.sink.SinkProvider; - -public interface SinkTableFactory extends TableFactory { - SinkProvider getSinkProvider(Context context); -} diff --git a/groot-core/src/main/java/com/geedgenetworks/core/factories/SourceTableFactory.java b/groot-core/src/main/java/com/geedgenetworks/core/factories/SourceTableFactory.java deleted file mode 100644 index d5f6db3..0000000 --- a/groot-core/src/main/java/com/geedgenetworks/core/factories/SourceTableFactory.java +++ /dev/null @@ -1,7 +0,0 @@ -package com.geedgenetworks.core.factories; - -import com.geedgenetworks.core.connector.source.SourceProvider; - -public interface SourceTableFactory extends TableFactory { - SourceProvider getSourceProvider(Context context); -} diff --git a/groot-core/src/main/java/com/geedgenetworks/core/filter/AviatorFilter.java b/groot-core/src/main/java/com/geedgenetworks/core/filter/AviatorFilter.java deleted file mode 100644 index 06693c9..0000000 --- a/groot-core/src/main/java/com/geedgenetworks/core/filter/AviatorFilter.java +++ /dev/null @@ -1,43 +0,0 @@ -package com.geedgenetworks.core.filter; - -import com.alibaba.fastjson.JSONObject; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.core.pojo.FilterConfig; -import com.typesafe.config.Config; -import org.apache.flink.streaming.api.datastream.DataStream; - -import java.util.Map; - -public class AviatorFilter implements Filter<FilterConfig> { - - @Override - public DataStream<Event> filterFunction( - DataStream<Event> singleOutputStreamOperator, FilterConfig FilterConfig) - throws Exception { - - if (FilterConfig.getParallelism() != 0) { - return singleOutputStreamOperator - .filter(new FilterFunction(FilterConfig)) - .setParallelism(FilterConfig.getParallelism()) - .name(FilterConfig.getName()); - } else { - return singleOutputStreamOperator - .filter(new FilterFunction(FilterConfig)) - .name(FilterConfig.getName()); - } - } - - @Override - public String type() { - return "aviator"; - } - - @Override - public FilterConfig checkConfig(String name, Map<String, Object> configProperties, Config typeSafeConfig) { - - FilterConfig filterConfig = new JSONObject(configProperties).toJavaObject(FilterConfig.class); - filterConfig.setName(name); - return filterConfig; - } - -} diff --git a/groot-core/src/main/java/com/geedgenetworks/core/filter/Filter.java b/groot-core/src/main/java/com/geedgenetworks/core/filter/Filter.java deleted file mode 100644 index 41daf3d..0000000 --- a/groot-core/src/main/java/com/geedgenetworks/core/filter/Filter.java +++ /dev/null @@ -1,22 +0,0 @@ -package com.geedgenetworks.core.filter; - -import com.geedgenetworks.common.Event; -import com.geedgenetworks.core.pojo.FilterConfig; - -import com.typesafe.config.Config; -import org.apache.flink.streaming.api.datastream.DataStream; -import org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator; - -import java.io.Serializable; -import java.util.Map; - -public interface Filter<T extends FilterConfig> extends Serializable { - - DataStream<Event> filterFunction( - DataStream<Event> singleOutputStreamOperator, T FilterConfig) - throws Exception; - String type(); - - T checkConfig(String name, Map<String, Object> configProperties, Config typeSafeConfig); - -} diff --git a/groot-core/src/main/java/com/geedgenetworks/core/pojo/FilterConfig.java b/groot-core/src/main/java/com/geedgenetworks/core/pojo/FilterConfig.java deleted file mode 100644 index 2d8a0d2..0000000 --- a/groot-core/src/main/java/com/geedgenetworks/core/pojo/FilterConfig.java +++ /dev/null @@ -1,54 +0,0 @@ -package com.geedgenetworks.core.pojo; - -import java.io.Serializable; -import java.util.List; -import java.util.Map; - -public class FilterConfig implements Serializable { - - private List<String> output_fields; - private String type; - private Map<String, Object> properties; - private int parallelism; - private String name; - - public String getName() { - return name; - } - - public void setName(String name) { - this.name = name; - } - - public List<String> getOutput_fields() { - return output_fields; - } - - public void setOutput_fields(List<String> output_fields) { - this.output_fields = output_fields; - } - - public String getType() { - return type; - } - - public void setType(String type) { - this.type = type; - } - - public Map<String, Object> getProperties() { - return properties; - } - - public void setProperties(Map<String, Object> properties) { - this.properties = properties; - } - - public int getParallelism() { - return parallelism; - } - - public void setParallelism(int parallelism) { - this.parallelism = parallelism; - } -} diff --git a/groot-core/src/main/java/com/geedgenetworks/core/pojo/ProjectionConfig.java b/groot-core/src/main/java/com/geedgenetworks/core/pojo/ProjectionConfig.java deleted file mode 100644 index 48daefd..0000000 --- a/groot-core/src/main/java/com/geedgenetworks/core/pojo/ProjectionConfig.java +++ /dev/null @@ -1,17 +0,0 @@ -package com.geedgenetworks.core.pojo; - -import com.geedgenetworks.common.udf.UDFContext; -import lombok.Data; -import lombok.EqualsAndHashCode; - -import java.io.Serializable; -import java.util.List; -import java.util.Map; -@EqualsAndHashCode(callSuper = true) -@Data -public class ProjectionConfig extends ProcessorConfig { - - private List<UDFContext> functions; - private String format; - -} diff --git a/groot-core/src/main/java/com/geedgenetworks/core/pojo/SinkConfig.java b/groot-core/src/main/java/com/geedgenetworks/core/pojo/SinkConfig.java deleted file mode 100644 index 66275d9..0000000 --- a/groot-core/src/main/java/com/geedgenetworks/core/pojo/SinkConfig.java +++ /dev/null @@ -1,43 +0,0 @@ -package com.geedgenetworks.core.pojo;
-
-import java.io.Serializable;
-import java.util.Map;
-
-public class SinkConfig implements Serializable {
- private String type;
- private Map<String, Object> schema;
- private Map<String, String> properties;
- private String name;
-
- public String getType() {
- return type;
- }
-
- public void setType(String type) {
- this.type = type;
- }
-
- public Map<String, Object> getSchema() {
- return schema;
- }
-
- public void setSchema(Map<String, Object> schema) {
- this.schema = schema;
- }
-
- public Map<String, String> getProperties() {
- return properties;
- }
-
- public void setProperties(Map<String, String> properties) {
- this.properties = properties;
- }
-
- public String getName() {
- return name;
- }
-
- public void setName(String name) {
- this.name = name;
- }
-}
diff --git a/groot-core/src/main/java/com/geedgenetworks/core/pojo/SinkConfigOld.java b/groot-core/src/main/java/com/geedgenetworks/core/pojo/SinkConfigOld.java deleted file mode 100644 index b2d2647..0000000 --- a/groot-core/src/main/java/com/geedgenetworks/core/pojo/SinkConfigOld.java +++ /dev/null @@ -1,63 +0,0 @@ -package com.geedgenetworks.core.pojo; - -import java.io.Serializable; -import java.util.List; -import java.util.Map; - -public class SinkConfigOld implements Serializable { - - private int parallelism; - private Map<String, Object> properties; - private String format; - private String type; - private List<String> output_fields; - private String name; - - public String getName() { - return name; - } - - public void setName(String name) { - this.name = name; - } - - public int getParallelism() { - return parallelism; - } - - public void setParallelism(int parallelism) { - this.parallelism = parallelism; - } - - public Map<String, Object> getProperties() { - return properties; - } - - public void setProperties(Map<String, Object> properties) { - this.properties = properties; - } - - public String getFormat() { - return format; - } - - public void setFormat(String format) { - this.format = format; - } - - public String getType() { - return type; - } - - public void setType(String type) { - this.type = type; - } - - public List<String> getOutput_fields() { - return output_fields; - } - - public void setOutput_fields(List<String> output_fields) { - this.output_fields = output_fields; - } -} diff --git a/groot-core/src/main/java/com/geedgenetworks/core/pojo/SourceConfig.java b/groot-core/src/main/java/com/geedgenetworks/core/pojo/SourceConfig.java deleted file mode 100644 index dd18593..0000000 --- a/groot-core/src/main/java/com/geedgenetworks/core/pojo/SourceConfig.java +++ /dev/null @@ -1,69 +0,0 @@ -package com.geedgenetworks.core.pojo; - -import java.io.Serializable; -import java.util.Map; - -public class SourceConfig implements Serializable { - private String type; - private Map<String, Object> schema; - private String watermark_timestamp; - private String watermark_timestamp_unit = "ms"; - private Long watermark_lag; - private Map<String, String> properties; - private String name; - public String getType() { - return type; - } - - public void setType(String type) { - this.type = type; - } - - public Map<String, Object> getSchema() { - return schema; - } - - public void setSchema(Map<String, Object> schema) { - this.schema = schema; - } - - public String getWatermark_timestamp() { - return watermark_timestamp; - } - - public void setWatermark_timestamp(String watermark_timestamp) { - this.watermark_timestamp = watermark_timestamp; - } - - public String getWatermark_timestamp_unit() { - return watermark_timestamp_unit; - } - - public void setWatermark_timestamp_unit(String watermark_timestamp_unit) { - this.watermark_timestamp_unit = watermark_timestamp_unit; - } - - public Long getWatermark_lag() { - return watermark_lag; - } - - public void setWatermark_lag(Long watermark_lag) { - this.watermark_lag = watermark_lag; - } - - public Map<String, String> getProperties() { - return properties; - } - - public void setProperties(Map<String, String> properties) { - this.properties = properties; - } - - public String getName() { - return name; - } - - public void setName(String name) { - this.name = name; - } -} diff --git a/groot-core/src/main/java/com/geedgenetworks/core/pojo/SourceConfigOld.java b/groot-core/src/main/java/com/geedgenetworks/core/pojo/SourceConfigOld.java deleted file mode 100644 index 9186a22..0000000 --- a/groot-core/src/main/java/com/geedgenetworks/core/pojo/SourceConfigOld.java +++ /dev/null @@ -1,81 +0,0 @@ -package com.geedgenetworks.core.pojo; - -import java.io.Serializable; -import java.util.List; -import java.util.Map; - -public class SourceConfigOld implements Serializable { - - private String name; - private int parallelism; - private Map<String, Object> properties; - private String watermark_timestamp; - private Long watermark_lag; - private String type; - private String format; - private List<String> output_fields; - - public String getName() { - return name; - } - - public void setName(String name) { - this.name = name; - } - - public int getParallelism() { - return parallelism; - } - - public void setParallelism(int parallelism) { - this.parallelism = parallelism; - } - - public Map<String, Object> getProperties() { - return properties; - } - - public void setProperties(Map<String, Object> properties) { - this.properties = properties; - } - - public String getWatermark_timestamp() { - return watermark_timestamp; - } - - public void setWatermark_timestamp(String watermark_timestamp) { - this.watermark_timestamp = watermark_timestamp; - } - - public Long getWatermark_lag() { - return watermark_lag; - } - - public void setWatermark_lag(Long watermark_lag) { - this.watermark_lag = watermark_lag; - } - - public String getType() { - return type; - } - - public void setType(String type) { - this.type = type; - } - - public String getFormat() { - return format; - } - - public void setFormat(String format) { - this.format = format; - } - - public List<String> getOutput_fields() { - return output_fields; - } - - public void setOutput_fields(List<String> output_fields) { - this.output_fields = output_fields; - } -} diff --git a/groot-core/src/main/java/com/geedgenetworks/core/pojo/SplitConfig.java b/groot-core/src/main/java/com/geedgenetworks/core/pojo/SplitConfig.java deleted file mode 100644 index 4381df5..0000000 --- a/groot-core/src/main/java/com/geedgenetworks/core/pojo/SplitConfig.java +++ /dev/null @@ -1,17 +0,0 @@ -package com.geedgenetworks.core.pojo; - -import com.geedgenetworks.common.udf.RuleContext; -import lombok.Data; - -import java.io.Serializable; -import java.util.List; -import java.util.Map; -@Data -public class SplitConfig implements Serializable { - - private String type; - private Map<String, Object> properties; - private int parallelism; - private String name; - private List<RuleContext> rules; -} diff --git a/groot-core/src/main/java/com/geedgenetworks/core/pojo/TableConfig.java b/groot-core/src/main/java/com/geedgenetworks/core/pojo/TableConfig.java deleted file mode 100644 index 3efb8e1..0000000 --- a/groot-core/src/main/java/com/geedgenetworks/core/pojo/TableConfig.java +++ /dev/null @@ -1,15 +0,0 @@ -package com.geedgenetworks.core.pojo; - -import com.geedgenetworks.common.udf.UDFContext; -import lombok.Data; -import lombok.EqualsAndHashCode; - -import java.util.List; - -@EqualsAndHashCode(callSuper = true) -@Data -public class TableConfig extends ProcessorConfig { - - private List<UDFContext> functions; - private String format; -} diff --git a/groot-core/src/main/java/com/geedgenetworks/core/pojo/Topology.java b/groot-core/src/main/java/com/geedgenetworks/core/pojo/Topology.java index e40442b..fffbca6 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/pojo/Topology.java +++ b/groot-core/src/main/java/com/geedgenetworks/core/pojo/Topology.java @@ -1,26 +1,13 @@ package com.geedgenetworks.core.pojo; +import lombok.Data; + import java.io.Serializable; import java.util.List; +@Data public class Topology implements Serializable { private int parallelism; private List<String> next; - - public int getParallelism() { - return parallelism; - } - - public void setParallelism(int parallelism) { - this.parallelism = parallelism; - } - - public List<String> getNext() { - return next; - } - - public void setNext(List<String> next) { - this.next = next; - } } diff --git a/groot-core/src/main/java/com/geedgenetworks/core/pojo/YamlEntity.java b/groot-core/src/main/java/com/geedgenetworks/core/pojo/YamlEntity.java index 94e5ae5..1d69479 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/pojo/YamlEntity.java +++ b/groot-core/src/main/java/com/geedgenetworks/core/pojo/YamlEntity.java @@ -1,8 +1,11 @@ package com.geedgenetworks.core.pojo; +import lombok.Data; + import java.io.Serializable; import java.util.Map; +@Data public class YamlEntity implements Serializable { private Map<String, Object> sources; @@ -12,60 +15,4 @@ public class YamlEntity implements Serializable { private Map<String, Object> pre_processing_pipelines; private Map<String, Object> processing_pipelines; private Map<String, Object> post_processing_pipelines; - - public Map<String, Object> getSources() { - return sources; - } - - public void setSources(Map<String, Object> sources) { - this.sources = sources; - } - - public Map<String, Object> getApplication() { - return application; - } - - public void setApplication(Map<String, Object> application) { - this.application = application; - } - - public Map<String, Object> getSinks() { - return sinks; - } - - public void setSinks(Map<String, Object> sinks) { - this.sinks = sinks; - } - - public Map<String, Object> getFilters() { - return filters; - } - - public void setFilters(Map<String, Object> filters) { - this.filters = filters; - } - - public Map<String, Object> getPre_processing_pipelines() { - return pre_processing_pipelines; - } - - public void setPre_processing_pipelines(Map<String, Object> pre_processing_pipelines) { - this.pre_processing_pipelines = pre_processing_pipelines; - } - - public Map<String, Object> getProcessing_pipelines() { - return processing_pipelines; - } - - public void setProcessing_pipelines(Map<String, Object> processing_pipelines) { - this.processing_pipelines = processing_pipelines; - } - - public Map<String, Object> getPost_processing_pipelines() { - return post_processing_pipelines; - } - - public void setPost_processing_pipelines(Map<String, Object> post_processing_pipelines) { - this.post_processing_pipelines = post_processing_pipelines; - } } diff --git a/groot-core/src/main/java/com/geedgenetworks/core/processor/Processor.java b/groot-core/src/main/java/com/geedgenetworks/core/processor/Processor.java deleted file mode 100644 index 1c9ba6f..0000000 --- a/groot-core/src/main/java/com/geedgenetworks/core/processor/Processor.java +++ /dev/null @@ -1,22 +0,0 @@ -package com.geedgenetworks.core.processor; - -import com.geedgenetworks.common.Event; -import com.geedgenetworks.core.pojo.ProcessorConfig; -import com.typesafe.config.Config; -import org.apache.flink.api.common.ExecutionConfig; -import org.apache.flink.streaming.api.datastream.DataStream; -import org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator; - -import java.io.Serializable; -import java.util.Map; - -public interface Processor<T extends ProcessorConfig> extends Serializable { - - DataStream<Event> processorFunction( - DataStream<Event> singleOutputStreamOperator, - T processorConfig, ExecutionConfig config) - throws Exception; - String type(); - - T checkConfig(String name, Map<String, Object> configProperties, Config typeSafeConfig); -} diff --git a/groot-core/src/main/java/com/geedgenetworks/core/processor/aggregate/AbstractFirstAggregation.java b/groot-core/src/main/java/com/geedgenetworks/core/processor/aggregate/AbstractFirstAggregation.java index ce77ee8..4dcea9d 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/processor/aggregate/AbstractFirstAggregation.java +++ b/groot-core/src/main/java/com/geedgenetworks/core/processor/aggregate/AbstractFirstAggregation.java @@ -1,19 +1,17 @@ package com.geedgenetworks.core.processor.aggregate; - import cn.hutool.crypto.SecureUtil; import com.alibaba.fastjson.JSON; -import com.geedgenetworks.common.Accumulator; -import com.geedgenetworks.common.Constants; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.common.KeybyEntity; +import com.geedgenetworks.common.config.Accumulator; +import com.geedgenetworks.common.config.Constants; +import com.geedgenetworks.common.config.KeybyEntity; import com.geedgenetworks.common.exception.CommonErrorCode; import com.geedgenetworks.common.exception.GrootStreamRuntimeException; -import com.geedgenetworks.common.udf.AggregateFunction; -import com.geedgenetworks.common.udf.UDFContext; -import com.geedgenetworks.core.metrics.InternalMetrics; -import com.geedgenetworks.core.pojo.AggregateConfig; -import com.geedgenetworks.core.processor.projection.UdfEntity; +import com.geedgenetworks.api.common.udf.AggregateFunction; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.common.udf.UdfEntity; +import com.geedgenetworks.api.metrics.InternalMetrics; +import com.geedgenetworks.api.connector.event.Event; import com.google.common.collect.Lists; import com.googlecode.aviator.AviatorEvaluator; import com.googlecode.aviator.AviatorEvaluatorInstance; diff --git a/groot-core/src/main/java/com/geedgenetworks/core/pojo/AggregateConfig.java b/groot-core/src/main/java/com/geedgenetworks/core/processor/aggregate/AggregateConfig.java index ebdb0bd..e8d5c26 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/pojo/AggregateConfig.java +++ b/groot-core/src/main/java/com/geedgenetworks/core/processor/aggregate/AggregateConfig.java @@ -1,7 +1,8 @@ -package com.geedgenetworks.core.pojo; +package com.geedgenetworks.core.processor.aggregate; import com.alibaba.fastjson2.annotation.JSONField; -import com.geedgenetworks.common.udf.UDFContext; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.processor.ProcessorConfig; import lombok.Data; import lombok.EqualsAndHashCode; @@ -10,8 +11,8 @@ import java.util.List; @EqualsAndHashCode(callSuper = true) @Data public class AggregateConfig extends ProcessorConfig { - - + private List<String> output_fields; + private List<String> remove_fields; private List<String> group_by_fields; private String window_timestamp_field; private String window_type; diff --git a/groot-common/src/main/java/com/geedgenetworks/common/config/AggregateConfigOptions.java b/groot-core/src/main/java/com/geedgenetworks/core/processor/aggregate/AggregateConfigOptions.java index af94abf..4f88d55 100644 --- a/groot-common/src/main/java/com/geedgenetworks/common/config/AggregateConfigOptions.java +++ b/groot-core/src/main/java/com/geedgenetworks/core/processor/aggregate/AggregateConfigOptions.java @@ -1,7 +1,9 @@ -package com.geedgenetworks.common.config; +package com.geedgenetworks.core.processor.aggregate; import com.alibaba.fastjson2.TypeReference; -import com.geedgenetworks.common.udf.UDFContext; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.common.config.Option; +import com.geedgenetworks.common.config.Options; import java.util.List; diff --git a/groot-core/src/main/java/com/geedgenetworks/core/processor/aggregate/AggregateProcessor.java b/groot-core/src/main/java/com/geedgenetworks/core/processor/aggregate/AggregateProcessor.java index 0846ffe..498d833 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/processor/aggregate/AggregateProcessor.java +++ b/groot-core/src/main/java/com/geedgenetworks/core/processor/aggregate/AggregateProcessor.java @@ -1,7 +1,126 @@ package com.geedgenetworks.core.processor.aggregate; -import com.geedgenetworks.core.pojo.AggregateConfig; -import com.geedgenetworks.core.processor.Processor; -public interface AggregateProcessor extends Processor<AggregateConfig> { +import com.alibaba.fastjson.JSONObject; +import com.geedgenetworks.common.config.CheckConfigUtil; +import com.geedgenetworks.common.config.CheckResult; +import com.geedgenetworks.common.exception.CommonErrorCode; +import com.geedgenetworks.common.exception.ConfigValidationException; +import com.geedgenetworks.common.exception.GrootStreamRuntimeException; +import com.geedgenetworks.api.connector.event.Event; +import com.geedgenetworks.api.processor.Processor; +import com.typesafe.config.Config; +import com.typesafe.config.ConfigUtil; +import org.apache.flink.streaming.api.datastream.DataStream; +import org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator; +import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment; +import org.apache.flink.streaming.api.windowing.assigners.SlidingEventTimeWindows; +import org.apache.flink.streaming.api.windowing.assigners.SlidingProcessingTimeWindows; +import org.apache.flink.streaming.api.windowing.assigners.TumblingEventTimeWindows; +import org.apache.flink.streaming.api.windowing.assigners.TumblingProcessingTimeWindows; +import org.apache.flink.streaming.api.windowing.time.Time; + +import java.util.Map; + +import static com.geedgenetworks.common.config.Constants.*; + +public class AggregateProcessor implements Processor<AggregateConfig> { + + @Override + public DataStream<Event> process(StreamExecutionEnvironment env, DataStream<Event> input, AggregateConfig aggregateConfig) { + + SingleOutputStreamOperator<Event> singleOutputStreamOperator; + if (aggregateConfig.getMini_batch()) { + switch (aggregateConfig.getWindow_type()) { + case TUMBLING_PROCESSING_TIME: + singleOutputStreamOperator = input + .process(new FirstAggregationProcessingTime(aggregateConfig, aggregateConfig.getWindow_size())) + .keyBy(new PreKeySelector(aggregateConfig.getGroup_by_fields())) + .window(TumblingProcessingTimeWindows.of(Time.seconds(aggregateConfig.getWindow_size()))) + .aggregate(new SecondAggregateProcessorFunction(env, aggregateConfig), new ProcessWindowFunction(aggregateConfig)); + break; + case TUMBLING_EVENT_TIME: + singleOutputStreamOperator = input + .process(new FirstAggregationEventTime(aggregateConfig, aggregateConfig.getWindow_size())) + .keyBy(new PreKeySelector(aggregateConfig.getGroup_by_fields())) + .window(TumblingEventTimeWindows.of(Time.seconds(aggregateConfig.getWindow_size()))) + .aggregate(new SecondAggregateProcessorFunction(env, aggregateConfig), new ProcessWindowFunction(aggregateConfig)); + break; + case SLIDING_PROCESSING_TIME: + singleOutputStreamOperator = input + .process(new FirstAggregationProcessingTime(aggregateConfig, aggregateConfig.getWindow_slide())) + .keyBy(new PreKeySelector(aggregateConfig.getGroup_by_fields())) + .window(SlidingProcessingTimeWindows.of(Time.seconds(aggregateConfig.getWindow_size()), Time.seconds(aggregateConfig.getWindow_slide()))) + .aggregate(new SecondAggregateProcessorFunction(env, aggregateConfig), new ProcessWindowFunction(aggregateConfig)); + break; + case SLIDING_EVENT_TIME: + singleOutputStreamOperator = input + .process(new FirstAggregationEventTime(aggregateConfig, aggregateConfig.getWindow_slide())) + .keyBy(new PreKeySelector(aggregateConfig.getGroup_by_fields())) + .window(SlidingEventTimeWindows.of(Time.seconds(aggregateConfig.getWindow_size()), Time.seconds(aggregateConfig.getWindow_slide()))) + .aggregate(new SecondAggregateProcessorFunction(env, aggregateConfig), new ProcessWindowFunction(aggregateConfig)); + break; + default: + throw new GrootStreamRuntimeException(CommonErrorCode.UNSUPPORTED_OPERATION, "Invalid window type"); + + } + } else { + switch (aggregateConfig.getWindow_type()) { + case TUMBLING_PROCESSING_TIME: + singleOutputStreamOperator = input + .keyBy(new KeySelector(aggregateConfig.getGroup_by_fields())) + .window(TumblingProcessingTimeWindows.of(Time.seconds(aggregateConfig.getWindow_size()))) + .aggregate(new AggregateProcessorFunction(env, aggregateConfig), new ProcessWindowFunction(aggregateConfig)); + break; + case TUMBLING_EVENT_TIME: + singleOutputStreamOperator = input + .keyBy(new KeySelector(aggregateConfig.getGroup_by_fields())) + .window(TumblingEventTimeWindows.of(Time.seconds(aggregateConfig.getWindow_size()))) + .aggregate(new AggregateProcessorFunction(env, aggregateConfig), new ProcessWindowFunction(aggregateConfig)); + break; + case SLIDING_PROCESSING_TIME: + singleOutputStreamOperator = input + .keyBy(new KeySelector(aggregateConfig.getGroup_by_fields())) + .window(SlidingProcessingTimeWindows.of(Time.seconds(aggregateConfig.getWindow_size()), Time.seconds(aggregateConfig.getWindow_slide()))) + .aggregate(new AggregateProcessorFunction(env, aggregateConfig), new ProcessWindowFunction(aggregateConfig)); + break; + case SLIDING_EVENT_TIME: + singleOutputStreamOperator = input + .keyBy(new KeySelector(aggregateConfig.getGroup_by_fields())) + .window(SlidingEventTimeWindows.of(Time.seconds(aggregateConfig.getWindow_size()), Time.seconds(aggregateConfig.getWindow_slide()))) + .aggregate(new AggregateProcessorFunction(env, aggregateConfig), new ProcessWindowFunction(aggregateConfig)); + break; + default: + throw new GrootStreamRuntimeException(CommonErrorCode.UNSUPPORTED_OPERATION, "Invalid window type"); + } + } + if (aggregateConfig.getParallelism() != 0) { + singleOutputStreamOperator.setParallelism(aggregateConfig.getParallelism()); + } + return singleOutputStreamOperator.name(aggregateConfig.getName()); + + } + + + @Override + public AggregateConfig parseConfig(String name, Config config) { + + CheckResult result = CheckConfigUtil.checkAllExists(config, + AggregateConfigOptions.GROUP_BY_FIELDS.key(), + AggregateConfigOptions.WINDOW_TYPE.key(), + AggregateConfigOptions.FUNCTIONS.key(), + AggregateConfigOptions.WINDOW_SIZE.key()); + if (!result.isSuccess()) { + throw new ConfigValidationException(CommonErrorCode.CONFIG_VALIDATION_FAILED, String.format( + "Aggregate processor: %s, At least one of [%s] should be specified.", + name, String.join(",", + AggregateConfigOptions.OUTPUT_FIELDS.key(), + AggregateConfigOptions.REMOVE_FIELDS.key(), + AggregateConfigOptions.FUNCTIONS.key()))); + } + + AggregateConfig aggregateConfig = new JSONObject(config.root().unwrapped()).toJavaObject(AggregateConfig.class); + aggregateConfig.setName(name); + return aggregateConfig; + } } diff --git a/groot-core/src/main/java/com/geedgenetworks/core/processor/aggregate/AggregateProcessorFactory.java b/groot-core/src/main/java/com/geedgenetworks/core/processor/aggregate/AggregateProcessorFactory.java new file mode 100644 index 0000000..03cc1e1 --- /dev/null +++ b/groot-core/src/main/java/com/geedgenetworks/core/processor/aggregate/AggregateProcessorFactory.java @@ -0,0 +1,30 @@ +package com.geedgenetworks.core.processor.aggregate; + +import com.geedgenetworks.api.processor.Processor; +import com.geedgenetworks.api.factory.ProcessorFactory; +import org.apache.flink.configuration.ConfigOption; + +import java.util.Set; + +public class AggregateProcessorFactory implements ProcessorFactory { + + @Override + public String type() { + return "aggregate"; + } + + @Override + public Set<ConfigOption<?>> requiredOptions() { + return Set.of(); + } + + @Override + public Set<ConfigOption<?>> optionalOptions() { + return Set.of(); + } + + @Override + public Processor<?> createProcessor() { + return new AggregateProcessor(); + } +} diff --git a/groot-core/src/main/java/com/geedgenetworks/core/processor/aggregate/AggregateProcessorFunction.java b/groot-core/src/main/java/com/geedgenetworks/core/processor/aggregate/AggregateProcessorFunction.java index 4f9535d..cf54c3f 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/processor/aggregate/AggregateProcessorFunction.java +++ b/groot-core/src/main/java/com/geedgenetworks/core/processor/aggregate/AggregateProcessorFunction.java @@ -1,15 +1,14 @@ package com.geedgenetworks.core.processor.aggregate; import com.alibaba.fastjson.JSON; -import com.geedgenetworks.common.Accumulator; -import com.geedgenetworks.common.Constants; -import com.geedgenetworks.common.Event; +import com.geedgenetworks.common.config.Accumulator; +import com.geedgenetworks.common.config.Constants; import com.geedgenetworks.common.exception.CommonErrorCode; import com.geedgenetworks.common.exception.GrootStreamRuntimeException; -import com.geedgenetworks.common.udf.AggregateFunction; -import com.geedgenetworks.common.udf.UDFContext; -import com.geedgenetworks.core.pojo.AggregateConfig; -import com.geedgenetworks.core.processor.projection.UdfEntity; +import com.geedgenetworks.api.common.udf.AggregateFunction; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.common.udf.UdfEntity; +import com.geedgenetworks.api.connector.event.Event; import com.google.common.collect.Lists; import com.googlecode.aviator.AviatorEvaluator; import com.googlecode.aviator.AviatorEvaluatorInstance; @@ -17,7 +16,7 @@ import com.googlecode.aviator.Expression; import com.googlecode.aviator.Options; import com.googlecode.aviator.exception.ExpressionRuntimeException; import lombok.extern.slf4j.Slf4j; -import org.apache.flink.api.common.ExecutionConfig; +import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment; import java.util.HashMap; import java.util.LinkedList; @@ -33,8 +32,8 @@ public class AggregateProcessorFunction implements org.apache.flink.api.common.f private final List<String> udfClassNameLists; private final LinkedList<UdfEntity> functions; - public AggregateProcessorFunction(AggregateConfig aggregateConfig, ExecutionConfig config) { - udfClassNameLists = JSON.parseObject(config.getGlobalJobParameters().toMap().get(Constants.SYSPROP_UDF_PLUGIN_CONFIG), List.class); + public AggregateProcessorFunction(StreamExecutionEnvironment env, AggregateConfig aggregateConfig) { + udfClassNameLists = JSON.parseObject(env.getConfig().getGlobalJobParameters().toMap().get(Constants.SYSPROP_UDF_PLUGIN_CONFIG), List.class); udfContexts = aggregateConfig.getFunctions(); if (udfContexts == null || udfContexts.isEmpty()) { throw new RuntimeException(); diff --git a/groot-core/src/main/java/com/geedgenetworks/core/processor/aggregate/AggregateProcessorImpl.java b/groot-core/src/main/java/com/geedgenetworks/core/processor/aggregate/AggregateProcessorImpl.java deleted file mode 100644 index 4712d36..0000000 --- a/groot-core/src/main/java/com/geedgenetworks/core/processor/aggregate/AggregateProcessorImpl.java +++ /dev/null @@ -1,129 +0,0 @@ -package com.geedgenetworks.core.processor.aggregate; - -import com.alibaba.fastjson.JSONObject; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.common.config.AggregateConfigOptions; -import com.geedgenetworks.common.config.CheckConfigUtil; -import com.geedgenetworks.common.config.CheckResult; -import com.geedgenetworks.common.exception.CommonErrorCode; -import com.geedgenetworks.common.exception.ConfigValidationException; -import com.geedgenetworks.common.exception.GrootStreamRuntimeException; -import com.geedgenetworks.core.pojo.AggregateConfig; -import com.typesafe.config.Config; -import org.apache.flink.api.common.ExecutionConfig; -import org.apache.flink.streaming.api.datastream.DataStream; -import org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator; -import org.apache.flink.streaming.api.windowing.assigners.SlidingEventTimeWindows; -import org.apache.flink.streaming.api.windowing.assigners.SlidingProcessingTimeWindows; -import org.apache.flink.streaming.api.windowing.assigners.TumblingEventTimeWindows; -import org.apache.flink.streaming.api.windowing.assigners.TumblingProcessingTimeWindows; -import org.apache.flink.streaming.api.windowing.time.Time; - -import java.util.Map; - -import static com.geedgenetworks.common.Constants.*; - -public class AggregateProcessorImpl implements AggregateProcessor { - - @Override - public DataStream<Event> processorFunction(DataStream<Event> grootEventSingleOutputStreamOperator, AggregateConfig aggregateConfig, ExecutionConfig config) throws Exception { - - SingleOutputStreamOperator<Event> singleOutputStreamOperator; - if (aggregateConfig.getMini_batch()) { - switch (aggregateConfig.getWindow_type()) { - case TUMBLING_PROCESSING_TIME: - singleOutputStreamOperator = grootEventSingleOutputStreamOperator - .process(new FirstAggregationProcessingTime(aggregateConfig, aggregateConfig.getWindow_size())) - .keyBy(new PreKeySelector(aggregateConfig.getGroup_by_fields())) - .window(TumblingProcessingTimeWindows.of(Time.seconds(aggregateConfig.getWindow_size()))) - .aggregate(new SecondAggregateProcessorFunction(aggregateConfig, config), new ProcessWindowFunctionImpl(aggregateConfig)); - break; - case TUMBLING_EVENT_TIME: - singleOutputStreamOperator = grootEventSingleOutputStreamOperator - .process(new FirstAggregationEventTime(aggregateConfig, aggregateConfig.getWindow_size())) - .keyBy(new PreKeySelector(aggregateConfig.getGroup_by_fields())) - .window(TumblingEventTimeWindows.of(Time.seconds(aggregateConfig.getWindow_size()))) - .aggregate(new SecondAggregateProcessorFunction(aggregateConfig, config), new ProcessWindowFunctionImpl(aggregateConfig)); - break; - case SLIDING_PROCESSING_TIME: - singleOutputStreamOperator = grootEventSingleOutputStreamOperator - .process(new FirstAggregationProcessingTime(aggregateConfig, aggregateConfig.getWindow_slide())) - .keyBy(new PreKeySelector(aggregateConfig.getGroup_by_fields())) - .window(SlidingProcessingTimeWindows.of(Time.seconds(aggregateConfig.getWindow_size()), Time.seconds(aggregateConfig.getWindow_slide()))) - .aggregate(new SecondAggregateProcessorFunction(aggregateConfig, config), new ProcessWindowFunctionImpl(aggregateConfig)); - break; - case SLIDING_EVENT_TIME: - singleOutputStreamOperator = grootEventSingleOutputStreamOperator - .process(new FirstAggregationEventTime(aggregateConfig, aggregateConfig.getWindow_slide())) - .keyBy(new PreKeySelector(aggregateConfig.getGroup_by_fields())) - .window(SlidingEventTimeWindows.of(Time.seconds(aggregateConfig.getWindow_size()), Time.seconds(aggregateConfig.getWindow_slide()))) - .aggregate(new SecondAggregateProcessorFunction(aggregateConfig, config), new ProcessWindowFunctionImpl(aggregateConfig)); - break; - default: - throw new GrootStreamRuntimeException(CommonErrorCode.UNSUPPORTED_OPERATION, "Invalid window type"); - - } - } else { - switch (aggregateConfig.getWindow_type()) { - case TUMBLING_PROCESSING_TIME: - singleOutputStreamOperator = grootEventSingleOutputStreamOperator - .keyBy(new KeySelector(aggregateConfig.getGroup_by_fields())) - .window(TumblingProcessingTimeWindows.of(Time.seconds(aggregateConfig.getWindow_size()))) - .aggregate(new AggregateProcessorFunction(aggregateConfig, config), new ProcessWindowFunctionImpl(aggregateConfig)); - break; - case TUMBLING_EVENT_TIME: - singleOutputStreamOperator = grootEventSingleOutputStreamOperator - .keyBy(new KeySelector(aggregateConfig.getGroup_by_fields())) - .window(TumblingEventTimeWindows.of(Time.seconds(aggregateConfig.getWindow_size()))) - .aggregate(new AggregateProcessorFunction(aggregateConfig, config), new ProcessWindowFunctionImpl(aggregateConfig)); - break; - case SLIDING_PROCESSING_TIME: - singleOutputStreamOperator = grootEventSingleOutputStreamOperator - .keyBy(new KeySelector(aggregateConfig.getGroup_by_fields())) - .window(SlidingProcessingTimeWindows.of(Time.seconds(aggregateConfig.getWindow_size()), Time.seconds(aggregateConfig.getWindow_slide()))) - .aggregate(new AggregateProcessorFunction(aggregateConfig, config), new ProcessWindowFunctionImpl(aggregateConfig)); - break; - case SLIDING_EVENT_TIME: - singleOutputStreamOperator = grootEventSingleOutputStreamOperator - .keyBy(new KeySelector(aggregateConfig.getGroup_by_fields())) - .window(SlidingEventTimeWindows.of(Time.seconds(aggregateConfig.getWindow_size()), Time.seconds(aggregateConfig.getWindow_slide()))) - .aggregate(new AggregateProcessorFunction(aggregateConfig, config), new ProcessWindowFunctionImpl(aggregateConfig)); - break; - default: - throw new GrootStreamRuntimeException(CommonErrorCode.UNSUPPORTED_OPERATION, "Invalid window type"); - } - } - if (aggregateConfig.getParallelism() != 0) { - singleOutputStreamOperator.setParallelism(aggregateConfig.getParallelism()); - } - return singleOutputStreamOperator.name(aggregateConfig.getName()); - - } - - @Override - public String type() { - return "aggregate"; - } - - @Override - public AggregateConfig checkConfig(String name, Map<String, Object> configProperties, Config typeSafeConfig) { - CheckResult result = CheckConfigUtil.checkAllExists(typeSafeConfig.getConfig(name), - AggregateConfigOptions.GROUP_BY_FIELDS.key(), - AggregateConfigOptions.WINDOW_TYPE.key(), - AggregateConfigOptions.FUNCTIONS.key(), - AggregateConfigOptions.WINDOW_SIZE.key()); - if (!result.isSuccess()) { - throw new ConfigValidationException(CommonErrorCode.CONFIG_VALIDATION_FAILED, String.format( - "Aggregate processor: %s, At least one of [%s] should be specified.", - name, String.join(",", - AggregateConfigOptions.OUTPUT_FIELDS.key(), - AggregateConfigOptions.REMOVE_FIELDS.key(), - AggregateConfigOptions.FUNCTIONS.key()))); - } - - AggregateConfig aggregateConfig = new JSONObject(configProperties).toJavaObject(AggregateConfig.class); - aggregateConfig.setName(name); - return aggregateConfig; - } - -} diff --git a/groot-core/src/main/java/com/geedgenetworks/core/processor/aggregate/FirstAggregationEventTime.java b/groot-core/src/main/java/com/geedgenetworks/core/processor/aggregate/FirstAggregationEventTime.java index 5adc6d1..6e53bd1 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/processor/aggregate/FirstAggregationEventTime.java +++ b/groot-core/src/main/java/com/geedgenetworks/core/processor/aggregate/FirstAggregationEventTime.java @@ -1,10 +1,9 @@ package com.geedgenetworks.core.processor.aggregate; -import com.geedgenetworks.common.Accumulator; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.common.KeybyEntity; -import com.geedgenetworks.core.pojo.AggregateConfig; +import com.geedgenetworks.common.config.Accumulator; +import com.geedgenetworks.common.config.KeybyEntity; +import com.geedgenetworks.api.connector.event.Event; import lombok.extern.slf4j.Slf4j; import org.apache.flink.configuration.Configuration; import org.apache.flink.streaming.api.functions.ProcessFunction; diff --git a/groot-core/src/main/java/com/geedgenetworks/core/processor/aggregate/FirstAggregationProcessingTime.java b/groot-core/src/main/java/com/geedgenetworks/core/processor/aggregate/FirstAggregationProcessingTime.java index 01c346f..2cd7c61 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/processor/aggregate/FirstAggregationProcessingTime.java +++ b/groot-core/src/main/java/com/geedgenetworks/core/processor/aggregate/FirstAggregationProcessingTime.java @@ -1,10 +1,10 @@ package com.geedgenetworks.core.processor.aggregate; -import com.geedgenetworks.common.Accumulator; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.common.KeybyEntity; -import com.geedgenetworks.core.pojo.AggregateConfig; +import com.geedgenetworks.common.config.Accumulator; + +import com.geedgenetworks.common.config.KeybyEntity; +import com.geedgenetworks.api.connector.event.Event; import lombok.extern.slf4j.Slf4j; import org.apache.flink.configuration.Configuration; import org.apache.flink.streaming.api.functions.ProcessFunction; diff --git a/groot-core/src/main/java/com/geedgenetworks/core/processor/aggregate/KeySelector.java b/groot-core/src/main/java/com/geedgenetworks/core/processor/aggregate/KeySelector.java index a6fb294..2b5f1e3 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/processor/aggregate/KeySelector.java +++ b/groot-core/src/main/java/com/geedgenetworks/core/processor/aggregate/KeySelector.java @@ -1,8 +1,8 @@ package com.geedgenetworks.core.processor.aggregate; import cn.hutool.crypto.SecureUtil; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.common.KeybyEntity; +import com.geedgenetworks.common.config.KeybyEntity; +import com.geedgenetworks.api.connector.event.Event; import java.util.HashMap; import java.util.List; diff --git a/groot-core/src/main/java/com/geedgenetworks/core/processor/aggregate/PreKeySelector.java b/groot-core/src/main/java/com/geedgenetworks/core/processor/aggregate/PreKeySelector.java index 4b21ba7..21964f4 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/processor/aggregate/PreKeySelector.java +++ b/groot-core/src/main/java/com/geedgenetworks/core/processor/aggregate/PreKeySelector.java @@ -1,8 +1,8 @@ package com.geedgenetworks.core.processor.aggregate; import cn.hutool.crypto.SecureUtil; -import com.geedgenetworks.common.Accumulator; -import com.geedgenetworks.common.KeybyEntity; +import com.geedgenetworks.common.config.Accumulator; +import com.geedgenetworks.common.config.KeybyEntity; import java.util.HashMap; import java.util.List; diff --git a/groot-core/src/main/java/com/geedgenetworks/core/processor/aggregate/ProcessWindowFunctionImpl.java b/groot-core/src/main/java/com/geedgenetworks/core/processor/aggregate/ProcessWindowFunction.java index cd5c485..7e1bc8c 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/processor/aggregate/ProcessWindowFunctionImpl.java +++ b/groot-core/src/main/java/com/geedgenetworks/core/processor/aggregate/ProcessWindowFunction.java @@ -1,20 +1,18 @@ package com.geedgenetworks.core.processor.aggregate; -import com.geedgenetworks.common.Accumulator; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.common.KeybyEntity; +import com.geedgenetworks.common.config.Accumulator; +import com.geedgenetworks.common.config.KeybyEntity; import com.geedgenetworks.common.utils.ColumnUtil; -import com.geedgenetworks.core.metrics.InternalMetrics; -import com.geedgenetworks.core.pojo.AggregateConfig; +import com.geedgenetworks.api.metrics.InternalMetrics; +import com.geedgenetworks.api.connector.event.Event; import org.apache.flink.configuration.Configuration; -import org.apache.flink.streaming.api.functions.windowing.ProcessWindowFunction; import org.apache.flink.streaming.api.windowing.windows.TimeWindow; import org.apache.flink.util.Collector; -import static com.geedgenetworks.common.Event.WINDOW_END_TIMESTAMP; -import static com.geedgenetworks.common.Event.WINDOW_START_TIMESTAMP; +import static com.geedgenetworks.api.connector.event.Event.WINDOW_END_TIMESTAMP; +import static com.geedgenetworks.api.connector.event.Event.WINDOW_START_TIMESTAMP; -public class ProcessWindowFunctionImpl extends org.apache.flink.streaming.api.functions.windowing.ProcessWindowFunction< +public class ProcessWindowFunction extends org.apache.flink.streaming.api.functions.windowing.ProcessWindowFunction< Accumulator, // 输入类型 Event, // 输出类型 KeybyEntity, // 键类型 @@ -22,7 +20,7 @@ public class ProcessWindowFunctionImpl extends org.apache.flink.streaming.api.fu private final AggregateConfig aggregateConfig; private transient InternalMetrics internalMetrics; - public ProcessWindowFunctionImpl(AggregateConfig aggregateConfig) { + public ProcessWindowFunction(AggregateConfig aggregateConfig) { this.aggregateConfig = aggregateConfig; } @@ -34,7 +32,7 @@ public class ProcessWindowFunctionImpl extends org.apache.flink.streaming.api.fu } @Override - public void process(KeybyEntity keybyEntity, ProcessWindowFunction<Accumulator, Event, KeybyEntity, TimeWindow>.Context context, Iterable<Accumulator> elements, Collector<Event> out) throws Exception { + public void process(KeybyEntity keybyEntity, org.apache.flink.streaming.api.functions.windowing.ProcessWindowFunction<Accumulator, Event, KeybyEntity, TimeWindow>.Context context, Iterable<Accumulator> elements, Collector<Event> out) throws Exception { Accumulator accumulator = elements.iterator().next(); Event event = new Event(); event.setExtractedFields(accumulator.getMetricsFields()); diff --git a/groot-core/src/main/java/com/geedgenetworks/core/processor/aggregate/SecondAggregateProcessorFunction.java b/groot-core/src/main/java/com/geedgenetworks/core/processor/aggregate/SecondAggregateProcessorFunction.java index 68fa53e..86cf3f6 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/processor/aggregate/SecondAggregateProcessorFunction.java +++ b/groot-core/src/main/java/com/geedgenetworks/core/processor/aggregate/SecondAggregateProcessorFunction.java @@ -1,14 +1,13 @@ package com.geedgenetworks.core.processor.aggregate; import com.alibaba.fastjson.JSON; -import com.geedgenetworks.common.Accumulator; -import com.geedgenetworks.common.Constants; +import com.geedgenetworks.common.config.Accumulator; +import com.geedgenetworks.common.config.Constants; import com.geedgenetworks.common.exception.CommonErrorCode; import com.geedgenetworks.common.exception.GrootStreamRuntimeException; -import com.geedgenetworks.common.udf.AggregateFunction; -import com.geedgenetworks.common.udf.UDFContext; -import com.geedgenetworks.core.pojo.AggregateConfig; -import com.geedgenetworks.core.processor.projection.UdfEntity; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.common.udf.UdfEntity; +import com.geedgenetworks.api.common.udf.AggregateFunction; import com.google.common.collect.Lists; import com.googlecode.aviator.AviatorEvaluator; import com.googlecode.aviator.AviatorEvaluatorInstance; @@ -16,14 +15,13 @@ import com.googlecode.aviator.Expression; import com.googlecode.aviator.Options; import com.googlecode.aviator.exception.ExpressionRuntimeException; import lombok.extern.slf4j.Slf4j; -import org.apache.flink.api.common.ExecutionConfig; +import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment; import java.util.HashMap; import java.util.LinkedList; import java.util.List; import java.util.Map; -import static com.geedgenetworks.core.utils.UDFUtils.filterExecute; import static com.geedgenetworks.core.utils.UDFUtils.getClassReflect; @Slf4j @@ -32,8 +30,8 @@ public class SecondAggregateProcessorFunction implements org.apache.flink.api.co private final List<String> udfClassNameLists; private final LinkedList<UdfEntity> functions; - public SecondAggregateProcessorFunction(AggregateConfig aggregateConfig, ExecutionConfig config) { - udfClassNameLists = JSON.parseObject(config.getGlobalJobParameters().toMap().get(Constants.SYSPROP_UDF_PLUGIN_CONFIG), List.class); + public SecondAggregateProcessorFunction(StreamExecutionEnvironment env, AggregateConfig aggregateConfig) { + udfClassNameLists = JSON.parseObject(env.getConfig().getGlobalJobParameters().toMap().get(Constants.SYSPROP_UDF_PLUGIN_CONFIG), List.class); udfContexts = aggregateConfig.getFunctions(); if (udfContexts == null || udfContexts.isEmpty()) { throw new RuntimeException(); diff --git a/groot-core/src/main/java/com/geedgenetworks/core/processor/filter/AviatorFilterProcessor.java b/groot-core/src/main/java/com/geedgenetworks/core/processor/filter/AviatorFilterProcessor.java new file mode 100644 index 0000000..8953c94 --- /dev/null +++ b/groot-core/src/main/java/com/geedgenetworks/core/processor/filter/AviatorFilterProcessor.java @@ -0,0 +1,32 @@ +package com.geedgenetworks.core.processor.filter; + +import com.alibaba.fastjson.JSONObject; +import com.geedgenetworks.api.processor.Processor; +import com.geedgenetworks.api.connector.event.Event; +import com.typesafe.config.Config; +import org.apache.flink.streaming.api.datastream.DataStream; +import org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator; +import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment; + +public class AviatorFilterProcessor implements Processor<FilterConfig> { + + @Override + public DataStream<Event> process(StreamExecutionEnvironment env, + DataStream<Event> input, FilterConfig FilterConfig) { + SingleOutputStreamOperator<Event> resultStream = + input.filter(new FilterFunction(FilterConfig)).name(FilterConfig.getName()); + + if (FilterConfig.getParallelism() != 0) { + resultStream.setParallelism(FilterConfig.getParallelism()); + } + return resultStream; + } + + @Override + public FilterConfig parseConfig(String name, Config config) { + FilterConfig filterConfig = new JSONObject(config.root().unwrapped()).toJavaObject(FilterConfig.class); + filterConfig.setName(name); + return filterConfig; + } + +} diff --git a/groot-core/src/main/java/com/geedgenetworks/core/processor/filter/AviatorFilterProcessorFactory.java b/groot-core/src/main/java/com/geedgenetworks/core/processor/filter/AviatorFilterProcessorFactory.java new file mode 100644 index 0000000..ea0c60b --- /dev/null +++ b/groot-core/src/main/java/com/geedgenetworks/core/processor/filter/AviatorFilterProcessorFactory.java @@ -0,0 +1,30 @@ +package com.geedgenetworks.core.processor.filter; + +import com.geedgenetworks.api.processor.Processor; +import com.geedgenetworks.api.factory.ProcessorFactory; +import org.apache.flink.configuration.ConfigOption; + +import java.util.Set; + +public class AviatorFilterProcessorFactory implements ProcessorFactory { + + @Override + public Processor<?> createProcessor() { + return new AviatorFilterProcessor(); + } + + @Override + public String type() { + return "aviator"; + } + + @Override + public Set<ConfigOption<?>> requiredOptions() { + return Set.of(); + } + + @Override + public Set<ConfigOption<?>> optionalOptions() { + return Set.of(); + } +} diff --git a/groot-core/src/main/java/com/geedgenetworks/core/processor/filter/FilterConfig.java b/groot-core/src/main/java/com/geedgenetworks/core/processor/filter/FilterConfig.java new file mode 100644 index 0000000..6291860 --- /dev/null +++ b/groot-core/src/main/java/com/geedgenetworks/core/processor/filter/FilterConfig.java @@ -0,0 +1,13 @@ +package com.geedgenetworks.core.processor.filter; + +import com.geedgenetworks.api.processor.ProcessorConfig; +import lombok.Data; +import lombok.EqualsAndHashCode; + +import java.util.List; + +@EqualsAndHashCode(callSuper = true) +@Data +public class FilterConfig extends ProcessorConfig { + private List<String> output_fields; +} diff --git a/groot-core/src/main/java/com/geedgenetworks/core/processor/filter/FilterConfigOptions.java b/groot-core/src/main/java/com/geedgenetworks/core/processor/filter/FilterConfigOptions.java new file mode 100644 index 0000000..3f7e01a --- /dev/null +++ b/groot-core/src/main/java/com/geedgenetworks/core/processor/filter/FilterConfigOptions.java @@ -0,0 +1,15 @@ +package com.geedgenetworks.core.processor.filter; + +import com.geedgenetworks.common.config.Option; +import com.geedgenetworks.common.config.Options; + +import java.util.List; +import java.util.Map; + +public interface FilterConfigOptions { + Option<List<String>> OUTPUT_FIELDS = Options.key("output_fields") + .listType() + .noDefaultValue() + .withDescription("The fields to be outputted."); + +} diff --git a/groot-core/src/main/java/com/geedgenetworks/core/filter/FilterFunction.java b/groot-core/src/main/java/com/geedgenetworks/core/processor/filter/FilterFunction.java index facb4af..9d5d6f3 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/filter/FilterFunction.java +++ b/groot-core/src/main/java/com/geedgenetworks/core/processor/filter/FilterFunction.java @@ -1,9 +1,8 @@ -package com.geedgenetworks.core.filter; +package com.geedgenetworks.core.processor.filter; import com.geedgenetworks.common.utils.ColumnUtil; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.core.metrics.InternalMetrics; -import com.geedgenetworks.core.pojo.FilterConfig; +import com.geedgenetworks.api.metrics.InternalMetrics; +import com.geedgenetworks.api.connector.event.Event; import com.googlecode.aviator.AviatorEvaluator; import com.googlecode.aviator.AviatorEvaluatorInstance; import com.googlecode.aviator.Expression; @@ -12,24 +11,23 @@ import com.googlecode.aviator.exception.ExpressionRuntimeException; import lombok.extern.slf4j.Slf4j; import org.apache.flink.api.common.functions.RichFilterFunction; import org.apache.flink.configuration.Configuration; -import org.apache.flink.metrics.Counter; import java.util.Map; @Slf4j public class FilterFunction extends RichFilterFunction<Event> { - private final com.geedgenetworks.core.pojo.FilterConfig FilterConfig; + private final FilterConfig filterConfig; private static Expression compiledExp; private static String expression; private transient InternalMetrics internalMetrics; - public FilterFunction(FilterConfig FilterConfig) { - this.FilterConfig = FilterConfig; + public FilterFunction(FilterConfig filterConfig) { + this.filterConfig = filterConfig; } @Override public void open(Configuration parameters) throws Exception { this.internalMetrics = new InternalMetrics(getRuntimeContext()); - expression = FilterConfig.getProperties().getOrDefault("expression", "").toString(); + expression = filterConfig.getProperties().getOrDefault("expression", "").toString(); AviatorEvaluatorInstance instance = AviatorEvaluator.getInstance(); instance.setCachedExpressionByDefault(true); instance.setOption(Options.OPTIMIZE_LEVEL, AviatorEvaluator.EVAL); @@ -43,11 +41,11 @@ public class FilterFunction extends RichFilterFunction<Event> { boolean isFilter ; try { - if (FilterConfig.getOutput_fields() != null - && !FilterConfig.getOutput_fields().isEmpty()) { + if (filterConfig.getOutput_fields() != null + && !filterConfig.getOutput_fields().isEmpty()) { value.setExtractedFields( ColumnUtil.columnSelector( - value.getExtractedFields(), FilterConfig.getOutput_fields())); + value.getExtractedFields(), filterConfig.getOutput_fields())); } isFilter = aviatorExcute(value.getExtractedFields()); diff --git a/groot-core/src/main/java/com/geedgenetworks/core/processor/projection/ProjectionConfig.java b/groot-core/src/main/java/com/geedgenetworks/core/processor/projection/ProjectionConfig.java new file mode 100644 index 0000000..fd7ff41 --- /dev/null +++ b/groot-core/src/main/java/com/geedgenetworks/core/processor/projection/ProjectionConfig.java @@ -0,0 +1,17 @@ +package com.geedgenetworks.core.processor.projection; + +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.processor.ProcessorConfig; +import lombok.Data; +import lombok.EqualsAndHashCode; + +import java.util.List; + +@EqualsAndHashCode(callSuper = true) +@Data +public class ProjectionConfig extends ProcessorConfig { + private List<String> output_fields; + private List<String> remove_fields; + private List<UDFContext> functions; + +} diff --git a/groot-common/src/main/java/com/geedgenetworks/common/config/ProjectionConfigOptions.java b/groot-core/src/main/java/com/geedgenetworks/core/processor/projection/ProjectionConfigOptions.java index 1a813af..0607e16 100644 --- a/groot-common/src/main/java/com/geedgenetworks/common/config/ProjectionConfigOptions.java +++ b/groot-core/src/main/java/com/geedgenetworks/core/processor/projection/ProjectionConfigOptions.java @@ -1,15 +1,13 @@ -package com.geedgenetworks.common.config; +package com.geedgenetworks.core.processor.projection; import com.alibaba.fastjson2.TypeReference; -import com.geedgenetworks.common.udf.UDFContext; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.common.config.Option; +import com.geedgenetworks.common.config.Options; import java.util.List; public interface ProjectionConfigOptions { - Option<String> TYPE = Options.key("type") - .stringType() - .noDefaultValue() - .withDescription("The type of processor."); Option<List<String>> OUTPUT_FIELDS = Options.key("output_fields") .listType() diff --git a/groot-core/src/main/java/com/geedgenetworks/core/processor/projection/ProjectionProcessFunction.java b/groot-core/src/main/java/com/geedgenetworks/core/processor/projection/ProjectionProcessFunction.java index 55258b3..db0070f 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/processor/projection/ProjectionProcessFunction.java +++ b/groot-core/src/main/java/com/geedgenetworks/core/processor/projection/ProjectionProcessFunction.java @@ -1,17 +1,18 @@ package com.geedgenetworks.core.processor.projection; import com.alibaba.fastjson.JSON; -import com.geedgenetworks.common.Constants; + +import com.geedgenetworks.common.config.Constants; import com.geedgenetworks.common.config.*; import com.geedgenetworks.common.exception.CommonErrorCode; import com.geedgenetworks.common.exception.GrootStreamRuntimeException; -import com.geedgenetworks.common.udf.UDFContext; import com.geedgenetworks.common.utils.ColumnUtil; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.core.metrics.InternalMetrics; -import com.geedgenetworks.core.pojo.ProjectionConfig; -import com.geedgenetworks.common.udf.ScalarFunction; import com.geedgenetworks.core.udf.knowlegdebase.KnowledgeBaseScheduler; +import com.geedgenetworks.api.common.udf.ScalarFunction; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.common.udf.UdfEntity; +import com.geedgenetworks.api.metrics.InternalMetrics; +import com.geedgenetworks.api.connector.event.Event; import com.google.common.collect.Lists; import com.googlecode.aviator.AviatorEvaluator; import com.googlecode.aviator.AviatorEvaluatorInstance; @@ -22,14 +23,13 @@ import lombok.extern.slf4j.Slf4j; import org.apache.flink.configuration.Configuration; import org.apache.flink.streaming.api.functions.ProcessFunction; import org.apache.flink.util.Collector; +import static com.geedgenetworks.core.utils.UDFUtils.filterExecute; +import static com.geedgenetworks.core.utils.UDFUtils.getClassReflect; import java.util.LinkedList; import java.util.List; import java.util.Map; -import static com.geedgenetworks.core.utils.UDFUtils.filterExecute; -import static com.geedgenetworks.core.utils.UDFUtils.getClassReflect; - @Slf4j public class ProjectionProcessFunction extends ProcessFunction<Event, Event> { private final ProjectionConfig projectionConfig; diff --git a/groot-core/src/main/java/com/geedgenetworks/core/processor/projection/ProjectionProcessor.java b/groot-core/src/main/java/com/geedgenetworks/core/processor/projection/ProjectionProcessor.java index f15d481..eb32786 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/processor/projection/ProjectionProcessor.java +++ b/groot-core/src/main/java/com/geedgenetworks/core/processor/projection/ProjectionProcessor.java @@ -1,8 +1,50 @@ package com.geedgenetworks.core.processor.projection; -import com.geedgenetworks.core.pojo.ProjectionConfig; -import com.geedgenetworks.core.processor.Processor; +import com.alibaba.fastjson.JSONObject; +import com.geedgenetworks.common.config.CheckConfigUtil; +import com.geedgenetworks.common.config.CheckResult; +import com.geedgenetworks.common.exception.CommonErrorCode; +import com.geedgenetworks.common.exception.ConfigValidationException; +import com.geedgenetworks.api.processor.Processor; +import com.geedgenetworks.api.connector.event.Event; +import com.typesafe.config.Config; +import org.apache.flink.streaming.api.datastream.DataStream; +import org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator; +import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment; -public interface ProjectionProcessor extends Processor<ProjectionConfig>{ +public class ProjectionProcessor implements Processor<ProjectionConfig> { + @Override + public DataStream<Event> process(StreamExecutionEnvironment env, + DataStream<Event> input, ProjectionConfig projectionConfig) { + SingleOutputStreamOperator<Event> resultStream = + input.process(new ProjectionProcessFunction(projectionConfig)).name(projectionConfig.getName()); + + if (projectionConfig.getParallelism() != 0) { + resultStream.setParallelism(projectionConfig.getParallelism()); + } + return resultStream; + } + + @Override + public ProjectionConfig parseConfig(String name, Config config) { + + CheckResult result = CheckConfigUtil.checkAtLeastOneExists(config, + ProjectionConfigOptions.OUTPUT_FIELDS.key(), + ProjectionConfigOptions.REMOVE_FIELDS.key(), + ProjectionConfigOptions.FUNCTIONS.key()); + if (!result.isSuccess()) { + throw new ConfigValidationException(CommonErrorCode.CONFIG_VALIDATION_FAILED, String.format( + "Processor: %s, At least one of [%s] should be specified.", + name, String.join(",", + ProjectionConfigOptions.OUTPUT_FIELDS.key(), + ProjectionConfigOptions.REMOVE_FIELDS.key(), + ProjectionConfigOptions.FUNCTIONS.key()))); + } + + ProjectionConfig projectionConfig = new JSONObject(config.root().unwrapped()).toJavaObject(ProjectionConfig.class); + projectionConfig.setName(name); + + return projectionConfig; + } } diff --git a/groot-core/src/main/java/com/geedgenetworks/core/processor/projection/ProjectionProcessorFactory.java b/groot-core/src/main/java/com/geedgenetworks/core/processor/projection/ProjectionProcessorFactory.java new file mode 100644 index 0000000..706eeea --- /dev/null +++ b/groot-core/src/main/java/com/geedgenetworks/core/processor/projection/ProjectionProcessorFactory.java @@ -0,0 +1,30 @@ +package com.geedgenetworks.core.processor.projection; + +import com.geedgenetworks.api.processor.Processor; +import com.geedgenetworks.api.factory.ProcessorFactory; +import org.apache.flink.configuration.ConfigOption; + +import java.util.Set; + +public class ProjectionProcessorFactory implements ProcessorFactory { + + @Override + public Processor<?> createProcessor() { + return new ProjectionProcessor(); + } + + @Override + public String type() { + return "projection"; + } + + @Override + public Set<ConfigOption<?>> requiredOptions() { + return Set.of(); + } + + @Override + public Set<ConfigOption<?>> optionalOptions() { + return Set.of(); + } +} diff --git a/groot-core/src/main/java/com/geedgenetworks/core/processor/projection/ProjectionProcessorImpl.java b/groot-core/src/main/java/com/geedgenetworks/core/processor/projection/ProjectionProcessorImpl.java deleted file mode 100644 index 7b35566..0000000 --- a/groot-core/src/main/java/com/geedgenetworks/core/processor/projection/ProjectionProcessorImpl.java +++ /dev/null @@ -1,61 +0,0 @@ -package com.geedgenetworks.core.processor.projection; - -import com.alibaba.fastjson.JSONObject; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.common.config.CheckConfigUtil; -import com.geedgenetworks.common.config.CheckResult; -import com.geedgenetworks.common.config.ProjectionConfigOptions; -import com.geedgenetworks.common.exception.CommonErrorCode; -import com.geedgenetworks.common.exception.ConfigValidationException; -import com.geedgenetworks.core.pojo.ProjectionConfig; - -import com.typesafe.config.Config; -import org.apache.flink.api.common.ExecutionConfig; -import org.apache.flink.streaming.api.datastream.DataStream; -import org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator; -import org.apache.flink.util.OutputTag; - -import java.util.Map; - -public class ProjectionProcessorImpl implements ProjectionProcessor { - - - @Override - public DataStream<Event> processorFunction(DataStream<Event> grootEventDataStream, ProjectionConfig projectionConfig, ExecutionConfig config) throws Exception { - if (projectionConfig.getParallelism() != 0) { - return grootEventDataStream - .process(new ProjectionProcessFunction(projectionConfig)) - .setParallelism(projectionConfig.getParallelism()) - .name(projectionConfig.getName()); - } else { - return grootEventDataStream - .process(new ProjectionProcessFunction(projectionConfig)) - .name(projectionConfig.getName()); - } - } - @Override - public String type() { - return "projection"; - } - - @Override - public ProjectionConfig checkConfig(String name, Map<String, Object> configProperties, Config typeSafeConfig) { - CheckResult result = CheckConfigUtil.checkAtLeastOneExists(typeSafeConfig.getConfig(name), - ProjectionConfigOptions.OUTPUT_FIELDS.key(), - ProjectionConfigOptions.REMOVE_FIELDS.key(), - ProjectionConfigOptions.FUNCTIONS.key()); - if (!result.isSuccess()) { - throw new ConfigValidationException(CommonErrorCode.CONFIG_VALIDATION_FAILED, String.format( - "Processor: %s, At least one of [%s] should be specified.", - name, String.join(",", - ProjectionConfigOptions.OUTPUT_FIELDS.key(), - ProjectionConfigOptions.REMOVE_FIELDS.key(), - ProjectionConfigOptions.FUNCTIONS.key()))); - } - - ProjectionConfig projectionConfig = new JSONObject(configProperties).toJavaObject(ProjectionConfig.class); - projectionConfig.setName(name); - - return projectionConfig; - } -} diff --git a/groot-common/src/main/java/com/geedgenetworks/common/udf/RuleContext.java b/groot-core/src/main/java/com/geedgenetworks/core/processor/split/RuleContext.java index 6aa9e3d..99076f9 100644 --- a/groot-common/src/main/java/com/geedgenetworks/common/udf/RuleContext.java +++ b/groot-core/src/main/java/com/geedgenetworks/core/processor/split/RuleContext.java @@ -1,6 +1,6 @@ -package com.geedgenetworks.common.udf; +package com.geedgenetworks.core.processor.split; -import com.geedgenetworks.common.Event; +import com.geedgenetworks.api.connector.event.Event; import com.googlecode.aviator.Expression; import lombok.Data; import org.apache.flink.util.OutputTag; diff --git a/groot-core/src/main/java/com/geedgenetworks/core/processor/split/SplitConfig.java b/groot-core/src/main/java/com/geedgenetworks/core/processor/split/SplitConfig.java new file mode 100644 index 0000000..908cca4 --- /dev/null +++ b/groot-core/src/main/java/com/geedgenetworks/core/processor/split/SplitConfig.java @@ -0,0 +1,13 @@ +package com.geedgenetworks.core.processor.split; + +import com.geedgenetworks.api.processor.ProcessorConfig; +import lombok.Data; +import lombok.EqualsAndHashCode; + +import java.util.List; + +@EqualsAndHashCode(callSuper = true) +@Data +public class SplitConfig extends ProcessorConfig { + private List<RuleContext> rules; +} diff --git a/groot-common/src/main/java/com/geedgenetworks/common/config/SplitConfigOptions.java b/groot-core/src/main/java/com/geedgenetworks/core/processor/split/SplitConfigOptions.java index a2acb71..51bb90c 100644 --- a/groot-common/src/main/java/com/geedgenetworks/common/config/SplitConfigOptions.java +++ b/groot-core/src/main/java/com/geedgenetworks/core/processor/split/SplitConfigOptions.java @@ -1,15 +1,12 @@ -package com.geedgenetworks.common.config; +package com.geedgenetworks.core.processor.split; import com.alibaba.fastjson2.TypeReference; -import com.geedgenetworks.common.udf.RuleContext; +import com.geedgenetworks.common.config.Option; +import com.geedgenetworks.common.config.Options; + import java.util.List; public interface SplitConfigOptions { - Option<String> TYPE = Options.key("type") - .stringType() - .noDefaultValue() - .withDescription("The type of route ."); - Option<List<RuleContext>> RULES = Options.key("rules") .type(new TypeReference<List<RuleContext>>() {}) .noDefaultValue() diff --git a/groot-core/src/main/java/com/geedgenetworks/core/split/SplitFunction.java b/groot-core/src/main/java/com/geedgenetworks/core/processor/split/SplitFunction.java index 07d4f9f..2e0fda6 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/split/SplitFunction.java +++ b/groot-core/src/main/java/com/geedgenetworks/core/processor/split/SplitFunction.java @@ -1,9 +1,7 @@ -package com.geedgenetworks.core.split; +package com.geedgenetworks.core.processor.split; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.common.udf.RuleContext; -import com.geedgenetworks.core.metrics.InternalMetrics; -import com.geedgenetworks.core.pojo.SplitConfig; +import com.geedgenetworks.api.metrics.InternalMetrics; +import com.geedgenetworks.api.connector.event.Event; import com.googlecode.aviator.AviatorEvaluator; import com.googlecode.aviator.AviatorEvaluatorInstance; import com.googlecode.aviator.Expression; diff --git a/groot-core/src/main/java/com/geedgenetworks/core/processor/split/SplitProcessor.java b/groot-core/src/main/java/com/geedgenetworks/core/processor/split/SplitProcessor.java new file mode 100644 index 0000000..e4ecb18 --- /dev/null +++ b/groot-core/src/main/java/com/geedgenetworks/core/processor/split/SplitProcessor.java @@ -0,0 +1,35 @@ +package com.geedgenetworks.core.processor.split; + +import com.alibaba.fastjson.JSONObject; +import com.geedgenetworks.api.processor.Processor; +import com.geedgenetworks.api.connector.event.Event; +import com.typesafe.config.Config; +import org.apache.flink.streaming.api.datastream.DataStream; +import org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator; +import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment; + +import java.util.Map; + +public class SplitProcessor implements Processor<SplitConfig> { + + @Override + public DataStream<Event> process(StreamExecutionEnvironment env, + DataStream<Event> input, SplitConfig splitConfig) { + + SingleOutputStreamOperator<Event> resultStream = + input.process(new SplitFunction(splitConfig)).name(splitConfig.getName()); + + if (splitConfig.getParallelism() != 0) { + resultStream.setParallelism(splitConfig.getParallelism()); + } + return resultStream; + } + + @Override + public SplitConfig parseConfig(String name, Config config) { + SplitConfig splitConfig = new JSONObject(config.root().unwrapped()).toJavaObject(SplitConfig.class); + splitConfig.setName(name); + return splitConfig; + } + +} diff --git a/groot-core/src/main/java/com/geedgenetworks/core/processor/split/SplitProcessorFactory.java b/groot-core/src/main/java/com/geedgenetworks/core/processor/split/SplitProcessorFactory.java new file mode 100644 index 0000000..ff85a45 --- /dev/null +++ b/groot-core/src/main/java/com/geedgenetworks/core/processor/split/SplitProcessorFactory.java @@ -0,0 +1,30 @@ +package com.geedgenetworks.core.processor.split; + +import com.geedgenetworks.api.processor.Processor; +import com.geedgenetworks.api.factory.ProcessorFactory; +import org.apache.flink.configuration.ConfigOption; + +import java.util.Set; + +public class SplitProcessorFactory implements ProcessorFactory { + + @Override + public Processor<?> createProcessor() { + return new SplitProcessor(); + } + + @Override + public String type() { + return "split"; + } + + @Override + public Set<ConfigOption<?>> requiredOptions() { + return Set.of(); + } + + @Override + public Set<ConfigOption<?>> optionalOptions() { + return Set.of(); + } +} diff --git a/groot-core/src/main/java/com/geedgenetworks/core/processor/table/TableConfig.java b/groot-core/src/main/java/com/geedgenetworks/core/processor/table/TableConfig.java new file mode 100644 index 0000000..e3c483a --- /dev/null +++ b/groot-core/src/main/java/com/geedgenetworks/core/processor/table/TableConfig.java @@ -0,0 +1,16 @@ +package com.geedgenetworks.core.processor.table; + +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.processor.ProcessorConfig; +import lombok.Data; +import lombok.EqualsAndHashCode; + +import java.util.List; + +@EqualsAndHashCode(callSuper = true) +@Data +public class TableConfig extends ProcessorConfig { + private List<String> output_fields; + private List<String> remove_fields; + private List<UDFContext> functions; +} diff --git a/groot-common/src/main/java/com/geedgenetworks/common/config/TableConfigOptions.java b/groot-core/src/main/java/com/geedgenetworks/core/processor/table/TableConfigOptions.java index 480496d..0c46d1f 100644 --- a/groot-common/src/main/java/com/geedgenetworks/common/config/TableConfigOptions.java +++ b/groot-core/src/main/java/com/geedgenetworks/core/processor/table/TableConfigOptions.java @@ -1,15 +1,13 @@ -package com.geedgenetworks.common.config; +package com.geedgenetworks.core.processor.table; import com.alibaba.fastjson2.TypeReference; -import com.geedgenetworks.common.udf.UDFContext; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.common.config.Option; +import com.geedgenetworks.common.config.Options; import java.util.List; public interface TableConfigOptions { - Option<String> TYPE = Options.key("type") - .stringType() - .noDefaultValue() - .withDescription("The type of processor."); Option<List<String>> OUTPUT_FIELDS = Options.key("output_fields") .listType() diff --git a/groot-core/src/main/java/com/geedgenetworks/core/processor/table/TableProcessor.java b/groot-core/src/main/java/com/geedgenetworks/core/processor/table/TableProcessor.java index 4078997..273f6de 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/processor/table/TableProcessor.java +++ b/groot-core/src/main/java/com/geedgenetworks/core/processor/table/TableProcessor.java @@ -1,7 +1,51 @@ package com.geedgenetworks.core.processor.table; -import com.geedgenetworks.core.pojo.TableConfig; -import com.geedgenetworks.core.processor.Processor; -public interface TableProcessor extends Processor<TableConfig> { +import com.alibaba.fastjson.JSONObject; +import com.geedgenetworks.common.config.CheckConfigUtil; +import com.geedgenetworks.common.config.CheckResult; +import com.geedgenetworks.common.exception.CommonErrorCode; +import com.geedgenetworks.common.exception.ConfigValidationException; +import com.geedgenetworks.api.processor.Processor; +import com.geedgenetworks.api.connector.event.Event; +import com.typesafe.config.Config; +import org.apache.flink.streaming.api.datastream.DataStream; +import org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator; +import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment; +import java.util.Map; + + +public class TableProcessor implements Processor<TableConfig> { + + @Override + public DataStream<Event> process(StreamExecutionEnvironment env, DataStream<Event> input, TableConfig tableConfig) { + + SingleOutputStreamOperator<Event> resultStream = + input.flatMap(new TableProcessorFunction(tableConfig)).name(tableConfig.getName()); + + if (tableConfig.getParallelism() != 0) { + resultStream.setParallelism(tableConfig.getParallelism()); + } + return resultStream; + } + + @Override + public TableConfig parseConfig(String name, Config config) { + CheckResult result = CheckConfigUtil.checkAtLeastOneExists(config, + TableConfigOptions.OUTPUT_FIELDS.key(), + TableConfigOptions.REMOVE_FIELDS.key(), + TableConfigOptions.FUNCTIONS.key()); + if (!result.isSuccess()) { + throw new ConfigValidationException(CommonErrorCode.CONFIG_VALIDATION_FAILED, String.format( + "Table processor: %s, At least one of [%s] should be specified.", + name, String.join(",", + TableConfigOptions.OUTPUT_FIELDS.key(), + TableConfigOptions.REMOVE_FIELDS.key(), + TableConfigOptions.FUNCTIONS.key()))); + } + + TableConfig tableConfig = new JSONObject(config.root().unwrapped()).toJavaObject(TableConfig.class); + tableConfig.setName(name); + return tableConfig; + } } diff --git a/groot-core/src/main/java/com/geedgenetworks/core/processor/table/TableProcessorFactory.java b/groot-core/src/main/java/com/geedgenetworks/core/processor/table/TableProcessorFactory.java new file mode 100644 index 0000000..c9e1e81 --- /dev/null +++ b/groot-core/src/main/java/com/geedgenetworks/core/processor/table/TableProcessorFactory.java @@ -0,0 +1,29 @@ +package com.geedgenetworks.core.processor.table; +import com.geedgenetworks.api.processor.Processor; +import com.geedgenetworks.api.factory.ProcessorFactory; +import org.apache.flink.configuration.ConfigOption; + +import java.util.Set; + +public class TableProcessorFactory implements ProcessorFactory { + + @Override + public Processor<?> createProcessor() { + return new TableProcessor(); + } + + @Override + public String type() { + return "table"; + } + + @Override + public Set<ConfigOption<?>> requiredOptions() { + return Set.of(); + } + + @Override + public Set<ConfigOption<?>> optionalOptions() { + return Set.of(); + } +} diff --git a/groot-core/src/main/java/com/geedgenetworks/core/processor/table/TableProcessorFunction.java b/groot-core/src/main/java/com/geedgenetworks/core/processor/table/TableProcessorFunction.java index 7b6a5e2..b840739 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/processor/table/TableProcessorFunction.java +++ b/groot-core/src/main/java/com/geedgenetworks/core/processor/table/TableProcessorFunction.java @@ -1,16 +1,15 @@ package com.geedgenetworks.core.processor.table; import com.alibaba.fastjson.JSON; -import com.geedgenetworks.common.Constants; -import com.geedgenetworks.common.Event; +import com.geedgenetworks.common.config.Constants; import com.geedgenetworks.common.exception.CommonErrorCode; import com.geedgenetworks.common.exception.GrootStreamRuntimeException; -import com.geedgenetworks.common.udf.TableFunction; -import com.geedgenetworks.common.udf.UDFContext; import com.geedgenetworks.common.utils.ColumnUtil; -import com.geedgenetworks.core.metrics.InternalMetrics; -import com.geedgenetworks.core.pojo.TableConfig; -import com.geedgenetworks.core.processor.projection.UdfEntity; +import com.geedgenetworks.api.common.udf.TableFunction; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.common.udf.UdfEntity; +import com.geedgenetworks.api.metrics.InternalMetrics; +import com.geedgenetworks.api.connector.event.Event; import com.google.common.collect.Lists; import com.googlecode.aviator.AviatorEvaluator; import com.googlecode.aviator.AviatorEvaluatorInstance; @@ -21,7 +20,6 @@ import lombok.extern.slf4j.Slf4j; import org.apache.flink.api.common.functions.RichFlatMapFunction; import org.apache.flink.configuration.Configuration; import org.apache.flink.util.Collector; -import org.checkerframework.checker.units.qual.A; import java.util.*; diff --git a/groot-core/src/main/java/com/geedgenetworks/core/processor/table/TableProcessorImpl.java b/groot-core/src/main/java/com/geedgenetworks/core/processor/table/TableProcessorImpl.java deleted file mode 100644 index 84454cf..0000000 --- a/groot-core/src/main/java/com/geedgenetworks/core/processor/table/TableProcessorImpl.java +++ /dev/null @@ -1,60 +0,0 @@ -package com.geedgenetworks.core.processor.table; - -import com.alibaba.fastjson.JSONObject; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.common.config.CheckConfigUtil; -import com.geedgenetworks.common.config.CheckResult; -import com.geedgenetworks.common.config.TableConfigOptions; -import com.geedgenetworks.common.exception.CommonErrorCode; -import com.geedgenetworks.common.exception.ConfigValidationException; -import com.geedgenetworks.core.pojo.TableConfig; -import com.typesafe.config.Config; -import org.apache.flink.api.common.ExecutionConfig; -import org.apache.flink.streaming.api.datastream.DataStream; -import org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator; -import org.apache.flink.util.OutputTag; - -import java.util.Map; - - -public class TableProcessorImpl implements TableProcessor { - - @Override - public DataStream<Event> processorFunction(DataStream<Event> grootEventDataStream, TableConfig tableConfig, ExecutionConfig config) throws Exception { - if (tableConfig.getParallelism() != 0) { - return grootEventDataStream - .flatMap(new TableProcessorFunction(tableConfig)) - .setParallelism(tableConfig.getParallelism()) - .name(tableConfig.getName()); - } else { - return grootEventDataStream - .flatMap(new TableProcessorFunction(tableConfig)) - .name(tableConfig.getName()); - } - } - - @Override - public String type() { - return "table"; - } - - @Override - public TableConfig checkConfig(String name, Map<String, Object> configProperties, Config typeSafeConfig) { - CheckResult result = CheckConfigUtil.checkAtLeastOneExists(typeSafeConfig.getConfig(name), - TableConfigOptions.OUTPUT_FIELDS.key(), - TableConfigOptions.REMOVE_FIELDS.key(), - TableConfigOptions.FUNCTIONS.key()); - if (!result.isSuccess()) { - throw new ConfigValidationException(CommonErrorCode.CONFIG_VALIDATION_FAILED, String.format( - "Table processor: %s, At least one of [%s] should be specified.", - name, String.join(",", - TableConfigOptions.OUTPUT_FIELDS.key(), - TableConfigOptions.REMOVE_FIELDS.key(), - TableConfigOptions.FUNCTIONS.key()))); - } - - TableConfig tableConfig = new JSONObject(configProperties).toJavaObject(TableConfig.class); - tableConfig.setName(name); - return tableConfig; - } -} diff --git a/groot-core/src/main/java/com/geedgenetworks/core/split/Split.java b/groot-core/src/main/java/com/geedgenetworks/core/split/Split.java deleted file mode 100644 index 4e4e387..0000000 --- a/groot-core/src/main/java/com/geedgenetworks/core/split/Split.java +++ /dev/null @@ -1,22 +0,0 @@ -package com.geedgenetworks.core.split; - -import com.geedgenetworks.common.Event; -import com.geedgenetworks.core.pojo.SplitConfig; -import com.typesafe.config.Config; -import org.apache.flink.streaming.api.datastream.DataStream; -import org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator; - -import java.io.Serializable; -import java.util.Map; -import java.util.Set; - -public interface Split<T extends SplitConfig> extends Serializable { - - DataStream<Event> splitFunction( - DataStream<Event> dataStream, T splitConfig) - throws Exception; - String type(); - - T checkConfig(String name, Map<String, Object> configProperties, Config typeSafeConfig); - -} diff --git a/groot-core/src/main/java/com/geedgenetworks/core/split/SplitOperator.java b/groot-core/src/main/java/com/geedgenetworks/core/split/SplitOperator.java deleted file mode 100644 index 48ef92d..0000000 --- a/groot-core/src/main/java/com/geedgenetworks/core/split/SplitOperator.java +++ /dev/null @@ -1,41 +0,0 @@ -package com.geedgenetworks.core.split; - -import com.alibaba.fastjson.JSONObject; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.core.pojo.FilterConfig; -import com.geedgenetworks.core.pojo.SplitConfig; -import com.typesafe.config.Config; -import org.apache.flink.streaming.api.datastream.DataStream; - -import java.util.Map; - -public class SplitOperator implements Split<SplitConfig> { - - @Override - public DataStream<Event> splitFunction( - DataStream<Event> dataStream, SplitConfig splitConfig) - throws Exception { - if (splitConfig.getParallelism() != 0) { - return dataStream - .process(new SplitFunction(splitConfig)) - .setParallelism(splitConfig.getParallelism()) - .name(splitConfig.getName()); - } else { - return dataStream - .process(new SplitFunction(splitConfig)) - .name(splitConfig.getName()); - } - } - @Override - public String type() { - return "split"; - } - - @Override - public SplitConfig checkConfig(String name, Map<String, Object> configProperties, Config config) { - SplitConfig splitConfig = new JSONObject(configProperties).toJavaObject(SplitConfig.class); - splitConfig.setName(name); - return splitConfig; - } - -} diff --git a/groot-core/src/main/java/com/geedgenetworks/core/types/BooleanType.java b/groot-core/src/main/java/com/geedgenetworks/core/types/BooleanType.java deleted file mode 100644 index f3de2d8..0000000 --- a/groot-core/src/main/java/com/geedgenetworks/core/types/BooleanType.java +++ /dev/null @@ -1,10 +0,0 @@ -package com.geedgenetworks.core.types; - -public class BooleanType extends DataType{ - BooleanType() { - } - @Override - public String simpleString() { - return "boolean"; - } -} diff --git a/groot-core/src/main/java/com/geedgenetworks/core/udf/AsnLookup.java b/groot-core/src/main/java/com/geedgenetworks/core/udf/AsnLookup.java index ac282b3..808c6b2 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/udf/AsnLookup.java +++ b/groot-core/src/main/java/com/geedgenetworks/core/udf/AsnLookup.java @@ -1,16 +1,18 @@ package com.geedgenetworks.core.udf; import com.alibaba.fastjson2.JSON; -import com.geedgenetworks.common.Constants; +import com.geedgenetworks.common.config.Constants; import com.geedgenetworks.common.config.*; import com.geedgenetworks.common.exception.CommonErrorCode; import com.geedgenetworks.common.exception.GrootStreamRuntimeException; -import com.geedgenetworks.common.udf.ScalarFunction; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.common.udf.UDFContext; import com.geedgenetworks.core.udf.knowlegdebase.handler.AsnKnowledgeBaseHandler; import com.geedgenetworks.core.udf.knowlegdebase.KnowledgeBaseUpdateJob; +import com.geedgenetworks.api.common.udf.ScalarFunction; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.configuration.CheckUDFContextUtil; +import com.geedgenetworks.api.configuration.UDFContextConfigOptions; +import com.geedgenetworks.api.connector.event.Event; import lombok.extern.slf4j.Slf4j; import org.apache.flink.api.common.functions.RuntimeContext; import org.apache.flink.configuration.Configuration; diff --git a/groot-core/src/main/java/com/geedgenetworks/core/udf/CurrentUnixTimestamp.java b/groot-core/src/main/java/com/geedgenetworks/core/udf/CurrentUnixTimestamp.java index 98b2d68..e59a7c9 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/udf/CurrentUnixTimestamp.java +++ b/groot-core/src/main/java/com/geedgenetworks/core/udf/CurrentUnixTimestamp.java @@ -2,9 +2,9 @@ package com.geedgenetworks.core.udf; import com.geedgenetworks.common.exception.CommonErrorCode; import com.geedgenetworks.common.exception.GrootStreamRuntimeException; -import com.geedgenetworks.common.udf.ScalarFunction; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.common.udf.UDFContext; +import com.geedgenetworks.api.common.udf.ScalarFunction; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.connector.event.Event; import lombok.extern.slf4j.Slf4j; import org.apache.flink.api.common.functions.RuntimeContext; diff --git a/groot-core/src/main/java/com/geedgenetworks/core/udf/DecodeBase64.java b/groot-core/src/main/java/com/geedgenetworks/core/udf/DecodeBase64.java index bc8563a..f1ff72f 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/udf/DecodeBase64.java +++ b/groot-core/src/main/java/com/geedgenetworks/core/udf/DecodeBase64.java @@ -2,9 +2,9 @@ package com.geedgenetworks.core.udf; import com.geedgenetworks.common.exception.CommonErrorCode; import com.geedgenetworks.common.exception.GrootStreamRuntimeException; -import com.geedgenetworks.common.udf.ScalarFunction; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.common.udf.UDFContext; +import com.geedgenetworks.api.common.udf.ScalarFunction; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.connector.event.Event; import com.geedgenetworks.utils.StringUtil; import lombok.extern.slf4j.Slf4j; diff --git a/groot-core/src/main/java/com/geedgenetworks/core/udf/Domain.java b/groot-core/src/main/java/com/geedgenetworks/core/udf/Domain.java index 9046472..5e41135 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/udf/Domain.java +++ b/groot-core/src/main/java/com/geedgenetworks/core/udf/Domain.java @@ -1,13 +1,13 @@ package com.geedgenetworks.core.udf; -import com.geedgenetworks.common.Event; import com.geedgenetworks.common.config.CheckResult; -import com.geedgenetworks.common.config.CheckUDFContextUtil; -import com.geedgenetworks.common.config.UDFContextConfigOptions; import com.geedgenetworks.common.exception.CommonErrorCode; import com.geedgenetworks.common.exception.GrootStreamRuntimeException; -import com.geedgenetworks.common.udf.ScalarFunction; -import com.geedgenetworks.common.udf.UDFContext; import com.geedgenetworks.shaded.com.google.common.net.InternetDomainName; +import com.geedgenetworks.api.common.udf.ScalarFunction; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.configuration.CheckUDFContextUtil; +import com.geedgenetworks.api.configuration.UDFContextConfigOptions; +import com.geedgenetworks.api.connector.event.Event; import lombok.extern.slf4j.Slf4j; import org.apache.flink.api.common.functions.RuntimeContext; import java.util.List; diff --git a/groot-core/src/main/java/com/geedgenetworks/core/udf/Drop.java b/groot-core/src/main/java/com/geedgenetworks/core/udf/Drop.java index c7f13c2..8b1c7a2 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/udf/Drop.java +++ b/groot-core/src/main/java/com/geedgenetworks/core/udf/Drop.java @@ -1,13 +1,13 @@ package com.geedgenetworks.core.udf; import com.geedgenetworks.common.config.CheckResult; -import com.geedgenetworks.common.config.CheckUDFContextUtil; -import com.geedgenetworks.common.config.UDFContextConfigOptions; import com.geedgenetworks.common.exception.CommonErrorCode; import com.geedgenetworks.common.exception.GrootStreamRuntimeException; -import com.geedgenetworks.common.udf.ScalarFunction; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.common.udf.UDFContext; +import com.geedgenetworks.api.common.udf.ScalarFunction; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.configuration.CheckUDFContextUtil; +import com.geedgenetworks.api.configuration.UDFContextConfigOptions; +import com.geedgenetworks.api.connector.event.Event; import org.apache.flink.api.common.functions.RuntimeContext; diff --git a/groot-core/src/main/java/com/geedgenetworks/core/udf/EncodeBase64.java b/groot-core/src/main/java/com/geedgenetworks/core/udf/EncodeBase64.java index a950252..7f1fe94 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/udf/EncodeBase64.java +++ b/groot-core/src/main/java/com/geedgenetworks/core/udf/EncodeBase64.java @@ -1,17 +1,15 @@ package com.geedgenetworks.core.udf; -import com.geedgenetworks.common.Event; import com.geedgenetworks.common.exception.CommonErrorCode; import com.geedgenetworks.common.exception.GrootStreamRuntimeException; -import com.geedgenetworks.common.udf.ScalarFunction; -import com.geedgenetworks.common.udf.UDFContext; +import com.geedgenetworks.api.common.udf.ScalarFunction; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.connector.event.Event; import lombok.extern.slf4j.Slf4j; import org.apache.flink.api.common.functions.RuntimeContext; import java.nio.charset.StandardCharsets; -import java.text.SimpleDateFormat; import java.util.Base64; -import java.util.List; @Slf4j public class EncodeBase64 implements ScalarFunction { diff --git a/groot-core/src/main/java/com/geedgenetworks/core/udf/Encrypt.java b/groot-core/src/main/java/com/geedgenetworks/core/udf/Encrypt.java index 2fa0804..80a2460 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/udf/Encrypt.java +++ b/groot-core/src/main/java/com/geedgenetworks/core/udf/Encrypt.java @@ -4,19 +4,24 @@ import cn.hutool.core.util.URLUtil; import cn.hutool.json.JSONObject; import cn.hutool.json.JSONUtil; import com.alibaba.fastjson2.JSON; -import com.geedgenetworks.common.Constants; -import com.geedgenetworks.common.Event; +import com.geedgenetworks.common.config.Constants; import com.geedgenetworks.common.config.*; import com.geedgenetworks.common.exception.CommonErrorCode; import com.geedgenetworks.common.exception.GrootStreamRuntimeException; -import com.geedgenetworks.common.udf.ScalarFunction; -import com.geedgenetworks.common.udf.UDFContext; +import com.geedgenetworks.common.utils.HttpClientPoolUtil; import com.geedgenetworks.core.pojo.DataEncryptionKey; import com.geedgenetworks.core.udf.encrypt.Crypto; import com.geedgenetworks.core.utils.*; import com.geedgenetworks.shaded.org.apache.http.HttpHeaders; import com.geedgenetworks.shaded.org.apache.http.HttpStatus; import com.geedgenetworks.shaded.org.apache.http.message.BasicHeader; +import com.geedgenetworks.api.common.udf.ScalarFunction; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.configuration.CheckUDFContextUtil; +import com.geedgenetworks.api.configuration.UDFContextConfigOptions; +import com.geedgenetworks.api.configuration.util.LoadIntervalDataOptions; +import com.geedgenetworks.api.configuration.util.LoadIntervalDataUtil; +import com.geedgenetworks.api.connector.event.Event; import lombok.extern.slf4j.Slf4j; import org.apache.flink.api.common.functions.RuntimeContext; import org.apache.flink.configuration.Configuration; diff --git a/groot-core/src/main/java/com/geedgenetworks/core/udf/Eval.java b/groot-core/src/main/java/com/geedgenetworks/core/udf/Eval.java index b04dc97..b0d2f73 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/udf/Eval.java +++ b/groot-core/src/main/java/com/geedgenetworks/core/udf/Eval.java @@ -2,11 +2,11 @@ package com.geedgenetworks.core.udf; import com.geedgenetworks.common.exception.CommonErrorCode; import com.geedgenetworks.common.exception.GrootStreamRuntimeException; -import com.geedgenetworks.common.udf.ScalarFunction; -import com.geedgenetworks.common.udf.UDFContext; import com.geedgenetworks.core.expressions.Calc; import com.geedgenetworks.core.expressions.EvalExecutor; -import com.geedgenetworks.common.Event; +import com.geedgenetworks.api.common.udf.ScalarFunction; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.connector.event.Event; import org.apache.flink.api.common.functions.RuntimeContext; import java.util.List; diff --git a/groot-core/src/main/java/com/geedgenetworks/core/udf/Flatten.java b/groot-core/src/main/java/com/geedgenetworks/core/udf/Flatten.java index d5d5761..38e0e98 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/udf/Flatten.java +++ b/groot-core/src/main/java/com/geedgenetworks/core/udf/Flatten.java @@ -2,11 +2,11 @@ package com.geedgenetworks.core.udf; import com.alibaba.fastjson2.JSONArray; import com.alibaba.fastjson2.JSONObject; -import com.geedgenetworks.common.Event; import com.geedgenetworks.common.exception.CommonErrorCode; import com.geedgenetworks.common.exception.GrootStreamRuntimeException; -import com.geedgenetworks.common.udf.ScalarFunction; -import com.geedgenetworks.common.udf.UDFContext; +import com.geedgenetworks.api.common.udf.ScalarFunction; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.connector.event.Event; import lombok.extern.slf4j.Slf4j; import org.apache.flink.api.common.functions.RuntimeContext; diff --git a/groot-core/src/main/java/com/geedgenetworks/core/udf/FromUnixTimestamp.java b/groot-core/src/main/java/com/geedgenetworks/core/udf/FromUnixTimestamp.java index d8803c3..e31e44f 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/udf/FromUnixTimestamp.java +++ b/groot-core/src/main/java/com/geedgenetworks/core/udf/FromUnixTimestamp.java @@ -2,9 +2,9 @@ package com.geedgenetworks.core.udf; import com.geedgenetworks.common.exception.CommonErrorCode; import com.geedgenetworks.common.exception.GrootStreamRuntimeException; -import com.geedgenetworks.common.udf.ScalarFunction; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.common.udf.UDFContext; +import com.geedgenetworks.api.common.udf.ScalarFunction; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.connector.event.Event; import lombok.extern.slf4j.Slf4j; import org.apache.flink.api.common.functions.RuntimeContext; diff --git a/groot-core/src/main/java/com/geedgenetworks/core/udf/GenerateStringArray.java b/groot-core/src/main/java/com/geedgenetworks/core/udf/GenerateStringArray.java index 366f204..7db582b 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/udf/GenerateStringArray.java +++ b/groot-core/src/main/java/com/geedgenetworks/core/udf/GenerateStringArray.java @@ -1,10 +1,10 @@ package com.geedgenetworks.core.udf; -import com.geedgenetworks.common.Event; import com.geedgenetworks.common.exception.CommonErrorCode; import com.geedgenetworks.common.exception.GrootStreamRuntimeException; -import com.geedgenetworks.common.udf.ScalarFunction; -import com.geedgenetworks.common.udf.UDFContext; +import com.geedgenetworks.api.common.udf.ScalarFunction; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.connector.event.Event; import org.apache.flink.api.common.functions.RuntimeContext; import java.util.*; diff --git a/groot-core/src/main/java/com/geedgenetworks/core/udf/GeoIpLookup.java b/groot-core/src/main/java/com/geedgenetworks/core/udf/GeoIpLookup.java index e800e5d..9c26527 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/udf/GeoIpLookup.java +++ b/groot-core/src/main/java/com/geedgenetworks/core/udf/GeoIpLookup.java @@ -1,17 +1,19 @@ package com.geedgenetworks.core.udf; -import com.geedgenetworks.common.Constants; +import com.geedgenetworks.common.config.Constants; import com.geedgenetworks.common.config.*; import com.geedgenetworks.common.exception.CommonErrorCode; import com.geedgenetworks.common.exception.GrootStreamRuntimeException; -import com.geedgenetworks.common.udf.ScalarFunction; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.common.udf.UDFContext; import com.geedgenetworks.core.udf.knowlegdebase.handler.GeoIpKnowledgeBaseHandler; import com.geedgenetworks.core.udf.knowlegdebase.KnowledgeBaseUpdateJob; import com.alibaba.fastjson2.JSON; import com.geedgenetworks.model.LocationResponse; +import com.geedgenetworks.api.common.udf.ScalarFunction; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.configuration.CheckUDFContextUtil; +import com.geedgenetworks.api.configuration.UDFContextConfigOptions; +import com.geedgenetworks.api.connector.event.Event; import lombok.extern.slf4j.Slf4j; import org.apache.flink.api.common.functions.RuntimeContext; import org.apache.flink.configuration.Configuration; diff --git a/groot-core/src/main/java/com/geedgenetworks/core/udf/Hmac.java b/groot-core/src/main/java/com/geedgenetworks/core/udf/Hmac.java index a18d361..970d5b4 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/udf/Hmac.java +++ b/groot-core/src/main/java/com/geedgenetworks/core/udf/Hmac.java @@ -2,14 +2,14 @@ package com.geedgenetworks.core.udf; import cn.hutool.crypto.digest.HMac; import cn.hutool.crypto.digest.HmacAlgorithm; -import com.geedgenetworks.common.Event; import com.geedgenetworks.common.config.CheckResult; -import com.geedgenetworks.common.config.CheckUDFContextUtil; -import com.geedgenetworks.common.config.UDFContextConfigOptions; import com.geedgenetworks.common.exception.CommonErrorCode; import com.geedgenetworks.common.exception.GrootStreamRuntimeException; -import com.geedgenetworks.common.udf.ScalarFunction; -import com.geedgenetworks.common.udf.UDFContext; +import com.geedgenetworks.api.common.udf.ScalarFunction; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.configuration.CheckUDFContextUtil; +import com.geedgenetworks.api.configuration.UDFContextConfigOptions; +import com.geedgenetworks.api.connector.event.Event; import com.geedgenetworks.utils.StringUtil; import lombok.extern.slf4j.Slf4j; import org.apache.flink.api.common.functions.RuntimeContext; diff --git a/groot-core/src/main/java/com/geedgenetworks/core/udf/JsonExtract.java b/groot-core/src/main/java/com/geedgenetworks/core/udf/JsonExtract.java index 57fe847..a64b3d2 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/udf/JsonExtract.java +++ b/groot-core/src/main/java/com/geedgenetworks/core/udf/JsonExtract.java @@ -2,10 +2,10 @@ package com.geedgenetworks.core.udf; import com.geedgenetworks.common.exception.CommonErrorCode; import com.geedgenetworks.common.exception.GrootStreamRuntimeException; -import com.geedgenetworks.common.udf.ScalarFunction; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.common.udf.UDFContext; import com.geedgenetworks.common.utils.JsonPathUtil; +import com.geedgenetworks.api.common.udf.ScalarFunction; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.connector.event.Event; import org.apache.flink.api.common.functions.RuntimeContext; public class JsonExtract implements ScalarFunction { diff --git a/groot-core/src/main/java/com/geedgenetworks/core/udf/PathCombine.java b/groot-core/src/main/java/com/geedgenetworks/core/udf/PathCombine.java index 0141a46..18aa591 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/udf/PathCombine.java +++ b/groot-core/src/main/java/com/geedgenetworks/core/udf/PathCombine.java @@ -2,13 +2,13 @@ package com.geedgenetworks.core.udf; import com.alibaba.fastjson.JSON; import com.alibaba.fastjson2.JSONArray; -import com.geedgenetworks.common.Constants; +import com.geedgenetworks.common.config.Constants; import com.geedgenetworks.common.config.CommonConfig; import com.geedgenetworks.common.exception.CommonErrorCode; import com.geedgenetworks.common.exception.GrootStreamRuntimeException; -import com.geedgenetworks.common.udf.ScalarFunction; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.common.udf.UDFContext; +import com.geedgenetworks.api.common.udf.ScalarFunction; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.connector.event.Event; import lombok.extern.slf4j.Slf4j; import org.apache.flink.api.common.functions.RuntimeContext; import org.apache.flink.configuration.Configuration; diff --git a/groot-core/src/main/java/com/geedgenetworks/core/udf/Rename.java b/groot-core/src/main/java/com/geedgenetworks/core/udf/Rename.java index 6a77c3a..662129c 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/udf/Rename.java +++ b/groot-core/src/main/java/com/geedgenetworks/core/udf/Rename.java @@ -3,9 +3,9 @@ package com.geedgenetworks.core.udf; import com.alibaba.fastjson2.JSONArray; import com.geedgenetworks.common.exception.CommonErrorCode; import com.geedgenetworks.common.exception.GrootStreamRuntimeException; -import com.geedgenetworks.common.udf.ScalarFunction; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.common.udf.UDFContext; +import com.geedgenetworks.api.common.udf.ScalarFunction; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.connector.event.Event; import com.googlecode.aviator.AviatorEvaluator; import com.googlecode.aviator.AviatorEvaluatorInstance; import com.googlecode.aviator.Expression; diff --git a/groot-core/src/main/java/com/geedgenetworks/core/udf/SnowflakeId.java b/groot-core/src/main/java/com/geedgenetworks/core/udf/SnowflakeId.java index b206f3b..520ba77 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/udf/SnowflakeId.java +++ b/groot-core/src/main/java/com/geedgenetworks/core/udf/SnowflakeId.java @@ -1,11 +1,11 @@ package com.geedgenetworks.core.udf; -import com.geedgenetworks.common.udf.ScalarFunction; -import com.geedgenetworks.common.udf.UDFContext; import com.geedgenetworks.common.exception.CommonErrorCode; import com.geedgenetworks.common.exception.GrootStreamRuntimeException; -import com.geedgenetworks.common.Event; import com.geedgenetworks.core.utils.SnowflakeIdUtils; +import com.geedgenetworks.api.common.udf.ScalarFunction; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.connector.event.Event; import org.apache.flink.api.common.functions.RuntimeContext; import java.io.Serializable; diff --git a/groot-core/src/main/java/com/geedgenetworks/core/udf/StringJoiner.java b/groot-core/src/main/java/com/geedgenetworks/core/udf/StringJoiner.java index 7e4ab68..1adb68d 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/udf/StringJoiner.java +++ b/groot-core/src/main/java/com/geedgenetworks/core/udf/StringJoiner.java @@ -1,10 +1,10 @@ package com.geedgenetworks.core.udf; -import com.geedgenetworks.common.Event; import com.geedgenetworks.common.exception.CommonErrorCode; import com.geedgenetworks.common.exception.GrootStreamRuntimeException; -import com.geedgenetworks.common.udf.ScalarFunction; -import com.geedgenetworks.common.udf.UDFContext; +import com.geedgenetworks.api.common.udf.ScalarFunction; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.connector.event.Event; import org.apache.flink.api.common.functions.RuntimeContext; import java.util.List; diff --git a/groot-core/src/main/java/com/geedgenetworks/core/udf/UnixTimestampConverter.java b/groot-core/src/main/java/com/geedgenetworks/core/udf/UnixTimestampConverter.java index 62c4dfa..04e0cfe 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/udf/UnixTimestampConverter.java +++ b/groot-core/src/main/java/com/geedgenetworks/core/udf/UnixTimestampConverter.java @@ -2,9 +2,9 @@ package com.geedgenetworks.core.udf; import com.geedgenetworks.common.exception.CommonErrorCode; import com.geedgenetworks.common.exception.GrootStreamRuntimeException; -import com.geedgenetworks.common.udf.ScalarFunction; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.common.udf.UDFContext; +import com.geedgenetworks.api.common.udf.ScalarFunction; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.connector.event.Event; import lombok.extern.slf4j.Slf4j; import org.apache.flink.api.common.functions.RuntimeContext; diff --git a/groot-core/src/main/java/com/geedgenetworks/core/udf/cn/AbstractKnowledgeScalarFunction.java b/groot-core/src/main/java/com/geedgenetworks/core/udf/cn/AbstractKnowledgeScalarFunction.java index e54b612..be2f15d 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/udf/cn/AbstractKnowledgeScalarFunction.java +++ b/groot-core/src/main/java/com/geedgenetworks/core/udf/cn/AbstractKnowledgeScalarFunction.java @@ -1,10 +1,10 @@ package com.geedgenetworks.core.udf.cn; -import com.geedgenetworks.common.Constants; +import com.geedgenetworks.common.config.Constants; import com.geedgenetworks.common.config.CommonConfig; import com.geedgenetworks.common.config.KnowledgeBaseConfig; -import com.geedgenetworks.common.udf.ScalarFunction; -import com.geedgenetworks.common.udf.UDFContext; +import com.geedgenetworks.api.common.udf.ScalarFunction; +import com.geedgenetworks.api.common.udf.UDFContext; import org.apache.flink.api.common.functions.RuntimeContext; import org.apache.flink.configuration.Configuration; import org.apache.flink.metrics.Counter; diff --git a/groot-core/src/main/java/com/geedgenetworks/core/udf/cn/AbstractKnowledgeWithRuleScalarFunction.java b/groot-core/src/main/java/com/geedgenetworks/core/udf/cn/AbstractKnowledgeWithRuleScalarFunction.java index 3112ec7..fd32559 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/udf/cn/AbstractKnowledgeWithRuleScalarFunction.java +++ b/groot-core/src/main/java/com/geedgenetworks/core/udf/cn/AbstractKnowledgeWithRuleScalarFunction.java @@ -1,9 +1,9 @@ package com.geedgenetworks.core.udf.cn; -import com.geedgenetworks.common.Constants; +import com.geedgenetworks.common.config.Constants; import com.geedgenetworks.common.config.CommonConfig; import com.geedgenetworks.common.config.KnowledgeBaseConfig; -import com.geedgenetworks.common.udf.UDFContext; +import com.geedgenetworks.api.common.udf.UDFContext; import org.apache.flink.api.common.functions.RuntimeContext; import org.apache.flink.configuration.Configuration; import org.apache.flink.metrics.Counter; diff --git a/groot-core/src/main/java/com/geedgenetworks/core/udf/cn/AnonymityLookup.java b/groot-core/src/main/java/com/geedgenetworks/core/udf/cn/AnonymityLookup.java index 6be1c90..eb6d66b 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/udf/cn/AnonymityLookup.java +++ b/groot-core/src/main/java/com/geedgenetworks/core/udf/cn/AnonymityLookup.java @@ -1,10 +1,10 @@ package com.geedgenetworks.core.udf.cn; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.common.udf.UDFContext; import com.geedgenetworks.core.udf.knowlegdebase.KnowledgeBaseUpdateJob; import com.geedgenetworks.core.udf.knowlegdebase.handler.IocDarkwebKnowledgeBaseHandler; import com.geedgenetworks.core.udf.knowlegdebase.handler.RuleKnowledgeBaseHandler; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.connector.event.Event; import org.apache.flink.api.common.functions.RuntimeContext; import org.slf4j.Logger; import org.slf4j.LoggerFactory; diff --git a/groot-core/src/main/java/com/geedgenetworks/core/udf/cn/AppCategoryLookup.java b/groot-core/src/main/java/com/geedgenetworks/core/udf/cn/AppCategoryLookup.java index 0052d82..89b3a6a 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/udf/cn/AppCategoryLookup.java +++ b/groot-core/src/main/java/com/geedgenetworks/core/udf/cn/AppCategoryLookup.java @@ -1,9 +1,9 @@ package com.geedgenetworks.core.udf.cn; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.common.udf.UDFContext; import com.geedgenetworks.core.udf.knowlegdebase.KnowledgeBaseUpdateJob; import com.geedgenetworks.core.udf.knowlegdebase.handler.AppCategoryKnowledgeBaseHandler; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.connector.event.Event; import org.apache.flink.api.common.functions.RuntimeContext; import org.slf4j.Logger; import org.slf4j.LoggerFactory; diff --git a/groot-core/src/main/java/com/geedgenetworks/core/udf/cn/ArrayElementsPrepend.java b/groot-core/src/main/java/com/geedgenetworks/core/udf/cn/ArrayElementsPrepend.java index de64073..dcc2da2 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/udf/cn/ArrayElementsPrepend.java +++ b/groot-core/src/main/java/com/geedgenetworks/core/udf/cn/ArrayElementsPrepend.java @@ -1,8 +1,8 @@ package com.geedgenetworks.core.udf.cn; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.common.udf.ScalarFunction; -import com.geedgenetworks.common.udf.UDFContext; +import com.geedgenetworks.api.common.udf.ScalarFunction; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.connector.event.Event; import org.apache.flink.api.common.functions.RuntimeContext; import java.util.ArrayList; diff --git a/groot-core/src/main/java/com/geedgenetworks/core/udf/cn/BaseStationLookup.java b/groot-core/src/main/java/com/geedgenetworks/core/udf/cn/BaseStationLookup.java index 191edd5..09e141d 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/udf/cn/BaseStationLookup.java +++ b/groot-core/src/main/java/com/geedgenetworks/core/udf/cn/BaseStationLookup.java @@ -1,11 +1,11 @@ package com.geedgenetworks.core.udf.cn; -import com.geedgenetworks.common.Event; import com.geedgenetworks.common.exception.CommonErrorCode; import com.geedgenetworks.common.exception.GrootStreamRuntimeException; -import com.geedgenetworks.common.udf.UDFContext; import com.geedgenetworks.core.udf.knowlegdebase.KnowledgeBaseUpdateJob; import com.geedgenetworks.core.udf.knowlegdebase.handler.BaseStationKnowledgeBaseHandler; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.connector.event.Event; import org.apache.flink.api.common.functions.RuntimeContext; import java.util.List; diff --git a/groot-core/src/main/java/com/geedgenetworks/core/udf/cn/DnsServerInfoLookup.java b/groot-core/src/main/java/com/geedgenetworks/core/udf/cn/DnsServerInfoLookup.java index 7136b71..dc9d27f 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/udf/cn/DnsServerInfoLookup.java +++ b/groot-core/src/main/java/com/geedgenetworks/core/udf/cn/DnsServerInfoLookup.java @@ -1,8 +1,8 @@ package com.geedgenetworks.core.udf.cn; -import com.geedgenetworks.common.Event; import com.geedgenetworks.core.udf.knowlegdebase.KnowledgeBaseUpdateJob; import com.geedgenetworks.core.udf.knowlegdebase.handler.DnsServerInfoKnowledgeBaseHandler; +import com.geedgenetworks.api.connector.event.Event; import java.util.List; diff --git a/groot-core/src/main/java/com/geedgenetworks/core/udf/cn/FieldsMerge.java b/groot-core/src/main/java/com/geedgenetworks/core/udf/cn/FieldsMerge.java index f4338fc..0f705b7 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/udf/cn/FieldsMerge.java +++ b/groot-core/src/main/java/com/geedgenetworks/core/udf/cn/FieldsMerge.java @@ -1,8 +1,8 @@ package com.geedgenetworks.core.udf.cn; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.common.udf.ScalarFunction; -import com.geedgenetworks.common.udf.UDFContext; +import com.geedgenetworks.api.common.udf.ScalarFunction; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.connector.event.Event; import org.apache.flink.api.common.functions.RuntimeContext; import java.util.ArrayList; diff --git a/groot-core/src/main/java/com/geedgenetworks/core/udf/cn/FqdnCategoryLookup.java b/groot-core/src/main/java/com/geedgenetworks/core/udf/cn/FqdnCategoryLookup.java index 0e52b8c..2e88d99 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/udf/cn/FqdnCategoryLookup.java +++ b/groot-core/src/main/java/com/geedgenetworks/core/udf/cn/FqdnCategoryLookup.java @@ -1,9 +1,9 @@ package com.geedgenetworks.core.udf.cn; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.common.udf.UDFContext; import com.geedgenetworks.core.udf.knowlegdebase.KnowledgeBaseUpdateJob; import com.geedgenetworks.core.udf.knowlegdebase.handler.FqdnCategoryKnowledgeBaseHandler; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.connector.event.Event; import org.apache.flink.api.common.functions.RuntimeContext; import org.slf4j.Logger; import org.slf4j.LoggerFactory; diff --git a/groot-core/src/main/java/com/geedgenetworks/core/udf/cn/FqdnWhoisLookup.java b/groot-core/src/main/java/com/geedgenetworks/core/udf/cn/FqdnWhoisLookup.java index 0cc18a0..e06714e 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/udf/cn/FqdnWhoisLookup.java +++ b/groot-core/src/main/java/com/geedgenetworks/core/udf/cn/FqdnWhoisLookup.java @@ -1,8 +1,8 @@ package com.geedgenetworks.core.udf.cn; -import com.geedgenetworks.common.Event; import com.geedgenetworks.core.udf.knowlegdebase.KnowledgeBaseUpdateJob; import com.geedgenetworks.core.udf.knowlegdebase.handler.FqdnWhoisKnowledgeBaseHandler; +import com.geedgenetworks.api.connector.event.Event; /** * @author gujinkai diff --git a/groot-core/src/main/java/com/geedgenetworks/core/udf/cn/H3CellLookup.java b/groot-core/src/main/java/com/geedgenetworks/core/udf/cn/H3CellLookup.java index 7389f4a..c6f4e62 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/udf/cn/H3CellLookup.java +++ b/groot-core/src/main/java/com/geedgenetworks/core/udf/cn/H3CellLookup.java @@ -1,11 +1,11 @@ package com.geedgenetworks.core.udf.cn; -import com.geedgenetworks.common.Event; import com.geedgenetworks.common.exception.CommonErrorCode; import com.geedgenetworks.common.exception.GrootStreamRuntimeException; -import com.geedgenetworks.common.udf.ScalarFunction; -import com.geedgenetworks.common.udf.UDFContext; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.connector.event.Event; import com.uber.h3core.H3Core; +import com.geedgenetworks.api.common.udf.ScalarFunction; import org.apache.flink.api.common.functions.RuntimeContext; import java.io.IOException; diff --git a/groot-core/src/main/java/com/geedgenetworks/core/udf/cn/IcpLookup.java b/groot-core/src/main/java/com/geedgenetworks/core/udf/cn/IcpLookup.java index 5c1fc97..8eaadc6 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/udf/cn/IcpLookup.java +++ b/groot-core/src/main/java/com/geedgenetworks/core/udf/cn/IcpLookup.java @@ -1,8 +1,8 @@ package com.geedgenetworks.core.udf.cn; -import com.geedgenetworks.common.Event; import com.geedgenetworks.core.udf.knowlegdebase.KnowledgeBaseUpdateJob; import com.geedgenetworks.core.udf.knowlegdebase.handler.FqdnIcpKnowledgeBaseHandler; +import com.geedgenetworks.api.connector.event.Event; /** * @author gujinkai diff --git a/groot-core/src/main/java/com/geedgenetworks/core/udf/cn/IdcRenterLookup.java b/groot-core/src/main/java/com/geedgenetworks/core/udf/cn/IdcRenterLookup.java index f7c5398..7269845 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/udf/cn/IdcRenterLookup.java +++ b/groot-core/src/main/java/com/geedgenetworks/core/udf/cn/IdcRenterLookup.java @@ -1,8 +1,8 @@ package com.geedgenetworks.core.udf.cn; -import com.geedgenetworks.common.Event; import com.geedgenetworks.core.udf.knowlegdebase.KnowledgeBaseUpdateJob; import com.geedgenetworks.core.udf.knowlegdebase.handler.IdcRenterKnowledgeBaseHandler; +import com.geedgenetworks.api.connector.event.Event; /** * @author gujinkai diff --git a/groot-core/src/main/java/com/geedgenetworks/core/udf/cn/IntelligenceIndicatorLookup.java b/groot-core/src/main/java/com/geedgenetworks/core/udf/cn/IntelligenceIndicatorLookup.java index 857ae74..d6f95ee 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/udf/cn/IntelligenceIndicatorLookup.java +++ b/groot-core/src/main/java/com/geedgenetworks/core/udf/cn/IntelligenceIndicatorLookup.java @@ -1,9 +1,9 @@ package com.geedgenetworks.core.udf.cn; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.common.udf.UDFContext; import com.geedgenetworks.core.udf.knowlegdebase.KnowledgeBaseUpdateJob; import com.geedgenetworks.core.udf.knowlegdebase.handler.IntelligenceIndicatorKnowledgeBaseHandler; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.connector.event.Event; import org.apache.flink.api.common.functions.RuntimeContext; import org.slf4j.Logger; import org.slf4j.LoggerFactory; diff --git a/groot-core/src/main/java/com/geedgenetworks/core/udf/cn/IocLookup.java b/groot-core/src/main/java/com/geedgenetworks/core/udf/cn/IocLookup.java index 4003297..9cc88cd 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/udf/cn/IocLookup.java +++ b/groot-core/src/main/java/com/geedgenetworks/core/udf/cn/IocLookup.java @@ -1,10 +1,10 @@ package com.geedgenetworks.core.udf.cn; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.common.udf.UDFContext; import com.geedgenetworks.core.udf.knowlegdebase.KnowledgeBaseUpdateJob; import com.geedgenetworks.core.udf.knowlegdebase.handler.IocMalwareKnowledgeBaseHandler; import com.geedgenetworks.core.udf.knowlegdebase.handler.RuleKnowledgeBaseHandler; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.connector.event.Event; import org.apache.flink.api.common.functions.RuntimeContext; import org.slf4j.Logger; import org.slf4j.LoggerFactory; diff --git a/groot-core/src/main/java/com/geedgenetworks/core/udf/cn/IpZoneLookup.java b/groot-core/src/main/java/com/geedgenetworks/core/udf/cn/IpZoneLookup.java index b9bd139..8803197 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/udf/cn/IpZoneLookup.java +++ b/groot-core/src/main/java/com/geedgenetworks/core/udf/cn/IpZoneLookup.java @@ -1,8 +1,8 @@ package com.geedgenetworks.core.udf.cn; -import com.geedgenetworks.common.Event; import com.geedgenetworks.core.udf.knowlegdebase.KnowledgeBaseUpdateJob; import com.geedgenetworks.core.udf.knowlegdebase.handler.InternalIpKnowledgeBaseHandler; +import com.geedgenetworks.api.connector.event.Event; /** * @author gujinkai diff --git a/groot-core/src/main/java/com/geedgenetworks/core/udf/cn/L7ProtocolAndAppExtract.java b/groot-core/src/main/java/com/geedgenetworks/core/udf/cn/L7ProtocolAndAppExtract.java index 7983015..fcbb53a 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/udf/cn/L7ProtocolAndAppExtract.java +++ b/groot-core/src/main/java/com/geedgenetworks/core/udf/cn/L7ProtocolAndAppExtract.java @@ -1,8 +1,9 @@ package com.geedgenetworks.core.udf.cn; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.common.udf.ScalarFunction; -import com.geedgenetworks.common.udf.UDFContext; + +import com.geedgenetworks.api.common.udf.ScalarFunction; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.connector.event.Event; import org.apache.flink.api.common.functions.RuntimeContext; import org.slf4j.Logger; import org.slf4j.LoggerFactory; diff --git a/groot-core/src/main/java/com/geedgenetworks/core/udf/cn/LinkDirectionLookup.java b/groot-core/src/main/java/com/geedgenetworks/core/udf/cn/LinkDirectionLookup.java index e653820..77de477 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/udf/cn/LinkDirectionLookup.java +++ b/groot-core/src/main/java/com/geedgenetworks/core/udf/cn/LinkDirectionLookup.java @@ -1,8 +1,8 @@ package com.geedgenetworks.core.udf.cn; -import com.geedgenetworks.common.Event; import com.geedgenetworks.core.udf.knowlegdebase.KnowledgeBaseUpdateJob; import com.geedgenetworks.core.udf.knowlegdebase.handler.LinkDirectionKnowledgeBaseHandler; +import com.geedgenetworks.api.connector.event.Event; /** * @author gujinkai diff --git a/groot-core/src/main/java/com/geedgenetworks/core/udf/cn/UserDefineTagLookup.java b/groot-core/src/main/java/com/geedgenetworks/core/udf/cn/UserDefineTagLookup.java index e5a3f7f..d926996 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/udf/cn/UserDefineTagLookup.java +++ b/groot-core/src/main/java/com/geedgenetworks/core/udf/cn/UserDefineTagLookup.java @@ -1,9 +1,9 @@ package com.geedgenetworks.core.udf.cn; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.common.udf.UDFContext; import com.geedgenetworks.core.udf.knowlegdebase.KnowledgeBaseUpdateJob; import com.geedgenetworks.core.udf.knowlegdebase.handler.*; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.connector.event.Event; import org.apache.flink.api.common.functions.RuntimeContext; import org.apache.flink.metrics.Counter; diff --git a/groot-core/src/main/java/com/geedgenetworks/core/udf/cn/VpnLookup.java b/groot-core/src/main/java/com/geedgenetworks/core/udf/cn/VpnLookup.java index 4cdb399..50e6586 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/udf/cn/VpnLookup.java +++ b/groot-core/src/main/java/com/geedgenetworks/core/udf/cn/VpnLookup.java @@ -1,10 +1,10 @@ package com.geedgenetworks.core.udf.cn; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.common.udf.UDFContext; import com.geedgenetworks.core.udf.knowlegdebase.KnowledgeBaseUpdateJob; import com.geedgenetworks.core.udf.knowlegdebase.handler.DomainVpnKnowledgeBaseHandler; import com.geedgenetworks.core.udf.knowlegdebase.handler.IpVpnKnowledgeBaseHandler; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.connector.event.Event; import org.apache.flink.api.common.functions.RuntimeContext; import org.slf4j.Logger; import org.slf4j.LoggerFactory; diff --git a/groot-core/src/main/java/com/geedgenetworks/core/udf/knowlegdebase/handler/AbstractKnowledgeBaseHandler.java b/groot-core/src/main/java/com/geedgenetworks/core/udf/knowlegdebase/handler/AbstractKnowledgeBaseHandler.java index 113e164..8a37f8d 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/udf/knowlegdebase/handler/AbstractKnowledgeBaseHandler.java +++ b/groot-core/src/main/java/com/geedgenetworks/core/udf/knowlegdebase/handler/AbstractKnowledgeBaseHandler.java @@ -6,9 +6,9 @@ import com.alibaba.fastjson2.JSONArray; import com.geedgenetworks.common.config.KnowledgeBaseConfig; import com.geedgenetworks.common.exception.CommonErrorCode; import com.geedgenetworks.common.exception.GrootStreamRuntimeException; +import com.geedgenetworks.common.utils.HttpClientPoolUtil; import com.geedgenetworks.core.utils.HdfsUtils; import com.geedgenetworks.core.pojo.KnowLedgeBaseFileMeta; -import com.geedgenetworks.core.utils.HttpClientPoolUtil; import lombok.extern.slf4j.Slf4j; import java.net.URI; diff --git a/groot-core/src/main/java/com/geedgenetworks/core/udf/udaf/CollectList.java b/groot-core/src/main/java/com/geedgenetworks/core/udf/udaf/CollectList.java index 8219fbd..3624527 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/udf/udaf/CollectList.java +++ b/groot-core/src/main/java/com/geedgenetworks/core/udf/udaf/CollectList.java @@ -1,27 +1,11 @@ -/** - * Copyright 2017 Hortonworks. - * <p> - * Licensed under the Apache License, Version 2.0 (the "License"); - * you may not use this file except in compliance with the License. - * You may obtain a copy of the License at - * <p> - * http://www.apache.org/licenses/LICENSE-2.0 - * <p> - * Unless required by applicable law or agreed to in writing, software - * distributed under the License is distributed on an "AS IS" BASIS, - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - * See the License for the specific language governing permissions and - * limitations under the License. - **/ package com.geedgenetworks.core.udf.udaf; - -import com.geedgenetworks.common.Accumulator; -import com.geedgenetworks.common.Event; +import com.geedgenetworks.common.config.Accumulator; import com.geedgenetworks.common.exception.CommonErrorCode; import com.geedgenetworks.common.exception.GrootStreamRuntimeException; -import com.geedgenetworks.common.udf.AggregateFunction; -import com.geedgenetworks.common.udf.UDFContext; +import com.geedgenetworks.api.common.udf.AggregateFunction; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.connector.event.Event; import java.util.ArrayList; import java.util.List; diff --git a/groot-core/src/main/java/com/geedgenetworks/core/udf/udaf/CollectSet.java b/groot-core/src/main/java/com/geedgenetworks/core/udf/udaf/CollectSet.java index c23f0ca..b16ae7e 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/udf/udaf/CollectSet.java +++ b/groot-core/src/main/java/com/geedgenetworks/core/udf/udaf/CollectSet.java @@ -1,12 +1,13 @@ package com.geedgenetworks.core.udf.udaf; -import com.geedgenetworks.common.Accumulator; -import com.geedgenetworks.common.Event; +import com.geedgenetworks.common.config.Accumulator; import com.geedgenetworks.common.exception.CommonErrorCode; import com.geedgenetworks.common.exception.GrootStreamRuntimeException; -import com.geedgenetworks.common.udf.AggregateFunction; -import com.geedgenetworks.common.udf.UDFContext; +import com.geedgenetworks.api.common.udf.AggregateFunction; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.connector.event.Event; + import java.util.HashSet; import java.util.Set; diff --git a/groot-core/src/main/java/com/geedgenetworks/core/udf/udaf/FirstValue.java b/groot-core/src/main/java/com/geedgenetworks/core/udf/udaf/FirstValue.java index f68448f..1f8698f 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/udf/udaf/FirstValue.java +++ b/groot-core/src/main/java/com/geedgenetworks/core/udf/udaf/FirstValue.java @@ -16,12 +16,12 @@ package com.geedgenetworks.core.udf.udaf; -import com.geedgenetworks.common.Accumulator; -import com.geedgenetworks.common.Event; +import com.geedgenetworks.common.config.Accumulator; import com.geedgenetworks.common.exception.CommonErrorCode; import com.geedgenetworks.common.exception.GrootStreamRuntimeException; -import com.geedgenetworks.common.udf.AggregateFunction; -import com.geedgenetworks.common.udf.UDFContext; +import com.geedgenetworks.api.common.udf.AggregateFunction; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.connector.event.Event; /** * Collects elements within a group and returns the list of aggregated objects diff --git a/groot-core/src/main/java/com/geedgenetworks/core/udf/udaf/HdrHistogram/HdrHistogram.java b/groot-core/src/main/java/com/geedgenetworks/core/udf/udaf/HdrHistogram/HdrHistogram.java index 368e8c1..ee31f58 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/udf/udaf/HdrHistogram/HdrHistogram.java +++ b/groot-core/src/main/java/com/geedgenetworks/core/udf/udaf/HdrHistogram/HdrHistogram.java @@ -1,42 +1,42 @@ -package com.geedgenetworks.core.udf.udaf.HdrHistogram;
-
-import com.geedgenetworks.common.Accumulator;
-import com.geedgenetworks.common.udf.UDFContext;
-import com.geedgenetworks.sketch.util.StringUtils;
-import org.HdrHistogram.Histogramer;
-
-import java.util.Map;
-
-public class HdrHistogram extends HdrHistogramBaseAggregate {
- boolean outputBase64;
-
- @Override
- public void open(UDFContext c) {
- super.open(c);
- Map<String, Object> params = c.getParameters();
- outputBase64 = "base64".equalsIgnoreCase(params.getOrDefault("output_format", "base64").toString());
- }
-
- @Override
- public Accumulator getResult(Accumulator acc) {
- Object agg = acc.getMetricsFields().get(outputField);
- if (agg == null) {
- return acc;
- }
-
- byte[] bytes = ((Histogramer) agg).toBytes();
- if (outputBase64) {
- acc.getMetricsFields().put(outputField, StringUtils.encodeBase64String(bytes));
- } else {
- acc.getMetricsFields().put(outputField, bytes);
- }
-
- return acc;
- }
-
- @Override
- public String functionName() {
- return "HDR_HISTOGRAM";
- }
-
-}
+package com.geedgenetworks.core.udf.udaf.HdrHistogram; + +import com.geedgenetworks.common.config.Accumulator; +import com.geedgenetworks.sketch.util.StringUtils; +import com.geedgenetworks.api.common.udf.UDFContext; +import org.HdrHistogram.Histogramer; + +import java.util.Map; + +public class HdrHistogram extends HdrHistogramBaseAggregate { + boolean outputBase64; + + @Override + public void open(UDFContext c) { + super.open(c); + Map<String, Object> params = c.getParameters(); + outputBase64 = "base64".equalsIgnoreCase(params.getOrDefault("output_format", "base64").toString()); + } + + @Override + public Accumulator getResult(Accumulator acc) { + Object agg = acc.getMetricsFields().get(outputField); + if (agg == null) { + return acc; + } + + byte[] bytes = ((Histogramer) agg).toBytes(); + if (outputBase64) { + acc.getMetricsFields().put(outputField, StringUtils.encodeBase64String(bytes)); + } else { + acc.getMetricsFields().put(outputField, bytes); + } + + return acc; + } + + @Override + public String functionName() { + return "HDR_HISTOGRAM"; + } + +} diff --git a/groot-core/src/main/java/com/geedgenetworks/core/udf/udaf/HdrHistogram/HdrHistogramBaseAggregate.java b/groot-core/src/main/java/com/geedgenetworks/core/udf/udaf/HdrHistogram/HdrHistogramBaseAggregate.java index d8656e0..1e26896 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/udf/udaf/HdrHistogram/HdrHistogramBaseAggregate.java +++ b/groot-core/src/main/java/com/geedgenetworks/core/udf/udaf/HdrHistogram/HdrHistogramBaseAggregate.java @@ -1,10 +1,10 @@ package com.geedgenetworks.core.udf.udaf.HdrHistogram; -import com.geedgenetworks.common.Accumulator; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.common.udf.AggregateFunction; -import com.geedgenetworks.common.udf.UDFContext; +import com.geedgenetworks.common.config.Accumulator; import com.geedgenetworks.sketch.util.StringUtils; +import com.geedgenetworks.api.common.udf.AggregateFunction; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.connector.event.Event; import org.HdrHistogram.ArrayHistogram; import org.HdrHistogram.DirectMapHistogram; import org.HdrHistogram.Histogramer; diff --git a/groot-core/src/main/java/com/geedgenetworks/core/udf/udaf/HdrHistogram/HdrHistogramQuantile.java b/groot-core/src/main/java/com/geedgenetworks/core/udf/udaf/HdrHistogram/HdrHistogramQuantile.java index b9f7d5b..724e21b 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/udf/udaf/HdrHistogram/HdrHistogramQuantile.java +++ b/groot-core/src/main/java/com/geedgenetworks/core/udf/udaf/HdrHistogram/HdrHistogramQuantile.java @@ -1,36 +1,36 @@ -package com.geedgenetworks.core.udf.udaf.HdrHistogram;
-
-import com.geedgenetworks.common.Accumulator;
-import com.geedgenetworks.common.udf.UDFContext;
-import org.HdrHistogram.Histogramer;
-
-import java.util.Map;
-
-public class HdrHistogramQuantile extends HdrHistogramBaseAggregate {
- Double probability;
-
- @Override
- public void open(UDFContext c) {
- super.open(c);
- Map<String, Object> params = c.getParameters();
- probability = Double.parseDouble(params.getOrDefault("probability", "0.5").toString());
- }
-
- @Override
- public Accumulator getResult(Accumulator acc) {
- Object agg = acc.getMetricsFields().get(outputField);
- if (agg == null) {
- return acc;
- }
-
- long percentile = ((Histogramer) agg).getValueAtPercentile(probability * 100);
- acc.getMetricsFields().put(outputField, percentile);
- return acc;
- }
-
- @Override
- public String functionName() {
- return "APPROX_QUANTILE_HDR";
- }
-
-}
+package com.geedgenetworks.core.udf.udaf.HdrHistogram; + +import com.geedgenetworks.common.config.Accumulator; +import com.geedgenetworks.api.common.udf.UDFContext; +import org.HdrHistogram.Histogramer; + +import java.util.Map; + +public class HdrHistogramQuantile extends HdrHistogramBaseAggregate { + Double probability; + + @Override + public void open(UDFContext c) { + super.open(c); + Map<String, Object> params = c.getParameters(); + probability = Double.parseDouble(params.getOrDefault("probability", "0.5").toString()); + } + + @Override + public Accumulator getResult(Accumulator acc) { + Object agg = acc.getMetricsFields().get(outputField); + if (agg == null) { + return acc; + } + + long percentile = ((Histogramer) agg).getValueAtPercentile(probability * 100); + acc.getMetricsFields().put(outputField, percentile); + return acc; + } + + @Override + public String functionName() { + return "APPROX_QUANTILE_HDR"; + } + +} diff --git a/groot-core/src/main/java/com/geedgenetworks/core/udf/udaf/HdrHistogram/HdrHistogramQuantiles.java b/groot-core/src/main/java/com/geedgenetworks/core/udf/udaf/HdrHistogram/HdrHistogramQuantiles.java index ccfffd3..e4e9b09 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/udf/udaf/HdrHistogram/HdrHistogramQuantiles.java +++ b/groot-core/src/main/java/com/geedgenetworks/core/udf/udaf/HdrHistogram/HdrHistogramQuantiles.java @@ -1,51 +1,51 @@ -package com.geedgenetworks.core.udf.udaf.HdrHistogram;
-
-import com.alibaba.fastjson2.JSON;
-import com.geedgenetworks.common.Accumulator;
-import com.geedgenetworks.common.udf.UDFContext;
-import org.HdrHistogram.Histogramer;
-
-import java.util.ArrayList;
-import java.util.List;
-import java.util.Map;
-
-public class HdrHistogramQuantiles extends HdrHistogramBaseAggregate {
- double[] probabilities;
-
- @Override
- public void open(UDFContext c) {
- super.open(c);
- Map<String, Object> params = c.getParameters();
- Object ps = params.get("probabilities");
- if(ps == null){
- throw new IllegalArgumentException("probabilities param is requested");
- }
- List<Double> floats = JSON.parseArray(ps instanceof String ? ps.toString(): JSON.toJSONString(ps), Double.class);
- probabilities = new double[floats.size()];
- for (int i = 0; i < floats.size(); i++) {
- probabilities[i] = floats.get(i);
- }
- }
-
- @Override
- public Accumulator getResult(Accumulator acc) {
- Object agg = acc.getMetricsFields().get(outputField);
- if (agg == null) {
- return acc;
- }
-
- Histogramer his = ((Histogramer) agg);
- final List<Long> counts = new ArrayList<>(probabilities.length);
- for (int i = 0; i < probabilities.length; i++) {
- counts.add(his.getValueAtPercentile(probabilities[i] * 100));
- }
- acc.getMetricsFields().put(outputField, counts);
- return acc;
- }
-
- @Override
- public String functionName() {
- return "APPROX_QUANTILES_HDR";
- }
-
-}
+package com.geedgenetworks.core.udf.udaf.HdrHistogram; + +import com.alibaba.fastjson2.JSON; +import com.geedgenetworks.common.config.Accumulator; +import com.geedgenetworks.api.common.udf.UDFContext; +import org.HdrHistogram.Histogramer; + +import java.util.ArrayList; +import java.util.List; +import java.util.Map; + +public class HdrHistogramQuantiles extends HdrHistogramBaseAggregate { + double[] probabilities; + + @Override + public void open(UDFContext c) { + super.open(c); + Map<String, Object> params = c.getParameters(); + Object ps = params.get("probabilities"); + if(ps == null){ + throw new IllegalArgumentException("probabilities param is requested"); + } + List<Double> floats = JSON.parseArray(ps instanceof String ? ps.toString(): JSON.toJSONString(ps), Double.class); + probabilities = new double[floats.size()]; + for (int i = 0; i < floats.size(); i++) { + probabilities[i] = floats.get(i); + } + } + + @Override + public Accumulator getResult(Accumulator acc) { + Object agg = acc.getMetricsFields().get(outputField); + if (agg == null) { + return acc; + } + + Histogramer his = ((Histogramer) agg); + final List<Long> counts = new ArrayList<>(probabilities.length); + for (int i = 0; i < probabilities.length; i++) { + counts.add(his.getValueAtPercentile(probabilities[i] * 100)); + } + acc.getMetricsFields().put(outputField, counts); + return acc; + } + + @Override + public String functionName() { + return "APPROX_QUANTILES_HDR"; + } + +} diff --git a/groot-core/src/main/java/com/geedgenetworks/core/udf/udaf/LastValue.java b/groot-core/src/main/java/com/geedgenetworks/core/udf/udaf/LastValue.java index f319e8d..a3b8382 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/udf/udaf/LastValue.java +++ b/groot-core/src/main/java/com/geedgenetworks/core/udf/udaf/LastValue.java @@ -16,12 +16,12 @@ package com.geedgenetworks.core.udf.udaf; -import com.geedgenetworks.common.Accumulator; -import com.geedgenetworks.common.Event; +import com.geedgenetworks.common.config.Accumulator; import com.geedgenetworks.common.exception.CommonErrorCode; import com.geedgenetworks.common.exception.GrootStreamRuntimeException; -import com.geedgenetworks.common.udf.AggregateFunction; -import com.geedgenetworks.common.udf.UDFContext; +import com.geedgenetworks.api.common.udf.AggregateFunction; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.connector.event.Event; /** * Collects elements within a group and returns the list of aggregated objects diff --git a/groot-core/src/main/java/com/geedgenetworks/core/udf/udaf/LongCount.java b/groot-core/src/main/java/com/geedgenetworks/core/udf/udaf/LongCount.java index 418eb9c..9cffe70 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/udf/udaf/LongCount.java +++ b/groot-core/src/main/java/com/geedgenetworks/core/udf/udaf/LongCount.java @@ -1,11 +1,11 @@ package com.geedgenetworks.core.udf.udaf; -import com.geedgenetworks.common.Accumulator; -import com.geedgenetworks.common.Event; +import com.geedgenetworks.common.config.Accumulator; import com.geedgenetworks.common.exception.CommonErrorCode; import com.geedgenetworks.common.exception.GrootStreamRuntimeException; -import com.geedgenetworks.common.udf.AggregateFunction; -import com.geedgenetworks.common.udf.UDFContext; +import com.geedgenetworks.api.common.udf.AggregateFunction; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.connector.event.Event; public class LongCount implements AggregateFunction { diff --git a/groot-core/src/main/java/com/geedgenetworks/core/udf/udaf/Max.java b/groot-core/src/main/java/com/geedgenetworks/core/udf/udaf/Max.java index 226df0a..decb770 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/udf/udaf/Max.java +++ b/groot-core/src/main/java/com/geedgenetworks/core/udf/udaf/Max.java @@ -1,15 +1,13 @@ package com.geedgenetworks.core.udf.udaf; - -import com.geedgenetworks.common.Accumulator; -import com.geedgenetworks.common.Event; +import com.geedgenetworks.common.config.Accumulator; import com.geedgenetworks.common.exception.CommonErrorCode; import com.geedgenetworks.common.exception.GrootStreamRuntimeException; -import com.geedgenetworks.common.udf.AggregateFunction; -import com.geedgenetworks.common.udf.UDFContext; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.common.udf.AggregateFunction; +import com.geedgenetworks.api.connector.event.Event; import java.time.LocalDateTime; -import java.time.format.DateTimeFormatter; import java.util.Map; diff --git a/groot-core/src/main/java/com/geedgenetworks/core/udf/udaf/Mean.java b/groot-core/src/main/java/com/geedgenetworks/core/udf/udaf/Mean.java index 88f693f..c61a908 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/udf/udaf/Mean.java +++ b/groot-core/src/main/java/com/geedgenetworks/core/udf/udaf/Mean.java @@ -1,12 +1,12 @@ package com.geedgenetworks.core.udf.udaf; -import com.geedgenetworks.common.Accumulator; -import com.geedgenetworks.common.Event; +import com.geedgenetworks.common.config.Accumulator; import com.geedgenetworks.common.exception.CommonErrorCode; import com.geedgenetworks.common.exception.GrootStreamRuntimeException; -import com.geedgenetworks.common.udf.AggregateFunction; -import com.geedgenetworks.common.udf.UDFContext; import com.geedgenetworks.core.pojo.OnlineStatistics; +import com.geedgenetworks.api.common.udf.AggregateFunction; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.connector.event.Event; import java.text.DecimalFormat; diff --git a/groot-core/src/main/java/com/geedgenetworks/core/udf/udaf/Min.java b/groot-core/src/main/java/com/geedgenetworks/core/udf/udaf/Min.java index 6fd7046..acf4ba8 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/udf/udaf/Min.java +++ b/groot-core/src/main/java/com/geedgenetworks/core/udf/udaf/Min.java @@ -1,15 +1,14 @@ package com.geedgenetworks.core.udf.udaf; -import com.geedgenetworks.common.Accumulator; -import com.geedgenetworks.common.Event; +import com.geedgenetworks.common.config.Accumulator; import com.geedgenetworks.common.exception.CommonErrorCode; import com.geedgenetworks.common.exception.GrootStreamRuntimeException; -import com.geedgenetworks.common.udf.AggregateFunction; -import com.geedgenetworks.common.udf.UDFContext; +import com.geedgenetworks.api.common.udf.AggregateFunction; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.connector.event.Event; import java.time.LocalDateTime; -import java.time.format.DateTimeFormatter; import java.util.Map; diff --git a/groot-core/src/main/java/com/geedgenetworks/core/udf/udaf/NumberSum.java b/groot-core/src/main/java/com/geedgenetworks/core/udf/udaf/NumberSum.java index ab8f744..80adf67 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/udf/udaf/NumberSum.java +++ b/groot-core/src/main/java/com/geedgenetworks/core/udf/udaf/NumberSum.java @@ -1,12 +1,11 @@ package com.geedgenetworks.core.udf.udaf; -import com.geedgenetworks.common.Accumulator; -import com.geedgenetworks.common.Event; +import com.geedgenetworks.common.config.Accumulator; import com.geedgenetworks.common.exception.CommonErrorCode; import com.geedgenetworks.common.exception.GrootStreamRuntimeException; -import com.geedgenetworks.common.udf.AggregateFunction; -import com.geedgenetworks.common.udf.UDFContext; - +import com.geedgenetworks.api.common.udf.AggregateFunction; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.connector.event.Event; public class NumberSum implements AggregateFunction { private String lookupField; diff --git a/groot-core/src/main/java/com/geedgenetworks/core/udf/udaf/hlld/Hlld.java b/groot-core/src/main/java/com/geedgenetworks/core/udf/udaf/hlld/Hlld.java index e373a7a..4c1a340 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/udf/udaf/hlld/Hlld.java +++ b/groot-core/src/main/java/com/geedgenetworks/core/udf/udaf/hlld/Hlld.java @@ -1,40 +1,40 @@ -package com.geedgenetworks.core.udf.udaf.hlld;
-
-import com.geedgenetworks.common.Accumulator;
-import com.geedgenetworks.common.udf.UDFContext;
-import com.geedgenetworks.sketch.hlld.Hll;
-import com.geedgenetworks.sketch.util.StringUtils;
-
-import java.util.Map;
-
-public class Hlld extends HlldBaseAggregate {
- boolean outputBase64;
-
- @Override
- public void open(UDFContext c) {
- super.open(c);
- Map<String, Object> params = c.getParameters();
- outputBase64 = "base64".equalsIgnoreCase(params.getOrDefault("output_format", "base64").toString());
- }
-
- @Override
- public Accumulator getResult(Accumulator acc) {
- Hll hll = getResultHll(acc);
- if (hll == null) {
- return acc;
- }
-
- if (outputBase64) {
- acc.getMetricsFields().put(outputField, StringUtils.encodeBase64String(hll.toBytes()));
- } else {
- acc.getMetricsFields().put(outputField, hll.toBytes());
- }
-
- return acc;
- }
-
- @Override
- public String functionName() {
- return "HLLD";
- }
-}
+package com.geedgenetworks.core.udf.udaf.hlld; + +import com.geedgenetworks.common.config.Accumulator; +import com.geedgenetworks.sketch.hlld.Hll; +import com.geedgenetworks.sketch.util.StringUtils; +import com.geedgenetworks.api.common.udf.UDFContext; + +import java.util.Map; + +public class Hlld extends HlldBaseAggregate { + boolean outputBase64; + + @Override + public void open(UDFContext c) { + super.open(c); + Map<String, Object> params = c.getParameters(); + outputBase64 = "base64".equalsIgnoreCase(params.getOrDefault("output_format", "base64").toString()); + } + + @Override + public Accumulator getResult(Accumulator acc) { + Hll hll = getResultHll(acc); + if (hll == null) { + return acc; + } + + if (outputBase64) { + acc.getMetricsFields().put(outputField, StringUtils.encodeBase64String(hll.toBytes())); + } else { + acc.getMetricsFields().put(outputField, hll.toBytes()); + } + + return acc; + } + + @Override + public String functionName() { + return "HLLD"; + } +} diff --git a/groot-core/src/main/java/com/geedgenetworks/core/udf/udaf/hlld/HlldApproxCountDistinct.java b/groot-core/src/main/java/com/geedgenetworks/core/udf/udaf/hlld/HlldApproxCountDistinct.java index ec003f8..3220c4c 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/udf/udaf/hlld/HlldApproxCountDistinct.java +++ b/groot-core/src/main/java/com/geedgenetworks/core/udf/udaf/hlld/HlldApproxCountDistinct.java @@ -1,25 +1,25 @@ -package com.geedgenetworks.core.udf.udaf.hlld;
-
-import com.geedgenetworks.common.Accumulator;
-import com.geedgenetworks.sketch.hlld.Hll;
-
-public class HlldApproxCountDistinct extends HlldBaseAggregate {
-
- @Override
- public Accumulator getResult(Accumulator acc) {
- Hll hll = getResultHll(acc);
- if (hll == null) {
- return acc;
- }
-
- acc.getMetricsFields().put(outputField, (long)hll.size());
-
- return acc;
- }
-
- @Override
- public String functionName() {
- return "APPROX_COUNT_DISTINCT_HLLD";
- }
-
-}
+package com.geedgenetworks.core.udf.udaf.hlld; + +import com.geedgenetworks.common.config.Accumulator; +import com.geedgenetworks.sketch.hlld.Hll; + +public class HlldApproxCountDistinct extends HlldBaseAggregate { + + @Override + public Accumulator getResult(Accumulator acc) { + Hll hll = getResultHll(acc); + if (hll == null) { + return acc; + } + + acc.getMetricsFields().put(outputField, (long)hll.size()); + + return acc; + } + + @Override + public String functionName() { + return "APPROX_COUNT_DISTINCT_HLLD"; + } + +} diff --git a/groot-core/src/main/java/com/geedgenetworks/core/udf/udaf/hlld/HlldBaseAggregate.java b/groot-core/src/main/java/com/geedgenetworks/core/udf/udaf/hlld/HlldBaseAggregate.java index c113c4a..9d46ed9 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/udf/udaf/hlld/HlldBaseAggregate.java +++ b/groot-core/src/main/java/com/geedgenetworks/core/udf/udaf/hlld/HlldBaseAggregate.java @@ -1,12 +1,12 @@ package com.geedgenetworks.core.udf.udaf.hlld; -import com.geedgenetworks.common.Accumulator; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.common.udf.AggregateFunction; -import com.geedgenetworks.common.udf.UDFContext; +import com.geedgenetworks.common.config.Accumulator; import com.geedgenetworks.sketch.hlld.Hll; import com.geedgenetworks.sketch.hlld.HllUnion; import com.geedgenetworks.sketch.hlld.HllUtils; +import com.geedgenetworks.api.common.udf.AggregateFunction; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.connector.event.Event; import org.apache.commons.collections.CollectionUtils; import java.util.Map; diff --git a/groot-core/src/main/java/com/geedgenetworks/core/udf/udtf/JsonUnroll.java b/groot-core/src/main/java/com/geedgenetworks/core/udf/udtf/JsonUnroll.java index de89e2c..e50dd12 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/udf/udtf/JsonUnroll.java +++ b/groot-core/src/main/java/com/geedgenetworks/core/udf/udtf/JsonUnroll.java @@ -3,16 +3,12 @@ package com.geedgenetworks.core.udf.udtf; import com.alibaba.fastjson.JSONArray; import com.alibaba.fastjson2.JSON; import com.alibaba.fastjson2.JSONObject; -import com.geedgenetworks.common.Event; import com.geedgenetworks.common.exception.CommonErrorCode; import com.geedgenetworks.common.exception.GrootStreamRuntimeException; -import com.geedgenetworks.common.udf.TableFunction; -import com.geedgenetworks.common.udf.UDFContext; import com.geedgenetworks.common.utils.JsonPathUtil; -import com.googlecode.aviator.AviatorEvaluator; -import com.googlecode.aviator.AviatorEvaluatorInstance; -import com.googlecode.aviator.Expression; -import com.googlecode.aviator.Options; +import com.geedgenetworks.api.common.udf.TableFunction; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.connector.event.Event; import lombok.extern.slf4j.Slf4j; import org.apache.flink.api.common.functions.RuntimeContext; @@ -99,7 +95,7 @@ public class JsonUnroll implements TableFunction { return Collections.singletonList(event); } - private List<Event> parseList(Object object,Event event) { + private List<Event> parseList(Object object, Event event) { List list = (List) object; List<Event> eventList = new ArrayList<>(); for (Object obj : list) { diff --git a/groot-core/src/main/java/com/geedgenetworks/core/udf/udtf/PathUnroll.java b/groot-core/src/main/java/com/geedgenetworks/core/udf/udtf/PathUnroll.java index e5732e0..dcc8bfb 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/udf/udtf/PathUnroll.java +++ b/groot-core/src/main/java/com/geedgenetworks/core/udf/udtf/PathUnroll.java @@ -1,8 +1,8 @@ package com.geedgenetworks.core.udf.udtf; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.common.udf.TableFunction; -import com.geedgenetworks.common.udf.UDFContext; +import com.geedgenetworks.api.common.udf.TableFunction; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.connector.event.Event; import org.apache.commons.collections.CollectionUtils; import org.apache.commons.lang3.StringUtils; import org.apache.flink.api.common.functions.RuntimeContext; diff --git a/groot-core/src/main/java/com/geedgenetworks/core/udf/udtf/Unroll.java b/groot-core/src/main/java/com/geedgenetworks/core/udf/udtf/Unroll.java index ff4a9ae..61bfff1 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/udf/udtf/Unroll.java +++ b/groot-core/src/main/java/com/geedgenetworks/core/udf/udtf/Unroll.java @@ -1,10 +1,10 @@ package com.geedgenetworks.core.udf.udtf; -import com.geedgenetworks.common.Event; import com.geedgenetworks.common.exception.CommonErrorCode; import com.geedgenetworks.common.exception.GrootStreamRuntimeException; -import com.geedgenetworks.common.udf.TableFunction; -import com.geedgenetworks.common.udf.UDFContext; +import com.geedgenetworks.api.common.udf.TableFunction; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.connector.event.Event; import lombok.extern.slf4j.Slf4j; import org.apache.flink.api.common.functions.RuntimeContext; diff --git a/groot-core/src/main/java/com/geedgenetworks/core/udf/uuid/UUID.java b/groot-core/src/main/java/com/geedgenetworks/core/udf/uuid/UUID.java index 4e9a031..d5cd7ed 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/udf/uuid/UUID.java +++ b/groot-core/src/main/java/com/geedgenetworks/core/udf/uuid/UUID.java @@ -2,13 +2,14 @@ package com.geedgenetworks.core.udf.uuid; import com.fasterxml.uuid.Generators; import com.fasterxml.uuid.impl.RandomBasedGenerator; -import com.geedgenetworks.common.Event; import com.geedgenetworks.common.exception.CommonErrorCode; import com.geedgenetworks.common.exception.GrootStreamRuntimeException; -import com.geedgenetworks.common.udf.ScalarFunction; -import com.geedgenetworks.common.udf.UDFContext; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.connector.event.Event; import lombok.extern.slf4j.Slf4j; import org.apache.flink.api.common.functions.RuntimeContext; +import com.geedgenetworks.api.common.udf.ScalarFunction; + @Slf4j public class UUID implements ScalarFunction { diff --git a/groot-core/src/main/java/com/geedgenetworks/core/udf/uuid/UUIDv5.java b/groot-core/src/main/java/com/geedgenetworks/core/udf/uuid/UUIDv5.java index 3a433b8..6f52928 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/udf/uuid/UUIDv5.java +++ b/groot-core/src/main/java/com/geedgenetworks/core/udf/uuid/UUIDv5.java @@ -2,11 +2,11 @@ package com.geedgenetworks.core.udf.uuid; import com.fasterxml.uuid.Generators; import com.fasterxml.uuid.impl.NameBasedGenerator; -import com.geedgenetworks.common.Event; import com.geedgenetworks.common.exception.CommonErrorCode; import com.geedgenetworks.common.exception.GrootStreamRuntimeException; -import com.geedgenetworks.common.udf.ScalarFunction; -import com.geedgenetworks.common.udf.UDFContext; +import com.geedgenetworks.api.common.udf.ScalarFunction; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.connector.event.Event; import lombok.extern.slf4j.Slf4j; import org.apache.flink.api.common.functions.RuntimeContext; diff --git a/groot-core/src/main/java/com/geedgenetworks/core/udf/uuid/UUIDv7.java b/groot-core/src/main/java/com/geedgenetworks/core/udf/uuid/UUIDv7.java index 60c388f..af03755 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/udf/uuid/UUIDv7.java +++ b/groot-core/src/main/java/com/geedgenetworks/core/udf/uuid/UUIDv7.java @@ -2,11 +2,11 @@ package com.geedgenetworks.core.udf.uuid; import com.fasterxml.uuid.Generators; import com.fasterxml.uuid.impl.TimeBasedEpochGenerator; -import com.geedgenetworks.common.Event; import com.geedgenetworks.common.exception.CommonErrorCode; import com.geedgenetworks.common.exception.GrootStreamRuntimeException; -import com.geedgenetworks.common.udf.ScalarFunction; -import com.geedgenetworks.common.udf.UDFContext; +import com.geedgenetworks.api.common.udf.ScalarFunction; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.connector.event.Event; import lombok.extern.slf4j.Slf4j; import org.apache.flink.api.common.functions.RuntimeContext; diff --git a/groot-core/src/main/java/com/geedgenetworks/core/utils/SchedulerUtils.java b/groot-core/src/main/java/com/geedgenetworks/core/utils/SchedulerUtils.java index cba11ee..cc5fb67 100644 --- a/groot-core/src/main/java/com/geedgenetworks/core/utils/SchedulerUtils.java +++ b/groot-core/src/main/java/com/geedgenetworks/core/utils/SchedulerUtils.java @@ -2,6 +2,7 @@ package com.geedgenetworks.core.utils; import org.quartz.Scheduler; import org.quartz.SchedulerException; import org.quartz.impl.StdSchedulerFactory; + public class SchedulerUtils { private static Scheduler scheduler; diff --git a/groot-core/src/main/resources/META-INF/services/com.geedgenetworks.api.factory.Factory b/groot-core/src/main/resources/META-INF/services/com.geedgenetworks.api.factory.Factory new file mode 100644 index 0000000..cbb5266 --- /dev/null +++ b/groot-core/src/main/resources/META-INF/services/com.geedgenetworks.api.factory.Factory @@ -0,0 +1,7 @@ +com.geedgenetworks.core.connector.inline.InlineTableFactory +com.geedgenetworks.core.connector.print.PrintTableFactory +com.geedgenetworks.core.processor.filter.AviatorFilterProcessorFactory +com.geedgenetworks.core.processor.split.SplitProcessorFactory +com.geedgenetworks.core.processor.projection.ProjectionProcessorFactory +com.geedgenetworks.core.processor.table.TableProcessorFactory +com.geedgenetworks.core.processor.aggregate.AggregateProcessorFactory
\ No newline at end of file diff --git a/groot-core/src/main/resources/META-INF/services/com.geedgenetworks.core.factories.Factory b/groot-core/src/main/resources/META-INF/services/com.geedgenetworks.core.factories.Factory deleted file mode 100644 index 6d6a1bb..0000000 --- a/groot-core/src/main/resources/META-INF/services/com.geedgenetworks.core.factories.Factory +++ /dev/null @@ -1,2 +0,0 @@ -com.geedgenetworks.core.connector.inline.InlineTableFactory -com.geedgenetworks.core.connector.print.PrintTableFactory diff --git a/groot-core/src/main/resources/META-INF/services/com.geedgenetworks.core.filter.Filter b/groot-core/src/main/resources/META-INF/services/com.geedgenetworks.core.filter.Filter deleted file mode 100644 index 2268533..0000000 --- a/groot-core/src/main/resources/META-INF/services/com.geedgenetworks.core.filter.Filter +++ /dev/null @@ -1 +0,0 @@ -com.geedgenetworks.core.filter.AviatorFilter
\ No newline at end of file diff --git a/groot-core/src/main/resources/META-INF/services/com.geedgenetworks.core.processor.Processor b/groot-core/src/main/resources/META-INF/services/com.geedgenetworks.core.processor.Processor deleted file mode 100644 index 1f32ffa..0000000 --- a/groot-core/src/main/resources/META-INF/services/com.geedgenetworks.core.processor.Processor +++ /dev/null @@ -1,3 +0,0 @@ -com.geedgenetworks.core.processor.aggregate.AggregateProcessorImpl -com.geedgenetworks.core.processor.projection.ProjectionProcessorImpl -com.geedgenetworks.core.processor.table.TableProcessorImpl
\ No newline at end of file diff --git a/groot-core/src/main/resources/META-INF/services/com.geedgenetworks.core.split.Split b/groot-core/src/main/resources/META-INF/services/com.geedgenetworks.core.split.Split deleted file mode 100644 index 500c367..0000000 --- a/groot-core/src/main/resources/META-INF/services/com.geedgenetworks.core.split.Split +++ /dev/null @@ -1 +0,0 @@ -com.geedgenetworks.core.split.SplitOperator
\ No newline at end of file diff --git a/groot-core/src/test/java/com/geedgenetworks/core/connector/schema/SchemaParserTest.java b/groot-core/src/test/java/com/geedgenetworks/core/connector/schema/SchemaParserTest.java index fbeaed7..4d42f85 100644 --- a/groot-core/src/test/java/com/geedgenetworks/core/connector/schema/SchemaParserTest.java +++ b/groot-core/src/test/java/com/geedgenetworks/core/connector/schema/SchemaParserTest.java @@ -1,22 +1,21 @@ -package com.geedgenetworks.core.connector.schema;
-
-import com.geedgenetworks.core.types.StructType;
-import org.apache.commons.io.FileUtils;
-
-import java.io.File;
-import java.nio.charset.StandardCharsets;
-
-import static org.junit.jupiter.api.Assertions.*;
-
-class SchemaParserTest {
-
- public static void main(String[] args) throws Exception{
- String str = FileUtils.readFileToString(new File("D:\\WorkSpace\\groot-stream\\session_record_schema.json"), StandardCharsets.UTF_8);
- SchemaParser.Parser parser = SchemaParser.PARSER_AVRO;
-
- StructType structType = parser.parser(str);
- System.out.println(structType.treeString());
-
- }
-
+package com.geedgenetworks.core.connector.schema; + +import com.geedgenetworks.api.connector.schema.SchemaParser; +import com.geedgenetworks.api.connector.type.StructType; +import org.apache.commons.io.FileUtils; + +import java.io.File; +import java.nio.charset.StandardCharsets; + +class SchemaParserTest { + + public static void main(String[] args) throws Exception{ + String str = FileUtils.readFileToString(new File("D:\\WorkSpace\\groot-stream\\session_record_schema.json"), StandardCharsets.UTF_8); + SchemaParser.Parser parser = SchemaParser.PARSER_AVRO; + + StructType structType = parser.parser(str); + System.out.println(structType.treeString()); + + } + }
\ No newline at end of file diff --git a/groot-core/src/test/java/com/geedgenetworks/core/connector/schema/utils/DynamicSchemaManagerTest.java b/groot-core/src/test/java/com/geedgenetworks/core/connector/schema/utils/DynamicSchemaManagerTest.java index b5b672e..252c2e1 100644 --- a/groot-core/src/test/java/com/geedgenetworks/core/connector/schema/utils/DynamicSchemaManagerTest.java +++ b/groot-core/src/test/java/com/geedgenetworks/core/connector/schema/utils/DynamicSchemaManagerTest.java @@ -1,218 +1,218 @@ -package com.geedgenetworks.core.connector.schema.utils;
-
-import com.geedgenetworks.core.connector.schema.DynamicSchema;
-import com.geedgenetworks.core.connector.schema.HttpDynamicSchema;
-import com.geedgenetworks.core.connector.schema.SchemaChangeAware;
-import com.geedgenetworks.core.connector.schema.SchemaParser;
-import com.geedgenetworks.core.types.StructType;
-import com.geedgenetworks.core.types.Types;
-import org.apache.flink.util.TimeUtils;
-import org.slf4j.Logger;
-import org.slf4j.LoggerFactory;
-
-import java.time.Duration;
-import java.util.ArrayList;
-import java.util.List;
-import java.util.concurrent.ThreadLocalRandom;
-
-public class DynamicSchemaManagerTest {
- static final Logger LOG = LoggerFactory.getLogger(DynamicSchemaManagerTest.class.getSimpleName());
-
- public static void main(String[] args) throws Exception{
- //testOneThread();
- //testMultiThread();
- testMultiThreadForHttpDynamicSchema();
- }
-
- public static void testOneThread() throws Exception{
- RandomDynamicSchema schema1 = new RandomDynamicSchema( 1000 * 5, "Schema1", 0.25);
- RandomDynamicSchema schema11 = new RandomDynamicSchema( 1000 * 5, "Schema1", 0.25);
- RandomDynamicSchema schema2 = new RandomDynamicSchema( 1000 * 10, "Schema1", 0.9);
-
- LOG.info("start");
- PrintSchemaChangeAware aware1 = new PrintSchemaChangeAware("aware1");
- PrintSchemaChangeAware aware2 = new PrintSchemaChangeAware("aware2");
- PrintSchemaChangeAware aware11 = new PrintSchemaChangeAware("aware11");
- PrintSchemaChangeAware aware22 = new PrintSchemaChangeAware("aware22");
-
- schema1.registerSchemaChangeAware(aware1);
- schema1.registerSchemaChangeAware(aware2);
- schema1.registerSchemaChangeAware(aware1);
- schema1.registerSchemaChangeAware(aware2);
-
- schema11.registerSchemaChangeAware(aware11);
- schema11.registerSchemaChangeAware(aware22);
-
-
- schema2.registerSchemaChangeAware(aware1);
- schema2.registerSchemaChangeAware(aware2);
- schema2.registerSchemaChangeAware(aware1);
- schema2.registerSchemaChangeAware(aware2);
-
- Thread.sleep(1000 * 60 * 2);
- schema1.unregisterSchemaChangeAware(aware1);
- schema1.unregisterSchemaChangeAware(aware2);
- schema1.unregisterSchemaChangeAware(aware11);
- schema1.unregisterSchemaChangeAware(aware22);
-
- Thread.sleep(1000 * 20);
- schema2.unregisterSchemaChangeAware(aware1);
- schema2.unregisterSchemaChangeAware(aware2);
-
- Thread.sleep(1000 * 3);
-
-
- schema1.registerSchemaChangeAware(aware2);
- schema2.registerSchemaChangeAware(aware1);
- Thread.sleep(1000 * 60 * 1);
- schema1.unregisterSchemaChangeAware(aware2);
- schema2.unregisterSchemaChangeAware(aware1);
- Thread.sleep(1000 * 3);
- }
-
- public static void testMultiThreadForHttpDynamicSchema() throws Exception{
- LOG.info("start");
-
- List<Thread> threads = new ArrayList<>(10);
- Thread thread;
- for (int i = 0; i < 5; i++) {
- int finalI = i + 1;
- thread = new Thread(() -> {
- DynamicSchema schema1 = new HttpDynamicSchema( "http://127.0.0.1/session_record_schema.json", SchemaParser.PARSER_AVRO, 1000 * 5);
- System.out.println(schema1.getDataType());
- PrintSchemaChangeAware aware1 = new PrintSchemaChangeAware("aware1_" + finalI);
- schema1.registerSchemaChangeAware(aware1);
- try {
- Thread.sleep(1000 * 60 * 1);
- } catch (InterruptedException e) {
- e.printStackTrace();
- }
- schema1.unregisterSchemaChangeAware(aware1);
- });
- threads.add(thread);
-
- thread = new Thread(() -> {
- DynamicSchema schema2 = new HttpDynamicSchema( "http://127.0.0.1/session_record_schema.json", SchemaParser.PARSER_AVRO, 1000 * 5);
- System.out.println(schema2.getDataType());
- PrintSchemaChangeAware aware2 = new PrintSchemaChangeAware("aware2_" + finalI);
- schema2.registerSchemaChangeAware(aware2);
- try {
- Thread.sleep(1000 * 60 * 1);
- } catch (InterruptedException e) {
- e.printStackTrace();
- }
- schema2.unregisterSchemaChangeAware(aware2);
- });
- threads.add(thread);
- }
-
- for (int i = 0; i < threads.size(); i++) {
- thread = threads.get(i);
- thread.start();
- }
-
- for (int i = 0; i < threads.size(); i++) {
- thread = threads.get(i);
- thread.join();
- }
- Thread.sleep(1000 * 3);
- LOG.info("done");
- }
-
- public static void testMultiThread() throws Exception{
- LOG.info("start");
-
- List<Thread> threads = new ArrayList<>(10);
- Thread thread;
- for (int i = 0; i < 5; i++) {
- int finalI = i + 1;
- thread = new Thread(() -> {
- RandomDynamicSchema schema1 = new RandomDynamicSchema( 1000 * 5, "Schema1", 0.25);
- PrintSchemaChangeAware aware1 = new PrintSchemaChangeAware("aware1_" + finalI);
- schema1.registerSchemaChangeAware(aware1);
- try {
- Thread.sleep(1000 * 60 * 1);
- } catch (InterruptedException e) {
- e.printStackTrace();
- }
- schema1.unregisterSchemaChangeAware(aware1);
- });
- threads.add(thread);
-
- thread = new Thread(() -> {
- RandomDynamicSchema schema2 = new RandomDynamicSchema(1000 * 10, "Schema1", 0.9);
- PrintSchemaChangeAware aware2 = new PrintSchemaChangeAware("aware2_" + finalI);
- schema2.registerSchemaChangeAware(aware2);
- try {
- Thread.sleep(1000 * 60 * 1);
- } catch (InterruptedException e) {
- e.printStackTrace();
- }
- schema2.unregisterSchemaChangeAware(aware2);
- });
- threads.add(thread);
- }
-
- for (int i = 0; i < threads.size(); i++) {
- thread = threads.get(i);
- thread.start();
- }
-
- for (int i = 0; i < threads.size(); i++) {
- thread = threads.get(i);
- thread.join();
- }
- Thread.sleep(1000 * 3);
- LOG.info("done");
- }
-
- public static class PrintSchemaChangeAware implements SchemaChangeAware {
- private final String name;
-
- public PrintSchemaChangeAware(String name) {
- this.name = name;
- }
-
- @Override
- public void schemaChange(StructType dataType) {
- String info = String.format("%s receive schema change:%s", name, dataType);
- //System.out.println(info);
- LOG.info(info);
- }
-
- @Override
- public String toString() {
- return name;
- }
- }
-
- public static class RandomDynamicSchema extends DynamicSchema{
- private final String key;
- private final double probability;
-
- public RandomDynamicSchema(long intervalMs, String identifier, double probability) {
- super(null, intervalMs);
- this.key = identifier + "_" + TimeUtils.formatWithHighestUnit(Duration.ofMillis(intervalMs));
- this.probability = probability;
- }
-
- @Override
- public String getCacheKey() {
- return key;
- }
-
- @Override
- protected String getDataTypeContent() {
- return null;
- }
-
- @Override
- public StructType updateDataType() {
- if(ThreadLocalRandom.current().nextDouble() < probability){
- return (StructType) Types.parseDataType(String.format("struct<name_%s: string>", key));
- }
- return null;
- }
-
- }
+package com.geedgenetworks.core.connector.schema.utils; + +import com.geedgenetworks.api.connector.schema.DynamicSchema; +import com.geedgenetworks.api.connector.schema.HttpDynamicSchema; +import com.geedgenetworks.api.connector.schema.SchemaChangeAware; +import com.geedgenetworks.api.connector.schema.SchemaParser; +import com.geedgenetworks.api.connector.type.StructType; +import com.geedgenetworks.api.connector.type.Types; +import org.apache.flink.util.TimeUtils; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.time.Duration; +import java.util.ArrayList; +import java.util.List; +import java.util.concurrent.ThreadLocalRandom; + +public class DynamicSchemaManagerTest { + static final Logger LOG = LoggerFactory.getLogger(DynamicSchemaManagerTest.class.getSimpleName()); + + public static void main(String[] args) throws Exception{ + //testOneThread(); + //testMultiThread(); + testMultiThreadForHttpDynamicSchema(); + } + + public static void testOneThread() throws Exception{ + RandomDynamicSchema schema1 = new RandomDynamicSchema( 1000 * 5, "Schema1", 0.25); + RandomDynamicSchema schema11 = new RandomDynamicSchema( 1000 * 5, "Schema1", 0.25); + RandomDynamicSchema schema2 = new RandomDynamicSchema( 1000 * 10, "Schema1", 0.9); + + LOG.info("start"); + PrintSchemaChangeAware aware1 = new PrintSchemaChangeAware("aware1"); + PrintSchemaChangeAware aware2 = new PrintSchemaChangeAware("aware2"); + PrintSchemaChangeAware aware11 = new PrintSchemaChangeAware("aware11"); + PrintSchemaChangeAware aware22 = new PrintSchemaChangeAware("aware22"); + + schema1.registerSchemaChangeAware(aware1); + schema1.registerSchemaChangeAware(aware2); + schema1.registerSchemaChangeAware(aware1); + schema1.registerSchemaChangeAware(aware2); + + schema11.registerSchemaChangeAware(aware11); + schema11.registerSchemaChangeAware(aware22); + + + schema2.registerSchemaChangeAware(aware1); + schema2.registerSchemaChangeAware(aware2); + schema2.registerSchemaChangeAware(aware1); + schema2.registerSchemaChangeAware(aware2); + + Thread.sleep(1000 * 60 * 2); + schema1.unregisterSchemaChangeAware(aware1); + schema1.unregisterSchemaChangeAware(aware2); + schema1.unregisterSchemaChangeAware(aware11); + schema1.unregisterSchemaChangeAware(aware22); + + Thread.sleep(1000 * 20); + schema2.unregisterSchemaChangeAware(aware1); + schema2.unregisterSchemaChangeAware(aware2); + + Thread.sleep(1000 * 3); + + + schema1.registerSchemaChangeAware(aware2); + schema2.registerSchemaChangeAware(aware1); + Thread.sleep(1000 * 60 * 1); + schema1.unregisterSchemaChangeAware(aware2); + schema2.unregisterSchemaChangeAware(aware1); + Thread.sleep(1000 * 3); + } + + public static void testMultiThreadForHttpDynamicSchema() throws Exception{ + LOG.info("start"); + + List<Thread> threads = new ArrayList<>(10); + Thread thread; + for (int i = 0; i < 5; i++) { + int finalI = i + 1; + thread = new Thread(() -> { + DynamicSchema schema1 = new HttpDynamicSchema( "http://127.0.0.1/session_record_schema.json", SchemaParser.PARSER_AVRO, 1000 * 5); + System.out.println(schema1.getDataType()); + PrintSchemaChangeAware aware1 = new PrintSchemaChangeAware("aware1_" + finalI); + schema1.registerSchemaChangeAware(aware1); + try { + Thread.sleep(1000 * 60 * 1); + } catch (InterruptedException e) { + e.printStackTrace(); + } + schema1.unregisterSchemaChangeAware(aware1); + }); + threads.add(thread); + + thread = new Thread(() -> { + DynamicSchema schema2 = new HttpDynamicSchema( "http://127.0.0.1/session_record_schema.json", SchemaParser.PARSER_AVRO, 1000 * 5); + System.out.println(schema2.getDataType()); + PrintSchemaChangeAware aware2 = new PrintSchemaChangeAware("aware2_" + finalI); + schema2.registerSchemaChangeAware(aware2); + try { + Thread.sleep(1000 * 60 * 1); + } catch (InterruptedException e) { + e.printStackTrace(); + } + schema2.unregisterSchemaChangeAware(aware2); + }); + threads.add(thread); + } + + for (int i = 0; i < threads.size(); i++) { + thread = threads.get(i); + thread.start(); + } + + for (int i = 0; i < threads.size(); i++) { + thread = threads.get(i); + thread.join(); + } + Thread.sleep(1000 * 3); + LOG.info("done"); + } + + public static void testMultiThread() throws Exception{ + LOG.info("start"); + + List<Thread> threads = new ArrayList<>(10); + Thread thread; + for (int i = 0; i < 5; i++) { + int finalI = i + 1; + thread = new Thread(() -> { + RandomDynamicSchema schema1 = new RandomDynamicSchema( 1000 * 5, "Schema1", 0.25); + PrintSchemaChangeAware aware1 = new PrintSchemaChangeAware("aware1_" + finalI); + schema1.registerSchemaChangeAware(aware1); + try { + Thread.sleep(1000 * 60 * 1); + } catch (InterruptedException e) { + e.printStackTrace(); + } + schema1.unregisterSchemaChangeAware(aware1); + }); + threads.add(thread); + + thread = new Thread(() -> { + RandomDynamicSchema schema2 = new RandomDynamicSchema(1000 * 10, "Schema1", 0.9); + PrintSchemaChangeAware aware2 = new PrintSchemaChangeAware("aware2_" + finalI); + schema2.registerSchemaChangeAware(aware2); + try { + Thread.sleep(1000 * 60 * 1); + } catch (InterruptedException e) { + e.printStackTrace(); + } + schema2.unregisterSchemaChangeAware(aware2); + }); + threads.add(thread); + } + + for (int i = 0; i < threads.size(); i++) { + thread = threads.get(i); + thread.start(); + } + + for (int i = 0; i < threads.size(); i++) { + thread = threads.get(i); + thread.join(); + } + Thread.sleep(1000 * 3); + LOG.info("done"); + } + + public static class PrintSchemaChangeAware implements SchemaChangeAware { + private final String name; + + public PrintSchemaChangeAware(String name) { + this.name = name; + } + + @Override + public void schemaChange(StructType dataType) { + String info = String.format("%s receive schema change:%s", name, dataType); + //System.out.println(info); + LOG.info(info); + } + + @Override + public String toString() { + return name; + } + } + + public static class RandomDynamicSchema extends DynamicSchema{ + private final String key; + private final double probability; + + public RandomDynamicSchema(long intervalMs, String identifier, double probability) { + super(null, intervalMs); + this.key = identifier + "_" + TimeUtils.formatWithHighestUnit(Duration.ofMillis(intervalMs)); + this.probability = probability; + } + + @Override + public String getCacheKey() { + return key; + } + + @Override + protected String getDataTypeContent() { + return null; + } + + @Override + public StructType updateDataType() { + if(ThreadLocalRandom.current().nextDouble() < probability){ + return (StructType) Types.parseDataType(String.format("struct<name_%s: string>", key)); + } + return null; + } + + } }
\ No newline at end of file diff --git a/groot-core/src/test/java/com/geedgenetworks/core/types/TypesTest.java b/groot-core/src/test/java/com/geedgenetworks/core/types/TypesTest.java index 518a3f4..f144abc 100644 --- a/groot-core/src/test/java/com/geedgenetworks/core/types/TypesTest.java +++ b/groot-core/src/test/java/com/geedgenetworks/core/types/TypesTest.java @@ -1,6 +1,7 @@ package com.geedgenetworks.core.types; import com.alibaba.fastjson2.JSON; +import com.geedgenetworks.api.connector.type.*; import org.junit.jupiter.api.Test; import java.util.Arrays; @@ -20,7 +21,7 @@ public class TypesTest { public void parseDataType(){ String ddl = "struct< id: int, obj: struct <id:int, arr: array < int >, name:string>, name : string>"; - StructType dataType = (StructType)Types.parseDataType(ddl); + StructType dataType = (StructType) Types.parseDataType(ddl); System.out.println(dataType); System.out.println(dataType.treeString()); diff --git a/groot-core/src/test/java/com/geedgenetworks/core/udf/cn/AnonymityLookupTest.java b/groot-core/src/test/java/com/geedgenetworks/core/udf/cn/AnonymityLookupTest.java index ae74fea..bfcd7a0 100644 --- a/groot-core/src/test/java/com/geedgenetworks/core/udf/cn/AnonymityLookupTest.java +++ b/groot-core/src/test/java/com/geedgenetworks/core/udf/cn/AnonymityLookupTest.java @@ -1,7 +1,7 @@ package com.geedgenetworks.core.udf.cn; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.common.udf.UDFContext; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.connector.event.Event; import org.apache.flink.api.common.functions.RuntimeContext; import org.junit.jupiter.api.AfterEach; import org.junit.jupiter.api.BeforeEach; diff --git a/groot-core/src/test/java/com/geedgenetworks/core/udf/cn/AppCategoryLookupTest.java b/groot-core/src/test/java/com/geedgenetworks/core/udf/cn/AppCategoryLookupTest.java index 713be3f..d32d027 100644 --- a/groot-core/src/test/java/com/geedgenetworks/core/udf/cn/AppCategoryLookupTest.java +++ b/groot-core/src/test/java/com/geedgenetworks/core/udf/cn/AppCategoryLookupTest.java @@ -1,7 +1,7 @@ package com.geedgenetworks.core.udf.cn; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.common.udf.UDFContext; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.connector.event.Event; import org.apache.flink.api.common.functions.RuntimeContext; import org.junit.jupiter.api.AfterEach; import org.junit.jupiter.api.BeforeAll; diff --git a/groot-core/src/test/java/com/geedgenetworks/core/udf/cn/BaseStationLookupTest.java b/groot-core/src/test/java/com/geedgenetworks/core/udf/cn/BaseStationLookupTest.java index 43b0bd5..f1936f6 100644 --- a/groot-core/src/test/java/com/geedgenetworks/core/udf/cn/BaseStationLookupTest.java +++ b/groot-core/src/test/java/com/geedgenetworks/core/udf/cn/BaseStationLookupTest.java @@ -1,7 +1,7 @@ package com.geedgenetworks.core.udf.cn; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.common.udf.UDFContext; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.connector.event.Event; import org.apache.flink.api.common.functions.RuntimeContext; import org.junit.jupiter.api.AfterEach; import org.junit.jupiter.api.BeforeAll; diff --git a/groot-core/src/test/java/com/geedgenetworks/core/udf/cn/DnsServerInfoLookupTest.java b/groot-core/src/test/java/com/geedgenetworks/core/udf/cn/DnsServerInfoLookupTest.java index d1cca09..996302c 100644 --- a/groot-core/src/test/java/com/geedgenetworks/core/udf/cn/DnsServerInfoLookupTest.java +++ b/groot-core/src/test/java/com/geedgenetworks/core/udf/cn/DnsServerInfoLookupTest.java @@ -1,7 +1,7 @@ package com.geedgenetworks.core.udf.cn; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.common.udf.UDFContext; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.connector.event.Event; import org.apache.flink.api.common.functions.RuntimeContext; import org.junit.jupiter.api.AfterEach; import org.junit.jupiter.api.BeforeAll; diff --git a/groot-core/src/test/java/com/geedgenetworks/core/udf/cn/FqdnCategoryLookupTest.java b/groot-core/src/test/java/com/geedgenetworks/core/udf/cn/FqdnCategoryLookupTest.java index 0e64982..2ff200a 100644 --- a/groot-core/src/test/java/com/geedgenetworks/core/udf/cn/FqdnCategoryLookupTest.java +++ b/groot-core/src/test/java/com/geedgenetworks/core/udf/cn/FqdnCategoryLookupTest.java @@ -1,7 +1,7 @@ package com.geedgenetworks.core.udf.cn; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.common.udf.UDFContext; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.connector.event.Event; import org.apache.flink.api.common.functions.RuntimeContext; import org.junit.jupiter.api.AfterEach; import org.junit.jupiter.api.BeforeAll; diff --git a/groot-core/src/test/java/com/geedgenetworks/core/udf/cn/FqdnWhoisLookupTest.java b/groot-core/src/test/java/com/geedgenetworks/core/udf/cn/FqdnWhoisLookupTest.java index 93ee663..ba0b757 100644 --- a/groot-core/src/test/java/com/geedgenetworks/core/udf/cn/FqdnWhoisLookupTest.java +++ b/groot-core/src/test/java/com/geedgenetworks/core/udf/cn/FqdnWhoisLookupTest.java @@ -1,7 +1,7 @@ package com.geedgenetworks.core.udf.cn; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.common.udf.UDFContext; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.connector.event.Event; import org.apache.flink.api.common.functions.RuntimeContext; import org.junit.jupiter.api.AfterEach; import org.junit.jupiter.api.BeforeAll; diff --git a/groot-core/src/test/java/com/geedgenetworks/core/udf/cn/H3CellLookupTest.java b/groot-core/src/test/java/com/geedgenetworks/core/udf/cn/H3CellLookupTest.java index a7b98ab..22431c3 100644 --- a/groot-core/src/test/java/com/geedgenetworks/core/udf/cn/H3CellLookupTest.java +++ b/groot-core/src/test/java/com/geedgenetworks/core/udf/cn/H3CellLookupTest.java @@ -1,7 +1,7 @@ package com.geedgenetworks.core.udf.cn; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.common.udf.UDFContext; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.connector.event.Event; import org.apache.flink.api.common.functions.RuntimeContext; import org.junit.jupiter.api.BeforeAll; import org.junit.jupiter.api.Test; diff --git a/groot-core/src/test/java/com/geedgenetworks/core/udf/cn/IcpLookupTest.java b/groot-core/src/test/java/com/geedgenetworks/core/udf/cn/IcpLookupTest.java index c2032e0..a27c36b 100644 --- a/groot-core/src/test/java/com/geedgenetworks/core/udf/cn/IcpLookupTest.java +++ b/groot-core/src/test/java/com/geedgenetworks/core/udf/cn/IcpLookupTest.java @@ -1,7 +1,7 @@ package com.geedgenetworks.core.udf.cn; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.common.udf.UDFContext; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.connector.event.Event; import org.apache.flink.api.common.functions.RuntimeContext; import org.junit.jupiter.api.AfterEach; import org.junit.jupiter.api.BeforeAll; diff --git a/groot-core/src/test/java/com/geedgenetworks/core/udf/cn/IdcRenterLookupTest.java b/groot-core/src/test/java/com/geedgenetworks/core/udf/cn/IdcRenterLookupTest.java index 7409a2f..73a9660 100644 --- a/groot-core/src/test/java/com/geedgenetworks/core/udf/cn/IdcRenterLookupTest.java +++ b/groot-core/src/test/java/com/geedgenetworks/core/udf/cn/IdcRenterLookupTest.java @@ -1,7 +1,7 @@ package com.geedgenetworks.core.udf.cn; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.common.udf.UDFContext; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.connector.event.Event; import org.apache.flink.api.common.functions.RuntimeContext; import org.junit.jupiter.api.AfterEach; import org.junit.jupiter.api.BeforeAll; diff --git a/groot-core/src/test/java/com/geedgenetworks/core/udf/cn/IntelligenceIndicatorLookupTest.java b/groot-core/src/test/java/com/geedgenetworks/core/udf/cn/IntelligenceIndicatorLookupTest.java index 7d643f4..1fde9a8 100644 --- a/groot-core/src/test/java/com/geedgenetworks/core/udf/cn/IntelligenceIndicatorLookupTest.java +++ b/groot-core/src/test/java/com/geedgenetworks/core/udf/cn/IntelligenceIndicatorLookupTest.java @@ -1,7 +1,7 @@ package com.geedgenetworks.core.udf.cn; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.common.udf.UDFContext; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.connector.event.Event; import org.apache.flink.api.common.functions.RuntimeContext; import org.junit.jupiter.api.AfterEach; import org.junit.jupiter.api.BeforeEach; diff --git a/groot-core/src/test/java/com/geedgenetworks/core/udf/cn/IocLookupTest.java b/groot-core/src/test/java/com/geedgenetworks/core/udf/cn/IocLookupTest.java index 8c01bc7..e7cfb7e 100644 --- a/groot-core/src/test/java/com/geedgenetworks/core/udf/cn/IocLookupTest.java +++ b/groot-core/src/test/java/com/geedgenetworks/core/udf/cn/IocLookupTest.java @@ -1,7 +1,7 @@ package com.geedgenetworks.core.udf.cn; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.common.udf.UDFContext; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.connector.event.Event; import org.apache.flink.api.common.functions.RuntimeContext; import org.junit.jupiter.api.AfterEach; import org.junit.jupiter.api.BeforeEach; diff --git a/groot-core/src/test/java/com/geedgenetworks/core/udf/cn/IpZoneLookupTest.java b/groot-core/src/test/java/com/geedgenetworks/core/udf/cn/IpZoneLookupTest.java index e3024b5..46f8ddd 100644 --- a/groot-core/src/test/java/com/geedgenetworks/core/udf/cn/IpZoneLookupTest.java +++ b/groot-core/src/test/java/com/geedgenetworks/core/udf/cn/IpZoneLookupTest.java @@ -1,7 +1,7 @@ package com.geedgenetworks.core.udf.cn; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.common.udf.UDFContext; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.connector.event.Event; import org.apache.flink.api.common.functions.RuntimeContext; import org.junit.jupiter.api.AfterEach; import org.junit.jupiter.api.BeforeEach; diff --git a/groot-core/src/test/java/com/geedgenetworks/core/udf/cn/LinkDirectionLookupTest.java b/groot-core/src/test/java/com/geedgenetworks/core/udf/cn/LinkDirectionLookupTest.java index 4f2f551..f504e43 100644 --- a/groot-core/src/test/java/com/geedgenetworks/core/udf/cn/LinkDirectionLookupTest.java +++ b/groot-core/src/test/java/com/geedgenetworks/core/udf/cn/LinkDirectionLookupTest.java @@ -1,7 +1,7 @@ package com.geedgenetworks.core.udf.cn; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.common.udf.UDFContext; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.connector.event.Event; import org.apache.flink.api.common.functions.RuntimeContext; import org.junit.jupiter.api.AfterEach; import org.junit.jupiter.api.BeforeAll; diff --git a/groot-core/src/test/java/com/geedgenetworks/core/udf/cn/LookupTestUtils.java b/groot-core/src/test/java/com/geedgenetworks/core/udf/cn/LookupTestUtils.java index bf95c57..abed4e2 100644 --- a/groot-core/src/test/java/com/geedgenetworks/core/udf/cn/LookupTestUtils.java +++ b/groot-core/src/test/java/com/geedgenetworks/core/udf/cn/LookupTestUtils.java @@ -1,7 +1,7 @@ package com.geedgenetworks.core.udf.cn; import com.alibaba.fastjson2.JSON; -import com.geedgenetworks.common.Constants; +import com.geedgenetworks.common.config.Constants; import com.geedgenetworks.common.config.CommonConfig; import com.geedgenetworks.common.config.KnowledgeBaseConfig; import com.geedgenetworks.core.pojo.KnowLedgeBaseFileMeta; diff --git a/groot-core/src/test/java/com/geedgenetworks/core/udf/cn/UserDefineTagLookupTest.java b/groot-core/src/test/java/com/geedgenetworks/core/udf/cn/UserDefineTagLookupTest.java index a586aeb..eccad83 100644 --- a/groot-core/src/test/java/com/geedgenetworks/core/udf/cn/UserDefineTagLookupTest.java +++ b/groot-core/src/test/java/com/geedgenetworks/core/udf/cn/UserDefineTagLookupTest.java @@ -1,7 +1,7 @@ package com.geedgenetworks.core.udf.cn; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.common.udf.UDFContext; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.connector.event.Event; import org.apache.flink.api.common.functions.RuntimeContext; import org.junit.jupiter.api.AfterEach; import org.junit.jupiter.api.BeforeEach; diff --git a/groot-core/src/test/java/com/geedgenetworks/core/udf/cn/VpnLookupTest.java b/groot-core/src/test/java/com/geedgenetworks/core/udf/cn/VpnLookupTest.java index 50374f7..16bcd67 100644 --- a/groot-core/src/test/java/com/geedgenetworks/core/udf/cn/VpnLookupTest.java +++ b/groot-core/src/test/java/com/geedgenetworks/core/udf/cn/VpnLookupTest.java @@ -1,7 +1,7 @@ package com.geedgenetworks.core.udf.cn; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.common.udf.UDFContext; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.connector.event.Event; import org.apache.flink.api.common.functions.RuntimeContext; import org.junit.jupiter.api.AfterEach; import org.junit.jupiter.api.BeforeEach; diff --git a/groot-core/src/test/java/com/geedgenetworks/core/udf/test/AsnLookupFunctionTest.java b/groot-core/src/test/java/com/geedgenetworks/core/udf/test/AsnLookupFunctionTest.java index f10fe2b..84d5cf5 100644 --- a/groot-core/src/test/java/com/geedgenetworks/core/udf/test/AsnLookupFunctionTest.java +++ b/groot-core/src/test/java/com/geedgenetworks/core/udf/test/AsnLookupFunctionTest.java @@ -3,10 +3,10 @@ package com.geedgenetworks.core.udf.test; import com.geedgenetworks.common.config.CommonConfig; import com.geedgenetworks.common.config.KnowledgeBaseConfig; import com.geedgenetworks.common.exception.GrootStreamRuntimeException; -import com.geedgenetworks.common.udf.UDFContext; import com.geedgenetworks.core.udf.AsnLookup; import com.geedgenetworks.core.udf.knowlegdebase.handler.AsnKnowledgeBaseHandler; import com.geedgenetworks.core.udf.knowlegdebase.KnowledgeBaseUpdateJob; +import com.geedgenetworks.api.common.udf.UDFContext; import org.junit.jupiter.api.Assertions; import org.junit.jupiter.api.BeforeAll; import org.junit.jupiter.api.Test; diff --git a/groot-core/src/test/java/com/geedgenetworks/core/udf/test/GeoIpLookupFunctionTest.java b/groot-core/src/test/java/com/geedgenetworks/core/udf/test/GeoIpLookupFunctionTest.java index 83f8ab4..90e39f8 100644 --- a/groot-core/src/test/java/com/geedgenetworks/core/udf/test/GeoIpLookupFunctionTest.java +++ b/groot-core/src/test/java/com/geedgenetworks/core/udf/test/GeoIpLookupFunctionTest.java @@ -1,13 +1,12 @@ package com.geedgenetworks.core.udf.test; -import com.alibaba.fastjson2.JSONObject; import com.geedgenetworks.common.config.CommonConfig; import com.geedgenetworks.common.config.KnowledgeBaseConfig; import com.geedgenetworks.common.exception.GrootStreamRuntimeException; -import com.geedgenetworks.common.udf.UDFContext; import com.geedgenetworks.core.udf.GeoIpLookup; import com.geedgenetworks.core.udf.knowlegdebase.handler.GeoIpKnowledgeBaseHandler; import com.geedgenetworks.core.udf.knowlegdebase.KnowledgeBaseUpdateJob; +import com.geedgenetworks.api.common.udf.UDFContext; import org.junit.jupiter.api.Assertions; import org.junit.jupiter.api.BeforeAll; import org.junit.jupiter.api.Test; diff --git a/groot-core/src/test/java/com/geedgenetworks/core/udf/test/aggregate/CollectListTest.java b/groot-core/src/test/java/com/geedgenetworks/core/udf/test/aggregate/CollectListTest.java index 16d5cce..0bd3d4a 100644 --- a/groot-core/src/test/java/com/geedgenetworks/core/udf/test/aggregate/CollectListTest.java +++ b/groot-core/src/test/java/com/geedgenetworks/core/udf/test/aggregate/CollectListTest.java @@ -1,10 +1,10 @@ package com.geedgenetworks.core.udf.test.aggregate; -import com.geedgenetworks.common.Accumulator; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.common.udf.UDFContext; +import com.geedgenetworks.common.config.Accumulator; import com.geedgenetworks.core.udf.udaf.CollectList; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.connector.event.Event; import org.junit.jupiter.api.Test; import java.text.ParseException; diff --git a/groot-core/src/test/java/com/geedgenetworks/core/udf/test/aggregate/CollectSetTest.java b/groot-core/src/test/java/com/geedgenetworks/core/udf/test/aggregate/CollectSetTest.java index 8909794..edfd81a 100644 --- a/groot-core/src/test/java/com/geedgenetworks/core/udf/test/aggregate/CollectSetTest.java +++ b/groot-core/src/test/java/com/geedgenetworks/core/udf/test/aggregate/CollectSetTest.java @@ -1,11 +1,10 @@ package com.geedgenetworks.core.udf.test.aggregate; -import com.geedgenetworks.common.Accumulator; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.common.udf.UDFContext; -import com.geedgenetworks.core.udf.udaf.CollectList; +import com.geedgenetworks.common.config.Accumulator; import com.geedgenetworks.core.udf.udaf.CollectSet; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.connector.event.Event; import org.junit.jupiter.api.Test; import java.text.ParseException; diff --git a/groot-core/src/test/java/com/geedgenetworks/core/udf/test/aggregate/FirstValueTest.java b/groot-core/src/test/java/com/geedgenetworks/core/udf/test/aggregate/FirstValueTest.java index 0acf1d5..1e51c8a 100644 --- a/groot-core/src/test/java/com/geedgenetworks/core/udf/test/aggregate/FirstValueTest.java +++ b/groot-core/src/test/java/com/geedgenetworks/core/udf/test/aggregate/FirstValueTest.java @@ -1,11 +1,10 @@ package com.geedgenetworks.core.udf.test.aggregate; -import com.geedgenetworks.common.Accumulator; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.common.udf.UDFContext; -import com.geedgenetworks.core.udf.udaf.CollectSet; +import com.geedgenetworks.common.config.Accumulator; import com.geedgenetworks.core.udf.udaf.FirstValue; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.connector.event.Event; import org.junit.jupiter.api.Test; import java.text.ParseException; diff --git a/groot-core/src/test/java/com/geedgenetworks/core/udf/test/aggregate/LastValueTest.java b/groot-core/src/test/java/com/geedgenetworks/core/udf/test/aggregate/LastValueTest.java index b2c9ceb..b6934b7 100644 --- a/groot-core/src/test/java/com/geedgenetworks/core/udf/test/aggregate/LastValueTest.java +++ b/groot-core/src/test/java/com/geedgenetworks/core/udf/test/aggregate/LastValueTest.java @@ -1,11 +1,10 @@ package com.geedgenetworks.core.udf.test.aggregate; -import com.geedgenetworks.common.Accumulator; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.common.udf.UDFContext; -import com.geedgenetworks.core.udf.udaf.FirstValue; +import com.geedgenetworks.common.config.Accumulator; import com.geedgenetworks.core.udf.udaf.LastValue; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.connector.event.Event; import org.junit.jupiter.api.Test; import java.text.ParseException; diff --git a/groot-core/src/test/java/com/geedgenetworks/core/udf/test/aggregate/LongCountTest.java b/groot-core/src/test/java/com/geedgenetworks/core/udf/test/aggregate/LongCountTest.java index 54d9dba..a02ba0f 100644 --- a/groot-core/src/test/java/com/geedgenetworks/core/udf/test/aggregate/LongCountTest.java +++ b/groot-core/src/test/java/com/geedgenetworks/core/udf/test/aggregate/LongCountTest.java @@ -1,13 +1,10 @@ package com.geedgenetworks.core.udf.test.aggregate; -import com.geedgenetworks.common.Accumulator; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.common.udf.UDFContext; -import com.geedgenetworks.core.udf.udaf.LastValue; +import com.geedgenetworks.common.config.Accumulator; import com.geedgenetworks.core.udf.udaf.LongCount; -import com.geedgenetworks.core.udf.udaf.NumberSum; -import com.ibm.icu.text.NumberFormat; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.connector.event.Event; import org.junit.jupiter.api.Test; import java.text.ParseException; diff --git a/groot-core/src/test/java/com/geedgenetworks/core/udf/test/aggregate/MaxTest.java b/groot-core/src/test/java/com/geedgenetworks/core/udf/test/aggregate/MaxTest.java index 311d51f..9c98f08 100644 --- a/groot-core/src/test/java/com/geedgenetworks/core/udf/test/aggregate/MaxTest.java +++ b/groot-core/src/test/java/com/geedgenetworks/core/udf/test/aggregate/MaxTest.java @@ -1,11 +1,10 @@ package com.geedgenetworks.core.udf.test.aggregate; -import com.geedgenetworks.common.Accumulator; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.common.udf.UDFContext; +import com.geedgenetworks.common.config.Accumulator; import com.geedgenetworks.core.udf.udaf.Max; -import com.geedgenetworks.core.udf.udaf.Min; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.connector.event.Event; import org.junit.jupiter.api.BeforeEach; import org.junit.jupiter.api.Test; diff --git a/groot-core/src/test/java/com/geedgenetworks/core/udf/test/aggregate/MeanTest.java b/groot-core/src/test/java/com/geedgenetworks/core/udf/test/aggregate/MeanTest.java index 62efc0a..330c70b 100644 --- a/groot-core/src/test/java/com/geedgenetworks/core/udf/test/aggregate/MeanTest.java +++ b/groot-core/src/test/java/com/geedgenetworks/core/udf/test/aggregate/MeanTest.java @@ -1,11 +1,10 @@ package com.geedgenetworks.core.udf.test.aggregate; -import com.geedgenetworks.common.Accumulator; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.common.udf.UDFContext; +import com.geedgenetworks.common.config.Accumulator; import com.geedgenetworks.core.udf.udaf.Mean; -import com.geedgenetworks.core.udf.udaf.NumberSum; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.connector.event.Event; import com.ibm.icu.text.NumberFormat; import org.junit.jupiter.api.Test; diff --git a/groot-core/src/test/java/com/geedgenetworks/core/udf/test/aggregate/MinTest.java b/groot-core/src/test/java/com/geedgenetworks/core/udf/test/aggregate/MinTest.java index e5a1615..4bcda37 100644 --- a/groot-core/src/test/java/com/geedgenetworks/core/udf/test/aggregate/MinTest.java +++ b/groot-core/src/test/java/com/geedgenetworks/core/udf/test/aggregate/MinTest.java @@ -1,10 +1,10 @@ package com.geedgenetworks.core.udf.test.aggregate; -import com.geedgenetworks.common.Accumulator; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.common.udf.UDFContext; +import com.geedgenetworks.common.config.Accumulator; import com.geedgenetworks.core.udf.udaf.Min; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.connector.event.Event; import org.junit.jupiter.api.BeforeEach; import org.junit.jupiter.api.Test; diff --git a/groot-core/src/test/java/com/geedgenetworks/core/udf/test/aggregate/NumberSumTest.java b/groot-core/src/test/java/com/geedgenetworks/core/udf/test/aggregate/NumberSumTest.java index 7ccb365..7a931d9 100644 --- a/groot-core/src/test/java/com/geedgenetworks/core/udf/test/aggregate/NumberSumTest.java +++ b/groot-core/src/test/java/com/geedgenetworks/core/udf/test/aggregate/NumberSumTest.java @@ -16,11 +16,10 @@ package com.geedgenetworks.core.udf.test.aggregate; -import com.geedgenetworks.common.Accumulator; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.common.udf.UDFContext; -import com.geedgenetworks.core.udf.udaf.LongCount; +import com.geedgenetworks.common.config.Accumulator; import com.geedgenetworks.core.udf.udaf.NumberSum; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.connector.event.Event; import com.ibm.icu.text.NumberFormat; import org.junit.jupiter.api.Test; diff --git a/groot-core/src/test/java/com/geedgenetworks/core/udf/test/simple/DecodeBase64FunctionTest.java b/groot-core/src/test/java/com/geedgenetworks/core/udf/test/simple/DecodeBase64FunctionTest.java index a5f31f7..4b20ff2 100644 --- a/groot-core/src/test/java/com/geedgenetworks/core/udf/test/simple/DecodeBase64FunctionTest.java +++ b/groot-core/src/test/java/com/geedgenetworks/core/udf/test/simple/DecodeBase64FunctionTest.java @@ -1,9 +1,9 @@ package com.geedgenetworks.core.udf.test.simple; import com.geedgenetworks.common.exception.GrootStreamRuntimeException; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.common.udf.UDFContext; import com.geedgenetworks.core.udf.DecodeBase64; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.connector.event.Event; import org.junit.jupiter.api.BeforeAll; import org.junit.jupiter.api.Test; diff --git a/groot-core/src/test/java/com/geedgenetworks/core/udf/test/simple/DomainFunctionTest.java b/groot-core/src/test/java/com/geedgenetworks/core/udf/test/simple/DomainFunctionTest.java index f8076cc..55f76b4 100644 --- a/groot-core/src/test/java/com/geedgenetworks/core/udf/test/simple/DomainFunctionTest.java +++ b/groot-core/src/test/java/com/geedgenetworks/core/udf/test/simple/DomainFunctionTest.java @@ -1,9 +1,9 @@ package com.geedgenetworks.core.udf.test.simple; import com.geedgenetworks.common.exception.GrootStreamRuntimeException; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.common.udf.UDFContext; import com.geedgenetworks.core.udf.Domain; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.connector.event.Event; import org.junit.jupiter.api.Assertions; import org.junit.jupiter.api.BeforeAll; import org.junit.jupiter.api.Test; diff --git a/groot-core/src/test/java/com/geedgenetworks/core/udf/test/simple/DropFunctionTest.java b/groot-core/src/test/java/com/geedgenetworks/core/udf/test/simple/DropFunctionTest.java index 027533e..ef6cebd 100644 --- a/groot-core/src/test/java/com/geedgenetworks/core/udf/test/simple/DropFunctionTest.java +++ b/groot-core/src/test/java/com/geedgenetworks/core/udf/test/simple/DropFunctionTest.java @@ -1,8 +1,8 @@ package com.geedgenetworks.core.udf.test.simple; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.common.udf.UDFContext; import com.geedgenetworks.core.udf.Drop; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.connector.event.Event; import org.junit.jupiter.api.BeforeAll; import org.junit.jupiter.api.Test; import java.util.HashMap; diff --git a/groot-core/src/test/java/com/geedgenetworks/core/udf/test/simple/EncodeBase64FunctionTest.java b/groot-core/src/test/java/com/geedgenetworks/core/udf/test/simple/EncodeBase64FunctionTest.java index 2bd6705..07203b5 100644 --- a/groot-core/src/test/java/com/geedgenetworks/core/udf/test/simple/EncodeBase64FunctionTest.java +++ b/groot-core/src/test/java/com/geedgenetworks/core/udf/test/simple/EncodeBase64FunctionTest.java @@ -1,15 +1,11 @@ package com.geedgenetworks.core.udf.test.simple; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.common.exception.GrootStreamRuntimeException; -import com.geedgenetworks.common.udf.UDFContext; -import com.geedgenetworks.core.udf.DecodeBase64; import com.geedgenetworks.core.udf.EncodeBase64; -import org.junit.jupiter.api.BeforeAll; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.connector.event.Event; import org.junit.jupiter.api.Test; import java.nio.charset.StandardCharsets; -import java.util.Arrays; import java.util.Collections; import java.util.HashMap; import java.util.Map; diff --git a/groot-core/src/test/java/com/geedgenetworks/core/udf/test/simple/EncryptFunctionTest.java b/groot-core/src/test/java/com/geedgenetworks/core/udf/test/simple/EncryptFunctionTest.java index 20f3c0d..c0a3ef9 100644 --- a/groot-core/src/test/java/com/geedgenetworks/core/udf/test/simple/EncryptFunctionTest.java +++ b/groot-core/src/test/java/com/geedgenetworks/core/udf/test/simple/EncryptFunctionTest.java @@ -2,19 +2,19 @@ package com.geedgenetworks.core.udf.test.simple; import cn.hutool.core.util.RandomUtil; import com.alibaba.fastjson2.JSON; -import com.geedgenetworks.common.Constants; -import com.geedgenetworks.common.Event; +import com.geedgenetworks.common.config.Constants; import com.geedgenetworks.common.config.CommonConfig; import com.geedgenetworks.common.config.KmsConfig; import com.geedgenetworks.common.config.ClientSSLConfig; import com.geedgenetworks.common.exception.GrootStreamRuntimeException; -import com.geedgenetworks.common.udf.UDFContext; +import com.geedgenetworks.common.utils.HttpClientPoolUtil; import com.geedgenetworks.core.pojo.DataEncryptionKey; import com.geedgenetworks.core.udf.Encrypt; import com.geedgenetworks.core.udf.encrypt.Crypto; import com.geedgenetworks.core.utils.CryptoProvider; -import com.geedgenetworks.core.utils.HttpClientPoolUtil; import com.geedgenetworks.core.utils.KMSUtils; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.connector.event.Event; import io.github.jopenlibs.vault.VaultException; import org.apache.flink.api.common.ExecutionConfig; import org.apache.flink.api.common.functions.RuntimeContext; diff --git a/groot-core/src/test/java/com/geedgenetworks/core/udf/test/simple/FlattenFunctionTest.java b/groot-core/src/test/java/com/geedgenetworks/core/udf/test/simple/FlattenFunctionTest.java index e829b1d..a73948f 100644 --- a/groot-core/src/test/java/com/geedgenetworks/core/udf/test/simple/FlattenFunctionTest.java +++ b/groot-core/src/test/java/com/geedgenetworks/core/udf/test/simple/FlattenFunctionTest.java @@ -1,9 +1,8 @@ package com.geedgenetworks.core.udf.test.simple; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.common.udf.UDFContext; import com.geedgenetworks.core.udf.Flatten; -import com.geedgenetworks.core.udf.StringJoiner; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.connector.event.Event; import org.junit.jupiter.api.BeforeAll; import org.junit.jupiter.api.Test; diff --git a/groot-core/src/test/java/com/geedgenetworks/core/udf/test/simple/FromUnixTimestampTest.java b/groot-core/src/test/java/com/geedgenetworks/core/udf/test/simple/FromUnixTimestampTest.java index 6cb1bf8..bc97a41 100644 --- a/groot-core/src/test/java/com/geedgenetworks/core/udf/test/simple/FromUnixTimestampTest.java +++ b/groot-core/src/test/java/com/geedgenetworks/core/udf/test/simple/FromUnixTimestampTest.java @@ -1,8 +1,8 @@ package com.geedgenetworks.core.udf.test.simple; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.common.udf.UDFContext; import com.geedgenetworks.core.udf.FromUnixTimestamp; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.connector.event.Event; import org.junit.jupiter.api.BeforeAll; import org.junit.jupiter.api.Test; diff --git a/groot-core/src/test/java/com/geedgenetworks/core/udf/test/simple/GenerateStringArrayFunctionTest.java b/groot-core/src/test/java/com/geedgenetworks/core/udf/test/simple/GenerateStringArrayFunctionTest.java index 1fbe06c..2156bb7 100644 --- a/groot-core/src/test/java/com/geedgenetworks/core/udf/test/simple/GenerateStringArrayFunctionTest.java +++ b/groot-core/src/test/java/com/geedgenetworks/core/udf/test/simple/GenerateStringArrayFunctionTest.java @@ -1,11 +1,10 @@ package com.geedgenetworks.core.udf.test.simple; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.common.udf.UDFContext; import com.geedgenetworks.core.udf.GenerateStringArray; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.connector.event.Event; import org.junit.jupiter.api.BeforeAll; import org.junit.jupiter.api.Test; -import scala.Array; import java.util.*; diff --git a/groot-core/src/test/java/com/geedgenetworks/core/udf/test/simple/HmacFunctionTest.java b/groot-core/src/test/java/com/geedgenetworks/core/udf/test/simple/HmacFunctionTest.java index 5a4d0d3..841ae35 100644 --- a/groot-core/src/test/java/com/geedgenetworks/core/udf/test/simple/HmacFunctionTest.java +++ b/groot-core/src/test/java/com/geedgenetworks/core/udf/test/simple/HmacFunctionTest.java @@ -1,9 +1,9 @@ package com.geedgenetworks.core.udf.test.simple; -import com.geedgenetworks.common.Event; import com.geedgenetworks.common.exception.GrootStreamRuntimeException; -import com.geedgenetworks.common.udf.UDFContext; import com.geedgenetworks.core.udf.Hmac; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.connector.event.Event; import org.junit.jupiter.api.BeforeAll; import org.junit.jupiter.api.Test; diff --git a/groot-core/src/test/java/com/geedgenetworks/core/udf/test/simple/JsonExtractFunctionTest.java b/groot-core/src/test/java/com/geedgenetworks/core/udf/test/simple/JsonExtractFunctionTest.java index dd661de..2460b77 100644 --- a/groot-core/src/test/java/com/geedgenetworks/core/udf/test/simple/JsonExtractFunctionTest.java +++ b/groot-core/src/test/java/com/geedgenetworks/core/udf/test/simple/JsonExtractFunctionTest.java @@ -3,9 +3,9 @@ package com.geedgenetworks.core.udf.test.simple; import com.alibaba.fastjson.JSON; import com.alibaba.fastjson.TypeReference; import com.geedgenetworks.common.exception.GrootStreamRuntimeException; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.common.udf.UDFContext; import com.geedgenetworks.core.udf.JsonExtract; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.connector.event.Event; import org.junit.jupiter.api.Assertions; import org.junit.jupiter.api.BeforeAll; import org.junit.jupiter.api.Test; diff --git a/groot-core/src/test/java/com/geedgenetworks/core/udf/test/simple/RenameFunctionTest.java b/groot-core/src/test/java/com/geedgenetworks/core/udf/test/simple/RenameFunctionTest.java index 24b3774..a9a147b 100644 --- a/groot-core/src/test/java/com/geedgenetworks/core/udf/test/simple/RenameFunctionTest.java +++ b/groot-core/src/test/java/com/geedgenetworks/core/udf/test/simple/RenameFunctionTest.java @@ -1,13 +1,12 @@ package com.geedgenetworks.core.udf.test.simple; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.common.udf.UDFContext; import com.geedgenetworks.core.udf.Rename; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.connector.event.Event; import org.junit.jupiter.api.BeforeAll; import org.junit.jupiter.api.Test; import java.util.HashMap; -import java.util.List; import java.util.Map; import static org.junit.jupiter.api.Assertions.assertEquals; diff --git a/groot-core/src/test/java/com/geedgenetworks/core/udf/test/simple/StringJoinerFunctionTest.java b/groot-core/src/test/java/com/geedgenetworks/core/udf/test/simple/StringJoinerFunctionTest.java index e9cde8a..68ea214 100644 --- a/groot-core/src/test/java/com/geedgenetworks/core/udf/test/simple/StringJoinerFunctionTest.java +++ b/groot-core/src/test/java/com/geedgenetworks/core/udf/test/simple/StringJoinerFunctionTest.java @@ -1,9 +1,8 @@ package com.geedgenetworks.core.udf.test.simple; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.common.udf.UDFContext; -import com.geedgenetworks.core.udf.Rename; import com.geedgenetworks.core.udf.StringJoiner; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.connector.event.Event; import org.junit.jupiter.api.BeforeAll; import org.junit.jupiter.api.Test; diff --git a/groot-core/src/test/java/com/geedgenetworks/core/udf/test/simple/UUIDTest.java b/groot-core/src/test/java/com/geedgenetworks/core/udf/test/simple/UUIDTest.java index 70216b4..5f1715f 100644 --- a/groot-core/src/test/java/com/geedgenetworks/core/udf/test/simple/UUIDTest.java +++ b/groot-core/src/test/java/com/geedgenetworks/core/udf/test/simple/UUIDTest.java @@ -1,11 +1,11 @@ package com.geedgenetworks.core.udf.test.simple; -import com.geedgenetworks.common.Event; import com.geedgenetworks.common.exception.GrootStreamRuntimeException; -import com.geedgenetworks.common.udf.UDFContext; import com.geedgenetworks.core.udf.uuid.UUID; import com.geedgenetworks.core.udf.uuid.UUIDv5; import com.geedgenetworks.core.udf.uuid.UUIDv7; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.connector.event.Event; import org.junit.jupiter.api.Assertions; import org.junit.jupiter.api.Test; diff --git a/groot-core/src/test/java/com/geedgenetworks/core/udf/test/simple/UnixTimestampConverterTest.java b/groot-core/src/test/java/com/geedgenetworks/core/udf/test/simple/UnixTimestampConverterTest.java index a0d70d7..396fb93 100644 --- a/groot-core/src/test/java/com/geedgenetworks/core/udf/test/simple/UnixTimestampConverterTest.java +++ b/groot-core/src/test/java/com/geedgenetworks/core/udf/test/simple/UnixTimestampConverterTest.java @@ -1,8 +1,8 @@ package com.geedgenetworks.core.udf.test.simple; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.common.udf.UDFContext; import com.geedgenetworks.core.udf.UnixTimestampConverter; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.connector.event.Event; import org.junit.jupiter.api.BeforeAll; import org.junit.jupiter.api.Test; diff --git a/groot-core/src/test/java/com/geedgenetworks/core/udf/test/table/JsonUnrollFunctionTest.java b/groot-core/src/test/java/com/geedgenetworks/core/udf/test/table/JsonUnrollFunctionTest.java index 3749eb1..288483a 100644 --- a/groot-core/src/test/java/com/geedgenetworks/core/udf/test/table/JsonUnrollFunctionTest.java +++ b/groot-core/src/test/java/com/geedgenetworks/core/udf/test/table/JsonUnrollFunctionTest.java @@ -1,11 +1,10 @@ package com.geedgenetworks.core.udf.test.table; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.common.udf.UDFContext; import com.geedgenetworks.core.udf.udtf.JsonUnroll; -import com.geedgenetworks.core.udf.udtf.Unroll; +import com.geedgenetworks.api.common.udf.UDFContext; import org.junit.jupiter.api.BeforeAll; import org.junit.jupiter.api.Test; +import com.geedgenetworks.api.connector.event.Event; import java.util.HashMap; import java.util.List; diff --git a/groot-core/src/test/java/com/geedgenetworks/core/udf/test/table/UnrollFunctionTest.java b/groot-core/src/test/java/com/geedgenetworks/core/udf/test/table/UnrollFunctionTest.java index db66e55..8774210 100644 --- a/groot-core/src/test/java/com/geedgenetworks/core/udf/test/table/UnrollFunctionTest.java +++ b/groot-core/src/test/java/com/geedgenetworks/core/udf/test/table/UnrollFunctionTest.java @@ -1,8 +1,8 @@ package com.geedgenetworks.core.udf.test.table; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.common.udf.UDFContext; import com.geedgenetworks.core.udf.udtf.Unroll; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.connector.event.Event; import org.junit.jupiter.api.BeforeAll; import org.junit.jupiter.api.Test; diff --git a/groot-core/src/test/java/com/geedgenetworks/core/udf/udaf/HdrHistogram/HdrHistogramQuantileTest.java b/groot-core/src/test/java/com/geedgenetworks/core/udf/udaf/HdrHistogram/HdrHistogramQuantileTest.java index 990186d..830f4c8 100644 --- a/groot-core/src/test/java/com/geedgenetworks/core/udf/udaf/HdrHistogram/HdrHistogramQuantileTest.java +++ b/groot-core/src/test/java/com/geedgenetworks/core/udf/udaf/HdrHistogram/HdrHistogramQuantileTest.java @@ -1,10 +1,10 @@ package com.geedgenetworks.core.udf.udaf.HdrHistogram; -import com.geedgenetworks.common.Accumulator; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.common.udf.AggregateFunction; -import com.geedgenetworks.common.udf.UDFContext; +import com.geedgenetworks.common.config.Accumulator; import com.geedgenetworks.sketch.util.StringUtils; +import com.geedgenetworks.api.common.udf.AggregateFunction; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.connector.event.Event; import org.HdrHistogram.ArrayHistogram; import org.junit.jupiter.api.Test; diff --git a/groot-core/src/test/java/com/geedgenetworks/core/udf/udaf/HdrHistogram/HdrHistogramQuantilesTest.java b/groot-core/src/test/java/com/geedgenetworks/core/udf/udaf/HdrHistogram/HdrHistogramQuantilesTest.java index a57645d..ec761f2 100644 --- a/groot-core/src/test/java/com/geedgenetworks/core/udf/udaf/HdrHistogram/HdrHistogramQuantilesTest.java +++ b/groot-core/src/test/java/com/geedgenetworks/core/udf/udaf/HdrHistogram/HdrHistogramQuantilesTest.java @@ -1,10 +1,10 @@ package com.geedgenetworks.core.udf.udaf.HdrHistogram; -import com.geedgenetworks.common.Accumulator; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.common.udf.AggregateFunction; -import com.geedgenetworks.common.udf.UDFContext; +import com.geedgenetworks.common.config.Accumulator; import com.geedgenetworks.sketch.util.StringUtils; +import com.geedgenetworks.api.common.udf.AggregateFunction; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.connector.event.Event; import org.HdrHistogram.ArrayHistogram; import org.junit.jupiter.api.Test; diff --git a/groot-core/src/test/java/com/geedgenetworks/core/udf/udaf/HdrHistogram/HdrHistogramTest.java b/groot-core/src/test/java/com/geedgenetworks/core/udf/udaf/HdrHistogram/HdrHistogramTest.java index 5905138..3c4ac41 100644 --- a/groot-core/src/test/java/com/geedgenetworks/core/udf/udaf/HdrHistogram/HdrHistogramTest.java +++ b/groot-core/src/test/java/com/geedgenetworks/core/udf/udaf/HdrHistogram/HdrHistogramTest.java @@ -1,10 +1,10 @@ package com.geedgenetworks.core.udf.udaf.HdrHistogram; -import com.geedgenetworks.common.Accumulator; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.common.udf.AggregateFunction; -import com.geedgenetworks.common.udf.UDFContext; +import com.geedgenetworks.common.config.Accumulator; import com.geedgenetworks.sketch.util.StringUtils; +import com.geedgenetworks.api.common.udf.AggregateFunction; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.connector.event.Event; import org.HdrHistogram.ArrayHistogram; import org.junit.jupiter.api.Test; diff --git a/groot-core/src/test/java/com/geedgenetworks/core/udf/udaf/hlld/HlldApproxCountDistinctTest.java b/groot-core/src/test/java/com/geedgenetworks/core/udf/udaf/hlld/HlldApproxCountDistinctTest.java index b80d782..77556d9 100644 --- a/groot-core/src/test/java/com/geedgenetworks/core/udf/udaf/hlld/HlldApproxCountDistinctTest.java +++ b/groot-core/src/test/java/com/geedgenetworks/core/udf/udaf/hlld/HlldApproxCountDistinctTest.java @@ -1,12 +1,12 @@ package com.geedgenetworks.core.udf.udaf.hlld; -import com.geedgenetworks.common.Accumulator; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.common.udf.AggregateFunction; -import com.geedgenetworks.common.udf.UDFContext; +import com.geedgenetworks.common.config.Accumulator; import com.geedgenetworks.sketch.hlld.Hll; import com.geedgenetworks.sketch.util.StringUtils; +import com.geedgenetworks.api.common.udf.AggregateFunction; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.connector.event.Event; import org.junit.jupiter.api.Test; import java.util.Collections; diff --git a/groot-core/src/test/java/com/geedgenetworks/core/udf/udaf/hlld/HlldTest.java b/groot-core/src/test/java/com/geedgenetworks/core/udf/udaf/hlld/HlldTest.java index d6ed4c1..67c9527 100644 --- a/groot-core/src/test/java/com/geedgenetworks/core/udf/udaf/hlld/HlldTest.java +++ b/groot-core/src/test/java/com/geedgenetworks/core/udf/udaf/hlld/HlldTest.java @@ -1,11 +1,11 @@ package com.geedgenetworks.core.udf.udaf.hlld; -import com.geedgenetworks.common.Accumulator; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.common.udf.AggregateFunction; -import com.geedgenetworks.common.udf.UDFContext; +import com.geedgenetworks.common.config.Accumulator; import com.geedgenetworks.sketch.hlld.Hll; import com.geedgenetworks.sketch.util.StringUtils; +import com.geedgenetworks.api.common.udf.AggregateFunction; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.connector.event.Event; import org.junit.jupiter.api.Test; import java.nio.charset.StandardCharsets; diff --git a/groot-core/src/test/java/com/geedgenetworks/core/udf/udtf/UnrollTest.java b/groot-core/src/test/java/com/geedgenetworks/core/udf/udtf/UnrollTest.java index 3320f38..0a46028 100644 --- a/groot-core/src/test/java/com/geedgenetworks/core/udf/udtf/UnrollTest.java +++ b/groot-core/src/test/java/com/geedgenetworks/core/udf/udtf/UnrollTest.java @@ -1,13 +1,12 @@ package com.geedgenetworks.core.udf.udtf; import com.alibaba.fastjson2.JSON; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.common.udf.UDFContext; +import com.geedgenetworks.api.common.udf.UDFContext; +import com.geedgenetworks.api.connector.event.Event; import org.junit.jupiter.api.Test; import java.util.Arrays; import java.util.HashMap; -import java.util.List; import java.util.Map; import java.util.stream.Collectors; diff --git a/groot-core/src/test/java/com/geedgenetworks/core/utils/LoadIntervalDataUtilTest.java b/groot-core/src/test/java/com/geedgenetworks/core/utils/LoadIntervalDataUtilTest.java index b7c6306..ef9a21e 100644 --- a/groot-core/src/test/java/com/geedgenetworks/core/utils/LoadIntervalDataUtilTest.java +++ b/groot-core/src/test/java/com/geedgenetworks/core/utils/LoadIntervalDataUtilTest.java @@ -1,80 +1,83 @@ -package com.geedgenetworks.core.utils;
-
-
-import java.sql.Timestamp;
-
-public class LoadIntervalDataUtilTest {
-
- public static void main(String[] args) throws Exception{
- //testNoError();
- //testNotUpdateDataOnStart();
- //testWithErrorAndNotFail();
- testWithErrorAndFail();
- }
-
- public static void testNoError() throws Exception{
- LoadIntervalDataUtil<Timestamp> util = LoadIntervalDataUtil.newInstance(() -> new Timestamp(System.currentTimeMillis()),
- LoadIntervalDataOptions.defaults("time", 3000));
-
- System.out.println(new Timestamp(System.currentTimeMillis()) + " - " + util.data());
-
- for (int i = 0; i < 10; i++) {
- Thread.sleep(1000);
- System.out.println(new Timestamp(System.currentTimeMillis()) + " - " + util.data());
- }
-
- util.stop();
- }
-
- public static void testNotUpdateDataOnStart() throws Exception{
- LoadIntervalDataUtil<Timestamp> util = LoadIntervalDataUtil.newInstance(() -> new Timestamp(System.currentTimeMillis()),
- LoadIntervalDataOptions.builder().withName("time").withIntervalMs(3000).withUpdateDataOnStart(false).build());
-
- System.out.println(new Timestamp(System.currentTimeMillis()) + " - " + util.data());
-
- for (int i = 0; i < 10; i++) {
- Thread.sleep(1000);
- System.out.println(new Timestamp(System.currentTimeMillis()) + " - " + util.data());
- }
-
- util.stop();
- }
-
- public static void testWithErrorAndNotFail() throws Exception{
- final long start = System.currentTimeMillis();
- LoadIntervalDataUtil<Timestamp> util = LoadIntervalDataUtil.newInstance(() -> {
- if(System.currentTimeMillis() - start >= 5000){
- throw new RuntimeException(new Timestamp(System.currentTimeMillis()).toString());
- }
- return new Timestamp(System.currentTimeMillis());
- }, LoadIntervalDataOptions.defaults("time", 3000));
-
- System.out.println(new Timestamp(System.currentTimeMillis()) + " - " + util.data());
-
- for (int i = 0; i < 10; i++) {
- Thread.sleep(1000);
- System.out.println(new Timestamp(System.currentTimeMillis()) + " - " + util.data());
- }
-
- util.stop();
- }
-
- public static void testWithErrorAndFail() throws Exception{
- final long start = System.currentTimeMillis();
- LoadIntervalDataUtil<Timestamp> util = LoadIntervalDataUtil.newInstance(() -> {
- if(System.currentTimeMillis() - start >= 5000){
- throw new RuntimeException(new Timestamp(System.currentTimeMillis()).toString());
- }
- return new Timestamp(System.currentTimeMillis());
- }, LoadIntervalDataOptions.builder().withName("time").withIntervalMs(3000).withFailOnException(true).build());
-
- System.out.println(new Timestamp(System.currentTimeMillis()) + " - " + util.data());
-
- for (int i = 0; i < 10; i++) {
- Thread.sleep(1000);
- System.out.println(new Timestamp(System.currentTimeMillis()) + " - " + util.data());
- }
-
- util.stop();
- }
+package com.geedgenetworks.core.utils; + + +import com.geedgenetworks.api.configuration.util.LoadIntervalDataOptions; +import com.geedgenetworks.api.configuration.util.LoadIntervalDataUtil; + +import java.sql.Timestamp; + +public class LoadIntervalDataUtilTest { + + public static void main(String[] args) throws Exception{ + //testNoError(); + //testNotUpdateDataOnStart(); + //testWithErrorAndNotFail(); + testWithErrorAndFail(); + } + + public static void testNoError() throws Exception{ + LoadIntervalDataUtil<Timestamp> util = LoadIntervalDataUtil.newInstance(() -> new Timestamp(System.currentTimeMillis()), + LoadIntervalDataOptions.defaults("time", 3000)); + + System.out.println(new Timestamp(System.currentTimeMillis()) + " - " + util.data()); + + for (int i = 0; i < 10; i++) { + Thread.sleep(1000); + System.out.println(new Timestamp(System.currentTimeMillis()) + " - " + util.data()); + } + + util.stop(); + } + + public static void testNotUpdateDataOnStart() throws Exception{ + LoadIntervalDataUtil<Timestamp> util = LoadIntervalDataUtil.newInstance(() -> new Timestamp(System.currentTimeMillis()), + LoadIntervalDataOptions.builder().withName("time").withIntervalMs(3000).withUpdateDataOnStart(false).build()); + + System.out.println(new Timestamp(System.currentTimeMillis()) + " - " + util.data()); + + for (int i = 0; i < 10; i++) { + Thread.sleep(1000); + System.out.println(new Timestamp(System.currentTimeMillis()) + " - " + util.data()); + } + + util.stop(); + } + + public static void testWithErrorAndNotFail() throws Exception{ + final long start = System.currentTimeMillis(); + LoadIntervalDataUtil<Timestamp> util = LoadIntervalDataUtil.newInstance(() -> { + if(System.currentTimeMillis() - start >= 5000){ + throw new RuntimeException(new Timestamp(System.currentTimeMillis()).toString()); + } + return new Timestamp(System.currentTimeMillis()); + }, LoadIntervalDataOptions.defaults("time", 3000)); + + System.out.println(new Timestamp(System.currentTimeMillis()) + " - " + util.data()); + + for (int i = 0; i < 10; i++) { + Thread.sleep(1000); + System.out.println(new Timestamp(System.currentTimeMillis()) + " - " + util.data()); + } + + util.stop(); + } + + public static void testWithErrorAndFail() throws Exception{ + final long start = System.currentTimeMillis(); + LoadIntervalDataUtil<Timestamp> util = LoadIntervalDataUtil.newInstance(() -> { + if(System.currentTimeMillis() - start >= 5000){ + throw new RuntimeException(new Timestamp(System.currentTimeMillis()).toString()); + } + return new Timestamp(System.currentTimeMillis()); + }, LoadIntervalDataOptions.builder().withName("time").withIntervalMs(3000).withFailOnException(true).build()); + + System.out.println(new Timestamp(System.currentTimeMillis()) + " - " + util.data()); + + for (int i = 0; i < 10; i++) { + Thread.sleep(1000); + System.out.println(new Timestamp(System.currentTimeMillis()) + " - " + util.data()); + } + + util.stop(); + } }
\ No newline at end of file diff --git a/groot-core/src/test/java/com/geedgenetworks/core/utils/SingleValueMapTest.java b/groot-core/src/test/java/com/geedgenetworks/core/utils/SingleValueMapTest.java index f5f1e7c..5d6994d 100644 --- a/groot-core/src/test/java/com/geedgenetworks/core/utils/SingleValueMapTest.java +++ b/groot-core/src/test/java/com/geedgenetworks/core/utils/SingleValueMapTest.java @@ -1,98 +1,100 @@ -package com.geedgenetworks.core.utils;
-
-import org.junit.jupiter.api.Assertions;
-
-import java.sql.Timestamp;
-import java.util.concurrent.ThreadLocalRandom;
-import java.util.concurrent.atomic.AtomicInteger;
-
-public class SingleValueMapTest {
-
- public static void main(String[] args) throws Exception {
- //testSingleValue();
- testSingleValueWithLoadIntervalDataUtil();
- }
-
- public static void testSingleValue() throws Exception {
- Thread[] threads = new Thread[20];
- for (int i = 0; i < threads.length; i++) {
- threads[i] = new Thread(() -> {
- SingleValueMap.Data<ConnDada> connDada = null;
- try {
- connDada = SingleValueMap.acquireData("conn_data", () -> new ConnDada(), x -> {
- System.out.println("close conn");
- });
- } catch (Exception e) {
- throw new RuntimeException(e);
- }
-
- try {
- Thread.sleep(ThreadLocalRandom.current().nextInt(5) * 10);
- } catch (InterruptedException e) {
- throw new RuntimeException(e);
- }
-
- connDada.release();
- }, "Thread-" + i);
- }
-
- for (int i = 0; i < threads.length; i++) {
- threads[i].start();
- }
-
- for (int i = 0; i < threads.length; i++) {
- threads[i].join();
- }
-
- System.out.println("initCnt:" + ConnDada.initCnt.get());
- Assertions.assertEquals(ConnDada.initCnt.get(), 1);
- }
-
- public static void testSingleValueWithLoadIntervalDataUtil() throws Exception {
- Thread[] threads = new Thread[20];
- for (int i = 0; i < threads.length; i++) {
- threads[i] = new Thread(() -> {
- SingleValueMap.Data<LoadIntervalDataUtil<Timestamp>> util = null;
- try {
- util = SingleValueMap.acquireData("LoadIntervalDataUtil",
- () -> LoadIntervalDataUtil.newInstance(() -> new Timestamp(System.currentTimeMillis()), LoadIntervalDataOptions.defaults("time", 3000)),
- LoadIntervalDataUtil::stop);
- } catch (Exception e) {
- throw new RuntimeException(e);
- }
-
-
- try {
- for (int j = 0; j < 10; j++) {
- Thread.sleep(1000);
- System.out.println(Thread.currentThread().getName() + " - " + new Timestamp(System.currentTimeMillis()) + " - " + util.getData().data());
- }
-
- Thread.sleep(ThreadLocalRandom.current().nextInt(5) * 10);
- } catch (Exception e) {
- throw new RuntimeException(e);
- }
-
- util.release();
- }, "Thread-" + i);
- }
-
- for (int i = 0; i < threads.length; i++) {
- threads[i].start();
- }
-
- for (int i = 0; i < threads.length; i++) {
- threads[i].join();
- }
-
- }
-
- public static class ConnDada {
- static AtomicInteger initCnt = new AtomicInteger(0);
- public ConnDada(){
- System.out.println("ConnDada init");
- initCnt.incrementAndGet();
- }
-
- }
+package com.geedgenetworks.core.utils; + +import com.geedgenetworks.api.configuration.util.LoadIntervalDataOptions; +import com.geedgenetworks.api.configuration.util.LoadIntervalDataUtil; +import org.junit.jupiter.api.Assertions; + +import java.sql.Timestamp; +import java.util.concurrent.ThreadLocalRandom; +import java.util.concurrent.atomic.AtomicInteger; + +public class SingleValueMapTest { + + public static void main(String[] args) throws Exception { + //testSingleValue(); + testSingleValueWithLoadIntervalDataUtil(); + } + + public static void testSingleValue() throws Exception { + Thread[] threads = new Thread[20]; + for (int i = 0; i < threads.length; i++) { + threads[i] = new Thread(() -> { + SingleValueMap.Data<ConnDada> connDada = null; + try { + connDada = SingleValueMap.acquireData("conn_data", () -> new ConnDada(), x -> { + System.out.println("close conn"); + }); + } catch (Exception e) { + throw new RuntimeException(e); + } + + try { + Thread.sleep(ThreadLocalRandom.current().nextInt(5) * 10); + } catch (InterruptedException e) { + throw new RuntimeException(e); + } + + connDada.release(); + }, "Thread-" + i); + } + + for (int i = 0; i < threads.length; i++) { + threads[i].start(); + } + + for (int i = 0; i < threads.length; i++) { + threads[i].join(); + } + + System.out.println("initCnt:" + ConnDada.initCnt.get()); + Assertions.assertEquals(ConnDada.initCnt.get(), 1); + } + + public static void testSingleValueWithLoadIntervalDataUtil() throws Exception { + Thread[] threads = new Thread[20]; + for (int i = 0; i < threads.length; i++) { + threads[i] = new Thread(() -> { + SingleValueMap.Data<LoadIntervalDataUtil<Timestamp>> util = null; + try { + util = SingleValueMap.acquireData("LoadIntervalDataUtil", + () -> LoadIntervalDataUtil.newInstance(() -> new Timestamp(System.currentTimeMillis()), LoadIntervalDataOptions.defaults("time", 3000)), + LoadIntervalDataUtil::stop); + } catch (Exception e) { + throw new RuntimeException(e); + } + + + try { + for (int j = 0; j < 10; j++) { + Thread.sleep(1000); + System.out.println(Thread.currentThread().getName() + " - " + new Timestamp(System.currentTimeMillis()) + " - " + util.getData().data()); + } + + Thread.sleep(ThreadLocalRandom.current().nextInt(5) * 10); + } catch (Exception e) { + throw new RuntimeException(e); + } + + util.release(); + }, "Thread-" + i); + } + + for (int i = 0; i < threads.length; i++) { + threads[i].start(); + } + + for (int i = 0; i < threads.length; i++) { + threads[i].join(); + } + + } + + public static class ConnDada { + static AtomicInteger initCnt = new AtomicInteger(0); + public ConnDada(){ + System.out.println("ConnDada init"); + initCnt.incrementAndGet(); + } + + } }
\ No newline at end of file diff --git a/groot-examples/end-to-end-example/src/main/resources/examples/kafka_to_print.yaml b/groot-examples/end-to-end-example/src/main/resources/examples/kafka_to_print.yaml index bf8ebf2..8eaa567 100644 --- a/groot-examples/end-to-end-example/src/main/resources/examples/kafka_to_print.yaml +++ b/groot-examples/end-to-end-example/src/main/resources/examples/kafka_to_print.yaml @@ -2,20 +2,18 @@ sources: kafka_source: type : kafka schema: - fields: # [array of object] Schema field projection, support read data only from specified fields. - - name: client_ip - type: string - - name: server_ip - type: string properties: # [object] Kafka source properties - topic: SESSION-RECORD - kafka.bootstrap.servers: 192.168.44.11:9092 + topic: DATAPATH-TELEMETRY-RECORD + kafka.bootstrap.servers: "192.168.44.12:9094" kafka.session.timeout.ms: 60000 kafka.max.poll.records: 3000 kafka.max.partition.fetch.bytes: 31457280 - kafka.group.id: GROOT-STREAM-EXAMPLE-KAFKA-TO-PRINT - kafka.auto.offset.reset: latest - format: json + kafka.security.protocol: SASL_PLAINTEXT + kafka.sasl.mechanism: PLAIN + kafka.sasl.jaas.config: 454f65ea6eef1256e3067104f82730e737b68959560966b811e7ff364116b03124917eb2b0f3596f14733aa29ebad9352644ce1a5c85991c6f01ba8a5e8f177a80bea937958aaa485c2acc2b475603495a23eb59f055e037c0b186acb22886bd0275ca91f1633441d9943e7962942252 + kafka.group.id: etl_datapath_telemetry_record_kafka_to_print_test_2024 + kafka.auto.offset.reset: earliest + format: msgpack sinks: # [object] Define connector sink print_sink: @@ -28,10 +26,10 @@ application: # [object] Define job configuration env: name: example-kafka-to-print parallelism: 1 + shade.identifier: aes pipeline: object-reuse: true - flink: - restart-strategy: none + topology: - name: kafka_source downstream: [print_sink] diff --git a/groot-examples/pom.xml b/groot-examples/pom.xml index 46ccaaa..d6cf698 100644 --- a/groot-examples/pom.xml +++ b/groot-examples/pom.xml @@ -22,6 +22,7 @@ </properties> <dependencies> + <dependency> <groupId>com.typesafe</groupId> <artifactId>config</artifactId> @@ -30,34 +31,35 @@ <dependency> <groupId>com.geedgenetworks</groupId> <artifactId>groot-bootstrap</artifactId> - <version>${project.version}</version> + <version>${revision}</version> + <scope>${scope}</scope> </dependency> <dependency> <groupId>com.geedgenetworks</groupId> <artifactId>connector-kafka</artifactId> - <version>${project.version}</version> + <version>${revision}</version> <scope>${scope}</scope> </dependency> <dependency> <groupId>com.geedgenetworks</groupId> <artifactId>connector-mock</artifactId> - <version>${project.version}</version> + <version>${revision}</version> <scope>${scope}</scope> </dependency> <dependency> <groupId>com.geedgenetworks</groupId> <artifactId>connector-clickhouse</artifactId> - <version>${project.version}</version> + <version>${revision}</version> <scope>${scope}</scope> </dependency> <dependency> <groupId>com.geedgenetworks</groupId> <artifactId>connector-ipfix-collector</artifactId> - <version>${project.version}</version> + <version>${revision}</version> <scope>${scope}</scope> </dependency> @@ -67,7 +69,12 @@ <version>${revision}</version> <scope>${scope}</scope> </dependency> - + <dependency> + <groupId>com.geedgenetworks</groupId> + <artifactId>format-msgpack</artifactId> + <version>${revision}</version> + <scope>${scope}</scope> + </dependency> <dependency> <groupId>com.geedgenetworks</groupId> @@ -141,7 +148,6 @@ </dependency> - </dependencies> diff --git a/groot-formats/format-csv/pom.xml b/groot-formats/format-csv/pom.xml index 4940bcf..509a9c1 100644 --- a/groot-formats/format-csv/pom.xml +++ b/groot-formats/format-csv/pom.xml @@ -19,5 +19,24 @@ <version>${flink.version}</version> <scope>${flink.scope}</scope> </dependency> + + <dependency> + <groupId>org.apache.flink</groupId> + <artifactId>flink-table-api-java-bridge_${scala.version}</artifactId> + <scope>${flink.scope}</scope> + </dependency> + + <dependency> + <groupId>org.apache.flink</groupId> + <artifactId>flink-clients_${scala.version}</artifactId> + <scope>test</scope> + </dependency> + + <dependency> + <groupId>org.apache.flink</groupId> + <artifactId>flink-table-planner-blink_${scala.version}</artifactId> + <scope>test</scope> + </dependency> + </dependencies> </project>
\ No newline at end of file diff --git a/groot-formats/format-csv/src/main/java/com/geedgenetworks/formats/csv/CsvEventDeserializationSchema.java b/groot-formats/format-csv/src/main/java/com/geedgenetworks/formats/csv/CsvEventDeserializationSchema.java index cae823f..8c73d9d 100644 --- a/groot-formats/format-csv/src/main/java/com/geedgenetworks/formats/csv/CsvEventDeserializationSchema.java +++ b/groot-formats/format-csv/src/main/java/com/geedgenetworks/formats/csv/CsvEventDeserializationSchema.java @@ -1,54 +1,54 @@ -package com.geedgenetworks.formats.csv;
-
-import com.geedgenetworks.common.Event;
-import com.geedgenetworks.core.connector.format.MapDeserialization;
-import com.geedgenetworks.core.types.StructType;
-import org.apache.flink.api.common.serialization.DeserializationSchema;
-import org.apache.flink.api.common.typeinfo.TypeInformation;
-import org.apache.flink.shaded.jackson2.com.fasterxml.jackson.dataformat.csv.CsvSchema;
-
-import java.io.IOException;
-import java.nio.charset.StandardCharsets;
-import java.util.Map;
-
-public class CsvEventDeserializationSchema implements DeserializationSchema<Event>, MapDeserialization {
- private final StructType dataType;
- private final CsvSchema csvSchema;
- private final boolean ignoreParseErrors;
- private final CsvToMapDataConverter converter;
-
- public CsvEventDeserializationSchema(StructType dataType, CsvSchema csvSchema, boolean ignoreParseErrors) {
- this.dataType = dataType;
- this.csvSchema = csvSchema;
- this.ignoreParseErrors = ignoreParseErrors;
- this.converter = new CsvToMapDataConverter(dataType, csvSchema, ignoreParseErrors);
- }
-
- @Override
- public Event deserialize(byte[] bytes) throws IOException {
- Map<String, Object> map = deserializeToMap(bytes);
- if (map == null) {
- return null;
- }
- Event event = new Event();
- event.setExtractedFields(map);
- return event;
- }
-
- @Override
- public Map<String, Object> deserializeToMap(byte[] bytes) throws IOException {
- String message = new String(bytes, StandardCharsets.UTF_8);
- return converter.convert(message);
- }
-
- @Override
- public boolean isEndOfStream(Event nextElement) {
- return false;
- }
-
- @Override
- public TypeInformation<Event> getProducedType() {
- return null;
- }
-
-}
+package com.geedgenetworks.formats.csv; + +import com.geedgenetworks.api.connector.serialization.MapDeserialization; +import com.geedgenetworks.api.connector.event.Event; +import com.geedgenetworks.api.connector.type.StructType; +import org.apache.flink.api.common.serialization.DeserializationSchema; +import org.apache.flink.api.common.typeinfo.TypeInformation; +import org.apache.flink.shaded.jackson2.com.fasterxml.jackson.dataformat.csv.CsvSchema; + +import java.io.IOException; +import java.nio.charset.StandardCharsets; +import java.util.Map; + +public class CsvEventDeserializationSchema implements DeserializationSchema<Event>, MapDeserialization { + private final StructType dataType; + private final CsvSchema csvSchema; + private final boolean ignoreParseErrors; + private final CsvToMapDataConverter converter; + + public CsvEventDeserializationSchema(StructType dataType, CsvSchema csvSchema, boolean ignoreParseErrors) { + this.dataType = dataType; + this.csvSchema = csvSchema; + this.ignoreParseErrors = ignoreParseErrors; + this.converter = new CsvToMapDataConverter(dataType, csvSchema, ignoreParseErrors); + } + + @Override + public Event deserialize(byte[] bytes) throws IOException { + Map<String, Object> map = deserializeToMap(bytes); + if (map == null) { + return null; + } + Event event = new Event(); + event.setExtractedFields(map); + return event; + } + + @Override + public Map<String, Object> deserializeToMap(byte[] bytes) throws IOException { + String message = new String(bytes, StandardCharsets.UTF_8); + return converter.convert(message); + } + + @Override + public boolean isEndOfStream(Event nextElement) { + return false; + } + + @Override + public TypeInformation<Event> getProducedType() { + return null; + } + +} diff --git a/groot-formats/format-csv/src/main/java/com/geedgenetworks/formats/csv/CsvEventSerializationSchema.java b/groot-formats/format-csv/src/main/java/com/geedgenetworks/formats/csv/CsvEventSerializationSchema.java index 1df31bb..bd1b69d 100644 --- a/groot-formats/format-csv/src/main/java/com/geedgenetworks/formats/csv/CsvEventSerializationSchema.java +++ b/groot-formats/format-csv/src/main/java/com/geedgenetworks/formats/csv/CsvEventSerializationSchema.java @@ -1,7 +1,7 @@ package com.geedgenetworks.formats.csv; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.core.types.StructType; +import com.geedgenetworks.api.connector.event.Event; +import com.geedgenetworks.api.connector.type.StructType; import org.apache.flink.api.common.serialization.SerializationSchema; import org.apache.flink.shaded.jackson2.com.fasterxml.jackson.dataformat.csv.CsvSchema; diff --git a/groot-formats/format-csv/src/main/java/com/geedgenetworks/formats/csv/CsvFormatFactory.java b/groot-formats/format-csv/src/main/java/com/geedgenetworks/formats/csv/CsvFormatFactory.java index 7e5db4a..c501cb0 100644 --- a/groot-formats/format-csv/src/main/java/com/geedgenetworks/formats/csv/CsvFormatFactory.java +++ b/groot-formats/format-csv/src/main/java/com/geedgenetworks/formats/csv/CsvFormatFactory.java @@ -1,190 +1,190 @@ -package com.geedgenetworks.formats.csv;
-
-import com.geedgenetworks.core.connector.format.DecodingFormat;
-import com.geedgenetworks.core.connector.format.EncodingFormat;
-import com.geedgenetworks.core.factories.DecodingFormatFactory;
-import com.geedgenetworks.core.factories.EncodingFormatFactory;
-import com.geedgenetworks.core.factories.FactoryUtil;
-import com.geedgenetworks.core.factories.TableFactory;
-import com.geedgenetworks.core.types.*;
-import org.apache.commons.lang3.StringEscapeUtils;
-import org.apache.flink.configuration.ConfigOption;
-import org.apache.flink.configuration.ReadableConfig;
-import org.apache.flink.shaded.jackson2.com.fasterxml.jackson.dataformat.csv.CsvSchema;
-import org.apache.flink.table.api.ValidationException;
-import org.apache.flink.util.Preconditions;
-
-import java.util.Collections;
-import java.util.HashSet;
-import java.util.Optional;
-import java.util.Set;
-
-import static com.geedgenetworks.formats.csv.CsvFormatOptions.*;
-
-public class CsvFormatFactory implements DecodingFormatFactory, EncodingFormatFactory {
- public static final String IDENTIFIER = "csv";
-
- @Override
- public String factoryIdentifier() {
- return IDENTIFIER;
- }
-
- @Override
- public DecodingFormat createDecodingFormat(TableFactory.Context context, ReadableConfig formatOptions) {
- FactoryUtil.validateFactoryOptions(this, formatOptions);
- validateFormatOptions(formatOptions);
- final boolean ignoreParseErrors = formatOptions.get(IGNORE_PARSE_ERRORS);
- return dataType -> {
- Preconditions.checkNotNull(dataType, "csv format require schema");
- CsvSchema csvSchema = getCsvSchema(dataType, formatOptions);
- return new CsvEventDeserializationSchema(dataType, csvSchema, ignoreParseErrors);
- };
- }
-
- @Override
- public EncodingFormat createEncodingFormat(TableFactory.Context context, ReadableConfig formatOptions) {
- FactoryUtil.validateFactoryOptions(this, formatOptions);
- validateFormatOptions(formatOptions);
- return dataType -> {
- Preconditions.checkNotNull(dataType, "csv format require schema");
- CsvSchema csvSchema = getCsvSchema(dataType, formatOptions);
- return new CsvEventSerializationSchema(dataType, csvSchema);
- };
- }
-
- @Override
- public Set<ConfigOption<?>> requiredOptions() {
- return Collections.emptySet();
- }
-
- @Override
- public Set<ConfigOption<?>> optionalOptions() {
- Set<ConfigOption<?>> options = new HashSet<>();
- options.add(FIELD_DELIMITER);
- options.add(DISABLE_QUOTE_CHARACTER);
- options.add(QUOTE_CHARACTER);
- options.add(ALLOW_COMMENTS);
- options.add(IGNORE_PARSE_ERRORS);
- options.add(ARRAY_ELEMENT_DELIMITER);
- options.add(ESCAPE_CHARACTER);
- options.add(NULL_LITERAL);
- return options;
- }
-
- static CsvSchema getCsvSchema(StructType dataType, ReadableConfig options){
- CsvSchema.Builder builder = convert(dataType).rebuild();
-
- options.getOptional(FIELD_DELIMITER)
- .map(delimiter -> StringEscapeUtils.unescapeJava(delimiter).charAt(0))
- .ifPresent(builder::setColumnSeparator);
-
- if (options.get(DISABLE_QUOTE_CHARACTER)) {
- builder.disableQuoteChar();
- } else {
- options.getOptional(QUOTE_CHARACTER)
- .map(quote -> quote.charAt(0))
- .ifPresent(builder::setQuoteChar);
- }
-
- options.getOptional(ARRAY_ELEMENT_DELIMITER).ifPresent(builder::setArrayElementSeparator);
-
- options.getOptional(ESCAPE_CHARACTER).map(quote -> quote.charAt(0)).ifPresent(builder::setEscapeChar);
-
- Optional.ofNullable(options.get(NULL_LITERAL)).ifPresent(builder::setNullValue);
-
- CsvSchema csvSchema = builder.build();
-
- return csvSchema;
- }
-
- public static CsvSchema convert(StructType schema) {
- CsvSchema.Builder builder = new CsvSchema.Builder();
- StructType.StructField[] fields = schema.fields;
- for (int i = 0; i < fields.length; i++) {
- String fieldName = fields[i].name;
- DataType dataType = fields[i].dataType;
- builder.addColumn(new CsvSchema.Column(i, fieldName, convertType(fieldName, dataType)));
- }
- return builder.build();
- }
-
- private static CsvSchema.ColumnType convertType(String fieldName, DataType dataType) {
- if (dataType instanceof StringType) {
- return CsvSchema.ColumnType.STRING;
- } else if (dataType instanceof IntegerType || dataType instanceof LongType || dataType instanceof FloatType || dataType instanceof DoubleType) {
- return CsvSchema.ColumnType.NUMBER;
- } else if (dataType instanceof BooleanType) {
- return CsvSchema.ColumnType.BOOLEAN;
- } else if (dataType instanceof ArrayType) {
- validateNestedField(fieldName, ((ArrayType) dataType).elementType);
- return CsvSchema.ColumnType.ARRAY;
- } else if (dataType instanceof StructType) {
- StructType rowType = (StructType) dataType;
- for (StructType.StructField field : rowType.fields) {
- validateNestedField(fieldName, field.dataType);
- }
- return CsvSchema.ColumnType.ARRAY;
- } else {
- throw new IllegalArgumentException(
- "Unsupported type '" + dataType + "' for field '" + fieldName + "'.");
- }
- }
-
- private static void validateNestedField(String fieldName, DataType dataType) {
- if (!(dataType instanceof StringType || dataType instanceof IntegerType || dataType instanceof LongType ||
- dataType instanceof FloatType || dataType instanceof DoubleType || dataType instanceof BooleanType)) {
- throw new IllegalArgumentException(
- "Only simple types are supported in the second level nesting of fields '"
- + fieldName
- + "' but was: "
- + dataType);
- }
- }
-
- // ------------------------------------------------------------------------
- // Validation
- // ------------------------------------------------------------------------
-
- static void validateFormatOptions(ReadableConfig tableOptions) {
- final boolean hasQuoteCharacter = tableOptions.getOptional(QUOTE_CHARACTER).isPresent();
- final boolean isDisabledQuoteCharacter = tableOptions.get(DISABLE_QUOTE_CHARACTER);
- if (isDisabledQuoteCharacter && hasQuoteCharacter) {
- throw new ValidationException(
- "Format cannot define a quote character and disabled quote character at the same time.");
- }
- // Validate the option value must be a single char.
- validateCharacterVal(tableOptions, FIELD_DELIMITER, true);
- validateCharacterVal(tableOptions, ARRAY_ELEMENT_DELIMITER);
- validateCharacterVal(tableOptions, QUOTE_CHARACTER);
- validateCharacterVal(tableOptions, ESCAPE_CHARACTER);
- }
-
- /** Validates the option {@code option} value must be a Character. */
- private static void validateCharacterVal(
- ReadableConfig tableOptions, ConfigOption<String> option) {
- validateCharacterVal(tableOptions, option, false);
- }
-
- /**
- * Validates the option {@code option} value must be a Character.
- *
- * @param tableOptions the table options
- * @param option the config option
- * @param unescape whether to unescape the option value
- */
- private static void validateCharacterVal(
- ReadableConfig tableOptions, ConfigOption<String> option, boolean unescape) {
- if (tableOptions.getOptional(option).isPresent()) {
- final String value =
- unescape
- ? StringEscapeUtils.unescapeJava(tableOptions.get(option))
- : tableOptions.get(option);
- if (value.length() != 1) {
- throw new ValidationException(
- String.format(
- "Option '%s.%s' must be a string with single character, but was: %s",
- IDENTIFIER, option.key(), tableOptions.get(option)));
- }
- }
- }
-}
+package com.geedgenetworks.formats.csv; + +import com.geedgenetworks.api.connector.serialization.DecodingFormat; +import com.geedgenetworks.api.connector.serialization.EncodingFormat; +import com.geedgenetworks.api.factory.DecodingFormatFactory; +import com.geedgenetworks.api.factory.EncodingFormatFactory; +import com.geedgenetworks.api.factory.FactoryUtil; +import com.geedgenetworks.api.factory.ConnectorFactory; +import com.geedgenetworks.api.connector.type.*; +import org.apache.commons.lang3.StringEscapeUtils; +import org.apache.flink.configuration.ConfigOption; +import org.apache.flink.configuration.ReadableConfig; +import org.apache.flink.shaded.jackson2.com.fasterxml.jackson.dataformat.csv.CsvSchema; +import org.apache.flink.table.api.ValidationException; +import org.apache.flink.util.Preconditions; + +import java.util.Collections; +import java.util.HashSet; +import java.util.Optional; +import java.util.Set; + +import static com.geedgenetworks.formats.csv.CsvFormatOptions.*; + +public class CsvFormatFactory implements DecodingFormatFactory, EncodingFormatFactory { + public static final String IDENTIFIER = "csv"; + + @Override + public String type() { + return IDENTIFIER; + } + + @Override + public DecodingFormat createDecodingFormat(ConnectorFactory.Context context, ReadableConfig formatOptions) { + FactoryUtil.validateFactoryOptions(this, formatOptions); + validateFormatOptions(formatOptions); + final boolean ignoreParseErrors = formatOptions.get(IGNORE_PARSE_ERRORS); + return dataType -> { + Preconditions.checkNotNull(dataType, "csv format require schema"); + CsvSchema csvSchema = getCsvSchema(dataType, formatOptions); + return new CsvEventDeserializationSchema(dataType, csvSchema, ignoreParseErrors); + }; + } + + @Override + public EncodingFormat createEncodingFormat(ConnectorFactory.Context context, ReadableConfig formatOptions) { + FactoryUtil.validateFactoryOptions(this, formatOptions); + validateFormatOptions(formatOptions); + return dataType -> { + Preconditions.checkNotNull(dataType, "csv format require schema"); + CsvSchema csvSchema = getCsvSchema(dataType, formatOptions); + return new CsvEventSerializationSchema(dataType, csvSchema); + }; + } + + @Override + public Set<ConfigOption<?>> requiredOptions() { + return Collections.emptySet(); + } + + @Override + public Set<ConfigOption<?>> optionalOptions() { + Set<ConfigOption<?>> options = new HashSet<>(); + options.add(FIELD_DELIMITER); + options.add(DISABLE_QUOTE_CHARACTER); + options.add(QUOTE_CHARACTER); + options.add(ALLOW_COMMENTS); + options.add(IGNORE_PARSE_ERRORS); + options.add(ARRAY_ELEMENT_DELIMITER); + options.add(ESCAPE_CHARACTER); + options.add(NULL_LITERAL); + return options; + } + + static CsvSchema getCsvSchema(StructType dataType, ReadableConfig options){ + CsvSchema.Builder builder = convert(dataType).rebuild(); + + options.getOptional(FIELD_DELIMITER) + .map(delimiter -> StringEscapeUtils.unescapeJava(delimiter).charAt(0)) + .ifPresent(builder::setColumnSeparator); + + if (options.get(DISABLE_QUOTE_CHARACTER)) { + builder.disableQuoteChar(); + } else { + options.getOptional(QUOTE_CHARACTER) + .map(quote -> quote.charAt(0)) + .ifPresent(builder::setQuoteChar); + } + + options.getOptional(ARRAY_ELEMENT_DELIMITER).ifPresent(builder::setArrayElementSeparator); + + options.getOptional(ESCAPE_CHARACTER).map(quote -> quote.charAt(0)).ifPresent(builder::setEscapeChar); + + Optional.ofNullable(options.get(NULL_LITERAL)).ifPresent(builder::setNullValue); + + CsvSchema csvSchema = builder.build(); + + return csvSchema; + } + + public static CsvSchema convert(StructType schema) { + CsvSchema.Builder builder = new CsvSchema.Builder(); + StructType.StructField[] fields = schema.fields; + for (int i = 0; i < fields.length; i++) { + String fieldName = fields[i].name; + DataType dataType = fields[i].dataType; + builder.addColumn(new CsvSchema.Column(i, fieldName, convertType(fieldName, dataType))); + } + return builder.build(); + } + + private static CsvSchema.ColumnType convertType(String fieldName, DataType dataType) { + if (dataType instanceof StringType) { + return CsvSchema.ColumnType.STRING; + } else if (dataType instanceof IntegerType || dataType instanceof LongType || dataType instanceof FloatType || dataType instanceof DoubleType) { + return CsvSchema.ColumnType.NUMBER; + } else if (dataType instanceof BooleanType) { + return CsvSchema.ColumnType.BOOLEAN; + } else if (dataType instanceof ArrayType) { + validateNestedField(fieldName, ((ArrayType) dataType).elementType); + return CsvSchema.ColumnType.ARRAY; + } else if (dataType instanceof StructType) { + StructType rowType = (StructType) dataType; + for (StructType.StructField field : rowType.fields) { + validateNestedField(fieldName, field.dataType); + } + return CsvSchema.ColumnType.ARRAY; + } else { + throw new IllegalArgumentException( + "Unsupported type '" + dataType + "' for field '" + fieldName + "'."); + } + } + + private static void validateNestedField(String fieldName, DataType dataType) { + if (!(dataType instanceof StringType || dataType instanceof IntegerType || dataType instanceof LongType || + dataType instanceof FloatType || dataType instanceof DoubleType || dataType instanceof BooleanType)) { + throw new IllegalArgumentException( + "Only simple types are supported in the second level nesting of fields '" + + fieldName + + "' but was: " + + dataType); + } + } + + // ------------------------------------------------------------------------ + // Validation + // ------------------------------------------------------------------------ + + static void validateFormatOptions(ReadableConfig tableOptions) { + final boolean hasQuoteCharacter = tableOptions.getOptional(QUOTE_CHARACTER).isPresent(); + final boolean isDisabledQuoteCharacter = tableOptions.get(DISABLE_QUOTE_CHARACTER); + if (isDisabledQuoteCharacter && hasQuoteCharacter) { + throw new ValidationException( + "Format cannot define a quote character and disabled quote character at the same time."); + } + // Validate the option value must be a single char. + validateCharacterVal(tableOptions, FIELD_DELIMITER, true); + validateCharacterVal(tableOptions, ARRAY_ELEMENT_DELIMITER); + validateCharacterVal(tableOptions, QUOTE_CHARACTER); + validateCharacterVal(tableOptions, ESCAPE_CHARACTER); + } + + /** Validates the option {@code option} value must be a Character. */ + private static void validateCharacterVal( + ReadableConfig tableOptions, ConfigOption<String> option) { + validateCharacterVal(tableOptions, option, false); + } + + /** + * Validates the option {@code option} value must be a Character. + * + * @param tableOptions the table options + * @param option the config option + * @param unescape whether to unescape the option value + */ + private static void validateCharacterVal( + ReadableConfig tableOptions, ConfigOption<String> option, boolean unescape) { + if (tableOptions.getOptional(option).isPresent()) { + final String value = + unescape + ? StringEscapeUtils.unescapeJava(tableOptions.get(option)) + : tableOptions.get(option); + if (value.length() != 1) { + throw new ValidationException( + String.format( + "Option '%s.%s' must be a string with single character, but was: %s", + IDENTIFIER, option.key(), tableOptions.get(option))); + } + } + } +} diff --git a/groot-formats/format-csv/src/main/java/com/geedgenetworks/formats/csv/CsvSerializer.java b/groot-formats/format-csv/src/main/java/com/geedgenetworks/formats/csv/CsvSerializer.java index 170a2b6..a39b0ee 100644 --- a/groot-formats/format-csv/src/main/java/com/geedgenetworks/formats/csv/CsvSerializer.java +++ b/groot-formats/format-csv/src/main/java/com/geedgenetworks/formats/csv/CsvSerializer.java @@ -1,6 +1,7 @@ package com.geedgenetworks.formats.csv; -import com.geedgenetworks.core.types.*; +import com.geedgenetworks.api.connector.type.*; +import com.geedgenetworks.api.connector.type.StructType; import org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.JsonNode; import org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.ObjectWriter; import org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.node.*; diff --git a/groot-formats/format-csv/src/main/java/com/geedgenetworks/formats/csv/CsvToMapDataConverter.java b/groot-formats/format-csv/src/main/java/com/geedgenetworks/formats/csv/CsvToMapDataConverter.java index f0d2e79..71fdbf3 100644 --- a/groot-formats/format-csv/src/main/java/com/geedgenetworks/formats/csv/CsvToMapDataConverter.java +++ b/groot-formats/format-csv/src/main/java/com/geedgenetworks/formats/csv/CsvToMapDataConverter.java @@ -1,222 +1,222 @@ -package com.geedgenetworks.formats.csv;
-
-import com.geedgenetworks.core.types.*;
-import org.apache.commons.lang3.StringUtils;
-import org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.JsonNode;
-import org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.ObjectReader;
-import org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.node.ArrayNode;
-import org.apache.flink.shaded.jackson2.com.fasterxml.jackson.dataformat.csv.CsvMapper;
-import org.apache.flink.shaded.jackson2.com.fasterxml.jackson.dataformat.csv.CsvSchema;
-import org.slf4j.Logger;
-import org.slf4j.LoggerFactory;
-
-import java.io.Serializable;
-import java.util.*;
-
-public class CsvToMapDataConverter implements Serializable {
- private static final Logger LOG = LoggerFactory.getLogger(CsvToMapDataConverter.class);
- private final StructType dataType;
- private final CsvSchema csvSchema;
- private final boolean ignoreParseErrors;
- private final ValueConverter valueConverter;
- private transient ObjectReader objectReader;
-
- public CsvToMapDataConverter(StructType dataType, CsvSchema csvSchema, boolean ignoreParseErrors) {
- this.dataType = dataType;
- this.csvSchema = csvSchema;
- this.ignoreParseErrors = ignoreParseErrors;
- this.valueConverter = createRowConverter(dataType, true);
- this.objectReader = new CsvMapper().readerFor(JsonNode.class).with(csvSchema);
- }
-
- public Map<String, Object> convert(String message) {
- if (objectReader == null) {
- this.objectReader = new CsvMapper().readerFor(JsonNode.class).with(csvSchema);
- }
- try {
- final JsonNode root = objectReader.readValue(message);
- return (Map<String, Object>) valueConverter.convert(root);
- } catch (Throwable t) {
- if (ignoreParseErrors) {
- LOG.error(String.format("CSV Parse Errors:%s", message), t);
- return null;
- }
- throw new UnsupportedOperationException(String.format("CSV Parse Errors:%s", message), t);
- }
- }
-
- private ValueConverter createRowConverter(StructType rowType, boolean isTopLevel) {
- final ValueConverter[] fieldConverters = Arrays.stream(rowType.fields).map(f -> makeConverter(f.dataType)).toArray(ValueConverter[]::new);
- final String[] fields = Arrays.stream(rowType.fields).map(f -> f.name).toArray(String[]::new);
- final int arity = fields.length;
- return node -> {
- int nodeSize = node.size();
-
- if (nodeSize != 0) {
- validateArity(arity, nodeSize, ignoreParseErrors);
- } else {
- return null;
- }
-
- Map<String, Object> obj = new HashMap<>();
- Object value;
- for (int i = 0; i < arity; i++) {
- JsonNode field;
- // Jackson only supports mapping by name in the first level
- if (isTopLevel) {
- field = node.get(fields[i]);
- } else {
- field = node.get(i);
- }
- if (field != null && !field.isNull()) {
- value = fieldConverters[i].convert(field);
- if (value != null) {
- obj.put(fields[i], value);
- }
- }
- }
- return obj;
- };
- }
-
- private ValueConverter createArrayConverter(ArrayType arrayType) {
- final ValueConverter converter = makeConverter(arrayType.elementType);
- return node -> {
- final ArrayNode arrayNode = (ArrayNode) node;
- if (arrayNode.size() == 0) {
- return null;
- }
- List<Object> objs = new ArrayList<>(arrayNode.size());
- for (int i = 0; i < arrayNode.size(); i++) {
- final JsonNode innerNode = arrayNode.get(i);
- if (innerNode == null || innerNode.isNull()) {
- objs.add(null);
- }else{
- objs.add(converter.convert(innerNode));
- }
- }
- return objs;
- };
- }
-
- private ValueConverter makeConverter(DataType dataType) {
- if (dataType instanceof StringType) {
- return this::convertToString;
- }
-
- if (dataType instanceof IntegerType) {
- return this::convertToInteger;
- }
-
- if (dataType instanceof LongType) {
- return this::convertToLong;
- }
-
- if (dataType instanceof FloatType) {
- return this::convertToFloat;
- }
-
- if (dataType instanceof DoubleType) {
- return this::convertToDouble;
- }
-
- if (dataType instanceof BooleanType) {
- return this::convertToBoolean;
- }
-
- if (dataType instanceof StructType) {
- return createRowConverter((StructType) dataType, false);
- }
-
- if (dataType instanceof ArrayType) {
- return createArrayConverter((ArrayType) dataType);
- }
-
- throw new UnsupportedOperationException("unsupported dataType: " + dataType);
- }
-
- private String convertToString(JsonNode node) {
- return node.asText();
- }
-
- private Integer convertToInteger(JsonNode node) {
- if (node.canConvertToInt()) {
- // avoid redundant toString and parseInt, for better performance
- return node.asInt();
- } else {
- String text = node.asText().trim();
- if (StringUtils.isBlank(text)) {
- return null;
- }
- return Integer.parseInt(text);
- }
- }
-
- private Long convertToLong(JsonNode node) {
- if (node.canConvertToLong()) {
- // avoid redundant toString and parseLong, for better performance
- return node.asLong();
- } else {
- String text = node.asText().trim();
- if (StringUtils.isBlank(text)) {
- return null;
- }
- return Long.parseLong(text);
- }
- }
-
- private Float convertToFloat(JsonNode node) {
- if (node.isDouble()) {
- // avoid redundant toString and parseDouble, for better performance
- return (float) node.asDouble();
- } else {
- String text = node.asText().trim();
- if (StringUtils.isBlank(text)) {
- return null;
- }
- return Float.parseFloat(text);
- }
- }
-
- private Double convertToDouble(JsonNode node) {
- if (node.isDouble()) {
- // avoid redundant toString and parseDouble, for better performance
- return node.asDouble();
- } else {
- String text = node.asText().trim();
- if (StringUtils.isBlank(text)) {
- return null;
- }
- return Double.parseDouble(text);
- }
- }
-
- private Boolean convertToBoolean(JsonNode node) {
- if (node.isBoolean()) {
- // avoid redundant toString and parseBoolean, for better performance
- return node.asBoolean();
- } else {
- String text = node.asText().trim();
- if (StringUtils.isBlank(text)) {
- return null;
- }
- return Boolean.parseBoolean(text);
- }
- }
-
- private static void validateArity(int expected, int actual, boolean ignoreParseErrors) {
- if (expected > actual && !ignoreParseErrors) {
- throw new RuntimeException(
- "Row length mismatch. "
- + expected
- + " fields expected but was "
- + actual
- + ".");
- }
- }
-
- @FunctionalInterface
- public interface ValueConverter extends Serializable {
- Object convert(JsonNode node) throws Exception;
- }
-}
+package com.geedgenetworks.formats.csv; + +import com.geedgenetworks.api.connector.type.*; +import org.apache.commons.lang3.StringUtils; +import org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.JsonNode; +import org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.ObjectReader; +import org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.node.ArrayNode; +import org.apache.flink.shaded.jackson2.com.fasterxml.jackson.dataformat.csv.CsvMapper; +import org.apache.flink.shaded.jackson2.com.fasterxml.jackson.dataformat.csv.CsvSchema; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.Serializable; +import java.util.*; + +public class CsvToMapDataConverter implements Serializable { + private static final Logger LOG = LoggerFactory.getLogger(CsvToMapDataConverter.class); + private final StructType dataType; + private final CsvSchema csvSchema; + private final boolean ignoreParseErrors; + private final ValueConverter valueConverter; + private transient ObjectReader objectReader; + + public CsvToMapDataConverter(StructType dataType, CsvSchema csvSchema, boolean ignoreParseErrors) { + this.dataType = dataType; + this.csvSchema = csvSchema; + this.ignoreParseErrors = ignoreParseErrors; + this.valueConverter = createRowConverter(dataType, true); + this.objectReader = new CsvMapper().readerFor(JsonNode.class).with(csvSchema); + } + + public Map<String, Object> convert(String message) { + if (objectReader == null) { + this.objectReader = new CsvMapper().readerFor(JsonNode.class).with(csvSchema); + } + try { + final JsonNode root = objectReader.readValue(message); + return (Map<String, Object>) valueConverter.convert(root); + } catch (Throwable t) { + if (ignoreParseErrors) { + LOG.error(String.format("CSV Parse Errors:%s", message), t); + return null; + } + throw new UnsupportedOperationException(String.format("CSV Parse Errors:%s", message), t); + } + } + + private ValueConverter createRowConverter(StructType rowType, boolean isTopLevel) { + final ValueConverter[] fieldConverters = Arrays.stream(rowType.fields).map(f -> makeConverter(f.dataType)).toArray(ValueConverter[]::new); + final String[] fields = Arrays.stream(rowType.fields).map(f -> f.name).toArray(String[]::new); + final int arity = fields.length; + return node -> { + int nodeSize = node.size(); + + if (nodeSize != 0) { + validateArity(arity, nodeSize, ignoreParseErrors); + } else { + return null; + } + + Map<String, Object> obj = new HashMap<>(); + Object value; + for (int i = 0; i < arity; i++) { + JsonNode field; + // Jackson only supports mapping by name in the first level + if (isTopLevel) { + field = node.get(fields[i]); + } else { + field = node.get(i); + } + if (field != null && !field.isNull()) { + value = fieldConverters[i].convert(field); + if (value != null) { + obj.put(fields[i], value); + } + } + } + return obj; + }; + } + + private ValueConverter createArrayConverter(ArrayType arrayType) { + final ValueConverter converter = makeConverter(arrayType.elementType); + return node -> { + final ArrayNode arrayNode = (ArrayNode) node; + if (arrayNode.size() == 0) { + return null; + } + List<Object> objs = new ArrayList<>(arrayNode.size()); + for (int i = 0; i < arrayNode.size(); i++) { + final JsonNode innerNode = arrayNode.get(i); + if (innerNode == null || innerNode.isNull()) { + objs.add(null); + }else{ + objs.add(converter.convert(innerNode)); + } + } + return objs; + }; + } + + private ValueConverter makeConverter(DataType dataType) { + if (dataType instanceof StringType) { + return this::convertToString; + } + + if (dataType instanceof IntegerType) { + return this::convertToInteger; + } + + if (dataType instanceof LongType) { + return this::convertToLong; + } + + if (dataType instanceof FloatType) { + return this::convertToFloat; + } + + if (dataType instanceof DoubleType) { + return this::convertToDouble; + } + + if (dataType instanceof BooleanType) { + return this::convertToBoolean; + } + + if (dataType instanceof StructType) { + return createRowConverter((StructType) dataType, false); + } + + if (dataType instanceof ArrayType) { + return createArrayConverter((ArrayType) dataType); + } + + throw new UnsupportedOperationException("unsupported dataType: " + dataType); + } + + private String convertToString(JsonNode node) { + return node.asText(); + } + + private Integer convertToInteger(JsonNode node) { + if (node.canConvertToInt()) { + // avoid redundant toString and parseInt, for better performance + return node.asInt(); + } else { + String text = node.asText().trim(); + if (StringUtils.isBlank(text)) { + return null; + } + return Integer.parseInt(text); + } + } + + private Long convertToLong(JsonNode node) { + if (node.canConvertToLong()) { + // avoid redundant toString and parseLong, for better performance + return node.asLong(); + } else { + String text = node.asText().trim(); + if (StringUtils.isBlank(text)) { + return null; + } + return Long.parseLong(text); + } + } + + private Float convertToFloat(JsonNode node) { + if (node.isDouble()) { + // avoid redundant toString and parseDouble, for better performance + return (float) node.asDouble(); + } else { + String text = node.asText().trim(); + if (StringUtils.isBlank(text)) { + return null; + } + return Float.parseFloat(text); + } + } + + private Double convertToDouble(JsonNode node) { + if (node.isDouble()) { + // avoid redundant toString and parseDouble, for better performance + return node.asDouble(); + } else { + String text = node.asText().trim(); + if (StringUtils.isBlank(text)) { + return null; + } + return Double.parseDouble(text); + } + } + + private Boolean convertToBoolean(JsonNode node) { + if (node.isBoolean()) { + // avoid redundant toString and parseBoolean, for better performance + return node.asBoolean(); + } else { + String text = node.asText().trim(); + if (StringUtils.isBlank(text)) { + return null; + } + return Boolean.parseBoolean(text); + } + } + + private static void validateArity(int expected, int actual, boolean ignoreParseErrors) { + if (expected > actual && !ignoreParseErrors) { + throw new RuntimeException( + "Row length mismatch. " + + expected + + " fields expected but was " + + actual + + "."); + } + } + + @FunctionalInterface + public interface ValueConverter extends Serializable { + Object convert(JsonNode node) throws Exception; + } +} diff --git a/groot-formats/format-csv/src/main/resources/META-INF/services/com.geedgenetworks.api.factory.Factory b/groot-formats/format-csv/src/main/resources/META-INF/services/com.geedgenetworks.api.factory.Factory new file mode 100644 index 0000000..e0ac788 --- /dev/null +++ b/groot-formats/format-csv/src/main/resources/META-INF/services/com.geedgenetworks.api.factory.Factory @@ -0,0 +1 @@ +com.geedgenetworks.formats.csv.CsvFormatFactory
\ No newline at end of file diff --git a/groot-formats/format-csv/src/main/resources/META-INF/services/com.geedgenetworks.core.factories.Factory b/groot-formats/format-csv/src/main/resources/META-INF/services/com.geedgenetworks.core.factories.Factory deleted file mode 100644 index e417fa4..0000000 --- a/groot-formats/format-csv/src/main/resources/META-INF/services/com.geedgenetworks.core.factories.Factory +++ /dev/null @@ -1 +0,0 @@ -com.geedgenetworks.formats.csv.CsvFormatFactory diff --git a/groot-formats/format-csv/src/test/java/com/geedgenetworks/formats/csv/CsvEventSerDeSchemaTest.java b/groot-formats/format-csv/src/test/java/com/geedgenetworks/formats/csv/CsvEventSerDeSchemaTest.java index 5142646..bf65a36 100644 --- a/groot-formats/format-csv/src/test/java/com/geedgenetworks/formats/csv/CsvEventSerDeSchemaTest.java +++ b/groot-formats/format-csv/src/test/java/com/geedgenetworks/formats/csv/CsvEventSerDeSchemaTest.java @@ -1,219 +1,219 @@ -package com.geedgenetworks.formats.csv;
-
-import com.geedgenetworks.common.Event;
-import com.geedgenetworks.core.connector.format.MapDeserialization;
-import com.geedgenetworks.core.connector.schema.Schema;
-import com.geedgenetworks.core.factories.DecodingFormatFactory;
-import com.geedgenetworks.core.factories.EncodingFormatFactory;
-import com.geedgenetworks.core.factories.FactoryUtil;
-import com.geedgenetworks.core.factories.TableFactory;
-import com.geedgenetworks.core.types.StructType;
-import com.geedgenetworks.core.types.Types;
-import org.apache.commons.lang3.StringUtils;
-import org.apache.flink.api.common.serialization.DeserializationSchema;
-import org.apache.flink.api.common.serialization.SerializationSchema;
-import org.apache.flink.configuration.Configuration;
-import org.junit.jupiter.api.Test;
-
-import java.nio.charset.StandardCharsets;
-import java.util.*;
-
-public class CsvEventSerDeSchemaTest {
-
- @Test
- public void testSimpleSerializeDeserialize() throws Exception {
- StructType dataType = Types.parseStructType("int:int,bigint:bigint,double:double,string:string");
- Map<String, String> options = new HashMap<>();
- TableFactory.Context context = new TableFactory.Context(Schema.newSchema(dataType), options, Configuration.fromMap(options));
-
- // 获取deserialization和serialization
- DeserializationSchema<Event> deserialization = FactoryUtil.discoverDecodingFormatFactory(DecodingFormatFactory.class, "csv")
- .createDecodingFormat(context, context.getConfiguration()).createRuntimeDecoder(dataType);
- SerializationSchema<Event> serialization = FactoryUtil.discoverEncodingFormatFactory(EncodingFormatFactory.class, "csv")
- .createEncodingFormat(context, context.getConfiguration()).createRuntimeEncoder(dataType);
-
- deserialization.open(null);
- serialization.open(null);
-
- Map<String, Object> map = new HashMap<>();
- map.put("int", 1);
- map.put("bigint", "2");
- map.put("double", "10.2");
- map.put("string", "utf-8字符串");
- Event row = new Event();
- row.setExtractedFields(map);
-
- byte[] bytes = serialization.serialize(row);
- System.out.println(map);
- System.out.println(new String(bytes, StandardCharsets.UTF_8));
- Map<String, Object> rst = deserialization.deserialize(bytes).getExtractedFields();
- System.out.println(rst);
-
- // 反序列成map
- if(deserialization instanceof MapDeserialization){
- MapDeserialization mapDeserialization = (MapDeserialization) deserialization;
- Map<String, Object> rstMap = mapDeserialization.deserializeToMap(bytes);
- System.out.println(rstMap);
- }
- }
-
- @Test
- public void testSerializeDeserialize() throws Exception {
- StructType dataType = Types.parseStructType("int:int,bigint:bigint,double:double,string:string,int_array:array<int>,struct:struct<int:int,string:string>");
- Map<String, String> options = new HashMap<>();
- TableFactory.Context context = new TableFactory.Context(Schema.newSchema(dataType), options, Configuration.fromMap(options));
-
- DeserializationSchema<Event> deserialization = FactoryUtil.discoverDecodingFormatFactory(DecodingFormatFactory.class, "csv")
- .createDecodingFormat(context, context.getConfiguration()).createRuntimeDecoder(dataType);
- SerializationSchema<Event> serialization = FactoryUtil.discoverEncodingFormatFactory(EncodingFormatFactory.class, "csv")
- .createEncodingFormat(context, context.getConfiguration()).createRuntimeEncoder(dataType);
-
- deserialization.open(null);
- serialization.open(null);
-
- Map<String, Object> map = new HashMap<>();
- map.put("int", 1);
- map.put("bigint", "2");
- map.put("double", "10.2");
- map.put("string", "utf-8字符串");
- map.put("int_array", Arrays.asList(1 , "2", 3));
- map.put("struct", Map.of("int", "1", "string", 22));
- Event row = new Event();
- row.setExtractedFields(map);
-
- byte[] bytes = serialization.serialize(row);
- System.out.println(map);
- System.out.println(new String(bytes, StandardCharsets.UTF_8));
- Map<String, Object> rst = deserialization.deserialize(bytes).getExtractedFields();
- System.out.println(rst);
-
- System.out.println(StringUtils.repeat('*', 60));
-
- map = new HashMap<>();
- row = new Event();
- map.put("int", 1);
- map.put("double", "10.2");
- map.put("int_array", Arrays.asList(1 , null, null));
- map.put("struct", Map.of( "string", 22));
- row.setExtractedFields(map);
-
- bytes = serialization.serialize(row);
- System.out.println(map);
- System.out.println(new String(bytes, StandardCharsets.UTF_8));
- rst = deserialization.deserialize(bytes).getExtractedFields();
- System.out.println(rst);
-
-
- System.out.println(StringUtils.repeat('*', 60));
-
- map = new HashMap<>();
- row = new Event();
- row.setExtractedFields(map);
-
- bytes = serialization.serialize(row);
- System.out.println(map);
- System.out.println(new String(bytes, StandardCharsets.UTF_8));
- rst = deserialization.deserialize(bytes).getExtractedFields();
- System.out.println(rst);
-
- System.out.println(StringUtils.repeat('*', 60));
-
- map = new HashMap<>();
- row = new Event();
- map.put("int", 1);
- map.put("bigint", "2");
- map.put("double", "10.2");
- map.put("string", "utf-8字符串");
- map.put("int_array", List.of(1 , "2", 3));
- map.put("struct", Map.of("int", "1", "string", 22));
- row.setExtractedFields(map);
-
- bytes = serialization.serialize(row);
- System.out.println(map);
- System.out.println(new String(bytes, StandardCharsets.UTF_8));
- rst = deserialization.deserialize(bytes).getExtractedFields();
- System.out.println(rst);
- }
-
-
- @Test
- public void testNullableFieldSerializeDeserialize() throws Exception {
- StructType dataType = Types.parseStructType("int:int,bigint:bigint,double:double,string:string,int_array:array<int>,struct:struct<int:int,string:string>");
- Map<String, String> options = new HashMap<>();
- options.put(CsvFormatOptions.NULL_LITERAL.key(), "null");
- options.put(CsvFormatOptions.IGNORE_PARSE_ERRORS.key(), "true");
- TableFactory.Context context = new TableFactory.Context(Schema.newSchema(dataType), options, Configuration.fromMap(options));
-
- DeserializationSchema<Event> deserialization = FactoryUtil.discoverDecodingFormatFactory(DecodingFormatFactory.class, "csv")
- .createDecodingFormat(context, context.getConfiguration()).createRuntimeDecoder(dataType);
- SerializationSchema<Event> serialization = FactoryUtil.discoverEncodingFormatFactory(EncodingFormatFactory.class, "csv")
- .createEncodingFormat(context, context.getConfiguration()).createRuntimeEncoder(dataType);
-
- deserialization.open(null);
- serialization.open(null);
-
- Map<String, Object> map = new HashMap<>();
- map.put("int", 1);
- map.put("bigint", "2");
- map.put("double", "10.2");
- map.put("string", "utf-8字符串");
- map.put("int_array", Arrays.asList(1 , "2", 3));
- map.put("struct", Map.of("int", "1", "string", 22));
- Event row = new Event();
- row.setExtractedFields(map);
-
- byte[] bytes = serialization.serialize(row);
- System.out.println(map);
- System.out.println(new String(bytes, StandardCharsets.UTF_8));
- Map<String, Object> rst = deserialization.deserialize(bytes).getExtractedFields();
- System.out.println(rst);
-
- System.out.println(StringUtils.repeat('*', 60));
-
- map = new HashMap<>();
- row = new Event();
- map.put("int", 1);
- map.put("double", "10.2");
- map.put("int_array", Arrays.asList(1 , null, null));
- map.put("struct", Map.of( "string", 22));
- row.setExtractedFields(map);
-
- bytes = serialization.serialize(row);
- System.out.println(map);
- System.out.println(new String(bytes, StandardCharsets.UTF_8));
- rst = deserialization.deserialize(bytes).getExtractedFields();
- System.out.println(rst);
-
-
- System.out.println(StringUtils.repeat('*', 60));
-
- map = new HashMap<>();
- row = new Event();
- row.setExtractedFields(map);
-
- bytes = serialization.serialize(row);
- System.out.println(map);
- System.out.println(new String(bytes, StandardCharsets.UTF_8));
- rst = deserialization.deserialize(bytes).getExtractedFields();
- System.out.println(rst);
-
- System.out.println(StringUtils.repeat('*', 60));
-
- map = new HashMap<>();
- row = new Event();
- map.put("int", 1);
- map.put("bigint", "2");
- map.put("double", "10.2");
- map.put("string", "utf-8字符串");
- map.put("int_array", List.of(1 , "2", 3));
- map.put("struct", Map.of("int", "1", "string", 22));
- row.setExtractedFields(map);
-
- bytes = serialization.serialize(row);
- System.out.println(map);
- System.out.println(new String(bytes, StandardCharsets.UTF_8));
- rst = deserialization.deserialize(bytes).getExtractedFields();
- System.out.println(rst);
- }
-
+package com.geedgenetworks.formats.csv; + +import com.geedgenetworks.api.connector.serialization.MapDeserialization; +import com.geedgenetworks.api.connector.event.Event; +import com.geedgenetworks.api.factory.DecodingFormatFactory; +import com.geedgenetworks.api.factory.EncodingFormatFactory; +import com.geedgenetworks.api.factory.FactoryUtil; +import com.geedgenetworks.api.factory.ConnectorFactory; +import com.geedgenetworks.api.connector.schema.Schema; +import com.geedgenetworks.api.connector.type.StructType; +import com.geedgenetworks.api.connector.type.Types; +import org.apache.commons.lang3.StringUtils; +import org.apache.flink.api.common.serialization.DeserializationSchema; +import org.apache.flink.api.common.serialization.SerializationSchema; +import org.apache.flink.configuration.Configuration; +import org.junit.jupiter.api.Test; + +import java.nio.charset.StandardCharsets; +import java.util.*; + +public class CsvEventSerDeSchemaTest { + + @Test + public void testSimpleSerializeDeserialize() throws Exception { + StructType dataType = Types.parseStructType("int:int,bigint:bigint,double:double,string:string"); + Map<String, String> options = new HashMap<>(); + ConnectorFactory.Context context = new ConnectorFactory.Context(Schema.newSchema(dataType), options, Configuration.fromMap(options)); + + // 获取deserialization和serialization + DeserializationSchema<Event> deserialization = FactoryUtil.discoverDecodingFormatFactory(DecodingFormatFactory.class, "csv") + .createDecodingFormat(context, context.getConfiguration()).createRuntimeDecoder(dataType); + SerializationSchema<Event> serialization = FactoryUtil.discoverEncodingFormatFactory(EncodingFormatFactory.class, "csv") + .createEncodingFormat(context, context.getConfiguration()).createRuntimeEncoder(dataType); + + deserialization.open(null); + serialization.open(null); + + Map<String, Object> map = new HashMap<>(); + map.put("int", 1); + map.put("bigint", "2"); + map.put("double", "10.2"); + map.put("string", "utf-8字符串"); + Event row = new Event(); + row.setExtractedFields(map); + + byte[] bytes = serialization.serialize(row); + System.out.println(map); + System.out.println(new String(bytes, StandardCharsets.UTF_8)); + Map<String, Object> rst = deserialization.deserialize(bytes).getExtractedFields(); + System.out.println(rst); + + // 反序列成map + if(deserialization instanceof MapDeserialization){ + MapDeserialization mapDeserialization = (MapDeserialization) deserialization; + Map<String, Object> rstMap = mapDeserialization.deserializeToMap(bytes); + System.out.println(rstMap); + } + } + + @Test + public void testSerializeDeserialize() throws Exception { + StructType dataType = Types.parseStructType("int:int,bigint:bigint,double:double,string:string,int_array:array<int>,struct:struct<int:int,string:string>"); + Map<String, String> options = new HashMap<>(); + ConnectorFactory.Context context = new ConnectorFactory.Context(Schema.newSchema(dataType), options, Configuration.fromMap(options)); + + DeserializationSchema<Event> deserialization = FactoryUtil.discoverDecodingFormatFactory(DecodingFormatFactory.class, "csv") + .createDecodingFormat(context, context.getConfiguration()).createRuntimeDecoder(dataType); + SerializationSchema<Event> serialization = FactoryUtil.discoverEncodingFormatFactory(EncodingFormatFactory.class, "csv") + .createEncodingFormat(context, context.getConfiguration()).createRuntimeEncoder(dataType); + + deserialization.open(null); + serialization.open(null); + + Map<String, Object> map = new HashMap<>(); + map.put("int", 1); + map.put("bigint", "2"); + map.put("double", "10.2"); + map.put("string", "utf-8字符串"); + map.put("int_array", Arrays.asList(1 , "2", 3)); + map.put("struct", Map.of("int", "1", "string", 22)); + Event row = new Event(); + row.setExtractedFields(map); + + byte[] bytes = serialization.serialize(row); + System.out.println(map); + System.out.println(new String(bytes, StandardCharsets.UTF_8)); + Map<String, Object> rst = deserialization.deserialize(bytes).getExtractedFields(); + System.out.println(rst); + + System.out.println(StringUtils.repeat('*', 60)); + + map = new HashMap<>(); + row = new Event(); + map.put("int", 1); + map.put("double", "10.2"); + map.put("int_array", Arrays.asList(1 , null, null)); + map.put("struct", Map.of( "string", 22)); + row.setExtractedFields(map); + + bytes = serialization.serialize(row); + System.out.println(map); + System.out.println(new String(bytes, StandardCharsets.UTF_8)); + rst = deserialization.deserialize(bytes).getExtractedFields(); + System.out.println(rst); + + + System.out.println(StringUtils.repeat('*', 60)); + + map = new HashMap<>(); + row = new Event(); + row.setExtractedFields(map); + + bytes = serialization.serialize(row); + System.out.println(map); + System.out.println(new String(bytes, StandardCharsets.UTF_8)); + rst = deserialization.deserialize(bytes).getExtractedFields(); + System.out.println(rst); + + System.out.println(StringUtils.repeat('*', 60)); + + map = new HashMap<>(); + row = new Event(); + map.put("int", 1); + map.put("bigint", "2"); + map.put("double", "10.2"); + map.put("string", "utf-8字符串"); + map.put("int_array", List.of(1 , "2", 3)); + map.put("struct", Map.of("int", "1", "string", 22)); + row.setExtractedFields(map); + + bytes = serialization.serialize(row); + System.out.println(map); + System.out.println(new String(bytes, StandardCharsets.UTF_8)); + rst = deserialization.deserialize(bytes).getExtractedFields(); + System.out.println(rst); + } + + + @Test + public void testNullableFieldSerializeDeserialize() throws Exception { + StructType dataType = Types.parseStructType("int:int,bigint:bigint,double:double,string:string,int_array:array<int>,struct:struct<int:int,string:string>"); + Map<String, String> options = new HashMap<>(); + options.put(CsvFormatOptions.NULL_LITERAL.key(), "null"); + options.put(CsvFormatOptions.IGNORE_PARSE_ERRORS.key(), "true"); + ConnectorFactory.Context context = new ConnectorFactory.Context(Schema.newSchema(dataType), options, Configuration.fromMap(options)); + + DeserializationSchema<Event> deserialization = FactoryUtil.discoverDecodingFormatFactory(DecodingFormatFactory.class, "csv") + .createDecodingFormat(context, context.getConfiguration()).createRuntimeDecoder(dataType); + SerializationSchema<Event> serialization = FactoryUtil.discoverEncodingFormatFactory(EncodingFormatFactory.class, "csv") + .createEncodingFormat(context, context.getConfiguration()).createRuntimeEncoder(dataType); + + deserialization.open(null); + serialization.open(null); + + Map<String, Object> map = new HashMap<>(); + map.put("int", 1); + map.put("bigint", "2"); + map.put("double", "10.2"); + map.put("string", "utf-8字符串"); + map.put("int_array", Arrays.asList(1 , "2", 3)); + map.put("struct", Map.of("int", "1", "string", 22)); + Event row = new Event(); + row.setExtractedFields(map); + + byte[] bytes = serialization.serialize(row); + System.out.println(map); + System.out.println(new String(bytes, StandardCharsets.UTF_8)); + Map<String, Object> rst = deserialization.deserialize(bytes).getExtractedFields(); + System.out.println(rst); + + System.out.println(StringUtils.repeat('*', 60)); + + map = new HashMap<>(); + row = new Event(); + map.put("int", 1); + map.put("double", "10.2"); + map.put("int_array", Arrays.asList(1 , null, null)); + map.put("struct", Map.of( "string", 22)); + row.setExtractedFields(map); + + bytes = serialization.serialize(row); + System.out.println(map); + System.out.println(new String(bytes, StandardCharsets.UTF_8)); + rst = deserialization.deserialize(bytes).getExtractedFields(); + System.out.println(rst); + + + System.out.println(StringUtils.repeat('*', 60)); + + map = new HashMap<>(); + row = new Event(); + row.setExtractedFields(map); + + bytes = serialization.serialize(row); + System.out.println(map); + System.out.println(new String(bytes, StandardCharsets.UTF_8)); + rst = deserialization.deserialize(bytes).getExtractedFields(); + System.out.println(rst); + + System.out.println(StringUtils.repeat('*', 60)); + + map = new HashMap<>(); + row = new Event(); + map.put("int", 1); + map.put("bigint", "2"); + map.put("double", "10.2"); + map.put("string", "utf-8字符串"); + map.put("int_array", List.of(1 , "2", 3)); + map.put("struct", Map.of("int", "1", "string", 22)); + row.setExtractedFields(map); + + bytes = serialization.serialize(row); + System.out.println(map); + System.out.println(new String(bytes, StandardCharsets.UTF_8)); + rst = deserialization.deserialize(bytes).getExtractedFields(); + System.out.println(rst); + } + }
\ No newline at end of file diff --git a/groot-formats/format-json/pom.xml b/groot-formats/format-json/pom.xml index 36fef72..1036832 100644 --- a/groot-formats/format-json/pom.xml +++ b/groot-formats/format-json/pom.xml @@ -12,6 +12,25 @@ <artifactId>format-json</artifactId> <name>Groot : Formats : Format-Json </name> <dependencies> + + <dependency> + <groupId>org.apache.flink</groupId> + <artifactId>flink-table-api-java-bridge_${scala.version}</artifactId> + <scope>test</scope> + </dependency> + + <dependency> + <groupId>org.apache.flink</groupId> + <artifactId>flink-clients_${scala.version}</artifactId> + <scope>test</scope> + </dependency> + + <dependency> + <groupId>org.apache.flink</groupId> + <artifactId>flink-table-planner-blink_${scala.version}</artifactId> + <scope>test</scope> + </dependency> + </dependencies> </project>
\ No newline at end of file diff --git a/groot-formats/format-json/src/main/java/com/geedgenetworks/formats/json/JsonEventDeserializationSchema.java b/groot-formats/format-json/src/main/java/com/geedgenetworks/formats/json/JsonEventDeserializationSchema.java index 2f7c352..11ce443 100644 --- a/groot-formats/format-json/src/main/java/com/geedgenetworks/formats/json/JsonEventDeserializationSchema.java +++ b/groot-formats/format-json/src/main/java/com/geedgenetworks/formats/json/JsonEventDeserializationSchema.java @@ -1,9 +1,9 @@ package com.geedgenetworks.formats.json; import com.alibaba.fastjson2.JSON; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.core.connector.format.MapDeserialization; -import com.geedgenetworks.core.types.StructType; +import com.geedgenetworks.api.connector.serialization.MapDeserialization; +import com.geedgenetworks.api.connector.event.Event; +import com.geedgenetworks.api.connector.type.StructType; import org.apache.flink.api.common.serialization.DeserializationSchema; import org.apache.flink.api.common.typeinfo.TypeInformation; import org.slf4j.Logger; diff --git a/groot-formats/format-json/src/main/java/com/geedgenetworks/formats/json/JsonEventSerializationSchema.java b/groot-formats/format-json/src/main/java/com/geedgenetworks/formats/json/JsonEventSerializationSchema.java index 260e35a..de5c4a1 100644 --- a/groot-formats/format-json/src/main/java/com/geedgenetworks/formats/json/JsonEventSerializationSchema.java +++ b/groot-formats/format-json/src/main/java/com/geedgenetworks/formats/json/JsonEventSerializationSchema.java @@ -2,9 +2,9 @@ package com.geedgenetworks.formats.json; import com.alibaba.fastjson2.JSON; import com.alibaba.fastjson2.filter.PropertyFilter; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.core.types.StructType; +import com.geedgenetworks.api.connector.type.StructType; import org.apache.flink.api.common.serialization.SerializationSchema; +import com.geedgenetworks.api.connector.event.Event; public class JsonEventSerializationSchema implements SerializationSchema<Event> { // __开头字段为内部字段,过滤掉 diff --git a/groot-formats/format-json/src/main/java/com/geedgenetworks/formats/json/JsonFormatFactory.java b/groot-formats/format-json/src/main/java/com/geedgenetworks/formats/json/JsonFormatFactory.java index 2a6e99d..15e48d6 100644 --- a/groot-formats/format-json/src/main/java/com/geedgenetworks/formats/json/JsonFormatFactory.java +++ b/groot-formats/format-json/src/main/java/com/geedgenetworks/formats/json/JsonFormatFactory.java @@ -1,12 +1,12 @@ package com.geedgenetworks.formats.json; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.core.connector.format.DecodingFormat; -import com.geedgenetworks.core.connector.format.EncodingFormat; -import com.geedgenetworks.core.factories.DecodingFormatFactory; -import com.geedgenetworks.core.factories.EncodingFormatFactory; -import com.geedgenetworks.core.factories.TableFactory; -import com.geedgenetworks.core.types.StructType; +import com.geedgenetworks.api.connector.serialization.DecodingFormat; +import com.geedgenetworks.api.connector.serialization.EncodingFormat; +import com.geedgenetworks.api.connector.event.Event; +import com.geedgenetworks.api.factory.DecodingFormatFactory; +import com.geedgenetworks.api.factory.EncodingFormatFactory; +import com.geedgenetworks.api.factory.ConnectorFactory; +import com.geedgenetworks.api.connector.type.StructType; import org.apache.flink.api.common.serialization.DeserializationSchema; import org.apache.flink.api.common.serialization.SerializationSchema; import org.apache.flink.configuration.ConfigOption; @@ -22,12 +22,12 @@ public class JsonFormatFactory implements DecodingFormatFactory, EncodingFormatF public static final String IDENTIFIER = "json"; @Override - public String factoryIdentifier() { + public String type() { return IDENTIFIER; } @Override - public DecodingFormat createDecodingFormat(TableFactory.Context context, ReadableConfig formatOptions) { + public DecodingFormat createDecodingFormat(ConnectorFactory.Context context, ReadableConfig formatOptions) { final boolean ignoreParseErrors = formatOptions.get(IGNORE_PARSE_ERRORS); return new DecodingFormat(){ @Override @@ -39,7 +39,7 @@ public class JsonFormatFactory implements DecodingFormatFactory, EncodingFormatF } @Override - public EncodingFormat createEncodingFormat(TableFactory.Context context, ReadableConfig formatOptions) { + public EncodingFormat createEncodingFormat(ConnectorFactory.Context context, ReadableConfig formatOptions) { return new EncodingFormat() { @Override public SerializationSchema<Event> createRuntimeEncoder(StructType dataType) { diff --git a/groot-formats/format-json/src/main/java/com/geedgenetworks/formats/json/JsonSerializer.java b/groot-formats/format-json/src/main/java/com/geedgenetworks/formats/json/JsonSerializer.java index fac90c8..625f8f4 100644 --- a/groot-formats/format-json/src/main/java/com/geedgenetworks/formats/json/JsonSerializer.java +++ b/groot-formats/format-json/src/main/java/com/geedgenetworks/formats/json/JsonSerializer.java @@ -1,176 +1,176 @@ -package com.geedgenetworks.formats.json;
-
-import com.alibaba.fastjson2.JSONWriter;
-import com.geedgenetworks.core.types.*;
-
-import java.io.Serializable;
-import java.util.Arrays;
-import java.util.List;
-import java.util.Map;
-import java.util.stream.Collectors;
-
-public class JsonSerializer implements Serializable{
-
- private final StructType dataType;
- private final ValueWriter valueWriter;
-
- public JsonSerializer(StructType dataType) {
- this.dataType = dataType;
- this.valueWriter = makeWriter(dataType);
- }
-
- public byte[] serialize(Map<String, Object> data){
- try (JSONWriter writer = JSONWriter.ofUTF8()) {
- if (data == null) {
- writer.writeNull();
- } else {
- valueWriter.write(writer, data);
- }
- return writer.getBytes();
- }
- }
-
- private ValueWriter makeWriter(DataType dataType) {
- if (dataType instanceof StringType) {
- return JsonSerializer::writeString;
- }
-
- if (dataType instanceof IntegerType) {
- return JsonSerializer::writeInt;
- }
-
- if (dataType instanceof LongType) {
- return JsonSerializer::writeLong;
- }
-
- if (dataType instanceof FloatType) {
- return JsonSerializer::writeFloat;
- }
-
- if (dataType instanceof DoubleType) {
- return JsonSerializer::writeDouble;
- }
-
- if (dataType instanceof StructType) {
- final Map<String, ValueWriter> fieldWriters = Arrays.stream(((StructType) dataType).fields).collect(Collectors.toMap(f -> f.name, f -> this.makeWriter(f.dataType)));
- return (writer, obj) -> {
- writeObject(writer, obj, fieldWriters);
- };
- }
-
- if (dataType instanceof ArrayType) {
- final ValueWriter elementWriter = this.makeWriter(((ArrayType) dataType).elementType);
- return (writer, obj) -> {
- writeArray(writer, obj, elementWriter);
- };
- }
-
- throw new UnsupportedOperationException("unsupported dataType: " + dataType);
- }
-
- static void writeString(JSONWriter writer, Object obj) {
- writer.writeString(obj.toString());
- }
-
- static void writeInt(JSONWriter writer, Object obj){
- if(obj instanceof Number){
- writer.writeInt32(((Number) obj).intValue());
- } else if(obj instanceof String){
- writer.writeInt32(Integer.parseInt((String) obj));
- } else {
- throw new IllegalArgumentException(String.format("can not convert %s to int", obj));
- }
- }
-
- static void writeLong(JSONWriter writer, Object obj) {
- if(obj instanceof Number){
- writer.writeInt64(((Number) obj).longValue());
- } else if(obj instanceof String){
- writer.writeInt64(Long.parseLong((String) obj));
- } else {
- throw new IllegalArgumentException(String.format("can not convert %s to long", obj));
- }
- }
-
- static void writeFloat(JSONWriter writer, Object obj) {
- if(obj instanceof Number){
- writer.writeFloat(((Number) obj).floatValue());
- } else if(obj instanceof String){
- writer.writeFloat(Float.parseFloat((String) obj));
- } else {
- throw new IllegalArgumentException(String.format("can not convert %s to float", obj));
- }
- }
-
- static void writeDouble(JSONWriter writer, Object obj){
- if(obj instanceof Number){
- writer.writeDouble(((Number) obj).doubleValue());
- } else if(obj instanceof String){
- writer.writeDouble(Double.parseDouble((String) obj));
- } else {
- throw new IllegalArgumentException(String.format("can not convert %s to double", obj));
- }
- }
-
- static void writeObject(JSONWriter writer, Object obj, Map<String, ValueWriter> fieldWriters){
- if(obj instanceof Map){
- Map<String, Object> map = (Map<String, Object>) obj;
- writer.startObject();
-
- String key;
- Object value;
- ValueWriter valueWriter;
- for (Map.Entry<String, Object> entry : map.entrySet()) {
- key = entry.getKey();
- /*if (key.startsWith("__")) {
- continue;
- }*/
- value = entry.getValue();
- if(value == null){
- continue;
- }
- valueWriter = fieldWriters.get(key);
- if(valueWriter != null){
- writer.writeName(key);
- writer.writeColon();
- valueWriter.write(writer, value);
- }
- }
-
- writer.endObject();
- } else {
- throw new IllegalArgumentException(String.format("can not convert %s to map", obj));
- }
- }
-
- static void writeArray(JSONWriter writer, Object obj, ValueWriter elementWriter){
- if(obj instanceof List){
- List<Object> list = (List<Object>) obj;
- writer.startArray();
-
- Object element;
- for (int i = 0; i < list.size(); i++) {
- if (i != 0) {
- writer.writeComma();
- }
-
- element = list.get(i);
- if (element == null) {
- writer.writeNull();
- continue;
- }
-
- elementWriter.write(writer, element);
- }
-
- writer.endArray();
- } else {
- throw new IllegalArgumentException(String.format("can not convert %s to list", obj));
- }
- }
-
- @FunctionalInterface
- public interface ValueWriter extends Serializable {
- void write(JSONWriter writer, Object obj);
- }
-}
+package com.geedgenetworks.formats.json; + +import com.alibaba.fastjson2.JSONWriter; +import com.geedgenetworks.api.connector.type.*; + +import java.io.Serializable; +import java.util.Arrays; +import java.util.List; +import java.util.Map; +import java.util.stream.Collectors; + +public class JsonSerializer implements Serializable{ + + private final StructType dataType; + private final ValueWriter valueWriter; + + public JsonSerializer(StructType dataType) { + this.dataType = dataType; + this.valueWriter = makeWriter(dataType); + } + + public byte[] serialize(Map<String, Object> data){ + try (JSONWriter writer = JSONWriter.ofUTF8()) { + if (data == null) { + writer.writeNull(); + } else { + valueWriter.write(writer, data); + } + return writer.getBytes(); + } + } + + private ValueWriter makeWriter(DataType dataType) { + if (dataType instanceof StringType) { + return JsonSerializer::writeString; + } + + if (dataType instanceof IntegerType) { + return JsonSerializer::writeInt; + } + + if (dataType instanceof LongType) { + return JsonSerializer::writeLong; + } + + if (dataType instanceof FloatType) { + return JsonSerializer::writeFloat; + } + + if (dataType instanceof DoubleType) { + return JsonSerializer::writeDouble; + } + + if (dataType instanceof StructType) { + final Map<String, ValueWriter> fieldWriters = Arrays.stream(((StructType) dataType).fields).collect(Collectors.toMap(f -> f.name, f -> this.makeWriter(f.dataType))); + return (writer, obj) -> { + writeObject(writer, obj, fieldWriters); + }; + } + + if (dataType instanceof ArrayType) { + final ValueWriter elementWriter = this.makeWriter(((ArrayType) dataType).elementType); + return (writer, obj) -> { + writeArray(writer, obj, elementWriter); + }; + } + + throw new UnsupportedOperationException("unsupported dataType: " + dataType); + } + + static void writeString(JSONWriter writer, Object obj) { + writer.writeString(obj.toString()); + } + + static void writeInt(JSONWriter writer, Object obj){ + if(obj instanceof Number){ + writer.writeInt32(((Number) obj).intValue()); + } else if(obj instanceof String){ + writer.writeInt32(Integer.parseInt((String) obj)); + } else { + throw new IllegalArgumentException(String.format("can not convert %s to int", obj)); + } + } + + static void writeLong(JSONWriter writer, Object obj) { + if(obj instanceof Number){ + writer.writeInt64(((Number) obj).longValue()); + } else if(obj instanceof String){ + writer.writeInt64(Long.parseLong((String) obj)); + } else { + throw new IllegalArgumentException(String.format("can not convert %s to long", obj)); + } + } + + static void writeFloat(JSONWriter writer, Object obj) { + if(obj instanceof Number){ + writer.writeFloat(((Number) obj).floatValue()); + } else if(obj instanceof String){ + writer.writeFloat(Float.parseFloat((String) obj)); + } else { + throw new IllegalArgumentException(String.format("can not convert %s to float", obj)); + } + } + + static void writeDouble(JSONWriter writer, Object obj){ + if(obj instanceof Number){ + writer.writeDouble(((Number) obj).doubleValue()); + } else if(obj instanceof String){ + writer.writeDouble(Double.parseDouble((String) obj)); + } else { + throw new IllegalArgumentException(String.format("can not convert %s to double", obj)); + } + } + + static void writeObject(JSONWriter writer, Object obj, Map<String, ValueWriter> fieldWriters){ + if(obj instanceof Map){ + Map<String, Object> map = (Map<String, Object>) obj; + writer.startObject(); + + String key; + Object value; + ValueWriter valueWriter; + for (Map.Entry<String, Object> entry : map.entrySet()) { + key = entry.getKey(); + /*if (key.startsWith("__")) { + continue; + }*/ + value = entry.getValue(); + if(value == null){ + continue; + } + valueWriter = fieldWriters.get(key); + if(valueWriter != null){ + writer.writeName(key); + writer.writeColon(); + valueWriter.write(writer, value); + } + } + + writer.endObject(); + } else { + throw new IllegalArgumentException(String.format("can not convert %s to map", obj)); + } + } + + static void writeArray(JSONWriter writer, Object obj, ValueWriter elementWriter){ + if(obj instanceof List){ + List<Object> list = (List<Object>) obj; + writer.startArray(); + + Object element; + for (int i = 0; i < list.size(); i++) { + if (i != 0) { + writer.writeComma(); + } + + element = list.get(i); + if (element == null) { + writer.writeNull(); + continue; + } + + elementWriter.write(writer, element); + } + + writer.endArray(); + } else { + throw new IllegalArgumentException(String.format("can not convert %s to list", obj)); + } + } + + @FunctionalInterface + public interface ValueWriter extends Serializable { + void write(JSONWriter writer, Object obj); + } +} diff --git a/groot-formats/format-json/src/main/java/com/geedgenetworks/formats/json/JsonToMapDataConverter.java b/groot-formats/format-json/src/main/java/com/geedgenetworks/formats/json/JsonToMapDataConverter.java index f40d2e2..f5a6848 100644 --- a/groot-formats/format-json/src/main/java/com/geedgenetworks/formats/json/JsonToMapDataConverter.java +++ b/groot-formats/format-json/src/main/java/com/geedgenetworks/formats/json/JsonToMapDataConverter.java @@ -2,12 +2,12 @@ package com.geedgenetworks.formats.json; import com.alibaba.fastjson2.JSONException; import com.alibaba.fastjson2.JSONReader; -import com.geedgenetworks.core.types.*; +import com.geedgenetworks.api.connector.type.*; +import com.geedgenetworks.api.connector.type.StructType; import org.slf4j.Logger; import org.slf4j.LoggerFactory; import java.io.Serializable; -import java.nio.charset.StandardCharsets; import java.util.*; import java.util.stream.Collectors; diff --git a/groot-formats/format-json/src/main/resources/META-INF/services/com.geedgenetworks.core.factories.Factory b/groot-formats/format-json/src/main/resources/META-INF/services/com.geedgenetworks.api.factory.Factory index c965152..c965152 100644 --- a/groot-formats/format-json/src/main/resources/META-INF/services/com.geedgenetworks.core.factories.Factory +++ b/groot-formats/format-json/src/main/resources/META-INF/services/com.geedgenetworks.api.factory.Factory diff --git a/groot-formats/format-json/src/test/java/com/geedgenetworks/formats/json/JsonSerializerTest.java b/groot-formats/format-json/src/test/java/com/geedgenetworks/formats/json/JsonSerializerTest.java index e5d6c10..356608b 100644 --- a/groot-formats/format-json/src/test/java/com/geedgenetworks/formats/json/JsonSerializerTest.java +++ b/groot-formats/format-json/src/test/java/com/geedgenetworks/formats/json/JsonSerializerTest.java @@ -1,79 +1,79 @@ -package com.geedgenetworks.formats.json;
-
-import com.alibaba.fastjson2.JSON;
-import com.geedgenetworks.core.types.StructType;
-import com.geedgenetworks.core.types.Types;
-import org.junit.jupiter.api.Test;
-
-import java.nio.charset.StandardCharsets;
-import java.util.*;
-import java.util.concurrent.ThreadLocalRandom;
-
-public class JsonSerializerTest {
-
- @Test
- public void testSerSimpleData(){
- ThreadLocalRandom random = ThreadLocalRandom.current();
- Map<String, Object> map = new LinkedHashMap<>();
- map.put("int", random.nextInt(1, Integer.MAX_VALUE));
- map.put("int_null", null);
- map.put("int_str", Integer.toString(random.nextInt(1, Integer.MAX_VALUE)));
-
- map.put("int64", random.nextLong(1, Long.MAX_VALUE));
- map.put("int64_null", null);
- map.put("int64_str", Long.toString(random.nextLong(1, Long.MAX_VALUE)));
-
- map.put("double", random.nextDouble(1, Integer.MAX_VALUE));
- map.put("double_null", null);
- map.put("double_str", Double.toString(random.nextDouble(1, Integer.MAX_VALUE)));
-
- map.put("str", "ut8字符串");
- map.put("str_null", null);
- map.put("str_int", random.nextInt(1, Integer.MAX_VALUE));
-
- map.put("int32_array", Arrays.asList(1, 3, 5));
- map.put("int32_array_null", null);
- map.put("int32_array_empty", Collections.emptyList());
-
- map.put("int64_array", Arrays.asList(1, 3, 5));
- map.put("int64_array_null", null);
- map.put("int64_array_empty", Collections.emptyList());
-
- map.put("str_array", Arrays.asList(1, 3, 5));
-
- Map<String, Object> obj = new LinkedHashMap<>();
- obj.put("id", 1);
- obj.put("name", "name");
- map.put("obj", obj);
-
- List<Object> list = new ArrayList<>();
- list.add(obj);
- obj = new LinkedHashMap<>();
- obj.put("id", 2);
- obj.put("name", "name2");
- list.add(obj);
- map.put("obj_array", list);
-
- StructType dataType = Types.parseStructType("int: int, int_null: int, int_str: int, int64: bigint, int64_null: bigint, int64_str: bigint, double: double, double_null: double, double_str: double, " +
- "str: string, str_null: string, str_int: string, int32_array: array<int>, int32_array_null: array<int>, int32_array_empty: array<int>, int64_array: array<bigint>, int64_array_null: array<bigint>, int64_array_empty: array<bigint>," +
- " str_array : array<string>, obj : struct<id :int, name: string>, obj_array : array<struct<id :int, name: string>>");
- JsonSerializer serializer = new JsonSerializer(dataType);
-
- byte[] bytes = serializer.serialize(map);
- System.out.println(map);
- System.out.println(bytes.length);
- System.out.println(new String(bytes, StandardCharsets.UTF_8));
- System.out.println(JSON.toJSONString(map));
-
- JsonToMapDataConverter converter = new JsonToMapDataConverter(dataType, false);
- Map<String, Object> rst = converter.convert(new String(bytes, StandardCharsets.UTF_8));
- System.out.println(map);
- System.out.println(rst);
-
- System.out.println(serializer.serialize(rst).length);
- System.out.println(new String(serializer.serialize(rst), StandardCharsets.UTF_8));
- System.out.println(JSON.toJSONString(map));
- }
-
-
+package com.geedgenetworks.formats.json; + +import com.alibaba.fastjson2.JSON; +import com.geedgenetworks.api.connector.type.StructType; +import com.geedgenetworks.api.connector.type.Types; +import org.junit.jupiter.api.Test; + +import java.nio.charset.StandardCharsets; +import java.util.*; +import java.util.concurrent.ThreadLocalRandom; + +public class JsonSerializerTest { + + @Test + public void testSerSimpleData(){ + ThreadLocalRandom random = ThreadLocalRandom.current(); + Map<String, Object> map = new LinkedHashMap<>(); + map.put("int", random.nextInt(1, Integer.MAX_VALUE)); + map.put("int_null", null); + map.put("int_str", Integer.toString(random.nextInt(1, Integer.MAX_VALUE))); + + map.put("int64", random.nextLong(1, Long.MAX_VALUE)); + map.put("int64_null", null); + map.put("int64_str", Long.toString(random.nextLong(1, Long.MAX_VALUE))); + + map.put("double", random.nextDouble(1, Integer.MAX_VALUE)); + map.put("double_null", null); + map.put("double_str", Double.toString(random.nextDouble(1, Integer.MAX_VALUE))); + + map.put("str", "ut8字符串"); + map.put("str_null", null); + map.put("str_int", random.nextInt(1, Integer.MAX_VALUE)); + + map.put("int32_array", Arrays.asList(1, 3, 5)); + map.put("int32_array_null", null); + map.put("int32_array_empty", Collections.emptyList()); + + map.put("int64_array", Arrays.asList(1, 3, 5)); + map.put("int64_array_null", null); + map.put("int64_array_empty", Collections.emptyList()); + + map.put("str_array", Arrays.asList(1, 3, 5)); + + Map<String, Object> obj = new LinkedHashMap<>(); + obj.put("id", 1); + obj.put("name", "name"); + map.put("obj", obj); + + List<Object> list = new ArrayList<>(); + list.add(obj); + obj = new LinkedHashMap<>(); + obj.put("id", 2); + obj.put("name", "name2"); + list.add(obj); + map.put("obj_array", list); + + StructType dataType = Types.parseStructType("int: int, int_null: int, int_str: int, int64: bigint, int64_null: bigint, int64_str: bigint, double: double, double_null: double, double_str: double, " + + "str: string, str_null: string, str_int: string, int32_array: array<int>, int32_array_null: array<int>, int32_array_empty: array<int>, int64_array: array<bigint>, int64_array_null: array<bigint>, int64_array_empty: array<bigint>," + + " str_array : array<string>, obj : struct<id :int, name: string>, obj_array : array<struct<id :int, name: string>>"); + JsonSerializer serializer = new JsonSerializer(dataType); + + byte[] bytes = serializer.serialize(map); + System.out.println(map); + System.out.println(bytes.length); + System.out.println(new String(bytes, StandardCharsets.UTF_8)); + System.out.println(JSON.toJSONString(map)); + + JsonToMapDataConverter converter = new JsonToMapDataConverter(dataType, false); + Map<String, Object> rst = converter.convert(new String(bytes, StandardCharsets.UTF_8)); + System.out.println(map); + System.out.println(rst); + + System.out.println(serializer.serialize(rst).length); + System.out.println(new String(serializer.serialize(rst), StandardCharsets.UTF_8)); + System.out.println(JSON.toJSONString(map)); + } + + }
\ No newline at end of file diff --git a/groot-formats/format-msgpack/pom.xml b/groot-formats/format-msgpack/pom.xml index a58e919..7d70875 100644 --- a/groot-formats/format-msgpack/pom.xml +++ b/groot-formats/format-msgpack/pom.xml @@ -19,15 +19,30 @@ <version>0.9.8</version> </dependency> - <!--<dependency> + <dependency> + <groupId>com.geedgenetworks</groupId> + <artifactId>groot-core</artifactId> + <version>${revision}</version> + <scope>test</scope> + </dependency> + + <dependency> + <groupId>org.apache.flink</groupId> + <artifactId>flink-table-api-java-bridge_${scala.version}</artifactId> + <scope>test</scope> + </dependency> + + <dependency> <groupId>org.apache.flink</groupId> <artifactId>flink-clients_${scala.version}</artifactId> + <scope>test</scope> </dependency> <dependency> <groupId>org.apache.flink</groupId> <artifactId>flink-table-planner-blink_${scala.version}</artifactId> - </dependency>--> + <scope>test</scope> + </dependency> </dependencies> </project>
\ No newline at end of file diff --git a/groot-formats/format-msgpack/src/main/java/com/geedgenetworks/formats/msgpack/MessagePackDeserializer.java b/groot-formats/format-msgpack/src/main/java/com/geedgenetworks/formats/msgpack/MessagePackDeserializer.java index 5bbe75e..0745a0a 100644 --- a/groot-formats/format-msgpack/src/main/java/com/geedgenetworks/formats/msgpack/MessagePackDeserializer.java +++ b/groot-formats/format-msgpack/src/main/java/com/geedgenetworks/formats/msgpack/MessagePackDeserializer.java @@ -1,343 +1,344 @@ -package com.geedgenetworks.formats.msgpack;
-
-import com.geedgenetworks.core.types.*;
-import org.msgpack.core.MessageFormat;
-import org.msgpack.core.MessagePack;
-import org.msgpack.core.MessageUnpacker;
-import org.msgpack.value.ValueType;
-
-import java.io.Serializable;
-import java.util.*;
-import java.util.stream.Collectors;
-
-public class MessagePackDeserializer implements Serializable{
- private final StructType dataType;
- private final ValueConverter rootConverter; // 带Schema时的converter
-
- private static final ValueConverter[] converterTable = new ValueConverter[12]; // 无Schema时的converter
-
-
- public MessagePackDeserializer(StructType dataType) {
- this.dataType = dataType;
- this.rootConverter = dataType == null ? null : makeConverterForMap(dataType);
- }
-
- static {
- initConverterTable();
- }
-
- public Map<String, Object> deserialize(byte[] bytes) throws Exception {
- MessageUnpacker unpacker = MessagePack.newDefaultUnpacker(bytes);
- try {
- if(rootConverter == null){
- return MessagePackDeserializer.converterMap(unpacker, null);
- }else{
- return (Map<String, Object>) rootConverter.convert(unpacker, null);
- }
- } finally {
- unpacker.close();
- }
- }
-
- private ValueConverter[] makeConverter(DataType dataType) {
- ValueConverter[] converterTable = new ValueConverter[12];
-
- converterTable[ValueType.BOOLEAN.ordinal()] = makeConverterForBoolean(dataType);
- converterTable[ValueType.INTEGER.ordinal()] = makeConverterForInteger(dataType);
- converterTable[ValueType.FLOAT.ordinal()] = makeConverterForFloat(dataType);
- converterTable[ValueType.STRING.ordinal()] = makeConverterForString(dataType);
- converterTable[ValueType.BINARY.ordinal()] = makeConverterForBinary(dataType);
- converterTable[ValueType.ARRAY.ordinal()] = makeConverterForArray(dataType);
- converterTable[ValueType.MAP.ordinal()] = makeConverterForMap(dataType);
-
- return converterTable;
- }
-
- public ValueConverter makeConverterForBoolean(DataType dataType){
- if (dataType instanceof BooleanType) {
- return (unpacker, format) -> unpacker.unpackBoolean();
- } else if (dataType instanceof IntegerType) {
- return (unpacker, format) -> unpacker.unpackBoolean() ? 1 : 0;
- } else {
- //throw newCanNotConvertException(ValueType.BOOLEAN.name(), dataType);
- return (unpacker, format) -> {throw newCanNotConvertException(ValueType.BOOLEAN.name(), dataType);};
- }
- }
-
- public ValueConverter makeConverterForInteger(DataType dataType) {
- if (dataType instanceof IntegerType) {
- return (unpacker, format) -> {
- switch (format) {
- case UINT64:
- return unpacker.unpackBigInteger().intValue();
- case INT64:
- case UINT32:
- return (int)unpacker.unpackLong();
- default:
- return unpacker.unpackInt();
- }
- };
- } else if (dataType instanceof LongType) {
- return (unpacker, format) -> {
- switch (format) {
- case UINT64:
- return unpacker.unpackBigInteger().longValue();
- case INT64:
- case UINT32:
- return unpacker.unpackLong();
- default:
- return (long)unpacker.unpackInt();
- }
- };
- } else if (dataType instanceof FloatType) {
- return (unpacker, format) -> {
- switch (format) {
- case UINT64:
- return unpacker.unpackBigInteger().floatValue();
- case INT64:
- case UINT32:
- return (float)unpacker.unpackLong();
- default:
- return (float)unpacker.unpackInt();
- }
- };
- } else if (dataType instanceof DoubleType) {
- return (unpacker, format) -> {
- switch (format) {
- case UINT64:
- return unpacker.unpackBigInteger().doubleValue();
- case INT64:
- case UINT32:
- return (double)unpacker.unpackLong();
- default:
- return (double)unpacker.unpackInt();
- }
- };
- } else if (dataType instanceof StringType) {
- return (unpacker, format) -> {
- switch (format) {
- case UINT64:
- return unpacker.unpackBigInteger().toString();
- case INT64:
- case UINT32:
- return Long.toString(unpacker.unpackLong());
- default:
- return Integer.toString(unpacker.unpackInt());
- }
- };
- } else {
- return (unpacker, format) -> {throw newCanNotConvertException(ValueType.INTEGER.name(), dataType);};
- }
- }
-
- public ValueConverter makeConverterForFloat(DataType dataType) {
- if (dataType instanceof DoubleType) {
- return (unpacker, format) -> unpacker.unpackDouble();
- } else if (dataType instanceof FloatType) {
- return (unpacker, format) -> (float) unpacker.unpackDouble();
- } else if (dataType instanceof IntegerType) {
- return (unpacker, format) -> (int) unpacker.unpackDouble();
- } else if (dataType instanceof LongType) {
- return (unpacker, format) -> (long) unpacker.unpackDouble();
- } else if (dataType instanceof StringType) {
- return (unpacker, format) -> Double.toString(unpacker.unpackDouble());
- } else {
- return (unpacker, format) -> {throw newCanNotConvertException(ValueType.FLOAT.name(), dataType);};
- }
- }
-
- public ValueConverter makeConverterForString(DataType dataType) {
- if (dataType instanceof StringType) {
- return (unpacker, format) -> unpacker.unpackString();
- } else if (dataType instanceof IntegerType) {
- return (unpacker, format) -> Integer.parseInt(unpacker.unpackString());
- } else if (dataType instanceof LongType) {
- return (unpacker, format) -> Long.parseLong(unpacker.unpackString());
- } else if (dataType instanceof FloatType) {
- return (unpacker, format) -> Float.parseFloat(unpacker.unpackString());
- } else if (dataType instanceof DoubleType) {
- return (unpacker, format) -> Double.parseDouble(unpacker.unpackString());
- } else if (dataType instanceof BinaryType) {
- return (unpacker, format) -> unpacker.readPayload(unpacker.unpackRawStringHeader());
- } else {
- return (unpacker, format) -> {throw newCanNotConvertException(ValueType.STRING.name(), dataType);};
- }
- }
-
- public ValueConverter makeConverterForBinary(DataType dataType){
- if (dataType instanceof BinaryType) {
- return (unpacker, format) -> unpacker.readPayload(unpacker.unpackBinaryHeader());
- } else {
- return (unpacker, format) -> {throw newCanNotConvertException(ValueType.BINARY.name(), dataType);};
- }
- }
-
- public ValueConverter makeConverterForArray(DataType dataType) {
- if (dataType instanceof ArrayType) {
- ValueConverter[] converterTable = makeConverter(((ArrayType) dataType).elementType);
- return (unpacker, format) -> {
- int size = unpacker.unpackArrayHeader();
- List<Object> array = new ArrayList<>(size);
- MessageFormat mf;
- ValueType type;
- ValueConverter valueConverter;
- for (int i = 0; i < size; i++) {
- mf = unpacker.getNextFormat();
- type = mf.getValueType();
- if (type == ValueType.NIL) {
- unpacker.unpackNil();
- array.add(null);
- continue;
- }
- valueConverter = converterTable[type.ordinal()];
- if (valueConverter == null) {
- throw new UnsupportedOperationException(type.name());
- }
- array.add(valueConverter.convert(unpacker, mf));
- }
- return array;
- };
- } else {
- return (unpacker, format) -> {throw newCanNotConvertException(ValueType.ARRAY.name(), dataType);};
- }
- }
-
- public ValueConverter makeConverterForMap(DataType dataType){
- if (!(dataType instanceof StructType)) {
- return (unpacker, format) -> {throw newCanNotConvertException(ValueType.MAP.name(), dataType);};
- }
- final Map<String, ValueConverter[]> filedConverters = Arrays.stream(((StructType) dataType).fields).collect(Collectors.toMap(f -> f.name, f -> this.makeConverter(f.dataType)));
- return (unpacker, format) -> {
- int size = unpacker.unpackMapHeader();
- Map<String, Object> map = new HashMap<>((int) (size / 0.75));
- MessageFormat mf;
- ValueType type;
- ValueConverter[] converterTable;
- ValueConverter valueConverter;
-
- String key;
- Object value;
- for (int i = 0; i < size; i++) {
- key = unpacker.unpackString();
- converterTable = filedConverters.get(key);
- if(converterTable == null){
- unpacker.skipValue();
- continue;
- }
-
- mf = unpacker.getNextFormat();
- type = mf.getValueType();
- if (type == ValueType.NIL) {
- unpacker.unpackNil();
- continue;
- }
- valueConverter = converterTable[type.ordinal()];
- if (valueConverter == null) {
- throw new UnsupportedOperationException(type.name());
- }
- value = valueConverter.convert(unpacker, mf);
- map.put(key, value);
- }
-
- return map;
- };
- }
-
- private static void initConverterTable() {
- converterTable[ValueType.BOOLEAN.ordinal()] = MessagePackDeserializer::converterBoolean;
- converterTable[ValueType.INTEGER.ordinal()] = MessagePackDeserializer::converterInteger;
- converterTable[ValueType.FLOAT.ordinal()] = MessagePackDeserializer::converterFloat;
- converterTable[ValueType.STRING.ordinal()] = MessagePackDeserializer::converterString;
- converterTable[ValueType.BINARY.ordinal()] = MessagePackDeserializer::converterBinary;
- converterTable[ValueType.ARRAY.ordinal()] = MessagePackDeserializer::converterArray;
- converterTable[ValueType.MAP.ordinal()] = MessagePackDeserializer::converterMap;
- }
-
- public static Object converterBoolean(MessageUnpacker unpacker, MessageFormat format) throws Exception {
- return unpacker.unpackBoolean();
- }
-
- public static Object converterInteger(MessageUnpacker unpacker, MessageFormat format) throws Exception {
- switch (format) {
- case UINT64:
- return unpacker.unpackBigInteger().longValue();
- case INT64:
- case UINT32:
- return unpacker.unpackLong();
- default:
- return unpacker.unpackInt();
- }
- }
-
- public static Object converterFloat(MessageUnpacker unpacker, MessageFormat format) throws Exception {
- return unpacker.unpackDouble();
- }
-
- public static Object converterString(MessageUnpacker unpacker, MessageFormat format) throws Exception {
- return unpacker.unpackString();
- }
-
- public static Object converterBinary(MessageUnpacker unpacker, MessageFormat format) throws Exception {
- return unpacker.readPayload(unpacker.unpackBinaryHeader());
- }
-
- public static Object converterArray(MessageUnpacker unpacker, MessageFormat format) throws Exception {
- int size = unpacker.unpackArrayHeader();
- List<Object> array = new ArrayList<>(size);
- MessageFormat mf;
- ValueType type;
- ValueConverter valueConverter;
- for (int i = 0; i < size; i++) {
- mf = unpacker.getNextFormat();
- type = mf.getValueType();
- if (type == ValueType.NIL) {
- unpacker.unpackNil();
- array.add(null);
- continue;
- }
- valueConverter = converterTable[type.ordinal()];
- if (valueConverter == null) {
- throw new UnsupportedOperationException(type.name());
- }
- array.add(valueConverter.convert(unpacker, mf));
- }
- return array;
- }
-
- public static Map<String, Object> converterMap(MessageUnpacker unpacker, MessageFormat format) throws Exception {
- int size = unpacker.unpackMapHeader();
- Map<String, Object> map = new HashMap<>((int) (size / 0.75));
- MessageFormat mf;
- ValueType type;
- ValueConverter valueConverter;
-
- String key;
- Object value;
- for (int i = 0; i < size; i++) {
- key = unpacker.unpackString();
- mf = unpacker.getNextFormat();
- type = mf.getValueType();
- if (type == ValueType.NIL) {
- unpacker.unpackNil();
- continue;
- }
- valueConverter = converterTable[type.ordinal()];
- if (valueConverter == null) {
- throw new UnsupportedOperationException(type.name());
- }
- value = valueConverter.convert(unpacker, mf);
- map.put(key, value);
- }
-
- return map;
- }
-
- private static IllegalArgumentException newCanNotConvertException(String type, DataType dataType) {
- return new IllegalArgumentException(String.format("%s can not convert to type:%s", type, dataType));
- }
-
- @FunctionalInterface
- public interface ValueConverter extends Serializable {
- Object convert(MessageUnpacker unpacker, MessageFormat format) throws Exception;
- }
-}
+package com.geedgenetworks.formats.msgpack; + +import com.geedgenetworks.api.connector.type.StructType; +import com.geedgenetworks.api.connector.type.*; +import org.msgpack.core.MessageFormat; +import org.msgpack.core.MessagePack; +import org.msgpack.core.MessageUnpacker; +import org.msgpack.value.ValueType; + +import java.io.Serializable; +import java.util.*; +import java.util.stream.Collectors; + +public class MessagePackDeserializer implements Serializable{ + private final StructType dataType; + private final ValueConverter rootConverter; // 带Schema时的converter + + private static final ValueConverter[] converterTable = new ValueConverter[12]; // 无Schema时的converter + + + public MessagePackDeserializer(StructType dataType) { + this.dataType = dataType; + this.rootConverter = dataType == null ? null : makeConverterForMap(dataType); + } + + static { + initConverterTable(); + } + + public Map<String, Object> deserialize(byte[] bytes) throws Exception { + MessageUnpacker unpacker = MessagePack.newDefaultUnpacker(bytes); + try { + if(rootConverter == null){ + return MessagePackDeserializer.converterMap(unpacker, null); + }else{ + return (Map<String, Object>) rootConverter.convert(unpacker, null); + } + } finally { + unpacker.close(); + } + } + + private ValueConverter[] makeConverter(DataType dataType) { + ValueConverter[] converterTable = new ValueConverter[12]; + + converterTable[ValueType.BOOLEAN.ordinal()] = makeConverterForBoolean(dataType); + converterTable[ValueType.INTEGER.ordinal()] = makeConverterForInteger(dataType); + converterTable[ValueType.FLOAT.ordinal()] = makeConverterForFloat(dataType); + converterTable[ValueType.STRING.ordinal()] = makeConverterForString(dataType); + converterTable[ValueType.BINARY.ordinal()] = makeConverterForBinary(dataType); + converterTable[ValueType.ARRAY.ordinal()] = makeConverterForArray(dataType); + converterTable[ValueType.MAP.ordinal()] = makeConverterForMap(dataType); + + return converterTable; + } + + public ValueConverter makeConverterForBoolean(DataType dataType){ + if (dataType instanceof BooleanType) { + return (unpacker, format) -> unpacker.unpackBoolean(); + } else if (dataType instanceof IntegerType) { + return (unpacker, format) -> unpacker.unpackBoolean() ? 1 : 0; + } else { + //throw newCanNotConvertException(ValueType.BOOLEAN.name(), dataType); + return (unpacker, format) -> {throw newCanNotConvertException(ValueType.BOOLEAN.name(), dataType);}; + } + } + + public ValueConverter makeConverterForInteger(DataType dataType) { + if (dataType instanceof IntegerType) { + return (unpacker, format) -> { + switch (format) { + case UINT64: + return unpacker.unpackBigInteger().intValue(); + case INT64: + case UINT32: + return (int)unpacker.unpackLong(); + default: + return unpacker.unpackInt(); + } + }; + } else if (dataType instanceof LongType) { + return (unpacker, format) -> { + switch (format) { + case UINT64: + return unpacker.unpackBigInteger().longValue(); + case INT64: + case UINT32: + return unpacker.unpackLong(); + default: + return (long)unpacker.unpackInt(); + } + }; + } else if (dataType instanceof FloatType) { + return (unpacker, format) -> { + switch (format) { + case UINT64: + return unpacker.unpackBigInteger().floatValue(); + case INT64: + case UINT32: + return (float)unpacker.unpackLong(); + default: + return (float)unpacker.unpackInt(); + } + }; + } else if (dataType instanceof DoubleType) { + return (unpacker, format) -> { + switch (format) { + case UINT64: + return unpacker.unpackBigInteger().doubleValue(); + case INT64: + case UINT32: + return (double)unpacker.unpackLong(); + default: + return (double)unpacker.unpackInt(); + } + }; + } else if (dataType instanceof StringType) { + return (unpacker, format) -> { + switch (format) { + case UINT64: + return unpacker.unpackBigInteger().toString(); + case INT64: + case UINT32: + return Long.toString(unpacker.unpackLong()); + default: + return Integer.toString(unpacker.unpackInt()); + } + }; + } else { + return (unpacker, format) -> {throw newCanNotConvertException(ValueType.INTEGER.name(), dataType);}; + } + } + + public ValueConverter makeConverterForFloat(DataType dataType) { + if (dataType instanceof DoubleType) { + return (unpacker, format) -> unpacker.unpackDouble(); + } else if (dataType instanceof FloatType) { + return (unpacker, format) -> (float) unpacker.unpackDouble(); + } else if (dataType instanceof IntegerType) { + return (unpacker, format) -> (int) unpacker.unpackDouble(); + } else if (dataType instanceof LongType) { + return (unpacker, format) -> (long) unpacker.unpackDouble(); + } else if (dataType instanceof StringType) { + return (unpacker, format) -> Double.toString(unpacker.unpackDouble()); + } else { + return (unpacker, format) -> {throw newCanNotConvertException(ValueType.FLOAT.name(), dataType);}; + } + } + + public ValueConverter makeConverterForString(DataType dataType) { + if (dataType instanceof StringType) { + return (unpacker, format) -> unpacker.unpackString(); + } else if (dataType instanceof IntegerType) { + return (unpacker, format) -> Integer.parseInt(unpacker.unpackString()); + } else if (dataType instanceof LongType) { + return (unpacker, format) -> Long.parseLong(unpacker.unpackString()); + } else if (dataType instanceof FloatType) { + return (unpacker, format) -> Float.parseFloat(unpacker.unpackString()); + } else if (dataType instanceof DoubleType) { + return (unpacker, format) -> Double.parseDouble(unpacker.unpackString()); + } else if (dataType instanceof BinaryType) { + return (unpacker, format) -> unpacker.readPayload(unpacker.unpackRawStringHeader()); + } else { + return (unpacker, format) -> {throw newCanNotConvertException(ValueType.STRING.name(), dataType);}; + } + } + + public ValueConverter makeConverterForBinary(DataType dataType){ + if (dataType instanceof BinaryType) { + return (unpacker, format) -> unpacker.readPayload(unpacker.unpackBinaryHeader()); + } else { + return (unpacker, format) -> {throw newCanNotConvertException(ValueType.BINARY.name(), dataType);}; + } + } + + public ValueConverter makeConverterForArray(DataType dataType) { + if (dataType instanceof ArrayType) { + ValueConverter[] converterTable = makeConverter(((ArrayType) dataType).elementType); + return (unpacker, format) -> { + int size = unpacker.unpackArrayHeader(); + List<Object> array = new ArrayList<>(size); + MessageFormat mf; + ValueType type; + ValueConverter valueConverter; + for (int i = 0; i < size; i++) { + mf = unpacker.getNextFormat(); + type = mf.getValueType(); + if (type == ValueType.NIL) { + unpacker.unpackNil(); + array.add(null); + continue; + } + valueConverter = converterTable[type.ordinal()]; + if (valueConverter == null) { + throw new UnsupportedOperationException(type.name()); + } + array.add(valueConverter.convert(unpacker, mf)); + } + return array; + }; + } else { + return (unpacker, format) -> {throw newCanNotConvertException(ValueType.ARRAY.name(), dataType);}; + } + } + + public ValueConverter makeConverterForMap(DataType dataType){ + if (!(dataType instanceof StructType)) { + return (unpacker, format) -> {throw newCanNotConvertException(ValueType.MAP.name(), dataType);}; + } + final Map<String, ValueConverter[]> filedConverters = Arrays.stream(((StructType) dataType).fields).collect(Collectors.toMap(f -> f.name, f -> this.makeConverter(f.dataType))); + return (unpacker, format) -> { + int size = unpacker.unpackMapHeader(); + Map<String, Object> map = new HashMap<>((int) (size / 0.75)); + MessageFormat mf; + ValueType type; + ValueConverter[] converterTable; + ValueConverter valueConverter; + + String key; + Object value; + for (int i = 0; i < size; i++) { + key = unpacker.unpackString(); + converterTable = filedConverters.get(key); + if(converterTable == null){ + unpacker.skipValue(); + continue; + } + + mf = unpacker.getNextFormat(); + type = mf.getValueType(); + if (type == ValueType.NIL) { + unpacker.unpackNil(); + continue; + } + valueConverter = converterTable[type.ordinal()]; + if (valueConverter == null) { + throw new UnsupportedOperationException(type.name()); + } + value = valueConverter.convert(unpacker, mf); + map.put(key, value); + } + + return map; + }; + } + + private static void initConverterTable() { + converterTable[ValueType.BOOLEAN.ordinal()] = MessagePackDeserializer::converterBoolean; + converterTable[ValueType.INTEGER.ordinal()] = MessagePackDeserializer::converterInteger; + converterTable[ValueType.FLOAT.ordinal()] = MessagePackDeserializer::converterFloat; + converterTable[ValueType.STRING.ordinal()] = MessagePackDeserializer::converterString; + converterTable[ValueType.BINARY.ordinal()] = MessagePackDeserializer::converterBinary; + converterTable[ValueType.ARRAY.ordinal()] = MessagePackDeserializer::converterArray; + converterTable[ValueType.MAP.ordinal()] = MessagePackDeserializer::converterMap; + } + + public static Object converterBoolean(MessageUnpacker unpacker, MessageFormat format) throws Exception { + return unpacker.unpackBoolean(); + } + + public static Object converterInteger(MessageUnpacker unpacker, MessageFormat format) throws Exception { + switch (format) { + case UINT64: + return unpacker.unpackBigInteger().longValue(); + case INT64: + case UINT32: + return unpacker.unpackLong(); + default: + return unpacker.unpackInt(); + } + } + + public static Object converterFloat(MessageUnpacker unpacker, MessageFormat format) throws Exception { + return unpacker.unpackDouble(); + } + + public static Object converterString(MessageUnpacker unpacker, MessageFormat format) throws Exception { + return unpacker.unpackString(); + } + + public static Object converterBinary(MessageUnpacker unpacker, MessageFormat format) throws Exception { + return unpacker.readPayload(unpacker.unpackBinaryHeader()); + } + + public static Object converterArray(MessageUnpacker unpacker, MessageFormat format) throws Exception { + int size = unpacker.unpackArrayHeader(); + List<Object> array = new ArrayList<>(size); + MessageFormat mf; + ValueType type; + ValueConverter valueConverter; + for (int i = 0; i < size; i++) { + mf = unpacker.getNextFormat(); + type = mf.getValueType(); + if (type == ValueType.NIL) { + unpacker.unpackNil(); + array.add(null); + continue; + } + valueConverter = converterTable[type.ordinal()]; + if (valueConverter == null) { + throw new UnsupportedOperationException(type.name()); + } + array.add(valueConverter.convert(unpacker, mf)); + } + return array; + } + + public static Map<String, Object> converterMap(MessageUnpacker unpacker, MessageFormat format) throws Exception { + int size = unpacker.unpackMapHeader(); + Map<String, Object> map = new HashMap<>((int) (size / 0.75)); + MessageFormat mf; + ValueType type; + ValueConverter valueConverter; + + String key; + Object value; + for (int i = 0; i < size; i++) { + key = unpacker.unpackString(); + mf = unpacker.getNextFormat(); + type = mf.getValueType(); + if (type == ValueType.NIL) { + unpacker.unpackNil(); + continue; + } + valueConverter = converterTable[type.ordinal()]; + if (valueConverter == null) { + throw new UnsupportedOperationException(type.name()); + } + value = valueConverter.convert(unpacker, mf); + map.put(key, value); + } + + return map; + } + + private static IllegalArgumentException newCanNotConvertException(String type, DataType dataType) { + return new IllegalArgumentException(String.format("%s can not convert to type:%s", type, dataType)); + } + + @FunctionalInterface + public interface ValueConverter extends Serializable { + Object convert(MessageUnpacker unpacker, MessageFormat format) throws Exception; + } +} diff --git a/groot-formats/format-msgpack/src/main/java/com/geedgenetworks/formats/msgpack/MessagePackEventDeserializationSchema.java b/groot-formats/format-msgpack/src/main/java/com/geedgenetworks/formats/msgpack/MessagePackEventDeserializationSchema.java index 2fc9c64..8791682 100644 --- a/groot-formats/format-msgpack/src/main/java/com/geedgenetworks/formats/msgpack/MessagePackEventDeserializationSchema.java +++ b/groot-formats/format-msgpack/src/main/java/com/geedgenetworks/formats/msgpack/MessagePackEventDeserializationSchema.java @@ -1,52 +1,53 @@ -package com.geedgenetworks.formats.msgpack;
-
-import com.geedgenetworks.common.Event;
-import com.geedgenetworks.core.connector.format.MapDeserialization;
-import com.geedgenetworks.core.types.StructType;
-import org.apache.flink.api.common.serialization.DeserializationSchema;
-import org.apache.flink.api.common.typeinfo.TypeInformation;
-import org.apache.flink.util.StringUtils;
-
-import java.io.IOException;
-import java.util.Map;
-
-public class MessagePackEventDeserializationSchema implements DeserializationSchema<Event>, MapDeserialization {
- private final StructType dataType;
- private final MessagePackDeserializer deserializer;
-
- public MessagePackEventDeserializationSchema(StructType dataType) {
- this.dataType = dataType;
- this.deserializer = new MessagePackDeserializer(dataType);
- }
-
- @Override
- public Event deserialize(byte[] bytes) throws IOException {
- try {
- Map<String, Object> map = deserializer.deserialize(bytes);
- Event event = new Event();
- event.setExtractedFields(map);
- return event;
- } catch (Exception e) {
- throw new IOException(StringUtils.byteToHexString(bytes), e);
- }
- }
-
- @Override
- public Map<String, Object> deserializeToMap(byte[] bytes) throws IOException {
- try {
- return deserializer.deserialize(bytes);
- } catch (Exception e) {
- throw new IOException(StringUtils.byteToHexString(bytes), e);
- }
- }
-
- @Override
- public boolean isEndOfStream(Event nextElement) {
- return false;
- }
-
- @Override
- public TypeInformation<Event> getProducedType() {
- return null;
- }
-}
+package com.geedgenetworks.formats.msgpack; + + +import com.geedgenetworks.api.connector.serialization.MapDeserialization; +import com.geedgenetworks.api.connector.event.Event; +import com.geedgenetworks.api.connector.type.StructType; +import org.apache.flink.api.common.serialization.DeserializationSchema; +import org.apache.flink.api.common.typeinfo.TypeInformation; +import org.apache.flink.util.StringUtils; + +import java.io.IOException; +import java.util.Map; + +public class MessagePackEventDeserializationSchema implements DeserializationSchema<Event>, MapDeserialization { + private final StructType dataType; + private final MessagePackDeserializer deserializer; + + public MessagePackEventDeserializationSchema(StructType dataType) { + this.dataType = dataType; + this.deserializer = new MessagePackDeserializer(dataType); + } + + @Override + public Event deserialize(byte[] bytes) throws IOException { + try { + Map<String, Object> map = deserializer.deserialize(bytes); + Event event = new Event(); + event.setExtractedFields(map); + return event; + } catch (Exception e) { + throw new IOException(StringUtils.byteToHexString(bytes), e); + } + } + + @Override + public Map<String, Object> deserializeToMap(byte[] bytes) throws IOException { + try { + return deserializer.deserialize(bytes); + } catch (Exception e) { + throw new IOException(StringUtils.byteToHexString(bytes), e); + } + } + + @Override + public boolean isEndOfStream(Event nextElement) { + return false; + } + + @Override + public TypeInformation<Event> getProducedType() { + return null; + } +} diff --git a/groot-formats/format-msgpack/src/main/java/com/geedgenetworks/formats/msgpack/MessagePackEventSerializationSchema.java b/groot-formats/format-msgpack/src/main/java/com/geedgenetworks/formats/msgpack/MessagePackEventSerializationSchema.java index 9fd5669..149a751 100644 --- a/groot-formats/format-msgpack/src/main/java/com/geedgenetworks/formats/msgpack/MessagePackEventSerializationSchema.java +++ b/groot-formats/format-msgpack/src/main/java/com/geedgenetworks/formats/msgpack/MessagePackEventSerializationSchema.java @@ -1,20 +1,20 @@ -package com.geedgenetworks.formats.msgpack;
-
-import com.geedgenetworks.common.Event;
-import com.geedgenetworks.core.types.StructType;
-import org.apache.flink.api.common.serialization.SerializationSchema;
-
-public class MessagePackEventSerializationSchema implements SerializationSchema<Event> {
- private final StructType dataType;
- private final MessagePackSerializer serializer;
-
- public MessagePackEventSerializationSchema(StructType dataType) {
- this.dataType = dataType;
- this.serializer = new MessagePackSerializer(dataType);
- }
-
- @Override
- public byte[] serialize(Event element) {
- return serializer.serialize(element.getExtractedFields());
- }
-}
+package com.geedgenetworks.formats.msgpack; + +import com.geedgenetworks.api.connector.event.Event; +import com.geedgenetworks.api.connector.type.StructType; +import org.apache.flink.api.common.serialization.SerializationSchema; + +public class MessagePackEventSerializationSchema implements SerializationSchema<Event> { + private final StructType dataType; + private final MessagePackSerializer serializer; + + public MessagePackEventSerializationSchema(StructType dataType) { + this.dataType = dataType; + this.serializer = new MessagePackSerializer(dataType); + } + + @Override + public byte[] serialize(Event element) { + return serializer.serialize(element.getExtractedFields()); + } +} diff --git a/groot-formats/format-msgpack/src/main/java/com/geedgenetworks/formats/msgpack/MessagePackFormatFactory.java b/groot-formats/format-msgpack/src/main/java/com/geedgenetworks/formats/msgpack/MessagePackFormatFactory.java index f5641c0..cfb47f6 100644 --- a/groot-formats/format-msgpack/src/main/java/com/geedgenetworks/formats/msgpack/MessagePackFormatFactory.java +++ b/groot-formats/format-msgpack/src/main/java/com/geedgenetworks/formats/msgpack/MessagePackFormatFactory.java @@ -1,12 +1,13 @@ package com.geedgenetworks.formats.msgpack; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.core.connector.format.DecodingFormat; -import com.geedgenetworks.core.connector.format.EncodingFormat; -import com.geedgenetworks.core.factories.DecodingFormatFactory; -import com.geedgenetworks.core.factories.EncodingFormatFactory; -import com.geedgenetworks.core.factories.TableFactory; -import com.geedgenetworks.core.types.StructType; + +import com.geedgenetworks.api.connector.serialization.DecodingFormat; +import com.geedgenetworks.api.connector.serialization.EncodingFormat; +import com.geedgenetworks.api.connector.event.Event; +import com.geedgenetworks.api.factory.DecodingFormatFactory; +import com.geedgenetworks.api.factory.EncodingFormatFactory; +import com.geedgenetworks.api.factory.ConnectorFactory; +import com.geedgenetworks.api.connector.type.StructType; import org.apache.flink.api.common.serialization.DeserializationSchema; import org.apache.flink.api.common.serialization.SerializationSchema; import org.apache.flink.configuration.ConfigOption; @@ -19,12 +20,12 @@ public class MessagePackFormatFactory implements DecodingFormatFactory, Encoding public static final String IDENTIFIER = "msgpack"; @Override - public String factoryIdentifier() { + public String type() { return IDENTIFIER; } @Override - public DecodingFormat createDecodingFormat(TableFactory.Context context, ReadableConfig formatOptions) { + public DecodingFormat createDecodingFormat(ConnectorFactory.Context context, ReadableConfig formatOptions) { return new DecodingFormat() { @Override @@ -35,7 +36,7 @@ public class MessagePackFormatFactory implements DecodingFormatFactory, Encoding } @Override - public EncodingFormat createEncodingFormat(TableFactory.Context context, ReadableConfig formatOptions) { + public EncodingFormat createEncodingFormat(ConnectorFactory.Context context, ReadableConfig formatOptions) { return new EncodingFormat() { @Override diff --git a/groot-formats/format-msgpack/src/main/java/com/geedgenetworks/formats/msgpack/MessagePackSerializer.java b/groot-formats/format-msgpack/src/main/java/com/geedgenetworks/formats/msgpack/MessagePackSerializer.java index 6848a8d..4dc8316 100644 --- a/groot-formats/format-msgpack/src/main/java/com/geedgenetworks/formats/msgpack/MessagePackSerializer.java +++ b/groot-formats/format-msgpack/src/main/java/com/geedgenetworks/formats/msgpack/MessagePackSerializer.java @@ -1,332 +1,332 @@ -package com.geedgenetworks.formats.msgpack;
-
-import com.alibaba.fastjson2.JSON;
-import com.geedgenetworks.core.types.*;
-import org.apache.commons.io.IOUtils;
-import org.msgpack.core.MessageBufferPacker;
-import org.msgpack.core.MessagePack;
-import org.msgpack.core.MessagePacker;
-
-import java.io.Serializable;
-import java.nio.charset.StandardCharsets;
-import java.util.ArrayDeque;
-import java.util.Arrays;
-import java.util.List;
-import java.util.Map;
-import java.util.stream.Collectors;
-
-public class MessagePackSerializer implements Serializable {
- private final StructType dataType;
- private final ValueWriter valueWriter;
- private ArrayDeque<MessageBufferPacker> bufferPackers;
-
- public MessagePackSerializer(StructType dataType) {
- this.dataType = dataType;
- this.valueWriter = dataType == null ? null : makeWriter(dataType);
- this.bufferPackers = new ArrayDeque<>();
- }
-
- public byte[] serialize(Map<String, Object> data){
- MessageBufferPacker packer = MessagePack.newDefaultBufferPacker();
- try {
- if (dataType == null) {
- writeMapValue(packer, data);
- return packer.toByteArray();
- } else {
- valueWriter.write(packer, data);
- return packer.toByteArray();
- }
- } catch (Exception e){
- throw new RuntimeException(e);
- } finally {
- //packer.close();
- IOUtils.closeQuietly(packer);
- }
- }
-
- private ValueWriter makeWriter(DataType dataType) {
- if (dataType instanceof StringType) {
- return this::writeString;
- }
-
- if (dataType instanceof IntegerType) {
- return this::writeInt;
- }
-
- if (dataType instanceof LongType) {
- return this::writeLong;
- }
-
- if (dataType instanceof FloatType) {
- return this::writeFloat;
- }
-
- if (dataType instanceof DoubleType) {
- return this::writeDouble;
- }
-
- if (dataType instanceof BooleanType) {
- return this::writeBoolean;
- }
-
- if (dataType instanceof BinaryType) {
- return this::writeBinary;
- }
-
- if (dataType instanceof StructType) {
- final Map<String, ValueWriter> fieldWriters = Arrays.stream(((StructType) dataType).fields).collect(Collectors.toMap(f -> f.name, f -> this.makeWriter(f.dataType)));
- return (packer, obj) -> {
- if (obj instanceof Map) {
- writeObject(packer, (Map<String, Object>) obj, fieldWriters);
- } else {
- throw new IllegalArgumentException(String.format("can not convert %s to map", obj));
- }
- };
- }
-
- if (dataType instanceof ArrayType) {
- final ValueWriter elementWriter = this.makeWriter(((ArrayType) dataType).elementType);
- return (packer, obj) -> {
- if (obj instanceof List) {
- writeArray(packer, (List<Object>) obj, elementWriter);
- }
- };
- }
-
- throw new UnsupportedOperationException("unsupported dataType: " + dataType);
- }
-
- void writeString(MessagePacker packer, Object obj) throws Exception {
- if (obj instanceof String) {
- packer.packString((String) obj);
- } else if (obj instanceof byte[]) {
- byte[] bytes = (byte[]) obj;
- packer.packRawStringHeader(bytes.length);
- packer.writePayload(bytes);
- } else {
- packer.packString(JSON.toJSONString(obj));
- }
- }
-
- void writeInt(MessagePacker packer, Object obj) throws Exception {
- if (obj instanceof Number) {
- packer.packInt(((Number) obj).intValue());
- } else if (obj instanceof String) {
- packer.packInt(Integer.parseInt((String) obj));
- } else {
- throw new IllegalArgumentException(String.format("can not convert %s to int", obj));
- }
- }
-
- void writeLong(MessagePacker packer, Object obj) throws Exception {
- if (obj instanceof Number) {
- packer.packLong(((Number) obj).longValue());
- } else if (obj instanceof String) {
- packer.packLong(Long.parseLong((String) obj));
- } else {
- throw new IllegalArgumentException(String.format("can not convert %s to long", obj));
- }
- }
-
- void writeFloat(MessagePacker packer, Object obj) throws Exception {
- if (obj instanceof Number) {
- packer.packFloat(((Number) obj).floatValue());
- } else if (obj instanceof String) {
- packer.packFloat(Float.parseFloat((String) obj));
- } else {
- throw new IllegalArgumentException(String.format("can not convert %s to float", obj));
- }
- }
-
- void writeDouble(MessagePacker packer, Object obj) throws Exception {
- if (obj instanceof Number) {
- packer.packDouble(((Number) obj).doubleValue());
- } else if (obj instanceof String) {
- packer.packDouble(Double.parseDouble((String) obj));
- } else {
- throw new IllegalArgumentException(String.format("can not convert %s to double", obj));
- }
- }
-
- void writeBoolean(MessagePacker packer, Object obj) throws Exception {
- if (obj instanceof Boolean) {
- packer.packBoolean((Boolean) obj);
- } else if (obj instanceof Number) {
- packer.packBoolean(((Number) obj).intValue() != 0);
- } else {
- throw new IllegalArgumentException(String.format("can not convert %s to bool", obj));
- }
- }
-
- void writeBinary(MessagePacker packer, Object obj) throws Exception {
- if (obj instanceof byte[]) {
- byte[] bytes = (byte[]) obj;
- packer.packBinaryHeader(bytes.length);
- packer.writePayload(bytes);
- } else if (obj instanceof String) {
- byte[] bytes = obj.toString().getBytes(StandardCharsets.UTF_8);
- packer.packBinaryHeader(bytes.length);
- packer.writePayload(bytes);
- } else {
- throw new IllegalArgumentException(String.format("can not convert %s to byte[]", obj));
- }
- }
-
- void writeObject(MessagePacker packer, Map<String, Object> map, Map<String, ValueWriter> fieldWriters) throws Exception {
- MessageBufferPacker bufferPacker = getBufferPacker();
- try {
- String key;
- Object value;
- ValueWriter valueWriter;
- int size = 0;
- for (Map.Entry<String, Object> entry : map.entrySet()) {
- key = entry.getKey();
- if (key.startsWith("__")) {
- continue;
- }
- value = entry.getValue();
- if (value == null) {
- continue;
- }
- valueWriter = fieldWriters.get(key);
- if (valueWriter != null) {
- bufferPacker.packString(key);
- valueWriter.write(bufferPacker, value);
- size++;
- }
- }
- byte[] bytes = bufferPacker.toByteArray();
- packer.packMapHeader(size);
- packer.writePayload(bytes);
- } finally {
- recycleBufferPacker(bufferPacker);
- }
- }
-
- void writeArray(MessagePacker packer, List<Object> array, ValueWriter elementWriter) throws Exception {
- packer.packArrayHeader(array.size());
- Object value;
- for (int i = 0; i < array.size(); i++) {
- value = array.get(i);
- if (value == null) {
- packer.packNil();
- continue;
- }
- elementWriter.write(packer, value);
- }
- }
-
- private MessageBufferPacker getBufferPacker() {
- if (bufferPackers.isEmpty()) {
- return MessagePack.newDefaultBufferPacker();
- }
-
- return bufferPackers.pollLast();
- }
-
- private void recycleBufferPacker(MessageBufferPacker bufferPacker) {
- bufferPacker.clear();
- bufferPackers.addLast(bufferPacker);
- }
-
- public void writeValue(MessagePacker packer, Object value) throws Exception {
- if (value instanceof String) {
- packer.packString((String) value);
- return;
- }
-
- if (value instanceof Integer) {
- packer.packInt((Integer) value);
- return;
- }
-
- if (value instanceof Long) {
- packer.packLong((Long) value);
- return;
- }
-
- if (value instanceof Float) {
- packer.packFloat((Float) value);
- return;
- }
-
- if (value instanceof Double) {
- packer.packDouble((Double) value);
- return;
- }
-
- if (value instanceof Number) {
- packer.packLong(((Number) value).longValue());
- return;
- }
-
- if (value instanceof Boolean) {
- packer.packBoolean((Boolean) value);
- return;
- }
-
- if (value instanceof byte[]) {
- byte[] bytes = (byte[]) value;
- packer.packBinaryHeader(bytes.length);
- packer.writePayload(bytes);
- return;
- }
-
- if (value instanceof Map) {
- writeMapValue(packer, (Map<String, Object>) value);
- return;
- }
-
- if (value instanceof List) {
- writeArrayValue(packer, (List<Object>) value);
- return;
- }
-
- throw new UnsupportedOperationException("can not write class:" + value.getClass());
- }
-
- public void writeMapValue(MessagePacker packer, Map<String, Object> map) throws Exception {
- MessageBufferPacker bufferPacker = getBufferPacker();
- try {
- String key;
- Object value;
- int size = 0;
- for (Map.Entry<String, Object> entry : map.entrySet()) {
- key = entry.getKey();
- if (key.startsWith("__")) {
- continue;
- }
- value = entry.getValue();
- if (value == null) {
- continue;
- }
- bufferPacker.packString(key);
- writeValue(bufferPacker, value);
- size++;
- }
- byte[] bytes = bufferPacker.toByteArray();
- packer.packMapHeader(size);
- packer.writePayload(bytes);
- } finally {
- recycleBufferPacker(bufferPacker);
- }
- }
-
- public void writeArrayValue(MessagePacker packer, List<Object> array) throws Exception {
- packer.packArrayHeader(array.size());
- Object value;
- for (int i = 0; i < array.size(); i++) {
- value = array.get(i);
- if (value == null) {
- packer.packNil();
- continue;
- }
- writeValue(packer, value);
- }
- }
-
- @FunctionalInterface
- public interface ValueWriter extends Serializable {
- void write(MessagePacker packer, Object obj) throws Exception;
- }
-}
+package com.geedgenetworks.formats.msgpack; + +import com.alibaba.fastjson2.JSON; +import com.geedgenetworks.api.connector.type.*; +import org.apache.commons.io.IOUtils; +import org.msgpack.core.MessageBufferPacker; +import org.msgpack.core.MessagePack; +import org.msgpack.core.MessagePacker; + +import java.io.Serializable; +import java.nio.charset.StandardCharsets; +import java.util.ArrayDeque; +import java.util.Arrays; +import java.util.List; +import java.util.Map; +import java.util.stream.Collectors; + +public class MessagePackSerializer implements Serializable { + private final StructType dataType; + private final ValueWriter valueWriter; + private ArrayDeque<MessageBufferPacker> bufferPackers; + + public MessagePackSerializer(StructType dataType) { + this.dataType = dataType; + this.valueWriter = dataType == null ? null : makeWriter(dataType); + this.bufferPackers = new ArrayDeque<>(); + } + + public byte[] serialize(Map<String, Object> data){ + MessageBufferPacker packer = MessagePack.newDefaultBufferPacker(); + try { + if (dataType == null) { + writeMapValue(packer, data); + return packer.toByteArray(); + } else { + valueWriter.write(packer, data); + return packer.toByteArray(); + } + } catch (Exception e){ + throw new RuntimeException(e); + } finally { + //packer.close(); + IOUtils.closeQuietly(packer); + } + } + + private ValueWriter makeWriter(DataType dataType) { + if (dataType instanceof StringType) { + return this::writeString; + } + + if (dataType instanceof IntegerType) { + return this::writeInt; + } + + if (dataType instanceof LongType) { + return this::writeLong; + } + + if (dataType instanceof FloatType) { + return this::writeFloat; + } + + if (dataType instanceof DoubleType) { + return this::writeDouble; + } + + if (dataType instanceof BooleanType) { + return this::writeBoolean; + } + + if (dataType instanceof BinaryType) { + return this::writeBinary; + } + + if (dataType instanceof StructType) { + final Map<String, ValueWriter> fieldWriters = Arrays.stream(((StructType) dataType).fields).collect(Collectors.toMap(f -> f.name, f -> this.makeWriter(f.dataType))); + return (packer, obj) -> { + if (obj instanceof Map) { + writeObject(packer, (Map<String, Object>) obj, fieldWriters); + } else { + throw new IllegalArgumentException(String.format("can not convert %s to map", obj)); + } + }; + } + + if (dataType instanceof ArrayType) { + final ValueWriter elementWriter = this.makeWriter(((ArrayType) dataType).elementType); + return (packer, obj) -> { + if (obj instanceof List) { + writeArray(packer, (List<Object>) obj, elementWriter); + } + }; + } + + throw new UnsupportedOperationException("unsupported dataType: " + dataType); + } + + void writeString(MessagePacker packer, Object obj) throws Exception { + if (obj instanceof String) { + packer.packString((String) obj); + } else if (obj instanceof byte[]) { + byte[] bytes = (byte[]) obj; + packer.packRawStringHeader(bytes.length); + packer.writePayload(bytes); + } else { + packer.packString(JSON.toJSONString(obj)); + } + } + + void writeInt(MessagePacker packer, Object obj) throws Exception { + if (obj instanceof Number) { + packer.packInt(((Number) obj).intValue()); + } else if (obj instanceof String) { + packer.packInt(Integer.parseInt((String) obj)); + } else { + throw new IllegalArgumentException(String.format("can not convert %s to int", obj)); + } + } + + void writeLong(MessagePacker packer, Object obj) throws Exception { + if (obj instanceof Number) { + packer.packLong(((Number) obj).longValue()); + } else if (obj instanceof String) { + packer.packLong(Long.parseLong((String) obj)); + } else { + throw new IllegalArgumentException(String.format("can not convert %s to long", obj)); + } + } + + void writeFloat(MessagePacker packer, Object obj) throws Exception { + if (obj instanceof Number) { + packer.packFloat(((Number) obj).floatValue()); + } else if (obj instanceof String) { + packer.packFloat(Float.parseFloat((String) obj)); + } else { + throw new IllegalArgumentException(String.format("can not convert %s to float", obj)); + } + } + + void writeDouble(MessagePacker packer, Object obj) throws Exception { + if (obj instanceof Number) { + packer.packDouble(((Number) obj).doubleValue()); + } else if (obj instanceof String) { + packer.packDouble(Double.parseDouble((String) obj)); + } else { + throw new IllegalArgumentException(String.format("can not convert %s to double", obj)); + } + } + + void writeBoolean(MessagePacker packer, Object obj) throws Exception { + if (obj instanceof Boolean) { + packer.packBoolean((Boolean) obj); + } else if (obj instanceof Number) { + packer.packBoolean(((Number) obj).intValue() != 0); + } else { + throw new IllegalArgumentException(String.format("can not convert %s to bool", obj)); + } + } + + void writeBinary(MessagePacker packer, Object obj) throws Exception { + if (obj instanceof byte[]) { + byte[] bytes = (byte[]) obj; + packer.packBinaryHeader(bytes.length); + packer.writePayload(bytes); + } else if (obj instanceof String) { + byte[] bytes = obj.toString().getBytes(StandardCharsets.UTF_8); + packer.packBinaryHeader(bytes.length); + packer.writePayload(bytes); + } else { + throw new IllegalArgumentException(String.format("can not convert %s to byte[]", obj)); + } + } + + void writeObject(MessagePacker packer, Map<String, Object> map, Map<String, ValueWriter> fieldWriters) throws Exception { + MessageBufferPacker bufferPacker = getBufferPacker(); + try { + String key; + Object value; + ValueWriter valueWriter; + int size = 0; + for (Map.Entry<String, Object> entry : map.entrySet()) { + key = entry.getKey(); + if (key.startsWith("__")) { + continue; + } + value = entry.getValue(); + if (value == null) { + continue; + } + valueWriter = fieldWriters.get(key); + if (valueWriter != null) { + bufferPacker.packString(key); + valueWriter.write(bufferPacker, value); + size++; + } + } + byte[] bytes = bufferPacker.toByteArray(); + packer.packMapHeader(size); + packer.writePayload(bytes); + } finally { + recycleBufferPacker(bufferPacker); + } + } + + void writeArray(MessagePacker packer, List<Object> array, ValueWriter elementWriter) throws Exception { + packer.packArrayHeader(array.size()); + Object value; + for (int i = 0; i < array.size(); i++) { + value = array.get(i); + if (value == null) { + packer.packNil(); + continue; + } + elementWriter.write(packer, value); + } + } + + private MessageBufferPacker getBufferPacker() { + if (bufferPackers.isEmpty()) { + return MessagePack.newDefaultBufferPacker(); + } + + return bufferPackers.pollLast(); + } + + private void recycleBufferPacker(MessageBufferPacker bufferPacker) { + bufferPacker.clear(); + bufferPackers.addLast(bufferPacker); + } + + public void writeValue(MessagePacker packer, Object value) throws Exception { + if (value instanceof String) { + packer.packString((String) value); + return; + } + + if (value instanceof Integer) { + packer.packInt((Integer) value); + return; + } + + if (value instanceof Long) { + packer.packLong((Long) value); + return; + } + + if (value instanceof Float) { + packer.packFloat((Float) value); + return; + } + + if (value instanceof Double) { + packer.packDouble((Double) value); + return; + } + + if (value instanceof Number) { + packer.packLong(((Number) value).longValue()); + return; + } + + if (value instanceof Boolean) { + packer.packBoolean((Boolean) value); + return; + } + + if (value instanceof byte[]) { + byte[] bytes = (byte[]) value; + packer.packBinaryHeader(bytes.length); + packer.writePayload(bytes); + return; + } + + if (value instanceof Map) { + writeMapValue(packer, (Map<String, Object>) value); + return; + } + + if (value instanceof List) { + writeArrayValue(packer, (List<Object>) value); + return; + } + + throw new UnsupportedOperationException("can not write class:" + value.getClass()); + } + + public void writeMapValue(MessagePacker packer, Map<String, Object> map) throws Exception { + MessageBufferPacker bufferPacker = getBufferPacker(); + try { + String key; + Object value; + int size = 0; + for (Map.Entry<String, Object> entry : map.entrySet()) { + key = entry.getKey(); + if (key.startsWith("__")) { + continue; + } + value = entry.getValue(); + if (value == null) { + continue; + } + bufferPacker.packString(key); + writeValue(bufferPacker, value); + size++; + } + byte[] bytes = bufferPacker.toByteArray(); + packer.packMapHeader(size); + packer.writePayload(bytes); + } finally { + recycleBufferPacker(bufferPacker); + } + } + + public void writeArrayValue(MessagePacker packer, List<Object> array) throws Exception { + packer.packArrayHeader(array.size()); + Object value; + for (int i = 0; i < array.size(); i++) { + value = array.get(i); + if (value == null) { + packer.packNil(); + continue; + } + writeValue(packer, value); + } + } + + @FunctionalInterface + public interface ValueWriter extends Serializable { + void write(MessagePacker packer, Object obj) throws Exception; + } +} diff --git a/groot-formats/format-msgpack/src/main/resources/META-INF/services/com.geedgenetworks.core.factories.Factory b/groot-formats/format-msgpack/src/main/resources/META-INF/services/com.geedgenetworks.api.factory.Factory index 6be6a2c..83ace6c 100644 --- a/groot-formats/format-msgpack/src/main/resources/META-INF/services/com.geedgenetworks.core.factories.Factory +++ b/groot-formats/format-msgpack/src/main/resources/META-INF/services/com.geedgenetworks.api.factory.Factory @@ -1 +1 @@ -com.geedgenetworks.formats.msgpack.MessagePackFormatFactory
+com.geedgenetworks.formats.msgpack.MessagePackFormatFactory diff --git a/groot-formats/format-msgpack/src/test/java/com/geedgenetworks/formats/msgpack/MessagePackDeserializerTest.java b/groot-formats/format-msgpack/src/test/java/com/geedgenetworks/formats/msgpack/MessagePackDeserializerTest.java index cb45ab4..f0603f5 100644 --- a/groot-formats/format-msgpack/src/test/java/com/geedgenetworks/formats/msgpack/MessagePackDeserializerTest.java +++ b/groot-formats/format-msgpack/src/test/java/com/geedgenetworks/formats/msgpack/MessagePackDeserializerTest.java @@ -1,231 +1,231 @@ -package com.geedgenetworks.formats.msgpack;
-
-import com.alibaba.fastjson2.JSON;
-import com.geedgenetworks.core.types.StructType;
-import com.geedgenetworks.core.types.Types;
-import org.junit.jupiter.api.Test;
-import org.msgpack.core.MessageBufferPacker;
-import org.msgpack.core.MessagePack;
-import org.msgpack.value.MapValue;
-import org.msgpack.value.ValueFactory;
-
-import java.nio.charset.StandardCharsets;
-import java.util.List;
-import java.util.Map;
-
-import static org.junit.jupiter.api.Assertions.*;
-
-public class MessagePackDeserializerTest {
- @Test
- public void testDeserSimpleData() throws Exception{
- ValueFactory.MapBuilder map = ValueFactory.newMapBuilder();
- map.put(ValueFactory.newString("uint8"), ValueFactory.newInteger(123));
- map.put(ValueFactory.newString("uint16"), ValueFactory.newInteger(512));
- map.put(ValueFactory.newString("uint32"), ValueFactory.newInteger(33554432));
- map.put(ValueFactory.newString("uint64"), ValueFactory.newInteger(17179869184L));
- map.put(ValueFactory.newString("int8"), ValueFactory.newInteger(-123));
- map.put(ValueFactory.newString("int16"), ValueFactory.newInteger(-512));
- map.put(ValueFactory.newString("int32"), ValueFactory.newInteger(-33554432));
- map.put(ValueFactory.newString("int64"), ValueFactory.newInteger(-17179869184L));
- map.put(ValueFactory.newString("null"), ValueFactory.newNil());
-
- map.put(ValueFactory.newString("double"), ValueFactory.newFloat(123.2));
-
- map.put(ValueFactory.newString("bool_true"), ValueFactory.newBoolean(true));
- map.put(ValueFactory.newString("bool_false"), ValueFactory.newBoolean(false));
-
- map.put(ValueFactory.newString("str"), ValueFactory.newString("ut8字符串"));
-
- map.put(ValueFactory.newString("binary"), ValueFactory.newBinary("ut8字符串".getBytes(StandardCharsets.UTF_8)));
-
- map.put(ValueFactory.newString("int32_array"), ValueFactory.newArray(ValueFactory.newInteger(123), ValueFactory.newInteger(512), ValueFactory.newNil(), ValueFactory.newInteger(33554432)));
- map.put(ValueFactory.newString("str_array"), ValueFactory.newArray(ValueFactory.newString("ut8字符串1"), ValueFactory.newNil(), ValueFactory.newString("ut8字符串2")));
-
- map.put(ValueFactory.newString("obj"), ValueFactory.newMapBuilder()
- .put(ValueFactory.newString("uint8"), ValueFactory.newInteger(123))
- .put(ValueFactory.newString("uint32"), ValueFactory.newInteger(33554432))
- .put(ValueFactory.newString("double"), ValueFactory.newFloat(123.2))
- .put(ValueFactory.newString("str"), ValueFactory.newString("ut8字符串"))
- .build());
-
-
- MapValue mapValue = map.build();
- MessageBufferPacker packer = MessagePack.newDefaultBufferPacker();
- packer.packValue(mapValue);
- byte[] bytes = packer.toByteArray();
- packer.close();
-
- MessagePackDeserializer deserializer = new MessagePackDeserializer(null);
- Map<String, Object> rst = deserializer.deserialize(bytes);
- System.out.println(mapValue.toJson());
- System.out.println(JSON.toJSONString(rst));
-
- assertEquals(rst.get("uint8"), 123);
- assertEquals(rst.get("uint16"), 512);
- assertEquals(rst.get("uint32"), 33554432L);
- assertEquals(rst.get("uint64"), 17179869184L);
- assertEquals(rst.get("int8"), -123);
- assertEquals(rst.get("int16"), -512);
- assertEquals(rst.get("int32"), -33554432);
- assertEquals(rst.get("int64"), -17179869184L);
-
- assertEquals(rst.get("double"), 123.2);
- assertEquals(rst.get("bool_true"), true);
- assertEquals(rst.get("bool_false"), false);
-
- assertEquals(rst.get("str"), "ut8字符串");
- assertArrayEquals((byte[]) rst.get("binary"), "ut8字符串".getBytes(StandardCharsets.UTF_8));
- assertArrayEquals(((List<Object>) rst.get("int32_array")).toArray(), new Object[]{123,512,null,33554432L});
- assertArrayEquals(((List<Object>) rst.get("str_array")).toArray(), new Object[]{"ut8字符串1",null,"ut8字符串2"});
-
- assertEquals(((Map<String, Object>)rst.get("obj")).get("uint8"), 123);
- assertEquals(((Map<String, Object>)rst.get("obj")).get("uint32"), 33554432L );
- assertEquals(((Map<String, Object>)rst.get("obj")).get("double"), 123.2);
- assertEquals(((Map<String, Object>)rst.get("obj")).get("str"), "ut8字符串");
-
- }
-
- @Test
- public void testDeserSimpleDataWithSchema() throws Exception{
- ValueFactory.MapBuilder map = ValueFactory.newMapBuilder();
- map.put(ValueFactory.newString("uint8"), ValueFactory.newInteger(123));
- map.put(ValueFactory.newString("uint16"), ValueFactory.newInteger(512));
- map.put(ValueFactory.newString("uint32"), ValueFactory.newInteger(33554432));
- map.put(ValueFactory.newString("uint64"), ValueFactory.newInteger(17179869184L));
- map.put(ValueFactory.newString("int8"), ValueFactory.newInteger(-123));
- map.put(ValueFactory.newString("int16"), ValueFactory.newInteger(-512));
- map.put(ValueFactory.newString("int32"), ValueFactory.newInteger(-33554432));
- map.put(ValueFactory.newString("int64"), ValueFactory.newInteger(-17179869184L));
- map.put(ValueFactory.newString("null"), ValueFactory.newNil());
-
- map.put(ValueFactory.newString("double"), ValueFactory.newFloat(123.2));
-
- map.put(ValueFactory.newString("bool_true"), ValueFactory.newBoolean(true));
- map.put(ValueFactory.newString("bool_false"), ValueFactory.newBoolean(false));
-
- map.put(ValueFactory.newString("str"), ValueFactory.newString("ut8字符串"));
-
- map.put(ValueFactory.newString("binary"), ValueFactory.newBinary("ut8字符串".getBytes(StandardCharsets.UTF_8)));
-
- map.put(ValueFactory.newString("int32_array"), ValueFactory.newArray(ValueFactory.newInteger(123), ValueFactory.newInteger(512), ValueFactory.newNil(), ValueFactory.newInteger(33554432)));
- map.put(ValueFactory.newString("str_array"), ValueFactory.newArray(ValueFactory.newString("ut8字符串1"), ValueFactory.newNil(), ValueFactory.newString("ut8字符串2")));
-
- map.put(ValueFactory.newString("obj"), ValueFactory.newMapBuilder()
- .put(ValueFactory.newString("uint8"), ValueFactory.newInteger(123))
- .put(ValueFactory.newString("uint32"), ValueFactory.newInteger(33554432))
- .put(ValueFactory.newString("double"), ValueFactory.newFloat(123.2))
- .put(ValueFactory.newString("str"), ValueFactory.newString("ut8字符串"))
- .build());
-
-
- MapValue mapValue = map.build();
- MessageBufferPacker packer = MessagePack.newDefaultBufferPacker();
- packer.packValue(mapValue);
- byte[] bytes = packer.toByteArray();
- packer.close();
-
- StructType dataType = Types.parseStructType("struct<uint8: int, uint16: int, uint32: int, uint64: bigint, int8: int, int16: int, int32: int, int64: bigint, double: double," +
- "bool_true: boolean, bool_false: boolean, str: string, binary: binary, int32_array:array<int>, str_array:array<string>, " +
- "obj:struct<uint8: int, uint32: int, double: double, str: string>>");
- MessagePackDeserializer deserializer = new MessagePackDeserializer(dataType);
- Map<String, Object> rst = deserializer.deserialize(bytes);
- System.out.println(mapValue.toJson());
- System.out.println(JSON.toJSONString(rst));
-
- assertEquals(rst.get("uint8"), 123);
- assertEquals(rst.get("uint16"), 512);
- assertEquals(rst.get("uint32"), 33554432);
- assertEquals(rst.get("uint64"), 17179869184L);
- assertEquals(rst.get("int8"), -123);
- assertEquals(rst.get("int16"), -512);
- assertEquals(rst.get("int32"), -33554432);
- assertEquals(rst.get("int64"), -17179869184L);
-
- assertEquals(rst.get("double"), 123.2);
- assertEquals(rst.get("bool_true"), true);
- assertEquals(rst.get("bool_false"), false);
-
- assertEquals(rst.get("str"), "ut8字符串");
- assertArrayEquals((byte[]) rst.get("binary"), "ut8字符串".getBytes(StandardCharsets.UTF_8));
- assertArrayEquals(((List<Object>) rst.get("int32_array")).toArray(), new Object[]{123,512,null,33554432});
- assertArrayEquals(((List<Object>) rst.get("str_array")).toArray(), new Object[]{"ut8字符串1",null,"ut8字符串2"});
-
- assertEquals(((Map<String, Object>)rst.get("obj")).get("uint8"), 123);
- assertEquals(((Map<String, Object>)rst.get("obj")).get("uint32"), 33554432 );
- assertEquals(((Map<String, Object>)rst.get("obj")).get("double"), 123.2);
- assertEquals(((Map<String, Object>)rst.get("obj")).get("str"), "ut8字符串");
-
- }
-
- @Test
- public void testDeserSimpleDataWithSchemaTypeConvert() throws Exception{
- ValueFactory.MapBuilder map = ValueFactory.newMapBuilder();
- map.put(ValueFactory.newString("uint8"), ValueFactory.newString("123"));
- map.put(ValueFactory.newString("uint16"), ValueFactory.newInteger(512));
- map.put(ValueFactory.newString("uint32"), ValueFactory.newString("33554432"));
- map.put(ValueFactory.newString("uint64"), ValueFactory.newString("17179869184"));
- map.put(ValueFactory.newString("int8"), ValueFactory.newInteger(-123));
- map.put(ValueFactory.newString("int16"), ValueFactory.newInteger(-512));
- map.put(ValueFactory.newString("int32"), ValueFactory.newInteger(-33554432));
- map.put(ValueFactory.newString("int64"), ValueFactory.newString("-17179869184"));
- map.put(ValueFactory.newString("null"), ValueFactory.newNil());
-
- map.put(ValueFactory.newString("double"), ValueFactory.newString("123.2"));
-
- map.put(ValueFactory.newString("bool_true"), ValueFactory.newBoolean(true));
- map.put(ValueFactory.newString("bool_false"), ValueFactory.newBoolean(false));
-
- map.put(ValueFactory.newString("str"), ValueFactory.newString("ut8字符串"));
-
- map.put(ValueFactory.newString("binary"), ValueFactory.newBinary("ut8字符串".getBytes(StandardCharsets.UTF_8)));
-
- map.put(ValueFactory.newString("int32_array"), ValueFactory.newArray(ValueFactory.newInteger(123), ValueFactory.newString("512"), ValueFactory.newNil(), ValueFactory.newInteger(33554432)));
- map.put(ValueFactory.newString("str_array"), ValueFactory.newArray(ValueFactory.newString("ut8字符串1"), ValueFactory.newNil(), ValueFactory.newString("ut8字符串2")));
-
- map.put(ValueFactory.newString("obj"), ValueFactory.newMapBuilder()
- .put(ValueFactory.newString("uint8"), ValueFactory.newInteger(123))
- .put(ValueFactory.newString("uint32"), ValueFactory.newString("33554432"))
- .put(ValueFactory.newString("double"), ValueFactory.newString("123.2"))
- .put(ValueFactory.newString("str"), ValueFactory.newString("ut8字符串"))
- .build());
-
-
- MapValue mapValue = map.build();
- MessageBufferPacker packer = MessagePack.newDefaultBufferPacker();
- packer.packValue(mapValue);
- byte[] bytes = packer.toByteArray();
- packer.close();
-
- StructType dataType = Types.parseStructType("struct<uint8: int, uint16: string, uint32: int, uint64: bigint, int8: int, int16: string, int32: int, int64: bigint, double: double," +
- "bool_true: int, bool_false: boolean, str: string, binary: binary, int32_array:array<int>, str_array:array<string>, " +
- "obj:struct<uint8: string, uint32: int, double: double, str: binary>>");
- MessagePackDeserializer deserializer = new MessagePackDeserializer(dataType);
- Map<String, Object> rst = deserializer.deserialize(bytes);
- System.out.println(mapValue.toJson());
- System.out.println(JSON.toJSONString(rst));
-
- assertEquals(rst.get("uint8"), 123);
- assertEquals(rst.get("uint16"), "512");
- assertEquals(rst.get("uint32"), 33554432);
- assertEquals(rst.get("uint64"), 17179869184L);
- assertEquals(rst.get("int8"), -123);
- assertEquals(rst.get("int16"), "-512");
- assertEquals(rst.get("int32"), -33554432);
- assertEquals(rst.get("int64"), -17179869184L);
-
- assertEquals(rst.get("double"), 123.2);
- assertEquals(rst.get("bool_true"), 1);
- assertEquals(rst.get("bool_false"), false);
-
- assertEquals(rst.get("str"), "ut8字符串");
- assertArrayEquals((byte[]) rst.get("binary"), "ut8字符串".getBytes(StandardCharsets.UTF_8));
- assertArrayEquals(((List<Object>) rst.get("int32_array")).toArray(), new Object[]{123,512,null,33554432});
- assertArrayEquals(((List<Object>) rst.get("str_array")).toArray(), new Object[]{"ut8字符串1",null,"ut8字符串2"});
-
- assertEquals(((Map<String, Object>)rst.get("obj")).get("uint8"), "123");
- assertEquals(((Map<String, Object>)rst.get("obj")).get("uint32"), 33554432 );
- assertEquals(((Map<String, Object>)rst.get("obj")).get("double"), 123.2);
- assertArrayEquals((byte[])((Map<String, Object>)rst.get("obj")).get("str"), "ut8字符串".getBytes(StandardCharsets.UTF_8));
-
- }
+package com.geedgenetworks.formats.msgpack; + +import com.alibaba.fastjson2.JSON; +import com.geedgenetworks.api.connector.type.StructType; +import com.geedgenetworks.api.connector.type.Types; +import org.junit.jupiter.api.Test; +import org.msgpack.core.MessageBufferPacker; +import org.msgpack.core.MessagePack; +import org.msgpack.value.MapValue; +import org.msgpack.value.ValueFactory; + +import java.nio.charset.StandardCharsets; +import java.util.List; +import java.util.Map; + +import static org.junit.jupiter.api.Assertions.*; + +public class MessagePackDeserializerTest { + @Test + public void testDeserSimpleData() throws Exception{ + ValueFactory.MapBuilder map = ValueFactory.newMapBuilder(); + map.put(ValueFactory.newString("uint8"), ValueFactory.newInteger(123)); + map.put(ValueFactory.newString("uint16"), ValueFactory.newInteger(512)); + map.put(ValueFactory.newString("uint32"), ValueFactory.newInteger(33554432)); + map.put(ValueFactory.newString("uint64"), ValueFactory.newInteger(17179869184L)); + map.put(ValueFactory.newString("int8"), ValueFactory.newInteger(-123)); + map.put(ValueFactory.newString("int16"), ValueFactory.newInteger(-512)); + map.put(ValueFactory.newString("int32"), ValueFactory.newInteger(-33554432)); + map.put(ValueFactory.newString("int64"), ValueFactory.newInteger(-17179869184L)); + map.put(ValueFactory.newString("null"), ValueFactory.newNil()); + + map.put(ValueFactory.newString("double"), ValueFactory.newFloat(123.2)); + + map.put(ValueFactory.newString("bool_true"), ValueFactory.newBoolean(true)); + map.put(ValueFactory.newString("bool_false"), ValueFactory.newBoolean(false)); + + map.put(ValueFactory.newString("str"), ValueFactory.newString("ut8字符串")); + + map.put(ValueFactory.newString("binary"), ValueFactory.newBinary("ut8字符串".getBytes(StandardCharsets.UTF_8))); + + map.put(ValueFactory.newString("int32_array"), ValueFactory.newArray(ValueFactory.newInteger(123), ValueFactory.newInteger(512), ValueFactory.newNil(), ValueFactory.newInteger(33554432))); + map.put(ValueFactory.newString("str_array"), ValueFactory.newArray(ValueFactory.newString("ut8字符串1"), ValueFactory.newNil(), ValueFactory.newString("ut8字符串2"))); + + map.put(ValueFactory.newString("obj"), ValueFactory.newMapBuilder() + .put(ValueFactory.newString("uint8"), ValueFactory.newInteger(123)) + .put(ValueFactory.newString("uint32"), ValueFactory.newInteger(33554432)) + .put(ValueFactory.newString("double"), ValueFactory.newFloat(123.2)) + .put(ValueFactory.newString("str"), ValueFactory.newString("ut8字符串")) + .build()); + + + MapValue mapValue = map.build(); + MessageBufferPacker packer = MessagePack.newDefaultBufferPacker(); + packer.packValue(mapValue); + byte[] bytes = packer.toByteArray(); + packer.close(); + + MessagePackDeserializer deserializer = new MessagePackDeserializer(null); + Map<String, Object> rst = deserializer.deserialize(bytes); + System.out.println(mapValue.toJson()); + System.out.println(JSON.toJSONString(rst)); + + assertEquals(rst.get("uint8"), 123); + assertEquals(rst.get("uint16"), 512); + assertEquals(rst.get("uint32"), 33554432L); + assertEquals(rst.get("uint64"), 17179869184L); + assertEquals(rst.get("int8"), -123); + assertEquals(rst.get("int16"), -512); + assertEquals(rst.get("int32"), -33554432); + assertEquals(rst.get("int64"), -17179869184L); + + assertEquals(rst.get("double"), 123.2); + assertEquals(rst.get("bool_true"), true); + assertEquals(rst.get("bool_false"), false); + + assertEquals(rst.get("str"), "ut8字符串"); + assertArrayEquals((byte[]) rst.get("binary"), "ut8字符串".getBytes(StandardCharsets.UTF_8)); + assertArrayEquals(((List<Object>) rst.get("int32_array")).toArray(), new Object[]{123,512,null,33554432L}); + assertArrayEquals(((List<Object>) rst.get("str_array")).toArray(), new Object[]{"ut8字符串1",null,"ut8字符串2"}); + + assertEquals(((Map<String, Object>)rst.get("obj")).get("uint8"), 123); + assertEquals(((Map<String, Object>)rst.get("obj")).get("uint32"), 33554432L ); + assertEquals(((Map<String, Object>)rst.get("obj")).get("double"), 123.2); + assertEquals(((Map<String, Object>)rst.get("obj")).get("str"), "ut8字符串"); + + } + + @Test + public void testDeserSimpleDataWithSchema() throws Exception{ + ValueFactory.MapBuilder map = ValueFactory.newMapBuilder(); + map.put(ValueFactory.newString("uint8"), ValueFactory.newInteger(123)); + map.put(ValueFactory.newString("uint16"), ValueFactory.newInteger(512)); + map.put(ValueFactory.newString("uint32"), ValueFactory.newInteger(33554432)); + map.put(ValueFactory.newString("uint64"), ValueFactory.newInteger(17179869184L)); + map.put(ValueFactory.newString("int8"), ValueFactory.newInteger(-123)); + map.put(ValueFactory.newString("int16"), ValueFactory.newInteger(-512)); + map.put(ValueFactory.newString("int32"), ValueFactory.newInteger(-33554432)); + map.put(ValueFactory.newString("int64"), ValueFactory.newInteger(-17179869184L)); + map.put(ValueFactory.newString("null"), ValueFactory.newNil()); + + map.put(ValueFactory.newString("double"), ValueFactory.newFloat(123.2)); + + map.put(ValueFactory.newString("bool_true"), ValueFactory.newBoolean(true)); + map.put(ValueFactory.newString("bool_false"), ValueFactory.newBoolean(false)); + + map.put(ValueFactory.newString("str"), ValueFactory.newString("ut8字符串")); + + map.put(ValueFactory.newString("binary"), ValueFactory.newBinary("ut8字符串".getBytes(StandardCharsets.UTF_8))); + + map.put(ValueFactory.newString("int32_array"), ValueFactory.newArray(ValueFactory.newInteger(123), ValueFactory.newInteger(512), ValueFactory.newNil(), ValueFactory.newInteger(33554432))); + map.put(ValueFactory.newString("str_array"), ValueFactory.newArray(ValueFactory.newString("ut8字符串1"), ValueFactory.newNil(), ValueFactory.newString("ut8字符串2"))); + + map.put(ValueFactory.newString("obj"), ValueFactory.newMapBuilder() + .put(ValueFactory.newString("uint8"), ValueFactory.newInteger(123)) + .put(ValueFactory.newString("uint32"), ValueFactory.newInteger(33554432)) + .put(ValueFactory.newString("double"), ValueFactory.newFloat(123.2)) + .put(ValueFactory.newString("str"), ValueFactory.newString("ut8字符串")) + .build()); + + + MapValue mapValue = map.build(); + MessageBufferPacker packer = MessagePack.newDefaultBufferPacker(); + packer.packValue(mapValue); + byte[] bytes = packer.toByteArray(); + packer.close(); + + StructType dataType = Types.parseStructType("struct<uint8: int, uint16: int, uint32: int, uint64: bigint, int8: int, int16: int, int32: int, int64: bigint, double: double," + + "bool_true: boolean, bool_false: boolean, str: string, binary: binary, int32_array:array<int>, str_array:array<string>, " + + "obj:struct<uint8: int, uint32: int, double: double, str: string>>"); + MessagePackDeserializer deserializer = new MessagePackDeserializer(dataType); + Map<String, Object> rst = deserializer.deserialize(bytes); + System.out.println(mapValue.toJson()); + System.out.println(JSON.toJSONString(rst)); + + assertEquals(rst.get("uint8"), 123); + assertEquals(rst.get("uint16"), 512); + assertEquals(rst.get("uint32"), 33554432); + assertEquals(rst.get("uint64"), 17179869184L); + assertEquals(rst.get("int8"), -123); + assertEquals(rst.get("int16"), -512); + assertEquals(rst.get("int32"), -33554432); + assertEquals(rst.get("int64"), -17179869184L); + + assertEquals(rst.get("double"), 123.2); + assertEquals(rst.get("bool_true"), true); + assertEquals(rst.get("bool_false"), false); + + assertEquals(rst.get("str"), "ut8字符串"); + assertArrayEquals((byte[]) rst.get("binary"), "ut8字符串".getBytes(StandardCharsets.UTF_8)); + assertArrayEquals(((List<Object>) rst.get("int32_array")).toArray(), new Object[]{123,512,null,33554432}); + assertArrayEquals(((List<Object>) rst.get("str_array")).toArray(), new Object[]{"ut8字符串1",null,"ut8字符串2"}); + + assertEquals(((Map<String, Object>)rst.get("obj")).get("uint8"), 123); + assertEquals(((Map<String, Object>)rst.get("obj")).get("uint32"), 33554432 ); + assertEquals(((Map<String, Object>)rst.get("obj")).get("double"), 123.2); + assertEquals(((Map<String, Object>)rst.get("obj")).get("str"), "ut8字符串"); + + } + + @Test + public void testDeserSimpleDataWithSchemaTypeConvert() throws Exception{ + ValueFactory.MapBuilder map = ValueFactory.newMapBuilder(); + map.put(ValueFactory.newString("uint8"), ValueFactory.newString("123")); + map.put(ValueFactory.newString("uint16"), ValueFactory.newInteger(512)); + map.put(ValueFactory.newString("uint32"), ValueFactory.newString("33554432")); + map.put(ValueFactory.newString("uint64"), ValueFactory.newString("17179869184")); + map.put(ValueFactory.newString("int8"), ValueFactory.newInteger(-123)); + map.put(ValueFactory.newString("int16"), ValueFactory.newInteger(-512)); + map.put(ValueFactory.newString("int32"), ValueFactory.newInteger(-33554432)); + map.put(ValueFactory.newString("int64"), ValueFactory.newString("-17179869184")); + map.put(ValueFactory.newString("null"), ValueFactory.newNil()); + + map.put(ValueFactory.newString("double"), ValueFactory.newString("123.2")); + + map.put(ValueFactory.newString("bool_true"), ValueFactory.newBoolean(true)); + map.put(ValueFactory.newString("bool_false"), ValueFactory.newBoolean(false)); + + map.put(ValueFactory.newString("str"), ValueFactory.newString("ut8字符串")); + + map.put(ValueFactory.newString("binary"), ValueFactory.newBinary("ut8字符串".getBytes(StandardCharsets.UTF_8))); + + map.put(ValueFactory.newString("int32_array"), ValueFactory.newArray(ValueFactory.newInteger(123), ValueFactory.newString("512"), ValueFactory.newNil(), ValueFactory.newInteger(33554432))); + map.put(ValueFactory.newString("str_array"), ValueFactory.newArray(ValueFactory.newString("ut8字符串1"), ValueFactory.newNil(), ValueFactory.newString("ut8字符串2"))); + + map.put(ValueFactory.newString("obj"), ValueFactory.newMapBuilder() + .put(ValueFactory.newString("uint8"), ValueFactory.newInteger(123)) + .put(ValueFactory.newString("uint32"), ValueFactory.newString("33554432")) + .put(ValueFactory.newString("double"), ValueFactory.newString("123.2")) + .put(ValueFactory.newString("str"), ValueFactory.newString("ut8字符串")) + .build()); + + + MapValue mapValue = map.build(); + MessageBufferPacker packer = MessagePack.newDefaultBufferPacker(); + packer.packValue(mapValue); + byte[] bytes = packer.toByteArray(); + packer.close(); + + StructType dataType = Types.parseStructType("struct<uint8: int, uint16: string, uint32: int, uint64: bigint, int8: int, int16: string, int32: int, int64: bigint, double: double," + + "bool_true: int, bool_false: boolean, str: string, binary: binary, int32_array:array<int>, str_array:array<string>, " + + "obj:struct<uint8: string, uint32: int, double: double, str: binary>>"); + MessagePackDeserializer deserializer = new MessagePackDeserializer(dataType); + Map<String, Object> rst = deserializer.deserialize(bytes); + System.out.println(mapValue.toJson()); + System.out.println(JSON.toJSONString(rst)); + + assertEquals(rst.get("uint8"), 123); + assertEquals(rst.get("uint16"), "512"); + assertEquals(rst.get("uint32"), 33554432); + assertEquals(rst.get("uint64"), 17179869184L); + assertEquals(rst.get("int8"), -123); + assertEquals(rst.get("int16"), "-512"); + assertEquals(rst.get("int32"), -33554432); + assertEquals(rst.get("int64"), -17179869184L); + + assertEquals(rst.get("double"), 123.2); + assertEquals(rst.get("bool_true"), 1); + assertEquals(rst.get("bool_false"), false); + + assertEquals(rst.get("str"), "ut8字符串"); + assertArrayEquals((byte[]) rst.get("binary"), "ut8字符串".getBytes(StandardCharsets.UTF_8)); + assertArrayEquals(((List<Object>) rst.get("int32_array")).toArray(), new Object[]{123,512,null,33554432}); + assertArrayEquals(((List<Object>) rst.get("str_array")).toArray(), new Object[]{"ut8字符串1",null,"ut8字符串2"}); + + assertEquals(((Map<String, Object>)rst.get("obj")).get("uint8"), "123"); + assertEquals(((Map<String, Object>)rst.get("obj")).get("uint32"), 33554432 ); + assertEquals(((Map<String, Object>)rst.get("obj")).get("double"), 123.2); + assertArrayEquals((byte[])((Map<String, Object>)rst.get("obj")).get("str"), "ut8字符串".getBytes(StandardCharsets.UTF_8)); + + } }
\ No newline at end of file diff --git a/groot-formats/format-msgpack/src/test/java/com/geedgenetworks/formats/msgpack/MessagePackFormatFactoryTest.java b/groot-formats/format-msgpack/src/test/java/com/geedgenetworks/formats/msgpack/MessagePackFormatFactoryTest.java index fbdce2d..9119317 100644 --- a/groot-formats/format-msgpack/src/test/java/com/geedgenetworks/formats/msgpack/MessagePackFormatFactoryTest.java +++ b/groot-formats/format-msgpack/src/test/java/com/geedgenetworks/formats/msgpack/MessagePackFormatFactoryTest.java @@ -1,100 +1,100 @@ -package com.geedgenetworks.formats.msgpack;
-
-import com.alibaba.fastjson2.JSON;
-import com.geedgenetworks.common.Event;
-import com.geedgenetworks.core.connector.sink.SinkProvider;
-import com.geedgenetworks.core.connector.source.SourceProvider;
-import com.geedgenetworks.core.factories.FactoryUtil;
-import com.geedgenetworks.core.factories.SinkTableFactory;
-import com.geedgenetworks.core.factories.SourceTableFactory;
-import com.geedgenetworks.core.factories.TableFactory;
-import org.apache.flink.configuration.Configuration;
-import org.apache.flink.streaming.api.datastream.DataStreamSink;
-import org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator;
-import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
-import org.msgpack.core.MessageBufferPacker;
-import org.msgpack.core.MessagePack;
-import org.msgpack.value.MapValue;
-import org.msgpack.value.ValueFactory;
-
-import java.nio.charset.StandardCharsets;
-import java.util.Base64;
-import java.util.HashMap;
-import java.util.Map;
-
-public class MessagePackFormatFactoryTest {
-
- private static byte[] getTestBytes() throws Exception{
- ValueFactory.MapBuilder map = ValueFactory.newMapBuilder();
- map.put(ValueFactory.newString("uint8"), ValueFactory.newInteger(123));
- map.put(ValueFactory.newString("uint16"), ValueFactory.newInteger(512));
- map.put(ValueFactory.newString("uint32"), ValueFactory.newInteger(33554432));
- map.put(ValueFactory.newString("uint64"), ValueFactory.newInteger(17179869184L));
- map.put(ValueFactory.newString("int8"), ValueFactory.newInteger(-123));
- map.put(ValueFactory.newString("int16"), ValueFactory.newInteger(-512));
- map.put(ValueFactory.newString("int32"), ValueFactory.newInteger(-33554432));
- map.put(ValueFactory.newString("int64"), ValueFactory.newInteger(-17179869184L));
- map.put(ValueFactory.newString("null"), ValueFactory.newNil());
-
- map.put(ValueFactory.newString("double"), ValueFactory.newFloat(123.2));
-
- map.put(ValueFactory.newString("bool_true"), ValueFactory.newBoolean(true));
- map.put(ValueFactory.newString("bool_false"), ValueFactory.newBoolean(false));
-
- map.put(ValueFactory.newString("str"), ValueFactory.newString("ut8字符串"));
-
- map.put(ValueFactory.newString("binary"), ValueFactory.newBinary("ut8字符串".getBytes(StandardCharsets.UTF_8)));
-
- map.put(ValueFactory.newString("int32_array"), ValueFactory.newArray(ValueFactory.newInteger(123), ValueFactory.newInteger(512), ValueFactory.newNil(), ValueFactory.newInteger(33554432)));
- map.put(ValueFactory.newString("str_array"), ValueFactory.newArray(ValueFactory.newString("ut8字符串1"), ValueFactory.newNil(), ValueFactory.newString("ut8字符串2")));
-
- map.put(ValueFactory.newString("obj"), ValueFactory.newMapBuilder()
- .put(ValueFactory.newString("uint8"), ValueFactory.newInteger(123))
- .put(ValueFactory.newString("uint32"), ValueFactory.newInteger(33554432))
- .put(ValueFactory.newString("double"), ValueFactory.newFloat(123.2))
- .put(ValueFactory.newString("str"), ValueFactory.newString("ut8字符串"))
- .build());
-
-
- MapValue mapValue = map.build();
- MessageBufferPacker packer = MessagePack.newDefaultBufferPacker();
- packer.packValue(mapValue);
- byte[] bytes = packer.toByteArray();
- packer.close();
- return bytes;
- }
-
- public static void main(String[] args) throws Exception{
- byte[] bytes = getTestBytes();
-
- SourceTableFactory tableFactory = FactoryUtil.discoverTableFactory(SourceTableFactory.class, "inline");
- Map<String, String> options = new HashMap<>();
- options.put("data", Base64.getEncoder().encodeToString(bytes));
- options.put("type", "base64");
- options.put("format", "msgpack");
-
- Configuration configuration = Configuration.fromMap(options);
- TableFactory.Context context = new TableFactory.Context( null, options, configuration);
- SourceProvider sourceProvider = tableFactory.getSourceProvider(context);
-
-
- SinkTableFactory sinkTableFactory = FactoryUtil.discoverTableFactory(SinkTableFactory.class, "print");
- options = new HashMap<>();
- options.put("format", "msgpack");
- configuration = Configuration.fromMap(options);
- context = new TableFactory.Context( null, options, configuration);
- SinkProvider sinkProvider = sinkTableFactory.getSinkProvider(context);
-
- StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
- env.setParallelism(1);
- SingleOutputStreamOperator<Event> dataStream = sourceProvider.produceDataStream(env);
-
- DataStreamSink<?> dataStreamSink = sinkProvider.consumeDataStream(dataStream);
- dataStreamSink.uid("sink").setParallelism(1);
-
- env.execute("test");
- }
-
-
-
-}
+package com.geedgenetworks.formats.msgpack; + +import com.geedgenetworks.api.connector.sink.SinkProvider; +import com.geedgenetworks.api.connector.sink.SinkTableFactory; +import com.geedgenetworks.api.connector.source.SourceProvider; +import com.geedgenetworks.api.connector.source.SourceTableFactory; +import com.geedgenetworks.api.connector.event.Event; +import com.geedgenetworks.api.factory.FactoryUtil; +import com.geedgenetworks.api.factory.ConnectorFactory; +import org.apache.flink.configuration.Configuration; +import org.apache.flink.streaming.api.datastream.DataStreamSink; +import org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator; +import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment; +import org.msgpack.core.MessageBufferPacker; +import org.msgpack.core.MessagePack; +import org.msgpack.value.MapValue; +import org.msgpack.value.ValueFactory; + +import java.nio.charset.StandardCharsets; +import java.util.Base64; +import java.util.HashMap; +import java.util.Map; + +public class MessagePackFormatFactoryTest { + + private static byte[] getTestBytes() throws Exception{ + ValueFactory.MapBuilder map = ValueFactory.newMapBuilder(); + map.put(ValueFactory.newString("uint8"), ValueFactory.newInteger(123)); + map.put(ValueFactory.newString("uint16"), ValueFactory.newInteger(512)); + map.put(ValueFactory.newString("uint32"), ValueFactory.newInteger(33554432)); + map.put(ValueFactory.newString("uint64"), ValueFactory.newInteger(17179869184L)); + map.put(ValueFactory.newString("int8"), ValueFactory.newInteger(-123)); + map.put(ValueFactory.newString("int16"), ValueFactory.newInteger(-512)); + map.put(ValueFactory.newString("int32"), ValueFactory.newInteger(-33554432)); + map.put(ValueFactory.newString("int64"), ValueFactory.newInteger(-17179869184L)); + map.put(ValueFactory.newString("null"), ValueFactory.newNil()); + + map.put(ValueFactory.newString("double"), ValueFactory.newFloat(123.2)); + + map.put(ValueFactory.newString("bool_true"), ValueFactory.newBoolean(true)); + map.put(ValueFactory.newString("bool_false"), ValueFactory.newBoolean(false)); + + map.put(ValueFactory.newString("str"), ValueFactory.newString("ut8字符串")); + + map.put(ValueFactory.newString("binary"), ValueFactory.newBinary("ut8字符串".getBytes(StandardCharsets.UTF_8))); + + map.put(ValueFactory.newString("int32_array"), ValueFactory.newArray(ValueFactory.newInteger(123), ValueFactory.newInteger(512), ValueFactory.newNil(), ValueFactory.newInteger(33554432))); + map.put(ValueFactory.newString("str_array"), ValueFactory.newArray(ValueFactory.newString("ut8字符串1"), ValueFactory.newNil(), ValueFactory.newString("ut8字符串2"))); + + map.put(ValueFactory.newString("obj"), ValueFactory.newMapBuilder() + .put(ValueFactory.newString("uint8"), ValueFactory.newInteger(123)) + .put(ValueFactory.newString("uint32"), ValueFactory.newInteger(33554432)) + .put(ValueFactory.newString("double"), ValueFactory.newFloat(123.2)) + .put(ValueFactory.newString("str"), ValueFactory.newString("ut8字符串")) + .build()); + + + MapValue mapValue = map.build(); + MessageBufferPacker packer = MessagePack.newDefaultBufferPacker(); + packer.packValue(mapValue); + byte[] bytes = packer.toByteArray(); + packer.close(); + return bytes; + } + + public static void main(String[] args) throws Exception{ + byte[] bytes = getTestBytes(); + + SourceTableFactory tableFactory = FactoryUtil.discoverConnectorFactory(SourceTableFactory.class, "inline"); + Map<String, String> options = new HashMap<>(); + options.put("data", Base64.getEncoder().encodeToString(bytes)); + options.put("repeat.count", "3"); + options.put("type", "base64"); + options.put("format", "msgpack"); + + Configuration configuration = Configuration.fromMap(options); + ConnectorFactory.Context context = new ConnectorFactory.Context( null, options, configuration); + SourceProvider sourceProvider = tableFactory.getSourceProvider(context); + + + SinkTableFactory sinkTableFactory = FactoryUtil.discoverConnectorFactory(SinkTableFactory.class, "print"); + options = new HashMap<>(); + options.put("format", "msgpack"); + configuration = Configuration.fromMap(options); + context = new ConnectorFactory.Context( null, options, configuration); + SinkProvider sinkProvider = sinkTableFactory.getSinkProvider(context); + + StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment(); + env.setParallelism(1); + SingleOutputStreamOperator<Event> dataStream = sourceProvider.produceDataStream(env); + + DataStreamSink<?> dataStreamSink = sinkProvider.consumeDataStream(dataStream); + dataStreamSink.uid("sink").setParallelism(1); + + env.execute("test"); + } + + + +} diff --git a/groot-formats/format-msgpack/src/test/java/com/geedgenetworks/formats/msgpack/MessagePackSerializerTest.java b/groot-formats/format-msgpack/src/test/java/com/geedgenetworks/formats/msgpack/MessagePackSerializerTest.java index 2b897e9..767301d 100644 --- a/groot-formats/format-msgpack/src/test/java/com/geedgenetworks/formats/msgpack/MessagePackSerializerTest.java +++ b/groot-formats/format-msgpack/src/test/java/com/geedgenetworks/formats/msgpack/MessagePackSerializerTest.java @@ -1,407 +1,407 @@ -package com.geedgenetworks.formats.msgpack;
-
-import com.alibaba.fastjson2.JSON;
-import com.geedgenetworks.core.types.StructType;
-import com.geedgenetworks.core.types.Types;
-import org.junit.jupiter.api.Test;
-import org.msgpack.core.MessageBufferPacker;
-import org.msgpack.core.MessagePack;
-import org.msgpack.value.MapValue;
-import org.msgpack.value.ValueFactory;
-
-import java.nio.charset.StandardCharsets;
-import java.util.Arrays;
-import java.util.Base64;
-import java.util.List;
-import java.util.Map;
-
-import static org.junit.jupiter.api.Assertions.*;
-import static org.junit.jupiter.api.Assertions.assertArrayEquals;
-
-public class MessagePackSerializerTest {
-
- public static void main(String[] args) throws Exception {
- // '{"log_id": 1, "recv_time":"111", "client_ip":"192.168.0.1"}'
- ValueFactory.MapBuilder map = ValueFactory.newMapBuilder();
- map.put(ValueFactory.newString("log_id"), ValueFactory.newInteger(1));
- map.put(ValueFactory.newString("recv_time"), ValueFactory.newInteger(System.currentTimeMillis() / 1000));
- map.put(ValueFactory.newString("client_ip"), ValueFactory.newString("192.168.0.1"));
- MapValue mapValue = map.build();
- MessageBufferPacker packer = MessagePack.newDefaultBufferPacker();
- packer.packValue(mapValue);
- byte[] bytes = packer.toByteArray();
- packer.close();
- String str = Base64.getEncoder().encodeToString(bytes);
- System.out.println(mapValue);
- System.out.println(str);
- }
-
- @Test
- public void testStringEncodeDecodeReversibility() throws Exception {
- byte[] bytes1 = "一个utf-8字符串".getBytes(StandardCharsets.UTF_8);
- byte[] bytes2 = new byte[256];
- for (int i = 0; i < bytes2.length; i++) {
- bytes2[i] = (byte) i;
- }
- byte[] bytes3 = new byte[128];
- for (int i = 0; i < bytes3.length; i++) {
- bytes3[i] = (byte) i;
- }
-
- List<byte[]> bytesList = Arrays.asList(bytes1, bytes2, bytes3);
- for (byte[] bytes : bytesList) {
- String str = new String(bytes, StandardCharsets.UTF_8);
- byte[] bytesEncodeDecode = str.getBytes(StandardCharsets.UTF_8);
- System.out.println(str);
- System.out.println(bytes.length + "," + bytesEncodeDecode.length + "," + Arrays.equals(bytes, bytesEncodeDecode));
- System.out.println("--------");
- }
- }
-
- @Test
- public void testJsonToString() throws Exception {
- Object[] objs = new Object[]{1, 512, 33554432, 17179869184L,123.2 ,1233333.23, "abc", "ut8字符串"};
- for (Object obj : objs) {
- System.out.println(obj.toString() + " , " + JSON.toJSONString(obj)+ " , " + obj.toString().equals(JSON.toJSONString(obj)));
- }
- }
-
- @Test
- public void testSerSimpleData() throws Exception{
- ValueFactory.MapBuilder map = ValueFactory.newMapBuilder();
- map.put(ValueFactory.newString("uint8"), ValueFactory.newInteger(123));
- map.put(ValueFactory.newString("uint16"), ValueFactory.newInteger(512));
- map.put(ValueFactory.newString("uint32"), ValueFactory.newInteger(33554432));
- map.put(ValueFactory.newString("uint64"), ValueFactory.newInteger(17179869184L));
- map.put(ValueFactory.newString("int8"), ValueFactory.newInteger(-123));
- map.put(ValueFactory.newString("int16"), ValueFactory.newInteger(-512));
- map.put(ValueFactory.newString("int32"), ValueFactory.newInteger(-33554432));
- map.put(ValueFactory.newString("int64"), ValueFactory.newInteger(-17179869184L));
- map.put(ValueFactory.newString("null"), ValueFactory.newNil());
-
- map.put(ValueFactory.newString("double"), ValueFactory.newFloat(123.2));
-
- map.put(ValueFactory.newString("bool_true"), ValueFactory.newBoolean(true));
- map.put(ValueFactory.newString("bool_false"), ValueFactory.newBoolean(false));
-
- map.put(ValueFactory.newString("str"), ValueFactory.newString("ut8字符串"));
-
- map.put(ValueFactory.newString("binary"), ValueFactory.newBinary("ut8字符串".getBytes(StandardCharsets.UTF_8)));
-
- map.put(ValueFactory.newString("int32_array"), ValueFactory.newArray(ValueFactory.newInteger(123), ValueFactory.newInteger(512), ValueFactory.newNil(), ValueFactory.newInteger(33554432)));
- map.put(ValueFactory.newString("str_array"), ValueFactory.newArray(ValueFactory.newString("ut8字符串1"), ValueFactory.newNil(), ValueFactory.newString("ut8字符串2")));
-
- map.put(ValueFactory.newString("obj"), ValueFactory.newMapBuilder()
- .put(ValueFactory.newString("uint8"), ValueFactory.newInteger(123))
- .put(ValueFactory.newString("uint32"), ValueFactory.newInteger(33554432))
- .put(ValueFactory.newString("double"), ValueFactory.newFloat(123.2))
- .put(ValueFactory.newString("str"), ValueFactory.newString("ut8字符串"))
- .build());
-
-
- MapValue mapValue = map.build();
- MessageBufferPacker packer = MessagePack.newDefaultBufferPacker();
- packer.packValue(mapValue);
- byte[] bytes = packer.toByteArray();
- packer.close();
-
- MessagePackDeserializer deserializer = new MessagePackDeserializer(null);
- Map<String, Object> data = deserializer.deserialize(bytes);
-
- MessagePackSerializer serializer = new MessagePackSerializer(null);
- byte[] bytes2 = serializer.serialize(data);
- Map<String, Object> rst = deserializer.deserialize(bytes2);
-
- System.out.println(mapValue.toJson());
- System.out.println(JSON.toJSONString(data));
- System.out.println(JSON.toJSONString(rst));
-
- System.out.println(bytes.length + "," + bytes2.length);
-
- assertEquals(rst.get("uint8"), 123);
- assertEquals(rst.get("uint16"), 512);
- assertEquals(rst.get("uint32"), 33554432L);
- assertEquals(rst.get("uint64"), 17179869184L);
- assertEquals(rst.get("int8"), -123);
- assertEquals(rst.get("int16"), -512);
- assertEquals(rst.get("int32"), -33554432);
- assertEquals(rst.get("int64"), -17179869184L);
-
- assertEquals(rst.get("double"), 123.2);
- assertEquals(rst.get("bool_true"), true);
- assertEquals(rst.get("bool_false"), false);
-
- assertEquals(rst.get("str"), "ut8字符串");
- assertArrayEquals((byte[]) rst.get("binary"), "ut8字符串".getBytes(StandardCharsets.UTF_8));
- assertArrayEquals(((List<Object>) rst.get("int32_array")).toArray(), new Object[]{123,512,null,33554432L});
- assertArrayEquals(((List<Object>) rst.get("str_array")).toArray(), new Object[]{"ut8字符串1",null,"ut8字符串2"});
-
- assertEquals(((Map<String, Object>)rst.get("obj")).get("uint8"), 123);
- assertEquals(((Map<String, Object>)rst.get("obj")).get("uint32"), 33554432L );
- assertEquals(((Map<String, Object>)rst.get("obj")).get("double"), 123.2);
- assertEquals(((Map<String, Object>)rst.get("obj")).get("str"), "ut8字符串");
-
- for (int i = 0; i < 10; i++) {
- //System.out.println("###########" + i);
- bytes2 = serializer.serialize(data);
- rst = deserializer.deserialize(bytes2);
-
- System.out.println(bytes.length + "," + bytes2.length);
-
- assertEquals(rst.get("uint8"), 123);
- assertEquals(rst.get("uint16"), 512);
- assertEquals(rst.get("uint32"), 33554432L);
- assertEquals(rst.get("uint64"), 17179869184L);
- assertEquals(rst.get("int8"), -123);
- assertEquals(rst.get("int16"), -512);
- assertEquals(rst.get("int32"), -33554432);
- assertEquals(rst.get("int64"), -17179869184L);
-
- assertEquals(rst.get("double"), 123.2);
- assertEquals(rst.get("bool_true"), true);
- assertEquals(rst.get("bool_false"), false);
-
- assertEquals(rst.get("str"), "ut8字符串");
- assertArrayEquals((byte[]) rst.get("binary"), "ut8字符串".getBytes(StandardCharsets.UTF_8));
- assertArrayEquals(((List<Object>) rst.get("int32_array")).toArray(), new Object[]{123,512,null,33554432L});
- assertArrayEquals(((List<Object>) rst.get("str_array")).toArray(), new Object[]{"ut8字符串1",null,"ut8字符串2"});
-
- assertEquals(((Map<String, Object>)rst.get("obj")).get("uint8"), 123);
- assertEquals(((Map<String, Object>)rst.get("obj")).get("uint32"), 33554432L );
- assertEquals(((Map<String, Object>)rst.get("obj")).get("double"), 123.2);
- assertEquals(((Map<String, Object>)rst.get("obj")).get("str"), "ut8字符串");
- }
- }
-
- @Test
- public void testSerSimpleDataWithSchema() throws Exception{
- ValueFactory.MapBuilder map = ValueFactory.newMapBuilder();
- map.put(ValueFactory.newString("uint8"), ValueFactory.newInteger(123));
- map.put(ValueFactory.newString("uint16"), ValueFactory.newInteger(512));
- map.put(ValueFactory.newString("uint32"), ValueFactory.newInteger(33554432));
- map.put(ValueFactory.newString("uint64"), ValueFactory.newInteger(17179869184L));
- map.put(ValueFactory.newString("int8"), ValueFactory.newInteger(-123));
- map.put(ValueFactory.newString("int16"), ValueFactory.newInteger(-512));
- map.put(ValueFactory.newString("int32"), ValueFactory.newInteger(-33554432));
- map.put(ValueFactory.newString("int64"), ValueFactory.newInteger(-17179869184L));
- map.put(ValueFactory.newString("null"), ValueFactory.newNil());
-
- map.put(ValueFactory.newString("double"), ValueFactory.newFloat(123.2));
-
- map.put(ValueFactory.newString("bool_true"), ValueFactory.newBoolean(true));
- map.put(ValueFactory.newString("bool_false"), ValueFactory.newBoolean(false));
-
- map.put(ValueFactory.newString("str"), ValueFactory.newString("ut8字符串"));
-
- map.put(ValueFactory.newString("binary"), ValueFactory.newBinary("ut8字符串".getBytes(StandardCharsets.UTF_8)));
-
- map.put(ValueFactory.newString("int32_array"), ValueFactory.newArray(ValueFactory.newInteger(123), ValueFactory.newInteger(512), ValueFactory.newNil(), ValueFactory.newInteger(33554432)));
- map.put(ValueFactory.newString("str_array"), ValueFactory.newArray(ValueFactory.newString("ut8字符串1"), ValueFactory.newNil(), ValueFactory.newString("ut8字符串2")));
-
- map.put(ValueFactory.newString("obj"), ValueFactory.newMapBuilder()
- .put(ValueFactory.newString("uint8"), ValueFactory.newInteger(123))
- .put(ValueFactory.newString("uint32"), ValueFactory.newInteger(33554432))
- .put(ValueFactory.newString("double"), ValueFactory.newFloat(123.2))
- .put(ValueFactory.newString("str"), ValueFactory.newString("ut8字符串"))
- .build());
-
-
- MapValue mapValue = map.build();
- MessageBufferPacker packer = MessagePack.newDefaultBufferPacker();
- packer.packValue(mapValue);
- byte[] bytes = packer.toByteArray();
- packer.close();
-
- StructType dataType = Types.parseStructType("struct<uint8: int, uint16: int, uint32: int, uint64: bigint, int8: int, int16: int, int32: int, int64: bigint, double: double," +
- "bool_true: boolean, bool_false: boolean, str: string, binary: binary, int32_array:array<int>, str_array:array<string>, " +
- "obj:struct<uint8: int, uint32: int, double: double, str: string>>");
-
- MessagePackDeserializer deserializer = new MessagePackDeserializer(dataType);
- Map<String, Object> data = deserializer.deserialize(bytes);
-
- MessagePackSerializer serializer = new MessagePackSerializer(dataType);
- byte[] bytes2 = serializer.serialize(data);
- Map<String, Object> rst = deserializer.deserialize(bytes2);
-
- String str = new String(bytes2, StandardCharsets.UTF_8);
- byte[] bytes3 = str.getBytes(StandardCharsets.UTF_8);
- System.out.println(bytes2.length + "," + bytes3.length + "," + Arrays.equals(bytes2, bytes3));
-
- System.out.println(mapValue.toJson());
- System.out.println(JSON.toJSONString(data));
- System.out.println(JSON.toJSONString(rst));
-
- System.out.println(bytes.length + "," + bytes2.length);
-
- assertEquals(rst.get("uint8"), 123);
- assertEquals(rst.get("uint16"), 512);
- assertEquals(rst.get("uint32"), 33554432);
- assertEquals(rst.get("uint64"), 17179869184L);
- assertEquals(rst.get("int8"), -123);
- assertEquals(rst.get("int16"), -512);
- assertEquals(rst.get("int32"), -33554432);
- assertEquals(rst.get("int64"), -17179869184L);
-
- assertEquals(rst.get("double"), 123.2);
- assertEquals(rst.get("bool_true"), true);
- assertEquals(rst.get("bool_false"), false);
-
- assertEquals(rst.get("str"), "ut8字符串");
- assertArrayEquals((byte[]) rst.get("binary"), "ut8字符串".getBytes(StandardCharsets.UTF_8));
- assertArrayEquals(((List<Object>) rst.get("int32_array")).toArray(), new Object[]{123,512,null,33554432});
- assertArrayEquals(((List<Object>) rst.get("str_array")).toArray(), new Object[]{"ut8字符串1",null,"ut8字符串2"});
-
- assertEquals(((Map<String, Object>)rst.get("obj")).get("uint8"), 123);
- assertEquals(((Map<String, Object>)rst.get("obj")).get("uint32"), 33554432 );
- assertEquals(((Map<String, Object>)rst.get("obj")).get("double"), 123.2);
- assertEquals(((Map<String, Object>)rst.get("obj")).get("str"), "ut8字符串");
-
- for (int i = 0; i < 10; i++) {
- //System.out.println("###########" + i);
- bytes2 = serializer.serialize(data);
- rst = deserializer.deserialize(bytes2);
-
- System.out.println(bytes.length + "," + bytes2.length);
-
- assertEquals(rst.get("uint8"), 123);
- assertEquals(rst.get("uint16"), 512);
- assertEquals(rst.get("uint32"), 33554432);
- assertEquals(rst.get("uint64"), 17179869184L);
- assertEquals(rst.get("int8"), -123);
- assertEquals(rst.get("int16"), -512);
- assertEquals(rst.get("int32"), -33554432);
- assertEquals(rst.get("int64"), -17179869184L);
-
- assertEquals(rst.get("double"), 123.2);
- assertEquals(rst.get("bool_true"), true);
- assertEquals(rst.get("bool_false"), false);
-
- assertEquals(rst.get("str"), "ut8字符串");
- assertArrayEquals((byte[]) rst.get("binary"), "ut8字符串".getBytes(StandardCharsets.UTF_8));
- assertArrayEquals(((List<Object>) rst.get("int32_array")).toArray(), new Object[]{123,512,null,33554432});
- assertArrayEquals(((List<Object>) rst.get("str_array")).toArray(), new Object[]{"ut8字符串1",null,"ut8字符串2"});
-
- assertEquals(((Map<String, Object>)rst.get("obj")).get("uint8"), 123);
- assertEquals(((Map<String, Object>)rst.get("obj")).get("uint32"), 33554432 );
- assertEquals(((Map<String, Object>)rst.get("obj")).get("double"), 123.2);
- assertEquals(((Map<String, Object>)rst.get("obj")).get("str"), "ut8字符串");
-
- }
- }
-
- @Test
- public void testSerSimpleDataWithSchemaTypeConvert() throws Exception{
- ValueFactory.MapBuilder map = ValueFactory.newMapBuilder();
- map.put(ValueFactory.newString("uint8"), ValueFactory.newString("123"));
- map.put(ValueFactory.newString("uint16"), ValueFactory.newInteger(512));
- map.put(ValueFactory.newString("uint32"), ValueFactory.newString("33554432"));
- map.put(ValueFactory.newString("uint64"), ValueFactory.newString("17179869184"));
- map.put(ValueFactory.newString("int8"), ValueFactory.newInteger(-123));
- map.put(ValueFactory.newString("int16"), ValueFactory.newInteger(-512));
- map.put(ValueFactory.newString("int32"), ValueFactory.newInteger(-33554432));
- map.put(ValueFactory.newString("int64"), ValueFactory.newString("-17179869184"));
- map.put(ValueFactory.newString("null"), ValueFactory.newNil());
-
- map.put(ValueFactory.newString("double"), ValueFactory.newString("123.2"));
-
- map.put(ValueFactory.newString("bool_true"), ValueFactory.newBoolean(true));
- map.put(ValueFactory.newString("bool_false"), ValueFactory.newBoolean(false));
-
- map.put(ValueFactory.newString("str"), ValueFactory.newString("ut8字符串"));
-
- map.put(ValueFactory.newString("binary"), ValueFactory.newBinary("ut8字符串".getBytes(StandardCharsets.UTF_8)));
-
- map.put(ValueFactory.newString("int32_array"), ValueFactory.newArray(ValueFactory.newInteger(123), ValueFactory.newString("512"), ValueFactory.newNil(), ValueFactory.newInteger(33554432)));
- map.put(ValueFactory.newString("str_array"), ValueFactory.newArray(ValueFactory.newString("ut8字符串1"), ValueFactory.newNil(), ValueFactory.newString("ut8字符串2")));
-
- map.put(ValueFactory.newString("obj"), ValueFactory.newMapBuilder()
- .put(ValueFactory.newString("uint8"), ValueFactory.newInteger(123))
- .put(ValueFactory.newString("uint32"), ValueFactory.newString("33554432"))
- .put(ValueFactory.newString("double"), ValueFactory.newString("123.2"))
- .put(ValueFactory.newString("str"), ValueFactory.newString("ut8字符串"))
- .build());
-
-
- MapValue mapValue = map.build();
- MessageBufferPacker packer = MessagePack.newDefaultBufferPacker();
- packer.packValue(mapValue);
- byte[] bytes = packer.toByteArray();
- packer.close();
-
- StructType dataType = Types.parseStructType("struct<uint8: int, uint16: string, uint32: int, uint64: bigint, int8: int, int16: string, int32: int, int64: bigint, double: double," +
- "bool_true: int, bool_false: boolean, str: string, binary: binary, int32_array:array<int>, str_array:array<string>, " +
- "obj:struct<uint8: string, uint32: int, double: double, str: binary>>");
-
- StructType dataType2 = Types.parseStructType("struct<uint8: int, uint16: int, uint32: int, uint64: bigint, int8: int, int16: int, int32: int, int64: bigint, double: double," +
- "bool_true: boolean, bool_false: boolean, str: string, binary: binary, int32_array:array<int>, str_array:array<string>, " +
- "obj:struct<uint8: int, uint32: int, double: double, str: string>>");
-
- MessagePackDeserializer deserializer = new MessagePackDeserializer(dataType);
- Map<String, Object> data = deserializer.deserialize(bytes);
-
- MessagePackSerializer serializer = new MessagePackSerializer(dataType2);
- byte[] bytes2 = serializer.serialize(data);
- Map<String, Object> rst = deserializer.deserialize(bytes2);
-
- System.out.println(mapValue.toJson());
- System.out.println(JSON.toJSONString(data));
- System.out.println(JSON.toJSONString(rst));
-
- System.out.println(bytes.length + "," + bytes2.length);
-
- assertEquals(rst.get("uint8"), 123);
- assertEquals(rst.get("uint16"), "512");
- assertEquals(rst.get("uint32"), 33554432);
- assertEquals(rst.get("uint64"), 17179869184L);
- assertEquals(rst.get("int8"), -123);
- assertEquals(rst.get("int16"), "-512");
- assertEquals(rst.get("int32"), -33554432);
- assertEquals(rst.get("int64"), -17179869184L);
-
- assertEquals(rst.get("double"), 123.2);
- assertEquals(rst.get("bool_true"), 1);
- assertEquals(rst.get("bool_false"), false);
-
- assertEquals(rst.get("str"), "ut8字符串");
- assertArrayEquals((byte[]) rst.get("binary"), "ut8字符串".getBytes(StandardCharsets.UTF_8));
- assertArrayEquals(((List<Object>) rst.get("int32_array")).toArray(), new Object[]{123,512,null,33554432});
- assertArrayEquals(((List<Object>) rst.get("str_array")).toArray(), new Object[]{"ut8字符串1",null,"ut8字符串2"});
-
- assertEquals(((Map<String, Object>)rst.get("obj")).get("uint8"), "123");
- assertEquals(((Map<String, Object>)rst.get("obj")).get("uint32"), 33554432 );
- assertEquals(((Map<String, Object>)rst.get("obj")).get("double"), 123.2);
- assertArrayEquals((byte[])((Map<String, Object>)rst.get("obj")).get("str"), "ut8字符串".getBytes(StandardCharsets.UTF_8));
-
- for (int i = 0; i < 10; i++) {
- //System.out.println("###########" + i);
- bytes2 = serializer.serialize(data);
- rst = deserializer.deserialize(bytes2);
-
- System.out.println(bytes.length + "," + bytes2.length);
-
- assertEquals(rst.get("uint8"), 123);
- assertEquals(rst.get("uint16"), "512");
- assertEquals(rst.get("uint32"), 33554432);
- assertEquals(rst.get("uint64"), 17179869184L);
- assertEquals(rst.get("int8"), -123);
- assertEquals(rst.get("int16"), "-512");
- assertEquals(rst.get("int32"), -33554432);
- assertEquals(rst.get("int64"), -17179869184L);
-
- assertEquals(rst.get("double"), 123.2);
- assertEquals(rst.get("bool_true"), 1);
- assertEquals(rst.get("bool_false"), false);
-
- assertEquals(rst.get("str"), "ut8字符串");
- assertArrayEquals((byte[]) rst.get("binary"), "ut8字符串".getBytes(StandardCharsets.UTF_8));
- assertArrayEquals(((List<Object>) rst.get("int32_array")).toArray(), new Object[]{123,512,null,33554432});
- assertArrayEquals(((List<Object>) rst.get("str_array")).toArray(), new Object[]{"ut8字符串1",null,"ut8字符串2"});
-
- assertEquals(((Map<String, Object>)rst.get("obj")).get("uint8"), "123");
- assertEquals(((Map<String, Object>)rst.get("obj")).get("uint32"), 33554432 );
- assertEquals(((Map<String, Object>)rst.get("obj")).get("double"), 123.2);
- assertArrayEquals((byte[])((Map<String, Object>)rst.get("obj")).get("str"), "ut8字符串".getBytes(StandardCharsets.UTF_8));
-
- }
- }
+package com.geedgenetworks.formats.msgpack; + +import com.alibaba.fastjson2.JSON; +import com.geedgenetworks.api.connector.type.StructType; +import com.geedgenetworks.api.connector.type.Types; +import org.junit.jupiter.api.Test; +import org.msgpack.core.MessageBufferPacker; +import org.msgpack.core.MessagePack; +import org.msgpack.value.MapValue; +import org.msgpack.value.ValueFactory; + +import java.nio.charset.StandardCharsets; +import java.util.Arrays; +import java.util.Base64; +import java.util.List; +import java.util.Map; + +import static org.junit.jupiter.api.Assertions.*; +import static org.junit.jupiter.api.Assertions.assertArrayEquals; + +public class MessagePackSerializerTest { + + public static void main(String[] args) throws Exception { + // '{"log_id": 1, "recv_time":"111", "client_ip":"192.168.0.1"}' + ValueFactory.MapBuilder map = ValueFactory.newMapBuilder(); + map.put(ValueFactory.newString("log_id"), ValueFactory.newInteger(1)); + map.put(ValueFactory.newString("recv_time"), ValueFactory.newInteger(System.currentTimeMillis() / 1000)); + map.put(ValueFactory.newString("client_ip"), ValueFactory.newString("192.168.0.1")); + MapValue mapValue = map.build(); + MessageBufferPacker packer = MessagePack.newDefaultBufferPacker(); + packer.packValue(mapValue); + byte[] bytes = packer.toByteArray(); + packer.close(); + String str = Base64.getEncoder().encodeToString(bytes); + System.out.println(mapValue); + System.out.println(str); + } + + @Test + public void testStringEncodeDecodeReversibility() throws Exception { + byte[] bytes1 = "一个utf-8字符串".getBytes(StandardCharsets.UTF_8); + byte[] bytes2 = new byte[256]; + for (int i = 0; i < bytes2.length; i++) { + bytes2[i] = (byte) i; + } + byte[] bytes3 = new byte[128]; + for (int i = 0; i < bytes3.length; i++) { + bytes3[i] = (byte) i; + } + + List<byte[]> bytesList = Arrays.asList(bytes1, bytes2, bytes3); + for (byte[] bytes : bytesList) { + String str = new String(bytes, StandardCharsets.UTF_8); + byte[] bytesEncodeDecode = str.getBytes(StandardCharsets.UTF_8); + System.out.println(str); + System.out.println(bytes.length + "," + bytesEncodeDecode.length + "," + Arrays.equals(bytes, bytesEncodeDecode)); + System.out.println("--------"); + } + } + + @Test + public void testJsonToString() throws Exception { + Object[] objs = new Object[]{1, 512, 33554432, 17179869184L,123.2 ,1233333.23, "abc", "ut8字符串"}; + for (Object obj : objs) { + System.out.println(obj.toString() + " , " + JSON.toJSONString(obj)+ " , " + obj.toString().equals(JSON.toJSONString(obj))); + } + } + + @Test + public void testSerSimpleData() throws Exception{ + ValueFactory.MapBuilder map = ValueFactory.newMapBuilder(); + map.put(ValueFactory.newString("uint8"), ValueFactory.newInteger(123)); + map.put(ValueFactory.newString("uint16"), ValueFactory.newInteger(512)); + map.put(ValueFactory.newString("uint32"), ValueFactory.newInteger(33554432)); + map.put(ValueFactory.newString("uint64"), ValueFactory.newInteger(17179869184L)); + map.put(ValueFactory.newString("int8"), ValueFactory.newInteger(-123)); + map.put(ValueFactory.newString("int16"), ValueFactory.newInteger(-512)); + map.put(ValueFactory.newString("int32"), ValueFactory.newInteger(-33554432)); + map.put(ValueFactory.newString("int64"), ValueFactory.newInteger(-17179869184L)); + map.put(ValueFactory.newString("null"), ValueFactory.newNil()); + + map.put(ValueFactory.newString("double"), ValueFactory.newFloat(123.2)); + + map.put(ValueFactory.newString("bool_true"), ValueFactory.newBoolean(true)); + map.put(ValueFactory.newString("bool_false"), ValueFactory.newBoolean(false)); + + map.put(ValueFactory.newString("str"), ValueFactory.newString("ut8字符串")); + + map.put(ValueFactory.newString("binary"), ValueFactory.newBinary("ut8字符串".getBytes(StandardCharsets.UTF_8))); + + map.put(ValueFactory.newString("int32_array"), ValueFactory.newArray(ValueFactory.newInteger(123), ValueFactory.newInteger(512), ValueFactory.newNil(), ValueFactory.newInteger(33554432))); + map.put(ValueFactory.newString("str_array"), ValueFactory.newArray(ValueFactory.newString("ut8字符串1"), ValueFactory.newNil(), ValueFactory.newString("ut8字符串2"))); + + map.put(ValueFactory.newString("obj"), ValueFactory.newMapBuilder() + .put(ValueFactory.newString("uint8"), ValueFactory.newInteger(123)) + .put(ValueFactory.newString("uint32"), ValueFactory.newInteger(33554432)) + .put(ValueFactory.newString("double"), ValueFactory.newFloat(123.2)) + .put(ValueFactory.newString("str"), ValueFactory.newString("ut8字符串")) + .build()); + + + MapValue mapValue = map.build(); + MessageBufferPacker packer = MessagePack.newDefaultBufferPacker(); + packer.packValue(mapValue); + byte[] bytes = packer.toByteArray(); + packer.close(); + + MessagePackDeserializer deserializer = new MessagePackDeserializer(null); + Map<String, Object> data = deserializer.deserialize(bytes); + + MessagePackSerializer serializer = new MessagePackSerializer(null); + byte[] bytes2 = serializer.serialize(data); + Map<String, Object> rst = deserializer.deserialize(bytes2); + + System.out.println(mapValue.toJson()); + System.out.println(JSON.toJSONString(data)); + System.out.println(JSON.toJSONString(rst)); + + System.out.println(bytes.length + "," + bytes2.length); + + assertEquals(rst.get("uint8"), 123); + assertEquals(rst.get("uint16"), 512); + assertEquals(rst.get("uint32"), 33554432L); + assertEquals(rst.get("uint64"), 17179869184L); + assertEquals(rst.get("int8"), -123); + assertEquals(rst.get("int16"), -512); + assertEquals(rst.get("int32"), -33554432); + assertEquals(rst.get("int64"), -17179869184L); + + assertEquals(rst.get("double"), 123.2); + assertEquals(rst.get("bool_true"), true); + assertEquals(rst.get("bool_false"), false); + + assertEquals(rst.get("str"), "ut8字符串"); + assertArrayEquals((byte[]) rst.get("binary"), "ut8字符串".getBytes(StandardCharsets.UTF_8)); + assertArrayEquals(((List<Object>) rst.get("int32_array")).toArray(), new Object[]{123,512,null,33554432L}); + assertArrayEquals(((List<Object>) rst.get("str_array")).toArray(), new Object[]{"ut8字符串1",null,"ut8字符串2"}); + + assertEquals(((Map<String, Object>)rst.get("obj")).get("uint8"), 123); + assertEquals(((Map<String, Object>)rst.get("obj")).get("uint32"), 33554432L ); + assertEquals(((Map<String, Object>)rst.get("obj")).get("double"), 123.2); + assertEquals(((Map<String, Object>)rst.get("obj")).get("str"), "ut8字符串"); + + for (int i = 0; i < 10; i++) { + //System.out.println("###########" + i); + bytes2 = serializer.serialize(data); + rst = deserializer.deserialize(bytes2); + + System.out.println(bytes.length + "," + bytes2.length); + + assertEquals(rst.get("uint8"), 123); + assertEquals(rst.get("uint16"), 512); + assertEquals(rst.get("uint32"), 33554432L); + assertEquals(rst.get("uint64"), 17179869184L); + assertEquals(rst.get("int8"), -123); + assertEquals(rst.get("int16"), -512); + assertEquals(rst.get("int32"), -33554432); + assertEquals(rst.get("int64"), -17179869184L); + + assertEquals(rst.get("double"), 123.2); + assertEquals(rst.get("bool_true"), true); + assertEquals(rst.get("bool_false"), false); + + assertEquals(rst.get("str"), "ut8字符串"); + assertArrayEquals((byte[]) rst.get("binary"), "ut8字符串".getBytes(StandardCharsets.UTF_8)); + assertArrayEquals(((List<Object>) rst.get("int32_array")).toArray(), new Object[]{123,512,null,33554432L}); + assertArrayEquals(((List<Object>) rst.get("str_array")).toArray(), new Object[]{"ut8字符串1",null,"ut8字符串2"}); + + assertEquals(((Map<String, Object>)rst.get("obj")).get("uint8"), 123); + assertEquals(((Map<String, Object>)rst.get("obj")).get("uint32"), 33554432L ); + assertEquals(((Map<String, Object>)rst.get("obj")).get("double"), 123.2); + assertEquals(((Map<String, Object>)rst.get("obj")).get("str"), "ut8字符串"); + } + } + + @Test + public void testSerSimpleDataWithSchema() throws Exception{ + ValueFactory.MapBuilder map = ValueFactory.newMapBuilder(); + map.put(ValueFactory.newString("uint8"), ValueFactory.newInteger(123)); + map.put(ValueFactory.newString("uint16"), ValueFactory.newInteger(512)); + map.put(ValueFactory.newString("uint32"), ValueFactory.newInteger(33554432)); + map.put(ValueFactory.newString("uint64"), ValueFactory.newInteger(17179869184L)); + map.put(ValueFactory.newString("int8"), ValueFactory.newInteger(-123)); + map.put(ValueFactory.newString("int16"), ValueFactory.newInteger(-512)); + map.put(ValueFactory.newString("int32"), ValueFactory.newInteger(-33554432)); + map.put(ValueFactory.newString("int64"), ValueFactory.newInteger(-17179869184L)); + map.put(ValueFactory.newString("null"), ValueFactory.newNil()); + + map.put(ValueFactory.newString("double"), ValueFactory.newFloat(123.2)); + + map.put(ValueFactory.newString("bool_true"), ValueFactory.newBoolean(true)); + map.put(ValueFactory.newString("bool_false"), ValueFactory.newBoolean(false)); + + map.put(ValueFactory.newString("str"), ValueFactory.newString("ut8字符串")); + + map.put(ValueFactory.newString("binary"), ValueFactory.newBinary("ut8字符串".getBytes(StandardCharsets.UTF_8))); + + map.put(ValueFactory.newString("int32_array"), ValueFactory.newArray(ValueFactory.newInteger(123), ValueFactory.newInteger(512), ValueFactory.newNil(), ValueFactory.newInteger(33554432))); + map.put(ValueFactory.newString("str_array"), ValueFactory.newArray(ValueFactory.newString("ut8字符串1"), ValueFactory.newNil(), ValueFactory.newString("ut8字符串2"))); + + map.put(ValueFactory.newString("obj"), ValueFactory.newMapBuilder() + .put(ValueFactory.newString("uint8"), ValueFactory.newInteger(123)) + .put(ValueFactory.newString("uint32"), ValueFactory.newInteger(33554432)) + .put(ValueFactory.newString("double"), ValueFactory.newFloat(123.2)) + .put(ValueFactory.newString("str"), ValueFactory.newString("ut8字符串")) + .build()); + + + MapValue mapValue = map.build(); + MessageBufferPacker packer = MessagePack.newDefaultBufferPacker(); + packer.packValue(mapValue); + byte[] bytes = packer.toByteArray(); + packer.close(); + + StructType dataType = Types.parseStructType("struct<uint8: int, uint16: int, uint32: int, uint64: bigint, int8: int, int16: int, int32: int, int64: bigint, double: double," + + "bool_true: boolean, bool_false: boolean, str: string, binary: binary, int32_array:array<int>, str_array:array<string>, " + + "obj:struct<uint8: int, uint32: int, double: double, str: string>>"); + + MessagePackDeserializer deserializer = new MessagePackDeserializer(dataType); + Map<String, Object> data = deserializer.deserialize(bytes); + + MessagePackSerializer serializer = new MessagePackSerializer(dataType); + byte[] bytes2 = serializer.serialize(data); + Map<String, Object> rst = deserializer.deserialize(bytes2); + + String str = new String(bytes2, StandardCharsets.UTF_8); + byte[] bytes3 = str.getBytes(StandardCharsets.UTF_8); + System.out.println(bytes2.length + "," + bytes3.length + "," + Arrays.equals(bytes2, bytes3)); + + System.out.println(mapValue.toJson()); + System.out.println(JSON.toJSONString(data)); + System.out.println(JSON.toJSONString(rst)); + + System.out.println(bytes.length + "," + bytes2.length); + + assertEquals(rst.get("uint8"), 123); + assertEquals(rst.get("uint16"), 512); + assertEquals(rst.get("uint32"), 33554432); + assertEquals(rst.get("uint64"), 17179869184L); + assertEquals(rst.get("int8"), -123); + assertEquals(rst.get("int16"), -512); + assertEquals(rst.get("int32"), -33554432); + assertEquals(rst.get("int64"), -17179869184L); + + assertEquals(rst.get("double"), 123.2); + assertEquals(rst.get("bool_true"), true); + assertEquals(rst.get("bool_false"), false); + + assertEquals(rst.get("str"), "ut8字符串"); + assertArrayEquals((byte[]) rst.get("binary"), "ut8字符串".getBytes(StandardCharsets.UTF_8)); + assertArrayEquals(((List<Object>) rst.get("int32_array")).toArray(), new Object[]{123,512,null,33554432}); + assertArrayEquals(((List<Object>) rst.get("str_array")).toArray(), new Object[]{"ut8字符串1",null,"ut8字符串2"}); + + assertEquals(((Map<String, Object>)rst.get("obj")).get("uint8"), 123); + assertEquals(((Map<String, Object>)rst.get("obj")).get("uint32"), 33554432 ); + assertEquals(((Map<String, Object>)rst.get("obj")).get("double"), 123.2); + assertEquals(((Map<String, Object>)rst.get("obj")).get("str"), "ut8字符串"); + + for (int i = 0; i < 10; i++) { + //System.out.println("###########" + i); + bytes2 = serializer.serialize(data); + rst = deserializer.deserialize(bytes2); + + System.out.println(bytes.length + "," + bytes2.length); + + assertEquals(rst.get("uint8"), 123); + assertEquals(rst.get("uint16"), 512); + assertEquals(rst.get("uint32"), 33554432); + assertEquals(rst.get("uint64"), 17179869184L); + assertEquals(rst.get("int8"), -123); + assertEquals(rst.get("int16"), -512); + assertEquals(rst.get("int32"), -33554432); + assertEquals(rst.get("int64"), -17179869184L); + + assertEquals(rst.get("double"), 123.2); + assertEquals(rst.get("bool_true"), true); + assertEquals(rst.get("bool_false"), false); + + assertEquals(rst.get("str"), "ut8字符串"); + assertArrayEquals((byte[]) rst.get("binary"), "ut8字符串".getBytes(StandardCharsets.UTF_8)); + assertArrayEquals(((List<Object>) rst.get("int32_array")).toArray(), new Object[]{123,512,null,33554432}); + assertArrayEquals(((List<Object>) rst.get("str_array")).toArray(), new Object[]{"ut8字符串1",null,"ut8字符串2"}); + + assertEquals(((Map<String, Object>)rst.get("obj")).get("uint8"), 123); + assertEquals(((Map<String, Object>)rst.get("obj")).get("uint32"), 33554432 ); + assertEquals(((Map<String, Object>)rst.get("obj")).get("double"), 123.2); + assertEquals(((Map<String, Object>)rst.get("obj")).get("str"), "ut8字符串"); + + } + } + + @Test + public void testSerSimpleDataWithSchemaTypeConvert() throws Exception{ + ValueFactory.MapBuilder map = ValueFactory.newMapBuilder(); + map.put(ValueFactory.newString("uint8"), ValueFactory.newString("123")); + map.put(ValueFactory.newString("uint16"), ValueFactory.newInteger(512)); + map.put(ValueFactory.newString("uint32"), ValueFactory.newString("33554432")); + map.put(ValueFactory.newString("uint64"), ValueFactory.newString("17179869184")); + map.put(ValueFactory.newString("int8"), ValueFactory.newInteger(-123)); + map.put(ValueFactory.newString("int16"), ValueFactory.newInteger(-512)); + map.put(ValueFactory.newString("int32"), ValueFactory.newInteger(-33554432)); + map.put(ValueFactory.newString("int64"), ValueFactory.newString("-17179869184")); + map.put(ValueFactory.newString("null"), ValueFactory.newNil()); + + map.put(ValueFactory.newString("double"), ValueFactory.newString("123.2")); + + map.put(ValueFactory.newString("bool_true"), ValueFactory.newBoolean(true)); + map.put(ValueFactory.newString("bool_false"), ValueFactory.newBoolean(false)); + + map.put(ValueFactory.newString("str"), ValueFactory.newString("ut8字符串")); + + map.put(ValueFactory.newString("binary"), ValueFactory.newBinary("ut8字符串".getBytes(StandardCharsets.UTF_8))); + + map.put(ValueFactory.newString("int32_array"), ValueFactory.newArray(ValueFactory.newInteger(123), ValueFactory.newString("512"), ValueFactory.newNil(), ValueFactory.newInteger(33554432))); + map.put(ValueFactory.newString("str_array"), ValueFactory.newArray(ValueFactory.newString("ut8字符串1"), ValueFactory.newNil(), ValueFactory.newString("ut8字符串2"))); + + map.put(ValueFactory.newString("obj"), ValueFactory.newMapBuilder() + .put(ValueFactory.newString("uint8"), ValueFactory.newInteger(123)) + .put(ValueFactory.newString("uint32"), ValueFactory.newString("33554432")) + .put(ValueFactory.newString("double"), ValueFactory.newString("123.2")) + .put(ValueFactory.newString("str"), ValueFactory.newString("ut8字符串")) + .build()); + + + MapValue mapValue = map.build(); + MessageBufferPacker packer = MessagePack.newDefaultBufferPacker(); + packer.packValue(mapValue); + byte[] bytes = packer.toByteArray(); + packer.close(); + + StructType dataType = Types.parseStructType("struct<uint8: int, uint16: string, uint32: int, uint64: bigint, int8: int, int16: string, int32: int, int64: bigint, double: double," + + "bool_true: int, bool_false: boolean, str: string, binary: binary, int32_array:array<int>, str_array:array<string>, " + + "obj:struct<uint8: string, uint32: int, double: double, str: binary>>"); + + StructType dataType2 = Types.parseStructType("struct<uint8: int, uint16: int, uint32: int, uint64: bigint, int8: int, int16: int, int32: int, int64: bigint, double: double," + + "bool_true: boolean, bool_false: boolean, str: string, binary: binary, int32_array:array<int>, str_array:array<string>, " + + "obj:struct<uint8: int, uint32: int, double: double, str: string>>"); + + MessagePackDeserializer deserializer = new MessagePackDeserializer(dataType); + Map<String, Object> data = deserializer.deserialize(bytes); + + MessagePackSerializer serializer = new MessagePackSerializer(dataType2); + byte[] bytes2 = serializer.serialize(data); + Map<String, Object> rst = deserializer.deserialize(bytes2); + + System.out.println(mapValue.toJson()); + System.out.println(JSON.toJSONString(data)); + System.out.println(JSON.toJSONString(rst)); + + System.out.println(bytes.length + "," + bytes2.length); + + assertEquals(rst.get("uint8"), 123); + assertEquals(rst.get("uint16"), "512"); + assertEquals(rst.get("uint32"), 33554432); + assertEquals(rst.get("uint64"), 17179869184L); + assertEquals(rst.get("int8"), -123); + assertEquals(rst.get("int16"), "-512"); + assertEquals(rst.get("int32"), -33554432); + assertEquals(rst.get("int64"), -17179869184L); + + assertEquals(rst.get("double"), 123.2); + assertEquals(rst.get("bool_true"), 1); + assertEquals(rst.get("bool_false"), false); + + assertEquals(rst.get("str"), "ut8字符串"); + assertArrayEquals((byte[]) rst.get("binary"), "ut8字符串".getBytes(StandardCharsets.UTF_8)); + assertArrayEquals(((List<Object>) rst.get("int32_array")).toArray(), new Object[]{123,512,null,33554432}); + assertArrayEquals(((List<Object>) rst.get("str_array")).toArray(), new Object[]{"ut8字符串1",null,"ut8字符串2"}); + + assertEquals(((Map<String, Object>)rst.get("obj")).get("uint8"), "123"); + assertEquals(((Map<String, Object>)rst.get("obj")).get("uint32"), 33554432 ); + assertEquals(((Map<String, Object>)rst.get("obj")).get("double"), 123.2); + assertArrayEquals((byte[])((Map<String, Object>)rst.get("obj")).get("str"), "ut8字符串".getBytes(StandardCharsets.UTF_8)); + + for (int i = 0; i < 10; i++) { + //System.out.println("###########" + i); + bytes2 = serializer.serialize(data); + rst = deserializer.deserialize(bytes2); + + System.out.println(bytes.length + "," + bytes2.length); + + assertEquals(rst.get("uint8"), 123); + assertEquals(rst.get("uint16"), "512"); + assertEquals(rst.get("uint32"), 33554432); + assertEquals(rst.get("uint64"), 17179869184L); + assertEquals(rst.get("int8"), -123); + assertEquals(rst.get("int16"), "-512"); + assertEquals(rst.get("int32"), -33554432); + assertEquals(rst.get("int64"), -17179869184L); + + assertEquals(rst.get("double"), 123.2); + assertEquals(rst.get("bool_true"), 1); + assertEquals(rst.get("bool_false"), false); + + assertEquals(rst.get("str"), "ut8字符串"); + assertArrayEquals((byte[]) rst.get("binary"), "ut8字符串".getBytes(StandardCharsets.UTF_8)); + assertArrayEquals(((List<Object>) rst.get("int32_array")).toArray(), new Object[]{123,512,null,33554432}); + assertArrayEquals(((List<Object>) rst.get("str_array")).toArray(), new Object[]{"ut8字符串1",null,"ut8字符串2"}); + + assertEquals(((Map<String, Object>)rst.get("obj")).get("uint8"), "123"); + assertEquals(((Map<String, Object>)rst.get("obj")).get("uint32"), 33554432 ); + assertEquals(((Map<String, Object>)rst.get("obj")).get("double"), 123.2); + assertArrayEquals((byte[])((Map<String, Object>)rst.get("obj")).get("str"), "ut8字符串".getBytes(StandardCharsets.UTF_8)); + + } + } }
\ No newline at end of file diff --git a/groot-formats/format-protobuf/pom.xml b/groot-formats/format-protobuf/pom.xml index f14e1d1..9902ada 100644 --- a/groot-formats/format-protobuf/pom.xml +++ b/groot-formats/format-protobuf/pom.xml @@ -13,17 +13,11 @@ <name>Groot : Formats : Format-Protobuf </name> <properties> - <protobuf.version>3.23.4</protobuf.version> + </properties> <dependencies> - <!--<dependency> - <groupId>com.google.protobuf</groupId> - <artifactId>protobuf-java</artifactId> - <version>${protobuf.version}</version> - </dependency>--> - <!-- - --> + <dependency> <groupId>com.geedgenetworks</groupId> <artifactId>protobuf-shaded</artifactId> @@ -37,10 +31,45 @@ </exclusions> </dependency> <dependency> + <groupId>com.geedgenetworks</groupId> + <artifactId>groot-core</artifactId> + <version>${revision}</version> + <scope>test</scope> + </dependency> + + <dependency> + <groupId>com.geedgenetworks</groupId> + <artifactId>format-json</artifactId> + <version>${revision}</version> + <scope>test</scope> + </dependency> + + <dependency> + <groupId>org.apache.flink</groupId> + <artifactId>flink-table-api-java-bridge_${scala.version}</artifactId> + <scope>test</scope> + </dependency> + + <dependency> + <groupId>org.apache.flink</groupId> + <artifactId>flink-clients_${scala.version}</artifactId> + <scope>test</scope> + </dependency> + + <dependency> + <groupId>org.apache.flink</groupId> + <artifactId>flink-table-planner-blink_${scala.version}</artifactId> + <scope>test</scope> + </dependency> + + <dependency> <groupId>org.apache.flink</groupId> <artifactId>flink-connector-kafka_${scala.version}</artifactId> <version>${flink.version}</version> <scope>test</scope> </dependency> + + + </dependencies> </project>
\ No newline at end of file diff --git a/groot-formats/format-protobuf/src/main/java/com/geedgenetworks/formats/protobuf/ProtobufEventDeserializationSchema.java b/groot-formats/format-protobuf/src/main/java/com/geedgenetworks/formats/protobuf/ProtobufEventDeserializationSchema.java index 0e477a1..c2e4437 100644 --- a/groot-formats/format-protobuf/src/main/java/com/geedgenetworks/formats/protobuf/ProtobufEventDeserializationSchema.java +++ b/groot-formats/format-protobuf/src/main/java/com/geedgenetworks/formats/protobuf/ProtobufEventDeserializationSchema.java @@ -1,14 +1,13 @@ package com.geedgenetworks.formats.protobuf; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.core.connector.format.MapDeserialization; -import com.geedgenetworks.core.types.StructType; import com.geedgenetworks.shaded.com.google.protobuf.Descriptors.Descriptor; +import com.geedgenetworks.api.connector.serialization.MapDeserialization; +import com.geedgenetworks.api.connector.event.Event; +import com.geedgenetworks.api.connector.type.StructType; import org.apache.flink.api.common.serialization.DeserializationSchema; import org.apache.flink.api.common.typeinfo.TypeInformation; import org.slf4j.Logger; import org.slf4j.LoggerFactory; - import java.io.IOException; import java.util.Base64; import java.util.Map; diff --git a/groot-formats/format-protobuf/src/main/java/com/geedgenetworks/formats/protobuf/ProtobufEventSerializationSchema.java b/groot-formats/format-protobuf/src/main/java/com/geedgenetworks/formats/protobuf/ProtobufEventSerializationSchema.java index 7f33dad..ccfe850 100644 --- a/groot-formats/format-protobuf/src/main/java/com/geedgenetworks/formats/protobuf/ProtobufEventSerializationSchema.java +++ b/groot-formats/format-protobuf/src/main/java/com/geedgenetworks/formats/protobuf/ProtobufEventSerializationSchema.java @@ -1,8 +1,8 @@ package com.geedgenetworks.formats.protobuf; import com.alibaba.fastjson2.JSON; -import com.geedgenetworks.common.Event; import com.geedgenetworks.shaded.com.google.protobuf.Descriptors; +import com.geedgenetworks.api.connector.event.Event; import org.apache.flink.api.common.serialization.SerializationSchema; import org.slf4j.Logger; import org.slf4j.LoggerFactory; diff --git a/groot-formats/format-protobuf/src/main/java/com/geedgenetworks/formats/protobuf/ProtobufFormatFactory.java b/groot-formats/format-protobuf/src/main/java/com/geedgenetworks/formats/protobuf/ProtobufFormatFactory.java index f68c8c3..9f008e9 100644 --- a/groot-formats/format-protobuf/src/main/java/com/geedgenetworks/formats/protobuf/ProtobufFormatFactory.java +++ b/groot-formats/format-protobuf/src/main/java/com/geedgenetworks/formats/protobuf/ProtobufFormatFactory.java @@ -1,13 +1,13 @@ package com.geedgenetworks.formats.protobuf; -import com.geedgenetworks.core.connector.format.DecodingFormat; -import com.geedgenetworks.core.connector.format.EncodingFormat; -import com.geedgenetworks.core.factories.DecodingFormatFactory; -import com.geedgenetworks.core.factories.EncodingFormatFactory; -import com.geedgenetworks.core.factories.TableFactory; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.core.types.StructType; import com.geedgenetworks.shaded.com.google.protobuf.Descriptors; +import com.geedgenetworks.api.connector.serialization.DecodingFormat; +import com.geedgenetworks.api.connector.serialization.EncodingFormat; +import com.geedgenetworks.api.connector.event.Event; +import com.geedgenetworks.api.factory.DecodingFormatFactory; +import com.geedgenetworks.api.factory.EncodingFormatFactory; +import com.geedgenetworks.api.factory.ConnectorFactory; +import com.geedgenetworks.api.connector.type.StructType; import org.apache.flink.api.common.serialization.DeserializationSchema; import org.apache.flink.api.common.serialization.SerializationSchema; import org.apache.flink.configuration.ConfigOption; @@ -22,12 +22,12 @@ public class ProtobufFormatFactory implements DecodingFormatFactory, EncodingFor public static final String IDENTIFIER = "protobuf"; @Override - public String factoryIdentifier() { + public String type() { return IDENTIFIER; } @Override - public DecodingFormat createDecodingFormat(TableFactory.Context context, ReadableConfig formatOptions) { + public DecodingFormat createDecodingFormat(ConnectorFactory.Context context, ReadableConfig formatOptions) { final String messageName = formatOptions.get(MESSAGE_NAME); final String descFilePath = formatOptions.get(DESC_FILE_PATH); final boolean ignoreParseErrors = formatOptions.get(IGNORE_PARSE_ERRORS); @@ -52,7 +52,7 @@ public class ProtobufFormatFactory implements DecodingFormatFactory, EncodingFor } @Override - public EncodingFormat createEncodingFormat(TableFactory.Context context, ReadableConfig formatOptions) { + public EncodingFormat createEncodingFormat(ConnectorFactory.Context context, ReadableConfig formatOptions) { final String messageName = formatOptions.get(MESSAGE_NAME); final String descFilePath = formatOptions.get(DESC_FILE_PATH); final byte[] fileContent = ProtobufUtils.readDescriptorFileContent(descFilePath); diff --git a/groot-formats/format-protobuf/src/main/java/com/geedgenetworks/formats/protobuf/SchemaConverters.java b/groot-formats/format-protobuf/src/main/java/com/geedgenetworks/formats/protobuf/SchemaConverters.java index 196b0c9..44a8140 100644 --- a/groot-formats/format-protobuf/src/main/java/com/geedgenetworks/formats/protobuf/SchemaConverters.java +++ b/groot-formats/format-protobuf/src/main/java/com/geedgenetworks/formats/protobuf/SchemaConverters.java @@ -1,807 +1,807 @@ -package com.geedgenetworks.formats.protobuf;
-
-import com.geedgenetworks.shaded.com.google.protobuf.Descriptors;
-import com.geedgenetworks.core.types.*;
-import com.geedgenetworks.core.types.StructType.StructField;
-import com.geedgenetworks.shaded.com.google.protobuf.CodedInputStream;
-import com.geedgenetworks.shaded.com.google.protobuf.Descriptors.Descriptor;
-import com.geedgenetworks.shaded.com.google.protobuf.Descriptors.FieldDescriptor;
-import com.geedgenetworks.shaded.com.google.protobuf.WireFormat;
-import org.apache.commons.lang3.StringUtils;
-import org.apache.flink.util.Preconditions;
-import org.slf4j.Logger;
-import org.slf4j.LoggerFactory;
-
-import java.nio.charset.StandardCharsets;
-import java.util.*;
-import java.util.stream.Collectors;
-
-public class SchemaConverters {
- static final Logger LOG = LoggerFactory.getLogger(SchemaConverters.class);
- public static StructType toStructType(Descriptor descriptor) {
- StructField[] fields = descriptor.getFields().stream().map(f -> structFieldFor(f)).toArray(StructField[]::new);
- return new StructType(fields);
- }
-
- private static StructField structFieldFor(FieldDescriptor fd) {
- WireFormat.FieldType type = fd.getLiteType();
- DataType dataType;
- switch (type) {
- case DOUBLE:
- dataType = Types.DOUBLE;
- break;
- case FLOAT:
- dataType = Types.FLOAT;
- break;
- case INT64:
- case UINT64:
- case FIXED64:
- case SINT64:
- case SFIXED64:
- dataType = Types.BIGINT;
- break;
- case INT32:
- case UINT32:
- case FIXED32:
- case SINT32:
- case SFIXED32:
- dataType = Types.INT;
- break;
- case BOOL:
- dataType = Types.BOOLEAN;
- break;
- case STRING:
- dataType = Types.STRING;
- break;
- case BYTES:
- dataType = Types.BINARY;
- break;
- case ENUM:
- dataType = Types.INT;
- break;
- case MESSAGE:
- if (fd.isRepeated() && fd.getMessageType().getOptions().hasMapEntry()) {
- throw new IllegalArgumentException(String.format("not supported type:%s(%s)", type, fd.getName()));
- } else {
- StructField[] fields = fd.getMessageType().getFields().stream().map(f -> structFieldFor(f)).toArray(StructField[]::new);
- dataType = new StructType(fields);
- }
- break;
- default:
- throw new IllegalArgumentException(String.format("not supported type:%s(%s)", type, fd.getName()));
- }
- if (fd.isRepeated()) {
- return new StructField(fd.getName(), new ArrayType(dataType));
- } else {
- return new StructField(fd.getName(), dataType);
- }
- }
-
- // 校验dataType和descriptor是否匹配,dataType中定义的属性必须全部在descriptor定义,每个字段的类型必须匹配(相同或者能够转换)
- public static void checkMatch(Descriptor descriptor, StructType dataType) throws Exception {
- checkMatch(descriptor, dataType, null);
- }
-
- private static void checkMatch(Descriptor descriptor, StructType dataType, String prefix) throws Exception {
- List<FieldDescriptor> fieldDescriptors = descriptor.getFields();
- Map<String, FieldDescriptor> fdMap = fieldDescriptors.stream().collect(Collectors.toMap(x -> x.getName(), x -> x));
- StructField[] fields = dataType.fields;
-
- for (int i = 0; i < fields.length; i++) {
- StructField field = fields[i];
- FieldDescriptor fd = fdMap.get(field.name);
- if(fd == null){
- throw new IllegalArgumentException(String.format("%s ' field:%s not found in proto descriptor", StringUtils.isBlank(prefix)? "root": prefix, field));
- }
- WireFormat.FieldType type = fd.getLiteType();
- DataType fieldDataType;
- if(fd.isRepeated()){
- if(!(field.dataType instanceof ArrayType)){
- throw newNotMatchException(field, fd, prefix);
- }
- fieldDataType = ((ArrayType)field.dataType).elementType;
- }else{
- fieldDataType = field.dataType;
- }
- switch (type) {
- case DOUBLE:
- case FLOAT:
- if(!(fieldDataType instanceof DoubleType || fieldDataType instanceof FloatType
- || fieldDataType instanceof IntegerType || fieldDataType instanceof LongType)){
- throw newNotMatchException(field, fd, prefix);
- }
- break;
- case INT64:
- case UINT64:
- case FIXED64:
- case SINT64:
- case SFIXED64:
- if(!(fieldDataType instanceof IntegerType || fieldDataType instanceof LongType
- || fieldDataType instanceof FloatType || fieldDataType instanceof DoubleType)){
- throw newNotMatchException(field, fd, prefix);
- }
- break;
- case INT32:
- case UINT32:
- case FIXED32:
- case SINT32:
- case SFIXED32:
- if(!(fieldDataType instanceof IntegerType || fieldDataType instanceof LongType
- || fieldDataType instanceof FloatType || fieldDataType instanceof DoubleType)){
- throw newNotMatchException(field, fd, prefix);
- }
- break;
- case BOOL:
- if(!(fieldDataType instanceof BooleanType || fieldDataType instanceof IntegerType)){
- throw newNotMatchException(field, fd, prefix);
- }
- break;
- case STRING:
- if(!(fieldDataType instanceof StringType)){
- throw newNotMatchException(field, fd, prefix);
- }
- break;
- case BYTES:
- if(!(fieldDataType instanceof BinaryType)){
- throw newNotMatchException(field, fd, prefix);
- }
- break;
- case ENUM:
- if(!(fieldDataType instanceof IntegerType)){
- throw newNotMatchException(field, fd, prefix);
- }
- break;
- case MESSAGE:
- if(!(fieldDataType instanceof StructType)){
- throw newNotMatchException(field, fd, prefix);
- }
- checkMatch(fd.getMessageType(), (StructType) fieldDataType, StringUtils.isBlank(prefix)? field.name: prefix + "." + field.name);
- }
- }
- }
-
- private static IllegalArgumentException newNotMatchException(StructField field, FieldDescriptor fd, String prefix){
- return new IllegalArgumentException(String.format("%s ' field:%s not match with proto field descriptor:%s(%s)", StringUtils.isBlank(prefix)? "root": prefix, field, fd, fd.getType()));
- }
-
- public static class MessageConverter {
- private static final int MAX_CHARS_LENGTH = 1024 * 4;
- FieldDesc[] fieldDescArray; // Message类型对应FieldDesc, 下标为field number
- int initialCapacity = 0;
- final boolean emitDefaultValues;
- final DefaultValue[] defaultValues;
- private final char[] tmpDecodeChars = new char[MAX_CHARS_LENGTH]; // 同一个Message的转换是在一个线程内,fieldDesc共用一个临时chars
-
- public MessageConverter(Descriptor descriptor, StructType dataType, boolean emitDefaultValues) {
- ProtobufUtils.checkSupportParseDescriptor(descriptor);
- List<FieldDescriptor> fields = descriptor.getFields();
- int maxNumber = fields.stream().mapToInt(f -> f.getNumber()).max().getAsInt();
- Preconditions.checkArgument(maxNumber < 10000, maxNumber);
- fieldDescArray = new FieldDesc[maxNumber + 1];
-
- this.emitDefaultValues = emitDefaultValues;
- if(this.emitDefaultValues){
- defaultValues = new DefaultValue[dataType.fields.length];
- }else{
- defaultValues = null;
- }
-
- for (FieldDescriptor field : fields) {
- // Optional<StructField> structFieldOptional = Arrays.stream(dataType.fields).filter(f -> f.name.equals(field.getName())).findFirst();
- // if(structFieldOptional.isPresent()){
- int position = -1;
- for (int i = 0; i < dataType.fields.length; i++) {
- if(dataType.fields[i].name.equals(field.getName())){
- position = i;
- break;
- }
- }
- if(position >= 0){
- fieldDescArray[field.getNumber()] = new FieldDesc(field, dataType.fields[position].dataType, position, emitDefaultValues, tmpDecodeChars);
- if(this.emitDefaultValues){
- defaultValues[position] = new DefaultValue(dataType.fields[position].name, getDefaultValue(field, dataType.fields[position].dataType));
- }
- }
- }
- if(dataType.fields.length / 3 > 16){
- initialCapacity = (dataType.fields.length / 3) ;
- }
- if(this.emitDefaultValues){
- LOG.warn("enable emitDefaultValues will seriously affect performance !!!");
- for (int i = 0; i < defaultValues.length; i++) {
- if (defaultValues[i] == null) {
- throw new IllegalArgumentException(String.format("%s and %s not match", dataType, descriptor));
- }
- }
- }
- }
-
- public Map<String, Object> converter(byte[] bytes) throws Exception {
- CodedInputStream input = CodedInputStream.newInstance(bytes);
- return emitDefaultValues ? converterEmitDefaultValues(input): converterNoEmitDefaultValues(input);
- }
-
- public Map<String, Object> converter(CodedInputStream input) throws Exception {
- return emitDefaultValues ? converterEmitDefaultValues(input): converterNoEmitDefaultValues(input);
- }
-
- private Map<String, Object> converterNoEmitDefaultValues(CodedInputStream input) throws Exception {
- Map<String, Object> data = initialCapacity == 0? new HashMap<>(): new HashMap<>(initialCapacity);
-
- while (true) {
- int tag = input.readTag();
- if (tag == 0) {
- break;
- }
-
- final int wireType = WireFormat.getTagWireType(tag);
- final int fieldNumber = WireFormat.getTagFieldNumber(tag);
-
- FieldDesc fieldDesc = null;
- if (fieldNumber < fieldDescArray.length) {
- fieldDesc = fieldDescArray[fieldNumber];
- }
-
- boolean unknown = false;
- boolean packed = false;
- if (fieldDesc == null) {
- unknown = true; // Unknown field.
- } else if (wireType == fieldDesc.field.getLiteType().getWireType()) {
- packed = false;
- } else if (fieldDesc.field.isPackable() && wireType == WireFormat.WIRETYPE_LENGTH_DELIMITED) {
- packed = true;
- } else {
- unknown = true; // Unknown wire type.
- }
-
- if (unknown) { // Unknown field or wrong wire type. Skip.
- input.skipField(tag);
- continue;
- }
-
- String name = fieldDesc.name;
- if (packed) {
- final int length = input.readRawVarint32();
- final int limit = input.pushLimit(length);
- List<Object> array = (List<Object>) fieldDesc.valueConverter.convert(input, true);
- input.popLimit(limit);
- List<Object> oldArray = (List<Object>)data.get(name);
- if(oldArray == null){
- data.put(name, array);
- }else{
- oldArray.addAll(array);
- }
- } else {
- final Object value = fieldDesc.valueConverter.convert(input, false);
- if(!fieldDesc.field.isRepeated()){
- data.put(name, value);
- }else{
- List<Object> array = (List<Object>)data.get(name);
- if(array == null){
- array = new ArrayList<>();
- data.put(name, array);
- }
- array.add(value);
- }
- }
-
- }
-
- return data;
- }
- private Map<String, Object> converterEmitDefaultValues(CodedInputStream input) throws Exception {
- Map<String, Object> data = initialCapacity == 0? new HashMap<>(): new HashMap<>(initialCapacity);
-
- // 比converterNoEmitDefaultValues多的代码
- for (int i = 0; i < defaultValues.length; i++) {
- defaultValues[i].hasValue = false;
- }
-
- while (true) {
- int tag = input.readTag();
- if (tag == 0) {
- break;
- }
-
- final int wireType = WireFormat.getTagWireType(tag);
- final int fieldNumber = WireFormat.getTagFieldNumber(tag);
-
- FieldDesc fieldDesc = null;
- if (fieldNumber < fieldDescArray.length) {
- fieldDesc = fieldDescArray[fieldNumber];
- }
-
- boolean unknown = false;
- boolean packed = false;
- if (fieldDesc == null) {
- unknown = true; // Unknown field.
- } else if (wireType == fieldDesc.field.getLiteType().getWireType()) {
- packed = false;
- } else if (fieldDesc.field.isPackable() && wireType == WireFormat.WIRETYPE_LENGTH_DELIMITED) {
- packed = true;
- } else {
- unknown = true; // Unknown wire type.
- }
-
- if (unknown) { // Unknown field or wrong wire type. Skip.
- input.skipField(tag);
- continue;
- }
-
- // 比converterNoEmitDefaultValues多的代码
- defaultValues[fieldDesc.fieldPosition].hasValue = true;
-
- String name = fieldDesc.name;
- if (packed) {
- final int length = input.readRawVarint32();
- final int limit = input.pushLimit(length);
- List<Object> array = (List<Object>) fieldDesc.valueConverter.convert(input, true);
- input.popLimit(limit);
- List<Object> oldArray = (List<Object>)data.get(name);
- if(oldArray == null){
- data.put(name, array);
- }else{
- oldArray.addAll(array);
- }
- } else {
- final Object value = fieldDesc.valueConverter.convert(input, false);
- if(!fieldDesc.field.isRepeated()){
- data.put(name, value);
- }else{
- List<Object> array = (List<Object>)data.get(name);
- if(array == null){
- array = new ArrayList<>();
- data.put(name, array);
- }
- array.add(value);
- }
- }
-
- }
-
- // 比converterNoEmitDefaultValues多的代码
- DefaultValue defaultValue;
- for (int i = 0; i < defaultValues.length; i++) {
- defaultValue = defaultValues[i];
- if(!defaultValue.hasValue && defaultValue.defaultValue != null){
- data.put(defaultValue.name, defaultValue.defaultValue);
- }
- }
-
- return data;
- }
-
- private Object getDefaultValue(FieldDescriptor field, DataType fieldDataType){
- if(field.getJavaType() == Descriptors.FieldDescriptor.JavaType.MESSAGE){
- return null;
- }
- if(field.isRepeated()){
- return null;
- }
- if(field.hasOptionalKeyword()){
- return null;
- }
-
- switch (field.getType()) {
- case DOUBLE:
- case FLOAT:
- case INT64:
- case UINT64:
- case FIXED64:
- case SFIXED64:
- case SINT64:
- case INT32:
- case UINT32:
- case FIXED32:
- case SFIXED32:
- case SINT32:
- Number number = 0L;
- if (fieldDataType instanceof DoubleType) {
- return number.doubleValue();
- } else if (fieldDataType instanceof FloatType) {
- return number.floatValue();
- } else if (fieldDataType instanceof IntegerType) {
- return number.intValue();
- } else if (fieldDataType instanceof LongType) {
- return number.longValue();
- } else {
- throw newCanNotConvertException(field, fieldDataType);
- }
- case BOOL:
- if (fieldDataType instanceof BooleanType) {
- return false;
- } else if (fieldDataType instanceof IntegerType) {
- return 0;
- } else {
- throw newCanNotConvertException(field, fieldDataType);
- }
- case BYTES:
- if (fieldDataType instanceof BinaryType) {
- return new byte[0];
- } else {
- throw newCanNotConvertException(field, fieldDataType);
- }
- case STRING:
- if (fieldDataType instanceof StringType) {
- return "";
- } else {
- throw newCanNotConvertException(field, fieldDataType);
- }
- case ENUM:
- if (fieldDataType instanceof IntegerType) {
- return ((Descriptors.EnumValueDescriptor) field.getDefaultValue()).getNumber();
- } else {
- throw newCanNotConvertException(field, fieldDataType);
- }
- default:
- throw new IllegalArgumentException(String.format("not supported proto type:%s(%s)", field.getType(), field.getName()));
- }
- }
- }
-
- public static class DefaultValue{
- boolean hasValue;
- final String name;
-
- final Object defaultValue;
-
- public DefaultValue(String name, Object defaultValue) {
- this.name = name;
- this.defaultValue = defaultValue;
- }
- }
-
- public static class FieldDesc {
- final FieldDescriptor field;
- final String name;
- final DataType fieldDataType; // field对应DataType,array类型存对应元素的类型
- final int fieldPosition; // field位置
-
- final ValueConverter valueConverter;
- private final char[] tmpDecodeChars;
-
- public FieldDesc(FieldDescriptor field, DataType dataType, int fieldPosition, boolean emitDefaultValues, char[] tmpDecodeChars) {
- this.field = field;
- this.name = field.getName();
- if (dataType instanceof ArrayType) {
- this.fieldDataType = ((ArrayType) dataType).elementType;
- } else {
- this.fieldDataType = dataType;
- }
- this.fieldPosition = fieldPosition;
- this.tmpDecodeChars = tmpDecodeChars;
- valueConverter = makeConverter(emitDefaultValues);
- }
-
- private ValueConverter makeConverter(boolean emitDefaultValues) {
- switch (field.getType()) {
- case ENUM:
- if(!(fieldDataType instanceof IntegerType)){
- throw newCanNotConvertException(field, fieldDataType);
- }
- return (input, packed) -> {
- if (packed) {
- List<Object> array = new ArrayList<>();
- while (input.getBytesUntilLimit() > 0) {
- array.add(input.readEnum());
- }
- return array;
- } else {
- return input.readEnum();
- }
- };
- case MESSAGE:
- final Descriptor descriptor = field.getMessageType();
- final MessageConverter messageConverter = new MessageConverter(descriptor, (StructType) fieldDataType, emitDefaultValues);
- return (input, packed) -> {
- final int length = input.readRawVarint32();
- final int oldLimit = input.pushLimit(length);
- Object message = messageConverter.converter(input);
- input.checkLastTagWas(0);
- if (input.getBytesUntilLimit() != 0) {
- throw new RuntimeException("parse");
- }
- input.popLimit(oldLimit);
- return message;
- };
- default:
- ValueConverter fieldConverter = makePrimitiveFieldConverter();
- return (input, packed) -> {
- if (packed) {
- List<Object> array = new ArrayList<>();
- while (input.getBytesUntilLimit() > 0) {
- array.add(fieldConverter.convert(input, false));
- }
- return array;
- } else {
- return fieldConverter.convert(input, false);
- }
- };
- }
- }
-
- private ValueConverter makePrimitiveFieldConverter() {
- switch (field.getType()) {
- case DOUBLE:
- if (fieldDataType instanceof DoubleType) {
- return (input, packed) -> input.readDouble();
- } else if (fieldDataType instanceof FloatType) {
- return (input, packed) -> (float) input.readDouble();
- } else if (fieldDataType instanceof IntegerType) {
- return (input, packed) -> (int) input.readDouble();
- } else if (fieldDataType instanceof LongType) {
- return (input, packed) -> (long) input.readDouble();
- } else {
- throw newCanNotConvertException(field, fieldDataType);
- }
- case FLOAT:
- if (fieldDataType instanceof DoubleType) {
- return (input, packed) -> (double) input.readFloat();
- } else if (fieldDataType instanceof FloatType) {
- return (input, packed) -> input.readFloat();
- } else if (fieldDataType instanceof IntegerType) {
- return (input, packed) -> (int) input.readFloat();
- } else if (fieldDataType instanceof LongType) {
- return (input, packed) -> (long) input.readFloat();
- } else {
- throw newCanNotConvertException(field, fieldDataType);
- }
- case INT64:
- if (fieldDataType instanceof IntegerType) {
- return (input, packed) -> (int) input.readInt64();
- } else if (fieldDataType instanceof LongType) {
- return (input, packed) -> input.readInt64();
- } else if (fieldDataType instanceof FloatType) {
- return (input, packed) -> (float) input.readInt64();
- } else if (fieldDataType instanceof DoubleType) {
- return (input, packed) -> (double) input.readInt64();
- } else {
- throw newCanNotConvertException(field, fieldDataType);
- }
- case UINT64:
- if (fieldDataType instanceof IntegerType) {
- return (input, packed) -> (int) input.readUInt64();
- } else if (fieldDataType instanceof LongType) {
- return (input, packed) -> input.readUInt64();
- } else if (fieldDataType instanceof FloatType) {
- return (input, packed) -> (float) input.readUInt64();
- } else if (fieldDataType instanceof DoubleType) {
- return (input, packed) -> (double) input.readUInt64();
- } else {
- throw newCanNotConvertException(field, fieldDataType);
- }
- case FIXED64:
- if (fieldDataType instanceof IntegerType) {
- return (input, packed) -> (int) input.readFixed64();
- } else if (fieldDataType instanceof LongType) {
- return (input, packed) -> input.readFixed64();
- } else if (fieldDataType instanceof FloatType) {
- return (input, packed) -> (float) input.readFixed64();
- } else if (fieldDataType instanceof DoubleType) {
- return (input, packed) -> (double) input.readFixed64();
- } else {
- throw newCanNotConvertException(field, fieldDataType);
- }
- case SFIXED64:
- if (fieldDataType instanceof IntegerType) {
- return (input, packed) -> (int) input.readSFixed64();
- } else if (fieldDataType instanceof LongType) {
- return (input, packed) -> input.readSFixed64();
- } else if (fieldDataType instanceof FloatType) {
- return (input, packed) -> (float) input.readSFixed64();
- } else if (fieldDataType instanceof DoubleType) {
- return (input, packed) -> (double) input.readSFixed64();
- } else {
- throw newCanNotConvertException(field, fieldDataType);
- }
- case SINT64:
- if (fieldDataType instanceof IntegerType) {
- return (input, packed) -> (int) input.readSInt64();
- } else if (fieldDataType instanceof LongType) {
- return (input, packed) -> input.readSInt64();
- } else if (fieldDataType instanceof FloatType) {
- return (input, packed) -> (float) input.readSInt64();
- } else if (fieldDataType instanceof DoubleType) {
- return (input, packed) -> (double) input.readSInt64();
- } else {
- throw newCanNotConvertException(field, fieldDataType);
- }
- case INT32:
- if (fieldDataType instanceof IntegerType) {
- return (input, packed) -> input.readInt32();
- } else if (fieldDataType instanceof LongType) {
- return (input, packed) -> (long) input.readInt32();
- } else if (fieldDataType instanceof FloatType) {
- return (input, packed) -> (float) input.readInt32();
- } else if (fieldDataType instanceof DoubleType) {
- return (input, packed) -> (double) input.readInt32();
- } else {
- throw newCanNotConvertException(field, fieldDataType);
- }
- case UINT32:
- if (fieldDataType instanceof IntegerType) {
- return (input, packed) -> input.readUInt32();
- } else if (fieldDataType instanceof LongType) {
- return (input, packed) -> (long) input.readUInt32();
- } else if (fieldDataType instanceof FloatType) {
- return (input, packed) -> (float) input.readUInt32();
- } else if (fieldDataType instanceof DoubleType) {
- return (input, packed) -> (double) input.readUInt32();
- } else {
- throw newCanNotConvertException(field, fieldDataType);
- }
- case FIXED32:
- if (fieldDataType instanceof IntegerType) {
- return (input, packed) -> input.readFixed32();
- } else if (fieldDataType instanceof LongType) {
- return (input, packed) -> (long) input.readFixed32();
- } else if (fieldDataType instanceof FloatType) {
- return (input, packed) -> (float) input.readFixed32();
- } else if (fieldDataType instanceof DoubleType) {
- return (input, packed) -> (double) input.readFixed32();
- } else {
- throw newCanNotConvertException(field, fieldDataType);
- }
- case SFIXED32:
- if (fieldDataType instanceof IntegerType) {
- return (input, packed) -> input.readSFixed32();
- } else if (fieldDataType instanceof LongType) {
- return (input, packed) -> (long) input.readSFixed32();
- } else if (fieldDataType instanceof FloatType) {
- return (input, packed) -> (float) input.readSFixed32();
- } else if (fieldDataType instanceof DoubleType) {
- return (input, packed) -> (double) input.readSFixed32();
- } else {
- throw newCanNotConvertException(field, fieldDataType);
- }
- case SINT32:
- if (fieldDataType instanceof IntegerType) {
- return (input, packed) -> input.readSInt32();
- } else if (fieldDataType instanceof LongType) {
- return (input, packed) -> (long) input.readSInt32();
- } else if (fieldDataType instanceof FloatType) {
- return (input, packed) -> (float) input.readSInt32();
- } else if (fieldDataType instanceof DoubleType) {
- return (input, packed) -> (double) input.readSInt32();
- } else {
- throw newCanNotConvertException(field, fieldDataType);
- }
- case BOOL:
- if (fieldDataType instanceof BooleanType) {
- return (input, packed) -> input.readBool();
- } else if (fieldDataType instanceof IntegerType) {
- return (input, packed) -> input.readBool() ? 1 : 0;
- } else {
- throw newCanNotConvertException(field, fieldDataType);
- }
- case BYTES:
- if (fieldDataType instanceof BinaryType) {
- return (input, packed) -> input.readByteArray();
- } else {
- throw newCanNotConvertException(field, fieldDataType);
- }
- case STRING:
- if (fieldDataType instanceof StringType) {
- return (input, packed) -> {
- //return input.readString();
- byte[] bytes = input.readByteArray();
- return decodeUTF8(bytes, 0, bytes.length);
- };
- } else {
- throw newCanNotConvertException(field, fieldDataType);
- }
- default:
- throw new IllegalArgumentException(String.format("not supported proto type:%s(%s)", field.getType(), field.getName()));
- }
- }
-
- private String decodeUTF8(byte[] input, int offset, int byteLen) {
- char[] chars = MessageConverter.MAX_CHARS_LENGTH < byteLen? new char[byteLen]: tmpDecodeChars;
- int len = decodeUTF8Strict(input, offset, byteLen, chars);
- if (len < 0) {
- return defaultDecodeUTF8(input, offset, byteLen);
- }
- return new String(chars, 0, len);
- }
-
- private static int decodeUTF8Strict(byte[] sa, int sp, int len, char[] da) {
- final int sl = sp + len;
- int dp = 0;
- int dlASCII = Math.min(len, da.length);
-
- // ASCII only optimized loop
- while (dp < dlASCII && sa[sp] >= 0) {
- da[dp++] = (char) sa[sp++];
- }
-
- while (sp < sl) {
- int b1 = sa[sp++];
- if (b1 >= 0) {
- // 1 byte, 7 bits: 0xxxxxxx
- da[dp++] = (char) b1;
- } else if ((b1 >> 5) == -2 && (b1 & 0x1e) != 0) {
- // 2 bytes, 11 bits: 110xxxxx 10xxxxxx
- if (sp < sl) {
- int b2 = sa[sp++];
- if ((b2 & 0xc0) != 0x80) { // isNotContinuation(b2)
- return -1;
- } else {
- da[dp++] = (char) (((b1 << 6) ^ b2) ^ (((byte) 0xC0 << 6) ^ ((byte) 0x80)));
- }
- continue;
- }
- return -1;
- } else if ((b1 >> 4) == -2) {
- // 3 bytes, 16 bits: 1110xxxx 10xxxxxx 10xxxxxx
- if (sp + 1 < sl) {
- int b2 = sa[sp++];
- int b3 = sa[sp++];
- if ((b1 == (byte) 0xe0 && (b2 & 0xe0) == 0x80)
- || (b2 & 0xc0) != 0x80
- || (b3 & 0xc0) != 0x80) { // isMalformed3(b1, b2, b3)
- return -1;
- } else {
- char c =
- (char)
- ((b1 << 12)
- ^ (b2 << 6)
- ^ (b3
- ^ (((byte) 0xE0 << 12)
- ^ ((byte) 0x80 << 6)
- ^ ((byte) 0x80))));
- if (Character.isSurrogate(c)) {
- return -1;
- } else {
- da[dp++] = c;
- }
- }
- continue;
- }
- return -1;
- } else if ((b1 >> 3) == -2) {
- // 4 bytes, 21 bits: 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx
- if (sp + 2 < sl) {
- int b2 = sa[sp++];
- int b3 = sa[sp++];
- int b4 = sa[sp++];
- int uc =
- ((b1 << 18)
- ^ (b2 << 12)
- ^ (b3 << 6)
- ^ (b4
- ^ (((byte) 0xF0 << 18)
- ^ ((byte) 0x80 << 12)
- ^ ((byte) 0x80 << 6)
- ^ ((byte) 0x80))));
- // isMalformed4 and shortest form check
- if (((b2 & 0xc0) != 0x80 || (b3 & 0xc0) != 0x80 || (b4 & 0xc0) != 0x80)
- || !Character.isSupplementaryCodePoint(uc)) {
- return -1;
- } else {
- da[dp++] = Character.highSurrogate(uc);
- da[dp++] = Character.lowSurrogate(uc);
- }
- continue;
- }
- return -1;
- } else {
- return -1;
- }
- }
- return dp;
- }
-
- private static String defaultDecodeUTF8(byte[] bytes, int offset, int len) {
- return new String(bytes, offset, len, StandardCharsets.UTF_8);
- }
- }
-
- private static IllegalArgumentException newCanNotConvertException(FieldDescriptor field, DataType fieldDataType){
- return new IllegalArgumentException(String.format("proto field:%s(%s) can not convert to type:%s", field.getName(), field.getType(), fieldDataType.simpleString()));
- }
-
- @FunctionalInterface
- public interface ValueConverter {
- Object convert(CodedInputStream input, boolean packed) throws Exception;
- }
-}
+package com.geedgenetworks.formats.protobuf; + +import com.geedgenetworks.shaded.com.google.protobuf.Descriptors; +import com.geedgenetworks.shaded.com.google.protobuf.CodedInputStream; +import com.geedgenetworks.shaded.com.google.protobuf.Descriptors.Descriptor; +import com.geedgenetworks.shaded.com.google.protobuf.Descriptors.FieldDescriptor; +import com.geedgenetworks.shaded.com.google.protobuf.WireFormat; +import com.geedgenetworks.api.connector.type.*; +import com.geedgenetworks.api.connector.type.StructType.StructField; +import org.apache.commons.lang3.StringUtils; +import org.apache.flink.util.Preconditions; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.nio.charset.StandardCharsets; +import java.util.*; +import java.util.stream.Collectors; + +public class SchemaConverters { + static final Logger LOG = LoggerFactory.getLogger(SchemaConverters.class); + public static StructType toStructType(Descriptor descriptor) { + StructField[] fields = descriptor.getFields().stream().map(f -> structFieldFor(f)).toArray(StructField[]::new); + return new StructType(fields); + } + + private static StructType.StructField structFieldFor(FieldDescriptor fd) { + WireFormat.FieldType type = fd.getLiteType(); + DataType dataType; + switch (type) { + case DOUBLE: + dataType = Types.DOUBLE; + break; + case FLOAT: + dataType = Types.FLOAT; + break; + case INT64: + case UINT64: + case FIXED64: + case SINT64: + case SFIXED64: + dataType = Types.BIGINT; + break; + case INT32: + case UINT32: + case FIXED32: + case SINT32: + case SFIXED32: + dataType = Types.INT; + break; + case BOOL: + dataType = Types.BOOLEAN; + break; + case STRING: + dataType = Types.STRING; + break; + case BYTES: + dataType = Types.BINARY; + break; + case ENUM: + dataType = Types.INT; + break; + case MESSAGE: + if (fd.isRepeated() && fd.getMessageType().getOptions().hasMapEntry()) { + throw new IllegalArgumentException(String.format("not supported type:%s(%s)", type, fd.getName())); + } else { + StructField[] fields = fd.getMessageType().getFields().stream().map(f -> structFieldFor(f)).toArray(StructField[]::new); + dataType = new StructType(fields); + } + break; + default: + throw new IllegalArgumentException(String.format("not supported type:%s(%s)", type, fd.getName())); + } + if (fd.isRepeated()) { + return new StructField(fd.getName(), new ArrayType(dataType)); + } else { + return new StructField(fd.getName(), dataType); + } + } + + // 校验dataType和descriptor是否匹配,dataType中定义的属性必须全部在descriptor定义,每个字段的类型必须匹配(相同或者能够转换) + public static void checkMatch(Descriptor descriptor, StructType dataType) throws Exception { + checkMatch(descriptor, dataType, null); + } + + private static void checkMatch(Descriptor descriptor, StructType dataType, String prefix) throws Exception { + List<FieldDescriptor> fieldDescriptors = descriptor.getFields(); + Map<String, FieldDescriptor> fdMap = fieldDescriptors.stream().collect(Collectors.toMap(x -> x.getName(), x -> x)); + StructField[] fields = dataType.fields; + + for (int i = 0; i < fields.length; i++) { + StructField field = fields[i]; + FieldDescriptor fd = fdMap.get(field.name); + if(fd == null){ + throw new IllegalArgumentException(String.format("%s ' field:%s not found in proto descriptor", StringUtils.isBlank(prefix)? "root": prefix, field)); + } + WireFormat.FieldType type = fd.getLiteType(); + DataType fieldDataType; + if(fd.isRepeated()){ + if(!(field.dataType instanceof ArrayType)){ + throw newNotMatchException(field, fd, prefix); + } + fieldDataType = ((ArrayType)field.dataType).elementType; + }else{ + fieldDataType = field.dataType; + } + switch (type) { + case DOUBLE: + case FLOAT: + if(!(fieldDataType instanceof DoubleType || fieldDataType instanceof FloatType + || fieldDataType instanceof IntegerType || fieldDataType instanceof LongType)){ + throw newNotMatchException(field, fd, prefix); + } + break; + case INT64: + case UINT64: + case FIXED64: + case SINT64: + case SFIXED64: + if(!(fieldDataType instanceof IntegerType || fieldDataType instanceof LongType + || fieldDataType instanceof FloatType || fieldDataType instanceof DoubleType)){ + throw newNotMatchException(field, fd, prefix); + } + break; + case INT32: + case UINT32: + case FIXED32: + case SINT32: + case SFIXED32: + if(!(fieldDataType instanceof IntegerType || fieldDataType instanceof LongType + || fieldDataType instanceof FloatType || fieldDataType instanceof DoubleType)){ + throw newNotMatchException(field, fd, prefix); + } + break; + case BOOL: + if(!(fieldDataType instanceof BooleanType || fieldDataType instanceof IntegerType)){ + throw newNotMatchException(field, fd, prefix); + } + break; + case STRING: + if(!(fieldDataType instanceof StringType)){ + throw newNotMatchException(field, fd, prefix); + } + break; + case BYTES: + if(!(fieldDataType instanceof BinaryType)){ + throw newNotMatchException(field, fd, prefix); + } + break; + case ENUM: + if(!(fieldDataType instanceof IntegerType)){ + throw newNotMatchException(field, fd, prefix); + } + break; + case MESSAGE: + if(!(fieldDataType instanceof StructType)){ + throw newNotMatchException(field, fd, prefix); + } + checkMatch(fd.getMessageType(), (StructType) fieldDataType, StringUtils.isBlank(prefix)? field.name: prefix + "." + field.name); + } + } + } + + private static IllegalArgumentException newNotMatchException(StructField field, FieldDescriptor fd, String prefix){ + return new IllegalArgumentException(String.format("%s ' field:%s not match with proto field descriptor:%s(%s)", StringUtils.isBlank(prefix)? "root": prefix, field, fd, fd.getType())); + } + + public static class MessageConverter { + private static final int MAX_CHARS_LENGTH = 1024 * 4; + FieldDesc[] fieldDescArray; // Message类型对应FieldDesc, 下标为field number + int initialCapacity = 0; + final boolean emitDefaultValues; + final DefaultValue[] defaultValues; + private final char[] tmpDecodeChars = new char[MAX_CHARS_LENGTH]; // 同一个Message的转换是在一个线程内,fieldDesc共用一个临时chars + + public MessageConverter(Descriptor descriptor, StructType dataType, boolean emitDefaultValues) { + ProtobufUtils.checkSupportParseDescriptor(descriptor); + List<FieldDescriptor> fields = descriptor.getFields(); + int maxNumber = fields.stream().mapToInt(f -> f.getNumber()).max().getAsInt(); + Preconditions.checkArgument(maxNumber < 10000, maxNumber); + fieldDescArray = new FieldDesc[maxNumber + 1]; + + this.emitDefaultValues = emitDefaultValues; + if(this.emitDefaultValues){ + defaultValues = new DefaultValue[dataType.fields.length]; + }else{ + defaultValues = null; + } + + for (FieldDescriptor field : fields) { + // Optional<StructField> structFieldOptional = Arrays.stream(dataType.fields).filter(f -> f.name.equals(field.getName())).findFirst(); + // if(structFieldOptional.isPresent()){ + int position = -1; + for (int i = 0; i < dataType.fields.length; i++) { + if(dataType.fields[i].name.equals(field.getName())){ + position = i; + break; + } + } + if(position >= 0){ + fieldDescArray[field.getNumber()] = new FieldDesc(field, dataType.fields[position].dataType, position, emitDefaultValues, tmpDecodeChars); + if(this.emitDefaultValues){ + defaultValues[position] = new DefaultValue(dataType.fields[position].name, getDefaultValue(field, dataType.fields[position].dataType)); + } + } + } + if(dataType.fields.length / 3 > 16){ + initialCapacity = (dataType.fields.length / 3) ; + } + if(this.emitDefaultValues){ + LOG.warn("enable emitDefaultValues will seriously affect performance !!!"); + for (int i = 0; i < defaultValues.length; i++) { + if (defaultValues[i] == null) { + throw new IllegalArgumentException(String.format("%s and %s not match", dataType, descriptor)); + } + } + } + } + + public Map<String, Object> converter(byte[] bytes) throws Exception { + CodedInputStream input = CodedInputStream.newInstance(bytes); + return emitDefaultValues ? converterEmitDefaultValues(input): converterNoEmitDefaultValues(input); + } + + public Map<String, Object> converter(CodedInputStream input) throws Exception { + return emitDefaultValues ? converterEmitDefaultValues(input): converterNoEmitDefaultValues(input); + } + + private Map<String, Object> converterNoEmitDefaultValues(CodedInputStream input) throws Exception { + Map<String, Object> data = initialCapacity == 0? new HashMap<>(): new HashMap<>(initialCapacity); + + while (true) { + int tag = input.readTag(); + if (tag == 0) { + break; + } + + final int wireType = WireFormat.getTagWireType(tag); + final int fieldNumber = WireFormat.getTagFieldNumber(tag); + + FieldDesc fieldDesc = null; + if (fieldNumber < fieldDescArray.length) { + fieldDesc = fieldDescArray[fieldNumber]; + } + + boolean unknown = false; + boolean packed = false; + if (fieldDesc == null) { + unknown = true; // Unknown field. + } else if (wireType == fieldDesc.field.getLiteType().getWireType()) { + packed = false; + } else if (fieldDesc.field.isPackable() && wireType == WireFormat.WIRETYPE_LENGTH_DELIMITED) { + packed = true; + } else { + unknown = true; // Unknown wire type. + } + + if (unknown) { // Unknown field or wrong wire type. Skip. + input.skipField(tag); + continue; + } + + String name = fieldDesc.name; + if (packed) { + final int length = input.readRawVarint32(); + final int limit = input.pushLimit(length); + List<Object> array = (List<Object>) fieldDesc.valueConverter.convert(input, true); + input.popLimit(limit); + List<Object> oldArray = (List<Object>)data.get(name); + if(oldArray == null){ + data.put(name, array); + }else{ + oldArray.addAll(array); + } + } else { + final Object value = fieldDesc.valueConverter.convert(input, false); + if(!fieldDesc.field.isRepeated()){ + data.put(name, value); + }else{ + List<Object> array = (List<Object>)data.get(name); + if(array == null){ + array = new ArrayList<>(); + data.put(name, array); + } + array.add(value); + } + } + + } + + return data; + } + private Map<String, Object> converterEmitDefaultValues(CodedInputStream input) throws Exception { + Map<String, Object> data = initialCapacity == 0? new HashMap<>(): new HashMap<>(initialCapacity); + + // 比converterNoEmitDefaultValues多的代码 + for (int i = 0; i < defaultValues.length; i++) { + defaultValues[i].hasValue = false; + } + + while (true) { + int tag = input.readTag(); + if (tag == 0) { + break; + } + + final int wireType = WireFormat.getTagWireType(tag); + final int fieldNumber = WireFormat.getTagFieldNumber(tag); + + FieldDesc fieldDesc = null; + if (fieldNumber < fieldDescArray.length) { + fieldDesc = fieldDescArray[fieldNumber]; + } + + boolean unknown = false; + boolean packed = false; + if (fieldDesc == null) { + unknown = true; // Unknown field. + } else if (wireType == fieldDesc.field.getLiteType().getWireType()) { + packed = false; + } else if (fieldDesc.field.isPackable() && wireType == WireFormat.WIRETYPE_LENGTH_DELIMITED) { + packed = true; + } else { + unknown = true; // Unknown wire type. + } + + if (unknown) { // Unknown field or wrong wire type. Skip. + input.skipField(tag); + continue; + } + + // 比converterNoEmitDefaultValues多的代码 + defaultValues[fieldDesc.fieldPosition].hasValue = true; + + String name = fieldDesc.name; + if (packed) { + final int length = input.readRawVarint32(); + final int limit = input.pushLimit(length); + List<Object> array = (List<Object>) fieldDesc.valueConverter.convert(input, true); + input.popLimit(limit); + List<Object> oldArray = (List<Object>)data.get(name); + if(oldArray == null){ + data.put(name, array); + }else{ + oldArray.addAll(array); + } + } else { + final Object value = fieldDesc.valueConverter.convert(input, false); + if(!fieldDesc.field.isRepeated()){ + data.put(name, value); + }else{ + List<Object> array = (List<Object>)data.get(name); + if(array == null){ + array = new ArrayList<>(); + data.put(name, array); + } + array.add(value); + } + } + + } + + // 比converterNoEmitDefaultValues多的代码 + DefaultValue defaultValue; + for (int i = 0; i < defaultValues.length; i++) { + defaultValue = defaultValues[i]; + if(!defaultValue.hasValue && defaultValue.defaultValue != null){ + data.put(defaultValue.name, defaultValue.defaultValue); + } + } + + return data; + } + + private Object getDefaultValue(FieldDescriptor field, DataType fieldDataType){ + if(field.getJavaType() == Descriptors.FieldDescriptor.JavaType.MESSAGE){ + return null; + } + if(field.isRepeated()){ + return null; + } + if(field.hasOptionalKeyword()){ + return null; + } + + switch (field.getType()) { + case DOUBLE: + case FLOAT: + case INT64: + case UINT64: + case FIXED64: + case SFIXED64: + case SINT64: + case INT32: + case UINT32: + case FIXED32: + case SFIXED32: + case SINT32: + Number number = 0L; + if (fieldDataType instanceof DoubleType) { + return number.doubleValue(); + } else if (fieldDataType instanceof FloatType) { + return number.floatValue(); + } else if (fieldDataType instanceof IntegerType) { + return number.intValue(); + } else if (fieldDataType instanceof LongType) { + return number.longValue(); + } else { + throw newCanNotConvertException(field, fieldDataType); + } + case BOOL: + if (fieldDataType instanceof BooleanType) { + return false; + } else if (fieldDataType instanceof IntegerType) { + return 0; + } else { + throw newCanNotConvertException(field, fieldDataType); + } + case BYTES: + if (fieldDataType instanceof BinaryType) { + return new byte[0]; + } else { + throw newCanNotConvertException(field, fieldDataType); + } + case STRING: + if (fieldDataType instanceof StringType) { + return ""; + } else { + throw newCanNotConvertException(field, fieldDataType); + } + case ENUM: + if (fieldDataType instanceof IntegerType) { + return ((Descriptors.EnumValueDescriptor) field.getDefaultValue()).getNumber(); + } else { + throw newCanNotConvertException(field, fieldDataType); + } + default: + throw new IllegalArgumentException(String.format("not supported proto type:%s(%s)", field.getType(), field.getName())); + } + } + } + + public static class DefaultValue{ + boolean hasValue; + final String name; + + final Object defaultValue; + + public DefaultValue(String name, Object defaultValue) { + this.name = name; + this.defaultValue = defaultValue; + } + } + + public static class FieldDesc { + final FieldDescriptor field; + final String name; + final DataType fieldDataType; // field对应DataType,array类型存对应元素的类型 + final int fieldPosition; // field位置 + + final ValueConverter valueConverter; + private final char[] tmpDecodeChars; + + public FieldDesc(FieldDescriptor field, DataType dataType, int fieldPosition, boolean emitDefaultValues, char[] tmpDecodeChars) { + this.field = field; + this.name = field.getName(); + if (dataType instanceof ArrayType) { + this.fieldDataType = ((ArrayType) dataType).elementType; + } else { + this.fieldDataType = dataType; + } + this.fieldPosition = fieldPosition; + this.tmpDecodeChars = tmpDecodeChars; + valueConverter = makeConverter(emitDefaultValues); + } + + private ValueConverter makeConverter(boolean emitDefaultValues) { + switch (field.getType()) { + case ENUM: + if(!(fieldDataType instanceof IntegerType)){ + throw newCanNotConvertException(field, fieldDataType); + } + return (input, packed) -> { + if (packed) { + List<Object> array = new ArrayList<>(); + while (input.getBytesUntilLimit() > 0) { + array.add(input.readEnum()); + } + return array; + } else { + return input.readEnum(); + } + }; + case MESSAGE: + final Descriptor descriptor = field.getMessageType(); + final MessageConverter messageConverter = new MessageConverter(descriptor, (StructType) fieldDataType, emitDefaultValues); + return (input, packed) -> { + final int length = input.readRawVarint32(); + final int oldLimit = input.pushLimit(length); + Object message = messageConverter.converter(input); + input.checkLastTagWas(0); + if (input.getBytesUntilLimit() != 0) { + throw new RuntimeException("parse"); + } + input.popLimit(oldLimit); + return message; + }; + default: + ValueConverter fieldConverter = makePrimitiveFieldConverter(); + return (input, packed) -> { + if (packed) { + List<Object> array = new ArrayList<>(); + while (input.getBytesUntilLimit() > 0) { + array.add(fieldConverter.convert(input, false)); + } + return array; + } else { + return fieldConverter.convert(input, false); + } + }; + } + } + + private ValueConverter makePrimitiveFieldConverter() { + switch (field.getType()) { + case DOUBLE: + if (fieldDataType instanceof DoubleType) { + return (input, packed) -> input.readDouble(); + } else if (fieldDataType instanceof FloatType) { + return (input, packed) -> (float) input.readDouble(); + } else if (fieldDataType instanceof IntegerType) { + return (input, packed) -> (int) input.readDouble(); + } else if (fieldDataType instanceof LongType) { + return (input, packed) -> (long) input.readDouble(); + } else { + throw newCanNotConvertException(field, fieldDataType); + } + case FLOAT: + if (fieldDataType instanceof DoubleType) { + return (input, packed) -> (double) input.readFloat(); + } else if (fieldDataType instanceof FloatType) { + return (input, packed) -> input.readFloat(); + } else if (fieldDataType instanceof IntegerType) { + return (input, packed) -> (int) input.readFloat(); + } else if (fieldDataType instanceof LongType) { + return (input, packed) -> (long) input.readFloat(); + } else { + throw newCanNotConvertException(field, fieldDataType); + } + case INT64: + if (fieldDataType instanceof IntegerType) { + return (input, packed) -> (int) input.readInt64(); + } else if (fieldDataType instanceof LongType) { + return (input, packed) -> input.readInt64(); + } else if (fieldDataType instanceof FloatType) { + return (input, packed) -> (float) input.readInt64(); + } else if (fieldDataType instanceof DoubleType) { + return (input, packed) -> (double) input.readInt64(); + } else { + throw newCanNotConvertException(field, fieldDataType); + } + case UINT64: + if (fieldDataType instanceof IntegerType) { + return (input, packed) -> (int) input.readUInt64(); + } else if (fieldDataType instanceof LongType) { + return (input, packed) -> input.readUInt64(); + } else if (fieldDataType instanceof FloatType) { + return (input, packed) -> (float) input.readUInt64(); + } else if (fieldDataType instanceof DoubleType) { + return (input, packed) -> (double) input.readUInt64(); + } else { + throw newCanNotConvertException(field, fieldDataType); + } + case FIXED64: + if (fieldDataType instanceof IntegerType) { + return (input, packed) -> (int) input.readFixed64(); + } else if (fieldDataType instanceof LongType) { + return (input, packed) -> input.readFixed64(); + } else if (fieldDataType instanceof FloatType) { + return (input, packed) -> (float) input.readFixed64(); + } else if (fieldDataType instanceof DoubleType) { + return (input, packed) -> (double) input.readFixed64(); + } else { + throw newCanNotConvertException(field, fieldDataType); + } + case SFIXED64: + if (fieldDataType instanceof IntegerType) { + return (input, packed) -> (int) input.readSFixed64(); + } else if (fieldDataType instanceof LongType) { + return (input, packed) -> input.readSFixed64(); + } else if (fieldDataType instanceof FloatType) { + return (input, packed) -> (float) input.readSFixed64(); + } else if (fieldDataType instanceof DoubleType) { + return (input, packed) -> (double) input.readSFixed64(); + } else { + throw newCanNotConvertException(field, fieldDataType); + } + case SINT64: + if (fieldDataType instanceof IntegerType) { + return (input, packed) -> (int) input.readSInt64(); + } else if (fieldDataType instanceof LongType) { + return (input, packed) -> input.readSInt64(); + } else if (fieldDataType instanceof FloatType) { + return (input, packed) -> (float) input.readSInt64(); + } else if (fieldDataType instanceof DoubleType) { + return (input, packed) -> (double) input.readSInt64(); + } else { + throw newCanNotConvertException(field, fieldDataType); + } + case INT32: + if (fieldDataType instanceof IntegerType) { + return (input, packed) -> input.readInt32(); + } else if (fieldDataType instanceof LongType) { + return (input, packed) -> (long) input.readInt32(); + } else if (fieldDataType instanceof FloatType) { + return (input, packed) -> (float) input.readInt32(); + } else if (fieldDataType instanceof DoubleType) { + return (input, packed) -> (double) input.readInt32(); + } else { + throw newCanNotConvertException(field, fieldDataType); + } + case UINT32: + if (fieldDataType instanceof IntegerType) { + return (input, packed) -> input.readUInt32(); + } else if (fieldDataType instanceof LongType) { + return (input, packed) -> (long) input.readUInt32(); + } else if (fieldDataType instanceof FloatType) { + return (input, packed) -> (float) input.readUInt32(); + } else if (fieldDataType instanceof DoubleType) { + return (input, packed) -> (double) input.readUInt32(); + } else { + throw newCanNotConvertException(field, fieldDataType); + } + case FIXED32: + if (fieldDataType instanceof IntegerType) { + return (input, packed) -> input.readFixed32(); + } else if (fieldDataType instanceof LongType) { + return (input, packed) -> (long) input.readFixed32(); + } else if (fieldDataType instanceof FloatType) { + return (input, packed) -> (float) input.readFixed32(); + } else if (fieldDataType instanceof DoubleType) { + return (input, packed) -> (double) input.readFixed32(); + } else { + throw newCanNotConvertException(field, fieldDataType); + } + case SFIXED32: + if (fieldDataType instanceof IntegerType) { + return (input, packed) -> input.readSFixed32(); + } else if (fieldDataType instanceof LongType) { + return (input, packed) -> (long) input.readSFixed32(); + } else if (fieldDataType instanceof FloatType) { + return (input, packed) -> (float) input.readSFixed32(); + } else if (fieldDataType instanceof DoubleType) { + return (input, packed) -> (double) input.readSFixed32(); + } else { + throw newCanNotConvertException(field, fieldDataType); + } + case SINT32: + if (fieldDataType instanceof IntegerType) { + return (input, packed) -> input.readSInt32(); + } else if (fieldDataType instanceof LongType) { + return (input, packed) -> (long) input.readSInt32(); + } else if (fieldDataType instanceof FloatType) { + return (input, packed) -> (float) input.readSInt32(); + } else if (fieldDataType instanceof DoubleType) { + return (input, packed) -> (double) input.readSInt32(); + } else { + throw newCanNotConvertException(field, fieldDataType); + } + case BOOL: + if (fieldDataType instanceof BooleanType) { + return (input, packed) -> input.readBool(); + } else if (fieldDataType instanceof IntegerType) { + return (input, packed) -> input.readBool() ? 1 : 0; + } else { + throw newCanNotConvertException(field, fieldDataType); + } + case BYTES: + if (fieldDataType instanceof BinaryType) { + return (input, packed) -> input.readByteArray(); + } else { + throw newCanNotConvertException(field, fieldDataType); + } + case STRING: + if (fieldDataType instanceof StringType) { + return (input, packed) -> { + //return input.readString(); + byte[] bytes = input.readByteArray(); + return decodeUTF8(bytes, 0, bytes.length); + }; + } else { + throw newCanNotConvertException(field, fieldDataType); + } + default: + throw new IllegalArgumentException(String.format("not supported proto type:%s(%s)", field.getType(), field.getName())); + } + } + + private String decodeUTF8(byte[] input, int offset, int byteLen) { + char[] chars = MessageConverter.MAX_CHARS_LENGTH < byteLen? new char[byteLen]: tmpDecodeChars; + int len = decodeUTF8Strict(input, offset, byteLen, chars); + if (len < 0) { + return defaultDecodeUTF8(input, offset, byteLen); + } + return new String(chars, 0, len); + } + + private static int decodeUTF8Strict(byte[] sa, int sp, int len, char[] da) { + final int sl = sp + len; + int dp = 0; + int dlASCII = Math.min(len, da.length); + + // ASCII only optimized loop + while (dp < dlASCII && sa[sp] >= 0) { + da[dp++] = (char) sa[sp++]; + } + + while (sp < sl) { + int b1 = sa[sp++]; + if (b1 >= 0) { + // 1 byte, 7 bits: 0xxxxxxx + da[dp++] = (char) b1; + } else if ((b1 >> 5) == -2 && (b1 & 0x1e) != 0) { + // 2 bytes, 11 bits: 110xxxxx 10xxxxxx + if (sp < sl) { + int b2 = sa[sp++]; + if ((b2 & 0xc0) != 0x80) { // isNotContinuation(b2) + return -1; + } else { + da[dp++] = (char) (((b1 << 6) ^ b2) ^ (((byte) 0xC0 << 6) ^ ((byte) 0x80))); + } + continue; + } + return -1; + } else if ((b1 >> 4) == -2) { + // 3 bytes, 16 bits: 1110xxxx 10xxxxxx 10xxxxxx + if (sp + 1 < sl) { + int b2 = sa[sp++]; + int b3 = sa[sp++]; + if ((b1 == (byte) 0xe0 && (b2 & 0xe0) == 0x80) + || (b2 & 0xc0) != 0x80 + || (b3 & 0xc0) != 0x80) { // isMalformed3(b1, b2, b3) + return -1; + } else { + char c = + (char) + ((b1 << 12) + ^ (b2 << 6) + ^ (b3 + ^ (((byte) 0xE0 << 12) + ^ ((byte) 0x80 << 6) + ^ ((byte) 0x80)))); + if (Character.isSurrogate(c)) { + return -1; + } else { + da[dp++] = c; + } + } + continue; + } + return -1; + } else if ((b1 >> 3) == -2) { + // 4 bytes, 21 bits: 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx + if (sp + 2 < sl) { + int b2 = sa[sp++]; + int b3 = sa[sp++]; + int b4 = sa[sp++]; + int uc = + ((b1 << 18) + ^ (b2 << 12) + ^ (b3 << 6) + ^ (b4 + ^ (((byte) 0xF0 << 18) + ^ ((byte) 0x80 << 12) + ^ ((byte) 0x80 << 6) + ^ ((byte) 0x80)))); + // isMalformed4 and shortest form check + if (((b2 & 0xc0) != 0x80 || (b3 & 0xc0) != 0x80 || (b4 & 0xc0) != 0x80) + || !Character.isSupplementaryCodePoint(uc)) { + return -1; + } else { + da[dp++] = Character.highSurrogate(uc); + da[dp++] = Character.lowSurrogate(uc); + } + continue; + } + return -1; + } else { + return -1; + } + } + return dp; + } + + private static String defaultDecodeUTF8(byte[] bytes, int offset, int len) { + return new String(bytes, offset, len, StandardCharsets.UTF_8); + } + } + + private static IllegalArgumentException newCanNotConvertException(FieldDescriptor field, DataType fieldDataType){ + return new IllegalArgumentException(String.format("proto field:%s(%s) can not convert to type:%s", field.getName(), field.getType(), fieldDataType.simpleString())); + } + + @FunctionalInterface + public interface ValueConverter { + Object convert(CodedInputStream input, boolean packed) throws Exception; + } +} diff --git a/groot-formats/format-protobuf/src/main/resources/META-INF/services/com.geedgenetworks.core.factories.Factory b/groot-formats/format-protobuf/src/main/resources/META-INF/services/com.geedgenetworks.api.factory.Factory index b6c459c..b6c459c 100644 --- a/groot-formats/format-protobuf/src/main/resources/META-INF/services/com.geedgenetworks.core.factories.Factory +++ b/groot-formats/format-protobuf/src/main/resources/META-INF/services/com.geedgenetworks.api.factory.Factory diff --git a/groot-formats/format-protobuf/src/test/java/com/geedgenetworks/formats/protobuf/ProtobufEventSchemaTest.java b/groot-formats/format-protobuf/src/test/java/com/geedgenetworks/formats/protobuf/ProtobufEventSchemaTest.java index 9638bd6..df3c30a 100644 --- a/groot-formats/format-protobuf/src/test/java/com/geedgenetworks/formats/protobuf/ProtobufEventSchemaTest.java +++ b/groot-formats/format-protobuf/src/test/java/com/geedgenetworks/formats/protobuf/ProtobufEventSchemaTest.java @@ -1,700 +1,700 @@ -package com.geedgenetworks.formats.protobuf;
-
-import com.alibaba.fastjson2.JSON;
-import com.geedgenetworks.shaded.com.google.protobuf.ByteString;
-import com.geedgenetworks.shaded.com.google.protobuf.Descriptors;
-import org.apache.commons.io.IOUtils;
-import org.apache.commons.io.LineIterator;
-import org.apache.flink.util.Preconditions;
-import org.apache.kafka.common.record.CompressionType;
-import org.apache.kafka.common.utils.ByteBufferOutputStream;
-import org.junit.jupiter.api.Test;
-import com.geedgenetworks.formats.protobuf.SchemaConverters.MessageConverter;
-
-import java.io.FileInputStream;
-import java.io.OutputStream;
-import java.nio.charset.StandardCharsets;
-import java.util.*;
-import java.util.concurrent.ThreadLocalRandom;
-
-import static org.junit.jupiter.api.Assertions.*;
-
-/**
- * protoc --descriptor_set_out=proto3_types.desc --java_out=./ proto3_types.proto
- * protoc --descriptor_set_out=session_record_test.desc session_record_test.proto
- *
- */
-public class ProtobufEventSchemaTest {
-
- public static class InputDatas{
- Proto3TypesProtos.Proto3Types msg;
- Proto3TypesProtos.StructMessage subMsg1;
- Proto3TypesProtos.StructMessage subMsg2;
- Map<String, Object> map;
- Map<String, Object> subMap1;
- Map<String, Object> subMap2;
- }
-
- public static InputDatas geneInputDatas(){
- ThreadLocalRandom random = ThreadLocalRandom.current();
- Proto3TypesProtos.Proto3Types.Builder msgBuilder = Proto3TypesProtos.Proto3Types.newBuilder();
- Map<String, Object> map = new HashMap<>();
- Proto3TypesProtos.StructMessage.Builder subMsgBuilder1 = Proto3TypesProtos.StructMessage.newBuilder();
- Proto3TypesProtos.StructMessage.Builder subMsgBuilder2 = Proto3TypesProtos.StructMessage.newBuilder();
- Map<String, Object> subMap1 = new HashMap<>();
- Map<String, Object> subMap2 = new HashMap<>();
-
- long int64 = random.nextLong(1, Long.MAX_VALUE);
- msgBuilder.setInt64(int64);
- map.put("int64", int64);
-
- int int32 = random.nextInt(1, Integer.MAX_VALUE);
- msgBuilder.setInt32(int32);
- map.put("int32", int32);
-
- String text = "ut8字符串";
- msgBuilder.setText(text);
- map.put("text", text);
-
- byte[] bytes = new byte[]{1, 2, 3, 4, 5};
- msgBuilder.setBytes(ByteString.copyFrom(bytes));
- map.put("bytes", bytes);
-
- int enum_val = 1;
- msgBuilder.setEnumValValue(enum_val);
- map.put("enum_val", enum_val);
-
- // subMsg start
- long id = random.nextLong(1, Long.MAX_VALUE);
- subMsgBuilder1.setId(id);
- subMap1.put("id", id);
-
- String name = "ut8字符串1";
- subMsgBuilder1.setName(name);
- subMap1.put("name", name);
-
- int age = random.nextInt(1, Integer.MAX_VALUE);
- subMsgBuilder1.setAge(age);
- subMap1.put("age", age);
-
- double score = random.nextDouble(1, Integer.MAX_VALUE);
- subMsgBuilder1.setScore(score);
- subMap1.put("score", score);
-
- long optional_id = random.nextLong(1, Long.MAX_VALUE);
- subMsgBuilder1.setOptionalId(optional_id);
- subMap1.put("optional_id", optional_id);
-
- int optional_age = random.nextInt(1, Integer.MAX_VALUE);
- subMsgBuilder1.setOptionalAge(optional_age);
- subMap1.put("optional_age", optional_age);
-
- id = random.nextLong(1, Long.MAX_VALUE);
- subMsgBuilder2.setId(id);
- subMap2.put("id", id);
-
- name = "ut8字符串1";
- subMsgBuilder2.setName(name);
- subMap2.put("name", name);
-
- age = random.nextInt(1, Integer.MAX_VALUE);
- subMsgBuilder2.setAge(age);
- subMap2.put("age", age);
-
- score = random.nextDouble(1, Integer.MAX_VALUE);
- subMsgBuilder2.setScore(score);
- subMap2.put("score", score);
-
- optional_id = random.nextLong(1, Long.MAX_VALUE);
- subMsgBuilder2.setOptionalId(optional_id);
- subMap2.put("optional_id", optional_id);
-
- optional_age = random.nextInt(1, Integer.MAX_VALUE);
- subMsgBuilder2.setOptionalAge(optional_age);
- subMap2.put("optional_age", optional_age);
-
- Proto3TypesProtos.StructMessage subMsg1 = subMsgBuilder1.build();
- Proto3TypesProtos.StructMessage subMsg2 = subMsgBuilder2.build();
- // subMsg end
-
- msgBuilder.setMessage(subMsg1);
- map.put("message", subMap1);
-
- long optional_int64 = random.nextLong(1, Long.MAX_VALUE);
- msgBuilder.setOptionalInt64(optional_int64);
- map.put("optional_int64", optional_int64);
-
- int optional_int32 = random.nextInt(1, Integer.MAX_VALUE);
- msgBuilder.setOptionalInt32(optional_int32);
- map.put("optional_int32", optional_int32);
-
- String optional_text = "ut8字符串";
- msgBuilder.setOptionalText(optional_text);
- map.put("optional_text", optional_text);
-
- byte[] optional_bytes = new byte[]{1, 2, 3, 4, 5};
- msgBuilder.setOptionalBytes(ByteString.copyFrom(optional_bytes));
- map.put("optional_bytes", optional_bytes);
-
- int optional_enum_val = 1;
- msgBuilder.setOptionalEnumValValue(optional_enum_val);
- map.put("optional_enum_val", optional_enum_val);
-
- msgBuilder.setOptionalMessage(subMsg2);
- map.put("optional_message", subMap2);
-
- List<Long> repeated_int64 = Arrays.asList(1L, 3L, 5L);
- msgBuilder.addAllRepeatedInt64(repeated_int64);
- map.put("repeated_int64", repeated_int64);
-
- List<Integer> repeated_int32 = Arrays.asList(1, 3, 5);
- msgBuilder.addAllRepeatedInt32(repeated_int32);
- map.put("repeated_int32", repeated_int32);
-
- msgBuilder.addAllRepeatedMessage(Arrays.asList(subMsg1, subMsg2));
- map.put("repeated_message", Arrays.asList(subMap1, subMap2));
-
- InputDatas datas = new InputDatas();
- datas.msg = msgBuilder.build();
- datas.subMsg1 = subMsg1;
- datas.subMsg2 = subMsg2;
- datas.map = map;
- datas.subMap1 = subMap1;
- datas.subMap2 = subMap2;
- return datas;
- }
-
- public static InputDatas geneInputDatasDefaultValue(){
- ThreadLocalRandom random = ThreadLocalRandom.current();
- Proto3TypesProtos.Proto3Types.Builder msgBuilder = Proto3TypesProtos.Proto3Types.newBuilder();
- Map<String, Object> map = new HashMap<>();
- Proto3TypesProtos.StructMessage.Builder subMsgBuilder1 = Proto3TypesProtos.StructMessage.newBuilder();
- Proto3TypesProtos.StructMessage.Builder subMsgBuilder2 = Proto3TypesProtos.StructMessage.newBuilder();
- Map<String, Object> subMap1 = new HashMap<>();
- Map<String, Object> subMap2 = new HashMap<>();
-
- long int64 = 0;
- msgBuilder.setInt64(int64);
- map.put("int64", int64);
-
- int int32 = 0;
- msgBuilder.setInt32(int32);
- map.put("int32", int32);
-
- String text = "";
- msgBuilder.setText(text);
- map.put("text", text);
-
- byte[] bytes = new byte[]{};
- msgBuilder.setBytes(ByteString.copyFrom(bytes));
- map.put("bytes", bytes);
-
- int enum_val = 0;
- msgBuilder.setEnumValValue(enum_val);
- map.put("enum_val", enum_val);
-
- // subMsg start
- long id = 0;
- subMsgBuilder1.setId(id);
- subMap1.put("id", id);
-
- String name = "";
- subMsgBuilder1.setName(name);
- subMap1.put("name", name);
-
- int age = 0;
- subMsgBuilder1.setAge(age);
- subMap1.put("age", age);
-
- double score = 0;
- subMsgBuilder1.setScore(score);
- subMap1.put("score", score);
-
- long optional_id = 0;
- subMsgBuilder1.setOptionalId(optional_id);
- subMap1.put("optional_id", optional_id);
-
- int optional_age = 0;
- /*subMsgBuilder1.setOptionalAge(optional_age);
- subMap1.put("optional_age", optional_age);*/
-
- id = 0;
- subMsgBuilder2.setId(id);
- subMap2.put("id", id);
-
- name = "";
- subMsgBuilder2.setName(name);
- subMap2.put("name", name);
-
- age = 0;
- subMsgBuilder2.setAge(age);
- subMap2.put("age", age);
-
- score = 0;
- subMsgBuilder2.setScore(score);
- subMap2.put("score", score);
-
- optional_id = 0;
- subMsgBuilder2.setOptionalId(optional_id);
- subMap2.put("optional_id", optional_id);
-
- optional_age = 0;
- subMsgBuilder2.setOptionalAge(optional_age);
- subMap2.put("optional_age", optional_age);
-
- Proto3TypesProtos.StructMessage subMsg1 = subMsgBuilder1.build();
- Proto3TypesProtos.StructMessage subMsg2 = subMsgBuilder2.build();
- // subMsg end
-
- msgBuilder.setMessage(subMsg1);
- map.put("message", subMap1);
-
- long optional_int64 = 0;
- msgBuilder.setOptionalInt64(optional_int64);
- map.put("optional_int64", optional_int64);
-
- int optional_int32 = 0;
- msgBuilder.setOptionalInt32(optional_int32);
- map.put("optional_int32", optional_int32);
-
- String optional_text = "";
- msgBuilder.setOptionalText(optional_text);
- map.put("optional_text", optional_text);
-
- byte[] optional_bytes = new byte[]{};
- msgBuilder.setOptionalBytes(ByteString.copyFrom(optional_bytes));
- map.put("optional_bytes", optional_bytes);
-
- int optional_enum_val = 0;
- msgBuilder.setOptionalEnumValValue(optional_enum_val);
- map.put("optional_enum_val", optional_enum_val);
-
- msgBuilder.setOptionalMessage(subMsg2);
- map.put("optional_message", subMap2);
-
- List<Long> repeated_int64 = Arrays.asList();
- msgBuilder.addAllRepeatedInt64(repeated_int64);
- map.put("repeated_int64", repeated_int64);
-
- List<Integer> repeated_int32 = Arrays.asList();
- msgBuilder.addAllRepeatedInt32(repeated_int32);
- map.put("repeated_int32", repeated_int32);
-
- msgBuilder.addAllRepeatedMessage(Arrays.asList());
- map.put("repeated_message", Arrays.asList());
-
- InputDatas datas = new InputDatas();
- datas.msg = msgBuilder.build();
- datas.subMsg1 = subMsg1;
- datas.subMsg2 = subMsg2;
- datas.map = map;
- datas.subMap1 = subMap1;
- datas.subMap2 = subMap2;
- return datas;
- }
-
- public static InputDatas geneInputDatasUsePartialField(){
- ThreadLocalRandom random = ThreadLocalRandom.current();
- Proto3TypesProtos.Proto3Types.Builder msgBuilder = Proto3TypesProtos.Proto3Types.newBuilder();
- Map<String, Object> map = new HashMap<>();
- Proto3TypesProtos.StructMessage.Builder subMsgBuilder1 = Proto3TypesProtos.StructMessage.newBuilder();
- Proto3TypesProtos.StructMessage.Builder subMsgBuilder2 = Proto3TypesProtos.StructMessage.newBuilder();
- Map<String, Object> subMap1 = new HashMap<>();
- Map<String, Object> subMap2 = new HashMap<>();
-
- /*long int64 = random.nextLong(1, Long.MAX_VALUE);
- msgBuilder.setInt64(int64);
- map.put("int64", int64);*/
-
- int int32 = random.nextInt(1, Integer.MAX_VALUE);
- msgBuilder.setInt32(int32);
- map.put("int32", int32);
-
- String text = "ut8字符串";
- msgBuilder.setText(text);
- map.put("text", text);
-
- /*byte[] bytes = new byte[]{1, 2, 3, 4, 5};
- msgBuilder.setBytes(ByteString.copyFrom(bytes));
- map.put("bytes", bytes);*/
-
- /*int enum_val = 1;
- msgBuilder.setEnumValValue(enum_val);
- map.put("enum_val", enum_val);*/
-
- // subMsg start
- long id = random.nextLong(1, Long.MAX_VALUE);
- subMsgBuilder1.setId(id);
- subMap1.put("id", id);
-
- String name = "ut8字符串1";
- /*subMsgBuilder1.setName(name);
- subMap1.put("name", name);*/
-
- int age = random.nextInt(1, Integer.MAX_VALUE);
- subMsgBuilder1.setAge(age);
- subMap1.put("age", age);
-
- double score = random.nextDouble(1, Integer.MAX_VALUE);
- /*subMsgBuilder1.setScore(score);
- subMap1.put("score", score);*/
-
- long optional_id = random.nextLong(1, Long.MAX_VALUE);
- subMsgBuilder1.setOptionalId(optional_id);
- subMap1.put("optional_id", optional_id);
-
- int optional_age = random.nextInt(1, Integer.MAX_VALUE);
- /*subMsgBuilder1.setOptionalAge(optional_age);
- subMap1.put("optional_age", optional_age);*/
-
- id = random.nextLong(1, Long.MAX_VALUE);
- /*subMsgBuilder2.setId(id);
- subMap2.put("id", id);*/
-
- name = "ut8字符串1";
- subMsgBuilder2.setName(name);
- subMap2.put("name", name);
-
- age = random.nextInt(1, Integer.MAX_VALUE);
- /*subMsgBuilder2.setAge(age);
- subMap2.put("age", age);*/
-
- score = random.nextDouble(1, Integer.MAX_VALUE);
- subMsgBuilder2.setScore(score);
- subMap2.put("score", score);
-
- optional_id = random.nextLong(1, Long.MAX_VALUE);
- /*subMsgBuilder2.setOptionalId(optional_id);
- subMap2.put("optional_id", optional_id);*/
-
- optional_age = random.nextInt(1, Integer.MAX_VALUE);
- subMsgBuilder2.setOptionalAge(optional_age);
- subMap2.put("optional_age", optional_age);
-
- Proto3TypesProtos.StructMessage subMsg1 = subMsgBuilder1.build();
- Proto3TypesProtos.StructMessage subMsg2 = subMsgBuilder2.build();
- // subMsg end
-
- /*msgBuilder.setMessage(subMsg1);
- map.put("message", subMap1);*/
-
- long optional_int64 = random.nextLong(1, Long.MAX_VALUE);
- msgBuilder.setOptionalInt64(optional_int64);
- map.put("optional_int64", optional_int64);
-
- /*int optional_int32 = random.nextInt(1, Integer.MAX_VALUE);
- msgBuilder.setOptionalInt32(optional_int32);
- map.put("optional_int32", optional_int32);*/
-
- String optional_text = "ut8字符串";
- msgBuilder.setOptionalText(optional_text);
- map.put("optional_text", optional_text);
-
- /*byte[] optional_bytes = new byte[]{1, 2, 3, 4, 5};
- msgBuilder.setOptionalBytes(ByteString.copyFrom(optional_bytes));
- map.put("optional_bytes", optional_bytes);*/
-
- int optional_enum_val = 1;
- msgBuilder.setOptionalEnumValValue(optional_enum_val);
- map.put("optional_enum_val", optional_enum_val);
-
- msgBuilder.setOptionalMessage(subMsg2);
- map.put("optional_message", subMap2);
-
- /*List<Long> repeated_int64 = Arrays.asList(1L, 3L, 5L);
- msgBuilder.addAllRepeatedInt64(repeated_int64);
- map.put("repeated_int64", repeated_int64);*/
-
- List<Integer> repeated_int32 = Arrays.asList(1, 3, 5);
- msgBuilder.addAllRepeatedInt32(repeated_int32);
- map.put("repeated_int32", repeated_int32);
-
- msgBuilder.addAllRepeatedMessage(Arrays.asList(subMsg1, subMsg2));
- map.put("repeated_message", Arrays.asList(subMap1, subMap2));
-
- InputDatas datas = new InputDatas();
- datas.msg = msgBuilder.build();
- datas.subMsg1 = subMsg1;
- datas.subMsg2 = subMsg2;
- datas.map = map;
- datas.subMap1 = subMap1;
- datas.subMap2 = subMap2;
- return datas;
- }
-
- @Test
- public void testSerializeAndDeserialize() throws Exception{
- String path = getClass().getResource("/proto3_types.desc").getPath();
- Descriptors.Descriptor descriptor = ProtobufUtils.buildDescriptor(ProtobufUtils.readDescriptorFileContent(path),"Proto3Types");
- InputDatas inputDatas = geneInputDatas();
-
- byte[] bytesSerByApi = inputDatas.msg.toByteArray();
-
- ProtobufSerializer serializer = new ProtobufSerializer(descriptor);
- byte[] bytesSer = serializer.serialize(inputDatas.map);
-
- System.out.println(String.format("built-in ser bytes size:%d\nmy ser bytes size:%d", bytesSerByApi.length, bytesSer.length));
- assertArrayEquals(bytesSerByApi, bytesSer);
-
- MessageConverter messageConverter = new MessageConverter(descriptor, SchemaConverters.toStructType(descriptor), false);
- Map<String, Object> rstMap = messageConverter.converter(bytesSer);
-
- assertTrue(objEquals(inputDatas.map, rstMap, false), () -> "\n" + inputDatas.map.toString() + "\n" + rstMap.toString());
- System.out.println(inputDatas.map.toString());
- System.out.println(rstMap.toString());
- System.out.println(JSON.toJSONString(inputDatas.map));
- System.out.println(JSON.toJSONString(rstMap));
-
- System.out.println(JSON.toJSONString(inputDatas.map).equals(JSON.toJSONString(rstMap)));
- }
-
- @Test
- public void testSerializeAndDeserializeDefaultValue() throws Exception{
- String path = getClass().getResource("/proto3_types.desc").getPath();
- Descriptors.Descriptor descriptor = ProtobufUtils.buildDescriptor(ProtobufUtils.readDescriptorFileContent(path),"Proto3Types");
- InputDatas inputDatas = geneInputDatasDefaultValue();
-
- byte[] bytesSerByApi = inputDatas.msg.toByteArray();
-
- ProtobufSerializer serializer = new ProtobufSerializer(descriptor);
- byte[] bytesSer = serializer.serialize(inputDatas.map);
-
- System.out.println(String.format("built-in ser bytes size:%d\nmy ser bytes size:%d", bytesSerByApi.length, bytesSer.length));
- assertArrayEquals(bytesSerByApi, bytesSer);
-
- MessageConverter messageConverter = new MessageConverter(descriptor, SchemaConverters.toStructType(descriptor), false);
- Map<String, Object> rstMap = messageConverter.converter(bytesSer);
- messageConverter = new MessageConverter(descriptor, SchemaConverters.toStructType(descriptor), true);
- Map<String, Object> rstMapEmitDefaultValue = messageConverter.converter(bytesSer);
-
- // message不是null就输出, 数组长度大于0才输出, optional设置值就输出, optional bytes长度为0也输出
- System.out.println(inputDatas.map.toString());
- System.out.println(rstMap.toString());
- System.out.println(rstMapEmitDefaultValue.toString());
- System.out.println(JSON.toJSONString(inputDatas.map));
- System.out.println(JSON.toJSONString(rstMap));
- System.out.println(JSON.toJSONString(rstMapEmitDefaultValue));
-
- System.out.println(JSON.toJSONString(inputDatas.map).equals(JSON.toJSONString(rstMap)));
- }
-
- @Test
- public void testSerializeAndDeserializeUsePartialField() throws Exception{
- String path = getClass().getResource("/proto3_types.desc").getPath();
- Descriptors.Descriptor descriptor = ProtobufUtils.buildDescriptor(ProtobufUtils.readDescriptorFileContent(path),"Proto3Types");
- InputDatas inputDatas = geneInputDatasUsePartialField();
-
- byte[] bytesSerByApi = inputDatas.msg.toByteArray();
-
- ProtobufSerializer serializer = new ProtobufSerializer(descriptor);
- byte[] bytesSer = serializer.serialize(inputDatas.map);
- System.out.println(Base64.getEncoder().encodeToString(bytesSer));
-
- System.out.println(String.format("built-in ser bytes size:%d\nmy ser bytes size:%d", bytesSerByApi.length, bytesSer.length));
- assertArrayEquals(bytesSerByApi, bytesSer);
-
- MessageConverter messageConverter = new MessageConverter(descriptor, SchemaConverters.toStructType(descriptor), false);
- Map<String, Object> rstMap = messageConverter.converter(bytesSer);
-
- assertTrue(objEquals(inputDatas.map, rstMap, false), () -> "\n" + inputDatas.map.toString() + "\n" + rstMap.toString());
- System.out.println(inputDatas.map.toString());
- System.out.println(rstMap.toString());
- System.out.println(JSON.toJSONString(inputDatas.map));
- System.out.println(JSON.toJSONString(rstMap));
-
- System.out.println(JSON.toJSONString(inputDatas.map).equals(JSON.toJSONString(rstMap)));
- }
-
- @Test
- public void testSerializeAndDeserializeSessionRecord() throws Exception{
- String path = getClass().getResource("/session_record_test.desc").getPath();
- Descriptors.Descriptor descriptor = ProtobufUtils.buildDescriptor(ProtobufUtils.readDescriptorFileContent(path),"SessionRecord");
- String json = "{\"recv_time\": 1704350600, \"log_id\": 185826449998479360, \"decoded_as\": \"BASE\", \"session_id\": 290502878495441820, \"start_timestamp_ms\": 1704350566378, \"end_timestamp_ms\": 1704350570816, \"duration_ms\": 4438, \"tcp_handshake_latency_ms\": 1105, \"ingestion_time\": 1704350600, \"processing_time\": 1704350600, \"device_id\": \"21426003\", \"out_link_id\": 65535, \"in_link_id\": 65535, \"device_tag\": \"{\\\"tags\\\":[{\\\"tag\\\":\\\"data_center\\\",\\\"value\\\":\\\"center-xxg-9140\\\"},{\\\"tag\\\":\\\"device_group\\\",\\\"value\\\":\\\"group-xxg-9140\\\"}]}\", \"data_center\": \"center-xxg-9140\", \"device_group\": \"group-xxg-9140\", \"sled_ip\": \"192.168.40.81\", \"address_type\": 4, \"vsys_id\": 1, \"t_vsys_id\": 1, \"flags\": 24592, \"flags_identify_info\": \"[1,1,2]\", \"statistics_rule_list\": [406583], \"client_ip\": \"192.56.151.80\", \"client_port\": 62241, \"client_os_desc\": \"Windows\", \"client_geolocation\": \"\\u7f8e\\u56fd.Unknown.Unknown..\", \"server_ip\": \"192.56.222.93\", \"server_port\": 14454, \"server_os_desc\": \"Linux\", \"server_geolocation\": \"\\u7f8e\\u56fd.Unknown.Unknown..\", \"ip_protocol\": \"tcp\", \"decoded_path\": \"ETHERNET.IPv4.TCP\", \"sent_pkts\": 4, \"received_pkts\": 5, \"sent_bytes\": 246, \"received_bytes\": 1809, \"tcp_rtt_ms\": 128, \"tcp_client_isn\": 568305009, \"tcp_server_isn\": 4027331180, \"in_src_mac\": \"a2:fa:dc:56:c7:b3\", \"out_src_mac\": \"48:73:97:96:38:20\", \"in_dest_mac\": \"48:73:97:96:38:20\", \"out_dest_mac\": \"a2:fa:dc:56:c7:b3\"}";
- Map<String, Object> map = JSON.parseObject(json);
-
- ProtobufSerializer serializer = new ProtobufSerializer(descriptor);
- byte[] bytesSer = serializer.serialize(map);
- System.out.println(Base64.getEncoder().encodeToString(bytesSer));
-
- System.out.println(String.format("my ser bytes size:%d", bytesSer.length));
-
- MessageConverter messageConverter = new MessageConverter(descriptor, SchemaConverters.toStructType(descriptor), false);
- Map<String, Object> rstMap = messageConverter.converter(bytesSer);
-
- assertTrue(objEquals(map, rstMap, true), () -> "\n" + JSON.toJSONString(map) + "\n" + JSON.toJSONString(rstMap));
- System.out.println(map.toString());
- System.out.println(rstMap.toString());
- System.out.println(JSON.toJSONString(new TreeMap<>(map)));
- System.out.println(JSON.toJSONString(new TreeMap<>(rstMap)));
-
- System.out.println(JSON.toJSONString(new TreeMap<>(map)).equals(JSON.toJSONString(new TreeMap<>(rstMap))));
- }
-
-
- public static void main(String[] args) throws Exception{
- ProtobufEventSchemaTest test = new ProtobufEventSchemaTest();
- String path = test.getClass().getResource("/session_record_test.desc").getPath();
- Descriptors.Descriptor descriptor = ProtobufUtils.buildDescriptor(ProtobufUtils.readDescriptorFileContent(path),"SessionRecord");
- MessageConverter messageConverter = new MessageConverter(descriptor, SchemaConverters.toStructType(descriptor), false);
- ProtobufSerializer serializer = new ProtobufSerializer(descriptor);
-
- FileInputStream inputStream = new FileInputStream("D:\\doc\\groot\\SESSION-RECORD-24-0104.json");
- LineIterator lines = IOUtils.lineIterator(inputStream, "utf-8");
- int count = 0;
- long jsonBytesTotalSize = 0;
- long protoBytesTotalSize = 0;
- long jsonBytesMinSize = Long.MAX_VALUE;
- long protoBytesMinSize = Long.MAX_VALUE;
- long jsonBytesMaxSize = 0;
- long protoBytesMaxSize = 0;
- long totalFieldCount = 0;
- long minFieldCount = Long.MAX_VALUE;
- long maxFieldCount = 0;
-
- CompressionType[] compressionTypes = new CompressionType[]{
- CompressionType.NONE, CompressionType.SNAPPY, CompressionType.LZ4, CompressionType.GZIP, CompressionType.ZSTD
- };
- long[][] compressionBytesSize = new long[compressionTypes.length][6];
- for (int i = 0; i < compressionBytesSize.length; i++) {
- compressionBytesSize[i][0] = 0;
- compressionBytesSize[i][1] = 0;
- compressionBytesSize[i][2] = Long.MAX_VALUE;
- compressionBytesSize[i][3] = Long.MAX_VALUE;
- compressionBytesSize[i][4] = 0;
- compressionBytesSize[i][5] = 0;
- }
-
- while (lines.hasNext()){
- String line = lines.next().trim();
- if(line.isEmpty()){
- continue;
- }
-
- Map<String, Object> map = JSON.parseObject(line);
- int fieldCount = map.size();
- byte[] bytesProto = serializer.serialize(map);
- byte[] bytesJson = JSON.toJSONString(map).getBytes(StandardCharsets.UTF_8);
- jsonBytesTotalSize += bytesJson.length;
- protoBytesTotalSize += bytesProto.length;
- jsonBytesMinSize = Math.min(jsonBytesMinSize, bytesJson.length);
- protoBytesMinSize = Math.min(protoBytesMinSize, bytesProto.length);
- jsonBytesMaxSize = Math.max(jsonBytesMaxSize, bytesJson.length);
- protoBytesMaxSize = Math.max(protoBytesMaxSize, bytesProto.length);
- totalFieldCount += fieldCount;
- minFieldCount = Math.min(minFieldCount, fieldCount);
- maxFieldCount = Math.max(maxFieldCount, fieldCount);
-
- Map<String, Object> rstMap = messageConverter.converter(bytesProto);
- Preconditions.checkArgument(test.objEquals(map, rstMap, true), "\n" + JSON.toJSONString(new TreeMap<>(map)) + "\n" + JSON.toJSONString(new TreeMap<>(rstMap)));
- Preconditions.checkArgument(JSON.toJSONString(new TreeMap<>(map)).equals(JSON.toJSONString(new TreeMap<>(rstMap))), "\n" + JSON.toJSONString(new TreeMap<>(map)) + "\n" + JSON.toJSONString(new TreeMap<>(rstMap)));
- count++;
-
- for (int i = 0; i < compressionTypes.length; i++) {
- CompressionType compressionType = compressionTypes[i];
- ByteBufferOutputStream bufferStream = new ByteBufferOutputStream(1024 * 16);
- OutputStream outputStream = compressionType.wrapForOutput(bufferStream, (byte) 2);
- outputStream.write(bytesJson);
- outputStream.close();
- int jsonCompressSize = bufferStream.position();
-
- bufferStream = new ByteBufferOutputStream(1024 * 16);
- outputStream = compressionType.wrapForOutput(bufferStream, (byte) 2);
- outputStream.write(bytesProto);
- outputStream.close();
- int protoCompressSize = bufferStream.position();
-
- compressionBytesSize[i][0] += jsonCompressSize;
- compressionBytesSize[i][1] += protoCompressSize;
- compressionBytesSize[i][2] = Math.min(compressionBytesSize[i][2], jsonCompressSize);
- compressionBytesSize[i][3] = Math.min(compressionBytesSize[i][3], protoCompressSize);
- compressionBytesSize[i][4] = Math.max(compressionBytesSize[i][4], jsonCompressSize);
- compressionBytesSize[i][5] = Math.max(compressionBytesSize[i][5], protoCompressSize);
- }
-
- }
- System.out.println(String.format("count:%d, avgFieldCount:%d, minFieldCount:%d, maxFieldCount:%d, jsonBytesAvgSize:%d, protoBytesAvgSize:%d, jsonBytesMinSize:%d, protoBytesMinSize:%d, jsonBytesMaxSize:%d, protoBytesMaxSize:%d",
- count, totalFieldCount/count, minFieldCount, maxFieldCount, jsonBytesTotalSize/count, protoBytesTotalSize/count,
- jsonBytesMinSize, protoBytesMinSize, jsonBytesMaxSize, protoBytesMaxSize));
- for (int i = 0; i < compressionTypes.length; i++) {
- CompressionType compressionType = compressionTypes[i];
- System.out.println(String.format("compression(%s): count:%d, jsonBytesAvgSize:%d, protoBytesAvgSize:%d, avgRatio:%.2f, jsonBytesMinSize:%d, protoBytesMinSize:%d, minRatio:%.2f, jsonBytesMaxSize:%d, protoBytesMaxSize:%d, maxRatio:%.2f",
- compressionType, count, compressionBytesSize[i][0]/count, compressionBytesSize[i][1]/count, (((double)compressionBytesSize[i][1])/count)/(compressionBytesSize[i][0]/count),
- compressionBytesSize[i][2], compressionBytesSize[i][3], ((double)compressionBytesSize[i][3])/(compressionBytesSize[i][2]),
- compressionBytesSize[i][4], compressionBytesSize[i][5], ((double)compressionBytesSize[i][5])/(compressionBytesSize[i][4])));
- }
- }
-
- @Test
- public void testArrayInstance() throws Exception{
- Object bytes = new byte[]{1, 2, 3, 4, 5};
- Object ints = new int[]{1, 2, 3, 4, 5};
-
- System.out.println(bytes.getClass().isArray());
- System.out.println(bytes instanceof byte[]);
- System.out.println(bytes instanceof int[]);
- System.out.println(ints.getClass().isArray());
- System.out.println(ints instanceof byte[]);
- System.out.println(ints instanceof int[]);
- }
-
- private boolean objEquals(Object value1, Object value2, boolean numConvert){
- if(value1 == null){
- if(value1 != value2){
- return false;
- }
- }else if(value2 == null){
- return false;
- }else if(value1 instanceof Map){
- if(!mapEquals((Map<String, Object>) value1, (Map<String, Object>) value2, numConvert)){
- return false;
- }
- }else if(value1 instanceof List){
- if(!listEquals((List< Object>) value1, (List< Object>) value2, numConvert)){
- return false;
- }
- }else if(value1 instanceof byte[]){
- if(!Arrays.equals((byte[]) value1, (byte[]) value2)){
- return false;
- }
- }
- else{
- if(value1.getClass() != value2.getClass() || !value1.equals(value2)){
- if(numConvert && value1 instanceof Number && value2 instanceof Number && ((Number) value1).longValue() == ((Number) value2).longValue()){
-
- }else{
- return false;
- }
- }
- }
- return true;
- }
- private boolean mapEquals(Map<String, Object> map1, Map<String, Object> map2, boolean numConvert){
- if(map1.size() != map2.size()){
- return false;
- }
-
- for (Map.Entry<String, Object> entry : map1.entrySet()) {
- Object value1 = entry.getValue();
- Object value2 = map2.get(entry.getKey());
- if(!objEquals(value1, value2, numConvert)){
- return false;
- }
- }
-
- return true;
- }
-
- private boolean listEquals(List< Object> list1, List< Object> list2, boolean numConvert){
- if(list1.size() != list2.size()){
- return false;
- }
-
- for (int i = 0; i < list1.size(); i++) {
- if(!objEquals(list1.get(i), list2.get(i), numConvert)){
- return false;
- }
- }
-
- return true;
- }
-}
+package com.geedgenetworks.formats.protobuf; + +import com.alibaba.fastjson2.JSON; +import com.geedgenetworks.shaded.com.google.protobuf.ByteString; +import com.geedgenetworks.shaded.com.google.protobuf.Descriptors; +import org.apache.commons.io.IOUtils; +import org.apache.commons.io.LineIterator; +import org.apache.flink.util.Preconditions; +import org.apache.kafka.common.record.CompressionType; +import org.apache.kafka.common.utils.ByteBufferOutputStream; +import org.junit.jupiter.api.Test; +import com.geedgenetworks.formats.protobuf.SchemaConverters.MessageConverter; + +import java.io.FileInputStream; +import java.io.OutputStream; +import java.nio.charset.StandardCharsets; +import java.util.*; +import java.util.concurrent.ThreadLocalRandom; + +import static org.junit.jupiter.api.Assertions.*; + +/** + * protoc --descriptor_set_out=proto3_types.desc --java_out=./ proto3_types.proto + * protoc --descriptor_set_out=session_record_test.desc session_record_test.proto + * + */ +public class ProtobufEventSchemaTest { + + public static class InputDatas{ + Proto3TypesProtos.Proto3Types msg; + Proto3TypesProtos.StructMessage subMsg1; + Proto3TypesProtos.StructMessage subMsg2; + Map<String, Object> map; + Map<String, Object> subMap1; + Map<String, Object> subMap2; + } + + public static InputDatas geneInputDatas(){ + ThreadLocalRandom random = ThreadLocalRandom.current(); + Proto3TypesProtos.Proto3Types.Builder msgBuilder = Proto3TypesProtos.Proto3Types.newBuilder(); + Map<String, Object> map = new HashMap<>(); + Proto3TypesProtos.StructMessage.Builder subMsgBuilder1 = Proto3TypesProtos.StructMessage.newBuilder(); + Proto3TypesProtos.StructMessage.Builder subMsgBuilder2 = Proto3TypesProtos.StructMessage.newBuilder(); + Map<String, Object> subMap1 = new HashMap<>(); + Map<String, Object> subMap2 = new HashMap<>(); + + long int64 = random.nextLong(1, Long.MAX_VALUE); + msgBuilder.setInt64(int64); + map.put("int64", int64); + + int int32 = random.nextInt(1, Integer.MAX_VALUE); + msgBuilder.setInt32(int32); + map.put("int32", int32); + + String text = "ut8字符串"; + msgBuilder.setText(text); + map.put("text", text); + + byte[] bytes = new byte[]{1, 2, 3, 4, 5}; + msgBuilder.setBytes(ByteString.copyFrom(bytes)); + map.put("bytes", bytes); + + int enum_val = 1; + msgBuilder.setEnumValValue(enum_val); + map.put("enum_val", enum_val); + + // subMsg start + long id = random.nextLong(1, Long.MAX_VALUE); + subMsgBuilder1.setId(id); + subMap1.put("id", id); + + String name = "ut8字符串1"; + subMsgBuilder1.setName(name); + subMap1.put("name", name); + + int age = random.nextInt(1, Integer.MAX_VALUE); + subMsgBuilder1.setAge(age); + subMap1.put("age", age); + + double score = random.nextDouble(1, Integer.MAX_VALUE); + subMsgBuilder1.setScore(score); + subMap1.put("score", score); + + long optional_id = random.nextLong(1, Long.MAX_VALUE); + subMsgBuilder1.setOptionalId(optional_id); + subMap1.put("optional_id", optional_id); + + int optional_age = random.nextInt(1, Integer.MAX_VALUE); + subMsgBuilder1.setOptionalAge(optional_age); + subMap1.put("optional_age", optional_age); + + id = random.nextLong(1, Long.MAX_VALUE); + subMsgBuilder2.setId(id); + subMap2.put("id", id); + + name = "ut8字符串1"; + subMsgBuilder2.setName(name); + subMap2.put("name", name); + + age = random.nextInt(1, Integer.MAX_VALUE); + subMsgBuilder2.setAge(age); + subMap2.put("age", age); + + score = random.nextDouble(1, Integer.MAX_VALUE); + subMsgBuilder2.setScore(score); + subMap2.put("score", score); + + optional_id = random.nextLong(1, Long.MAX_VALUE); + subMsgBuilder2.setOptionalId(optional_id); + subMap2.put("optional_id", optional_id); + + optional_age = random.nextInt(1, Integer.MAX_VALUE); + subMsgBuilder2.setOptionalAge(optional_age); + subMap2.put("optional_age", optional_age); + + Proto3TypesProtos.StructMessage subMsg1 = subMsgBuilder1.build(); + Proto3TypesProtos.StructMessage subMsg2 = subMsgBuilder2.build(); + // subMsg end + + msgBuilder.setMessage(subMsg1); + map.put("message", subMap1); + + long optional_int64 = random.nextLong(1, Long.MAX_VALUE); + msgBuilder.setOptionalInt64(optional_int64); + map.put("optional_int64", optional_int64); + + int optional_int32 = random.nextInt(1, Integer.MAX_VALUE); + msgBuilder.setOptionalInt32(optional_int32); + map.put("optional_int32", optional_int32); + + String optional_text = "ut8字符串"; + msgBuilder.setOptionalText(optional_text); + map.put("optional_text", optional_text); + + byte[] optional_bytes = new byte[]{1, 2, 3, 4, 5}; + msgBuilder.setOptionalBytes(ByteString.copyFrom(optional_bytes)); + map.put("optional_bytes", optional_bytes); + + int optional_enum_val = 1; + msgBuilder.setOptionalEnumValValue(optional_enum_val); + map.put("optional_enum_val", optional_enum_val); + + msgBuilder.setOptionalMessage(subMsg2); + map.put("optional_message", subMap2); + + List<Long> repeated_int64 = Arrays.asList(1L, 3L, 5L); + msgBuilder.addAllRepeatedInt64(repeated_int64); + map.put("repeated_int64", repeated_int64); + + List<Integer> repeated_int32 = Arrays.asList(1, 3, 5); + msgBuilder.addAllRepeatedInt32(repeated_int32); + map.put("repeated_int32", repeated_int32); + + msgBuilder.addAllRepeatedMessage(Arrays.asList(subMsg1, subMsg2)); + map.put("repeated_message", Arrays.asList(subMap1, subMap2)); + + InputDatas datas = new InputDatas(); + datas.msg = msgBuilder.build(); + datas.subMsg1 = subMsg1; + datas.subMsg2 = subMsg2; + datas.map = map; + datas.subMap1 = subMap1; + datas.subMap2 = subMap2; + return datas; + } + + public static InputDatas geneInputDatasDefaultValue(){ + ThreadLocalRandom random = ThreadLocalRandom.current(); + Proto3TypesProtos.Proto3Types.Builder msgBuilder = Proto3TypesProtos.Proto3Types.newBuilder(); + Map<String, Object> map = new HashMap<>(); + Proto3TypesProtos.StructMessage.Builder subMsgBuilder1 = Proto3TypesProtos.StructMessage.newBuilder(); + Proto3TypesProtos.StructMessage.Builder subMsgBuilder2 = Proto3TypesProtos.StructMessage.newBuilder(); + Map<String, Object> subMap1 = new HashMap<>(); + Map<String, Object> subMap2 = new HashMap<>(); + + long int64 = 0; + msgBuilder.setInt64(int64); + map.put("int64", int64); + + int int32 = 0; + msgBuilder.setInt32(int32); + map.put("int32", int32); + + String text = ""; + msgBuilder.setText(text); + map.put("text", text); + + byte[] bytes = new byte[]{}; + msgBuilder.setBytes(ByteString.copyFrom(bytes)); + map.put("bytes", bytes); + + int enum_val = 0; + msgBuilder.setEnumValValue(enum_val); + map.put("enum_val", enum_val); + + // subMsg start + long id = 0; + subMsgBuilder1.setId(id); + subMap1.put("id", id); + + String name = ""; + subMsgBuilder1.setName(name); + subMap1.put("name", name); + + int age = 0; + subMsgBuilder1.setAge(age); + subMap1.put("age", age); + + double score = 0; + subMsgBuilder1.setScore(score); + subMap1.put("score", score); + + long optional_id = 0; + subMsgBuilder1.setOptionalId(optional_id); + subMap1.put("optional_id", optional_id); + + int optional_age = 0; + /*subMsgBuilder1.setOptionalAge(optional_age); + subMap1.put("optional_age", optional_age);*/ + + id = 0; + subMsgBuilder2.setId(id); + subMap2.put("id", id); + + name = ""; + subMsgBuilder2.setName(name); + subMap2.put("name", name); + + age = 0; + subMsgBuilder2.setAge(age); + subMap2.put("age", age); + + score = 0; + subMsgBuilder2.setScore(score); + subMap2.put("score", score); + + optional_id = 0; + subMsgBuilder2.setOptionalId(optional_id); + subMap2.put("optional_id", optional_id); + + optional_age = 0; + subMsgBuilder2.setOptionalAge(optional_age); + subMap2.put("optional_age", optional_age); + + Proto3TypesProtos.StructMessage subMsg1 = subMsgBuilder1.build(); + Proto3TypesProtos.StructMessage subMsg2 = subMsgBuilder2.build(); + // subMsg end + + msgBuilder.setMessage(subMsg1); + map.put("message", subMap1); + + long optional_int64 = 0; + msgBuilder.setOptionalInt64(optional_int64); + map.put("optional_int64", optional_int64); + + int optional_int32 = 0; + msgBuilder.setOptionalInt32(optional_int32); + map.put("optional_int32", optional_int32); + + String optional_text = ""; + msgBuilder.setOptionalText(optional_text); + map.put("optional_text", optional_text); + + byte[] optional_bytes = new byte[]{}; + msgBuilder.setOptionalBytes(ByteString.copyFrom(optional_bytes)); + map.put("optional_bytes", optional_bytes); + + int optional_enum_val = 0; + msgBuilder.setOptionalEnumValValue(optional_enum_val); + map.put("optional_enum_val", optional_enum_val); + + msgBuilder.setOptionalMessage(subMsg2); + map.put("optional_message", subMap2); + + List<Long> repeated_int64 = Arrays.asList(); + msgBuilder.addAllRepeatedInt64(repeated_int64); + map.put("repeated_int64", repeated_int64); + + List<Integer> repeated_int32 = Arrays.asList(); + msgBuilder.addAllRepeatedInt32(repeated_int32); + map.put("repeated_int32", repeated_int32); + + msgBuilder.addAllRepeatedMessage(Arrays.asList()); + map.put("repeated_message", Arrays.asList()); + + InputDatas datas = new InputDatas(); + datas.msg = msgBuilder.build(); + datas.subMsg1 = subMsg1; + datas.subMsg2 = subMsg2; + datas.map = map; + datas.subMap1 = subMap1; + datas.subMap2 = subMap2; + return datas; + } + + public static InputDatas geneInputDatasUsePartialField(){ + ThreadLocalRandom random = ThreadLocalRandom.current(); + Proto3TypesProtos.Proto3Types.Builder msgBuilder = Proto3TypesProtos.Proto3Types.newBuilder(); + Map<String, Object> map = new HashMap<>(); + Proto3TypesProtos.StructMessage.Builder subMsgBuilder1 = Proto3TypesProtos.StructMessage.newBuilder(); + Proto3TypesProtos.StructMessage.Builder subMsgBuilder2 = Proto3TypesProtos.StructMessage.newBuilder(); + Map<String, Object> subMap1 = new HashMap<>(); + Map<String, Object> subMap2 = new HashMap<>(); + + /*long int64 = random.nextLong(1, Long.MAX_VALUE); + msgBuilder.setInt64(int64); + map.put("int64", int64);*/ + + int int32 = random.nextInt(1, Integer.MAX_VALUE); + msgBuilder.setInt32(int32); + map.put("int32", int32); + + String text = "ut8字符串"; + msgBuilder.setText(text); + map.put("text", text); + + /*byte[] bytes = new byte[]{1, 2, 3, 4, 5}; + msgBuilder.setBytes(ByteString.copyFrom(bytes)); + map.put("bytes", bytes);*/ + + /*int enum_val = 1; + msgBuilder.setEnumValValue(enum_val); + map.put("enum_val", enum_val);*/ + + // subMsg start + long id = random.nextLong(1, Long.MAX_VALUE); + subMsgBuilder1.setId(id); + subMap1.put("id", id); + + String name = "ut8字符串1"; + /*subMsgBuilder1.setName(name); + subMap1.put("name", name);*/ + + int age = random.nextInt(1, Integer.MAX_VALUE); + subMsgBuilder1.setAge(age); + subMap1.put("age", age); + + double score = random.nextDouble(1, Integer.MAX_VALUE); + /*subMsgBuilder1.setScore(score); + subMap1.put("score", score);*/ + + long optional_id = random.nextLong(1, Long.MAX_VALUE); + subMsgBuilder1.setOptionalId(optional_id); + subMap1.put("optional_id", optional_id); + + int optional_age = random.nextInt(1, Integer.MAX_VALUE); + /*subMsgBuilder1.setOptionalAge(optional_age); + subMap1.put("optional_age", optional_age);*/ + + id = random.nextLong(1, Long.MAX_VALUE); + /*subMsgBuilder2.setId(id); + subMap2.put("id", id);*/ + + name = "ut8字符串1"; + subMsgBuilder2.setName(name); + subMap2.put("name", name); + + age = random.nextInt(1, Integer.MAX_VALUE); + /*subMsgBuilder2.setAge(age); + subMap2.put("age", age);*/ + + score = random.nextDouble(1, Integer.MAX_VALUE); + subMsgBuilder2.setScore(score); + subMap2.put("score", score); + + optional_id = random.nextLong(1, Long.MAX_VALUE); + /*subMsgBuilder2.setOptionalId(optional_id); + subMap2.put("optional_id", optional_id);*/ + + optional_age = random.nextInt(1, Integer.MAX_VALUE); + subMsgBuilder2.setOptionalAge(optional_age); + subMap2.put("optional_age", optional_age); + + Proto3TypesProtos.StructMessage subMsg1 = subMsgBuilder1.build(); + Proto3TypesProtos.StructMessage subMsg2 = subMsgBuilder2.build(); + // subMsg end + + /*msgBuilder.setMessage(subMsg1); + map.put("message", subMap1);*/ + + long optional_int64 = random.nextLong(1, Long.MAX_VALUE); + msgBuilder.setOptionalInt64(optional_int64); + map.put("optional_int64", optional_int64); + + /*int optional_int32 = random.nextInt(1, Integer.MAX_VALUE); + msgBuilder.setOptionalInt32(optional_int32); + map.put("optional_int32", optional_int32);*/ + + String optional_text = "ut8字符串"; + msgBuilder.setOptionalText(optional_text); + map.put("optional_text", optional_text); + + /*byte[] optional_bytes = new byte[]{1, 2, 3, 4, 5}; + msgBuilder.setOptionalBytes(ByteString.copyFrom(optional_bytes)); + map.put("optional_bytes", optional_bytes);*/ + + int optional_enum_val = 1; + msgBuilder.setOptionalEnumValValue(optional_enum_val); + map.put("optional_enum_val", optional_enum_val); + + msgBuilder.setOptionalMessage(subMsg2); + map.put("optional_message", subMap2); + + /*List<Long> repeated_int64 = Arrays.asList(1L, 3L, 5L); + msgBuilder.addAllRepeatedInt64(repeated_int64); + map.put("repeated_int64", repeated_int64);*/ + + List<Integer> repeated_int32 = Arrays.asList(1, 3, 5); + msgBuilder.addAllRepeatedInt32(repeated_int32); + map.put("repeated_int32", repeated_int32); + + msgBuilder.addAllRepeatedMessage(Arrays.asList(subMsg1, subMsg2)); + map.put("repeated_message", Arrays.asList(subMap1, subMap2)); + + InputDatas datas = new InputDatas(); + datas.msg = msgBuilder.build(); + datas.subMsg1 = subMsg1; + datas.subMsg2 = subMsg2; + datas.map = map; + datas.subMap1 = subMap1; + datas.subMap2 = subMap2; + return datas; + } + + @Test + public void testSerializeAndDeserialize() throws Exception{ + String path = getClass().getResource("/proto3_types.desc").getPath(); + Descriptors.Descriptor descriptor = ProtobufUtils.buildDescriptor(ProtobufUtils.readDescriptorFileContent(path),"Proto3Types"); + InputDatas inputDatas = geneInputDatas(); + + byte[] bytesSerByApi = inputDatas.msg.toByteArray(); + + ProtobufSerializer serializer = new ProtobufSerializer(descriptor); + byte[] bytesSer = serializer.serialize(inputDatas.map); + + System.out.println(String.format("built-in ser bytes size:%d\nmy ser bytes size:%d", bytesSerByApi.length, bytesSer.length)); + assertArrayEquals(bytesSerByApi, bytesSer); + + MessageConverter messageConverter = new MessageConverter(descriptor, SchemaConverters.toStructType(descriptor), false); + Map<String, Object> rstMap = messageConverter.converter(bytesSer); + + assertTrue(objEquals(inputDatas.map, rstMap, false), () -> "\n" + inputDatas.map.toString() + "\n" + rstMap.toString()); + System.out.println(inputDatas.map.toString()); + System.out.println(rstMap.toString()); + System.out.println(JSON.toJSONString(inputDatas.map)); + System.out.println(JSON.toJSONString(rstMap)); + + System.out.println(JSON.toJSONString(inputDatas.map).equals(JSON.toJSONString(rstMap))); + } + + @Test + public void testSerializeAndDeserializeDefaultValue() throws Exception{ + String path = getClass().getResource("/proto3_types.desc").getPath(); + Descriptors.Descriptor descriptor = ProtobufUtils.buildDescriptor(ProtobufUtils.readDescriptorFileContent(path),"Proto3Types"); + InputDatas inputDatas = geneInputDatasDefaultValue(); + + byte[] bytesSerByApi = inputDatas.msg.toByteArray(); + + ProtobufSerializer serializer = new ProtobufSerializer(descriptor); + byte[] bytesSer = serializer.serialize(inputDatas.map); + + System.out.println(String.format("built-in ser bytes size:%d\nmy ser bytes size:%d", bytesSerByApi.length, bytesSer.length)); + assertArrayEquals(bytesSerByApi, bytesSer); + + MessageConverter messageConverter = new MessageConverter(descriptor, SchemaConverters.toStructType(descriptor), false); + Map<String, Object> rstMap = messageConverter.converter(bytesSer); + messageConverter = new MessageConverter(descriptor, SchemaConverters.toStructType(descriptor), true); + Map<String, Object> rstMapEmitDefaultValue = messageConverter.converter(bytesSer); + + // message不是null就输出, 数组长度大于0才输出, optional设置值就输出, optional bytes长度为0也输出 + System.out.println(inputDatas.map.toString()); + System.out.println(rstMap.toString()); + System.out.println(rstMapEmitDefaultValue.toString()); + System.out.println(JSON.toJSONString(inputDatas.map)); + System.out.println(JSON.toJSONString(rstMap)); + System.out.println(JSON.toJSONString(rstMapEmitDefaultValue)); + + System.out.println(JSON.toJSONString(inputDatas.map).equals(JSON.toJSONString(rstMap))); + } + + @Test + public void testSerializeAndDeserializeUsePartialField() throws Exception{ + String path = getClass().getResource("/proto3_types.desc").getPath(); + Descriptors.Descriptor descriptor = ProtobufUtils.buildDescriptor(ProtobufUtils.readDescriptorFileContent(path),"Proto3Types"); + InputDatas inputDatas = geneInputDatasUsePartialField(); + + byte[] bytesSerByApi = inputDatas.msg.toByteArray(); + + ProtobufSerializer serializer = new ProtobufSerializer(descriptor); + byte[] bytesSer = serializer.serialize(inputDatas.map); + System.out.println(Base64.getEncoder().encodeToString(bytesSer)); + + System.out.println(String.format("built-in ser bytes size:%d\nmy ser bytes size:%d", bytesSerByApi.length, bytesSer.length)); + assertArrayEquals(bytesSerByApi, bytesSer); + + MessageConverter messageConverter = new MessageConverter(descriptor, SchemaConverters.toStructType(descriptor), false); + Map<String, Object> rstMap = messageConverter.converter(bytesSer); + + assertTrue(objEquals(inputDatas.map, rstMap, false), () -> "\n" + inputDatas.map.toString() + "\n" + rstMap.toString()); + System.out.println(inputDatas.map.toString()); + System.out.println(rstMap.toString()); + System.out.println(JSON.toJSONString(inputDatas.map)); + System.out.println(JSON.toJSONString(rstMap)); + + System.out.println(JSON.toJSONString(inputDatas.map).equals(JSON.toJSONString(rstMap))); + } + + @Test + public void testSerializeAndDeserializeSessionRecord() throws Exception{ + String path = getClass().getResource("/session_record_test.desc").getPath(); + Descriptors.Descriptor descriptor = ProtobufUtils.buildDescriptor(ProtobufUtils.readDescriptorFileContent(path),"SessionRecord"); + String json = "{\"recv_time\": 1704350600, \"log_id\": 185826449998479360, \"decoded_as\": \"BASE\", \"session_id\": 290502878495441820, \"start_timestamp_ms\": 1704350566378, \"end_timestamp_ms\": 1704350570816, \"duration_ms\": 4438, \"tcp_handshake_latency_ms\": 1105, \"ingestion_time\": 1704350600, \"processing_time\": 1704350600, \"device_id\": \"21426003\", \"out_link_id\": 65535, \"in_link_id\": 65535, \"device_tag\": \"{\\\"tags\\\":[{\\\"tag\\\":\\\"data_center\\\",\\\"value\\\":\\\"center-xxg-9140\\\"},{\\\"tag\\\":\\\"device_group\\\",\\\"value\\\":\\\"group-xxg-9140\\\"}]}\", \"data_center\": \"center-xxg-9140\", \"device_group\": \"group-xxg-9140\", \"sled_ip\": \"192.168.40.81\", \"address_type\": 4, \"vsys_id\": 1, \"t_vsys_id\": 1, \"flags\": 24592, \"flags_identify_info\": \"[1,1,2]\", \"statistics_rule_list\": [406583], \"client_ip\": \"192.56.151.80\", \"client_port\": 62241, \"client_os_desc\": \"Windows\", \"client_geolocation\": \"\\u7f8e\\u56fd.Unknown.Unknown..\", \"server_ip\": \"192.56.222.93\", \"server_port\": 14454, \"server_os_desc\": \"Linux\", \"server_geolocation\": \"\\u7f8e\\u56fd.Unknown.Unknown..\", \"ip_protocol\": \"tcp\", \"decoded_path\": \"ETHERNET.IPv4.TCP\", \"sent_pkts\": 4, \"received_pkts\": 5, \"sent_bytes\": 246, \"received_bytes\": 1809, \"tcp_rtt_ms\": 128, \"tcp_client_isn\": 568305009, \"tcp_server_isn\": 4027331180, \"in_src_mac\": \"a2:fa:dc:56:c7:b3\", \"out_src_mac\": \"48:73:97:96:38:20\", \"in_dest_mac\": \"48:73:97:96:38:20\", \"out_dest_mac\": \"a2:fa:dc:56:c7:b3\"}"; + Map<String, Object> map = JSON.parseObject(json); + + ProtobufSerializer serializer = new ProtobufSerializer(descriptor); + byte[] bytesSer = serializer.serialize(map); + System.out.println(Base64.getEncoder().encodeToString(bytesSer)); + + System.out.println(String.format("my ser bytes size:%d", bytesSer.length)); + + MessageConverter messageConverter = new MessageConverter(descriptor, SchemaConverters.toStructType(descriptor), false); + Map<String, Object> rstMap = messageConverter.converter(bytesSer); + + assertTrue(objEquals(map, rstMap, true), () -> "\n" + JSON.toJSONString(map) + "\n" + JSON.toJSONString(rstMap)); + System.out.println(map.toString()); + System.out.println(rstMap.toString()); + System.out.println(JSON.toJSONString(new TreeMap<>(map))); + System.out.println(JSON.toJSONString(new TreeMap<>(rstMap))); + + System.out.println(JSON.toJSONString(new TreeMap<>(map)).equals(JSON.toJSONString(new TreeMap<>(rstMap)))); + } + + + public static void main(String[] args) throws Exception{ + ProtobufEventSchemaTest test = new ProtobufEventSchemaTest(); + String path = test.getClass().getResource("/session_record_test.desc").getPath(); + Descriptors.Descriptor descriptor = ProtobufUtils.buildDescriptor(ProtobufUtils.readDescriptorFileContent(path),"SessionRecord"); + MessageConverter messageConverter = new MessageConverter(descriptor, SchemaConverters.toStructType(descriptor), false); + ProtobufSerializer serializer = new ProtobufSerializer(descriptor); + + FileInputStream inputStream = new FileInputStream( test.getClass().getResource("/format_protobuf_test_data.json").getPath()); + LineIterator lines = IOUtils.lineIterator(inputStream, "utf-8"); + int count = 0; + long jsonBytesTotalSize = 0; + long protoBytesTotalSize = 0; + long jsonBytesMinSize = Long.MAX_VALUE; + long protoBytesMinSize = Long.MAX_VALUE; + long jsonBytesMaxSize = 0; + long protoBytesMaxSize = 0; + long totalFieldCount = 0; + long minFieldCount = Long.MAX_VALUE; + long maxFieldCount = 0; + + CompressionType[] compressionTypes = new CompressionType[]{ + CompressionType.NONE, CompressionType.SNAPPY, CompressionType.LZ4, CompressionType.GZIP, CompressionType.ZSTD + }; + long[][] compressionBytesSize = new long[compressionTypes.length][6]; + for (int i = 0; i < compressionBytesSize.length; i++) { + compressionBytesSize[i][0] = 0; + compressionBytesSize[i][1] = 0; + compressionBytesSize[i][2] = Long.MAX_VALUE; + compressionBytesSize[i][3] = Long.MAX_VALUE; + compressionBytesSize[i][4] = 0; + compressionBytesSize[i][5] = 0; + } + + while (lines.hasNext()){ + String line = lines.next().trim(); + if(line.isEmpty()){ + continue; + } + + Map<String, Object> map = JSON.parseObject(line); + int fieldCount = map.size(); + byte[] bytesProto = serializer.serialize(map); + byte[] bytesJson = JSON.toJSONString(map).getBytes(StandardCharsets.UTF_8); + jsonBytesTotalSize += bytesJson.length; + protoBytesTotalSize += bytesProto.length; + jsonBytesMinSize = Math.min(jsonBytesMinSize, bytesJson.length); + protoBytesMinSize = Math.min(protoBytesMinSize, bytesProto.length); + jsonBytesMaxSize = Math.max(jsonBytesMaxSize, bytesJson.length); + protoBytesMaxSize = Math.max(protoBytesMaxSize, bytesProto.length); + totalFieldCount += fieldCount; + minFieldCount = Math.min(minFieldCount, fieldCount); + maxFieldCount = Math.max(maxFieldCount, fieldCount); + + Map<String, Object> rstMap = messageConverter.converter(bytesProto); + Preconditions.checkArgument(test.objEquals(map, rstMap, true), "\n" + JSON.toJSONString(new TreeMap<>(map)) + "\n" + JSON.toJSONString(new TreeMap<>(rstMap))); + Preconditions.checkArgument(JSON.toJSONString(new TreeMap<>(map)).equals(JSON.toJSONString(new TreeMap<>(rstMap))), "\n" + JSON.toJSONString(new TreeMap<>(map)) + "\n" + JSON.toJSONString(new TreeMap<>(rstMap))); + count++; + + for (int i = 0; i < compressionTypes.length; i++) { + CompressionType compressionType = compressionTypes[i]; + ByteBufferOutputStream bufferStream = new ByteBufferOutputStream(1024 * 16); + OutputStream outputStream = compressionType.wrapForOutput(bufferStream, (byte) 2); + outputStream.write(bytesJson); + outputStream.close(); + int jsonCompressSize = bufferStream.position(); + + bufferStream = new ByteBufferOutputStream(1024 * 16); + outputStream = compressionType.wrapForOutput(bufferStream, (byte) 2); + outputStream.write(bytesProto); + outputStream.close(); + int protoCompressSize = bufferStream.position(); + + compressionBytesSize[i][0] += jsonCompressSize; + compressionBytesSize[i][1] += protoCompressSize; + compressionBytesSize[i][2] = Math.min(compressionBytesSize[i][2], jsonCompressSize); + compressionBytesSize[i][3] = Math.min(compressionBytesSize[i][3], protoCompressSize); + compressionBytesSize[i][4] = Math.max(compressionBytesSize[i][4], jsonCompressSize); + compressionBytesSize[i][5] = Math.max(compressionBytesSize[i][5], protoCompressSize); + } + + } + System.out.println(String.format("count:%d, avgFieldCount:%d, minFieldCount:%d, maxFieldCount:%d, jsonBytesAvgSize:%d, protoBytesAvgSize:%d, jsonBytesMinSize:%d, protoBytesMinSize:%d, jsonBytesMaxSize:%d, protoBytesMaxSize:%d", + count, totalFieldCount/count, minFieldCount, maxFieldCount, jsonBytesTotalSize/count, protoBytesTotalSize/count, + jsonBytesMinSize, protoBytesMinSize, jsonBytesMaxSize, protoBytesMaxSize)); + for (int i = 0; i < compressionTypes.length; i++) { + CompressionType compressionType = compressionTypes[i]; + System.out.println(String.format("compression(%s): count:%d, jsonBytesAvgSize:%d, protoBytesAvgSize:%d, avgRatio:%.2f, jsonBytesMinSize:%d, protoBytesMinSize:%d, minRatio:%.2f, jsonBytesMaxSize:%d, protoBytesMaxSize:%d, maxRatio:%.2f", + compressionType, count, compressionBytesSize[i][0]/count, compressionBytesSize[i][1]/count, (((double)compressionBytesSize[i][1])/count)/(compressionBytesSize[i][0]/count), + compressionBytesSize[i][2], compressionBytesSize[i][3], ((double)compressionBytesSize[i][3])/(compressionBytesSize[i][2]), + compressionBytesSize[i][4], compressionBytesSize[i][5], ((double)compressionBytesSize[i][5])/(compressionBytesSize[i][4]))); + } + } + + @Test + public void testArrayInstance() throws Exception{ + Object bytes = new byte[]{1, 2, 3, 4, 5}; + Object ints = new int[]{1, 2, 3, 4, 5}; + + System.out.println(bytes.getClass().isArray()); + System.out.println(bytes instanceof byte[]); + System.out.println(bytes instanceof int[]); + System.out.println(ints.getClass().isArray()); + System.out.println(ints instanceof byte[]); + System.out.println(ints instanceof int[]); + } + + private boolean objEquals(Object value1, Object value2, boolean numConvert){ + if(value1 == null){ + if(value1 != value2){ + return false; + } + }else if(value2 == null){ + return false; + }else if(value1 instanceof Map){ + if(!mapEquals((Map<String, Object>) value1, (Map<String, Object>) value2, numConvert)){ + return false; + } + }else if(value1 instanceof List){ + if(!listEquals((List< Object>) value1, (List< Object>) value2, numConvert)){ + return false; + } + }else if(value1 instanceof byte[]){ + if(!Arrays.equals((byte[]) value1, (byte[]) value2)){ + return false; + } + } + else{ + if(value1.getClass() != value2.getClass() || !value1.equals(value2)){ + if(numConvert && value1 instanceof Number && value2 instanceof Number && ((Number) value1).longValue() == ((Number) value2).longValue()){ + + }else{ + return false; + } + } + } + return true; + } + private boolean mapEquals(Map<String, Object> map1, Map<String, Object> map2, boolean numConvert){ + if(map1.size() != map2.size()){ + return false; + } + + for (Map.Entry<String, Object> entry : map1.entrySet()) { + Object value1 = entry.getValue(); + Object value2 = map2.get(entry.getKey()); + if(!objEquals(value1, value2, numConvert)){ + return false; + } + } + + return true; + } + + private boolean listEquals(List< Object> list1, List< Object> list2, boolean numConvert){ + if(list1.size() != list2.size()){ + return false; + } + + for (int i = 0; i < list1.size(); i++) { + if(!objEquals(list1.get(i), list2.get(i), numConvert)){ + return false; + } + } + + return true; + } +} diff --git a/groot-formats/format-protobuf/src/test/java/com/geedgenetworks/formats/protobuf/ProtobufFormatFactoryTest.java b/groot-formats/format-protobuf/src/test/java/com/geedgenetworks/formats/protobuf/ProtobufFormatFactoryTest.java index 72f4ac9..1359f85 100644 --- a/groot-formats/format-protobuf/src/test/java/com/geedgenetworks/formats/protobuf/ProtobufFormatFactoryTest.java +++ b/groot-formats/format-protobuf/src/test/java/com/geedgenetworks/formats/protobuf/ProtobufFormatFactoryTest.java @@ -1,13 +1,13 @@ package com.geedgenetworks.formats.protobuf; import com.alibaba.fastjson2.JSON; -import com.geedgenetworks.core.connector.sink.SinkProvider; -import com.geedgenetworks.core.connector.source.SourceProvider; -import com.geedgenetworks.core.factories.FactoryUtil; -import com.geedgenetworks.core.factories.SinkTableFactory; -import com.geedgenetworks.core.factories.SourceTableFactory; -import com.geedgenetworks.core.factories.TableFactory; -import com.geedgenetworks.common.Event; +import com.geedgenetworks.api.connector.sink.SinkProvider; +import com.geedgenetworks.api.connector.sink.SinkTableFactory; +import com.geedgenetworks.api.connector.source.SourceProvider; +import com.geedgenetworks.api.connector.source.SourceTableFactory; +import com.geedgenetworks.api.connector.event.Event; +import com.geedgenetworks.api.factory.FactoryUtil; +import com.geedgenetworks.api.factory.ConnectorFactory; import org.apache.flink.configuration.Configuration; import org.apache.flink.streaming.api.datastream.DataStreamSink; import org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator; @@ -25,8 +25,9 @@ class ProtobufFormatFactoryTest { String path = ProtobufFormatFactoryTest.class.getResource("/proto3_types.desc").getPath(); String messageName = "Proto3Types"; - SourceTableFactory tableFactory = FactoryUtil.discoverTableFactory(SourceTableFactory.class, "inline"); + SourceTableFactory tableFactory = FactoryUtil.discoverConnectorFactory(SourceTableFactory.class, "inline"); Map<String, String> options = new HashMap<>(); + options.put("repeat.count", "3"); options.put("data", Base64.getEncoder().encodeToString(inputDatas.msg.toByteArray())); options.put("type", "base64"); options.put("format", "protobuf"); @@ -34,14 +35,14 @@ class ProtobufFormatFactoryTest { options.put("protobuf.message.name", messageName); Configuration configuration = Configuration.fromMap(options); - TableFactory.Context context = new TableFactory.Context( null, options, configuration); + ConnectorFactory.Context context = new ConnectorFactory.Context( null, options, configuration); SourceProvider sourceProvider = tableFactory.getSourceProvider(context); - SinkTableFactory sinkTableFactory = FactoryUtil.discoverTableFactory(SinkTableFactory.class, "print"); + SinkTableFactory sinkTableFactory = FactoryUtil.discoverConnectorFactory(SinkTableFactory.class, "print"); options = new HashMap<>(); options.put("format", "json"); configuration = Configuration.fromMap(options); - context = new TableFactory.Context( null, options, configuration); + context = new ConnectorFactory.Context( null, options, configuration); SinkProvider sinkProvider = sinkTableFactory.getSinkProvider(context); StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment(); diff --git a/groot-formats/format-protobuf/src/test/resources/format_protobuf_test_data.json b/groot-formats/format-protobuf/src/test/resources/format_protobuf_test_data.json new file mode 100644 index 0000000..51dac53 --- /dev/null +++ b/groot-formats/format-protobuf/src/test/resources/format_protobuf_test_data.json @@ -0,0 +1 @@ +{"in_src_mac":"58:b3:8f:fa:3b:11","in_dest_mac":"48:73:97:96:38:27","out_src_mac":"48:73:97:96:38:27","out_dest_mac":"58:b3:8f:fa:3b:11","ip_protocol":"tcp","address_type":4,"client_ip":"192.168.32.110","server_ip":"180.163.210.217","client_port":54570,"server_port":8081,"tcp_client_isn":3530397760,"tcp_server_isn":1741812485,"tcp_rtt_ms":28,"tcp_handshake_latency_ms":28,"direction":"Outbound","in_link_id":29,"out_link_id":29,"start_timestamp_ms":1731167469371,"end_timestamp_ms":1731167474466,"duration_ms":5095,"sent_pkts":6,"sent_bytes":572,"tcp_c2s_ip_fragments":0,"received_pkts":4,"tcp_s2c_ip_fragments":0,"received_bytes":266,"tcp_c2s_rtx_pkts":0,"tcp_c2s_rtx_bytes":0,"tcp_c2s_o3_pkts":0,"tcp_c2s_lost_bytes":0,"tcp_s2c_rtx_pkts":0,"tcp_s2c_rtx_bytes":0,"tcp_s2c_o3_pkts":0,"tcp_s2c_lost_bytes":0,"flags":28680,"flags_identify_info":[1,4,1,2],"app_transition":"unknown","decoded_as":"BASE","app_content":"unknown","app":"unknown","decoded_path":"ETHERNET.IPv4.TCP","client_country":"Private Network","server_country":"CN","app_category":"networking","server_asn":4812,"c2s_ttl":127,"s2c_ttl":47,"t_vsys_id":1,"vsys_id":1,"statistics_rule_list":[7731,7689,7532,7531,7372],"session_id":290530145510806375,"client_os_desc":"Windows","server_os_desc":"Linux","data_center":"XXG-TSG-BJ","device_group":"XXG-TSG-BJ","device_tag":"{\"tags\":[{\"tag\":\"data_center\",\"value\":\"XXG-TSG-BJ\"},{\"tag\":\"device_group\",\"value\":\"XXG-TSG-BJ\"}]}","device_id":"9800165603191146","sled_ip":"192.168.40.62","dup_traffic_flag":0,"sc_rule_list":[4303],"sc_rsp_raw":[2002],"encapsulation":"[{\"tunnels_schema_type\":\"MULTIPATH_ETHERNET\",\"c2s_source_mac\":\"48:73:97:96:38:27\",\"c2s_destination_mac\":\"58:b3:8f:fa:3b:11\",\"s2c_source_mac\":\"58:b3:8f:fa:3b:11\",\"s2c_destination_mac\":\"48:73:97:96:38:27\"}]","client_ip_tags":["Country Code:Private Network"],"server_ip_tags":["Country:China","ASN:4812","Country Code:CN"]}
\ No newline at end of file diff --git a/groot-formats/format-raw/pom.xml b/groot-formats/format-raw/pom.xml index 3433e64..11aa4d1 100644 --- a/groot-formats/format-raw/pom.xml +++ b/groot-formats/format-raw/pom.xml @@ -13,6 +13,30 @@ <name>Groot : Formats : Format-Raw </name> <dependencies> + <dependency> + <groupId>org.apache.flink</groupId> + <artifactId>flink-table-api-java-bridge_${scala.version}</artifactId> + <scope>test</scope> + </dependency> + + <dependency> + <groupId>org.apache.flink</groupId> + <artifactId>flink-clients_${scala.version}</artifactId> + <scope>test</scope> + </dependency> + + <dependency> + <groupId>org.apache.flink</groupId> + <artifactId>flink-table-planner-blink_${scala.version}</artifactId> + <scope>test</scope> + </dependency> + + <dependency> + <groupId>org.apache.flink</groupId> + <artifactId>flink-connector-kafka_${scala.version}</artifactId> + <version>${flink.version}</version> + <scope>test</scope> + </dependency> </dependencies> </project>
\ No newline at end of file diff --git a/groot-formats/format-raw/src/main/java/com/geedgenetworks/formats/raw/RawEventDeserializationSchema.java b/groot-formats/format-raw/src/main/java/com/geedgenetworks/formats/raw/RawEventDeserializationSchema.java index 14947d4..b299535 100644 --- a/groot-formats/format-raw/src/main/java/com/geedgenetworks/formats/raw/RawEventDeserializationSchema.java +++ b/groot-formats/format-raw/src/main/java/com/geedgenetworks/formats/raw/RawEventDeserializationSchema.java @@ -1,42 +1,42 @@ -package com.geedgenetworks.formats.raw;
-
-import com.geedgenetworks.common.Event;
-import com.geedgenetworks.core.types.Types;
-import com.geedgenetworks.core.types.StructType;
-import org.apache.flink.api.common.serialization.DeserializationSchema;
-import org.apache.flink.api.common.typeinfo.TypeInformation;
-import org.apache.flink.util.Preconditions;
-
-import java.io.IOException;
-import java.util.HashMap;
-import java.util.Map;
-
-public class RawEventDeserializationSchema implements DeserializationSchema<Event> {
- private final StructType dataType;
- private final String name;
-
- public RawEventDeserializationSchema(StructType dataType) {
- Preconditions.checkArgument(dataType.fields.length == 1 && dataType.fields[0].dataType.equals(Types.BINARY), "must is one binary type field");
- this.dataType = dataType;
- this.name = dataType.fields[0].name;
- }
-
- @Override
- public Event deserialize(byte[] message) throws IOException {
- Event event = new Event();
- Map<String, Object> map = new HashMap<>(8);
- map.put(name, message);
- event.setExtractedFields(map);
- return event;
- }
-
- @Override
- public boolean isEndOfStream(Event nextElement) {
- return false;
- }
-
- @Override
- public TypeInformation<Event> getProducedType() {
- return null;
- }
-}
+package com.geedgenetworks.formats.raw; + +import com.geedgenetworks.api.connector.event.Event; +import com.geedgenetworks.api.connector.type.StructType; +import com.geedgenetworks.api.connector.type.Types; +import org.apache.flink.api.common.serialization.DeserializationSchema; +import org.apache.flink.api.common.typeinfo.TypeInformation; +import org.apache.flink.util.Preconditions; + +import java.io.IOException; +import java.util.HashMap; +import java.util.Map; + +public class RawEventDeserializationSchema implements DeserializationSchema<Event> { + private final StructType dataType; + private final String name; + + public RawEventDeserializationSchema(StructType dataType) { + Preconditions.checkArgument(dataType.fields.length == 1 && dataType.fields[0].dataType.equals(Types.BINARY), "must is one binary type field"); + this.dataType = dataType; + this.name = dataType.fields[0].name; + } + + @Override + public Event deserialize(byte[] message) throws IOException { + Event event = new Event(); + Map<String, Object> map = new HashMap<>(8); + map.put(name, message); + event.setExtractedFields(map); + return event; + } + + @Override + public boolean isEndOfStream(Event nextElement) { + return false; + } + + @Override + public TypeInformation<Event> getProducedType() { + return null; + } +} diff --git a/groot-formats/format-raw/src/main/java/com/geedgenetworks/formats/raw/RawEventSerializationSchema.java b/groot-formats/format-raw/src/main/java/com/geedgenetworks/formats/raw/RawEventSerializationSchema.java index 8dfbe41..c964a5c 100644 --- a/groot-formats/format-raw/src/main/java/com/geedgenetworks/formats/raw/RawEventSerializationSchema.java +++ b/groot-formats/format-raw/src/main/java/com/geedgenetworks/formats/raw/RawEventSerializationSchema.java @@ -1,25 +1,26 @@ -package com.geedgenetworks.formats.raw;
-
-import com.geedgenetworks.common.Event;
-import com.geedgenetworks.core.types.StructType;
-import com.geedgenetworks.core.types.Types;
-import org.apache.flink.api.common.serialization.SerializationSchema;
-import org.apache.flink.util.Preconditions;
-
-public class RawEventSerializationSchema implements SerializationSchema<Event> {
- private final StructType dataType;
- private final String name;
-
- public RawEventSerializationSchema(StructType dataType) {
- Preconditions.checkArgument(dataType.fields.length == 1 && dataType.fields[0].dataType.equals(Types.BINARY), "must is one binary type field");
- this.dataType = dataType;
- this.name = dataType.fields[0].name;
- }
-
- @Override
- public byte[] serialize(Event element) {
- byte[] data = (byte[])element.getExtractedFields().get(name);
- return data;
- }
-
-}
+package com.geedgenetworks.formats.raw; + + +import com.geedgenetworks.api.connector.event.Event; +import com.geedgenetworks.api.connector.type.StructType; +import com.geedgenetworks.api.connector.type.Types; +import org.apache.flink.api.common.serialization.SerializationSchema; +import org.apache.flink.util.Preconditions; + +public class RawEventSerializationSchema implements SerializationSchema<Event> { + private final StructType dataType; + private final String name; + + public RawEventSerializationSchema(StructType dataType) { + Preconditions.checkArgument(dataType.fields.length == 1 && dataType.fields[0].dataType.equals(Types.BINARY), "must is one binary type field"); + this.dataType = dataType; + this.name = dataType.fields[0].name; + } + + @Override + public byte[] serialize(Event element) { + byte[] data = (byte[])element.getExtractedFields().get(name); + return data; + } + +} diff --git a/groot-formats/format-raw/src/main/java/com/geedgenetworks/formats/raw/RawFormatFactory.java b/groot-formats/format-raw/src/main/java/com/geedgenetworks/formats/raw/RawFormatFactory.java index 10e7b21..6e493bb 100644 --- a/groot-formats/format-raw/src/main/java/com/geedgenetworks/formats/raw/RawFormatFactory.java +++ b/groot-formats/format-raw/src/main/java/com/geedgenetworks/formats/raw/RawFormatFactory.java @@ -1,14 +1,14 @@ package com.geedgenetworks.formats.raw; -import com.geedgenetworks.common.Event; -import com.geedgenetworks.core.connector.format.DecodingFormat; -import com.geedgenetworks.core.connector.format.EncodingFormat; -import com.geedgenetworks.core.factories.DecodingFormatFactory; -import com.geedgenetworks.core.factories.EncodingFormatFactory; -import com.geedgenetworks.core.factories.TableFactory; -import com.geedgenetworks.core.types.StructType; -import com.geedgenetworks.core.types.StructType.StructField; -import com.geedgenetworks.core.types.Types; +import com.geedgenetworks.api.connector.serialization.DecodingFormat; +import com.geedgenetworks.api.connector.serialization.EncodingFormat; +import com.geedgenetworks.api.connector.event.Event; +import com.geedgenetworks.api.factory.DecodingFormatFactory; +import com.geedgenetworks.api.factory.EncodingFormatFactory; +import com.geedgenetworks.api.factory.ConnectorFactory; +import com.geedgenetworks.api.connector.type.StructType; +import com.geedgenetworks.api.connector.type.StructType.StructField; +import com.geedgenetworks.api.connector.type.Types; import org.apache.flink.api.common.serialization.DeserializationSchema; import org.apache.flink.api.common.serialization.SerializationSchema; import org.apache.flink.configuration.ConfigOption; @@ -22,12 +22,12 @@ public class RawFormatFactory implements DecodingFormatFactory, EncodingFormatFa public static final StructType DEFAULT_DATATYPE = new StructType(new StructField[]{new StructField("raw", Types.BINARY)}); @Override - public String factoryIdentifier() { + public String type() { return IDENTIFIER; } @Override - public DecodingFormat createDecodingFormat(TableFactory.Context context, ReadableConfig formatOptions) { + public DecodingFormat createDecodingFormat(ConnectorFactory.Context context, ReadableConfig formatOptions) { return new DecodingFormat(){ @Override public DeserializationSchema<Event> createRuntimeDecoder(StructType dataType) { @@ -41,7 +41,7 @@ public class RawFormatFactory implements DecodingFormatFactory, EncodingFormatFa } @Override - public EncodingFormat createEncodingFormat(TableFactory.Context context, ReadableConfig formatOptions) { + public EncodingFormat createEncodingFormat(ConnectorFactory.Context context, ReadableConfig formatOptions) { return new EncodingFormat() { @Override public SerializationSchema<Event> createRuntimeEncoder(StructType dataType) { diff --git a/groot-formats/format-raw/src/main/resources/META-INF/services/com.geedgenetworks.core.factories.Factory b/groot-formats/format-raw/src/main/resources/META-INF/services/com.geedgenetworks.api.factory.Factory index fb82c79..f4523f2 100644 --- a/groot-formats/format-raw/src/main/resources/META-INF/services/com.geedgenetworks.core.factories.Factory +++ b/groot-formats/format-raw/src/main/resources/META-INF/services/com.geedgenetworks.api.factory.Factory @@ -1 +1 @@ -com.geedgenetworks.formats.raw.RawFormatFactory
+com.geedgenetworks.formats.raw.RawFormatFactory diff --git a/groot-formats/pom.xml b/groot-formats/pom.xml index 31f15a1..5f20e42 100644 --- a/groot-formats/pom.xml +++ b/groot-formats/pom.xml @@ -21,24 +21,21 @@ </modules> <dependencies> + <dependency> <groupId>com.geedgenetworks</groupId> - <artifactId>groot-common</artifactId> + <artifactId>groot-api</artifactId> <version>${revision}</version> <scope>provided</scope> </dependency> <dependency> <groupId>com.geedgenetworks</groupId> - <artifactId>groot-core</artifactId> + <artifactId>groot-common</artifactId> <version>${revision}</version> <scope>provided</scope> </dependency> - <dependency> - <groupId>org.apache.flink</groupId> - <artifactId>flink-table-api-java-bridge_${scala.version}</artifactId> - </dependency> </dependencies> diff --git a/groot-release/src/main/assembly/assembly-bin.xml b/groot-release/src/main/assembly/assembly-bin-deprecated.xml index cea6575..cea6575 100644 --- a/groot-release/src/main/assembly/assembly-bin.xml +++ b/groot-release/src/main/assembly/assembly-bin-deprecated.xml diff --git a/groot-tests/pom.xml b/groot-tests/pom.xml index b46ad10..47b9177 100644 --- a/groot-tests/pom.xml +++ b/groot-tests/pom.xml @@ -24,10 +24,10 @@ <maven.deploy.skip>true</maven.deploy.skip> <maven-jar-plugin.version>2.4</maven-jar-plugin.version> <rest-assured.version>4.3.1</rest-assured.version> - <snappy-java.version>1.1.8.3</snappy-java.version> </properties> <dependencies> + <dependency> <groupId>com.geedgenetworks</groupId> <artifactId>groot-bootstrap</artifactId> diff --git a/groot-tests/test-e2e-clickhouse/pom.xml b/groot-tests/test-e2e-clickhouse/pom.xml index aef4470..d575f15 100644 --- a/groot-tests/test-e2e-clickhouse/pom.xml +++ b/groot-tests/test-e2e-clickhouse/pom.xml @@ -79,7 +79,6 @@ <dependency> <groupId>org.xerial.snappy</groupId> <artifactId>snappy-java</artifactId> - <version>${snappy-java.version}</version> <scope>test</scope> </dependency> diff --git a/groot-tests/test-e2e-kafka/pom.xml b/groot-tests/test-e2e-kafka/pom.xml index 4592f79..3d66b2a 100644 --- a/groot-tests/test-e2e-kafka/pom.xml +++ b/groot-tests/test-e2e-kafka/pom.xml @@ -39,7 +39,6 @@ <dependency> <groupId>org.xerial.snappy</groupId> <artifactId>snappy-java</artifactId> - <version>${snappy-java.version}</version> <scope>test</scope> </dependency> diff --git a/groot-tests/test-e2e-kafka/src/test/java/com/geedgenetworks/test/e2e/kafka/KafkaIT.java b/groot-tests/test-e2e-kafka/src/test/java/com/geedgenetworks/test/e2e/kafka/KafkaIT.java index 50ff9d8..e60d34d 100644 --- a/groot-tests/test-e2e-kafka/src/test/java/com/geedgenetworks/test/e2e/kafka/KafkaIT.java +++ b/groot-tests/test-e2e-kafka/src/test/java/com/geedgenetworks/test/e2e/kafka/KafkaIT.java @@ -1,8 +1,8 @@ package com.geedgenetworks.test.e2e.kafka; -import com.geedgenetworks.core.types.StructType; -import com.geedgenetworks.core.types.Types; import com.geedgenetworks.formats.json.JsonSerializer; +import com.geedgenetworks.api.connector.type.StructType; +import com.geedgenetworks.api.connector.type.Types; import com.geedgenetworks.test.common.TestResource; import com.geedgenetworks.test.common.TestSuiteBase; import com.geedgenetworks.test.common.container.TestContainer; @@ -30,6 +30,7 @@ import org.testcontainers.containers.Container; import org.testcontainers.containers.KafkaContainer; import org.testcontainers.containers.output.Slf4jLogConsumer; import org.testcontainers.lifecycle.Startables;; +import org.testcontainers.shaded.org.apache.commons.lang3.RandomStringUtils; import org.testcontainers.utility.DockerImageName; import org.testcontainers.utility.DockerLoggerFactory; import org.testcontainers.utility.MountableFile; @@ -153,7 +154,7 @@ public class KafkaIT extends TestSuiteBase implements TestResource { await().atMost(60000, TimeUnit.MILLISECONDS) .untilAsserted( () -> { - data.addAll(getKafkaConsumerListData("test_sink_topic")); + data.addAll(getKafkaConsumerListData("test_sink_topic", null)); Assertions.assertEquals(10, data.size()); // Check if all 10 records are consumed }); @@ -162,7 +163,7 @@ public class KafkaIT extends TestSuiteBase implements TestResource { @TestTemplate public void testKafkaAsSinkProducerQuota(TestContainer container) throws IOException, InterruptedException { //Create topic with 3 partitions - executeShell("kafka-topics --create --topic SESSION-RECORD-QUOTA-TEST --bootstrap-server kafkaCluster:9092 --partitions 3 --replication-factor 1 --command-config /etc/kafka/kafka_client_jass_cli.properties"); + executeShell("kafka-topics --create --topic SESSION-RECORD-QUOTA-TEST --bootstrap-server kafkaCluster:9092 --partitions 1 --replication-factor 1 --command-config /etc/kafka/kafka_client_jass_cli.properties"); //Set producer quota to 2KB/s executeShell("kafka-configs --bootstrap-server kafkaCluster:9092 --alter --add-config 'producer_byte_rate=2048' --entity-type users --entity-name admin --entity-type clients --entity-name SESSION-RECORD-QUOTA-TEST --command-config /etc/kafka/kafka_client_jass_cli.properties "); @@ -179,12 +180,12 @@ public class KafkaIT extends TestSuiteBase implements TestResource { }); List<String> data = Lists.newArrayList(); - await().atMost(300000, TimeUnit.MILLISECONDS) - .untilAsserted( - () -> { - data.addAll(getKafkaConsumerListData("SESSION-RECORD-QUOTA-TEST")); - Assertions.assertTrue(StringUtils.contains(container.getServerLogs(), "TimeoutException") && data.size()>100); - }); + await().atMost(600000, TimeUnit.MILLISECONDS) + .untilAsserted( + () -> { + data.addAll(getKafkaConsumerListData("SESSION-RECORD-QUOTA-TEST", "test-consume-group-quota"+ RandomStringUtils.randomAlphabetic(5))); + Assertions.assertTrue(StringUtils.contains(container.getServerLogs(), "TimeoutException") && data.size()>100); + }); } @@ -209,7 +210,7 @@ public class KafkaIT extends TestSuiteBase implements TestResource { await().atMost(60000, TimeUnit.MILLISECONDS) .untilAsserted( () -> { - data.addAll(getKafkaConsumerListData("test_handle_error_json_format_topic")); + data.addAll(getKafkaConsumerListData("test_handle_error_json_format_topic", null)); Assertions.assertTrue(StringUtils.contains(container.getServerLogs(), "UnsupportedOperationException")); Assertions.assertEquals(0, data.size()); }); @@ -237,7 +238,7 @@ public class KafkaIT extends TestSuiteBase implements TestResource { await().atMost(60000, TimeUnit.MILLISECONDS) .untilAsserted( () -> { - data.addAll(getKafkaConsumerListData("test_skip_error_json_format_topic")); + data.addAll(getKafkaConsumerListData("test_skip_error_json_format_topic",null)); Assertions.assertTrue(StringUtils.contains(container.getServerLogs(), "NullPointerException")); Assertions.assertEquals(0, data.size()); }); @@ -337,9 +338,10 @@ public class KafkaIT extends TestSuiteBase implements TestResource { return data; } - private List<String> getKafkaConsumerListData(String topicName) { + private List<String> getKafkaConsumerListData(String topicName, String consumeGroup) { List<String> data = new ArrayList<>(); - try (KafkaConsumer<String, String> consumer = new KafkaConsumer<>(kafkaConsumerConfig(DEFAULT_TEST_TOPIC_CONSUME_GROUP))) { + consumeGroup = StringUtils.isBlank(consumeGroup) ? DEFAULT_TEST_TOPIC_CONSUME_GROUP : consumeGroup; + try (KafkaConsumer<String, String> consumer = new KafkaConsumer<>(kafkaConsumerConfig(consumeGroup))) { consumer.subscribe(Arrays.asList(topicName)); Map<TopicPartition, Long> offsets = consumer.endOffsets(Arrays.asList(new TopicPartition(topicName, 0))); diff --git a/groot-tests/test-e2e-kafka/src/test/resources/kafka_producer_quota.yaml b/groot-tests/test-e2e-kafka/src/test/resources/kafka_producer_quota.yaml index 16d34a5..8c2ad8d 100644 --- a/groot-tests/test-e2e-kafka/src/test/resources/kafka_producer_quota.yaml +++ b/groot-tests/test-e2e-kafka/src/test/resources/kafka_producer_quota.yaml @@ -3,7 +3,7 @@ sources: # [object] Define connector source type: mock properties: mock.desc.file.path: /tmp/grootstream/config/template/mock_schema/session_record_mock_desc.json - rows.per.second: 100000 + rows.per.second: 10000 processing_pipelines: etl_processor: @@ -20,6 +20,7 @@ <module>groot-examples</module> <module>groot-formats</module> <module>groot-tests</module> + <module>groot-api</module> </modules> <properties> @@ -52,9 +53,12 @@ <hbase.version>2.2.3</hbase.version> <scala.version>2.12</scala.version> <opencsv.version>3.3</opencsv.version> - <jsonpath.version>2.4.0</jsonpath.version> + <jsonpath.version>2.9.0</jsonpath.version> <fastjson2.version>2.0.32</fastjson2.version> <hutool.version>5.8.22</hutool.version> + <uber-h3.version>4.1.1</uber-h3.version> + <vault-java-driver.version>6.2.0</vault-java-driver.version> + <bouncycastle.version>1.78.1</bouncycastle.version> <uuid-generator.version>5.1.0</uuid-generator.version> <bouncycastle.version>1.78.1</bouncycastle.version> <galaxy.version>2.0.2</galaxy.version> @@ -62,10 +66,11 @@ <ipaddress.version>5.3.3</ipaddress.version> <aviator.version>5.4.1</aviator.version> <snakeyaml.version>1.29</snakeyaml.version> - <nacos.version>1.2.0</nacos.version> + <nacos.version>2.0.4</nacos.version> <antlr4.version>4.8</antlr4.version> <jcommander.version>1.81</jcommander.version> <avro.version>1.9.1</avro.version> + <snappy-java.version>1.1.10.4</snappy-java.version> <lombok.version>1.18.24</lombok.version> <config.version>1.3.3</config.version> <hazelcast.version>5.1</hazelcast.version> @@ -319,6 +324,12 @@ <version>${avro.version}</version> </dependency> + <dependency> + <groupId>org.xerial.snappy</groupId> + <artifactId>snappy-java</artifactId> + <version>${snappy-java.version}</version> + </dependency> + <!-- flink dependencies --> <dependency> <groupId>org.apache.flink</groupId> @@ -393,6 +404,25 @@ <version>${hutool.version}</version> </dependency> + <!--Java bindings for H3, a hierarchical hexagonal geospatial indexing system.--> + <dependency> + <groupId>com.uber</groupId> + <artifactId>h3</artifactId> + <version>${uber-h3.version}</version> + </dependency> + + <dependency> + <groupId>io.github.jopenlibs</groupId> + <artifactId>vault-java-driver</artifactId> + <version>${vault-java-driver.version}</version> + </dependency> + <!-- bouncycastle cryptographic algorithms --> + <dependency> + <groupId>org.bouncycastle</groupId> + <artifactId>bcpkix-jdk18on</artifactId> + <version>${bouncycastle.version}</version> + </dependency> + <dependency> <groupId>com.fasterxml.uuid</groupId> <artifactId>java-uuid-generator</artifactId> |
