readme_fieldstat.md


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221

# Fieldstat 4.0 

## design 
Field Stat is a library for outputting and statistics of running states. Compared to the previous version, fieldstat3.0, this version provides support for new functionalities. The following are the concepts are borrowed from OLAP (Online Analytical Processing) and are used in fieldstat4.0:


- Instance: Corresponds to the `struct fieldstat` and represents the instance handle on a thread.
- Cube: A cube is a multi-dimensional dataset that allows for complex analysis of data. It represents data through the use of dimensions, the categorical attributes by which data is analyzed, and metrics. The cube structure allows users to view data from different perspectives, slice and dice data along various dimensions, and perform advanced analytical operations. 
- Cell: Each cell represents the intersection of all the dimensions in the cube. Cells contain the actual data values that are being analyzed.
- Metric: The numerical data points that are being analyzed. Each metric corresponds to a data value in cell. Currently, there are three types of metric statistics: counter, Hyper Log Log, and Histogram.

![cube](./OLAP_example.png)


### sampling mode

Although the addition of cells is dynamic, there is a maximum limit on the number of cells per cube. Currently, there are 3 modes of limitation known as sampling modes:
- SAMPLING_MODE_COMPREHENSIVE
- SAMPLING_MODE_TOPK
- SAMPLING_MODE_TOP_CADINALITY
  
In the Comprehensive mode, once the maximum number of cells is reached in a cube, no new cells can be added under the current cube. Any subsequent new cells will be discarded.

In the top-K mode, the Heavy Keeper algorithm is used to retain the complete sorting information of cells based on a primary metric. Even after reaching the maximum number of cells, new cells will still be accepted. A new cell with a higher ranking will be added while removing a cell with a lower ranking.

In the top-cardinality mode, the cell sorting behavior is similar to the top-K mode. However, the maximum number of cells is determined by the cardinality of the primary metric of HyperLogLog. [SpreadSketch](https://ieeexplore.ieee.org/abstract/document/9858870) algorithm is used to sort out entries of higher cardinality.

### metrics
#### Counter
A simple counting unit that supports Increment and Set operations. When registering a counter metric, you need to specify whether the counter type is Gauge or a general Counter. Gauge is used for statistical analysis of state variables, typically associated with Set operations. Counter is used for accumulating scalar values, typically associated with Increment operations. The explicit distinction between these two types is because they are handled differently during the merge process.

#### Histogram
Outputs the distribution of numerical values for events, such as latency or file length. Depending on the nature of the measured quantity, it can output the probability distribution of values in terms of quantity or the proportion of values in the entire dataset.

#### Hyper Log Log
Uses statistical techniques to estimate the number of elements in a set that contains a large number of values. For example, it can be used to estimate the number of unique client IP addresses in a set of network flows.

### Why not fieldstat 3
Compared to version 3.0, version 4.0 introduces the concepts of cube. Cube is a collection of metrics for the same purpose, and is also the unit of metric management. 

Version 4.0 no longer supports multithreading. In version 3.0, the metric of type "counter" could be shared and operated on by multiple threads. However, in version 4.0, each instance is written by only one thread. The fieldstat version 4.0 provides support for distributed statistics through methods like serialize and merge. It is the responsibility of the caller to aggregate statistics from different instances. For simplified multi-threaded support, refer to [fieldstat easy](readme_fieldstat_easy.md).

## Usage
### installation 
Download fieldstat4 rpm from https://repo.geedge.net/pulp/content/ and install rpm package.

### Statistic
``` C
#include "fieldstat.h"

struct fieldstat *instance = fieldstat_new();
int cube_id = fieldstat_cube_create(instance, YOUR_DIMENSION, YOUR_DIMENSION_LENGTH);
int metric_counter_id = fieldstat_register_counter(instance, cube_id, "any metric name", 0/1);
int metric_histogram_id = fieldstat_register_histogramogram(instance, cube_id, "any metric name", THE_MINIMUM_NUMBER_TO_RECORD, THE_MAXIMUM_NUMBER_TO_RECORD, PRECISION);
int metric_hll_id = fieldstat_register_hll(instance, cube_id, "any metric name", PRECISION);
fieldstat_cube_set_sampling(instance, cube_id, SAMPLING_MODE_COMPREHENSIVE/SAMPLING_MODE_TOPK/SAMPLING_MODE_TOP_CARDINALITY, max_cell_number);

fieldstat_counter_incrby(instance, cube_id, metric_counter_id, cell_id, VALUE);
fieldstat_histogram_record(instance, cube_id, metric_counter_id, cell_id, VALUE);
fieldstat_hll_add(instance, cube_id, metric_counter_id, cell_id, VALUE, VALUE_STR_LENGTH);

fieldstat_free(instance);
```

### Used in multi-threading
``` C
struct fieldstat *master = fieldstat_new();
// other operations like cube_create and metric_register
for (int i = 0; i < INSTANCE_NUM; i++) {
    if (instances[i] == NULL) {
        struct fieldstat *instance = fieldstat_fork(master); // new a instance with the same configuration as master
    } else {
        fieldstat_calibrate(master, instances[i]); // sync the changes cube and metric to instance[i] without changing the data
    }
}

// After operations on these instance, to sum up result:
struct fieldstat *dest = fieldstat_new();
for (int i = 0; i < INSTANCE_NUM; i++) {
    fieldstat_merge(dest, instances[i]);
}
```

### Export
``` C

struct fieldstat_json_exporter *fieldstat_json_exporter = fieldstat_json_exporter_new(instance);
// optional
fieldstat_json_exporter_set_global_dimension(fieldstat_json_exporter, YOUR_GLOBAL_TAG, YOUR_GLOBAL_TAG_LENGTH);

char *json_string = fieldstat_json_exporter_export_flat(fieldstat_json_exporter);
printf("test, fieldstat_json_exporter_export json_string: %s\n", json_string);
free(json_string);
fieldstat_json_exporter_free(fieldstat_json_exporter);
```

## benchmark
### memory 
The total memory consumption, approximately composes two parts: the memory used for storing data, and the memory used when processing merge as well as export.

The memory is calculated on 64-bit system, with unit of Byte if not specified.

#### Total usage on storing data
The memory are consumed by:
- the array and hash table to store tags of every cells
- Metrics
- Heavy Keeper instance if top-K sampling mode or Spread Sketch instance if top-cardinality sampling mode

Note that every cell will have its only metric, so the memory cost by metrics should be multiplied by cell number.

#### metrics
- **Counter** 
  Counter is a very simple data structure, every counter uses 24 Bytes.
- **Hyper Log Log**
  The size of HLL is dependent on its precision. The size is: 
  $$ 4 \cdot  2 ^ {precision}/ 6$$
- **Histogram** 
  The size of histogram size can be approximated by
  $$ log_2(\frac{MAX}{MIN}) * 2^{precision} $$

#### cells
Every time successfully add a cell will increase memory usage by 
$$ 2 sizeof(tag) + 112$$


#### Heavy Keeper
As shown in table. For K above 1000, the size of sketch will hardly increase, the marginal memory cost will be storing cells. Memory is the memory cost of a initialed heavykeeper(Sketch memory cost), while Memory_max is that after cube is full(adding keys and hash handles in sorted set).

| K    | w    | d   | Memory(kB) | Memory_max(kB) |
| ---- | ---- | --- | ---------- | -------------- |
| 10   | 150  | 3   | 3.768      | 5.028          |
| 100  | 400  | 3   | 9.768      | 22.368         |
| 500  | 1125 | 3   | 27.168     | 90.168         |
| 1000 | 1951 | 3   | 46.992     | 172.992        |

#### Spread Sketch
Spread Sketch memory cost constists of two parts: the memory cost of a initialed Spread Sketch(sketch memory cost), and the memory cost of storing cells in a hash list. The memory cost of hash list is quite unstable, but can be estimated by K * 1.7 * 7 Bytes, where 1.7 is the ratio of actually stored entries number to the maximum number of entries, and 7 bytes is the size of a hash handle and 2 pointers.

| K    | w    | d   | Precision |Memory(kB) |  Memory_max(kB) |  
| ---- | ---- | --- | ---------- | ---------- | ---------- | 
| 10   | 40   | 3   | 6       | 1.75      | 1.82     |
| 100  | 150  | 3   | 6      | 6.60     | 7.76     |
| 500  | 550  | 3   | 6     | 24.16     | 29.97 |
| 1000 | 1050 | 3   | 6    | 46.14     | 57.76 |


## performance
### environment
- **OS**：x86_64 Linux 4.18.0-425.3.1.el8
- **CPU**: Intel(R) Xeon(R) Platinum 8269CY CPU @ 2.50GHz
- **Total Memory**：29GB
- **GCC Version**：8.5.0 20210514 (Red Hat 8.5.0-15) (GCC)
- **Compiling Option**：-O3

### result

| operation                                     | time(us)  | corresponding api                                                                |
| --------------------------------------------- | --------- | -------------------------------------------------------------------------------- |
| add a cell(topk)                              | 0.836     | fieldstat_counter_incyby(new sampling)                                           |
| add a cell(comprehensive)                     | 1.191     | fieldstat_counter_incyby(new sampling)                                           |
| histogram record                              | 0.2946    | fieldstat_histogram_record                                                       |
| HLL add                                       | 0.3306    | fieldstat_hll_add                                                                |
| add a cell(comprehensive)  with 5 tags        | 1.8751    | fieldstat_counter_incyby(new sampling)                                           |
| histogram record           with 5 tags        | 0.5956    | fieldstat_histogram_record                                                       |
| HLL add                    with 5 tags        | 0.6825    | fieldstat_hll_add                                                                |
| copy 1 cell of a 1 counter metric             | 0.7814    | fieldstat_merge(empty destination instance)                                      |
| copy 1 cell of a HLL metric                   | 1.96      | fieldstat_merge(empty destination instance)                                      |
| copy 1 cell of a histogram metric             | 0.844     | fieldstat_merge(empty destination instance)                                      |
| copy 1 cell of a counter metric on topk cube  | 1.77      | fieldstat_merge(empty destination instance)                                      |
| merge 1 cell of a counter metric              | 0.448     | fieldstat_merge(destination also has registered cube, metric and cells)          |
| merge 1 cell of a HLL metric                  | 1.70      | fieldstat_merge(destination also has registered cube, metric and cells)          |
| merge 1 cell of a histogram metric            | 0.659     | fieldstat_merge(destination instance also has registered cube, metric and cells) |
| merge 1 cell of a counter metric on topk cube | 0.659     | fieldstat_merge(destination instance also has registered cube, metric and cells) |
| export a cell (with 10 counter)               | 0.118     | fieldstat_json_exporter_export                                                   |
| merge 1000 empty cubes(with 10 counter)(comprehensive)               | 0.434000     | fieldstat_merge(empty src)                                                   |
| merge 1000 empty cubes(with 10 counter)(topk)               | 0.838000     | fieldstat_merge(empty src)                                                   |
| reset 1000 empty cubes(comprehensive)               | 0.163000     | fieldstat_reset                                                   |
| reset 1000 empty cubes(topk)               | 0.315000     | fieldstat_reset                                                   |
| callibrate two same instance with 1000 cubes               | 0.196000     | fieldstat_calibrate                                                 |
| reset 1000 counter cells               | 0.036000     | fieldstat_reset                                                   |


### Explanation
#### add a cell(topk)
Generate 100000 random tags, call fieldstat_counter_incyby for each tag. Calculate the mean time spent for one tag. The max cell number of cube(K) is 1000.
#### add a cell(comprehensive)
Generate 100000 random tags, call fieldstat_counter_incyby for each tag. Calculate the mean time spent for one tag. Every adding operation is success (The max cell number of cube is 100000).

The time is larger than topk mode, because every tags are added to hash table, while topk only select some.
#### histogram record 
For the same cube, the same metric, and the same cell, call fieldstat_histogram_record 100000 times. Calculate the mean time spent for record once.
#### HLL add
For the same cube, the same metric, and the same cell, call fieldstat_hll_add 100000 times. Calculate the mean time spent for record once.
#### copy 1 cell of a counter metric
There are 1000 cells, 1 cube, 1 metric on a instance to be merged. The metric type is counter. Record the time of merging once. Calculate the mean time.

####  copy 1 cell of a HLL metric
There are 1000 cells, 1 cube, 1 metric on a instance to be merged. The metric type is HLL, with precision set to 6. Record the time of merging once. Calculate the mean time. 

####  copy 1 cell of a histogram metric
There are 1000 cells, 1 cube, 1 metric on a instance to be merged. The metric type is Histogram, with min set to 1, max set to 100000, precision set to 1. Record the time of merging once. Calculate the mean time. It should be noticed that merging histogram is time-consuming.

#### copy 1 cell of a counter metric on topk cube
Register a top-K sampling mode cube, following by the same operation of "copy 1 cell of a counter metric".

#### merge 1 cell of a 1 counter metric
First do the same registering operation the same as copy 1 cell of a counter metric. After merging without timer start once to create a duplication of source instance, merge twice with the timer start to measure performance. The second merge won't add new cells, so it will be faster.

####  merge 1 cell of a HLL metric
The same as "copy 1 cell of a HLL metric" but merge twice.

####  merge 1 cell of a histogram metric
The same as "copy 1 cell of a histogram metric" but merge twice.

#### merge 1 cell of a counter metric on topk cube
The same as "copy 1 cell of a counter metric on topk cube" but merge twice. 
Merging Heavy Keeper is slower the coping it, it is why the execution time is unusually higher than copying it.

#### export a cell
I add a instance with 100 cubes, on each of which I register 10 metrics. The max cell number of each cube is 1000, and call the metric operation on every added cell. So there are 100 * 10 * 1000 cells in total. After exporting the instance, calculate the average time spent on each cells.