Components of metadata

Target definition

Target is defined as an aggregate in the aggregates table by having window_type=0. Id of the aggregate must be specified in the models table.

Target example

Aggregates table:
id variable_id aggregate_type window_type name
1 1 1 0 D_count_all
Models table:
id used target_aggregate_id target_start target_length target_length_exact ...
1 1 1 0 0 0 ...

In example above, the target window starts directly after an event setting trigger true (because of target_start=0) and is counted to the end of data (target_length=0 and target_length_exact=0). The target is an D_count_all aggregate, which counts the number of occurences of the "D" event.

Models

Information about models is stored in metadata in the models table. Active models (used in on-line scoring) must have "used" flag set on "true". The models table also contains a scoring code of a model, saved in a string format.

Deployment of a new model

After computation of a new model, it is immediately implemented - new model owerwrites an old model with the same id.

If a model uses other variables than the previous one, every used aggregate must be updated. Aggregates are calculated iteratively in order to minimize the delay of processing a new event by the new model. At first, the engine calculate necessary aggregates using saved retail events, then it updates the values of aggregates if a new event appears. The process is repeated several times.

Scoring

Scoring runs when a high-level event occurs (see High-level events section below). Events are defined in metadata in the triggers table. Different events can induce scoring by different models (assignment of high-level events to models is described in metadata). Also, different groups of users can be scored by different models. Definitions of groups of users are also stored in the triggers table. Given user can be scored by multiple models at once (every event inducing scoring can be related to many models in the models_triggers table). Scoring code is kept as a string in the models table. If the Event Engine is integrated with Scoring Engine, a reference to a model stored in Scoring Engine database is kept in the models table in place of the scoring code.

In programmatic, the modeling table is build basing on unique user id and impression id. Therefore, scoring is based on impressions as well. Events inducing scoring (bid requests) may include several bid requests (impression list). In this case, at the beginning of processing the event is divided into several events (one for each impression) passed to scoring.

High-level events

High-level events are events inducing scoring or target calculating. They are defined in metadata (the triggers table) as Java expressions. Expressions will be defined by analytics. All variables, that can be used in expressions, will be listed in a separate dictionary.

Dictionaries

Dictionaries can be used in order to faciliate the use of aggregates. Dictionaries are defined in metadata in the dictionary table. When defining an aggregate, a dictionary_id of a suitable entry must be included.

Example of a dictionary

Dictionary:
variable1 cat1 cat2
1 D AA
2 F AA
3 D BB
4 F BB
5 D AA
The dictionary table:
id categorical value start start_inclusive end end_inclusive mapped_value
1 1 1 null null null null D
1 1 3 null null null null D
1 1 5 null null null null D
2 1 2 null null null null F
2 1 2 null null null null F
3 1 1 null null null null AA
3 1 2 null null null null AA
3 1 5 null null null null AA
4 1 3 null null null null BB
4 1 4 null null null null BB
5 1 1 null null null null D_AA
5 1 5 null null null null D_AA
6 1 3 null null null null D_BB
7 1 2 null null null null F_AA
8 1 4 null null null null F_BB
The aggregates table:
variable_id aggregate_type dictionary_id name
1 1 1 D_count_all
1 1 2 F_count_all
1 1 3 AA_count_all
1 1 4 BB_count_all
1 1 5 D_AA_count_all
1 1 6 D_BB_count_all
1 1 7 F_AA_count_all
1 1 8 F_BB_count_all
1 2 8 F_BB_sum_all