-
Notifications
You must be signed in to change notification settings - Fork 12
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[Feature][Docs] Add metric docs (#23)
- Loading branch information
Showing
22 changed files
with
1,022 additions
and
10 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
6 changes: 3 additions & 3 deletions
6
docs/04-features/01-catalog/02-connector/10-connector-databend.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
File renamed without changes.
62 changes: 62 additions & 0 deletions
62
docs/04-features/02-metric/01-single-table-metric/03-column-null.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,62 @@ | ||
--- | ||
id: 'column-null' | ||
title: 'column_null' | ||
--- | ||
## 使用方法 | ||
- 点击创建规则作业,选择数据质量作业 | ||
- 进入作业页面选择 空值检查 规则 | ||
- 选择要检查的数据源信息 | ||
|
||
## 参数介绍 | ||
### Options | ||
|
||
| name | type | required | default value | | ||
|:----------------------------:|:------:|:----------:|:-------------:| | ||
| [database](#database-string) | string | yes | - | | ||
| [table](#table-string) | string | yes | - | | ||
| [column](#column-string) | string | yes | - | | ||
|
||
#### database [string] | ||
源表数据库名 | ||
#### table [string] | ||
源表数据库中的表名 | ||
#### column [string] | ||
要检查的列 | ||
|
||
### 配置文件例子 | ||
``` | ||
{ | ||
"metricType": "column_null", | ||
"metricParameter": { | ||
"database": "datavines", | ||
"table": "dv_catalog_entity_instance", | ||
"column": "type" | ||
} | ||
} | ||
``` | ||
|
||
### 检查过程中自动生成的 `SQL` 语句 | ||
|
||
检查过程会用到的一些自动生成的参数,用于区分各个检查规则。 | ||
- uniqueKey:会根据每个规则的配置信息生成一个唯一键值 | ||
- invalidate_items_table:会创建一个视图用于存储中间表数据,中间表数据一般为命中规则的数据,即为错误数据,该视图的名字生成规则为 invalidate_items_${uniqueKey} | ||
|
||
中间表 invalidate_items_uniqueKey | ||
``` | ||
select * from ${table} where ${column} is null and ${filter} | ||
``` | ||
计算实际值的 `SQL` | ||
``` | ||
select count(1) as actual_value_"+ uniqueKey +" from ${invalidate_items_table} | ||
``` | ||
|
||
## 使用案例 | ||
|
||
### 场景 | ||
... | ||
|
||
### 思路 | ||
... | ||
|
||
### 步骤 | ||
... |
57 changes: 57 additions & 0 deletions
57
docs/04-features/02-metric/01-single-table-metric/04-column-avg.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,57 @@ | ||
--- | ||
id: 'column-avg' | ||
title: 'column_avg' | ||
--- | ||
## 使用方法 | ||
- 点击创建规则作业,选择数据质量作业 | ||
- 进入作业页面选择 平均值检查 规则 | ||
- 选择要检查的数据源信息 | ||
|
||
## 参数介绍 | ||
### Options | ||
|
||
| name | type | required | default value | | ||
|:----------------------------:|:------:|:----------:|:-------------:| | ||
| [database](#database-string) | string | yes | - | | ||
| [table](#table-string) | string | yes | - | | ||
| [column](#column-string) | string | yes | - | | ||
|
||
#### database [string] | ||
源表数据库名 | ||
#### table [string] | ||
源表数据库中的表名 | ||
#### column [string] | ||
要检查的列 | ||
|
||
### 配置文件例子 | ||
``` | ||
{ | ||
"metricType": "column_avg", | ||
"metricParameter": { | ||
"database": "datavines", | ||
"table": "dv_catalog_entity_instance", | ||
"column": "type" | ||
} | ||
} | ||
``` | ||
|
||
### 检查过程中自动生成的 `SQL` 语句 | ||
|
||
检查过程会用到的一些自动生成的参数,用于区分各个检查规则。 | ||
- uniqueKey:会根据每个规则的配置信息生成一个唯一键值 | ||
|
||
计算实际值的 `SQL` | ||
``` | ||
select avg(${column}) as actual_value_${uniqueKey} from ${table} where ${filter} | ||
``` | ||
|
||
## 使用案例 | ||
|
||
### 场景 | ||
... | ||
|
||
### 思路 | ||
... | ||
|
||
### 步骤 | ||
... |
57 changes: 57 additions & 0 deletions
57
docs/04-features/02-metric/01-single-table-metric/05-column-avg-length.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,57 @@ | ||
--- | ||
id: 'column-avg-length' | ||
title: 'column_avg_length' | ||
--- | ||
## 使用方法 | ||
- 点击创建规则作业,选择数据质量作业 | ||
- 进入作业页面选择 平均长度检查 规则 | ||
- 选择要检查的数据源信息 | ||
|
||
## 参数介绍 | ||
### Options | ||
|
||
| name | type | required | default value | | ||
|:----------------------------:|:------:|:----------:|:-------------:| | ||
| [database](#database-string) | string | yes | - | | ||
| [table](#table-string) | string | yes | - | | ||
| [column](#column-string) | string | yes | - | | ||
|
||
#### database [string] | ||
源表数据库名 | ||
#### table [string] | ||
源表数据库中的表名 | ||
#### column [string] | ||
要检查的列 | ||
|
||
### 配置文件例子 | ||
``` | ||
{ | ||
"metricType": "column_avg_length", | ||
"metricParameter": { | ||
"database": "datavines", | ||
"table": "dv_catalog_entity_instance", | ||
"column": "type" | ||
} | ||
} | ||
``` | ||
|
||
### 检查过程中自动生成的 `SQL` 语句 | ||
|
||
检查过程会用到的一些自动生成的参数,用于区分各个检查规则。 | ||
- uniqueKey:会根据每个规则的配置信息生成一个唯一键值 | ||
|
||
计算实际值的 `SQL` | ||
``` | ||
select avg(length(${column})) as actual_value_${uniqueKey} from ${table} where ${filter} | ||
``` | ||
|
||
## 使用案例 | ||
|
||
### 场景 | ||
... | ||
|
||
### 思路 | ||
... | ||
|
||
### 步骤 | ||
... |
62 changes: 62 additions & 0 deletions
62
docs/04-features/02-metric/01-single-table-metric/06-column-blank.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,62 @@ | ||
--- | ||
id: 'column-blank' | ||
title: 'column_blank' | ||
--- | ||
## 使用方法 | ||
- 点击创建规则作业,选择数据质量作业 | ||
- 进入作业页面选择 无值检查 规则 | ||
- 选择要检查的数据源信息 | ||
|
||
## 参数介绍 | ||
### Options | ||
|
||
| name | type | required | default value | | ||
|:----------------------------:|:------:|:----------:|:-------------:| | ||
| [database](#database-string) | string | yes | - | | ||
| [table](#table-string) | string | yes | - | | ||
| [column](#column-string) | string | yes | - | | ||
|
||
#### database [string] | ||
源表数据库名 | ||
#### table [string] | ||
源表数据库中的表名 | ||
#### column [string] | ||
要检查的列 | ||
|
||
### 配置文件例子 | ||
``` | ||
{ | ||
"metricType": "column_blank", | ||
"metricParameter": { | ||
"database": "datavines", | ||
"table": "dv_catalog_entity_instance", | ||
"column": "type" | ||
} | ||
} | ||
``` | ||
|
||
### 检查过程中自动生成的 `SQL` 语句 | ||
|
||
检查过程会用到的一些自动生成的参数,用于区分各个检查规则。 | ||
- uniqueKey:会根据每个规则的配置信息生成一个唯一键值 | ||
- invalidate_items_table:会创建一个视图用于存储中间表数据,中间表数据一般为命中规则的数据,即为错误数据,该视图的名字生成规则为 invalidate_items_${uniqueKey} | ||
|
||
中间表 invalidate_items_uniqueKey | ||
``` | ||
select * from ${table} where (${column} is null or ${column} = '') and ${filter} | ||
``` | ||
计算实际值的 `SQL` | ||
``` | ||
select count(1) as actual_value_"+ uniqueKey +" from ${invalidate_items_table} | ||
``` | ||
|
||
## 使用案例 | ||
|
||
### 场景 | ||
... | ||
|
||
### 思路 | ||
... | ||
|
||
### 步骤 | ||
... |
57 changes: 57 additions & 0 deletions
57
docs/04-features/02-metric/01-single-table-metric/07-column-distinct.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,57 @@ | ||
--- | ||
id: 'column-distinct' | ||
title: 'column_distinct' | ||
--- | ||
## 使用方法 | ||
- 点击创建规则作业,选择数据质量作业 | ||
- 进入作业页面选择 Distinct检查 规则 | ||
- 选择要检查的数据源信息 | ||
|
||
## 参数介绍 | ||
### Options | ||
|
||
| name | type | required | default value | | ||
|:----------------------------:|:------:|:----------:|:-------------:| | ||
| [database](#database-string) | string | yes | - | | ||
| [table](#table-string) | string | yes | - | | ||
| [column](#column-string) | string | yes | - | | ||
|
||
#### database [string] | ||
源表数据库名 | ||
#### table [string] | ||
源表数据库中的表名 | ||
#### column [string] | ||
要检查的列 | ||
|
||
### 配置文件例子 | ||
``` | ||
{ | ||
"metricType": "column_distinct", | ||
"metricParameter": { | ||
"database": "datavines", | ||
"table": "dv_catalog_entity_instance", | ||
"column": "type" | ||
} | ||
} | ||
``` | ||
|
||
### 检查过程中自动生成的 `SQL` 语句 | ||
|
||
检查过程会用到的一些自动生成的参数,用于区分各个检查规则。 | ||
- uniqueKey:会根据每个规则的配置信息生成一个唯一键值 | ||
|
||
计算实际值的 `SQL`, 输出不重复的行数 | ||
``` | ||
select count(distinct(${column})) as actual_value_${uniqueKey} from ${table} where ${filter} | ||
``` | ||
|
||
## 使用案例 | ||
|
||
### 场景 | ||
... | ||
|
||
### 思路 | ||
... | ||
|
||
### 步骤 | ||
... |
Oops, something went wrong.