1,kettle下载
kettle本身是Java开发的开源项目,由于某种原因,目前官网已经不支持下载了,可到我的CSDN中下载
下载完后,还需要安装JDK环境以及对应的数据库驱动包,需要注意的是 该kettle的版本较低,如果你是用的 mysql数据库,需要下载 5.4.7以下的驱动包,下载地址参照以下:
2,一切准备就绪后就可以开始使用啦
假设现在有一个场景:需要将本地的所有库同步到目标服务器上对应的数据库中
其中涉及到 数据表的自动创建 以及 数据的自动同步,这里以全量同步为例。
实现步骤分为三大块:定义一个入口 任务,该任务中涵盖 start、库表初始化(删除表及数据)、全量同步、结束
3,main 任务脚本内容
- "1.0" encoding="UTF-8"?>
-
main -
-
-
-
/ -
- -
2024/06/03 09:54:18.592 -
- -
2024/06/03 09:54:18.592 -
-
-
-
-
-
-
-
-
-
-
-
-
ID_JOB -
Y -
ID_JOB -
-
-
CHANNEL_ID -
Y -
CHANNEL_ID -
-
-
JOBNAME -
Y -
JOBNAME -
-
-
STATUS -
Y -
STATUS -
-
-
LINES_READ -
Y -
LINES_READ -
-
-
LINES_WRITTEN -
Y -
LINES_WRITTEN -
-
-
LINES_UPDATED -
Y -
LINES_UPDATED -
-
-
LINES_INPUT -
Y -
LINES_INPUT -
-
-
LINES_OUTPUT -
Y -
LINES_OUTPUT -
-
-
LINES_REJECTED -
Y -
LINES_REJECTED -
-
-
ERRORS -
Y -
ERRORS -
-
-
STARTDATE -
Y -
STARTDATE -
-
-
ENDDATE -
Y -
ENDDATE -
-
-
LOGDATE -
Y -
LOGDATE -
-
-
DEPDATE -
Y -
DEPDATE -
-
-
REPLAYDATE -
Y -
REPLAYDATE -
-
-
LOG_FIELD -
Y -
LOG_FIELD -
-
-
EXECUTING_SERVER -
N -
EXECUTING_SERVER -
-
-
EXECUTING_USER -
N -
EXECUTING_USER -
-
-
START_JOB_ENTRY -
N -
START_JOB_ENTRY -
-
-
CLIENT -
N -
CLIENT -
-
-
-
-
-
-
-
-
ID_BATCH -
Y -
ID_BATCH -
-
-
CHANNEL_ID -
Y -
CHANNEL_ID -
-
-
LOG_DATE -
Y -
LOG_DATE -
-
-
JOBNAME -
Y -
TRANSNAME -
-
-
JOBENTRYNAME -
Y -
STEPNAME -
-
-
LINES_READ -
Y -
LINES_READ -
-
-
LINES_WRITTEN -
Y -
LINES_WRITTEN -
-
-
LINES_UPDATED -
Y -
LINES_UPDATED -
-
-
LINES_INPUT -
Y -
LINES_INPUT -
-
-
LINES_OUTPUT -
Y -
LINES_OUTPUT -
-
-
LINES_REJECTED -
Y -
LINES_REJECTED -
-
-
ERRORS -
Y -
ERRORS -
-
-
RESULT -
Y -
RESULT -
-
-
NR_RESULT_ROWS -
Y -
NR_RESULT_ROWS -
-
-
NR_RESULT_FILES -
Y -
NR_RESULT_FILES -
-
-
LOG_FIELD -
N -
LOG_FIELD -
-
-
COPY_NR -
N -
COPY_NR -
-
-
-
-
-
-
-
-
ID_BATCH -
Y -
ID_BATCH -
-
-
CHANNEL_ID -
Y -
CHANNEL_ID -
-
-
LOG_DATE -
Y -
LOG_DATE -
-
-
LOGGING_OBJECT_TYPE -
Y -
LOGGING_OBJECT_TYPE -
-
-
OBJECT_NAME -
Y -
OBJECT_NAME -
-
-
OBJECT_COPY -
Y -
OBJECT_COPY -
-
-
REPOSITORY_DIRECTORY -
Y -
REPOSITORY_DIRECTORY -
-
-
FILENAME -
Y -
FILENAME -
-
-
OBJECT_ID -
Y -
OBJECT_ID -
-
-
OBJECT_REVISION -
Y -
OBJECT_REVISION -
-
-
PARENT_CHANNEL_ID -
Y -
PARENT_CHANNEL_ID -
-
-
ROOT_CHANNEL_ID -
Y -
ROOT_CHANNEL_ID -
-
-
N -
-
-
-
START -
-
SPECIAL -
Y -
N -
N -
0 -
0 -
60 -
12 -
0 -
1 -
1 -
N -
Y -
0 -
176 -
208 -
-
-
同步aav -
-
TRANS -
filename -
-
${Internal.Entry.Current.Directory}/sync_aav.ktr -
-
N -
N -
N -
N -
N -
N -
-
-
N -
N -
Basic -
N -
-
N -
Y -
N -
N -
N -
Pentaho local -
-
Y -
-
N -
Y -
0 -
640 -
48 -
-
-
初始化所有表 -
-
JOB -
filename -
-
${Internal.Entry.Current.Directory}/init.kjb -
-
N -
N -
N -
N -
-
-
N -
N -
Nothing -
-
Y -
N -
N -
N -
N -
-
Y -
-
N -
N -
Y -
0 -
352 -
208 -
-
-
DUMMY -
-
SPECIAL -
N -
Y -
N -
0 -
0 -
60 -
12 -
0 -
1 -
1 -
N -
Y -
0 -
928 -
208 -
-
-
同步aps -
-
TRANS -
filename -
-
${Internal.Entry.Current.Directory}/sync_aps.ktr -
-
N -
N -
N -
N -
N -
N -
-
-
N -
N -
Basic -
N -
-
N -
Y -
N -
N -
N -
Pentaho local -
-
Y -
-
N -
Y -
0 -
640 -
128 -
-
-
同步douban -
-
TRANS -
filename -
-
${Internal.Entry.Current.Directory}/sync_douban.ktr -
-
N -
N -
N -
N -
N -
N -
-
-
N -
N -
Basic -
N -
-
N -
Y -
N -
N -
N -
Pentaho local -
-
Y -
-
N -
Y -
0 -
640 -
208 -
-
-
同步sp-dev -
-
TRANS -
filename -
-
${Internal.Entry.Current.Directory}/sync_sp-dev.ktr -
-
N -
N -
N -
N -
N -
N -
-
-
N -
N -
Basic -
N -
-
N -
Y -
N -
N -
N -
Pentaho local -
-
Y -
-
N -
Y -
0 -
640 -
368 -
-
-
同步lottery -
-
TRANS -
filename -
-
${Internal.Entry.Current.Directory}/sync_lottery.ktr -
-
N -
N -
N -
N -
N -
N -
-
-
N -
N -
Basic -
N -
-
N -
Y -
N -
N -
N -
Pentaho local -
-
Y -
-
N -
Y -
0 -
640 -
288 -
-
-
-
-
START -
初始化所有表 -
0 -
0 -
Y -
Y -
Y -
-
-
初始化所有表 -
同步aav -
0 -
0 -
Y -
Y -
N -
-
-
同步aav -
DUMMY -
0 -
0 -
Y -
Y -
N -
-
-
初始化所有表 -
同步douban -
0 -
0 -
Y -
Y -
N -
-
-
初始化所有表 -
同步lottery -
0 -
0 -
Y -
Y -
N -
-
-
初始化所有表 -
同步sp-dev -
0 -
0 -
Y -
Y -
N -
-
-
初始化所有表 -
同步aps -
0 -
0 -
Y -
Y -
N -
-
-
同步aps -
DUMMY -
0 -
0 -
Y -
Y -
N -
-
-
同步douban -
DUMMY -
0 -
0 -
Y -
Y -
N -
-
-
同步lottery -
DUMMY -
0 -
0 -
Y -
Y -
N -
-
-
同步sp-dev -
DUMMY -
0 -
0 -
Y -
Y -
N -
-
-
-
-
-
-
METASTORE.pentaho -
-
Default Run Configuration -
{"namespace":"pentaho","id":"Default Run Configuration","name":"Default Run Configuration","description":"Defines a default run configuration","metaStoreName":null} -
-
-
-
{"_":"Embedded MetaStore Elements","namespace":"pentaho","type":"Default Run Configuration"} -
-
Pentaho local -
{"children":[{"children":[],"id":"server","value":null},{"children":[],"id":"clustered","value":"N"},{"children":[],"id":"name","value":"Pentaho local"},{"children":[],"id":"description","value":null},{"children":[],"id":"readOnly","value":"Y"},{"children":[],"id":"sendResources","value":"N"},{"children":[],"id":"logRemoteExecutionLocally","value":"N"},{"children":[],"id":"remote","value":"N"},{"children":[],"id":"local","value":"Y"},{"children":[],"id":"showTransformations","value":"N"}],"id":"Pentaho local","value":null,"name":"Pentaho local","owner":null,"ownerPermissionsList":[]} -
-
-
4,kettle.properties变量
- 名称 值
- local_ip localhost
- local_user root
- password 123456
- schemeName_aav aav
- schemeName_aav1 aav1
- schemeName_aps aps
- schemeName_aps1 aps1
- schemeName_douban douban
- schemeName_douban1 douban1
- schemeName_lottery lottery
- schemeName_lottery1 lottery1
- schemeName_sp-dev sp-dev
- schemeName_sp-dev1 sp-dev1
5,在使用组件时,经常会遇到一个问题:数据库中某个值设置的是非null,在同步数据时,起始源数据库的表字段值为空字符串,在进入到kettle中时会默认编程null,这时直接同步会报错,需要更改kettle的配置变量。
kettle.properties -》KETTLE_EMPTY_STRING_DIFFERS_FROM_NULL=Y
更改后需要重启kettle