1、前期准备
Linux系统
Python(最好是2)
Jdk 1.8以上
2、安装Python2
--更新软件包
sudo apt update
--安装python2
sudo apt install python2
--查看python版本
python2 --version
3、下载DataX
Linux下载DataX
wget http://datax-opensource.oss-cn-hangzhou.aliyuncs.com/datax.tar.gz
解压
tar -zxvf datax.tar.gz
4、增加DataX Job(DataX数据迁移任务)
读取库:SQL Server
写入库:MongoDB
SqlServerToMongodb.json内容如下
{
"job": {
"setting": {
"speed": {
"channel": 5
}
},
"content": [{
"reader": {
"name": "sqlserverreader",
"parameter": {
"username": "用户名",
"password": "密码",
"column": [
"id",
"version",
"created",
"modified",
"code",
"name"
],
"splitPk": "pk",
"where": "",
"connection": [{
"table": ["EMPLOYEE"],
"jdbcUrl": ["jdbc:sqlserver://127.0.0.1:1433;DatabaseName=TEST"]
}]
}
},
"writer": {
"name": "mongodbwriter",
"parameter": {
"address": ["127.0.0.1:27017"],
"userName": "datax",
"userPassword": "datax",
"dbName": "TEST",
"collectionName": "employee",
"column": [{
"name": "id",
"type": "string"
}, {
"name": "version",
"type": "int"
}, {
"name": "created",
"type": "date"
}, {
"name": "modified",
"type": "date"
}, {
"name": "code",
"type": "string"
}, {
"name": "name",
"type": "string"
}]
}
}
}]
}
}
字段名 | 描述 |
channel | datax线程数(分几个线程执行) |
其他参数查看下面参考资料 |
本文是SqlServerToMongodb的示例,其他查看github里面其他库的读写文档
参考资料:
GitHub - alibaba/DataX: DataX是阿里云DataWorks数据集成的开源版本。
https://github.com/alibaba/DataX/blob/master/sqlserverreader/doc/sqlserverreader.md
https://github.com/alibaba/DataX/blob/master/mongodbwriter/doc/mongodbwriter.md
5、启动任务
进入bin目录
执行命令(具体目录以自己放的位置为主):
python2 datax.py ../job/SqlServerToMongodb.json