最近数据大迁移,Mysql迁移已经完成,但是因为上海机房的硬盘不像ssd那么给力… Hbase是通过thrift来对导.. 今天要从mongodb导出些业务数据,做一些测试。 下面是mongo里面的数据格式,created_on是时间字段。 我们要做的是,按照日期来导出mongodb。
> it { "_id" : 121, "status" : 0, "project_id" : 15, "or_keys" : [ ], "and_keys" : [ ], "synonyms" : [ ], "created_on" : ISODate("2013-04-02T14:38:11.589Z"), "main_key" : "英国 品牌 捷豹", "not_keys" : [ ] } { "_id" : 122, "status" : 1, "project_id" : 15, "or_keys" : [ ], "and_keys" : [ ], "synonyms" : [ ], "created_on" : ISODate("2013-04-02T14:38:23.143Z"), "main_key" : "无可复制的生命力", "not_keys" : [ ] } { "_id" : 123, "status" : 1, "project_id" : 15, "or_keys" : [ ], "and_keys" : [ ], "synonyms" : [ ], "created_on" : ISODate("2013-04-02T14:38:46.523Z"), "main_key" : "无可复制", "not_keys" : [ ] } { "_id" : 124, "status" : 1, "project_id" : 15, "or_keys" : [ ], "and_keys" : [ ], "synonyms" : [ ], "created_on" : ISODate("2013-04-02T14:38:54.731Z"), "main_key" : "奢华的灵魂", "not_keys" : [ ] } { "_id" : 125, "status" : 1, "project_id" : 15, "or_keys" : [ ], "and_keys" : [ ], "synonyms" : [ ], "created_on" : ISODate("2013-04-02T14:39:02.628Z"), "main_key" : "茨维考", "not_keys" : [ ] } { "_id" : 126, "status" : 1, "project_id" : 15, "or_keys" : [ ], "and_keys" : [ ], "synonyms" : [ ], "created_on" : ISODate("2013-04-02T14:39:18.237Z"), "main_key" : "生命的咆哮", "not_keys" : [ ] } { "_id" : 127, "status" : 1, "project_id" : 15, "or_keys" : [ ], "and_keys" : [ ], "synonyms" : [ ], "created_on" : ISODate("2013-04-02T14:39:32.075Z"), "main_key" : "How Alive Are You", "not_keys" : [ ] } { "_id" : 128, "status" : 1, "project_id" : 15, "or_keys" : [ ], "and_keys" : [ ], "synonyms" : [ ], "created_on" : ISODate("2013-04-02T14:39:44.529Z"), "main_key" : "Alastair", "not_keys" : [ ] } { "_id" : 129, "status" : 0, "project_id" : 17, "or_keys" : [ ], "and_keys" : [ ], "synonyms" : [ ], "created_on" : ISODate("2013-04-02T20:44:55.261Z"), "main_key" : "东风日产", "not_keys" : [ ] } { "_id" : 130, "and_keys" : [ ], "created_on" : ISODate("2013-04-02T20:45:25.829Z"), "main_key" : "楼兰", "not_keys" : [ ], "or_keys" : [ "日产", "nissan" ], "project_id" : 17, "status" : 1, "synonyms" : [ ] } { "_id" : 131, "and_keys" : [ ], "created_on" : ISODate("2013-04-02T20:45:26.119Z"), "main_key" : "楼兰", "not_keys" : [ ], "or_keys" : [ "日产", "nissan" ], "project_id" : 17, "status" : 0, "synonyms" : [ ] } { "_id" : 132, "and_keys" : [ ], "created_on" : ISODate("2013-04-02T20:47:12.137Z"), "main_key" : "天籁", "not_keys" : [ ], "or_keys" : [ "日产", "nissan", "气囊", "事故", "维权" ], "project_id" : 17, "status" : 0, "synonyms" : [ ] } { "_id" : 133, "status" : 0, "project_id" : 17, "or_keys" : [ ], "and_keys" : [ ], "synonyms" : [ ], "created_on" : ISODate("2013-04-02T20:48:47.076Z"), "main_key" : "奇骏", "not_keys" : [ ] } { "_id" : 134, "and_keys" : [ ], "created_on" : ISODate("2013-04-02T20:50:37.309Z"), "main_key" : "逍客", "not_keys" : [ [ "逍客旅行网" ] ], "or_keys" : [ "日产", "nissan" ], "project_id" : 17, "status" : 0, "synonyms" : [ ] } { "_id" : 135, "status" : 0, "project_id" : 17, "or_keys" : [ ], "and_keys" : [ ], "synonyms" : [ ], "created_on" : ISODate("2013-04-02T20:52:16.256Z"), "main_key" : "轩逸", "not_keys" : [ ] } { "_id" : 136, "status" : 0, "project_id" : 17, "or_keys" : [ ], "and_keys" : [ ], "synonyms" : [ ], "created_on" : ISODate("2013-04-02T20:52:26.142Z"), "main_key" : "骐达", "not_keys" : [ ] } { "_id" : 137, "status" : 0, "project_id" : 17, "or_keys" : [ ], "and_keys" : [ ], "synonyms" : [ ], "created_on" : ISODate("2013-04-02T20:52:40.670Z"), "main_key" : "骊威", "not_keys" : [ ] } { "_id" : 138, "and_keys" : [ ], "created_on" : ISODate("2013-04-02T20:53:05.306Z"), "main_key" : "阳光", "not_keys" : [ ], "or_keys" : [ "日产", "nissan" ], "project_id" : 17, "status" : 0, "synonyms" : [ ] } { "_id" : 139, "status" : 0, "project_id" : 17, "or_keys" : [ ], "and_keys" : [ ], "synonyms" : [ ], "created_on" : ISODate("2013-04-02T20:53:16.477Z"), "main_key" : "启辰", "not_keys" : [ ] } { "_id" : 140, "status" : 0, "project_id" : 17, "or_keys" : [ ], "and_keys" : [ ], "synonyms" : [ ], "created_on" : ISODate("2013-04-02T20:53:28.351Z"), "main_key" : "颐达", "not_keys" : [ ] }
文章原文. http://xiaorui.cc/?p=1838
mongodump –query选项可指定查询条件,按日期范围导出数据,但是并不支持ISODate格式,一般都是用Date函数来转换。
首先把你的时间转换成时间戳,在mongodb的Date是需要时间戳*1000的,也就是毫秒。
[ruifengyun@bj-buzz-dev01:~]$date -d 2014-05-11 +%s 1399737600
下面我们开始mongodump导出.
[ruifengyun@bj-buzz-dev01:~]mongodump --port 27017 -d buzz_master -c feed -q '{"created_on":{gte:Date(1399737600000)}}' -o 8-7
connected to: 127.0.0.1:27017
DATABASE: buzz_master to 8-7/buzz_master
buzz_master.feed to 8-7/buzz_master/feed.bson
89 objects
[ruifengyun@bj-buzz-dev01:~]$
除了严格的按照时间范围来mongodump之外,我们还可以用objectid的方式, 大家知道mongodb的objectid的前四个字节是有时间特征的
mongodump -umonitor -p'xxxxxx' --port 27017 -d buzz_master -c topic -q '{"_id" : {$gte:ObjectId("51031c80c8665789e6a9b7fe")}}' -o xiaorui.cc