Azure Functions - How to trigger an email on any change in storage container blob
Introduction In Azure, assuming you are having a storage container. And, you…
September 06, 2019
This post is about syncing your mongodo database data to ElasticSearch. There might be several scenarios where you want to quickly search some data, or expose a search api, or running Grafana to visualize your data.
MongoConnector is an open source tool which is to sync MongoDB data to ElasticSearch. You can run it periodically or continously. And, it can sync all the changes in your MongoDB data. ElasticSearch will have a replica of the MongoDB data.
You can configure which are the MongoDB collections you want to sync and with what names their indexes should be made in ElasticSearch.
MongoDB Replica Set You need a MongoDB replica set. A standalone instance will not work.
ElasticSearch Cluster
mongo-connector utility
See: Run MongoDB replica set with Docker
See: Run Elastic Search Cluster with Docker
You need to have python installed, and install it via pip:
pip install 'mongo-connector[elastic5]' 'elastic2-doc-manager[elastic5]'
Or, you can prepare its docker image too. See below Dockerfile:
FROM python:3-alpine
RUN apk add --no-cache curl sed && pip install 'mongo-connector[elastic5]' 'elastic2-doc-manager[elastic5]'
ENTRYPOINT ["mongo-connector"]
To build docker image:
docker build -t my_mongoconnector .
You should prepare a config file(name=mongoconnector.json):
{
"oplogFile": "<your desired path>/oplog.timestamp",
"noDump": false,
"batchSize": 50,
"verbosity": 2,
"continueOnError": true,
"logging": {
"type": "stream"
},
"namespaces": {
"mydb.coll1": {
"rename": "mydb_coll1._doc"
},
"mydb.trainings": {
"rename": "mydb_trainings._doc"
}
},
"docManagers": [
{
"docManager": "elastic2_doc_manager",
"targetURL": "<elastic search hostname>:9200",
"bulkSize": 10,
"uniqueKey": "_id",
"args": {
"clientOptions": {"timeout": 5000}
}
}
]
}
In above config file:
mongo-connector -m "mongodb://<mongoset1>:27017,<mongoset2>:27018,<mongoset3>:27019/<your db>?replicaSet=your-replicaset-name" -c ./mongoconnector.json
If everything is fine, it will start syncing your MongoDB data to ElasticSearch you specified.
2019-09-06 08:17:05,189 [ALWAYS] mongo_connector.connector:50 - Starting mongo-connector version: 3.1.1
2019-09-06 08:17:05,189 [ALWAYS] mongo_connector.connector:50 - Python version: 3.6.8 (default, Apr 25 2019, 21:02:35)
[GCC 4.8.5 20150623 (Red Hat 4.8.5-36)]
2019-09-06 08:17:05,190 [ALWAYS] mongo_connector.connector:50 - Platform: Linux-3.10.0-957.21.3.el7.x86_64-x86_64-with-centos-7.6.1810-Core
2019-09-06 08:17:05,191 [ALWAYS] mongo_connector.connector:50 - pymongo version: 3.9.0
2019-09-06 08:17:05,204 [ALWAYS] mongo_connector.connector:50 - Source MongoDB version: 4.2.0
2019-09-06 08:17:05,204 [ALWAYS] mongo_connector.connector:50 - Target DocManager: mongo_connector.doc_managers.elastic2_doc_manager version: 1.0.0
2019-09-06 08:17:05,225 [INFO] mongo_connector.oplog_manager:137 - OplogThread: Initializing oplog thread
2019-09-06 08:17:05,227 [INFO] mongo_connector.connector:402 - MongoConnector: Starting connection thread MongoClient(host=['mongoset1:27018', 'mongoset1:27017', 'mongoset1:27019'], document_class=dict, tz_aware=False, connect=True, replicaset='your-replica-set')
2019-09-06 08:17:05,241 [INFO] elasticsearch:83 - GET http://<es-hostname>:9200/_mget?realtime=true [status:200 request:0.007s]
2019-09-06 08:17:05,356 [INFO] elasticsearch:83 - POST http://<es-hostname>:9200/_bulk [status:200 request:0.110s]
2019-09-06 08:17:05,477 [INFO] elasticsearch:83 - POST http://<es-hostname>:9200/_refresh [status:200 request:0.121s]
2019-09-06 08:17:05,484 [INFO] elasticsearch:83 - GET http://<es-hostname>:9200/_mget?realtime=true [status:200 request:0.006s]
2019-09-06 08:17:05,616 [INFO] elasticsearch:83 - POST http://<es-hostname>9200/_bulk [status:200 request:0.129s]
2019-09-06 08:17:05,744 [INFO] elasticsearch:83 - POST http://<es-hostname>:9200/_refresh [status:200 request:0.128s]
.
.
.
.
.
2019-09-06 08:18:35,294 [INFO] mongo_connector.oplog_manager:78 - OplogThread for replica set 'your replica set' is up to date with the oplog.
2019-09-06 08:19:05,324 [INFO] mongo_connector.oplog_manager:78 - OplogThread for replica set 'your replica set' is up to date with the oplog.
And it will update the timestmap in that oplog file.
Introduction In Azure, assuming you are having a storage container. And, you…
Tag the image, by seeing its image id, from docker images command docker tag 04d…
Note: I have public URLs of these images, which I want to save. return…
Introduction In this post, we will see: use Grafana Community Edition (Free…
Problem Statement I have a drupal module, where there is a file of extension…
Introduction I have a host running mysql (not on a container). I have to run an…
Introduction In this post we will see following: How to schedule a job on cron…
Introduction There are some cases, where I need another git repository while…
Introduction In this post, we will see how to fetch multiple credentials and…
Introduction I have an automation script, that I want to run on different…
Introduction I had to write a CICD system for one of our project. I had to…
Introduction Java log4j has many ways to initialize and append the desired…