How we, at eXelate, built an ETL pipeline for Elasticsearch using Spark, including :
* Processing the data using Spark.
* Indexing the processed data directly into Elasticsearch using elasticsearch-hadoop plugin-in for Spark.
* Managing the flow using some of the services provided by AWS (EMR, Data Pipeline, etc.).
The presentation includes some tips and discusses some of the pitfalls we encountered while setting-up this process.
Building an ETL pipeline for Elasticsearch using Spark
1. Building an ETL pipeline for
Elasticsearch using Spark
* *
@2014 eXelate Inc. Confidential and Proprietary
Itai Yaffe, Big Data Infrastructure Developer
December 2015