For the last couple of months, I have been exploring Elasticsearch and I even shared some articles about it talking about how impressive the technology behind it is and how it can be used with other projects such as Spark to expand the search capabilities Elasticsearch offers with the real-time distributed analytics and machine learning Spark offers.
In the previous article (Part1), we installed the ELK stack along with the ES-Hadoop connector and spark, then we did some visualizations in Kibana with the houses price prediction data set from kaggle. In this part we will start with adding Search Guard to the stack in order to define permissions and access to our data and configurations, then we will implement our models with the help of Spark Ml lib, and we will finish with deploying our models in a pipeline in order to predict the prices for new entries to our Elasticsearch.
The ELK Suite is an acronym for a combination of three widely used open source projects. E = Elasticsearch (inspired by Lucene), L = Logstash and K = Kibana. All developed in Java and published as Open Source under the Apache license. The addition of Beats turned the stack into a four-legged project and led to its renaming as “Elastic Stack”, but for us in this article we will at least use the official name of ELK.
As a machine learning practitioner and enthusiast for a while now, I have discovered some interesting facts about life and how a lot of what I used to describe as random is actually following well-known patterns and have somehow a predictable behavior.