SlideShare a Scribd company logo
1 of 19
Building Big Data applications with Stratio Sparta
SPARTA 2.0
José Carlos García Serrano
Big Data Architect in Stratio.
I am from Granada and Computer Science Engineer in the
ETSII, post graduate in Big Data and certificate in Spark and
AWS
def fanBoy(): Seq[Skills] = {
val functional = Seq(Scala, Akka)
val processing = Seq(Spark)
val noSql = Seq(MongoDB, Cassandra)
functional ++ processing ++ noSql
}
def aLongTimeAgo(): Seq[Skills] = {
val programming = Seq(Delphi, C++)
val processing = Seq(Hadoop)
val sql = Seq(Interbase, FireBird)
programming ++ processing ++ sql
}
Sparktan one
Sparktan two
Javier Yuste Checa
Product Owner at Stratio.
I am from Madrid and I have a Master in Computer Science by
the UPM, with 10+ years of hands-on software development
experience.
I love travelling and motorbikes
I like:
● Software development
● Agile methodologies
● Product development
● Make things happen
What is Sparta?1
Questions4
New version2 Use case - Demo3
Index
WHAT IS SPARTA?
1
© Stratio 2016. Confidential, All Rights Reserved.
Towards a generic real-time aggregation platform
7
At Stratio, we have implemented several real-time analytic projects based on Apache Spark,
Kafka, Flume, Cassandra, or MongoDB.
These technologies were always a perfect fit, but soon we found ourselves writing the same
pieces of integration code over and over again.
This is how SPARTA was born
© Stratio 2016. Confidential, All Rights Reserved.
SPARTA - Beginning
8
The goals
○ Pure Spark!
○ No need of coding, only declarative workflows
○ Data continuously streamed in and processed in near real-time
○ Ready to use out of the box
○ Plug & play: flexible workflows (inputs, outputs, parsers, etc…)
○ High performance
○ Scalable and fault tolerant
○ Stateful operations with OLAP engine
○ Execute SQL over Streaming and batch data
© Stratio 2016. Confidential, All Rights Reserved.
SPARTA 1.0
9
Kafka
Flume
Crossdata
RabbitMQ
Socket
WebSocket
HDFS/S3
Twitter
MongoDB
Cassandra
ElasticSearch
Redis
JDBC
Crossdata
CSV
Parquet
Http
Kafka
HDFS/S3
Http Rest
Avro
© Stratio 2016. Confidential, All Rights Reserved.
SPARTA 1.0
10
NEW VERSION
2
© Stratio 2016. Confidential, All Rights Reserved.
SPARTA 2.0
12
Security
○ Data
○ Resource
○ Service
Distributed
Multiprocessing
○ Streaming
○ Batch
○ SQL / OLAP
○ Machine learning
SaaS & Multi
Tenancy
○ Sparta as a Service
○ Multi User
○ Multi Tenant
○ Multi Instance
© Stratio 2016. Confidential, All Rights Reserved.
EOS - Data centric Suite
13
USE CASES
DEMO
3
© Stratio 2017. Confidential, All Rights Reserved.
Big Data use case
15
• Different technologies and skills
needed
• Experience solving complex problems
• Knowledge of Big data architectures
© Stratio 2017. Confidential, All Rights Reserved.
Big Data use case
16
© Stratio 2017. Confidential, All Rights Reserved.
Simple use case
17
Input Test:
{
"event":"Big data spain",
"speech":"Sparta",
"attendees":10,
“address”: ”Kinepolis”
}
Select:
speech, attendees
Cube:
dimensions:
speech
operations:
sum(attendees)
BIG DATA
CHILD`S PLAY
THANKS
Contacts:
jcgarcia@stratio.com
es.linkedin.com/in/gserranojc
jyuste@stratio.com
es.linkedin.com/in/jyustecheca
Stratio Sparta 2.0

More Related Content

What's hot

Opensource Frameworks and BigData Processing
Opensource Frameworks and BigData ProcessingOpensource Frameworks and BigData Processing
Opensource Frameworks and BigData ProcessingAmir Sedighi
 
Using druid for interactive count distinct queries at scale
Using druid for interactive count distinct queries at scaleUsing druid for interactive count distinct queries at scale
Using druid for interactive count distinct queries at scaleItai Yaffe
 
Monitoring in Big Data Frameworks @ Big Data Meetup, Timisoara, 2015
Monitoring in Big Data Frameworks @ Big Data Meetup, Timisoara, 2015Monitoring in Big Data Frameworks @ Big Data Meetup, Timisoara, 2015
Monitoring in Big Data Frameworks @ Big Data Meetup, Timisoara, 2015Institute e-Austria Timisoara
 
Technology behind-real-time-log-analytics
Technology behind-real-time-log-analytics Technology behind-real-time-log-analytics
Technology behind-real-time-log-analytics Data Science Thailand
 
Talend spark meetup 03042017 - Paris Spark Meetup
Talend spark meetup 03042017 - Paris Spark MeetupTalend spark meetup 03042017 - Paris Spark Meetup
Talend spark meetup 03042017 - Paris Spark MeetupModern Data Stack France
 
Open source big data landscape and possible ITS applications
Open source big data landscape and possible ITS applicationsOpen source big data landscape and possible ITS applications
Open source big data landscape and possible ITS applicationsSoftwareMill
 
Big Data Streams Architectures. Why? What? How?
Big Data Streams Architectures. Why? What? How?Big Data Streams Architectures. Why? What? How?
Big Data Streams Architectures. Why? What? How?Anton Nazaruk
 
Powering Predictive Mapping at Scale with Spark, Kafka, and Elastic Search: S...
Powering Predictive Mapping at Scale with Spark, Kafka, and Elastic Search: S...Powering Predictive Mapping at Scale with Spark, Kafka, and Elastic Search: S...
Powering Predictive Mapping at Scale with Spark, Kafka, and Elastic Search: S...Spark Summit
 
Implementing the Lambda Architecture efficiently with Apache Spark
Implementing the Lambda Architecture efficiently with Apache SparkImplementing the Lambda Architecture efficiently with Apache Spark
Implementing the Lambda Architecture efficiently with Apache SparkDataWorks Summit
 
Big data serving: Processing and inference at scale in real time
Big data serving: Processing and inference at scale in real timeBig data serving: Processing and inference at scale in real time
Big data serving: Processing and inference at scale in real timeItai Yaffe
 
Fighting Cybercrime: A Joint Task Force of Real-Time Data and Human Analytics...
Fighting Cybercrime: A Joint Task Force of Real-Time Data and Human Analytics...Fighting Cybercrime: A Joint Task Force of Real-Time Data and Human Analytics...
Fighting Cybercrime: A Joint Task Force of Real-Time Data and Human Analytics...Spark Summit
 
The Evolution of the Fashion Retail Industry in the Age of AI with Kshitij Ku...
The Evolution of the Fashion Retail Industry in the Age of AI with Kshitij Ku...The Evolution of the Fashion Retail Industry in the Age of AI with Kshitij Ku...
The Evolution of the Fashion Retail Industry in the Age of AI with Kshitij Ku...Databricks
 
Continuous delivery for machine learning
Continuous delivery for machine learningContinuous delivery for machine learning
Continuous delivery for machine learningRajesh Muppalla
 
Trends for Big Data and Apache Spark in 2017 by Matei Zaharia
Trends for Big Data and Apache Spark in 2017 by Matei ZahariaTrends for Big Data and Apache Spark in 2017 by Matei Zaharia
Trends for Big Data and Apache Spark in 2017 by Matei ZahariaSpark Summit
 
Application and Challenges of Streaming Analytics and Machine Learning on Mu...
 Application and Challenges of Streaming Analytics and Machine Learning on Mu... Application and Challenges of Streaming Analytics and Machine Learning on Mu...
Application and Challenges of Streaming Analytics and Machine Learning on Mu...Databricks
 
How Spark Enables the Internet of Things: Efficient Integration of Multiple ...
How Spark Enables the Internet of Things: Efficient Integration of Multiple ...How Spark Enables the Internet of Things: Efficient Integration of Multiple ...
How Spark Enables the Internet of Things: Efficient Integration of Multiple ...sparktc
 
R&D to Product Pipeline Using Apache Spark in AdTech: Spark Summit East talk ...
R&D to Product Pipeline Using Apache Spark in AdTech: Spark Summit East talk ...R&D to Product Pipeline Using Apache Spark in AdTech: Spark Summit East talk ...
R&D to Product Pipeline Using Apache Spark in AdTech: Spark Summit East talk ...Spark Summit
 
Multiplatform Spark solution for Graph datasources by Javier Dominguez
Multiplatform Spark solution for Graph datasources by Javier DominguezMultiplatform Spark solution for Graph datasources by Javier Dominguez
Multiplatform Spark solution for Graph datasources by Javier DominguezBig Data Spain
 
Alexander Pavlenko, Java Software Engineer, DataArt.
Alexander Pavlenko, Java Software Engineer, DataArt.Alexander Pavlenko, Java Software Engineer, DataArt.
Alexander Pavlenko, Java Software Engineer, DataArt.Alina Vilk
 
British Gas Connected Homes: Data Engineering
British Gas Connected Homes: Data EngineeringBritish Gas Connected Homes: Data Engineering
British Gas Connected Homes: Data EngineeringDataStax Academy
 

What's hot (20)

Opensource Frameworks and BigData Processing
Opensource Frameworks and BigData ProcessingOpensource Frameworks and BigData Processing
Opensource Frameworks and BigData Processing
 
Using druid for interactive count distinct queries at scale
Using druid for interactive count distinct queries at scaleUsing druid for interactive count distinct queries at scale
Using druid for interactive count distinct queries at scale
 
Monitoring in Big Data Frameworks @ Big Data Meetup, Timisoara, 2015
Monitoring in Big Data Frameworks @ Big Data Meetup, Timisoara, 2015Monitoring in Big Data Frameworks @ Big Data Meetup, Timisoara, 2015
Monitoring in Big Data Frameworks @ Big Data Meetup, Timisoara, 2015
 
Technology behind-real-time-log-analytics
Technology behind-real-time-log-analytics Technology behind-real-time-log-analytics
Technology behind-real-time-log-analytics
 
Talend spark meetup 03042017 - Paris Spark Meetup
Talend spark meetup 03042017 - Paris Spark MeetupTalend spark meetup 03042017 - Paris Spark Meetup
Talend spark meetup 03042017 - Paris Spark Meetup
 
Open source big data landscape and possible ITS applications
Open source big data landscape and possible ITS applicationsOpen source big data landscape and possible ITS applications
Open source big data landscape and possible ITS applications
 
Big Data Streams Architectures. Why? What? How?
Big Data Streams Architectures. Why? What? How?Big Data Streams Architectures. Why? What? How?
Big Data Streams Architectures. Why? What? How?
 
Powering Predictive Mapping at Scale with Spark, Kafka, and Elastic Search: S...
Powering Predictive Mapping at Scale with Spark, Kafka, and Elastic Search: S...Powering Predictive Mapping at Scale with Spark, Kafka, and Elastic Search: S...
Powering Predictive Mapping at Scale with Spark, Kafka, and Elastic Search: S...
 
Implementing the Lambda Architecture efficiently with Apache Spark
Implementing the Lambda Architecture efficiently with Apache SparkImplementing the Lambda Architecture efficiently with Apache Spark
Implementing the Lambda Architecture efficiently with Apache Spark
 
Big data serving: Processing and inference at scale in real time
Big data serving: Processing and inference at scale in real timeBig data serving: Processing and inference at scale in real time
Big data serving: Processing and inference at scale in real time
 
Fighting Cybercrime: A Joint Task Force of Real-Time Data and Human Analytics...
Fighting Cybercrime: A Joint Task Force of Real-Time Data and Human Analytics...Fighting Cybercrime: A Joint Task Force of Real-Time Data and Human Analytics...
Fighting Cybercrime: A Joint Task Force of Real-Time Data and Human Analytics...
 
The Evolution of the Fashion Retail Industry in the Age of AI with Kshitij Ku...
The Evolution of the Fashion Retail Industry in the Age of AI with Kshitij Ku...The Evolution of the Fashion Retail Industry in the Age of AI with Kshitij Ku...
The Evolution of the Fashion Retail Industry in the Age of AI with Kshitij Ku...
 
Continuous delivery for machine learning
Continuous delivery for machine learningContinuous delivery for machine learning
Continuous delivery for machine learning
 
Trends for Big Data and Apache Spark in 2017 by Matei Zaharia
Trends for Big Data and Apache Spark in 2017 by Matei ZahariaTrends for Big Data and Apache Spark in 2017 by Matei Zaharia
Trends for Big Data and Apache Spark in 2017 by Matei Zaharia
 
Application and Challenges of Streaming Analytics and Machine Learning on Mu...
 Application and Challenges of Streaming Analytics and Machine Learning on Mu... Application and Challenges of Streaming Analytics and Machine Learning on Mu...
Application and Challenges of Streaming Analytics and Machine Learning on Mu...
 
How Spark Enables the Internet of Things: Efficient Integration of Multiple ...
How Spark Enables the Internet of Things: Efficient Integration of Multiple ...How Spark Enables the Internet of Things: Efficient Integration of Multiple ...
How Spark Enables the Internet of Things: Efficient Integration of Multiple ...
 
R&D to Product Pipeline Using Apache Spark in AdTech: Spark Summit East talk ...
R&D to Product Pipeline Using Apache Spark in AdTech: Spark Summit East talk ...R&D to Product Pipeline Using Apache Spark in AdTech: Spark Summit East talk ...
R&D to Product Pipeline Using Apache Spark in AdTech: Spark Summit East talk ...
 
Multiplatform Spark solution for Graph datasources by Javier Dominguez
Multiplatform Spark solution for Graph datasources by Javier DominguezMultiplatform Spark solution for Graph datasources by Javier Dominguez
Multiplatform Spark solution for Graph datasources by Javier Dominguez
 
Alexander Pavlenko, Java Software Engineer, DataArt.
Alexander Pavlenko, Java Software Engineer, DataArt.Alexander Pavlenko, Java Software Engineer, DataArt.
Alexander Pavlenko, Java Software Engineer, DataArt.
 
British Gas Connected Homes: Data Engineering
British Gas Connected Homes: Data EngineeringBritish Gas Connected Homes: Data Engineering
British Gas Connected Homes: Data Engineering
 

Similar to Stratio Sparta 2.0

Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...
Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...
Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...Anant Corporation
 
Stream, stream, stream: Different streaming methods with Spark and Kafka
Stream, stream, stream: Different streaming methods with Spark and KafkaStream, stream, stream: Different streaming methods with Spark and Kafka
Stream, stream, stream: Different streaming methods with Spark and KafkaItai Yaffe
 
An introduction To Apache Spark
An introduction To Apache SparkAn introduction To Apache Spark
An introduction To Apache SparkAmir Sedighi
 
Analyzing Data at Scale with Apache Spark
Analyzing Data at Scale with Apache SparkAnalyzing Data at Scale with Apache Spark
Analyzing Data at Scale with Apache SparkNicola Ferraro
 
Stream, Stream, Stream: Different Streaming Methods with Apache Spark and Kafka
Stream, Stream, Stream: Different Streaming Methods with Apache Spark and KafkaStream, Stream, Stream: Different Streaming Methods with Apache Spark and Kafka
Stream, Stream, Stream: Different Streaming Methods with Apache Spark and KafkaDatabricks
 
Intro to Apache Spark
Intro to Apache SparkIntro to Apache Spark
Intro to Apache SparkMammoth Data
 
Stream, Stream, Stream: Different Streaming Methods with Spark and Kafka
Stream, Stream, Stream: Different Streaming Methods with Spark and KafkaStream, Stream, Stream: Different Streaming Methods with Spark and Kafka
Stream, Stream, Stream: Different Streaming Methods with Spark and KafkaDataWorks Summit
 
JConWorld_ Continuous SQL with Kafka and Flink
JConWorld_ Continuous SQL with Kafka and FlinkJConWorld_ Continuous SQL with Kafka and Flink
JConWorld_ Continuous SQL with Kafka and FlinkTimothy Spann
 
Scala for Everything: From Frontend to Backend Applications - Scala Matsuri 2020
Scala for Everything: From Frontend to Backend Applications - Scala Matsuri 2020Scala for Everything: From Frontend to Backend Applications - Scala Matsuri 2020
Scala for Everything: From Frontend to Backend Applications - Scala Matsuri 2020Taro L. Saito
 
Spark Driven Big Data Analytics
Spark Driven Big Data AnalyticsSpark Driven Big Data Analytics
Spark Driven Big Data Analyticsinoshg
 
PyconZA19-Distributed-workloads-challenges-with-PySpark-and-Airflow
PyconZA19-Distributed-workloads-challenges-with-PySpark-and-AirflowPyconZA19-Distributed-workloads-challenges-with-PySpark-and-Airflow
PyconZA19-Distributed-workloads-challenges-with-PySpark-and-AirflowChetan Khatri
 
Jump Start into Apache® Spark™ and Databricks
Jump Start into Apache® Spark™ and DatabricksJump Start into Apache® Spark™ and Databricks
Jump Start into Apache® Spark™ and DatabricksDatabricks
 
Project Hydrogen: State-of-the-Art Deep Learning on Apache Spark
Project Hydrogen: State-of-the-Art Deep Learning on Apache SparkProject Hydrogen: State-of-the-Art Deep Learning on Apache Spark
Project Hydrogen: State-of-the-Art Deep Learning on Apache SparkDatabricks
 
Bringing the Power and Familiarity of .NET, C# and F# to Big Data Processing ...
Bringing the Power and Familiarity of .NET, C# and F# to Big Data Processing ...Bringing the Power and Familiarity of .NET, C# and F# to Big Data Processing ...
Bringing the Power and Familiarity of .NET, C# and F# to Big Data Processing ...Michael Rys
 
Spark + AI Summit 2020 イベント概要
Spark + AI Summit 2020 イベント概要Spark + AI Summit 2020 イベント概要
Spark + AI Summit 2020 イベント概要Paulo Gutierrez
 
Getting Started with Spark Scala
Getting Started with Spark ScalaGetting Started with Spark Scala
Getting Started with Spark ScalaKnoldus Inc.
 
Spark Streaming the Industrial IoT
Spark Streaming the Industrial IoTSpark Streaming the Industrial IoT
Spark Streaming the Industrial IoTJim Haughwout
 
Sanath pabba hadoop resume 1.0
Sanath pabba hadoop resume 1.0Sanath pabba hadoop resume 1.0
Sanath pabba hadoop resume 1.0Pabba Gupta
 

Similar to Stratio Sparta 2.0 (20)

Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...
Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...
Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...
 
Stream, stream, stream: Different streaming methods with Spark and Kafka
Stream, stream, stream: Different streaming methods with Spark and KafkaStream, stream, stream: Different streaming methods with Spark and Kafka
Stream, stream, stream: Different streaming methods with Spark and Kafka
 
An introduction To Apache Spark
An introduction To Apache SparkAn introduction To Apache Spark
An introduction To Apache Spark
 
Analyzing Data at Scale with Apache Spark
Analyzing Data at Scale with Apache SparkAnalyzing Data at Scale with Apache Spark
Analyzing Data at Scale with Apache Spark
 
Stream, Stream, Stream: Different Streaming Methods with Apache Spark and Kafka
Stream, Stream, Stream: Different Streaming Methods with Apache Spark and KafkaStream, Stream, Stream: Different Streaming Methods with Apache Spark and Kafka
Stream, Stream, Stream: Different Streaming Methods with Apache Spark and Kafka
 
Intro to Apache Spark
Intro to Apache SparkIntro to Apache Spark
Intro to Apache Spark
 
Stream, Stream, Stream: Different Streaming Methods with Spark and Kafka
Stream, Stream, Stream: Different Streaming Methods with Spark and KafkaStream, Stream, Stream: Different Streaming Methods with Spark and Kafka
Stream, Stream, Stream: Different Streaming Methods with Spark and Kafka
 
JConWorld_ Continuous SQL with Kafka and Flink
JConWorld_ Continuous SQL with Kafka and FlinkJConWorld_ Continuous SQL with Kafka and Flink
JConWorld_ Continuous SQL with Kafka and Flink
 
Scala for Everything: From Frontend to Backend Applications - Scala Matsuri 2020
Scala for Everything: From Frontend to Backend Applications - Scala Matsuri 2020Scala for Everything: From Frontend to Backend Applications - Scala Matsuri 2020
Scala for Everything: From Frontend to Backend Applications - Scala Matsuri 2020
 
Spark Driven Big Data Analytics
Spark Driven Big Data AnalyticsSpark Driven Big Data Analytics
Spark Driven Big Data Analytics
 
PyconZA19-Distributed-workloads-challenges-with-PySpark-and-Airflow
PyconZA19-Distributed-workloads-challenges-with-PySpark-and-AirflowPyconZA19-Distributed-workloads-challenges-with-PySpark-and-Airflow
PyconZA19-Distributed-workloads-challenges-with-PySpark-and-Airflow
 
Jump Start into Apache® Spark™ and Databricks
Jump Start into Apache® Spark™ and DatabricksJump Start into Apache® Spark™ and Databricks
Jump Start into Apache® Spark™ and Databricks
 
Project Hydrogen: State-of-the-Art Deep Learning on Apache Spark
Project Hydrogen: State-of-the-Art Deep Learning on Apache SparkProject Hydrogen: State-of-the-Art Deep Learning on Apache Spark
Project Hydrogen: State-of-the-Art Deep Learning on Apache Spark
 
ASPgems - kappa architecture
ASPgems - kappa architectureASPgems - kappa architecture
ASPgems - kappa architecture
 
Bringing the Power and Familiarity of .NET, C# and F# to Big Data Processing ...
Bringing the Power and Familiarity of .NET, C# and F# to Big Data Processing ...Bringing the Power and Familiarity of .NET, C# and F# to Big Data Processing ...
Bringing the Power and Familiarity of .NET, C# and F# to Big Data Processing ...
 
Data streaming
Data streamingData streaming
Data streaming
 
Spark + AI Summit 2020 イベント概要
Spark + AI Summit 2020 イベント概要Spark + AI Summit 2020 イベント概要
Spark + AI Summit 2020 イベント概要
 
Getting Started with Spark Scala
Getting Started with Spark ScalaGetting Started with Spark Scala
Getting Started with Spark Scala
 
Spark Streaming the Industrial IoT
Spark Streaming the Industrial IoTSpark Streaming the Industrial IoT
Spark Streaming the Industrial IoT
 
Sanath pabba hadoop resume 1.0
Sanath pabba hadoop resume 1.0Sanath pabba hadoop resume 1.0
Sanath pabba hadoop resume 1.0
 

More from Stratio

Mesos Meetup - Building an enterprise-ready analytics and operational ecosyst...
Mesos Meetup - Building an enterprise-ready analytics and operational ecosyst...Mesos Meetup - Building an enterprise-ready analytics and operational ecosyst...
Mesos Meetup - Building an enterprise-ready analytics and operational ecosyst...Stratio
 
Can an intelligent system exist without awareness? BDS18
Can an intelligent system exist without awareness? BDS18Can an intelligent system exist without awareness? BDS18
Can an intelligent system exist without awareness? BDS18Stratio
 
Kafka and KSQL - Apache Kafka Meetup
Kafka and KSQL - Apache Kafka MeetupKafka and KSQL - Apache Kafka Meetup
Kafka and KSQL - Apache Kafka MeetupStratio
 
Wild Data - The Data Science Meetup
Wild Data - The Data Science MeetupWild Data - The Data Science Meetup
Wild Data - The Data Science MeetupStratio
 
Using Kafka on Event-driven Microservices Architectures - Apache Kafka Meetup
Using Kafka on Event-driven Microservices Architectures - Apache Kafka MeetupUsing Kafka on Event-driven Microservices Architectures - Apache Kafka Meetup
Using Kafka on Event-driven Microservices Architectures - Apache Kafka MeetupStratio
 
Ensemble methods in Machine Learning
Ensemble methods in Machine Learning Ensemble methods in Machine Learning
Ensemble methods in Machine Learning Stratio
 
Introduction to Artificial Neural Networks
Introduction to Artificial Neural NetworksIntroduction to Artificial Neural Networks
Introduction to Artificial Neural NetworksStratio
 
“A Distributed Operational and Informational Technological Stack”
“A Distributed Operational and Informational Technological Stack” “A Distributed Operational and Informational Technological Stack”
“A Distributed Operational and Informational Technological Stack” Stratio
 
Meetup: Cómo monitorizar y optimizar procesos de Spark usando la Spark Web - ...
Meetup: Cómo monitorizar y optimizar procesos de Spark usando la Spark Web - ...Meetup: Cómo monitorizar y optimizar procesos de Spark usando la Spark Web - ...
Meetup: Cómo monitorizar y optimizar procesos de Spark usando la Spark Web - ...Stratio
 
Lunch&Learn: Combinación de modelos
Lunch&Learn: Combinación de modelosLunch&Learn: Combinación de modelos
Lunch&Learn: Combinación de modelosStratio
 
Meetup: Spark + Kerberos
Meetup: Spark + KerberosMeetup: Spark + Kerberos
Meetup: Spark + KerberosStratio
 
Distributed Logistic Model Trees
Distributed Logistic Model TreesDistributed Logistic Model Trees
Distributed Logistic Model TreesStratio
 
Multiplaform Solution for Graph Datasources
Multiplaform Solution for Graph DatasourcesMultiplaform Solution for Graph Datasources
Multiplaform Solution for Graph DatasourcesStratio
 
Stratio's Cassandra Lucene index: Geospatial use cases - Big Data Spain 2016
Stratio's Cassandra Lucene index: Geospatial use cases - Big Data Spain 2016Stratio's Cassandra Lucene index: Geospatial use cases - Big Data Spain 2016
Stratio's Cassandra Lucene index: Geospatial use cases - Big Data Spain 2016Stratio
 
[Strata] Sparkta
[Strata] Sparkta[Strata] Sparkta
[Strata] SparktaStratio
 
Introduction to Asynchronous scala
Introduction to Asynchronous scalaIntroduction to Asynchronous scala
Introduction to Asynchronous scalaStratio
 
Functional programming in scala
Functional programming in scalaFunctional programming in scala
Functional programming in scalaStratio
 
Spark Streaming @ Berlin Apache Spark Meetup, March 2015
Spark Streaming @ Berlin Apache Spark Meetup, March 2015Spark Streaming @ Berlin Apache Spark Meetup, March 2015
Spark Streaming @ Berlin Apache Spark Meetup, March 2015Stratio
 
Advanced search and Top-K queries in Cassandra
Advanced search and Top-K queries in CassandraAdvanced search and Top-K queries in Cassandra
Advanced search and Top-K queries in CassandraStratio
 
[Spark meetup] Spark Streaming Overview
[Spark meetup] Spark Streaming Overview[Spark meetup] Spark Streaming Overview
[Spark meetup] Spark Streaming OverviewStratio
 

More from Stratio (20)

Mesos Meetup - Building an enterprise-ready analytics and operational ecosyst...
Mesos Meetup - Building an enterprise-ready analytics and operational ecosyst...Mesos Meetup - Building an enterprise-ready analytics and operational ecosyst...
Mesos Meetup - Building an enterprise-ready analytics and operational ecosyst...
 
Can an intelligent system exist without awareness? BDS18
Can an intelligent system exist without awareness? BDS18Can an intelligent system exist without awareness? BDS18
Can an intelligent system exist without awareness? BDS18
 
Kafka and KSQL - Apache Kafka Meetup
Kafka and KSQL - Apache Kafka MeetupKafka and KSQL - Apache Kafka Meetup
Kafka and KSQL - Apache Kafka Meetup
 
Wild Data - The Data Science Meetup
Wild Data - The Data Science MeetupWild Data - The Data Science Meetup
Wild Data - The Data Science Meetup
 
Using Kafka on Event-driven Microservices Architectures - Apache Kafka Meetup
Using Kafka on Event-driven Microservices Architectures - Apache Kafka MeetupUsing Kafka on Event-driven Microservices Architectures - Apache Kafka Meetup
Using Kafka on Event-driven Microservices Architectures - Apache Kafka Meetup
 
Ensemble methods in Machine Learning
Ensemble methods in Machine Learning Ensemble methods in Machine Learning
Ensemble methods in Machine Learning
 
Introduction to Artificial Neural Networks
Introduction to Artificial Neural NetworksIntroduction to Artificial Neural Networks
Introduction to Artificial Neural Networks
 
“A Distributed Operational and Informational Technological Stack”
“A Distributed Operational and Informational Technological Stack” “A Distributed Operational and Informational Technological Stack”
“A Distributed Operational and Informational Technological Stack”
 
Meetup: Cómo monitorizar y optimizar procesos de Spark usando la Spark Web - ...
Meetup: Cómo monitorizar y optimizar procesos de Spark usando la Spark Web - ...Meetup: Cómo monitorizar y optimizar procesos de Spark usando la Spark Web - ...
Meetup: Cómo monitorizar y optimizar procesos de Spark usando la Spark Web - ...
 
Lunch&Learn: Combinación de modelos
Lunch&Learn: Combinación de modelosLunch&Learn: Combinación de modelos
Lunch&Learn: Combinación de modelos
 
Meetup: Spark + Kerberos
Meetup: Spark + KerberosMeetup: Spark + Kerberos
Meetup: Spark + Kerberos
 
Distributed Logistic Model Trees
Distributed Logistic Model TreesDistributed Logistic Model Trees
Distributed Logistic Model Trees
 
Multiplaform Solution for Graph Datasources
Multiplaform Solution for Graph DatasourcesMultiplaform Solution for Graph Datasources
Multiplaform Solution for Graph Datasources
 
Stratio's Cassandra Lucene index: Geospatial use cases - Big Data Spain 2016
Stratio's Cassandra Lucene index: Geospatial use cases - Big Data Spain 2016Stratio's Cassandra Lucene index: Geospatial use cases - Big Data Spain 2016
Stratio's Cassandra Lucene index: Geospatial use cases - Big Data Spain 2016
 
[Strata] Sparkta
[Strata] Sparkta[Strata] Sparkta
[Strata] Sparkta
 
Introduction to Asynchronous scala
Introduction to Asynchronous scalaIntroduction to Asynchronous scala
Introduction to Asynchronous scala
 
Functional programming in scala
Functional programming in scalaFunctional programming in scala
Functional programming in scala
 
Spark Streaming @ Berlin Apache Spark Meetup, March 2015
Spark Streaming @ Berlin Apache Spark Meetup, March 2015Spark Streaming @ Berlin Apache Spark Meetup, March 2015
Spark Streaming @ Berlin Apache Spark Meetup, March 2015
 
Advanced search and Top-K queries in Cassandra
Advanced search and Top-K queries in CassandraAdvanced search and Top-K queries in Cassandra
Advanced search and Top-K queries in Cassandra
 
[Spark meetup] Spark Streaming Overview
[Spark meetup] Spark Streaming Overview[Spark meetup] Spark Streaming Overview
[Spark meetup] Spark Streaming Overview
 

Recently uploaded

Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...itnewsafrica
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observabilityitnewsafrica
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Kaya Weers
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkPixlogix Infotech
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 

Recently uploaded (20)

Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App Framework
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 

Stratio Sparta 2.0

  • 1.
  • 2. Building Big Data applications with Stratio Sparta SPARTA 2.0
  • 3. José Carlos García Serrano Big Data Architect in Stratio. I am from Granada and Computer Science Engineer in the ETSII, post graduate in Big Data and certificate in Spark and AWS def fanBoy(): Seq[Skills] = { val functional = Seq(Scala, Akka) val processing = Seq(Spark) val noSql = Seq(MongoDB, Cassandra) functional ++ processing ++ noSql } def aLongTimeAgo(): Seq[Skills] = { val programming = Seq(Delphi, C++) val processing = Seq(Hadoop) val sql = Seq(Interbase, FireBird) programming ++ processing ++ sql } Sparktan one
  • 4. Sparktan two Javier Yuste Checa Product Owner at Stratio. I am from Madrid and I have a Master in Computer Science by the UPM, with 10+ years of hands-on software development experience. I love travelling and motorbikes I like: ● Software development ● Agile methodologies ● Product development ● Make things happen
  • 5. What is Sparta?1 Questions4 New version2 Use case - Demo3 Index
  • 7. © Stratio 2016. Confidential, All Rights Reserved. Towards a generic real-time aggregation platform 7 At Stratio, we have implemented several real-time analytic projects based on Apache Spark, Kafka, Flume, Cassandra, or MongoDB. These technologies were always a perfect fit, but soon we found ourselves writing the same pieces of integration code over and over again. This is how SPARTA was born
  • 8. © Stratio 2016. Confidential, All Rights Reserved. SPARTA - Beginning 8 The goals ○ Pure Spark! ○ No need of coding, only declarative workflows ○ Data continuously streamed in and processed in near real-time ○ Ready to use out of the box ○ Plug & play: flexible workflows (inputs, outputs, parsers, etc…) ○ High performance ○ Scalable and fault tolerant ○ Stateful operations with OLAP engine ○ Execute SQL over Streaming and batch data
  • 9. © Stratio 2016. Confidential, All Rights Reserved. SPARTA 1.0 9 Kafka Flume Crossdata RabbitMQ Socket WebSocket HDFS/S3 Twitter MongoDB Cassandra ElasticSearch Redis JDBC Crossdata CSV Parquet Http Kafka HDFS/S3 Http Rest Avro
  • 10. © Stratio 2016. Confidential, All Rights Reserved. SPARTA 1.0 10
  • 12. © Stratio 2016. Confidential, All Rights Reserved. SPARTA 2.0 12 Security ○ Data ○ Resource ○ Service Distributed Multiprocessing ○ Streaming ○ Batch ○ SQL / OLAP ○ Machine learning SaaS & Multi Tenancy ○ Sparta as a Service ○ Multi User ○ Multi Tenant ○ Multi Instance
  • 13. © Stratio 2016. Confidential, All Rights Reserved. EOS - Data centric Suite 13
  • 15. © Stratio 2017. Confidential, All Rights Reserved. Big Data use case 15 • Different technologies and skills needed • Experience solving complex problems • Knowledge of Big data architectures
  • 16. © Stratio 2017. Confidential, All Rights Reserved. Big Data use case 16
  • 17. © Stratio 2017. Confidential, All Rights Reserved. Simple use case 17 Input Test: { "event":"Big data spain", "speech":"Sparta", "attendees":10, “address”: ”Kinepolis” } Select: speech, attendees Cube: dimensions: speech operations: sum(attendees)