SlideShare a Scribd company logo
1 of 27
Download to read offline
Serverless Clojure and ML prototyping: an experience report
Toni Väisänen & Kimmo Koskinen
Helsinki Clojure Meetup
21.6.2022
Table of Contents
● Background
● The Team
● Technologies
○ Serverless Infrastructure, AWS, Terraform
○ CI/CD, Github Actions
○ Clojure
○ ML, NLP
● Closing Thoughts
Background
https://repliance.com/product.html
The Team
Toni: Fullstacker with an ML angle
● The main dev in the project
Clojurians, Koodiklinikka: @tvaisanen
Kimmo: Long time Clojure enthusiast, likes to dabble in data
projects
● The mentor in the project
https://twitter.com/KimmoKoskinen & https://github.com/viesti
The Application
Serverless Infrastructure
● AWS Organizations
○ AWS Root account and AWS account per client
■ Separate infra for each client
○ Terraform modules for logical parts of the infra
● DynamoDB used as the database, ML is run on-demand
with Sagemaker
○ Serverless infra requires a Serverless database
○ GPU’s on demand
● API Gateway
○ Frontend uses AWS services directly via API Gateway
● Terraform defining everything
Serverless Infrastructure Simplified
CI / CD
● GitHub Actions
○ Build & Test
○ Deploys the development environment
■ GitHub Actions assumes an AWS IAM Role
● Production deploy
○ Publish build
■ Triggered by new tag push
■ Publish versioned release artifacts to S3
○ Deployment
■ Manually triggered workflow
■ Artifacts are downloaded from S3 and
■ Deployed with Terraform
Clojure Applications
● Dashboard
○ ClojureScript Reagent Single Page Application
● Dashboard Backend
○ Node/ClojureScript Lambda
■ Used to presigns S3 URLs upload and download
■ Node JS for faster cold-start
● Event Processor
○ JVM/Clojure Lambda
○ Processes events from services such as:
■ SES, SQS, S3, SageMaker etc…
Clojure REPL
Clojure Tooling
● ClojureScript
○ Shadow-CLJS
■ Builds the Dashboard SPA and
■ The Lambda that runs on NodeJS
● JVM/Clojure
○ deps.edn for project configuration and
○ depstar for building the uberjar
● Babashka
○ Build, test and release tasks
○ bb.edn files small, task code required and shared
● Kaocha for testing
Machine learning
Natural Language Processing
Deepset AI’s Haystack for NLP tasks
https://haystack.deepset.ai/pipeline_nodes/retriever
Natural Language Processing
● NLP tasks are based on having source material from which
natural language queries are asked.
○ Natural language texts (policy files)
○ Answer question pairs (FAQ items)
Natural Language Processing
● NLP tasks are based on having source material from which
natural language queries are asked.
○ Natural language texts (policy files)
○ Answer question pairs (FAQ items)
● The source material collection is called the document store
which stores the data in SQLite database
Natural Language Processing
● NLP tasks are based on having source material from which
natural language queries are asked.
○ Natural language texts (policy files)
○ Answer question pairs (FAQ items)
● The source material collection is called the document store
which stores the data in SQLite database
● In addition to the DB the document store has another
component (FAISS index) that stores the vectorized
(embeddings) representations from text passages
Natural Language Processing
“FAISS (Facebook AI Similarity Search) is a library that allows developers to quickly search for
embeddings of multimedia documents that are similar to each other. It solves limitations of
traditional query search engines that are optimized for hash-based searches, and provides more
scalable similarity search functions.”
User Workflow
● User upload a questionnaire file (excel)
○ This triggers the Event Processor Lambda
○ File is transformed to JSON format and stored for later use
User Workflow
● User upload a questionnaire file (excel)
○ This triggers the Event Processor Lambda
○ File is transformed to JSON format and stored for later use
● There’s an UI tool that enables the user to
○ select the question rows and
○ pick the property columns
User Workflow
● User upload a questionnaire file (excel)
○ This triggers the Event Processor Lambda
○ File is transformed to JSON format and stored for later use
● There’s an UI tool that enables the user to
○ select the question rows and
○ pick the property columns
● The selection is saved and stored for later use
User Workflow
● User upload a questionnaire file (excel)
○ This triggers the Event Processor Lambda
○ File is transformed to JSON format and stored for later use
● There’s an UI tool that enables the user to
○ select the question rows and
○ pick the property columns
● The selection is saved and stored for later use
● User can also upload pre answered questions
○ to be used in the document store where the answers are searched for
User Workflow
● User upload a questionnaire file (excel)
○ This triggers the Event Processor Lambda
○ File is transformed to JSON format and stored for later use
● There’s an UI tool that enables the user to
○ select the question rows and
○ pick the property columns
● The selection is saved and stored for later use
● User can also upload pre answered questions
○ to be used in the document store where the answers are searched for
● User can trigger inference
○ Event sent to SQS, Fires a Lambda that trigger SageMaker Batch
Transform Job 🧠
AWS SageMaker
Batch Transform
“Run inference when you don't need a persistent endpoint”
Inference Workflow
- On startup Batch Transform Job fetches
- Policy files from S3
- FAQ items from DynamoDB
- Initializes the document store
- Pre-process policy files
- Create the embeddings
- Starts a web server (Flask)
- SageMaker reads the questions from S3
- SageMaker writes the answers to S3
- PutNotification triggered on new object
- Event Processor listens to these events and writes the results to
Dynamo DB
End Result
End Result
Closing Thoughts
● The project continues, next phase reveals how this actually
works :grimacing:
○ But there’s more angles also
● Pros
○ Interesting technology
○ Exploratory coding
○ Full stack: Infra, Backend, Frontend, ML, Design, UX, you name it!
● Cons
○ Complexity creeping, how to maintain…
○ See last pros bullet :D
● Learnings
○ Using tools that fit the job is good
○ ML & Serverless is not too difficult with Clojure

More Related Content

What's hot

Introduction to elasticsearch
Introduction to elasticsearchIntroduction to elasticsearch
Introduction to elasticsearchpmanvi
 
The Basics of MongoDB
The Basics of MongoDBThe Basics of MongoDB
The Basics of MongoDBvaluebound
 
Building an open data platform with apache iceberg
Building an open data platform with apache icebergBuilding an open data platform with apache iceberg
Building an open data platform with apache icebergAlluxio, Inc.
 
What Is ELK Stack | ELK Tutorial For Beginners | Elasticsearch Kibana | ELK S...
What Is ELK Stack | ELK Tutorial For Beginners | Elasticsearch Kibana | ELK S...What Is ELK Stack | ELK Tutorial For Beginners | Elasticsearch Kibana | ELK S...
What Is ELK Stack | ELK Tutorial For Beginners | Elasticsearch Kibana | ELK S...Edureka!
 
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...GetInData
 
Data council sf amundsen presentation
Data council sf    amundsen presentationData council sf    amundsen presentation
Data council sf amundsen presentationTao Feng
 
Pythonsevilla2019 - Introduction to MLFlow
Pythonsevilla2019 - Introduction to MLFlowPythonsevilla2019 - Introduction to MLFlow
Pythonsevilla2019 - Introduction to MLFlowFernando Ortega Gallego
 
Uber: Kafka Consumer Proxy
Uber: Kafka Consumer ProxyUber: Kafka Consumer Proxy
Uber: Kafka Consumer Proxyconfluent
 
DevNation Live: Kafka and Debezium
DevNation Live: Kafka and DebeziumDevNation Live: Kafka and Debezium
DevNation Live: Kafka and DebeziumRed Hat Developers
 
Introduction to elasticsearch
Introduction to elasticsearchIntroduction to elasticsearch
Introduction to elasticsearchhypto
 
6 Nines: How Stripe keeps Kafka highly-available across the globe with Donny ...
6 Nines: How Stripe keeps Kafka highly-available across the globe with Donny ...6 Nines: How Stripe keeps Kafka highly-available across the globe with Donny ...
6 Nines: How Stripe keeps Kafka highly-available across the globe with Donny ...HostedbyConfluent
 
Big data real time architectures
Big data real time architecturesBig data real time architectures
Big data real time architecturesDaniel Marcous
 
Transparent Data Encryption in PostgreSQL
Transparent Data Encryption in PostgreSQLTransparent Data Encryption in PostgreSQL
Transparent Data Encryption in PostgreSQLMasahiko Sawada
 
Elasticsearch From the Bottom Up
Elasticsearch From the Bottom UpElasticsearch From the Bottom Up
Elasticsearch From the Bottom Upfoundsearch
 
Introduction to Apache Airflow
Introduction to Apache AirflowIntroduction to Apache Airflow
Introduction to Apache Airflowmutt_data
 
Spark Operator—Deploy, Manage and Monitor Spark clusters on Kubernetes
 Spark Operator—Deploy, Manage and Monitor Spark clusters on Kubernetes Spark Operator—Deploy, Manage and Monitor Spark clusters on Kubernetes
Spark Operator—Deploy, Manage and Monitor Spark clusters on KubernetesDatabricks
 
Introduction to Apache Spark
Introduction to Apache SparkIntroduction to Apache Spark
Introduction to Apache SparkDatio Big Data
 
MLflow Model Serving
MLflow Model ServingMLflow Model Serving
MLflow Model ServingDatabricks
 
Altinity Quickstart for ClickHouse-2202-09-15.pdf
Altinity Quickstart for ClickHouse-2202-09-15.pdfAltinity Quickstart for ClickHouse-2202-09-15.pdf
Altinity Quickstart for ClickHouse-2202-09-15.pdfAltinity Ltd
 

What's hot (20)

Introduction to elasticsearch
Introduction to elasticsearchIntroduction to elasticsearch
Introduction to elasticsearch
 
The Basics of MongoDB
The Basics of MongoDBThe Basics of MongoDB
The Basics of MongoDB
 
Building an open data platform with apache iceberg
Building an open data platform with apache icebergBuilding an open data platform with apache iceberg
Building an open data platform with apache iceberg
 
What Is ELK Stack | ELK Tutorial For Beginners | Elasticsearch Kibana | ELK S...
What Is ELK Stack | ELK Tutorial For Beginners | Elasticsearch Kibana | ELK S...What Is ELK Stack | ELK Tutorial For Beginners | Elasticsearch Kibana | ELK S...
What Is ELK Stack | ELK Tutorial For Beginners | Elasticsearch Kibana | ELK S...
 
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
 
Data council sf amundsen presentation
Data council sf    amundsen presentationData council sf    amundsen presentation
Data council sf amundsen presentation
 
Pythonsevilla2019 - Introduction to MLFlow
Pythonsevilla2019 - Introduction to MLFlowPythonsevilla2019 - Introduction to MLFlow
Pythonsevilla2019 - Introduction to MLFlow
 
Tensorflow 2.0 and Coral Edge TPU
Tensorflow 2.0 and Coral Edge TPU Tensorflow 2.0 and Coral Edge TPU
Tensorflow 2.0 and Coral Edge TPU
 
Uber: Kafka Consumer Proxy
Uber: Kafka Consumer ProxyUber: Kafka Consumer Proxy
Uber: Kafka Consumer Proxy
 
DevNation Live: Kafka and Debezium
DevNation Live: Kafka and DebeziumDevNation Live: Kafka and Debezium
DevNation Live: Kafka and Debezium
 
Introduction to elasticsearch
Introduction to elasticsearchIntroduction to elasticsearch
Introduction to elasticsearch
 
6 Nines: How Stripe keeps Kafka highly-available across the globe with Donny ...
6 Nines: How Stripe keeps Kafka highly-available across the globe with Donny ...6 Nines: How Stripe keeps Kafka highly-available across the globe with Donny ...
6 Nines: How Stripe keeps Kafka highly-available across the globe with Donny ...
 
Big data real time architectures
Big data real time architecturesBig data real time architectures
Big data real time architectures
 
Transparent Data Encryption in PostgreSQL
Transparent Data Encryption in PostgreSQLTransparent Data Encryption in PostgreSQL
Transparent Data Encryption in PostgreSQL
 
Elasticsearch From the Bottom Up
Elasticsearch From the Bottom UpElasticsearch From the Bottom Up
Elasticsearch From the Bottom Up
 
Introduction to Apache Airflow
Introduction to Apache AirflowIntroduction to Apache Airflow
Introduction to Apache Airflow
 
Spark Operator—Deploy, Manage and Monitor Spark clusters on Kubernetes
 Spark Operator—Deploy, Manage and Monitor Spark clusters on Kubernetes Spark Operator—Deploy, Manage and Monitor Spark clusters on Kubernetes
Spark Operator—Deploy, Manage and Monitor Spark clusters on Kubernetes
 
Introduction to Apache Spark
Introduction to Apache SparkIntroduction to Apache Spark
Introduction to Apache Spark
 
MLflow Model Serving
MLflow Model ServingMLflow Model Serving
MLflow Model Serving
 
Altinity Quickstart for ClickHouse-2202-09-15.pdf
Altinity Quickstart for ClickHouse-2202-09-15.pdfAltinity Quickstart for ClickHouse-2202-09-15.pdf
Altinity Quickstart for ClickHouse-2202-09-15.pdf
 

Similar to Serverless Clojure ML Prototyping Report

Wattpad - Spark Stories
Wattpad - Spark StoriesWattpad - Spark Stories
Wattpad - Spark StoriesRylan Halteman
 
#RADC4L16: An API-First Archives Approach at NPR
#RADC4L16: An API-First Archives Approach at NPR#RADC4L16: An API-First Archives Approach at NPR
#RADC4L16: An API-First Archives Approach at NPRCamille Salas
 
OpenSearch.pdf
OpenSearch.pdfOpenSearch.pdf
OpenSearch.pdfAbhi Jain
 
2022-06-23 Apache Arrow and DataFusion_ Changing the Game for implementing Da...
2022-06-23 Apache Arrow and DataFusion_ Changing the Game for implementing Da...2022-06-23 Apache Arrow and DataFusion_ Changing the Game for implementing Da...
2022-06-23 Apache Arrow and DataFusion_ Changing the Game for implementing Da...Andrew Lamb
 
Data Science in the Cloud @StitchFix
Data Science in the Cloud @StitchFixData Science in the Cloud @StitchFix
Data Science in the Cloud @StitchFixC4Media
 
Untangling - fall2017 - week 9
Untangling - fall2017 - week 9Untangling - fall2017 - week 9
Untangling - fall2017 - week 9Derek Jacoby
 
Netflix Container Scheduling and Execution - QCon New York 2016
Netflix Container Scheduling and Execution - QCon New York 2016Netflix Container Scheduling and Execution - QCon New York 2016
Netflix Container Scheduling and Execution - QCon New York 2016aspyker
 
Scheduling a fuller house - Talk at QCon NY 2016
Scheduling a fuller house - Talk at QCon NY 2016Scheduling a fuller house - Talk at QCon NY 2016
Scheduling a fuller house - Talk at QCon NY 2016Sharma Podila
 
Data Platform in the Cloud
Data Platform in the CloudData Platform in the Cloud
Data Platform in the CloudAmihay Zer-Kavod
 
Triangle Devops Meetup 10/2015
Triangle Devops Meetup 10/2015Triangle Devops Meetup 10/2015
Triangle Devops Meetup 10/2015aspyker
 
Data Day Texas 2017: Scaling Data Science at Stitch Fix
Data Day Texas 2017: Scaling Data Science at Stitch FixData Day Texas 2017: Scaling Data Science at Stitch Fix
Data Day Texas 2017: Scaling Data Science at Stitch FixStefan Krawczyk
 
Data Day Seattle 2017: Scaling Data Science at Stitch Fix
Data Day Seattle 2017: Scaling Data Science at Stitch FixData Day Seattle 2017: Scaling Data Science at Stitch Fix
Data Day Seattle 2017: Scaling Data Science at Stitch FixStefan Krawczyk
 
The Autobahn Has No Speed Limit - Your XPages Shouldn't Either!
The Autobahn Has No Speed Limit - Your XPages Shouldn't Either!The Autobahn Has No Speed Limit - Your XPages Shouldn't Either!
The Autobahn Has No Speed Limit - Your XPages Shouldn't Either!Teamstudio
 
Utilizing the open ntf domino api
Utilizing the open ntf domino apiUtilizing the open ntf domino api
Utilizing the open ntf domino apiOliver Busse
 
Utilizing the OpenNTF Domino API
Utilizing the OpenNTF Domino APIUtilizing the OpenNTF Domino API
Utilizing the OpenNTF Domino APIOliver Busse
 
Who needs containers in a serverless world
Who needs containers in a serverless worldWho needs containers in a serverless world
Who needs containers in a serverless worldMatthias Luebken
 
Building a high-performance, scalable ML & NLP platform with Python, Sheer El...
Building a high-performance, scalable ML & NLP platform with Python, Sheer El...Building a high-performance, scalable ML & NLP platform with Python, Sheer El...
Building a high-performance, scalable ML & NLP platform with Python, Sheer El...Pôle Systematic Paris-Region
 
Google app engine - Soft Uni 19.06.2014
Google app engine - Soft Uni 19.06.2014Google app engine - Soft Uni 19.06.2014
Google app engine - Soft Uni 19.06.2014Dimitar Danailov
 
Machine learning and big data @ uber a tale of two systems
Machine learning and big data @ uber a tale of two systemsMachine learning and big data @ uber a tale of two systems
Machine learning and big data @ uber a tale of two systemsZhenxiao Luo
 

Similar to Serverless Clojure ML Prototyping Report (20)

Wattpad - Spark Stories
Wattpad - Spark StoriesWattpad - Spark Stories
Wattpad - Spark Stories
 
#RADC4L16: An API-First Archives Approach at NPR
#RADC4L16: An API-First Archives Approach at NPR#RADC4L16: An API-First Archives Approach at NPR
#RADC4L16: An API-First Archives Approach at NPR
 
OpenSearch.pdf
OpenSearch.pdfOpenSearch.pdf
OpenSearch.pdf
 
2022-06-23 Apache Arrow and DataFusion_ Changing the Game for implementing Da...
2022-06-23 Apache Arrow and DataFusion_ Changing the Game for implementing Da...2022-06-23 Apache Arrow and DataFusion_ Changing the Game for implementing Da...
2022-06-23 Apache Arrow and DataFusion_ Changing the Game for implementing Da...
 
Data Science in the Cloud @StitchFix
Data Science in the Cloud @StitchFixData Science in the Cloud @StitchFix
Data Science in the Cloud @StitchFix
 
Untangling - fall2017 - week 9
Untangling - fall2017 - week 9Untangling - fall2017 - week 9
Untangling - fall2017 - week 9
 
Netflix Container Scheduling and Execution - QCon New York 2016
Netflix Container Scheduling and Execution - QCon New York 2016Netflix Container Scheduling and Execution - QCon New York 2016
Netflix Container Scheduling and Execution - QCon New York 2016
 
Scheduling a fuller house - Talk at QCon NY 2016
Scheduling a fuller house - Talk at QCon NY 2016Scheduling a fuller house - Talk at QCon NY 2016
Scheduling a fuller house - Talk at QCon NY 2016
 
Data Platform in the Cloud
Data Platform in the CloudData Platform in the Cloud
Data Platform in the Cloud
 
Triangle Devops Meetup 10/2015
Triangle Devops Meetup 10/2015Triangle Devops Meetup 10/2015
Triangle Devops Meetup 10/2015
 
Data Day Texas 2017: Scaling Data Science at Stitch Fix
Data Day Texas 2017: Scaling Data Science at Stitch FixData Day Texas 2017: Scaling Data Science at Stitch Fix
Data Day Texas 2017: Scaling Data Science at Stitch Fix
 
Data Day Seattle 2017: Scaling Data Science at Stitch Fix
Data Day Seattle 2017: Scaling Data Science at Stitch FixData Day Seattle 2017: Scaling Data Science at Stitch Fix
Data Day Seattle 2017: Scaling Data Science at Stitch Fix
 
The Autobahn Has No Speed Limit - Your XPages Shouldn't Either!
The Autobahn Has No Speed Limit - Your XPages Shouldn't Either!The Autobahn Has No Speed Limit - Your XPages Shouldn't Either!
The Autobahn Has No Speed Limit - Your XPages Shouldn't Either!
 
Utilizing the open ntf domino api
Utilizing the open ntf domino apiUtilizing the open ntf domino api
Utilizing the open ntf domino api
 
Utilizing the OpenNTF Domino API
Utilizing the OpenNTF Domino APIUtilizing the OpenNTF Domino API
Utilizing the OpenNTF Domino API
 
Who needs containers in a serverless world
Who needs containers in a serverless worldWho needs containers in a serverless world
Who needs containers in a serverless world
 
Building a high-performance, scalable ML & NLP platform with Python, Sheer El...
Building a high-performance, scalable ML & NLP platform with Python, Sheer El...Building a high-performance, scalable ML & NLP platform with Python, Sheer El...
Building a high-performance, scalable ML & NLP platform with Python, Sheer El...
 
Revealing ALLSTOCKER
Revealing ALLSTOCKERRevealing ALLSTOCKER
Revealing ALLSTOCKER
 
Google app engine - Soft Uni 19.06.2014
Google app engine - Soft Uni 19.06.2014Google app engine - Soft Uni 19.06.2014
Google app engine - Soft Uni 19.06.2014
 
Machine learning and big data @ uber a tale of two systems
Machine learning and big data @ uber a tale of two systemsMachine learning and big data @ uber a tale of two systems
Machine learning and big data @ uber a tale of two systems
 

More from Metosin Oy

Navigating container technology for enhanced security by Niklas Saari
Navigating container technology for enhanced security by Niklas SaariNavigating container technology for enhanced security by Niklas Saari
Navigating container technology for enhanced security by Niklas SaariMetosin Oy
 
Where is Technical Debt?
Where is Technical Debt?Where is Technical Debt?
Where is Technical Debt?Metosin Oy
 
Creating an experimental GraphQL formatter using Clojure, Instaparse, and Gra...
Creating an experimental GraphQL formatter using Clojure, Instaparse, and Gra...Creating an experimental GraphQL formatter using Clojure, Instaparse, and Gra...
Creating an experimental GraphQL formatter using Clojure, Instaparse, and Gra...Metosin Oy
 
Designing with malli
Designing with malliDesigning with malli
Designing with malliMetosin Oy
 
Malli: inside data-driven schemas
Malli: inside data-driven schemasMalli: inside data-driven schemas
Malli: inside data-driven schemasMetosin Oy
 
Naked Performance With Clojure
Naked Performance With ClojureNaked Performance With Clojure
Naked Performance With ClojureMetosin Oy
 
Reitit - Clojure/North 2019
Reitit - Clojure/North 2019Reitit - Clojure/North 2019
Reitit - Clojure/North 2019Metosin Oy
 
Fun with errors? - Clojure Finland Meetup 26.3.2019 Tampere
Fun with errors? - Clojure Finland Meetup 26.3.2019 TampereFun with errors? - Clojure Finland Meetup 26.3.2019 Tampere
Fun with errors? - Clojure Finland Meetup 26.3.2019 TampereMetosin Oy
 
Clojutre Real Life (2012 ClojuTRE Retro Edition)
Clojutre Real Life (2012 ClojuTRE Retro Edition)Clojutre Real Life (2012 ClojuTRE Retro Edition)
Clojutre Real Life (2012 ClojuTRE Retro Edition)Metosin Oy
 
The Ancient Art of Data-Driven - reitit, the library -
The Ancient Art of Data-Driven - reitit, the library - The Ancient Art of Data-Driven - reitit, the library -
The Ancient Art of Data-Driven - reitit, the library - Metosin Oy
 
Craft Beer & Clojure
Craft Beer & ClojureCraft Beer & Clojure
Craft Beer & ClojureMetosin Oy
 
Performance and Abstractions
Performance and AbstractionsPerformance and Abstractions
Performance and AbstractionsMetosin Oy
 
ClojuTRE2016 Opening slides
ClojuTRE2016 Opening slidesClojuTRE2016 Opening slides
ClojuTRE2016 Opening slidesMetosin Oy
 
Schema tools-and-trics-and-quick-intro-to-clojure-spec-22.6.2016
Schema tools-and-trics-and-quick-intro-to-clojure-spec-22.6.2016Schema tools-and-trics-and-quick-intro-to-clojure-spec-22.6.2016
Schema tools-and-trics-and-quick-intro-to-clojure-spec-22.6.2016Metosin Oy
 
ClojuTRE - a (very) brief history
ClojuTRE - a (very) brief historyClojuTRE - a (very) brief history
ClojuTRE - a (very) brief historyMetosin Oy
 
Wieldy remote apis with Kekkonen - ClojureD 2016
Wieldy remote apis with Kekkonen - ClojureD 2016Wieldy remote apis with Kekkonen - ClojureD 2016
Wieldy remote apis with Kekkonen - ClojureD 2016Metosin Oy
 
ClojuTRE2015: Kekkonen - making your Clojure web APIs more awesome
ClojuTRE2015: Kekkonen - making your Clojure web APIs more awesomeClojuTRE2015: Kekkonen - making your Clojure web APIs more awesome
ClojuTRE2015: Kekkonen - making your Clojure web APIs more awesomeMetosin Oy
 
Clojure in real life 17.10.2014
Clojure in real life 17.10.2014Clojure in real life 17.10.2014
Clojure in real life 17.10.2014Metosin Oy
 
Euroclojure2014: Schema & Swagger - making your Clojure web APIs more awesome
Euroclojure2014: Schema & Swagger - making your Clojure web APIs more awesomeEuroclojure2014: Schema & Swagger - making your Clojure web APIs more awesome
Euroclojure2014: Schema & Swagger - making your Clojure web APIs more awesomeMetosin Oy
 
Swaggered web apis in Clojure
Swaggered web apis in ClojureSwaggered web apis in Clojure
Swaggered web apis in ClojureMetosin Oy
 

More from Metosin Oy (20)

Navigating container technology for enhanced security by Niklas Saari
Navigating container technology for enhanced security by Niklas SaariNavigating container technology for enhanced security by Niklas Saari
Navigating container technology for enhanced security by Niklas Saari
 
Where is Technical Debt?
Where is Technical Debt?Where is Technical Debt?
Where is Technical Debt?
 
Creating an experimental GraphQL formatter using Clojure, Instaparse, and Gra...
Creating an experimental GraphQL formatter using Clojure, Instaparse, and Gra...Creating an experimental GraphQL formatter using Clojure, Instaparse, and Gra...
Creating an experimental GraphQL formatter using Clojure, Instaparse, and Gra...
 
Designing with malli
Designing with malliDesigning with malli
Designing with malli
 
Malli: inside data-driven schemas
Malli: inside data-driven schemasMalli: inside data-driven schemas
Malli: inside data-driven schemas
 
Naked Performance With Clojure
Naked Performance With ClojureNaked Performance With Clojure
Naked Performance With Clojure
 
Reitit - Clojure/North 2019
Reitit - Clojure/North 2019Reitit - Clojure/North 2019
Reitit - Clojure/North 2019
 
Fun with errors? - Clojure Finland Meetup 26.3.2019 Tampere
Fun with errors? - Clojure Finland Meetup 26.3.2019 TampereFun with errors? - Clojure Finland Meetup 26.3.2019 Tampere
Fun with errors? - Clojure Finland Meetup 26.3.2019 Tampere
 
Clojutre Real Life (2012 ClojuTRE Retro Edition)
Clojutre Real Life (2012 ClojuTRE Retro Edition)Clojutre Real Life (2012 ClojuTRE Retro Edition)
Clojutre Real Life (2012 ClojuTRE Retro Edition)
 
The Ancient Art of Data-Driven - reitit, the library -
The Ancient Art of Data-Driven - reitit, the library - The Ancient Art of Data-Driven - reitit, the library -
The Ancient Art of Data-Driven - reitit, the library -
 
Craft Beer & Clojure
Craft Beer & ClojureCraft Beer & Clojure
Craft Beer & Clojure
 
Performance and Abstractions
Performance and AbstractionsPerformance and Abstractions
Performance and Abstractions
 
ClojuTRE2016 Opening slides
ClojuTRE2016 Opening slidesClojuTRE2016 Opening slides
ClojuTRE2016 Opening slides
 
Schema tools-and-trics-and-quick-intro-to-clojure-spec-22.6.2016
Schema tools-and-trics-and-quick-intro-to-clojure-spec-22.6.2016Schema tools-and-trics-and-quick-intro-to-clojure-spec-22.6.2016
Schema tools-and-trics-and-quick-intro-to-clojure-spec-22.6.2016
 
ClojuTRE - a (very) brief history
ClojuTRE - a (very) brief historyClojuTRE - a (very) brief history
ClojuTRE - a (very) brief history
 
Wieldy remote apis with Kekkonen - ClojureD 2016
Wieldy remote apis with Kekkonen - ClojureD 2016Wieldy remote apis with Kekkonen - ClojureD 2016
Wieldy remote apis with Kekkonen - ClojureD 2016
 
ClojuTRE2015: Kekkonen - making your Clojure web APIs more awesome
ClojuTRE2015: Kekkonen - making your Clojure web APIs more awesomeClojuTRE2015: Kekkonen - making your Clojure web APIs more awesome
ClojuTRE2015: Kekkonen - making your Clojure web APIs more awesome
 
Clojure in real life 17.10.2014
Clojure in real life 17.10.2014Clojure in real life 17.10.2014
Clojure in real life 17.10.2014
 
Euroclojure2014: Schema & Swagger - making your Clojure web APIs more awesome
Euroclojure2014: Schema & Swagger - making your Clojure web APIs more awesomeEuroclojure2014: Schema & Swagger - making your Clojure web APIs more awesome
Euroclojure2014: Schema & Swagger - making your Clojure web APIs more awesome
 
Swaggered web apis in Clojure
Swaggered web apis in ClojureSwaggered web apis in Clojure
Swaggered web apis in Clojure
 

Recently uploaded

Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
Active Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdfActive Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdfCionsystems
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...OnePlan Solutions
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionSolGuruz
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsJhone kinadey
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Steffen Staab
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsArshad QA
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfjoe51371421
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...OnePlan Solutions
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about usDynamic Netsoft
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...panagenda
 

Recently uploaded (20)

Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
Active Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdfActive Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdf
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdf
 
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about us
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 

Serverless Clojure ML Prototyping Report

  • 1. Serverless Clojure and ML prototyping: an experience report Toni Väisänen & Kimmo Koskinen Helsinki Clojure Meetup 21.6.2022
  • 2. Table of Contents ● Background ● The Team ● Technologies ○ Serverless Infrastructure, AWS, Terraform ○ CI/CD, Github Actions ○ Clojure ○ ML, NLP ● Closing Thoughts
  • 4. The Team Toni: Fullstacker with an ML angle ● The main dev in the project Clojurians, Koodiklinikka: @tvaisanen Kimmo: Long time Clojure enthusiast, likes to dabble in data projects ● The mentor in the project https://twitter.com/KimmoKoskinen & https://github.com/viesti
  • 6. Serverless Infrastructure ● AWS Organizations ○ AWS Root account and AWS account per client ■ Separate infra for each client ○ Terraform modules for logical parts of the infra ● DynamoDB used as the database, ML is run on-demand with Sagemaker ○ Serverless infra requires a Serverless database ○ GPU’s on demand ● API Gateway ○ Frontend uses AWS services directly via API Gateway ● Terraform defining everything
  • 8. CI / CD ● GitHub Actions ○ Build & Test ○ Deploys the development environment ■ GitHub Actions assumes an AWS IAM Role ● Production deploy ○ Publish build ■ Triggered by new tag push ■ Publish versioned release artifacts to S3 ○ Deployment ■ Manually triggered workflow ■ Artifacts are downloaded from S3 and ■ Deployed with Terraform
  • 9. Clojure Applications ● Dashboard ○ ClojureScript Reagent Single Page Application ● Dashboard Backend ○ Node/ClojureScript Lambda ■ Used to presigns S3 URLs upload and download ■ Node JS for faster cold-start ● Event Processor ○ JVM/Clojure Lambda ○ Processes events from services such as: ■ SES, SQS, S3, SageMaker etc…
  • 11. Clojure Tooling ● ClojureScript ○ Shadow-CLJS ■ Builds the Dashboard SPA and ■ The Lambda that runs on NodeJS ● JVM/Clojure ○ deps.edn for project configuration and ○ depstar for building the uberjar ● Babashka ○ Build, test and release tasks ○ bb.edn files small, task code required and shared ● Kaocha for testing
  • 13. Natural Language Processing Deepset AI’s Haystack for NLP tasks https://haystack.deepset.ai/pipeline_nodes/retriever
  • 14. Natural Language Processing ● NLP tasks are based on having source material from which natural language queries are asked. ○ Natural language texts (policy files) ○ Answer question pairs (FAQ items)
  • 15. Natural Language Processing ● NLP tasks are based on having source material from which natural language queries are asked. ○ Natural language texts (policy files) ○ Answer question pairs (FAQ items) ● The source material collection is called the document store which stores the data in SQLite database
  • 16. Natural Language Processing ● NLP tasks are based on having source material from which natural language queries are asked. ○ Natural language texts (policy files) ○ Answer question pairs (FAQ items) ● The source material collection is called the document store which stores the data in SQLite database ● In addition to the DB the document store has another component (FAISS index) that stores the vectorized (embeddings) representations from text passages
  • 17. Natural Language Processing “FAISS (Facebook AI Similarity Search) is a library that allows developers to quickly search for embeddings of multimedia documents that are similar to each other. It solves limitations of traditional query search engines that are optimized for hash-based searches, and provides more scalable similarity search functions.”
  • 18. User Workflow ● User upload a questionnaire file (excel) ○ This triggers the Event Processor Lambda ○ File is transformed to JSON format and stored for later use
  • 19. User Workflow ● User upload a questionnaire file (excel) ○ This triggers the Event Processor Lambda ○ File is transformed to JSON format and stored for later use ● There’s an UI tool that enables the user to ○ select the question rows and ○ pick the property columns
  • 20. User Workflow ● User upload a questionnaire file (excel) ○ This triggers the Event Processor Lambda ○ File is transformed to JSON format and stored for later use ● There’s an UI tool that enables the user to ○ select the question rows and ○ pick the property columns ● The selection is saved and stored for later use
  • 21. User Workflow ● User upload a questionnaire file (excel) ○ This triggers the Event Processor Lambda ○ File is transformed to JSON format and stored for later use ● There’s an UI tool that enables the user to ○ select the question rows and ○ pick the property columns ● The selection is saved and stored for later use ● User can also upload pre answered questions ○ to be used in the document store where the answers are searched for
  • 22. User Workflow ● User upload a questionnaire file (excel) ○ This triggers the Event Processor Lambda ○ File is transformed to JSON format and stored for later use ● There’s an UI tool that enables the user to ○ select the question rows and ○ pick the property columns ● The selection is saved and stored for later use ● User can also upload pre answered questions ○ to be used in the document store where the answers are searched for ● User can trigger inference ○ Event sent to SQS, Fires a Lambda that trigger SageMaker Batch Transform Job 🧠
  • 23. AWS SageMaker Batch Transform “Run inference when you don't need a persistent endpoint”
  • 24. Inference Workflow - On startup Batch Transform Job fetches - Policy files from S3 - FAQ items from DynamoDB - Initializes the document store - Pre-process policy files - Create the embeddings - Starts a web server (Flask) - SageMaker reads the questions from S3 - SageMaker writes the answers to S3 - PutNotification triggered on new object - Event Processor listens to these events and writes the results to Dynamo DB
  • 27. Closing Thoughts ● The project continues, next phase reveals how this actually works :grimacing: ○ But there’s more angles also ● Pros ○ Interesting technology ○ Exploratory coding ○ Full stack: Infra, Backend, Frontend, ML, Design, UX, you name it! ● Cons ○ Complexity creeping, how to maintain… ○ See last pros bullet :D ● Learnings ○ Using tools that fit the job is good ○ ML & Serverless is not too difficult with Clojure