Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Informix MQTT Streaming

Informix Spark Streaming is an extension of Informix that allows data to be streamed out of the database as soon as it is inserted, updated, or deleted.
The protocol currently used to stream the changes is MQTT v3.1.1 (older versions not supported!). This extension is able to stream data to any MQTT broker where it can be processed or passed on to subscribing clients for processing.

  • Login to see the comments

Informix MQTT Streaming

  1. 1. © 2015 IBM Corporation IBM Analytics Spark Analytics with Informix Pradeep Natarajan, IBM @pradeepnatara
  2. 2. Agenda  Context: Informix / Spark high-level value propositions  IoT use-cases  Challenges  Prototype and implementation  What’s next? 2
  3. 3. Informix to Spark Context 3
  4. 4. Informix for Internet of Things • Optimized Database for environments, such as: • Low or no database administration • Embedded: gateways, routers • Very high transaction rates and uptime characteristics • Widely deployed in the retail sector, where the low administration overhead makes it essential for in-store deployments. • Informix supports key Internet-of-Things solutions • Native support for time-based data: Timeseries • Small footprint • Low administration requirements 4
  5. 5. Apache Spark  Speed  Ease of use, Unified Engine  Sophisticated analytics 5
  6. 6. Apache Spark • Cluster computing framework • Fast and general engine for large-scale data processing • In-memory computing 6
  7. 7. Apache Spark Streaming  Extends Spark for big data stream processing ROW DATA STREAM Processed Data Distributed Stream Processing System  Scaling, low latency, Recovery  Integrate Batch and interactive processing 7
  8. 8. Informix to Spark Use cases 8
  9. 9. Real-Time Operational Database Streaming Analytics with Spark  Applications that drive business have positioned relational databases at the center of operations.  To continue their success, businesses need to use streaming analytics to gain real-time insights into their operations and take actions to optimize outcomes.  Infrequent batch analytics on “stale” data losing competitive edge. Increasing demand for real-time analytics to stay in the lead. 9
  10. 10. SENSE -> ANALYZE -> ACT  As data ages, business value diminishes.  Sense → Analyze → Act in seconds/ milliseconds, not days or weeks Sens e Analyze ActSens e Analyz e ActDays Days Seconds Days Seconds Batch Real- time 10
  11. 11. Connected Vehicles Energy & Utilities Health Care Driving behavior matching Power consumption Continuously streaming data from IBM Informix to analytics platform Streaming analytics service sample scenarios … How does power consumption correlated between House A,B,C D? Detect abnormal patterns in ECG series Detect the anomaly driving behavior cause higher fuel consumptions Increasing demand for real-time analytics Finance Detect the anomaly by price change rate in time window Steady price change Vibration in short period Market Manipulation Detection Heart Attack Prevention Cloud Service Operation Detect the system resource peak and valley, correlates with workload information Server health diagnosis 11
  12. 12. Real-time analytics - Industry  Information technology – Systems & Network monitoring  IoT - sensor data analytics and processing  Financial transactions – authentication, fraud detection, validation  Inventory control – consumer trends and demands  Website analytics – ad targeting  Many others…. 12
  13. 13. Real-time analytics - applications  Data analyzed as it arrives – data in motion  Simple: Monitoring, alerts/reports, statistics  Complex: predictive analytics (regressions, machine learning, etc…), K-means clusters (classification, anomaly detection)  Many store events as well, combine with later batch processing.  Immediate actions possible. 13
  14. 14. Informix to Spark Challenges 14
  15. 15. Exploring data and discovering actionable business insights  The problem - Often users will not know what exact analytics they want to do  Difficult to justify cost/risk of a complex solution without specific business value  Need to reduce the cost/risk of adding real-time data analytics pipeline to application architecture  Let data scientist explore data to find useful data analytics without interfering with existing business. 15
  16. 16. We're running an Informix database. How to incorporate real-time analytics into our application architecture? Application Server Database 16
  17. 17. Out-dated approach - requires additional complexity Increased risk and cost. Application Server Additional Component Additional Component 17
  18. 18. Informix to Spark Prototype Implementation 18
  19. 19. Real-Time Operational Database Streaming Analytics with Spark  Newly prototyped feature for the Informix database.  Enables Informix customers to stream data added to their database in real-time via MQTT, which can then be consumed by an analytics platform such as Apache Spark. 19
  20. 20. Informix MQTT Streamer – Enable real- time analytics pipeline which drastically reduces complexity, cost and risk 20
  21. 21. How is it implemented?  Uses Informix Virtual-Index Interface (VII)  VII allows us to write UDRs that will be triggered whenever certain SQL statements are executed  This is typically used to create indexes for custom data types. Instead, we use it to write data to a socket during INSERT/UPDATE statements VII UDR: Publish to MQTT broker MQTT broker 21
  22. 22. Installation and basic usage  Open Sourced!  Available on github –  Run install script  Add the streaming index to the column whose values you want to stream create index stream on table(col1, col2) USING streaming_index; 22
  23. 23. The Nitty gritty • Installed into Informix is a set of custom UDRs that convert data into MQTT messages and sends them to a specified address • Virtual Table Indexes detect data insert/update/deletes as they happen and trigger the messages to be sent • Once in an MQTT broker, almost anything can consume it – MQTT clients available for most programming languages (include Java for Apache spark) • Spark can analyze the data, compare it to historical data, use streaming k-means algorithms to determine changes in the data
  24. 24. The Nitty gritty continued  Once installed, the custom “streaming_index” index type will be available for use.  Running the “create index” command and specifying to use the “streaming_index” index type will run the code in the custom UDRs that will push the data via MQTT.  Then, whenever you run the INSERT statement on the column that you created the streaming index on, the data that you inserted will automatically be published to an MQTT broker.  See the “IBM Informix Virtual-Index Interface Programmer's Guide” for more details. 24
  25. 25. In-depth  Does the prototype work for Temp. tables?  No specific index-related restrictions to temp. tables  Do we lock the tables?  The VII will lengthen the amount of time a lock is held  Future item - multiple concurrent writers to a per-table queue, flushed asynchronously by a separate thread  Would this work for multi-nodes (sharding)?  The current prototype is really delegating this to Spark, where multiple input streams could be merged into one 25
  26. 26. In-depth  Installs in seconds  No need to upgrade database  No need to restart database server  Can be installed and activated on a live production database!  Minimal interference with existing business application 26
  27. 27. Informix to Spark Demo 27
  28. 28. Heart To Spark • Demonstration for real time streaming of data from the Informix engine into a message broker for digestion by one or more services • Simulates IOT data from a heart rate monitor • Watches for trends in heart rates – Poor health/stress can cause a rise in baseline heartrate which is measurable • Uses Spark Analytics to determine baseline heartrates and plots the trend (heartrate rising, steady, or falling) • Graphing tools in browser show us a view of the data
  29. 29. Demo - Installation 30
  30. 30. IOT devices send data into the Informix server Data Streams from Informix into an MQTT broker From MQTT Data is streamed into Spark for real-time Analysis Results from both Informix and Spark available to the end user Overview
  31. 31. Not limited to Apache Spark  Can be used by any application/platform that can consume TCP socket data.  IBM Infosphere Streams  Apache Storm  Custom applications (most programming languages have MQTT libraries)  Many, many others. 32
  32. 32. Informix to Spark What’s next? 33
  33. 33. Endless possibilities  Check out Apache Spark for more information about analytics and machine learning   Learn more about Machine Learning and its potential   Contact IBM Informix 34
  34. 34. Questions? Pradeep Natarajan @pradeepnatara 3535