Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Just because you can doesn't mean that you should - thingmonk 2016


Published on

Big data! Fast data! Real-time analytics! These are buzzwords commonly associated with platform offerings around IoT.

Although the Law of large numbers always applies, just because you can deploy more sensors doesn't automatically mean that you should. After all, they cost money, bandwidth, and can be a pain to maintain. On the example of the Westminster Parking Trial, I'd like to show how analytics on preliminary survey data could have reduced the number of deployed sensors significantly.

A similar logic goes for fast and real-time analytics. While being advertised as killer features, many people new to IoT and analytics are not even aware that they might get away with batch processing. On the example of flying a drone, I'd like to discuss for which use cases I'd apply edge processing (on the drone), stream or micro-batch analytics (when data arrives at the platform) or work on batched data (stored in a database).

Published in: Technology
  • Überprüfen Sie die Quelle ⇒ ⇐ . Diese Seite hat mir geholfen, eine Diplomarbeit zu schreiben.
    Are you sure you want to  Yes  No
    Your message goes here

Just because you can doesn't mean that you should - thingmonk 2016

  1. 1. Just because you can doesn’t mean that you should Dr. Boris Adryan @BorisAdryan
  2. 2. The logarithmic history of things Boris the Academic “Give me £50M and I build you the best IoT ontology money can buy.”
  3. 3. “I wonder if anyone is making money with IoT” Talking about inflated expectations “There may be money in IoT” “I’m going to get rich with IoT” “I’m making a decent salary with IoT”
  4. 4. The logarithmic history of things Boris the Academic “Give me £50M and I build you the best IoT ontology money can buy.” Boris the Freelancer “If you want to pay £5M for machine learning - make sure it isn’t rude or annoying.” Boris at Zühlke “Don’t pay anyone £0.5M - I show you how we can do it for half.”
  5. 5. peanuts: “a spoon full” How many peanuts are that on average? 0 50 100 “on average” 3 samples
  6. 6. Do I get more peanuts at Thing Monk or at Monki Gras? 0 50 100 “on average” thingmonk 3 samples “on average” monkigras
  7. 7. Do I get more peanuts at Thing Monk or at Monki Gras? 0 50 100 “on average” thingmonk 4 samples “on average” monkigras
  8. 8. Do I get more peanuts at Thing Monk or at Monki Gras? 0 50 100 “on average” thingmonk n samples “on average” monkigras statistical power through large numbers of samples deviation
  9. 9. Statisticians and data scientists LOVE larger sample sizes! …but if sampling costs time and resources, we need a compromise.
  10. 10. precision and accuracy that can be achieved theoretically Sampling strategy precision and accuracy that is needed to get a job done accurate and precise not accurate, but precise accurate, not precise not what you want
  11. 11. Deployment patterns and analytics strategies to maximise profit Dr. Boris Adryan @BorisAdryan
  12. 12. 39% of survey participants are worried about the upfront investment for an industrial IoT solution. “Why aren’t you doing IoT?”
  13. 13. •how to cut down on hardware costs •how to cut down on software costs Sweetening IoT for your customer A few recommendations from the trenches: insights from a project with OpenSensors
  14. 14. Westminster Parking Trial IoT solution Service company ~750 independent parking lots with a total of >3,500 individual spaces access to
  15. 15. Can we learn an optimal deployment and sampling pattern? •sampling rate of 5-10 min •data over 2 weeks in May 2015 •overall 2.6 million data points Can we make Ethos’ budget go further by • distributing a given number of sensors over a wider geographic area? • lowering the sampling rate for better battery life? labour: expensive sensor: cheap
  16. 16. Correlation and clustering 0 5 10 15 20 0 3 6 9 12 “correlated” 0 5 10 15 20 0 3 6 9 12 “anti-correlated” 0 5 10 15 20 0 3 6 9 12 “independent” lorry coach car bike skateboard hierarchical clustering on the basis of a feature matrix
  17. 17. Good news: temporal occupancy pattern roughly predicts neighbours lots in Southampton lots around the corner of each other 750 parking lots
  18. 18. A caveat: Is a high-degree of correlation a function of parking lot size? finding two lots of 20 spaces that correlate finding two lots of 3 spaces that correlate 0:00 12:00 23:59 0:00 12:00 23:59 “more likely” “less likely”
  19. 19. Bootstrapping in DBSCAN clusters Simulation: Swap the occupancy vectors between parking lots of similar size and test per grid cell if lots still correlate
  20. 20. Verdict: In some grid cells the level of the occupancy of one parking lot predicts the occupancy of most parking spaces. x x x x x x x x x x x x x x x x Better for navigation We suggested that about ONE THIRD of the sensors may be sufficient. Better predictive power
  21. 21. Suggested technology for trials A temporary survey would have allowed us to make the same recommendation, including the insight that the provided 5’ resolution is probably not required.
  22. 22. Monte Carlo simulations are great tools to assess the business value of IoT base assets “A tour of my assets every Friday.” base ‘cost function’: sum of all edges p1(need today) “A demand-driven tour of my assets.” ‘cost function’: sum of edges needed in 7 days p2(need today) p3(need today) p4(need today) p5(need today) p6(need today)
  23. 23. Hardware is often perceived as investment that customers understand and therefore anticipate the cost. This talk is about unfounded IoT fears. There’s an air of magic around data and analytics.
  24. 24. “My data problem must be special!” ✓ unstructured data ✓ distributed ingestion and storage Or they believe from hear-say that IoT automatically requires: ✓ real-time analytics ✓ sophisticated machine learning My company went to an IoT conference & all I got was this t-shirt and a bunch of buzzwords. Customers fear costs because they’re thinking about:
  25. 25. “I need to do real-time analytics!” microseconds to seconds seconds to minutes minutes to hours hours to weeks on device on stream in batch am I falling? counteract battery level should I land? how many times did I stall? what’s the best weather for flying? in process in database operational insight performance insight strategic insight e.g. Kalman filter e.g. with machine learning e.g. rules engine e.g. summary stats
  26. 26. Can IoT ever be real-time? zone 1: real-time [us] zone 2: real-time [ms] zone 3: real-time [s]
  27. 27. Edge, fog and cloud computing Edge Pro: - immediate compression from raw data to actionable information - cuts down traffic - fast response Con: - loses potentially valuable raw data - developing analytics on embedded systems requires specialists - compute costs valuable battery life Cloud Pro: - compute power - scalability - familiarity for developers - integration centre across all data sources - cheapest ‘real-time’ option Con: - traffic Fog Pro: - same as Edge - closer to ‘normal’ development work - gateways often mains-powered Con: - loses potentially valuable raw data
  28. 28. Options for real-time in cloud some features can cost a bit, especially when you don’t really know what you’re doing and want to ‘try it out’. a badly configured SMACK stack on your own commodity hardware can be slow and unreliable your pre-trained classifier
  29. 29. My current pet hate: Deep Learning Deep learning has delivered impressive results mimicking human reasoning, strategic thinking and creativity. At the same time, big players have released libraries such that even ‘script kiddies’ can apply deep learning. It’s already leading to unreflected use of deep learning when other methods would be more appropriate.
  30. 30. Dr. Boris Adryan @BorisAdryan ‣ Preliminary surveys, data analysis and simulation can help to minimise the number of sensors and develop an optimal deployment strategy and sampling schedule. ‣ Faster analytics on bigger and better hardware are not automatically the most useful solution. ‣ A good understanding on the type of insight that is required by the business model is essential. Zühlke can advise on options around IoT and data analytics, and provide complete solutions where needed. Summary