SlideShare a Scribd company logo
1 of 63
Why Smart Meters Need
Informix TimeSeries




IBM Data Server Day, Stockholm 2012-05-22   Cosmo@uk.ibm.com
                                                               © 2012 IBM Corporation
Please Note:
 IBM’s statements regarding its plans, directions, and intent are subject to change or withdrawal
 without notice at IBM’s sole discretion.
 Information regarding potential future products is intended to outline our general product direction
 and it should not be relied on in making a purchasing decision.
 The information mentioned regarding potential future products is not a commitment, promise, or
 legal obligation to deliver any material, code or functionality. Information about potential future
 products may not be incorporated into any contract. The development, release, and timing of any
 future features or functionality described for our products remains at our sole discretion.




  Performance is based on measurements and projections using standard
  IBM benchmarks in a controlled environment. The actual throughput or
  performance that any user will experience will vary depending upon many
  factors, including considerations such as the amount of multiprogramming
  in the user's job stream, the I/O configuration, the storage configuration,
  and the workload processed. Therefore, no assurance can be given that an
  individual user will achieve results similar to those stated here.


                                                                                               2012
Acknowledgements and Disclaimers:
Availability. References in this presentation to IBM products, programs, or services do not imply that they will be available in all
countries in which IBM operates.


The workshops, sessions and materials have been prepared by IBM or the session speakers and reflect their own views. They are
provided for informational purposes only, and are neither intended to, nor shall have the effect of being, legal or other guidance or advice
to any participant. While efforts were made to verify the completeness and accuracy of the information contained in this presentation, it
is provided AS-IS without warranty of any kind, express or implied. IBM shall not be responsible for any damages arising out of the use
of, or otherwise related to, this presentation or any other materials. Nothing contained in this presentation is intended to, nor shall have
the effect of, creating any warranties or representations from IBM or its suppliers or licensors, or altering the terms and conditions of the
applicable license agreement governing the use of IBM software.

All customer examples described are presented as illustrations of how those customers have used IBM products and the results they
may have achieved. Actual environmental costs and performance characteristics may vary by customer. Nothing contained in these
materials is intended to, nor shall have the effect of, stating or implying that any activities undertaken by you will result in any specific
sales, revenue growth or other results.


© Copyright IBM Corporation 2012. All rights reserved.
    –     U.S. Government Users Restricted Rights - Use, duplication or disclosure restricted by GSA ADP Schedule Contract
          with IBM Corp.
IBM, the IBM logo, ibm.com and IBM Informix are trademarks or registered trademarks of International Business Machines Corporation
in the United States, other countries, or both. If these and other IBM trademarked terms are marked on their first occurrence in this
information with a trademark symbol (® or ™), these symbols indicate U.S. registered or common law trademarks owned by IBM at the
time this information was published. Such trademarks may also be registered or common law trademarks in other countries. A current list
of IBM trademarks is available on the Web at “Copyright and trademark information” at www.ibm.com/legal/copytrade.shtml

    Other company, product, or service names may be trademarks or service marks of others.




                                                                                                                                        2012
Why Smart Meters Need Informix TimeSeries
      What challenges are being faced in the Energy & Utilities Sector today?

      What is a Smart Meter and how can it help?

      How does Informix TimeSeries fit it?

      Case studies
        –1M Oncor PoC
        –35M Internal benchmark
        –100M AMT Sybex benchmark




4     22 May 2012                                                  2012
Consumers need Smart Meters




     Samuel Palmisano, chief executive officer of International Business Machines Corp., said
     Improving the U.S. electric-transmission grid depends on providing better information to
     consumers. Companies shouldn’t wait for government to set standards for data and
     technologies to create a "smart grid," which lets consumers monitor their energy use and take
     conservation steps that can save energy and money, [ September 21, 2010 at the Gridwise
     Global Forum in Washington]




5      22 May 2012                                                                   2012
Energy Usage Issues

     Emission reduction goals:
       – EU 20% emissions reduction by 2020 as compared to 1990.
       – UK is 60% reduction by 2050.
     Long lead times for new, “clean” energy supply.
     Lasting legacy of energy inefficiency:
        – 80% of refrigerators bought in 2007 will be in use in 2020.
        – Less than 1/3 of industrial infrastructure will be replaced by 2020.
        – Over 20%of cars bought in 2007 will still be on the road in 2020.
     Household efficiency a priority:
       – 25-30% of carbon emissions are from regular households.
       – 80% of home energy usage is heating.
       – EC projects 27% savings through efficiency in buildings.




6      22 May 2012                                                               2012
Why Smart Meters

     Access to near-real-time electricity usage information.


     Better control and management of electricity usage.


     Enable retail electric providers to develop and offer new, innovative plans
      that will lower consumer bills.

     Help make smarter decisions and change behaviours to help reduce
      consumption, or modify usage patterns.

    Smart meter often refers to an electrical meter, but it can also mean a device
    measuring natural gas or water consumption.




7      22 May 2012                                                     2012
Who is Using Smart Meters

     Utility Companies:
       – In the U.S. – stimulus money used for smart meters.
       – Main drive is not reducing billing costs.
       – Better analysis of usage patterns.
       – Can different tariffs change energy consumption?
           .
     Consumers:
       – Looking to reduce energy costs.
       – Wanting to improve their green credentials.


     Governments:
       – Need to show improvements in emissions.
       – Want to reduce energy consumption/reliance.




8      22 May 2012                                             2012
Smart Meters Solves Real Problems

     Real time information on Energy Usage.
     Gain control over personal energy usage
       – Modify electrical consumption:
          • California study – reductions 5.7% to 8.7%.
          • Norwegian study - reductions of 9%.
          • UK study reduction of 12%.
          • Oncor Texas, reductions of 5%-10%.


     Power companies:
       – Develop new innovative rate plans.
       – Avoid building new plants.
       – Avoid buyer power from other sources.
       – Meet Green standards.
       – Reliable power restored quicker after outages.




9      22 May 2012                                        2012
Data issues with Smart Meters

  Data Issues - Terabytes of new Data:
    – Ability to bring on new meters.
    – Stores data for new regulatory reasons.
    – Analyse usage.
    – Automatically Read Meters.


  New Data, New Applications:
    – Billing
    – Portal
    – Compliance
    – New Analytics
    – Combine Meter and Weather data




12   22 May 2012                                2012
Informix Timeseries Overview




13                                  2012
What is Time Series Data?

      Time series data is:
        – A set of data where each item is time-stamped
           • Think of an array where each element can be indexed by time or
             by a timestamp

         “Give me the Jan 1st element from time series “X”

      Most useful when a range of data is normally read
         “Give me the Jan 1st thru Jan 10th elements from time series “X”
      Access to one time series is usually completed before moving to the
       next time series.




14                                                                      2012
How are Time Series Used?

      They access the data by time range
        – Look at a range of data in the past
        – Make predictions about a range in the future

      Their analysis is often very proprietary

      Many keep large volumes of data online

      Many take in huge volumes of data each second

      All these markets use relational data as well

      All need to combine their relational data with time series data




15                                                                       2012
Key Strengths of Informix Timeseries

      Performance
        – Extremely fast data access
            • Data layout optimized on disk
        – Handles operations hard or impossible to do in standard SQL

      Space Savings
        – Can be over 70% space savings over standard relational layout

      Toolkit approach allows users to develop their own algorithms
        – Algorithms run in the database to leverage buffer pool

      Conceptually closer to how users think of time series




16                                                                2012
Relational Time Series Representation
     Meter_ID TimeStamp         phase1   phase2   ...   temp
     1       2010-06-01 00:00   1.3      0              15.6
     1       2010-06-01 00:30   1.6      0              15.6
     1       2010-06-01 01:00   1.4      0              15.5
     1       2010-06-01 01:30   1.4      0              15.4
     1       2010-06-01 02:00   1.4      0              15.5




                                                                      Growth
     ...
     2       2010-06-01 00:00   0.4      0              12.3
     2       2010-06-01 00:30   0.3      0              12.3
     2       2010-06-01 01:00   0.2      0              12.2
     2       2010-06-01 01:30   0.5      0              12.3
     ...
     3       2010-06-01 00:00   0.0      3.5            13.6
     3       2010-06-01 00:03   0.0      4.3            12.2




17                                                             2012
Same Table Stored as a Time Series

 Meter_ID       Origin         00:00           00:30             01:00      01:30            ...
 1              2010-06-01     (1.3,0...15.6) (1.6,0...15.5) (1.4,0...15.5) (1.4,0...15.4)
 2              2010-06-01     (0.4,0...12.3) (0.3,0...12.3) (0.2,0...12.2) (0.5,0...12.3)
 3              2010-06-01     (0,3.5... 13.6) (0,4.3... 12.2)



     There are only as many rows as meters                                   Growth

     Each row is very long and grows as data is inserted

     Very fast access to a timeslot once the Meter_ID is selected

     Very fast to read time-ordered set of values




18                                                                                  2012
Informix TimeSeries

      A “timeseries” datatype is available in Informix
        – First introduced by Illustra in 1996

      Additional “objects” associated with timeseries:
        – “Calendar” datatype
           • For defining when data can be collected
        – Row types
           • For defining what should be collected
        – Containers
           • For defining where the data should be stored
        – Several Support tables:
           • Calendar, tsinstancestable, tscontainertable



19                                                          2012
Key Concepts: Regular Time Series

      Data collected uniformly over time intervals is a “regular” time series
        – For example: daily, hourly, etc...

      A regular time series always has exactly one record per interval

      If an interval is missing data then:
          – Missing data on an existing page takes up (a little) space
          – If all the intervals for a page are missing data then the page takes no space

      Values in one interval typically do not carry into the next

      Can be thought of as an array of data




20                                                                                 2012
Key Concepts: Irregular Time Series

      Irregular time series also use intervals of time, however:
          – Unlike regular time series, irregular time series can store more than one
            record into a given time interval
              • For instance, multiple alerts can occur in the same second
          – Missing data never takes any room on disk

      Values in an irregular time series can be treated in two ways:
        – Values may persist until next value arrives (stair step)
            • Total usage counter
        – Values are only valid at their given time point and do not “persist” (discreet)
            • Power outage alert

      Can also be thought of as an array of data




21                                                                                2012
Key Concepts: Calendar Datatype

      Every Timeseries has an associated calendar

      A calendar is made up of several parts:
        – A name
        – A pattern of intervals
        – A start date
        For instance, to create a calendar called “daily” that starts on
        Jan 1 2010 and defines regular work days you would issue this query:

           INSERT INTO Calendartable (c_name, c_calendar) VALUES
           (‘weekday’, ‘start(2010-01-01 00:00:00), pattern({5 on, 2 off},
             day)’);

      The system catalog called “calendartable” holds all the calendars that have
       been defined



22                                                                        2012
Key Concepts: Row Types

      A Timeseries is made up of a series of timestamped rows


      The granularity of the timestamp is 10 microseconds (.00001 seconds)

      The SQL syntax that defines a row type is:

            CREATE ROW TYPE reading (tstamp DATETIME, phase1 DECIMAL,…)
        – NOTE: Timeseries requires the type of the first column (the type of tstamp) to be
          “datetime year to fraction(5)”

      Data in the row can be missing (NULL)
        – Missing data takes no space in a time series

      Rows can be marked as hidden
        – Useful for holidays and other times where data is not available




23                                                                                    2012
Key Concepts: Containers

      A “container” is the name given to the data structure that hold data for one or more time
       series.
      It guarantees that time series data is stored clustered and sorted on disk

      A container is explicetly created using this SQL syntax:

         EXECUTE PROCEDURE TsContainerCreate(‘cont_name’, ‘dbspace_name’,
          ‘rowtype_name’, first_extent, next_extent);

        – rowtype_name is the name of an existing row type
        – DBSPACE_NAME is the name of an existing dbspace (predefined area of disk)
        – FIRST_EXTENT is the size of the first extent of storage
        – NEXT_EXTENT is the size of the subsequent extents of storage
      TimeSeries 5.00 has an automatic container allocation mechanism
        – With no container definition the dbspace of the table is used
        – Otherwise user defined pools can be used
        – Policy can be Round Robin or user defined




24                                                                                     2012
Putting it all Together
      Create a calendar for 30 minute intervals;

        INSERT INTO Calendartable (c_name, c_calendar) VALUES
              (‘interval’, ‘start(2010-01-01 00:00:00), pattern({1 on, 29 off}, minute)’);

      Create a row type:
        CREATE ROW TYPE reading (tstamp datetime year to fraction(5),
            phase1 DECIMAL, phase2 DECIMAL, phase3 DECIMAL, temp DECIMAL);

      Create a container:

        EXECUTE PROCEDURE TsContainerCreate (‘int_cont1’, ‘tsdbs’, ‘reading’, 1024, 1024);

      Create a table and insert a “blank” row for 1 meter:
        CREATE TABLE meters (Meter_ID char(64), Actual timeseries(reading));
       INSERT INTO meters VALUES (“9908898”,
             “origin (2010-01-01 00:00:00), calendar(interval), container(int_cont1), regular”);




25                                                                                           2012
Relational Storage – Traditional Index Method
 Data pages have mixed Meter_IDs
 Multiple page access required                    Meter_ID     Start    End
 Key stored in both index and data                root         2010




     Meter_ID   Start   End        Meter_ID   Start   End                         Meter_ID   Start     End
     MX001      00:00 23:30        MX002      00:00 23:30                         MX1209980 00:00 23:30

 Index Page                                                   Data Page
Meter_ID        TStamp             Pointer                   Meter_ID         TStamp             usage
MX001           2010-06-01 00:00                             MX001            2010-06-01 00:00   1.6
MX001           2010-06-01 00:30                             MX001            2010-06-01 00:30   1.8
MX001           2010-06-01 01:00                             MX002            2010-06-01 12:30   3.6
MX001           2010-06-01 01:30                             MX003            2010-06-01 06:00   8.2
MX001           2010-06-01 02:00                             MX001            2010-06-01 01:00   4.7




26                                                                                           2012
Relational Storage – “High Performance” Index Method
 Only index page access required
 But All data is stored in both index              Meter_ID     Start    End
 and data pages                                    root         2010




     Meter_ID   Start   End         Meter_ID   Start   End                         Meter_ID   Start     End
     MX001      00:00 23:30         MX002      00:00 23:30                         MX1209980 00:00 23:30

 Index Page                                                    Data Page
Meter_I TStamp                Usage Pointer                   Meter_ID         TStamp             Usage
D
                                                              MX001            2010-06-01 00:00   1.6
MX001 2010-06-01 00:00        1.6
                                                              MX001            2010-06-01 00:30   1.8
MX001 2010-06-01 00:30        1.8
                                                              MX002            2010-06-01 12:30   3.6
MX001 2010-06-01 01:00        4.7
                                                              MX003            2010-06-01 06:00   8.2
MX001 2010-06-01 01:30        2.5
                                                              MX001            2010-06-01 01:00   4.7
MX001 2010-06-01 02:00        2.1




27                                                                                            2012
An Informix Table Containing a Timeseries Column
 The timeseries in the table is a physical reference to a mini-btree in a container



 Meter_ID Timeseries(reading)                                        Container “A”
 MX001      [container_A, 1]
 MX002      [container_B, 2]
 MX003      [container_A, 3]
 MX004      [container_C, 4]                                         Container “B”
 MX234      [container_C, 5]
 MX239      [container_B, 6]
 MX675      [container_C, 7]
                                                                     Container “C”
 MX521      [container_C, 8]




28                                                                           2012
Timeseries Container Layout
 The btree index key is the
 time series id plus either:
 • An integer for regular
 time series
 • A timestamp for irregular
 time series

 Each low-level page
 holds sorted data for      4   5   7   8   12      16
 exactly one time series


     Index Twig Pages:


29                                               2012
Irregular Timeseries Storage Compared to Relational
 Data values only stored once
 No data pointers or pages
 Multiple, smaller btrees                    TS_ID Start             End
                                             root      2010-01-01


     TS_ID   Start End         TS_ID       Start End                       TS_ID   Start End
     1       00:00 23:30       2           00:00 23:30                     1000    00:00 23:30

 Timeseries Page           (irregular)                      Data Page
Meter_ID     TStamp                  Usage Pointer           Meter_ID TStamp
MX001        2010-06-01 01:03        1.6                     MX001      2010-06-01 01:03 1.6
MX001        2010-06-01 01:45        1.8                     MX001      2010-06-01 01:45 1.8
MX001        2010-06-01 02:06        1.9                     MX002      2010-06-01 02:06 3.6
MX001        2010-06-01 02:08        2.1                     MX003      2010-06-01 02:08 8.2
MX001        2010-06-01 02:25        1.8                     MX001      2010-06-01 02:25 1.9



30                                                                                 2012
Regular Timeseries Storage Compared to Relational
 Data is only stored once
 No timestamps or data pages
 Multiple, smaller btrees                 TS_ID Start             End
                                          root      2010-01-01


     TS_ID   Start End         TS_ID    Start End                       TS_ID   Start End
     1       00:00 23:30       2        00:00 23:30                     1000    00:00 23:30

 Timeseries Page           (regular)                     Data Page
Meter_ID     TStamp                Usage Pointer          Meter_ID TStamp
MX001        2010-06-01 00:00 1.6                         MX001      2010-06-01 00:00 1.6
MX001        2010-06-01 00:30 1.8                         MX001      2010-06-01 00:30 1.8
MX001        2010-06-01 01:00 1.9                         MX002      2010-06-01 12:30 3.6
MX001        2010-06-01 01:30 2.1                         MX003      2010-06-01 06:00 8.2
MX001        2010-06-01 02:00 1.8                         MX001      2010-06-01 01:00 1.9



31                                                                              2012
Informix Timeseries Space Saving

  There is a small overhead for the b-tree pages
    – Meter_ID and Timestamp stored
    – Also pointer to Timeseries page
  Irregular Timeseries must store Timestamp for each element
      – 8 Bytes Extra overhead per element

  Regular Timeseries uses known offsets
    – No Timestamp stored
    – Even more efficient
  NULL data is compressed
    – NULL elements (missing regular elements) take zero space
    – Sparse arrays are not stored at all if no elements in time range
    – Unlike relational storage NULL values take NO SPACE
    – A row type of (DECIMAL(12), INTEGER, INTEGER) is 7 + 4 + 4 = 15 bytes
    – Storing (NULL, 1, NULL) would only require 4 bytes




32                                                                            2012
Worked Example – Relational Method

     Number of meters:   3,000,000
     Interval:           15 minutes (96 readings per day)
     Meter ID length:    8 bytes
     Timestamp length:   12 bytes
     Data length:        8 + 6 bytes + 2 bytes slot overhead

     Data space:         3000000 * 96 * ( 8 + 12 + 8 + 6 + 2 )       = 10GB
     Index space:        3000000 * 96 * ( 8 + 12 + 8 + 2 )
                                            + 10% b+tree overhead    = 9GB
     Total storage:                                                  = 19GB



                            19GB per day
                             19GB per day




33                                                                  2012
Worked Example – Informix Timeseries

 Number of meters:      3,000,000
 Interval:              15 minutes (96 readings per day)
 Meter ID length:       64 bytes
 Timestamp length:      12 bytes
 Timeseries metadata:   86 bytes
 Data length:           8 + 6 bytes + 2 bytes slot overhead

 Fixed data space:      3000000 * ( 64 + 86 )                 = 429MB
 Timeseries overhead:   3000000 * ( 12 + 4 + 2 ) + 10%        = 66MB
 Variable data space:   3000000 * 96 * ( 8 + 6 + 2 )          = 4.4GB




                        That is aahuge saving of 76%
                         That is huge saving of 76%




34                                                            2012
Timeseries Simplicity – Example
        • Much simpler SQL – Apply a tariff

 Relational:

     SELECT meter_id, sum (value * 1.76)
        FROM meters where (tstamp BETWEEN '2010-06-02 00:00' AND '2010-06-02 06:59')
         OR (tstamp between '2010-06-02 21:00' AND '2010-06-02 23:59')
       GROUP BY 1, 2;

 Timeseries:
        SELECT meter_id, apply_tariff (readings, tariff,
                   '2010-06-02 00:00', '2010-06-02 23:59')::Timeseries(applied_cost)
          FROM meters;


      But what if there is a missing value in the interval data?
      What if you want to reference data outside the query range?


36                                                                                2012
Building Applications with the TimeSeries Datablade
      Standard client access to server
        – ESQL/C
        – ODBC, JDBC, .NET
        – Perl DBD::Informix, PHP, Ruby
      Several Timeseries specific interfaces are available:
        – SQL
        – VTI
        – SPL
        – Java (client & server)
        – C-API (client & server)
      It’s a toolkit approach!
          – Allow people to build their analytics in the server




37                                                                2012
Informix Timeseries SQL Interface
  Timeseries data is usually accessed through user defined routines (UDR’s)
   from SQL
  Over 80 predefined functions come with Informix Timeseries:

     – Clip() - clip a range of a time series and return it
     – LastElem(), FirstElem() - return the last (first) element in the time series
     – Apply() – run a query across a time series
        • Apply filters, project only subset of columns, apply functions to elements,
          etc…
     – AggregateBy() – Roll up or down values
        • Change the frequency of a Timeseries from hourly to daily for instance
     – SetContainerName() - move a Timeseries from one container to another
     – BulkLoad() - load data into a Timeseries from a file




38                                                                               2012
TimeSeries SQL Examples

      Get all meter data for meter 3 for the last day

        SELECT Clip(reading, CURRENT – 1 units day, CURRENT)
         FROM meters WHERE Meter_ID = ‘3’;

      Get the last meter record for meter 3

        SELECT GetLast (reading)
         FROM meters WHERE Meter_ID = ‘3’;

      Find the maximum usage by week for meter 3 over the last 30 days

        SELECT AggregateBy (‘max($usage)’, ‘weeklycal’, reading,
         CURRENT – 30 units day, CURRENT)
         FROM meters WHERE Meter_ID = ‘3’;




39                                                                    2012
Informix Timeseries VTI Interface
      Makes time series data look like standard relational data
        – useful for programs that can’t our proprietary Timeseries format
        – There is a small penalty for using VTI

      Restrictions
        – No secondary indices are allowed
        – No triggers allowed

      SQL to create a VTI table:
        – If you have a table called “meters” with a time series column the
          following query will create an equivalent VTI table:

             EXECUTE PROCEDURE tscreatevirtualtab(‘readings’, ‘meters’);




40                                                                     2012
VTI Interface: Continued
 Meters – The Timeseries data
 Meter_ID     Origin           00:00     01:00         02:00        03:00         ...
 MX001        2010-06-01       1.3       1.6           1.4          1.5
 MX002        2010-06-01       0.4       0.3           0.2          0.5
 MX003        2010-06-01       3.5       4.3

 Readings – A virtual view of the Timeseries data
 Meter_ID TStamp                 usage
 MX001      2010-06-01 00:00    1.3
                                                 The VTI view is equivalent to
 MX001      2010-06-01 01:00    1.6              the tall thin relational table
 MX001      2010-06-01 02:00    1.4              and can be easily accessed
 MX001      2010-06-01 03:00    1.5              by any SQL client
 ...
 MX002      2010-06-01 00:00    0.4
 MX002      2010-06-01 01:00    0.3



41                                                                          2012
Informix Timeseries 5.00 VTI Interface
      TimeSeries 5.00 VTI Enhancements
        – Update regular VTI using primary key only
        – Use of TimeSeries expressions (read only)

      SQL to create a VTI table with an expression:

       EXECUTE PROCEDURE TSCreateExpressionVirtualTab(
          'day_agg_readings',
          'devices',
          'AggregateBy("sum($kwh),avg($phase_a),avg($phase_b),avg($phase_c)",
                       "cal1day", readings, 0)',
          'reading',
          1024,
          'readings');




42                                                                   2012
Comparison of VTI vs Native Time Series Queries
  Select a range of data for a meter:
     – Native:
              SELECT Clip (reading, “2010-01-01”, “2010-01-10”)
               FROM Meters WHERE Meter_ID = “2”;
     – VTI:
              SELECT * FROM readings WHERE tstamp
               BETWEEN “2010-01-01” AND “2010-01-10” AND Meter_ID = ”2”;

  Find the max usage for a given meter in a given period of time
     - Native:
              SELECT Apply (“max($usage)”, “2010-01-01”, “2010-01-10”, reading)
               FROM Meters WHERE Meter_ID = “2”;
     - VTI:
              SELECT max(usage) FROM readings
               WHERE tstamp BETWEEN “2010-01-01” AND “2010-01-10 AND Meter_ID = “2”;


 Note:
    – Native will normally be faster than VTI, probably in 5 to 10% range
    – It is often much faster to write custom user defined functions
    – VTI functions are very convenient for standard SQL clients



43                                                                                2012
TimeSeries C-API Interface


      Client and server versions of the API

      Treats a time series like a table (sort of)
        – Functions to open and close a time series
        – Functions to scan a time series between 2 timestamps
        – Functions to create a time series
        – Functions to retrieve, insert, delete, update

      Plus another 70 functions defined




44                                                               2012
Timeseries Data Loading




46                             2012
Timeseries Data Loading

  Timeseries is a specialist type and benefits from a specialist data loading mechanism

  Traditionally the Real Time Loader has been used for high speed Timeseries data
   insert
     – Developed for stock market trade data
     – Good for irregular Timeseries
     – Small symbol universe – 10s of thousands of stocks
     – Data arriving in timestamp order
     – Small number of active stocks
     – Needs to cope with very high peak loads at exchange open & close


  Smart Meter Data is a new challenge
     – Timeseries is regular
     – Many millions of meters
     – Data batched by Meter Identifier
     – All meters equally active




47                                                                           2012
Smart Meter Data Loader

  Uses similar internal mechanism as RTL to directly access containers


  Builds internal map of Meter ID and Timeseries ID

  Can use fragmentation of base table for better parallelism

  Parallel sessions can work on separate disks to reduce contention


  Load rates can be in excess of 50,000 intervals per second per core




50                                                                        2012
Smart Meter Data Loader – Architecture
 Random Distribution      Meter_ID TS ID
                          7898765   1
                          2168768   2
                          9879821   3
                          1656578   4
                          8787987   5
                          4678768   6
                          7354658   7
                          2537591   8
                          8973547   9
                          1352857   10
                          3451759   11
                          7656472   12
                          6543897   13
Meter Data   Loaders      3324516   14     Containers   Physical Disks
                              Hash table


52                                                         2012
Oncor PoC




53               2012
Oncor PoC Details
  Simulation
     – 90 days worth of meter data for 1 million meters
         • 15 minute intervals
         • One value stored per interval
     – 200 locations
     – 500 feeders
     – 34 substations

  Hardware
     – Power7 with 2 sockets each with 8 cores
     – 64 bit SUSE Linux 11
     – 128 GB of memory
         • Memory actually needed, 44GB, although could probably be less
     – 6 disks dedicated to the database, 2 additional for OS and LSE staging
         • Disk space actually used by the database, about 350GB (110 days)
     – Additional disks for the operating system and staging area for files

  Software
     – Informix Ultimate Edition 11.7
     – Informix Timeseries




54                                                                              2012
Informix Time Series Schema

The Meter table looks like this:                A Meter reading looks like this:

CREATE TABLE meters (                           CREATE ROW TYPE meter_data (
  esi_id       char(64) not null primary key,      tstamp datetime year to fraction(5),
  suffix       char(32),                           value decimal (14,3)
  location     char(16),                        );
  feeder       char(16),
  sub_station  char(16),
  dbspace      varchar(128),                    An update (correction) record
  container    varchar(128),                    looks like:
  actual       Timeseries(meter_data),
  estimated    Timeseries(meter_data),          CREATE ROW TYPE update_day (
  valid        Timeseries(update_day)              tstamp    datetime year to fraction(5),
)                                                  last_update datetime year to fraction(5),
                                                );

 Hierarchy is sub_station->feeder->meter.
 There are also tables for location,
 sub_station and feeder not shown above.



55                                                                            2012
Primary Use Cases
  Load 90 days worth of data for 1 million meters from LSE files
     – Original set of LSE files massaged to generate 1 million distinct meters
       Oracle 6 hours                             Timeseries 18 minutes
  6-day ERCOT Settlement Extract
     – Show support for the ERCOT settlement processes by creating LSE file consisting of every record
       (every meter) for operating day - 6 (calendar day that occurred 6 days prior to current day). Must be
       able to extract and create the LSE files for 1M meters for a specific day.
       Oracle 5 hours                                                         T
                                                   Timeseries <7 minutes
  22-Day Update ERCOT Settlement Extract
     – Show support for the ERCOT settlement processes by creating LSE files consisting of every record
       that has had a consumption interval record update since the prior extract / pull (6-Day). Only extract
       the last or most current update for each meter, so if a meter has been updated four times, only the
       last / current record is sent. The entire 96 15 minute intervals are sent each time as well.
       Oracle 8 hours
                                                   Timeseries 4 minutes (90 day 11 minutes)
  Missing Record ERCOT Settlement Extract
     – Show support for the ERCOT settlement processes by creating an LSE file consisting of only the
       meter IDs and date that is provided in a missing meter ID file from ERCOT. The dates will be as far
       back as 90 days and no sooner than 28 days back in time.
                                                   4000 random reads on one day - 6 seconds
                                                   4000 random reads many days - 24 seconds


65                                                                                             2012
Other Use Cases

      Determine the count and the list of meter ID's for all meters with missing intervals
       and / or register reads on a given day
       Oracle 3-4 hours                     Timeseries <7 minutes

      Determine the 90 day history for a given meter (90 day aggregation)
       Oracle > 1 second                   Timeseries 0.04 seconds

      Determine the count and list of meter IDs that exceeded a given high interval
       value for a given day or given time period (multiple days). For example, count and
       list of meters that had interval value of 12 or higher for a given period of time.
                                            Timeseries <6 minutes

      Determine list of meters that have 5 consecutive or more days with estimated
       values only (no actual interval reads during a 5 day or more period)
       Oracle 6 hours
                                            Timeseries 17 minutes

66                                                                              2012
Internal Benchmark




67                        2012
Internal Benchmark - Requirements


  35 Million meters
     – 10 minute intervals with 5 values
     – 5 billion intervals per day

  12 Months data storage
     – Over 1.8 trillion intervals
     – Regular TimeSeries 30TB
     – Predicted Relational 84TB

  OLTP concurrent users
    – All running while data is loading

  Complex aggregations
    – Required new TSRollUp function




68                                         2012
Internal Benchmark - Hardware

      IBM P780 with AIX 7.1
      Storage: IBM DS8000 - 576 HDD 146GB/15krpm
      Space used
        –TimeSeries intervals: 30Tb
          • Split over 64 logical devices, 768 containers
        –Relational Tables: 112Gb
          • 1 main data dbspace, 70 fragmentation dbspaces
        –System use: 148Gb
          • Root, log dbspaces + 6 temp dbspaces
      64 cores, primary CPU thread affinitied to 64 Virtual Processors
      1Tb main memory, up to 950Gb assigned to database server
        –80Gb relational data buffers
        –680Gb TimeSeries data buffers
        –45Gb system memory




69                                                                        2012
Internal Benchmark - Results


  Data loading
    – Single day load:                20 minutes (64 Cores used)
    – Historical load of 12 months:   <6 days
    – Daily load during queries:      160 minutes (8 Cores used)
    – Data cleansing after load:      2 minutes

  Query performance
    – 3,000 concurrent sessions
    – Single meter queries sub-second response time
    – Larger summary queries executed in <5 seconds
    – No performance degradation during data load




70                                                                 2012
AMT Sybex Benchmark




71                         2012
AMT Sybex Benchmark

      Most ambitious Smart Meter Benchmark to date
      100 Million Meters
         – 30 minute intervals
         – 1, 2 or 3 daily registers
      Target was to confirm a 24hr operational window
        – Load data
        – Validate data
        – Calculate estimated corrections
        – Billing run for 6% of the meters


                              Validation       Load                   VEE




                                                           Database   Query

                             Single IBM Power 750 server




72                                                                            2012
AMT Sybex Benchmark
  Hardware
    – IBM Power 750 32 cores (3.5GHz) running AIX 7.1
    – 1 x Gb LAN Fibre adapter (dual port, using 1 port)
    – 2 x 8Gb FC adapters (dual port, using 4 port)
    – 512Gb memory
    – 1 x IBM XIV Storage System with 15 x 2Tb data modules
  Software
    – IBM Informix Dynamic Server 11.70.FC3
    – IBM Informix TimeSeries 5.00.FC1
    – AMT-SYBEX SmartDTS v 6.0
  Database Server
    – 101,000,000 x 4Kb buffers
    – 16 cpu vps
    – 30 x 2Gb logical logs
    – 40Gb physical log
    – The time series were stored over 16 logical disks




73                                                            2012
AMT Sybex Benchmark – Processing time


                                    Daily Processing Time

                             Showing predictability of processing
                                 as database size increases

                540                                                              4.0

                480                                                              3.5
                420
                                                                                 3.0
                360
                                                                                 2.5
                300
      Minutes




                                                                                 2.0




                                                                                       Tb
                240
                                                                                 1.5
                180
                                                                                 1.0
                120

                60                                                               0.5

                 0                                                               0.0


                      Validation       Loading         VEE          Space Used




74                                                                                          2012
AMT Sybex Benchmark – Performance Results
 Individual operations

     Operation   Time in hrs   CPU

     Validate    2:18          100%
     Load        3:15          80%
     VEE         2:10          100%
     Total       7:43




75                                           2012
AMT Sybex Benchmark – Performance Results
 Individual operations

     Operation       Time in hrs   CPU

     Validate        2:18          100%
     Load            3:15          80%
     VEE             2:10          100%
     Total           7:43


     Billing Query   4:21          5%

     Overall total   12:04




76                                           2012
AMT Sybex Benchmark – Performance Results
 Combined operations

     The Billing Query and the load can be run concurrently


     Operation         Time in hrs          CPU

     Validate          2:18                 100%
     Load + Billing    4:41                 85%
     VEE               2:10                 100%
     Overall Total     9:09


     This result confirmed that a 9hr processing window was sufficient for the daily processing




77                                                                                    2012
How Does This Benchmark Compare?
Comparison of Published Benchmarks for Meter Data Management

                                                                         Daily           Total     Total               DB              App                   DB    App
                                                     Meters
                                                                         Reads           Cores     RAM                 cores           cores                 RAM   RAM
      Informix TimeSeries
                                                     100M                4.9B            16        500                 16              (shared)              500   (shared)


      The Competition *                              10M                 970M            456       3668                48              <180                  384   1.5TB


                       Daily Readings (meters * registers * intervals)
                                                                                                                            Database Resources (CPU cores)




                                                                                                 Informix TimeSeries
 Informix TimeSeries                                                            4,900,000,000    total cores            16
                                                                                                                                                     48
                                                                                                 The Competition
                                                                                                 – db cores
                                                                                                                                                                       180
     The Competition        970,000,000
                                                                                                 The Competition –
                                                                                                 app server cores




               5 times the performance                                                                             < 1/5 the resources
      … with significantly simpler management using a single node system
     * Based on latest published Oracle benchmark
       http://www.oracle.com/us/industries/utilities/ultilities-exadata-exalogic-wp-1499854.pdf
78        22 May 2012                                                                                                                                        2012
http://www.ibm.com/informix
           Cosmo@uk.ibm.com



79                                 2012

More Related Content

What's hot

Aula 02 importância do chipset na escolha
Aula 02   importância do chipset na escolhaAula 02   importância do chipset na escolha
Aula 02 importância do chipset na escolhaMarcos Basilio
 
Turducken - Divide and Conquer large GWT apps with multiple teams
Turducken - Divide and Conquer large GWT apps with multiple teamsTurducken - Divide and Conquer large GWT apps with multiple teams
Turducken - Divide and Conquer large GWT apps with multiple teamsRobert Keane
 
Pequenas avarias os beeps
Pequenas avarias os beepsPequenas avarias os beeps
Pequenas avarias os beepscabaldreams
 
Apresentação Aula Memoria
Apresentação Aula MemoriaApresentação Aula Memoria
Apresentação Aula MemoriaCENTEC
 
Introduction to the DAOS Scale-out object store (HLRS Workshop, April 2017)
Introduction to the DAOS Scale-out object store (HLRS Workshop, April 2017)Introduction to the DAOS Scale-out object store (HLRS Workshop, April 2017)
Introduction to the DAOS Scale-out object store (HLRS Workshop, April 2017)Johann Lombardi
 
Apresentação Banco de Dados - Caché
Apresentação Banco de Dados - CachéApresentação Banco de Dados - Caché
Apresentação Banco de Dados - CachéRenzo Petri
 
Introduction To Data Warehousing
Introduction To Data WarehousingIntroduction To Data Warehousing
Introduction To Data WarehousingAlex Meadows
 

What's hot (11)

Aula 02 importância do chipset na escolha
Aula 02   importância do chipset na escolhaAula 02   importância do chipset na escolha
Aula 02 importância do chipset na escolha
 
Turducken - Divide and Conquer large GWT apps with multiple teams
Turducken - Divide and Conquer large GWT apps with multiple teamsTurducken - Divide and Conquer large GWT apps with multiple teams
Turducken - Divide and Conquer large GWT apps with multiple teams
 
Pequenas avarias os beeps
Pequenas avarias os beepsPequenas avarias os beeps
Pequenas avarias os beeps
 
So-mod-4
So-mod-4So-mod-4
So-mod-4
 
EJB
EJBEJB
EJB
 
DB2 DOCUMENT
DB2 DOCUMENTDB2 DOCUMENT
DB2 DOCUMENT
 
Dbms Useful PPT
Dbms Useful PPTDbms Useful PPT
Dbms Useful PPT
 
Apresentação Aula Memoria
Apresentação Aula MemoriaApresentação Aula Memoria
Apresentação Aula Memoria
 
Introduction to the DAOS Scale-out object store (HLRS Workshop, April 2017)
Introduction to the DAOS Scale-out object store (HLRS Workshop, April 2017)Introduction to the DAOS Scale-out object store (HLRS Workshop, April 2017)
Introduction to the DAOS Scale-out object store (HLRS Workshop, April 2017)
 
Apresentação Banco de Dados - Caché
Apresentação Banco de Dados - CachéApresentação Banco de Dados - Caché
Apresentação Banco de Dados - Caché
 
Introduction To Data Warehousing
Introduction To Data WarehousingIntroduction To Data Warehousing
Introduction To Data Warehousing
 

Similar to Why Smart Meters Need Informix TimeSeries

Addressing the challenge of energy efficiency through ICT
Addressing the challenge of energy efficiency through ICTAddressing the challenge of energy efficiency through ICT
Addressing the challenge of energy efficiency through ICTFiras Obeido
 
How can Digital Twins support Manufacturers on the path to Net-Zero?
How can Digital Twins support Manufacturers on the path to Net-Zero?How can Digital Twins support Manufacturers on the path to Net-Zero?
How can Digital Twins support Manufacturers on the path to Net-Zero?IES VE
 
Shared Economy & Open Data in #EnergyEfficiency Markets
Shared Economy & Open Data in #EnergyEfficiency MarketsShared Economy & Open Data in #EnergyEfficiency Markets
Shared Economy & Open Data in #EnergyEfficiency MarketsUmesh Bhutoria
 
IRJET - A Research on Eloquent Salvation and Productive Outsourcing of Massiv...
IRJET - A Research on Eloquent Salvation and Productive Outsourcing of Massiv...IRJET - A Research on Eloquent Salvation and Productive Outsourcing of Massiv...
IRJET - A Research on Eloquent Salvation and Productive Outsourcing of Massiv...IRJET Journal
 
Lessons from handling up to 26 Billion transactions a day - The Weather Compa...
Lessons from handling up to 26 Billion transactions a day - The Weather Compa...Lessons from handling up to 26 Billion transactions a day - The Weather Compa...
Lessons from handling up to 26 Billion transactions a day - The Weather Compa...Derek Baron
 
Insight-2015-Session-3193
Insight-2015-Session-3193Insight-2015-Session-3193
Insight-2015-Session-3193Michal Miklas
 
The New Role of Data in the Changing Energy & Utilities Landscape
The New Role of Data in the Changing Energy & Utilities LandscapeThe New Role of Data in the Changing Energy & Utilities Landscape
The New Role of Data in the Changing Energy & Utilities LandscapeDenodo
 
Big Data, Physics, and the Industrial Internet: How Modeling & Analytics are ...
Big Data, Physics, and the Industrial Internet: How Modeling & Analytics are ...Big Data, Physics, and the Industrial Internet: How Modeling & Analytics are ...
Big Data, Physics, and the Industrial Internet: How Modeling & Analytics are ...mattdenesuk
 
Energy star webinar updated
Energy star webinar updatedEnergy star webinar updated
Energy star webinar updatedlkimes
 
Preparing for the Future - What It Will Take to Compete in 2021 - Connie Palu...
Preparing for the Future - What It Will Take to Compete in 2021 - Connie Palu...Preparing for the Future - What It Will Take to Compete in 2021 - Connie Palu...
Preparing for the Future - What It Will Take to Compete in 2021 - Connie Palu...Bob Dopico
 
Preparing for the Future - What It Will Take to Compete in 2021 - Connie Palu...
Preparing for the Future - What It Will Take to Compete in 2021 - Connie Palu...Preparing for the Future - What It Will Take to Compete in 2021 - Connie Palu...
Preparing for the Future - What It Will Take to Compete in 2021 - Connie Palu...Bob Dopico
 
IBM SmartCloud and ISVs September 2013 (Softlayer)
IBM SmartCloud and ISVs September 2013 (Softlayer)IBM SmartCloud and ISVs September 2013 (Softlayer)
IBM SmartCloud and ISVs September 2013 (Softlayer)Simon Baker
 
Oracle communications data model product overview
Oracle communications data model   product overviewOracle communications data model   product overview
Oracle communications data model product overviewGreenHamster
 
Effective IIoT Implementation combining different data sources
Effective IIoT Implementation combining different data sourcesEffective IIoT Implementation combining different data sources
Effective IIoT Implementation combining different data sourcesM2M Alliance e.V.
 
Webinar: Energy Data - The New Profit Lever
Webinar: Energy Data - The New Profit LeverWebinar: Energy Data - The New Profit Lever
Webinar: Energy Data - The New Profit LeverUrjanet
 

Similar to Why Smart Meters Need Informix TimeSeries (20)

Energy Management Solution - iARMS-EMS/PMS
Energy Management Solution - iARMS-EMS/PMSEnergy Management Solution - iARMS-EMS/PMS
Energy Management Solution - iARMS-EMS/PMS
 
Addressing the challenge of energy efficiency through ICT
Addressing the challenge of energy efficiency through ICTAddressing the challenge of energy efficiency through ICT
Addressing the challenge of energy efficiency through ICT
 
How can Digital Twins support Manufacturers on the path to Net-Zero?
How can Digital Twins support Manufacturers on the path to Net-Zero?How can Digital Twins support Manufacturers on the path to Net-Zero?
How can Digital Twins support Manufacturers on the path to Net-Zero?
 
Shared Economy & Open Data in #EnergyEfficiency Markets
Shared Economy & Open Data in #EnergyEfficiency MarketsShared Economy & Open Data in #EnergyEfficiency Markets
Shared Economy & Open Data in #EnergyEfficiency Markets
 
IRJET - A Research on Eloquent Salvation and Productive Outsourcing of Massiv...
IRJET - A Research on Eloquent Salvation and Productive Outsourcing of Massiv...IRJET - A Research on Eloquent Salvation and Productive Outsourcing of Massiv...
IRJET - A Research on Eloquent Salvation and Productive Outsourcing of Massiv...
 
Lessons from handling up to 26 Billion transactions a day - The Weather Compa...
Lessons from handling up to 26 Billion transactions a day - The Weather Compa...Lessons from handling up to 26 Billion transactions a day - The Weather Compa...
Lessons from handling up to 26 Billion transactions a day - The Weather Compa...
 
Insight-2015-Session-3193
Insight-2015-Session-3193Insight-2015-Session-3193
Insight-2015-Session-3193
 
Sgcp13halley
Sgcp13halleySgcp13halley
Sgcp13halley
 
The New Role of Data in the Changing Energy & Utilities Landscape
The New Role of Data in the Changing Energy & Utilities LandscapeThe New Role of Data in the Changing Energy & Utilities Landscape
The New Role of Data in the Changing Energy & Utilities Landscape
 
Big Data, Physics, and the Industrial Internet: How Modeling & Analytics are ...
Big Data, Physics, and the Industrial Internet: How Modeling & Analytics are ...Big Data, Physics, and the Industrial Internet: How Modeling & Analytics are ...
Big Data, Physics, and the Industrial Internet: How Modeling & Analytics are ...
 
Energy star webinar updated
Energy star webinar updatedEnergy star webinar updated
Energy star webinar updated
 
Preparing for the Future - What It Will Take to Compete in 2021 - Connie Palu...
Preparing for the Future - What It Will Take to Compete in 2021 - Connie Palu...Preparing for the Future - What It Will Take to Compete in 2021 - Connie Palu...
Preparing for the Future - What It Will Take to Compete in 2021 - Connie Palu...
 
Preparing for the Future - What It Will Take to Compete in 2021 - Connie Palu...
Preparing for the Future - What It Will Take to Compete in 2021 - Connie Palu...Preparing for the Future - What It Will Take to Compete in 2021 - Connie Palu...
Preparing for the Future - What It Will Take to Compete in 2021 - Connie Palu...
 
Sgcp14beart1
Sgcp14beart1Sgcp14beart1
Sgcp14beart1
 
IBM SmartCloud and ISVs September 2013 (Softlayer)
IBM SmartCloud and ISVs September 2013 (Softlayer)IBM SmartCloud and ISVs September 2013 (Softlayer)
IBM SmartCloud and ISVs September 2013 (Softlayer)
 
Point de vue n° 28 - english
Point de vue n° 28 - englishPoint de vue n° 28 - english
Point de vue n° 28 - english
 
GE
GEGE
GE
 
Oracle communications data model product overview
Oracle communications data model   product overviewOracle communications data model   product overview
Oracle communications data model product overview
 
Effective IIoT Implementation combining different data sources
Effective IIoT Implementation combining different data sourcesEffective IIoT Implementation combining different data sources
Effective IIoT Implementation combining different data sources
 
Webinar: Energy Data - The New Profit Lever
Webinar: Energy Data - The New Profit LeverWebinar: Energy Data - The New Profit Lever
Webinar: Energy Data - The New Profit Lever
 

More from IBM Sverige

Trender, inspirationer och visioner - Mikael Haglund #ibmbpsse18
Trender, inspirationer och visioner - Mikael Haglund #ibmbpsse18Trender, inspirationer och visioner - Mikael Haglund #ibmbpsse18
Trender, inspirationer och visioner - Mikael Haglund #ibmbpsse18IBM Sverige
 
AI – hur långt har vi kommit? – Oskar Malmström, IBM #ibmbpsse18
AI – hur långt har vi kommit? – Oskar Malmström, IBM #ibmbpsse18AI – hur långt har vi kommit? – Oskar Malmström, IBM #ibmbpsse18
AI – hur långt har vi kommit? – Oskar Malmström, IBM #ibmbpsse18IBM Sverige
 
#ibmbpsse18 - The journey to AI - Mikko Hörkkö, Elinar

#ibmbpsse18 - The journey to AI - Mikko Hörkkö, Elinar
#ibmbpsse18 - The journey to AI - Mikko Hörkkö, Elinar

#ibmbpsse18 - The journey to AI - Mikko Hörkkö, Elinar
IBM Sverige
 
#ibmbpsse18 - Koppla säkert & redundant till IBM Cloud - Magnus Huss, Interexion
#ibmbpsse18 - Koppla säkert & redundant till IBM Cloud - Magnus Huss, Interexion#ibmbpsse18 - Koppla säkert & redundant till IBM Cloud - Magnus Huss, Interexion
#ibmbpsse18 - Koppla säkert & redundant till IBM Cloud - Magnus Huss, InterexionIBM Sverige
 
#ibmbpsse18 - Den svenska marknaden, Andreas Lundgren, CMO, IBM
#ibmbpsse18 - Den svenska marknaden, Andreas Lundgren, CMO, IBM#ibmbpsse18 - Den svenska marknaden, Andreas Lundgren, CMO, IBM
#ibmbpsse18 - Den svenska marknaden, Andreas Lundgren, CMO, IBMIBM Sverige
 
Multiresursplanering - Karolinska Universitetssjukhuset
Multiresursplanering - Karolinska UniversitetssjukhusetMultiresursplanering - Karolinska Universitetssjukhuset
Multiresursplanering - Karolinska UniversitetssjukhusetIBM Sverige
 
Solving Challenges With 'Huge Data'
Solving Challenges With 'Huge Data'Solving Challenges With 'Huge Data'
Solving Challenges With 'Huge Data'IBM Sverige
 
Blockchain explored
Blockchain explored Blockchain explored
Blockchain explored IBM Sverige
 
Blockchain architected
Blockchain architectedBlockchain architected
Blockchain architectedIBM Sverige
 
Blockchain explained
Blockchain explainedBlockchain explained
Blockchain explainedIBM Sverige
 
Grow smarter project kista watson summit 2018_tommy auoja-1
Grow smarter project  kista watson summit 2018_tommy auoja-1Grow smarter project  kista watson summit 2018_tommy auoja-1
Grow smarter project kista watson summit 2018_tommy auoja-1IBM Sverige
 
Bemanningsplanering axfood och houston final
Bemanningsplanering axfood och houston finalBemanningsplanering axfood och houston final
Bemanningsplanering axfood och houston finalIBM Sverige
 
Power ai nordics dcm
Power ai nordics dcmPower ai nordics dcm
Power ai nordics dcmIBM Sverige
 
Nvidia and ibm presentation feb18
Nvidia and ibm presentation feb18Nvidia and ibm presentation feb18
Nvidia and ibm presentation feb18IBM Sverige
 
Hwx introduction to_ibm_ai
Hwx introduction to_ibm_aiHwx introduction to_ibm_ai
Hwx introduction to_ibm_aiIBM Sverige
 
Ac922 watson 180208 v1
Ac922 watson 180208 v1Ac922 watson 180208 v1
Ac922 watson 180208 v1IBM Sverige
 
Watson kista summit 2018 box
Watson kista summit 2018 box Watson kista summit 2018 box
Watson kista summit 2018 box IBM Sverige
 
Watson kista summit 2018 en bättre arbetsdag för de många människorna
Watson kista summit 2018   en bättre arbetsdag för de många människornaWatson kista summit 2018   en bättre arbetsdag för de många människorna
Watson kista summit 2018 en bättre arbetsdag för de många människornaIBM Sverige
 
Iwcs and cisco watson kista summit 2018 v2
Iwcs and cisco   watson kista summit 2018 v2Iwcs and cisco   watson kista summit 2018 v2
Iwcs and cisco watson kista summit 2018 v2IBM Sverige
 
Ibm intro (watson summit) bkacke
Ibm intro (watson summit) bkackeIbm intro (watson summit) bkacke
Ibm intro (watson summit) bkackeIBM Sverige
 

More from IBM Sverige (20)

Trender, inspirationer och visioner - Mikael Haglund #ibmbpsse18
Trender, inspirationer och visioner - Mikael Haglund #ibmbpsse18Trender, inspirationer och visioner - Mikael Haglund #ibmbpsse18
Trender, inspirationer och visioner - Mikael Haglund #ibmbpsse18
 
AI – hur långt har vi kommit? – Oskar Malmström, IBM #ibmbpsse18
AI – hur långt har vi kommit? – Oskar Malmström, IBM #ibmbpsse18AI – hur långt har vi kommit? – Oskar Malmström, IBM #ibmbpsse18
AI – hur långt har vi kommit? – Oskar Malmström, IBM #ibmbpsse18
 
#ibmbpsse18 - The journey to AI - Mikko Hörkkö, Elinar

#ibmbpsse18 - The journey to AI - Mikko Hörkkö, Elinar
#ibmbpsse18 - The journey to AI - Mikko Hörkkö, Elinar

#ibmbpsse18 - The journey to AI - Mikko Hörkkö, Elinar

 
#ibmbpsse18 - Koppla säkert & redundant till IBM Cloud - Magnus Huss, Interexion
#ibmbpsse18 - Koppla säkert & redundant till IBM Cloud - Magnus Huss, Interexion#ibmbpsse18 - Koppla säkert & redundant till IBM Cloud - Magnus Huss, Interexion
#ibmbpsse18 - Koppla säkert & redundant till IBM Cloud - Magnus Huss, Interexion
 
#ibmbpsse18 - Den svenska marknaden, Andreas Lundgren, CMO, IBM
#ibmbpsse18 - Den svenska marknaden, Andreas Lundgren, CMO, IBM#ibmbpsse18 - Den svenska marknaden, Andreas Lundgren, CMO, IBM
#ibmbpsse18 - Den svenska marknaden, Andreas Lundgren, CMO, IBM
 
Multiresursplanering - Karolinska Universitetssjukhuset
Multiresursplanering - Karolinska UniversitetssjukhusetMultiresursplanering - Karolinska Universitetssjukhuset
Multiresursplanering - Karolinska Universitetssjukhuset
 
Solving Challenges With 'Huge Data'
Solving Challenges With 'Huge Data'Solving Challenges With 'Huge Data'
Solving Challenges With 'Huge Data'
 
Blockchain explored
Blockchain explored Blockchain explored
Blockchain explored
 
Blockchain architected
Blockchain architectedBlockchain architected
Blockchain architected
 
Blockchain explained
Blockchain explainedBlockchain explained
Blockchain explained
 
Grow smarter project kista watson summit 2018_tommy auoja-1
Grow smarter project  kista watson summit 2018_tommy auoja-1Grow smarter project  kista watson summit 2018_tommy auoja-1
Grow smarter project kista watson summit 2018_tommy auoja-1
 
Bemanningsplanering axfood och houston final
Bemanningsplanering axfood och houston finalBemanningsplanering axfood och houston final
Bemanningsplanering axfood och houston final
 
Power ai nordics dcm
Power ai nordics dcmPower ai nordics dcm
Power ai nordics dcm
 
Nvidia and ibm presentation feb18
Nvidia and ibm presentation feb18Nvidia and ibm presentation feb18
Nvidia and ibm presentation feb18
 
Hwx introduction to_ibm_ai
Hwx introduction to_ibm_aiHwx introduction to_ibm_ai
Hwx introduction to_ibm_ai
 
Ac922 watson 180208 v1
Ac922 watson 180208 v1Ac922 watson 180208 v1
Ac922 watson 180208 v1
 
Watson kista summit 2018 box
Watson kista summit 2018 box Watson kista summit 2018 box
Watson kista summit 2018 box
 
Watson kista summit 2018 en bättre arbetsdag för de många människorna
Watson kista summit 2018   en bättre arbetsdag för de många människornaWatson kista summit 2018   en bättre arbetsdag för de många människorna
Watson kista summit 2018 en bättre arbetsdag för de många människorna
 
Iwcs and cisco watson kista summit 2018 v2
Iwcs and cisco   watson kista summit 2018 v2Iwcs and cisco   watson kista summit 2018 v2
Iwcs and cisco watson kista summit 2018 v2
 
Ibm intro (watson summit) bkacke
Ibm intro (watson summit) bkackeIbm intro (watson summit) bkacke
Ibm intro (watson summit) bkacke
 

Recently uploaded

Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfhans926745
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 

Recently uploaded (20)

Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 

Why Smart Meters Need Informix TimeSeries

  • 1. Why Smart Meters Need Informix TimeSeries IBM Data Server Day, Stockholm 2012-05-22 Cosmo@uk.ibm.com © 2012 IBM Corporation
  • 2. Please Note: IBM’s statements regarding its plans, directions, and intent are subject to change or withdrawal without notice at IBM’s sole discretion. Information regarding potential future products is intended to outline our general product direction and it should not be relied on in making a purchasing decision. The information mentioned regarding potential future products is not a commitment, promise, or legal obligation to deliver any material, code or functionality. Information about potential future products may not be incorporated into any contract. The development, release, and timing of any future features or functionality described for our products remains at our sole discretion. Performance is based on measurements and projections using standard IBM benchmarks in a controlled environment. The actual throughput or performance that any user will experience will vary depending upon many factors, including considerations such as the amount of multiprogramming in the user's job stream, the I/O configuration, the storage configuration, and the workload processed. Therefore, no assurance can be given that an individual user will achieve results similar to those stated here. 2012
  • 3. Acknowledgements and Disclaimers: Availability. References in this presentation to IBM products, programs, or services do not imply that they will be available in all countries in which IBM operates. The workshops, sessions and materials have been prepared by IBM or the session speakers and reflect their own views. They are provided for informational purposes only, and are neither intended to, nor shall have the effect of being, legal or other guidance or advice to any participant. While efforts were made to verify the completeness and accuracy of the information contained in this presentation, it is provided AS-IS without warranty of any kind, express or implied. IBM shall not be responsible for any damages arising out of the use of, or otherwise related to, this presentation or any other materials. Nothing contained in this presentation is intended to, nor shall have the effect of, creating any warranties or representations from IBM or its suppliers or licensors, or altering the terms and conditions of the applicable license agreement governing the use of IBM software. All customer examples described are presented as illustrations of how those customers have used IBM products and the results they may have achieved. Actual environmental costs and performance characteristics may vary by customer. Nothing contained in these materials is intended to, nor shall have the effect of, stating or implying that any activities undertaken by you will result in any specific sales, revenue growth or other results. © Copyright IBM Corporation 2012. All rights reserved. – U.S. Government Users Restricted Rights - Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp. IBM, the IBM logo, ibm.com and IBM Informix are trademarks or registered trademarks of International Business Machines Corporation in the United States, other countries, or both. If these and other IBM trademarked terms are marked on their first occurrence in this information with a trademark symbol (® or ™), these symbols indicate U.S. registered or common law trademarks owned by IBM at the time this information was published. Such trademarks may also be registered or common law trademarks in other countries. A current list of IBM trademarks is available on the Web at “Copyright and trademark information” at www.ibm.com/legal/copytrade.shtml Other company, product, or service names may be trademarks or service marks of others. 2012
  • 4. Why Smart Meters Need Informix TimeSeries  What challenges are being faced in the Energy & Utilities Sector today?  What is a Smart Meter and how can it help?  How does Informix TimeSeries fit it?  Case studies –1M Oncor PoC –35M Internal benchmark –100M AMT Sybex benchmark 4 22 May 2012 2012
  • 5. Consumers need Smart Meters  Samuel Palmisano, chief executive officer of International Business Machines Corp., said Improving the U.S. electric-transmission grid depends on providing better information to consumers. Companies shouldn’t wait for government to set standards for data and technologies to create a "smart grid," which lets consumers monitor their energy use and take conservation steps that can save energy and money, [ September 21, 2010 at the Gridwise Global Forum in Washington] 5 22 May 2012 2012
  • 6. Energy Usage Issues  Emission reduction goals: – EU 20% emissions reduction by 2020 as compared to 1990. – UK is 60% reduction by 2050.  Long lead times for new, “clean” energy supply.  Lasting legacy of energy inefficiency: – 80% of refrigerators bought in 2007 will be in use in 2020. – Less than 1/3 of industrial infrastructure will be replaced by 2020. – Over 20%of cars bought in 2007 will still be on the road in 2020.  Household efficiency a priority: – 25-30% of carbon emissions are from regular households. – 80% of home energy usage is heating. – EC projects 27% savings through efficiency in buildings. 6 22 May 2012 2012
  • 7. Why Smart Meters  Access to near-real-time electricity usage information.  Better control and management of electricity usage.  Enable retail electric providers to develop and offer new, innovative plans that will lower consumer bills.  Help make smarter decisions and change behaviours to help reduce consumption, or modify usage patterns. Smart meter often refers to an electrical meter, but it can also mean a device measuring natural gas or water consumption. 7 22 May 2012 2012
  • 8. Who is Using Smart Meters  Utility Companies: – In the U.S. – stimulus money used for smart meters. – Main drive is not reducing billing costs. – Better analysis of usage patterns. – Can different tariffs change energy consumption? .  Consumers: – Looking to reduce energy costs. – Wanting to improve their green credentials.  Governments: – Need to show improvements in emissions. – Want to reduce energy consumption/reliance. 8 22 May 2012 2012
  • 9. Smart Meters Solves Real Problems  Real time information on Energy Usage.  Gain control over personal energy usage – Modify electrical consumption: • California study – reductions 5.7% to 8.7%. • Norwegian study - reductions of 9%. • UK study reduction of 12%. • Oncor Texas, reductions of 5%-10%.  Power companies: – Develop new innovative rate plans. – Avoid building new plants. – Avoid buyer power from other sources. – Meet Green standards. – Reliable power restored quicker after outages. 9 22 May 2012 2012
  • 10. Data issues with Smart Meters  Data Issues - Terabytes of new Data: – Ability to bring on new meters. – Stores data for new regulatory reasons. – Analyse usage. – Automatically Read Meters.  New Data, New Applications: – Billing – Portal – Compliance – New Analytics – Combine Meter and Weather data 12 22 May 2012 2012
  • 12. What is Time Series Data?  Time series data is: – A set of data where each item is time-stamped • Think of an array where each element can be indexed by time or by a timestamp “Give me the Jan 1st element from time series “X”  Most useful when a range of data is normally read “Give me the Jan 1st thru Jan 10th elements from time series “X”  Access to one time series is usually completed before moving to the next time series. 14 2012
  • 13. How are Time Series Used?  They access the data by time range – Look at a range of data in the past – Make predictions about a range in the future  Their analysis is often very proprietary  Many keep large volumes of data online  Many take in huge volumes of data each second  All these markets use relational data as well  All need to combine their relational data with time series data 15 2012
  • 14. Key Strengths of Informix Timeseries  Performance – Extremely fast data access • Data layout optimized on disk – Handles operations hard or impossible to do in standard SQL  Space Savings – Can be over 70% space savings over standard relational layout  Toolkit approach allows users to develop their own algorithms – Algorithms run in the database to leverage buffer pool  Conceptually closer to how users think of time series 16 2012
  • 15. Relational Time Series Representation Meter_ID TimeStamp phase1 phase2 ... temp 1 2010-06-01 00:00 1.3 0 15.6 1 2010-06-01 00:30 1.6 0 15.6 1 2010-06-01 01:00 1.4 0 15.5 1 2010-06-01 01:30 1.4 0 15.4 1 2010-06-01 02:00 1.4 0 15.5 Growth ... 2 2010-06-01 00:00 0.4 0 12.3 2 2010-06-01 00:30 0.3 0 12.3 2 2010-06-01 01:00 0.2 0 12.2 2 2010-06-01 01:30 0.5 0 12.3 ... 3 2010-06-01 00:00 0.0 3.5 13.6 3 2010-06-01 00:03 0.0 4.3 12.2 17 2012
  • 16. Same Table Stored as a Time Series Meter_ID Origin 00:00 00:30 01:00 01:30 ... 1 2010-06-01 (1.3,0...15.6) (1.6,0...15.5) (1.4,0...15.5) (1.4,0...15.4) 2 2010-06-01 (0.4,0...12.3) (0.3,0...12.3) (0.2,0...12.2) (0.5,0...12.3) 3 2010-06-01 (0,3.5... 13.6) (0,4.3... 12.2) There are only as many rows as meters Growth Each row is very long and grows as data is inserted Very fast access to a timeslot once the Meter_ID is selected Very fast to read time-ordered set of values 18 2012
  • 17. Informix TimeSeries  A “timeseries” datatype is available in Informix – First introduced by Illustra in 1996  Additional “objects” associated with timeseries: – “Calendar” datatype • For defining when data can be collected – Row types • For defining what should be collected – Containers • For defining where the data should be stored – Several Support tables: • Calendar, tsinstancestable, tscontainertable 19 2012
  • 18. Key Concepts: Regular Time Series  Data collected uniformly over time intervals is a “regular” time series – For example: daily, hourly, etc...  A regular time series always has exactly one record per interval  If an interval is missing data then: – Missing data on an existing page takes up (a little) space – If all the intervals for a page are missing data then the page takes no space  Values in one interval typically do not carry into the next  Can be thought of as an array of data 20 2012
  • 19. Key Concepts: Irregular Time Series  Irregular time series also use intervals of time, however: – Unlike regular time series, irregular time series can store more than one record into a given time interval • For instance, multiple alerts can occur in the same second – Missing data never takes any room on disk  Values in an irregular time series can be treated in two ways: – Values may persist until next value arrives (stair step) • Total usage counter – Values are only valid at their given time point and do not “persist” (discreet) • Power outage alert  Can also be thought of as an array of data 21 2012
  • 20. Key Concepts: Calendar Datatype  Every Timeseries has an associated calendar  A calendar is made up of several parts: – A name – A pattern of intervals – A start date For instance, to create a calendar called “daily” that starts on Jan 1 2010 and defines regular work days you would issue this query: INSERT INTO Calendartable (c_name, c_calendar) VALUES (‘weekday’, ‘start(2010-01-01 00:00:00), pattern({5 on, 2 off}, day)’);  The system catalog called “calendartable” holds all the calendars that have been defined 22 2012
  • 21. Key Concepts: Row Types  A Timeseries is made up of a series of timestamped rows  The granularity of the timestamp is 10 microseconds (.00001 seconds)  The SQL syntax that defines a row type is: CREATE ROW TYPE reading (tstamp DATETIME, phase1 DECIMAL,…) – NOTE: Timeseries requires the type of the first column (the type of tstamp) to be “datetime year to fraction(5)”  Data in the row can be missing (NULL) – Missing data takes no space in a time series  Rows can be marked as hidden – Useful for holidays and other times where data is not available 23 2012
  • 22. Key Concepts: Containers  A “container” is the name given to the data structure that hold data for one or more time series.  It guarantees that time series data is stored clustered and sorted on disk  A container is explicetly created using this SQL syntax: EXECUTE PROCEDURE TsContainerCreate(‘cont_name’, ‘dbspace_name’, ‘rowtype_name’, first_extent, next_extent); – rowtype_name is the name of an existing row type – DBSPACE_NAME is the name of an existing dbspace (predefined area of disk) – FIRST_EXTENT is the size of the first extent of storage – NEXT_EXTENT is the size of the subsequent extents of storage  TimeSeries 5.00 has an automatic container allocation mechanism – With no container definition the dbspace of the table is used – Otherwise user defined pools can be used – Policy can be Round Robin or user defined 24 2012
  • 23. Putting it all Together  Create a calendar for 30 minute intervals; INSERT INTO Calendartable (c_name, c_calendar) VALUES (‘interval’, ‘start(2010-01-01 00:00:00), pattern({1 on, 29 off}, minute)’);  Create a row type: CREATE ROW TYPE reading (tstamp datetime year to fraction(5), phase1 DECIMAL, phase2 DECIMAL, phase3 DECIMAL, temp DECIMAL);  Create a container: EXECUTE PROCEDURE TsContainerCreate (‘int_cont1’, ‘tsdbs’, ‘reading’, 1024, 1024);  Create a table and insert a “blank” row for 1 meter: CREATE TABLE meters (Meter_ID char(64), Actual timeseries(reading)); INSERT INTO meters VALUES (“9908898”, “origin (2010-01-01 00:00:00), calendar(interval), container(int_cont1), regular”); 25 2012
  • 24. Relational Storage – Traditional Index Method Data pages have mixed Meter_IDs Multiple page access required Meter_ID Start End Key stored in both index and data root 2010 Meter_ID Start End Meter_ID Start End Meter_ID Start End MX001 00:00 23:30 MX002 00:00 23:30 MX1209980 00:00 23:30 Index Page Data Page Meter_ID TStamp Pointer Meter_ID TStamp usage MX001 2010-06-01 00:00 MX001 2010-06-01 00:00 1.6 MX001 2010-06-01 00:30 MX001 2010-06-01 00:30 1.8 MX001 2010-06-01 01:00 MX002 2010-06-01 12:30 3.6 MX001 2010-06-01 01:30 MX003 2010-06-01 06:00 8.2 MX001 2010-06-01 02:00 MX001 2010-06-01 01:00 4.7 26 2012
  • 25. Relational Storage – “High Performance” Index Method Only index page access required But All data is stored in both index Meter_ID Start End and data pages root 2010 Meter_ID Start End Meter_ID Start End Meter_ID Start End MX001 00:00 23:30 MX002 00:00 23:30 MX1209980 00:00 23:30 Index Page Data Page Meter_I TStamp Usage Pointer Meter_ID TStamp Usage D MX001 2010-06-01 00:00 1.6 MX001 2010-06-01 00:00 1.6 MX001 2010-06-01 00:30 1.8 MX001 2010-06-01 00:30 1.8 MX002 2010-06-01 12:30 3.6 MX001 2010-06-01 01:00 4.7 MX003 2010-06-01 06:00 8.2 MX001 2010-06-01 01:30 2.5 MX001 2010-06-01 01:00 4.7 MX001 2010-06-01 02:00 2.1 27 2012
  • 26. An Informix Table Containing a Timeseries Column The timeseries in the table is a physical reference to a mini-btree in a container Meter_ID Timeseries(reading) Container “A” MX001 [container_A, 1] MX002 [container_B, 2] MX003 [container_A, 3] MX004 [container_C, 4] Container “B” MX234 [container_C, 5] MX239 [container_B, 6] MX675 [container_C, 7] Container “C” MX521 [container_C, 8] 28 2012
  • 27. Timeseries Container Layout The btree index key is the time series id plus either: • An integer for regular time series • A timestamp for irregular time series Each low-level page holds sorted data for 4 5 7 8 12 16 exactly one time series Index Twig Pages: 29 2012
  • 28. Irregular Timeseries Storage Compared to Relational Data values only stored once No data pointers or pages Multiple, smaller btrees TS_ID Start End root 2010-01-01 TS_ID Start End TS_ID Start End TS_ID Start End 1 00:00 23:30 2 00:00 23:30 1000 00:00 23:30 Timeseries Page (irregular) Data Page Meter_ID TStamp Usage Pointer Meter_ID TStamp MX001 2010-06-01 01:03 1.6 MX001 2010-06-01 01:03 1.6 MX001 2010-06-01 01:45 1.8 MX001 2010-06-01 01:45 1.8 MX001 2010-06-01 02:06 1.9 MX002 2010-06-01 02:06 3.6 MX001 2010-06-01 02:08 2.1 MX003 2010-06-01 02:08 8.2 MX001 2010-06-01 02:25 1.8 MX001 2010-06-01 02:25 1.9 30 2012
  • 29. Regular Timeseries Storage Compared to Relational Data is only stored once No timestamps or data pages Multiple, smaller btrees TS_ID Start End root 2010-01-01 TS_ID Start End TS_ID Start End TS_ID Start End 1 00:00 23:30 2 00:00 23:30 1000 00:00 23:30 Timeseries Page (regular) Data Page Meter_ID TStamp Usage Pointer Meter_ID TStamp MX001 2010-06-01 00:00 1.6 MX001 2010-06-01 00:00 1.6 MX001 2010-06-01 00:30 1.8 MX001 2010-06-01 00:30 1.8 MX001 2010-06-01 01:00 1.9 MX002 2010-06-01 12:30 3.6 MX001 2010-06-01 01:30 2.1 MX003 2010-06-01 06:00 8.2 MX001 2010-06-01 02:00 1.8 MX001 2010-06-01 01:00 1.9 31 2012
  • 30. Informix Timeseries Space Saving  There is a small overhead for the b-tree pages – Meter_ID and Timestamp stored – Also pointer to Timeseries page  Irregular Timeseries must store Timestamp for each element – 8 Bytes Extra overhead per element  Regular Timeseries uses known offsets – No Timestamp stored – Even more efficient  NULL data is compressed – NULL elements (missing regular elements) take zero space – Sparse arrays are not stored at all if no elements in time range – Unlike relational storage NULL values take NO SPACE – A row type of (DECIMAL(12), INTEGER, INTEGER) is 7 + 4 + 4 = 15 bytes – Storing (NULL, 1, NULL) would only require 4 bytes 32 2012
  • 31. Worked Example – Relational Method Number of meters: 3,000,000 Interval: 15 minutes (96 readings per day) Meter ID length: 8 bytes Timestamp length: 12 bytes Data length: 8 + 6 bytes + 2 bytes slot overhead Data space: 3000000 * 96 * ( 8 + 12 + 8 + 6 + 2 ) = 10GB Index space: 3000000 * 96 * ( 8 + 12 + 8 + 2 ) + 10% b+tree overhead = 9GB Total storage: = 19GB 19GB per day 19GB per day 33 2012
  • 32. Worked Example – Informix Timeseries Number of meters: 3,000,000 Interval: 15 minutes (96 readings per day) Meter ID length: 64 bytes Timestamp length: 12 bytes Timeseries metadata: 86 bytes Data length: 8 + 6 bytes + 2 bytes slot overhead Fixed data space: 3000000 * ( 64 + 86 ) = 429MB Timeseries overhead: 3000000 * ( 12 + 4 + 2 ) + 10% = 66MB Variable data space: 3000000 * 96 * ( 8 + 6 + 2 ) = 4.4GB That is aahuge saving of 76% That is huge saving of 76% 34 2012
  • 33. Timeseries Simplicity – Example • Much simpler SQL – Apply a tariff Relational: SELECT meter_id, sum (value * 1.76) FROM meters where (tstamp BETWEEN '2010-06-02 00:00' AND '2010-06-02 06:59') OR (tstamp between '2010-06-02 21:00' AND '2010-06-02 23:59') GROUP BY 1, 2; Timeseries: SELECT meter_id, apply_tariff (readings, tariff, '2010-06-02 00:00', '2010-06-02 23:59')::Timeseries(applied_cost) FROM meters;  But what if there is a missing value in the interval data?  What if you want to reference data outside the query range? 36 2012
  • 34. Building Applications with the TimeSeries Datablade  Standard client access to server – ESQL/C – ODBC, JDBC, .NET – Perl DBD::Informix, PHP, Ruby  Several Timeseries specific interfaces are available: – SQL – VTI – SPL – Java (client & server) – C-API (client & server)  It’s a toolkit approach! – Allow people to build their analytics in the server 37 2012
  • 35. Informix Timeseries SQL Interface  Timeseries data is usually accessed through user defined routines (UDR’s) from SQL  Over 80 predefined functions come with Informix Timeseries: – Clip() - clip a range of a time series and return it – LastElem(), FirstElem() - return the last (first) element in the time series – Apply() – run a query across a time series • Apply filters, project only subset of columns, apply functions to elements, etc… – AggregateBy() – Roll up or down values • Change the frequency of a Timeseries from hourly to daily for instance – SetContainerName() - move a Timeseries from one container to another – BulkLoad() - load data into a Timeseries from a file 38 2012
  • 36. TimeSeries SQL Examples  Get all meter data for meter 3 for the last day SELECT Clip(reading, CURRENT – 1 units day, CURRENT) FROM meters WHERE Meter_ID = ‘3’;  Get the last meter record for meter 3 SELECT GetLast (reading) FROM meters WHERE Meter_ID = ‘3’;  Find the maximum usage by week for meter 3 over the last 30 days SELECT AggregateBy (‘max($usage)’, ‘weeklycal’, reading, CURRENT – 30 units day, CURRENT) FROM meters WHERE Meter_ID = ‘3’; 39 2012
  • 37. Informix Timeseries VTI Interface  Makes time series data look like standard relational data – useful for programs that can’t our proprietary Timeseries format – There is a small penalty for using VTI  Restrictions – No secondary indices are allowed – No triggers allowed  SQL to create a VTI table: – If you have a table called “meters” with a time series column the following query will create an equivalent VTI table: EXECUTE PROCEDURE tscreatevirtualtab(‘readings’, ‘meters’); 40 2012
  • 38. VTI Interface: Continued Meters – The Timeseries data Meter_ID Origin 00:00 01:00 02:00 03:00 ... MX001 2010-06-01 1.3 1.6 1.4 1.5 MX002 2010-06-01 0.4 0.3 0.2 0.5 MX003 2010-06-01 3.5 4.3 Readings – A virtual view of the Timeseries data Meter_ID TStamp usage MX001 2010-06-01 00:00 1.3 The VTI view is equivalent to MX001 2010-06-01 01:00 1.6 the tall thin relational table MX001 2010-06-01 02:00 1.4 and can be easily accessed MX001 2010-06-01 03:00 1.5 by any SQL client ... MX002 2010-06-01 00:00 0.4 MX002 2010-06-01 01:00 0.3 41 2012
  • 39. Informix Timeseries 5.00 VTI Interface  TimeSeries 5.00 VTI Enhancements – Update regular VTI using primary key only – Use of TimeSeries expressions (read only)  SQL to create a VTI table with an expression: EXECUTE PROCEDURE TSCreateExpressionVirtualTab( 'day_agg_readings', 'devices', 'AggregateBy("sum($kwh),avg($phase_a),avg($phase_b),avg($phase_c)", "cal1day", readings, 0)', 'reading', 1024, 'readings'); 42 2012
  • 40. Comparison of VTI vs Native Time Series Queries  Select a range of data for a meter: – Native: SELECT Clip (reading, “2010-01-01”, “2010-01-10”) FROM Meters WHERE Meter_ID = “2”; – VTI: SELECT * FROM readings WHERE tstamp BETWEEN “2010-01-01” AND “2010-01-10” AND Meter_ID = ”2”;  Find the max usage for a given meter in a given period of time - Native: SELECT Apply (“max($usage)”, “2010-01-01”, “2010-01-10”, reading) FROM Meters WHERE Meter_ID = “2”; - VTI: SELECT max(usage) FROM readings WHERE tstamp BETWEEN “2010-01-01” AND “2010-01-10 AND Meter_ID = “2”; Note: – Native will normally be faster than VTI, probably in 5 to 10% range – It is often much faster to write custom user defined functions – VTI functions are very convenient for standard SQL clients 43 2012
  • 41. TimeSeries C-API Interface  Client and server versions of the API  Treats a time series like a table (sort of) – Functions to open and close a time series – Functions to scan a time series between 2 timestamps – Functions to create a time series – Functions to retrieve, insert, delete, update  Plus another 70 functions defined 44 2012
  • 43. Timeseries Data Loading  Timeseries is a specialist type and benefits from a specialist data loading mechanism  Traditionally the Real Time Loader has been used for high speed Timeseries data insert – Developed for stock market trade data – Good for irregular Timeseries – Small symbol universe – 10s of thousands of stocks – Data arriving in timestamp order – Small number of active stocks – Needs to cope with very high peak loads at exchange open & close  Smart Meter Data is a new challenge – Timeseries is regular – Many millions of meters – Data batched by Meter Identifier – All meters equally active 47 2012
  • 44. Smart Meter Data Loader  Uses similar internal mechanism as RTL to directly access containers  Builds internal map of Meter ID and Timeseries ID  Can use fragmentation of base table for better parallelism  Parallel sessions can work on separate disks to reduce contention  Load rates can be in excess of 50,000 intervals per second per core 50 2012
  • 45. Smart Meter Data Loader – Architecture Random Distribution Meter_ID TS ID 7898765 1 2168768 2 9879821 3 1656578 4 8787987 5 4678768 6 7354658 7 2537591 8 8973547 9 1352857 10 3451759 11 7656472 12 6543897 13 Meter Data Loaders 3324516 14 Containers Physical Disks Hash table 52 2012
  • 46. Oncor PoC 53 2012
  • 47. Oncor PoC Details  Simulation – 90 days worth of meter data for 1 million meters • 15 minute intervals • One value stored per interval – 200 locations – 500 feeders – 34 substations  Hardware – Power7 with 2 sockets each with 8 cores – 64 bit SUSE Linux 11 – 128 GB of memory • Memory actually needed, 44GB, although could probably be less – 6 disks dedicated to the database, 2 additional for OS and LSE staging • Disk space actually used by the database, about 350GB (110 days) – Additional disks for the operating system and staging area for files  Software – Informix Ultimate Edition 11.7 – Informix Timeseries 54 2012
  • 48. Informix Time Series Schema The Meter table looks like this: A Meter reading looks like this: CREATE TABLE meters ( CREATE ROW TYPE meter_data ( esi_id char(64) not null primary key, tstamp datetime year to fraction(5), suffix char(32), value decimal (14,3) location char(16), ); feeder char(16), sub_station char(16), dbspace varchar(128), An update (correction) record container varchar(128), looks like: actual Timeseries(meter_data), estimated Timeseries(meter_data), CREATE ROW TYPE update_day ( valid Timeseries(update_day) tstamp datetime year to fraction(5), ) last_update datetime year to fraction(5), ); Hierarchy is sub_station->feeder->meter. There are also tables for location, sub_station and feeder not shown above. 55 2012
  • 49. Primary Use Cases  Load 90 days worth of data for 1 million meters from LSE files – Original set of LSE files massaged to generate 1 million distinct meters Oracle 6 hours Timeseries 18 minutes  6-day ERCOT Settlement Extract – Show support for the ERCOT settlement processes by creating LSE file consisting of every record (every meter) for operating day - 6 (calendar day that occurred 6 days prior to current day). Must be able to extract and create the LSE files for 1M meters for a specific day. Oracle 5 hours T Timeseries <7 minutes  22-Day Update ERCOT Settlement Extract – Show support for the ERCOT settlement processes by creating LSE files consisting of every record that has had a consumption interval record update since the prior extract / pull (6-Day). Only extract the last or most current update for each meter, so if a meter has been updated four times, only the last / current record is sent. The entire 96 15 minute intervals are sent each time as well. Oracle 8 hours Timeseries 4 minutes (90 day 11 minutes)  Missing Record ERCOT Settlement Extract – Show support for the ERCOT settlement processes by creating an LSE file consisting of only the meter IDs and date that is provided in a missing meter ID file from ERCOT. The dates will be as far back as 90 days and no sooner than 28 days back in time. 4000 random reads on one day - 6 seconds 4000 random reads many days - 24 seconds 65 2012
  • 50. Other Use Cases  Determine the count and the list of meter ID's for all meters with missing intervals and / or register reads on a given day Oracle 3-4 hours Timeseries <7 minutes  Determine the 90 day history for a given meter (90 day aggregation) Oracle > 1 second Timeseries 0.04 seconds  Determine the count and list of meter IDs that exceeded a given high interval value for a given day or given time period (multiple days). For example, count and list of meters that had interval value of 12 or higher for a given period of time. Timeseries <6 minutes  Determine list of meters that have 5 consecutive or more days with estimated values only (no actual interval reads during a 5 day or more period) Oracle 6 hours Timeseries 17 minutes 66 2012
  • 52. Internal Benchmark - Requirements  35 Million meters – 10 minute intervals with 5 values – 5 billion intervals per day  12 Months data storage – Over 1.8 trillion intervals – Regular TimeSeries 30TB – Predicted Relational 84TB  OLTP concurrent users – All running while data is loading  Complex aggregations – Required new TSRollUp function 68 2012
  • 53. Internal Benchmark - Hardware  IBM P780 with AIX 7.1  Storage: IBM DS8000 - 576 HDD 146GB/15krpm  Space used –TimeSeries intervals: 30Tb • Split over 64 logical devices, 768 containers –Relational Tables: 112Gb • 1 main data dbspace, 70 fragmentation dbspaces –System use: 148Gb • Root, log dbspaces + 6 temp dbspaces  64 cores, primary CPU thread affinitied to 64 Virtual Processors  1Tb main memory, up to 950Gb assigned to database server –80Gb relational data buffers –680Gb TimeSeries data buffers –45Gb system memory 69 2012
  • 54. Internal Benchmark - Results  Data loading – Single day load: 20 minutes (64 Cores used) – Historical load of 12 months: <6 days – Daily load during queries: 160 minutes (8 Cores used) – Data cleansing after load: 2 minutes  Query performance – 3,000 concurrent sessions – Single meter queries sub-second response time – Larger summary queries executed in <5 seconds – No performance degradation during data load 70 2012
  • 56. AMT Sybex Benchmark  Most ambitious Smart Meter Benchmark to date  100 Million Meters – 30 minute intervals – 1, 2 or 3 daily registers  Target was to confirm a 24hr operational window – Load data – Validate data – Calculate estimated corrections – Billing run for 6% of the meters Validation Load VEE Database Query Single IBM Power 750 server 72 2012
  • 57. AMT Sybex Benchmark  Hardware – IBM Power 750 32 cores (3.5GHz) running AIX 7.1 – 1 x Gb LAN Fibre adapter (dual port, using 1 port) – 2 x 8Gb FC adapters (dual port, using 4 port) – 512Gb memory – 1 x IBM XIV Storage System with 15 x 2Tb data modules  Software – IBM Informix Dynamic Server 11.70.FC3 – IBM Informix TimeSeries 5.00.FC1 – AMT-SYBEX SmartDTS v 6.0  Database Server – 101,000,000 x 4Kb buffers – 16 cpu vps – 30 x 2Gb logical logs – 40Gb physical log – The time series were stored over 16 logical disks 73 2012
  • 58. AMT Sybex Benchmark – Processing time Daily Processing Time Showing predictability of processing as database size increases 540 4.0 480 3.5 420 3.0 360 2.5 300 Minutes 2.0 Tb 240 1.5 180 1.0 120 60 0.5 0 0.0 Validation Loading VEE Space Used 74 2012
  • 59. AMT Sybex Benchmark – Performance Results Individual operations Operation Time in hrs CPU Validate 2:18 100% Load 3:15 80% VEE 2:10 100% Total 7:43 75 2012
  • 60. AMT Sybex Benchmark – Performance Results Individual operations Operation Time in hrs CPU Validate 2:18 100% Load 3:15 80% VEE 2:10 100% Total 7:43 Billing Query 4:21 5% Overall total 12:04 76 2012
  • 61. AMT Sybex Benchmark – Performance Results Combined operations The Billing Query and the load can be run concurrently Operation Time in hrs CPU Validate 2:18 100% Load + Billing 4:41 85% VEE 2:10 100% Overall Total 9:09 This result confirmed that a 9hr processing window was sufficient for the daily processing 77 2012
  • 62. How Does This Benchmark Compare? Comparison of Published Benchmarks for Meter Data Management Daily Total Total DB App DB App Meters Reads Cores RAM cores cores RAM RAM Informix TimeSeries 100M 4.9B 16 500 16 (shared) 500 (shared) The Competition * 10M 970M 456 3668 48 <180 384 1.5TB Daily Readings (meters * registers * intervals) Database Resources (CPU cores) Informix TimeSeries Informix TimeSeries 4,900,000,000 total cores 16 48 The Competition – db cores 180 The Competition 970,000,000 The Competition – app server cores 5 times the performance < 1/5 the resources … with significantly simpler management using a single node system * Based on latest published Oracle benchmark http://www.oracle.com/us/industries/utilities/ultilities-exadata-exalogic-wp-1499854.pdf 78 22 May 2012 2012
  • 63. http://www.ibm.com/informix Cosmo@uk.ibm.com 79 2012