SlideShare a Scribd company logo
1 of 52
Download to read offline
Spark Meetup - 2017-05-16, Paris
Mathias @Herberts - CTO, Cityzen Data
Warp 10 - Simplifying analysis of
time series data on top of
`whoami`
Former Senior SRE on Big Table at Google
Former head of Big Data at Crédit Mutuel Arkéa
Pioneer in the use of Hadoop & HBase in production since 2009
Co-Founder and CTO of Cityzen Data, maker of Warp 10
@herberts
Time Series
Time Series are everywhere
IoT & time series data management and analysis
Versatile Data Model
Geo Time Series®
Geo Time Series®
Digital Twin Paradigm
Multiple Versions
Embeddable for Edge Analytics
Standalone Version
HA
Datalog
in-memory
Distributed Version
Secure Solution
Security
Encryption and authentication/authorization mechanisms
sandboxed environment for analytics
Rich Analytics
Analytics
A stack based language dedicated to time series analytics
Advanced stack based language
■ Result is a JSON array of the various stack levels
■ Support for variables and context saving
■ Code serialization
■ Loops, conditionals, macros - Data Flow model
■ Secure code execution, resource limits
5 high level frameworks
■ BUCKETIZE - transform a series so it has regularly spaced ticks
■ MAP - apply a function on a sliding window
■ REDUCE - tick by tick computation on multiple series, producing a single one
■ FILTER - select series based on various criteria
■ APPLY - tick by tick application of an n-ary function
! != % & && * ** + +! - ->B64 ->B64URL ->BIN ->BYTES ->DOUBLEBITS ->FLOATBITS ->GEOHASH ->HEX ->HHCODE ->HHCODELONG ->JSON ->LIST ->MAP ->MAT ->OPB64 ->PICKLE ->Q ->SET ->TSELEMENTS ->V ->VEC ->Z / < << <=
== > >= >> >>> ABS ACOS ADDDAYS ADDMONTHS ADDVALUE ADDYEARS AESUNWRAP AESWRAP AGO AND APPEND APPLY ASIN ASSERT ATAN ATBUCKET ATINDEX ATTICK ATTRIBUTES AUTHENTICATE B64-> B64TOHEX B64URL-> BBOX
BIN-> BINTOHEX BITCOUNT BITGET BITSTOBYTES BOOTSTRAP BREAK BUCKETCOUNT BUCKETIZE BUCKETSPAN BYTES-> BYTESTOBITS BYTESTOBITS CALL CBRT CEIL CHUNK CLEAR CLEARDEFS CLEARSYMBOLS CLEARTOMARK CLIP CLONE
CLONEEMPTY CLONEREVERSE COMMONTICKS COMPACT CONTAINS CONTAINSKEY CONTAINSVALUE CONTINUE COPYGEO COPYSIGN CORRELATE COS COSH COUNTER COUNTERDELTA COUNTERVALUE COUNTTOMARK CPROB CROP CSTORE
CUDF DEBUGOFF DEBUGON DEDUP DEF DEFINED DEFINEDMACRO DELETE DEPTH DET DIFFERENCE DISCORDS DOC DOCMODE DOUBLEBITS-> DOUBLEEXPONENTIALSMOOTHING DROP DROPN DTW DUMP DUP DUPN DURATION DWTSPLIT E
ELAPSED ELEVATIONS EMPTY ESDTEST EVAL EVALSECURE EVERY EXP EXPM1 EXPORT FAIL FDWT FETCH FETCHBOOLEAN FETCHDOUBLE FETCHLONG FETCHSTRING FFT FFTAP FILLNEXT FILLPREVIOUS FILLTICKS FILLVALUE FILTER FIND
FINDSETS FINDSTATS FIRSTTICK FLATTEN FLOATBITS-> FLOOR FOR FOREACH FORGET FORSTEP FROMBIN FROMBITS FROMHEX FUSE GEO.DIFFERENCE GEO.INTERSECTION GEO.INTERSECTS GEO.REGEXP GEO.UNION GEO.WITHIN GEO.WKT
GEOHASH-> GEOPACK GEOUNPACK GET GETHOOK GETSECTION GRUBBSTEST GZIP HASH HAVERSINE HEADER HEX-> HEXTOB64 HEXTOBIN HHCODE-> HUMANDURATION HYBRIDTEST HYBRIDTEST2 HYPOT IDENT IDWT IEEEREMAINDER IFFT
IFT IFTE IMMUTABLE INTEGRATE INTERPOLATE INTERSECTION INV ISNULL ISNaN ISO8601 ISODURATION ISONORMALIZE JOIN JSON-> JSONLOOSE JSONSTRICT KEYLIST LABELS LASTBUCKET LASTSORT LASTTICK LBOUNDS LFLATMAP LIMIT
LIST-> LMAP LOAD LOCATIONOFFSET LOCATIONS LOCSTRINGS LOG LOG10 LOG1P LORAENC LORAMIC LOWESS LR LSORT LTTB MACROBUCKETIZER MACROFILTER MACROMAPPER MACROREDUCER MAKEGTS MAP MAP-> MAPID MARK
MAT-> MATCH MATCHER MAX MAXBUCKETS MAXDEPTH MAXGTS MAXLONG MAXLOOP MAXOPS MAXPIXELS MAXSYMBOLS MD5 MERGE META METASET METASORT MIN MINLONG MODE MONOTONIC MSGFAIL MSORT MSTU MUSIGMA
NAME NBOUNDS NDEBUGON NEWGTS NEXTAFTER NEXTUP NONEMPTY NOOP NORMALIZE NOT NOTAFTER NOTBEFORE NOTIMINGS NOW NPDF NRETURN NSUMSUMSQ NULL NaN ONLYBUCKETS OPB64-> OPB64TOHEX OPS OPTDTW OR
PACK PAPPLY PARSE PARSESELECTOR PARTITION PATTERNDETECTION PATTERNS PFILTER PGraphics PI PICK PICKLE-> PIGSCHEMA PREDUCE PROB PROBABILITY PUT Palpha Parc Pbackground PbeginContour PbeginShape Pbezier
PbezierDetail PbezierPoint PbezierTangent PbezierVertex Pblend PblendMode Pblue Pbox Pbrightness Pclear Pclip Pcolor PcolorMode Pconstrain Pcopy PcreateFont Pcurve PcurveDetail PcurvePoint PcurveTangent PcurveTightness
PcurveVertex Pdecode Pdist Pellipse PellipseMode Pencode PendContour PendShape Pfill Pget Pgreen Phue Pimage PimageMode Plerp PlerpColor Pline Pmag Pmap PnoClip PnoFill PnoStroke PnoTint Pnorm Ppixels Ppoint PpopMatrix
PpopStyle PpushMatrix PpushStyle Pquad PquadraticVertex Prect PrectMode Pred PresetMatrix Protate ProtateX ProtateY ProtateZ Psaturation Pscale Pset PshapeMode PshearX PshearY Psphere PsphereDetail Pstroke PstrokeCap
PstrokeJoin PstrokeWeight Ptext PtextAlign PtextAscent PtextDescent PtextFont PtextLeading PtextMode PtextSize PtextWidth Ptint Ptranslate Ptriangle PupdatePixels Pvertex Q-> QCONJUGATE QDIVIDE QMULTIPLY QROTATE
QROTATION QUANTIZE RAND RANDPDF RANGE RANGECOMPACT REDEFS REDUCE RELABEL REMOVE RENAME REPLACE REPLACEALL RESET RESETS RESTORE RETURN REV REVBITS REVERSE REXEC REXECZ RINT RLOWESS ROLL ROLLD
ROT ROTATIONQ ROUND RSADECRYPT RSAENCRYPT RSAGEN RSAPRIVATE RSAPUBLIC RSASIGN RSAVERIFY RSORT RTFM RUN RUNNERNONCE RVALUESORT SAVE SECTION SECUREKEY SET SET-> SETATTRIBUTES SETVALUE SHA1
SHA1HMAC SHA256 SHA256HMAC SHRINK SIGNUM SIN SINGLEEXPONENTIALSMOOTHING SINH SIZE SNAPSHOT SNAPSHOTALL SNAPSHOTALLTOMARK SNAPSHOTCOPY SNAPSHOTCOPYALL SNAPSHOTCOPYALLTOMARK
SNAPSHOTCOPYTOMARK SNAPSHOTTOMARK SORT SORTBY SPLIT SQRT STACKATTRIBUTE STACKTOLIST STANDARDIZE STL STLESDTEST STOP STORE STRICTMAPPER STRICTPARTITION STRICTREDUCER STU SUBLIST SUBMAP SUBSTRING
SWAP SWITCH TAN TANH TEMPLATE TEMPLATE THRESHOLDTEST TICKINDEX TICKLIST TICKS TIMECLIP TIMEMODULO TIMESCALE TIMESHIFT TIMESPLIT TIMINGS TLTTB TOBIN TOBITS TOBOOLEAN TODEGREES TODOUBLE TOHEX
TOKENINFO TOLONG TOLOWER TORADIANS TOSELECTOR TOSTRING TOTIMESTAMP TOTIMESTAMP TOUPPER TR TRANSPOSE TRIM TSELEMENTS TSELEMENTS-> TYPEOF UDF ULP UNBUCKETIZE UNGZIP UNION UNIQUE UNLIST UNMAP
UNPACK UNSECURE UNTIL UNWRAP UNWRAPEMPTY UNWRAPSIZE UPDATE URLDECODE URLENCODE UUID V-> VALUEDEDUP VALUEHISTOGRAM VALUELIST VALUES VALUESORT VALUESPLIT VEC-> WEBCALL WHILE WRAP WRAPOPT
WRAPRAW WRAPRAWOPT Z-> ZDISCORDS ZIP ZPATTERNDETECTION ZPATTERNS ZSCORE ZSCORETEST [ [] ] ^ bucketizer.and bucketizer.count bucketizer.count.exclude-nulls bucketizer.count.include-nulls bucketizer.count.nonnull
bucketizer.first bucketizer.join bucketizer.join.forbid-nulls bucketizer.last bucketizer.mad bucketizer.max bucketizer.max.forbid-nulls bucketizer.mean bucketizer.mean.circular bucketizer.mean.circular.exclude-nulls
bucketizer.mean.exclude-nulls bucketizer.median bucketizer.min bucketizer.min.forbid-nulls bucketizer.or bucketizer.percentile bucketizer.sum bucketizer.sum.forbid-nulls d e filter.byattr filter.byclass filter.bylabels filter.bylabelsattr
filter.bymetadata filter.last.eq filter.last.ge filter.last.gt filter.last.le filter.last.lt filter.last.ne filter.latencies h m mapper.abs mapper.abscissa mapper.add mapper.and mapper.ceil mapper.count mapper.count.exclude-nulls
mapper.count.include-nulls mapper.count.nonnull mapper.day mapper.delta mapper.distinct mapper.dotproduct mapper.dotproduct.positive mapper.dotproduct.sigmoid mapper.dotproduct.tanh mapper.eq mapper.exp mapper.finite
mapper.first mapper.floor mapper.ge mapper.geo.approximate mapper.geo.clear mapper.geo.outside mapper.geo.within mapper.gt mapper.hdist mapper.highest mapper.hour mapper.hspeed mapper.join mapper.join.forbid-nulls
mapper.kernel.cosine mapper.kernel.epanechnikov mapper.kernel.gaussian mapper.kernel.logistic mapper.kernel.quartic mapper.kernel.silverman mapper.kernel.triangular mapper.kernel.tricube mapper.kernel.triweight
mapper.kernel.uniform mapper.last mapper.le mapper.log mapper.lowest mapper.lt mapper.mad mapper.max mapper.max.forbid-nulls mapper.max.x mapper.mean mapper.mean.circular mapper.mean.circular.exclude-nulls
mapper.mean.exclude-nulls mapper.median mapper.min mapper.min.forbid-nulls mapper.min.x mapper.minute mapper.mod mapper.month mapper.mul mapper.ne mapper.npdf mapper.or mapper.parsedouble mapper.percentile
mapper.pow mapper.product mapper.rate mapper.replace mapper.round mapper.sd mapper.sd.forbid-nulls mapper.second mapper.sigmoid mapper.sum mapper.sum.forbid-nulls mapper.tanh mapper.tick mapper.toboolean
mapper.todouble mapper.tolong mapper.tostring mapper.truecourse mapper.var mapper.var.forbid-nulls mapper.vdist mapper.vspeed mapper.weekday mapper.year max.tick.sliding.window max.time.sliding.window ms ns op.add
op.add.ignore-nulls op.and op.and.ignore-nulls op.div op.eq op.ge op.gt op.le op.lt op.mask op.mul op.mul.ignore-nulls op.ne op.negmask op.or op.or.ignore-nulls op.sub pi ps reducer.and reducer.and.exclude-nulls reducer.argmax
reducer.argmin reducer.count reducer.count.exclude-nulls reducer.count.include-nulls reducer.count.nonnull reducer.join reducer.join.forbid-nulls reducer.join.nonnull reducer.join.urlencoded reducer.mad reducer.max
reducer.max.forbid-nulls reducer.max.nonnull reducer.mean reducer.mean.circular reducer.mean.circular.exclude-nulls reducer.mean.exclude-nulls reducer.median reducer.min reducer.min.forbid-nulls reducer.min.nonnull reducer.or
reducer.or.exclude-nulls reducer.percentile reducer.product reducer.sd reducer.sd.forbid-nulls reducer.shannonentropy.0 reducer.shannonentropy.1 reducer.sum reducer.sum.forbid-nulls reducer.sum.nonnull reducer.var
reducer.var.forbid-nulls s us w { {} | || } ~ ~=
800 functions
Compact expressiveness
<%
‘Display write requests count for each region’ DOC
SAVE 'context' STORE
'cell' STORE
'PT60m' DURATION 'duration' STORE
'@TOKEN_READ@' 'TOKEN' STORE
NOW 'now' STORE
[ $TOKEN 'writeRequestCount' { 'cell' $cell 'Context'
'regionserver' } $now $duration ] FETCH
// Remove resets
false RESETS
// Align ticks
[ SWAP bucketizer.last $now 60 STU * 0 ] BUCKETIZE
// Sum by hname
[ SWAP [ 'hname' ] reducer.sum ] REDUCE
FILLNEXT FILLPREVIOUS
// Compute rates
[ SWAP mapper.rate 1 0 0 ] MAP
$context RESTORE
%>
Extensibility
WarpScript Server Side Macros
<%
<’
This macro does such and such…
@param xxx
@param yyy
‘>
DOC
// Store the current context so we can create symbols freely
SAVE ‘_context’ STORE
// Insert your code here
// Restore original context
$_context RESTORE
%> ‘macro’ STORE
// Unit tests
// Leave the macro on the stack
$macro
// Use via @path/to/macro in your scripts
WarpScript Extensions
Import io.warp10.script.sdk.WarpScriptExtension;
import io.warp10.script.NamedWarpScriptFunction;
import io.warp10.script.WarpScriptException;
import io.warp10.script.WarpScriptStack;
import io.warp10.script.WarpScriptStackFunction;
public class MyExtension extends WarpScriptExtension {
private static Map<String,Object> functions = new HashMap<String,Object>();
private static class MyStackFunction extends NamedWarpScriptFunction
implements WarpScriptStackFunction {
@Override
public Object apply(WarpScriptStack stack) throws WarpScriptException {
….
return stack;
}
}
static { functions.put("XXX", new MyStackFunction(“XXX”)); }
@Override
public Map<String, Object> getFunctions() {
return functions;
}
}
CALLing external programs
#!/usr/bin/env python -u
import cPickle, sys, urllib, base64
# Output the maximum number of instances of this 'callable' to spawn
print 10
# Loop, reading stdin, doing our stuff and outputing to stdout
while True:
try:
line = sys.stdin.readline()
line = line.strip()
line = urllib.unquote(line.decode('utf-8'))
# Remove Base64 encoding
str = base64.b64decode(line)
args = cPickle.loads(str)
# Do out stuff
output = ….
# Output result (URL encoded UTF-8).
print urllib.quote(output.encode('utf-8'))
except Exception as err:
print ' ' + urllib.quote(repr(err).encode('utf-8'))
...
->PICKLE ‘UTF-8’
BYTES-> ->B64
‘path/to/file’ CALL
B64-> PICKLE->
....
Visualization
Quantum IDE
Quantum IDE
QuantumViz Web Component
<!doctype html>
<html>
<head>
<meta name="viewport" content="width=device-width, minimum-scale=1.0, initial-scale=1.0, user-scalable=yes">
<script src="https://api0.cityzendata.net/widgets/quantumviz/webcomponentsjs/webcomponents.js"></script>
<link rel="import" href="https://api0.cityzendata.net/widgets/quantumviz/polymer/polymer.html">
<link rel="import" href="https://api0.cityzendata.net/widgets/quantumviz/warp10-quantumviz/warp10-quantumviz.html">
<body>
<warp10-quantumviz
width="500" height="400"
show-axis="true" tooltip="true" line-width="2" reload="0"
host="https://warp.cityzendata.net/" >
NEWGTS
1 720
<% DUP 'i' STORE 10000000 * NaN NaN NaN $i TORADIANS COS ADDVALUE %>
FOR
[ SWAP ] 'gts' STORE
[ { 'color' '#00d4ff' 'key' 'Sine' } ] 'params' STORE
{ 'interpolate' 'linear' } 'globalParams' STORE
{ 'gts' $gts 'params' $params 'globalParams' $globalParams }
</warp10-quantumviz>
</body>
</html>
Grafana Integration
Timelion Integration
rocessing Integration
800 'width' STORE 800 'height' STORE
400.0 'maxspeed' STORE 40000.0 'maxalt' STORE
3.0 2.0 2.0 @orbit/heatmap/kernel/triangular 'kernel' STORE
@orbit/heatmap/palette/classic 'palette' STORE
'TOKEN''token' STORE
$width $height '2D' PGraphics
'MULTIPLY' PblendMode 'CENTER' PimageMode
[ $token '~(ALT|CAS)' {} NOW -2000000 ] FETCH
DUP 0 GET LASTTICK 'now' STORE
[ SWAP bucketizer.last $now STU 0 ] BUCKETIZE
// Create heatmap
<%
7 GET LIST-> DROP 'CAS' STORE 'ALT' STORE
<% $CAS ISNULL NOT $ALT ISNULL NOT && %>
<% $kernel $CAS $maxspeed / $width * $ALT $maxalt / 1.0
SWAP - $height * Pimage %>
IFT
0 NaN NaN NaN NULL
%> MACROREDUCER 'GRAPHER' STORE
[ SWAP [] $GRAPHER ] REDUCE DROP
// Colorize
Ppixels <% DROP Palpha $palette SWAP GET %> LMAP
PupdatePixels Pencode Pdecode
$width $height '2D' PGraphics
// Do the grid
PnoFill 0 0 $width 1 - $height 1 - Prect
2.0 PstrokeWeight 200.0 Pcolor Pstroke
250.0 $maxspeed / $width * DUP 0 SWAP $height Pline
0 10000 $maxalt / 1.0 SWAP - $height * DUP $width SWAP Pline
SWAP 0 0 Pimage Pencode
QuantumImg Web Component
<!doctype html>
<html>
<head>
<link rel="stylesheet" href="//maxcdn.bootstrapcdn.com/bootstrap/3.3.5/css/bootstrap.min.css">
<link rel="stylesheet" href="//maxcdn.bootstrapcdn.com/font-awesome/4.3.0/css/font-awesome.min.css">
<script src="//cdnjs.cloudflare.com/ajax/libs/jquery/1.11.3/jquery.min.js"></script>
<script src="//cdnjs.cloudflare.com/ajax/libs/twitter-bootstrap/3.3.5/js/bootstrap.min.js"></script>
<script src="https://api0.cityzendata.net/widgets/quantumviz/webcomponentsjs/webcomponents.js"></script>
<link rel="import" href="https://api0.cityzendata.net/widgets/quantumviz/polymer/polymer.html">
<link rel="import" href="https://api0.cityzendata.net/widgets/quantumviz/warp10-quantumviz/quantumviz-warpscript-image.html">
<body>
<warp10-img
width="300" height="300" reload="0"
host="https://warp.cityzendata.net/">
200 'width' CSTORE 200 'height' CSTORE
$width $height '2D' Pgraphics
Ppixels
<% DROP DROP RAND 0xFFFFFFFF * TOLONG %> LMAP
PupdatePixels Pencode
</warp10-img>
</body>
</html>
Ok, what about Hadoop?
Dealing with time series data in
Hadoop is difficult!
Most, if not all approaches do it
wrong!
Either too narrow in focus...
think econometric time series
...or providing too little value...
because moving average is simply a beginning
...or limited to a specific tool
think xxxRDD
Warp 10 brings the power of
to
Warp10InputFormat
■ Read data stored in Warp 10 at millions of datapoints per second
■ Standard Hadoop InputFormat
■ Compatible with any tool relying on such an InputFormat
■ Compact representation of time series, lower memory footprint
Integration with
■ Enable the use of WarpScript code in the Spark DAG
■ Provide both WarpScriptFunction and WarpScriptFlatMapFunction
■ Manipulate RDD/DataSet/DataFrame elements on the WarpScript stack
■ Extend WarpScript to support custom types if needed
■ Load time series data from any source (Parquet, SQL, …)
DataFrame df = sqlc.read().parquet(...);
RDD<Row> rdd = df.rdd();
JavaRDD<Row> jrdd = rdd.toJavaRDD();
JavaRDD<Row> out = jrdd.mapPartitions(new
WarpScriptFlatMapFunction<Iterator<Row>,Row>("@ext-macro.mc2"));
JavaPairRDD<Row, Iterable<Row>> grouped = out.groupBy(new WarpScriptFunction<Row, Row>("[ 0
1 ] SUBLIST ->SPARKROW"));
JavaRDD<Row> merged = grouped.map(new WarpScriptFunction<Tuple2<Row,Iterable<Row>>,
Row>("LIST-> DROP 0 GET [] SWAP <% SPARK-> 2 GET UNWRAP +! %> FOREACH MERGE WRAPRAW + 2 GET
1 ->LIST ->SPARKROW"));
List<StructField> fields = new ArrayList<StructField>();
fields.add(DataTypes.createStructField("wrapper", DataTypes.BinaryType, false));
StructType st = new StructType(fields.toArray(new StructField[0]));
DataFrame df2 = sqlc.createDataFrame(merged, st);
df2.write().parquet("/path/to/output/parquetfile");
Integration with
Integration with
■ Enable the use of WarpScript code in Pig scripts
■ Provide a WarpScriptRun UDF
■ Manipulate Pig types (tuples, bags, …) on the WarpScript stack
■ Represent time series in a very compact form to speed up processing
■ Load time series data from any source
REGISTER warp10-pig-0.0.10-rc2.jar;
SET warp.timeunits 'us';
DEFINE WarpScriptRun io.warp10.pig.WarpScriptRun();
GTS = LOAD '$input' USING PigStorage() AS (gts: chararray);
-- Retain only the 'frequency' GTS and chunk them by 5 minutes
FREQCHUNKS = FOREACH GTS GENERATE
FLATTEN( WarpScriptRun('DUP UNWRAPEMPTY NAME "frequency" == <% UNWRAP 0 5 m 0 0 "chunkid" false CHUNK WRAP %> <% [] %> IFTE ->V ', gts));
-- Flatten the bag
CHUNKS = FOREACH FREQCHUNKS GENERATE FLATTEN($0);
-- Generate station id, chunk id, gts
BYSTATIONCHUNK = FOREACH CHUNKS GENERATE FLATTEN( WarpScriptRun('DUP UNWRAP LABELS DUP "chunkid" GET SWAP "stationid" GET ', $0))
AS (stationid: chararray, chunkid: chararray, gts: chararray);
-- Group by station id, chunk id
STATIONCHUNKGROUP = GROUP BYSTATIONCHUNK BY (stationid, chunkid) PARALLEL 20;
-- Merge the GTS to reconstruct the chunk and emit station id, chunk id, gts
FULLCHUNKS = FOREACH STATIONCHUNKGROUP GENERATE
FLATTEN(
WarpScriptRun('V-> <% DROP 2 GET UNWRAP %> LMAP MERGE DUP LABELS SWAP WRAP SWAP DUP "chunkid" GET SWAP "stationid" GET ', BYSTATIONCHUNK))
AS (stationid: chararray, chunkid: chararray, gts: chararray);
STORE FULLCHUNKS INTO ‘$output’ USING PigStorage(‘t’);
Integration with
{
'type' 'spout'
'id' 'spout-0'
'output' { 'stream-0' [ 'field-2' 'field-1' ] }
'parallelism' 1
'every' 500
'debug' true
'macro'
0 'counter' STORE
<%
$counter 1 + 'counter' STORE
'NOW' 'https://host:port/api/v0/exec' REXEC 'now'
STORE
{ 'stream-0' [ [ 'now' $now ] ] }
%>
}
{
'type' 'bolt'
'id' 'bolt-0'
'parallelism' 2
'debug' true
'input' { 'spout-0' { 'stream-0' 'shuffle' } }
'output' { 'stream-1' [ 'outfield' ] }
‘macro' <%
SNAPSHOT [ SWAP ] 'value' STORE
$value 0 GET _storm.LOG
{ 'stream-1' [ $value ] }
%>
}
Integration with stream processing engines
And also...
■ Integration with Flink
■ Integration with Zeppelin via a WarpScript interpreter
■ Warp 10 sink to push data to Warp 10 once it has been processed
■ Coherent approach in ad-hoc, batch, and streaming modes
■ Reduce amount of code needed to be written, focus on business problems
Open Source Distribution
Thank you!
curl -O -L https://dl.bintray.com/cityzendata/generic/io/warp10/warp10/1.2.7/warp10-1.2.7.tar.gz
tar zxpf warp10-1.2.7.tar.gz
export JAVA_HOME=/path/to/java/home; cd warp10-1.2.7; ./bin/warp10-standalone.init start
3 steps to get you started with Warp 10
A set of resources to learn, ask and share
@warp10io
http://www.warp10.io/
http://groups.google.com/forum/#!forum/warp10-users
https://github.com/cityzendata
contact @ cityzendata . com

More Related Content

Recently uploaded

What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 

Recently uploaded (20)

What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 

Featured

Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Applitools
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at WorkGetSmarter
 
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...DevGAMM Conference
 
Barbie - Brand Strategy Presentation
Barbie - Brand Strategy PresentationBarbie - Brand Strategy Presentation
Barbie - Brand Strategy PresentationErica Santiago
 
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them wellGood Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them wellSaba Software
 

Featured (20)

Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work
 
ChatGPT webinar slides
ChatGPT webinar slidesChatGPT webinar slides
ChatGPT webinar slides
 
More than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike RoutesMore than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike Routes
 
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
 
Barbie - Brand Strategy Presentation
Barbie - Brand Strategy PresentationBarbie - Brand Strategy Presentation
Barbie - Brand Strategy Presentation
 
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them wellGood Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
 

Warp 10 - Time Series Analysis on top of Hadoop - HUG France - Paris Spark Meetup 2017-05-16

  • 1. Spark Meetup - 2017-05-16, Paris Mathias @Herberts - CTO, Cityzen Data Warp 10 - Simplifying analysis of time series data on top of
  • 2. `whoami` Former Senior SRE on Big Table at Google Former head of Big Data at Crédit Mutuel Arkéa Pioneer in the use of Hadoop & HBase in production since 2009 Co-Founder and CTO of Cityzen Data, maker of Warp 10 @herberts
  • 4.
  • 5. Time Series are everywhere
  • 6. IoT & time series data management and analysis
  • 12. Embeddable for Edge Analytics
  • 15.
  • 17. Security Encryption and authentication/authorization mechanisms sandboxed environment for analytics
  • 19. Analytics A stack based language dedicated to time series analytics
  • 20. Advanced stack based language ■ Result is a JSON array of the various stack levels ■ Support for variables and context saving ■ Code serialization ■ Loops, conditionals, macros - Data Flow model ■ Secure code execution, resource limits
  • 21. 5 high level frameworks ■ BUCKETIZE - transform a series so it has regularly spaced ticks ■ MAP - apply a function on a sliding window ■ REDUCE - tick by tick computation on multiple series, producing a single one ■ FILTER - select series based on various criteria ■ APPLY - tick by tick application of an n-ary function
  • 22. ! != % & && * ** + +! - ->B64 ->B64URL ->BIN ->BYTES ->DOUBLEBITS ->FLOATBITS ->GEOHASH ->HEX ->HHCODE ->HHCODELONG ->JSON ->LIST ->MAP ->MAT ->OPB64 ->PICKLE ->Q ->SET ->TSELEMENTS ->V ->VEC ->Z / < << <= == > >= >> >>> ABS ACOS ADDDAYS ADDMONTHS ADDVALUE ADDYEARS AESUNWRAP AESWRAP AGO AND APPEND APPLY ASIN ASSERT ATAN ATBUCKET ATINDEX ATTICK ATTRIBUTES AUTHENTICATE B64-> B64TOHEX B64URL-> BBOX BIN-> BINTOHEX BITCOUNT BITGET BITSTOBYTES BOOTSTRAP BREAK BUCKETCOUNT BUCKETIZE BUCKETSPAN BYTES-> BYTESTOBITS BYTESTOBITS CALL CBRT CEIL CHUNK CLEAR CLEARDEFS CLEARSYMBOLS CLEARTOMARK CLIP CLONE CLONEEMPTY CLONEREVERSE COMMONTICKS COMPACT CONTAINS CONTAINSKEY CONTAINSVALUE CONTINUE COPYGEO COPYSIGN CORRELATE COS COSH COUNTER COUNTERDELTA COUNTERVALUE COUNTTOMARK CPROB CROP CSTORE CUDF DEBUGOFF DEBUGON DEDUP DEF DEFINED DEFINEDMACRO DELETE DEPTH DET DIFFERENCE DISCORDS DOC DOCMODE DOUBLEBITS-> DOUBLEEXPONENTIALSMOOTHING DROP DROPN DTW DUMP DUP DUPN DURATION DWTSPLIT E ELAPSED ELEVATIONS EMPTY ESDTEST EVAL EVALSECURE EVERY EXP EXPM1 EXPORT FAIL FDWT FETCH FETCHBOOLEAN FETCHDOUBLE FETCHLONG FETCHSTRING FFT FFTAP FILLNEXT FILLPREVIOUS FILLTICKS FILLVALUE FILTER FIND FINDSETS FINDSTATS FIRSTTICK FLATTEN FLOATBITS-> FLOOR FOR FOREACH FORGET FORSTEP FROMBIN FROMBITS FROMHEX FUSE GEO.DIFFERENCE GEO.INTERSECTION GEO.INTERSECTS GEO.REGEXP GEO.UNION GEO.WITHIN GEO.WKT GEOHASH-> GEOPACK GEOUNPACK GET GETHOOK GETSECTION GRUBBSTEST GZIP HASH HAVERSINE HEADER HEX-> HEXTOB64 HEXTOBIN HHCODE-> HUMANDURATION HYBRIDTEST HYBRIDTEST2 HYPOT IDENT IDWT IEEEREMAINDER IFFT IFT IFTE IMMUTABLE INTEGRATE INTERPOLATE INTERSECTION INV ISNULL ISNaN ISO8601 ISODURATION ISONORMALIZE JOIN JSON-> JSONLOOSE JSONSTRICT KEYLIST LABELS LASTBUCKET LASTSORT LASTTICK LBOUNDS LFLATMAP LIMIT LIST-> LMAP LOAD LOCATIONOFFSET LOCATIONS LOCSTRINGS LOG LOG10 LOG1P LORAENC LORAMIC LOWESS LR LSORT LTTB MACROBUCKETIZER MACROFILTER MACROMAPPER MACROREDUCER MAKEGTS MAP MAP-> MAPID MARK MAT-> MATCH MATCHER MAX MAXBUCKETS MAXDEPTH MAXGTS MAXLONG MAXLOOP MAXOPS MAXPIXELS MAXSYMBOLS MD5 MERGE META METASET METASORT MIN MINLONG MODE MONOTONIC MSGFAIL MSORT MSTU MUSIGMA NAME NBOUNDS NDEBUGON NEWGTS NEXTAFTER NEXTUP NONEMPTY NOOP NORMALIZE NOT NOTAFTER NOTBEFORE NOTIMINGS NOW NPDF NRETURN NSUMSUMSQ NULL NaN ONLYBUCKETS OPB64-> OPB64TOHEX OPS OPTDTW OR PACK PAPPLY PARSE PARSESELECTOR PARTITION PATTERNDETECTION PATTERNS PFILTER PGraphics PI PICK PICKLE-> PIGSCHEMA PREDUCE PROB PROBABILITY PUT Palpha Parc Pbackground PbeginContour PbeginShape Pbezier PbezierDetail PbezierPoint PbezierTangent PbezierVertex Pblend PblendMode Pblue Pbox Pbrightness Pclear Pclip Pcolor PcolorMode Pconstrain Pcopy PcreateFont Pcurve PcurveDetail PcurvePoint PcurveTangent PcurveTightness PcurveVertex Pdecode Pdist Pellipse PellipseMode Pencode PendContour PendShape Pfill Pget Pgreen Phue Pimage PimageMode Plerp PlerpColor Pline Pmag Pmap PnoClip PnoFill PnoStroke PnoTint Pnorm Ppixels Ppoint PpopMatrix PpopStyle PpushMatrix PpushStyle Pquad PquadraticVertex Prect PrectMode Pred PresetMatrix Protate ProtateX ProtateY ProtateZ Psaturation Pscale Pset PshapeMode PshearX PshearY Psphere PsphereDetail Pstroke PstrokeCap PstrokeJoin PstrokeWeight Ptext PtextAlign PtextAscent PtextDescent PtextFont PtextLeading PtextMode PtextSize PtextWidth Ptint Ptranslate Ptriangle PupdatePixels Pvertex Q-> QCONJUGATE QDIVIDE QMULTIPLY QROTATE QROTATION QUANTIZE RAND RANDPDF RANGE RANGECOMPACT REDEFS REDUCE RELABEL REMOVE RENAME REPLACE REPLACEALL RESET RESETS RESTORE RETURN REV REVBITS REVERSE REXEC REXECZ RINT RLOWESS ROLL ROLLD ROT ROTATIONQ ROUND RSADECRYPT RSAENCRYPT RSAGEN RSAPRIVATE RSAPUBLIC RSASIGN RSAVERIFY RSORT RTFM RUN RUNNERNONCE RVALUESORT SAVE SECTION SECUREKEY SET SET-> SETATTRIBUTES SETVALUE SHA1 SHA1HMAC SHA256 SHA256HMAC SHRINK SIGNUM SIN SINGLEEXPONENTIALSMOOTHING SINH SIZE SNAPSHOT SNAPSHOTALL SNAPSHOTALLTOMARK SNAPSHOTCOPY SNAPSHOTCOPYALL SNAPSHOTCOPYALLTOMARK SNAPSHOTCOPYTOMARK SNAPSHOTTOMARK SORT SORTBY SPLIT SQRT STACKATTRIBUTE STACKTOLIST STANDARDIZE STL STLESDTEST STOP STORE STRICTMAPPER STRICTPARTITION STRICTREDUCER STU SUBLIST SUBMAP SUBSTRING SWAP SWITCH TAN TANH TEMPLATE TEMPLATE THRESHOLDTEST TICKINDEX TICKLIST TICKS TIMECLIP TIMEMODULO TIMESCALE TIMESHIFT TIMESPLIT TIMINGS TLTTB TOBIN TOBITS TOBOOLEAN TODEGREES TODOUBLE TOHEX TOKENINFO TOLONG TOLOWER TORADIANS TOSELECTOR TOSTRING TOTIMESTAMP TOTIMESTAMP TOUPPER TR TRANSPOSE TRIM TSELEMENTS TSELEMENTS-> TYPEOF UDF ULP UNBUCKETIZE UNGZIP UNION UNIQUE UNLIST UNMAP UNPACK UNSECURE UNTIL UNWRAP UNWRAPEMPTY UNWRAPSIZE UPDATE URLDECODE URLENCODE UUID V-> VALUEDEDUP VALUEHISTOGRAM VALUELIST VALUES VALUESORT VALUESPLIT VEC-> WEBCALL WHILE WRAP WRAPOPT WRAPRAW WRAPRAWOPT Z-> ZDISCORDS ZIP ZPATTERNDETECTION ZPATTERNS ZSCORE ZSCORETEST [ [] ] ^ bucketizer.and bucketizer.count bucketizer.count.exclude-nulls bucketizer.count.include-nulls bucketizer.count.nonnull bucketizer.first bucketizer.join bucketizer.join.forbid-nulls bucketizer.last bucketizer.mad bucketizer.max bucketizer.max.forbid-nulls bucketizer.mean bucketizer.mean.circular bucketizer.mean.circular.exclude-nulls bucketizer.mean.exclude-nulls bucketizer.median bucketizer.min bucketizer.min.forbid-nulls bucketizer.or bucketizer.percentile bucketizer.sum bucketizer.sum.forbid-nulls d e filter.byattr filter.byclass filter.bylabels filter.bylabelsattr filter.bymetadata filter.last.eq filter.last.ge filter.last.gt filter.last.le filter.last.lt filter.last.ne filter.latencies h m mapper.abs mapper.abscissa mapper.add mapper.and mapper.ceil mapper.count mapper.count.exclude-nulls mapper.count.include-nulls mapper.count.nonnull mapper.day mapper.delta mapper.distinct mapper.dotproduct mapper.dotproduct.positive mapper.dotproduct.sigmoid mapper.dotproduct.tanh mapper.eq mapper.exp mapper.finite mapper.first mapper.floor mapper.ge mapper.geo.approximate mapper.geo.clear mapper.geo.outside mapper.geo.within mapper.gt mapper.hdist mapper.highest mapper.hour mapper.hspeed mapper.join mapper.join.forbid-nulls mapper.kernel.cosine mapper.kernel.epanechnikov mapper.kernel.gaussian mapper.kernel.logistic mapper.kernel.quartic mapper.kernel.silverman mapper.kernel.triangular mapper.kernel.tricube mapper.kernel.triweight mapper.kernel.uniform mapper.last mapper.le mapper.log mapper.lowest mapper.lt mapper.mad mapper.max mapper.max.forbid-nulls mapper.max.x mapper.mean mapper.mean.circular mapper.mean.circular.exclude-nulls mapper.mean.exclude-nulls mapper.median mapper.min mapper.min.forbid-nulls mapper.min.x mapper.minute mapper.mod mapper.month mapper.mul mapper.ne mapper.npdf mapper.or mapper.parsedouble mapper.percentile mapper.pow mapper.product mapper.rate mapper.replace mapper.round mapper.sd mapper.sd.forbid-nulls mapper.second mapper.sigmoid mapper.sum mapper.sum.forbid-nulls mapper.tanh mapper.tick mapper.toboolean mapper.todouble mapper.tolong mapper.tostring mapper.truecourse mapper.var mapper.var.forbid-nulls mapper.vdist mapper.vspeed mapper.weekday mapper.year max.tick.sliding.window max.time.sliding.window ms ns op.add op.add.ignore-nulls op.and op.and.ignore-nulls op.div op.eq op.ge op.gt op.le op.lt op.mask op.mul op.mul.ignore-nulls op.ne op.negmask op.or op.or.ignore-nulls op.sub pi ps reducer.and reducer.and.exclude-nulls reducer.argmax reducer.argmin reducer.count reducer.count.exclude-nulls reducer.count.include-nulls reducer.count.nonnull reducer.join reducer.join.forbid-nulls reducer.join.nonnull reducer.join.urlencoded reducer.mad reducer.max reducer.max.forbid-nulls reducer.max.nonnull reducer.mean reducer.mean.circular reducer.mean.circular.exclude-nulls reducer.mean.exclude-nulls reducer.median reducer.min reducer.min.forbid-nulls reducer.min.nonnull reducer.or reducer.or.exclude-nulls reducer.percentile reducer.product reducer.sd reducer.sd.forbid-nulls reducer.shannonentropy.0 reducer.shannonentropy.1 reducer.sum reducer.sum.forbid-nulls reducer.sum.nonnull reducer.var reducer.var.forbid-nulls s us w { {} | || } ~ ~= 800 functions
  • 23. Compact expressiveness <% ‘Display write requests count for each region’ DOC SAVE 'context' STORE 'cell' STORE 'PT60m' DURATION 'duration' STORE '@TOKEN_READ@' 'TOKEN' STORE NOW 'now' STORE [ $TOKEN 'writeRequestCount' { 'cell' $cell 'Context' 'regionserver' } $now $duration ] FETCH // Remove resets false RESETS // Align ticks [ SWAP bucketizer.last $now 60 STU * 0 ] BUCKETIZE // Sum by hname [ SWAP [ 'hname' ] reducer.sum ] REDUCE FILLNEXT FILLPREVIOUS // Compute rates [ SWAP mapper.rate 1 0 0 ] MAP $context RESTORE %>
  • 25. WarpScript Server Side Macros <% <’ This macro does such and such… @param xxx @param yyy ‘> DOC // Store the current context so we can create symbols freely SAVE ‘_context’ STORE // Insert your code here // Restore original context $_context RESTORE %> ‘macro’ STORE // Unit tests // Leave the macro on the stack $macro // Use via @path/to/macro in your scripts
  • 26. WarpScript Extensions Import io.warp10.script.sdk.WarpScriptExtension; import io.warp10.script.NamedWarpScriptFunction; import io.warp10.script.WarpScriptException; import io.warp10.script.WarpScriptStack; import io.warp10.script.WarpScriptStackFunction; public class MyExtension extends WarpScriptExtension { private static Map<String,Object> functions = new HashMap<String,Object>(); private static class MyStackFunction extends NamedWarpScriptFunction implements WarpScriptStackFunction { @Override public Object apply(WarpScriptStack stack) throws WarpScriptException { …. return stack; } } static { functions.put("XXX", new MyStackFunction(“XXX”)); } @Override public Map<String, Object> getFunctions() { return functions; } }
  • 27. CALLing external programs #!/usr/bin/env python -u import cPickle, sys, urllib, base64 # Output the maximum number of instances of this 'callable' to spawn print 10 # Loop, reading stdin, doing our stuff and outputing to stdout while True: try: line = sys.stdin.readline() line = line.strip() line = urllib.unquote(line.decode('utf-8')) # Remove Base64 encoding str = base64.b64decode(line) args = cPickle.loads(str) # Do out stuff output = …. # Output result (URL encoded UTF-8). print urllib.quote(output.encode('utf-8')) except Exception as err: print ' ' + urllib.quote(repr(err).encode('utf-8')) ... ->PICKLE ‘UTF-8’ BYTES-> ->B64 ‘path/to/file’ CALL B64-> PICKLE-> ....
  • 31. QuantumViz Web Component <!doctype html> <html> <head> <meta name="viewport" content="width=device-width, minimum-scale=1.0, initial-scale=1.0, user-scalable=yes"> <script src="https://api0.cityzendata.net/widgets/quantumviz/webcomponentsjs/webcomponents.js"></script> <link rel="import" href="https://api0.cityzendata.net/widgets/quantumviz/polymer/polymer.html"> <link rel="import" href="https://api0.cityzendata.net/widgets/quantumviz/warp10-quantumviz/warp10-quantumviz.html"> <body> <warp10-quantumviz width="500" height="400" show-axis="true" tooltip="true" line-width="2" reload="0" host="https://warp.cityzendata.net/" > NEWGTS 1 720 <% DUP 'i' STORE 10000000 * NaN NaN NaN $i TORADIANS COS ADDVALUE %> FOR [ SWAP ] 'gts' STORE [ { 'color' '#00d4ff' 'key' 'Sine' } ] 'params' STORE { 'interpolate' 'linear' } 'globalParams' STORE { 'gts' $gts 'params' $params 'globalParams' $globalParams } </warp10-quantumviz> </body> </html>
  • 34. rocessing Integration 800 'width' STORE 800 'height' STORE 400.0 'maxspeed' STORE 40000.0 'maxalt' STORE 3.0 2.0 2.0 @orbit/heatmap/kernel/triangular 'kernel' STORE @orbit/heatmap/palette/classic 'palette' STORE 'TOKEN''token' STORE $width $height '2D' PGraphics 'MULTIPLY' PblendMode 'CENTER' PimageMode [ $token '~(ALT|CAS)' {} NOW -2000000 ] FETCH DUP 0 GET LASTTICK 'now' STORE [ SWAP bucketizer.last $now STU 0 ] BUCKETIZE // Create heatmap <% 7 GET LIST-> DROP 'CAS' STORE 'ALT' STORE <% $CAS ISNULL NOT $ALT ISNULL NOT && %> <% $kernel $CAS $maxspeed / $width * $ALT $maxalt / 1.0 SWAP - $height * Pimage %> IFT 0 NaN NaN NaN NULL %> MACROREDUCER 'GRAPHER' STORE [ SWAP [] $GRAPHER ] REDUCE DROP // Colorize Ppixels <% DROP Palpha $palette SWAP GET %> LMAP PupdatePixels Pencode Pdecode $width $height '2D' PGraphics // Do the grid PnoFill 0 0 $width 1 - $height 1 - Prect 2.0 PstrokeWeight 200.0 Pcolor Pstroke 250.0 $maxspeed / $width * DUP 0 SWAP $height Pline 0 10000 $maxalt / 1.0 SWAP - $height * DUP $width SWAP Pline SWAP 0 0 Pimage Pencode
  • 35. QuantumImg Web Component <!doctype html> <html> <head> <link rel="stylesheet" href="//maxcdn.bootstrapcdn.com/bootstrap/3.3.5/css/bootstrap.min.css"> <link rel="stylesheet" href="//maxcdn.bootstrapcdn.com/font-awesome/4.3.0/css/font-awesome.min.css"> <script src="//cdnjs.cloudflare.com/ajax/libs/jquery/1.11.3/jquery.min.js"></script> <script src="//cdnjs.cloudflare.com/ajax/libs/twitter-bootstrap/3.3.5/js/bootstrap.min.js"></script> <script src="https://api0.cityzendata.net/widgets/quantumviz/webcomponentsjs/webcomponents.js"></script> <link rel="import" href="https://api0.cityzendata.net/widgets/quantumviz/polymer/polymer.html"> <link rel="import" href="https://api0.cityzendata.net/widgets/quantumviz/warp10-quantumviz/quantumviz-warpscript-image.html"> <body> <warp10-img width="300" height="300" reload="0" host="https://warp.cityzendata.net/"> 200 'width' CSTORE 200 'height' CSTORE $width $height '2D' Pgraphics Ppixels <% DROP DROP RAND 0xFFFFFFFF * TOLONG %> LMAP PupdatePixels Pencode </warp10-img> </body> </html>
  • 36. Ok, what about Hadoop?
  • 37. Dealing with time series data in Hadoop is difficult!
  • 38. Most, if not all approaches do it wrong!
  • 39. Either too narrow in focus... think econometric time series
  • 40. ...or providing too little value... because moving average is simply a beginning
  • 41. ...or limited to a specific tool think xxxRDD
  • 42. Warp 10 brings the power of to
  • 43. Warp10InputFormat ■ Read data stored in Warp 10 at millions of datapoints per second ■ Standard Hadoop InputFormat ■ Compatible with any tool relying on such an InputFormat ■ Compact representation of time series, lower memory footprint
  • 44. Integration with ■ Enable the use of WarpScript code in the Spark DAG ■ Provide both WarpScriptFunction and WarpScriptFlatMapFunction ■ Manipulate RDD/DataSet/DataFrame elements on the WarpScript stack ■ Extend WarpScript to support custom types if needed ■ Load time series data from any source (Parquet, SQL, …)
  • 45. DataFrame df = sqlc.read().parquet(...); RDD<Row> rdd = df.rdd(); JavaRDD<Row> jrdd = rdd.toJavaRDD(); JavaRDD<Row> out = jrdd.mapPartitions(new WarpScriptFlatMapFunction<Iterator<Row>,Row>("@ext-macro.mc2")); JavaPairRDD<Row, Iterable<Row>> grouped = out.groupBy(new WarpScriptFunction<Row, Row>("[ 0 1 ] SUBLIST ->SPARKROW")); JavaRDD<Row> merged = grouped.map(new WarpScriptFunction<Tuple2<Row,Iterable<Row>>, Row>("LIST-> DROP 0 GET [] SWAP <% SPARK-> 2 GET UNWRAP +! %> FOREACH MERGE WRAPRAW + 2 GET 1 ->LIST ->SPARKROW")); List<StructField> fields = new ArrayList<StructField>(); fields.add(DataTypes.createStructField("wrapper", DataTypes.BinaryType, false)); StructType st = new StructType(fields.toArray(new StructField[0])); DataFrame df2 = sqlc.createDataFrame(merged, st); df2.write().parquet("/path/to/output/parquetfile"); Integration with
  • 46. Integration with ■ Enable the use of WarpScript code in Pig scripts ■ Provide a WarpScriptRun UDF ■ Manipulate Pig types (tuples, bags, …) on the WarpScript stack ■ Represent time series in a very compact form to speed up processing ■ Load time series data from any source
  • 47. REGISTER warp10-pig-0.0.10-rc2.jar; SET warp.timeunits 'us'; DEFINE WarpScriptRun io.warp10.pig.WarpScriptRun(); GTS = LOAD '$input' USING PigStorage() AS (gts: chararray); -- Retain only the 'frequency' GTS and chunk them by 5 minutes FREQCHUNKS = FOREACH GTS GENERATE FLATTEN( WarpScriptRun('DUP UNWRAPEMPTY NAME "frequency" == <% UNWRAP 0 5 m 0 0 "chunkid" false CHUNK WRAP %> <% [] %> IFTE ->V ', gts)); -- Flatten the bag CHUNKS = FOREACH FREQCHUNKS GENERATE FLATTEN($0); -- Generate station id, chunk id, gts BYSTATIONCHUNK = FOREACH CHUNKS GENERATE FLATTEN( WarpScriptRun('DUP UNWRAP LABELS DUP "chunkid" GET SWAP "stationid" GET ', $0)) AS (stationid: chararray, chunkid: chararray, gts: chararray); -- Group by station id, chunk id STATIONCHUNKGROUP = GROUP BYSTATIONCHUNK BY (stationid, chunkid) PARALLEL 20; -- Merge the GTS to reconstruct the chunk and emit station id, chunk id, gts FULLCHUNKS = FOREACH STATIONCHUNKGROUP GENERATE FLATTEN( WarpScriptRun('V-> <% DROP 2 GET UNWRAP %> LMAP MERGE DUP LABELS SWAP WRAP SWAP DUP "chunkid" GET SWAP "stationid" GET ', BYSTATIONCHUNK)) AS (stationid: chararray, chunkid: chararray, gts: chararray); STORE FULLCHUNKS INTO ‘$output’ USING PigStorage(‘t’); Integration with
  • 48. { 'type' 'spout' 'id' 'spout-0' 'output' { 'stream-0' [ 'field-2' 'field-1' ] } 'parallelism' 1 'every' 500 'debug' true 'macro' 0 'counter' STORE <% $counter 1 + 'counter' STORE 'NOW' 'https://host:port/api/v0/exec' REXEC 'now' STORE { 'stream-0' [ [ 'now' $now ] ] } %> } { 'type' 'bolt' 'id' 'bolt-0' 'parallelism' 2 'debug' true 'input' { 'spout-0' { 'stream-0' 'shuffle' } } 'output' { 'stream-1' [ 'outfield' ] } ‘macro' <% SNAPSHOT [ SWAP ] 'value' STORE $value 0 GET _storm.LOG { 'stream-1' [ $value ] } %> } Integration with stream processing engines
  • 49. And also... ■ Integration with Flink ■ Integration with Zeppelin via a WarpScript interpreter ■ Warp 10 sink to push data to Warp 10 once it has been processed ■ Coherent approach in ad-hoc, batch, and streaming modes ■ Reduce amount of code needed to be written, focus on business problems
  • 51. Thank you! curl -O -L https://dl.bintray.com/cityzendata/generic/io/warp10/warp10/1.2.7/warp10-1.2.7.tar.gz tar zxpf warp10-1.2.7.tar.gz export JAVA_HOME=/path/to/java/home; cd warp10-1.2.7; ./bin/warp10-standalone.init start 3 steps to get you started with Warp 10 A set of resources to learn, ask and share @warp10io http://www.warp10.io/ http://groups.google.com/forum/#!forum/warp10-users https://github.com/cityzendata