SlideShare a Scribd company logo
1 of 23
Download to read offline
DML	Syntax	&	Invocation
Nakul	Jindal
Spark	Technology	Center,	San	Francisco
Goal	of	These	Slides
• Provide	you	with	basic	DML	syntax
• Link	to	important	resources
• Invocation	
Non-Goals
• Comprehensive	syntax	and	API	coverage
Resources
• Google	“Apache	Systemml”
• Documentation	- https://apache.github.io/incubator-systemml/
• DML	Language	Reference	- https://apache.github.io/incubator-systemml/dml-
language-reference.html
• MLContext- https://apache.github.io/incubator-systemml/spark-mlcontext-
programming-guide.html#spark-shell-scala-example
• Github - https://github.com/apache/incubator-systemml
Note
• Some	documentation	 is	outdated
• If	you	find	a	typo	or	want	to	update	the	document,	consider	making	a	Pull	Request
• All	docs	are	in	Markdown	format
• https://github.com/apache/incubator-systemml/tree/master/docs
About	DML	Briefly	
• DML	=	Declarative	Machine	Learning
• R-like	syntax,	some	subtle	differences	from	R
• Dynamically	typed
• Data	Structures
• Scalars	– Boolean,	Integers,	Strings,	Double	Precision
• Cacheable	– Matrices,	DataFrames
• Data	Structure	Terminology	in	DML
• Value	Type	- Boolean,	Integers,	Strings,	Double	Precision
• Data	Type	– Scalar,	Matrices,	DataFrames*
• You	can	have	a	DataType[ValueType],	not	all	combinations	are	supported
• For	instance	– matrix[double]
• Scoping
• One	global	scope,	except	inside	functions
*	Coming	soon
About	DML	Briefly	
• Control	Flow
• Sequential	imperative	control	flow	(like	most	other	languages)
• Looping	–
• while (<condition>)	{	…	}
• for (var in <for_predicate>)	{	…	}
• parfor (var in <for_predicate>)	{	…	} //	Iterations	in	parallel
• Guards	–
• if (<condition>)	{	...	}	[ else if (<condition>)	{	...	}	...	else {	…	}	]
• Functions
• Built-in	– List	available	in	language	reference
• User	Defined	– (multiple	return	parameters)
• functionName =	function (<formal_parameters>…)	return (<formal_parameters>)	{	...	}
• Can	only	access	variables	defined	in	the	formal_parameters in	the	body	of	the	function	
• External	Function	– same	as	user	defined,	can	call	external	Java	Package
About	DML	Briefly
• Imports
• Can	import	user	defined/external	functions from	other	source	files
• Disambiguation	using	namespaces
• Command	Line	Arguments
• By	position	- $1,	$2 …
• By	name	- $X,	$Y ...
• Limitations
• A	user	defined	functions	can	only	be	called	on	the	right	hand	side	of	assignments	as	
the	only	expression
• Cannot	write
• X	<- Y	+	bar()
• for (i in foo(1,2,3))	{	…	}
Sample	Code
A = 1.0 # A is an integer
X <- matrix(“4 3 2 5 7 8”, rows=3, cols=2) # X = matrix of size 3,2 '<-' is assignment
Y = matrix(1, rows=3, cols=2) # Y = matrix of size 3,2 with all 1s
b <- t(X) %*% Y # %*% is matrix multiply, t(X) is transpose
S = "hello world"
i=0
while(i < max_iteration) {
H = (H * (t(W) %*% (V/(W%*%H))))/t(colSums(W)) # * is element by element mult
W = (W * ((V/(W%*%H)) %*% t(H)))/t(rowSums(H))
i = i + 1; # i is an integer
}
print (toString(H)) # toString converts a matrix to a string
Sample	Code
source("nn/layers/affine.dml") as affine # import a file in the “affine“ namespace
[W, b] = affine::init(D, M) # calls the init function, multiple return
parfor (i in 1:nrow(X)) { # i iterates over 1 through num rows in X in parallel
for (j in 1:ncol(X)) { # j iterates over 1 through num cols in X
# Computation ...
}
}
write (M, fileM, format=“text”) # M=matrix, fileM=file, also writes to HDFS
X = read (fileX) # fileX=file, also reads from HDFS
if (ncol (A) > 1) {
# Matrix A is being sliced by a given range of columns
A[,1:(ncol (A) - 1)] = A[,1:(ncol (A) - 1)] - A[,2:ncol (A)];
}
Sample	Code
interpSpline = function(
double x, matrix[double] X, matrix[double] Y, matrix[double] K) return (double q) {
i = as.integer(nrow(X) - sum(ppred(X, x, ">=")) + 1)
# misc computation …
q = as.scalar(qm)
}
eigen = externalFunction(Matrix[Double] A)
return(Matrix[Double] eval, Matrix[Double] evec)
implemented in (classname="org.apache.sysml.udf.lib.EigenWrapper", exectype="mem")
Sample	Code	(From	LinearRegDS.dml*)
A = t(X) %*% X
b = t(X) %*% y
if (intercept_status == 2) {
A = t(diag (scale_X) %*% A + shift_X %*% A [m_ext, ])
A = diag (scale_X) %*% A + shift_X %*% A [m_ext, ]
b = diag (scale_X) %*% b + shift_X %*% b [m_ext, ]
}
A = A + diag (lambda)
print ("Calling the Direct Solver...")
beta_unscaled = solve (A, b)
*https://github.com/apache/incubator-systemml/blob/master/scripts/algorithms/LinearRegDS.dml#L133
MLContext API
• You	can	invoke	SystemML from	the	
• Command	line	or	a	
• Spark	Program
• The	MLContext API	lets	you	invoke	it	from	a	Spark	Program
• Command	line	invocation	described	later
• Available	as	a	Scala	API	and	a	Python	API
• These	slides	will	only	talk	about	the	Scala	API
MLContext API	– Example	Usage
val ml = new MLContext(sc)
val X_train = sc.textFile("amazon0601.txt")
.filter(!_.startsWith("#"))
.map(_.split("t") match{case Array(prod1, prod2)=>(prod1.toInt, prod2.toInt,1.0)})
.toDF("prod_i", "prod_j", "x_ij")
.filter("prod_i < 5000 AND prod_j < 5000") // Change to smaller number
.cache()
MLContext API	– Example	Usage
val pnmf =
"""
# data & args
X = read($X)
rank = as.integer($rank)
# Computation ....
write(negloglik, $negloglikout)
write(W, $Wout)
write(H, $Hout)
"""
MLContext API	– Example	Usage
val pnmf =
"""
# data & args
X = read($X)
rank = as.integer($rank)
# Computation ....
write(negloglik, $negloglikout)
write(W, $Wout)
write(H, $Hout)
"""
ml.registerInput("X", X_train)
ml.registerOutput("W")
ml.registerOutput("H")
ml.registerOutput("negloglik")
val outputs = ml.executeScript(pnmf,
Map("maxiter" -> "100", "rank" -> "10"))
val negloglik = getScalarDouble(outputs,
"negloglik")
Invocation	– How	to	run	a	DML	file
• SystemML can	run	on
• Your	laptop	(Standalone)
• Spark
• Hybrid	Spark	– using	the	better	choice	between	the	driver	and	the	cluster
• Hadoop
• Hybrid	Hadoop	
• For	this	presentation,	we	care	about	standalone,	spark &	
hybrid_spark
• Documentation	has	detailed	instructions	on	the	others
Invocation	– How	to	run	a	DML	file
Standalone	
In	the	systemml directory
bin/systemml <dml-filename>	[arguments]
Example	invocations:
bin/systemml LinearRegCG.dml –nvargs X=X.mtx Y=Y.mtx B=B.mtx
bin/systemml oddsRatio.dml –args X.mtx 50	B.mtx
Named	arguments
Position	arguments
Invocation	– How	to	run	a	DML	file
Spark/ Hybrid	Spark	
Define	SPARK_HOME	to	point	to	your	Apache	Spark	Installation
Define	SYSTEMML_HOME	to	point	to	your	Apache	SystemML installation
In	the	systemml directory
scripts/sparkDML.sh<dml-filename>	[systemmlarguments]
Example	invocations:
scripts/sparkDML.sh LinearRegCG.dml --nvargs X=X.mtx Y=Y.mtxB=B.mtx
scripts/sparkDML.sh oddsRatio.dml --args X.mtx 50	B.mtx
Named	arguments
Position	arguments
Invocation	– How	to	run	a	DML	file
Spark/ Hybrid	Spark	
Define	SPARK_HOME	to	point	to	your	Apache	Spark	Installation
Define	SYSTEMML_HOME	to	point	to	your	Apache	SystemML installation
Using	the	spark-submit	script
$SPARK_HOME/bin/spark-submit
--master	<master-url>		
--class	org.apache.sysml.api.DMLScript
${SYSTEMML_HOME}/SystemML.jar -f	<dml-filename>	 <systemml arguments>	-exec	{hybrid_spark,spark}
Example	invocation:
$SPARK_HOME/bin/spark-submit	
--master	local[*]	
--class	org.apache.sysml.api.DMLScript
${SYSTEMML_HOME}/SystemML.jar -f	LinearRegCG.dml --nvargs X=X.mtx Y=Y.mtx B=B.mtx
Editor	Support
• Very	rudimentary	editor	support
• Bit	of	shameless	self-promotion	:	
• Atom	– Hackable	Text	editor
• Install	package	- https://atom.io/packages/language-dml
• From	GUI	- http://flight-manual.atom.io/using-atom/sections/atom-packages/
• Or	from	command	line	– apm install	language-dml
• Rudimentary	snippet	based	completion	of	builtin function
• Vim
• Install	package	- https://github.com/nakul02/vim-dml
• Works	with	Vundle(vim	package	manager)
• There	is	an	experimental	Zeppelin	Notebook	integration	with	DML	–
• https://issues.apache.org/jira/browse/SYSTEMML-542
• Available	as	a	docker image	to	play	with	- https://hub.docker.com/r/nakul02/incubator-zeppelin/
• Please	send	feedback	when	using	these,	requests	for	features,	bugs
• I’ll	work	on	them	when	I	can
Other	Information
• All	scripts	are	in	- https://github.com/apache/incubator-
systemml/tree/master/scripts
• Algorithm	Scripts	- https://github.com/apache/incubator-
systemml/tree/master/scripts/algorithms
• Test	Scripts	- https://github.com/apache/incubator-
systemml/tree/master/src/test/scripts
• Look	inside	the	test	folder	for	programs	that	run	the	tests,	play	
around	with	some	of	them	- https://github.com/apache/incubator-
systemml/tree/master/src/test/java/org/apache/sysml/test
Thanks!
• The	documentation	might	be	outdated	and	have	typos
• Please	submit	fixes
• If	a	language	feature	does	not	make	sense	or	is	missing,	ask	a	
SystemML team	member
• Have	Fun!
BACKUP	SLIDES
• There	was	an	attempt	at	an	Eclipse	Plugin	late	last	year	-
• https://www.mail-
archive.com/dev%40systemml.incubator.apache.org/msg00147.html
• The	project	is	largely	dead
Editor	Support

More Related Content

What's hot

Chapter 10 Library Function
Chapter 10 Library FunctionChapter 10 Library Function
Chapter 10 Library Function
Deepak Singh
 
Python Programming - IX. On Randomness
Python Programming - IX. On RandomnessPython Programming - IX. On Randomness
Python Programming - IX. On Randomness
Ranel Padon
 

What's hot (18)

Programming in Scala: Notes
Programming in Scala: NotesProgramming in Scala: Notes
Programming in Scala: Notes
 
A Tour Of Scala
A Tour Of ScalaA Tour Of Scala
A Tour Of Scala
 
CNIT 127: Ch 2: Stack overflows on Linux
CNIT 127: Ch 2: Stack overflows on LinuxCNIT 127: Ch 2: Stack overflows on Linux
CNIT 127: Ch 2: Stack overflows on Linux
 
2CPP15 - Templates
2CPP15 - Templates2CPP15 - Templates
2CPP15 - Templates
 
Advanced Functional Programming in Scala
Advanced Functional Programming in ScalaAdvanced Functional Programming in Scala
Advanced Functional Programming in Scala
 
Spark Schema For Free with David Szakallas
 Spark Schema For Free with David Szakallas Spark Schema For Free with David Szakallas
Spark Schema For Free with David Szakallas
 
Advance Scala - Oleg Mürk
Advance Scala - Oleg MürkAdvance Scala - Oleg Mürk
Advance Scala - Oleg Mürk
 
Introduction to programming in scala
Introduction to programming in scalaIntroduction to programming in scala
Introduction to programming in scala
 
The Evolution of Scala
The Evolution of ScalaThe Evolution of Scala
The Evolution of Scala
 
Spark workshop
Spark workshopSpark workshop
Spark workshop
 
Demystifying functional programming with Scala
Demystifying functional programming with ScalaDemystifying functional programming with Scala
Demystifying functional programming with Scala
 
Functional programming in Scala
Functional programming in ScalaFunctional programming in Scala
Functional programming in Scala
 
Pune Clojure Course Outline
Pune Clojure Course OutlinePune Clojure Course Outline
Pune Clojure Course Outline
 
Chapter 10 Library Function
Chapter 10 Library FunctionChapter 10 Library Function
Chapter 10 Library Function
 
Python Programming - IX. On Randomness
Python Programming - IX. On RandomnessPython Programming - IX. On Randomness
Python Programming - IX. On Randomness
 
Functional Programming in Scala
Functional Programming in ScalaFunctional Programming in Scala
Functional Programming in Scala
 
Scalax
ScalaxScalax
Scalax
 
Functional Programming With Scala
Functional Programming With ScalaFunctional Programming With Scala
Functional Programming With Scala
 

Viewers also liked

PandJ Final Essay .
PandJ Final Essay .PandJ Final Essay .
PandJ Final Essay .
Cori Muller
 

Viewers also liked (10)

A Schedule Optimization Tool for Destructive and Non-Destructive Vehicle Tests
A Schedule Optimization Tool for Destructive and Non-Destructive Vehicle Tests  A Schedule Optimization Tool for Destructive and Non-Destructive Vehicle Tests
A Schedule Optimization Tool for Destructive and Non-Destructive Vehicle Tests
 
E irfan
E irfanE irfan
E irfan
 
Forum SDM Bali - Struktur KHL Permen 21 2016 ttg khl
Forum SDM Bali - Struktur KHL Permen 21 2016 ttg khlForum SDM Bali - Struktur KHL Permen 21 2016 ttg khl
Forum SDM Bali - Struktur KHL Permen 21 2016 ttg khl
 
Ashisdeb analytics new_cv_doc
Ashisdeb analytics new_cv_docAshisdeb analytics new_cv_doc
Ashisdeb analytics new_cv_doc
 
Cmp2015 ritsumei takeda
Cmp2015 ritsumei takedaCmp2015 ritsumei takeda
Cmp2015 ritsumei takeda
 
Apache SystemML 2016 Summer class primer by Berthold Reinwald
Apache SystemML 2016 Summer class primer by Berthold ReinwaldApache SystemML 2016 Summer class primer by Berthold Reinwald
Apache SystemML 2016 Summer class primer by Berthold Reinwald
 
eCommerce and Your Business in 2020
eCommerce and Your Business in 2020eCommerce and Your Business in 2020
eCommerce and Your Business in 2020
 
PandJ Final Essay .
PandJ Final Essay .PandJ Final Essay .
PandJ Final Essay .
 
Manifestação pelo Veto Parcial do Projeto de Lei de Reforma da Lei Complement...
Manifestação pelo Veto Parcial do Projeto de Lei de Reforma da Lei Complement...Manifestação pelo Veto Parcial do Projeto de Lei de Reforma da Lei Complement...
Manifestação pelo Veto Parcial do Projeto de Lei de Reforma da Lei Complement...
 
my cv
my cvmy cv
my cv
 

Similar to DML Syntax and Invocation process

From HelloWorld to Configurable and Reusable Apache Spark Applications in Sca...
From HelloWorld to Configurable and Reusable Apache Spark Applications in Sca...From HelloWorld to Configurable and Reusable Apache Spark Applications in Sca...
From HelloWorld to Configurable and Reusable Apache Spark Applications in Sca...
Databricks
 
CS 23001 Computer Science II Data Structures & AbstractionPro.docx
CS 23001 Computer Science II Data Structures & AbstractionPro.docxCS 23001 Computer Science II Data Structures & AbstractionPro.docx
CS 23001 Computer Science II Data Structures & AbstractionPro.docx
faithxdunce63732
 
TI1220 Lecture 14: Domain-Specific Languages
TI1220 Lecture 14: Domain-Specific LanguagesTI1220 Lecture 14: Domain-Specific Languages
TI1220 Lecture 14: Domain-Specific Languages
Eelco Visser
 
Writing Continuous Applications with Structured Streaming Python APIs in Apac...
Writing Continuous Applications with Structured Streaming Python APIs in Apac...Writing Continuous Applications with Structured Streaming Python APIs in Apac...
Writing Continuous Applications with Structured Streaming Python APIs in Apac...
Databricks
 

Similar to DML Syntax and Invocation process (20)

Overview of Apache SystemML by Berthold Reinwald and Nakul Jindal
Overview of Apache SystemML by Berthold Reinwald and Nakul JindalOverview of Apache SystemML by Berthold Reinwald and Nakul Jindal
Overview of Apache SystemML by Berthold Reinwald and Nakul Jindal
 
Overview of Apache SystemML by Berthold Reinwald and Nakul Jindal
Overview of Apache SystemML by Berthold Reinwald and Nakul JindalOverview of Apache SystemML by Berthold Reinwald and Nakul Jindal
Overview of Apache SystemML by Berthold Reinwald and Nakul Jindal
 
From HelloWorld to Configurable and Reusable Apache Spark Applications in Sca...
From HelloWorld to Configurable and Reusable Apache Spark Applications in Sca...From HelloWorld to Configurable and Reusable Apache Spark Applications in Sca...
From HelloWorld to Configurable and Reusable Apache Spark Applications in Sca...
 
Building an ML Platform with Ray and MLflow
Building an ML Platform with Ray and MLflowBuilding an ML Platform with Ray and MLflow
Building an ML Platform with Ray and MLflow
 
CS 23001 Computer Science II Data Structures & AbstractionPro.docx
CS 23001 Computer Science II Data Structures & AbstractionPro.docxCS 23001 Computer Science II Data Structures & AbstractionPro.docx
CS 23001 Computer Science II Data Structures & AbstractionPro.docx
 
PuttingItAllTogether
PuttingItAllTogetherPuttingItAllTogether
PuttingItAllTogether
 
TI1220 Lecture 14: Domain-Specific Languages
TI1220 Lecture 14: Domain-Specific LanguagesTI1220 Lecture 14: Domain-Specific Languages
TI1220 Lecture 14: Domain-Specific Languages
 
Writing Continuous Applications with Structured Streaming Python APIs in Apac...
Writing Continuous Applications with Structured Streaming Python APIs in Apac...Writing Continuous Applications with Structured Streaming Python APIs in Apac...
Writing Continuous Applications with Structured Streaming Python APIs in Apac...
 
Scala for Java Programmers
Scala for Java ProgrammersScala for Java Programmers
Scala for Java Programmers
 
Meta Object Protocols
Meta Object ProtocolsMeta Object Protocols
Meta Object Protocols
 
Scalable Data Science in Python and R on Apache Spark
Scalable Data Science in Python and R on Apache SparkScalable Data Science in Python and R on Apache Spark
Scalable Data Science in Python and R on Apache Spark
 
Tackling repetitive tasks with serial or parallel programming in R
Tackling repetitive tasks with serial or parallel programming in RTackling repetitive tasks with serial or parallel programming in R
Tackling repetitive tasks with serial or parallel programming in R
 
U-SQL Killer Scenarios: Custom Processing, Big Cognition, Image and JSON Proc...
U-SQL Killer Scenarios: Custom Processing, Big Cognition, Image and JSON Proc...U-SQL Killer Scenarios: Custom Processing, Big Cognition, Image and JSON Proc...
U-SQL Killer Scenarios: Custom Processing, Big Cognition, Image and JSON Proc...
 
Short intro to scala and the play framework
Short intro to scala and the play frameworkShort intro to scala and the play framework
Short intro to scala and the play framework
 
Using existing language skillsets to create large-scale, cloud-based analytics
Using existing language skillsets to create large-scale, cloud-based analyticsUsing existing language skillsets to create large-scale, cloud-based analytics
Using existing language skillsets to create large-scale, cloud-based analytics
 
Terraform modules restructured
Terraform modules restructuredTerraform modules restructured
Terraform modules restructured
 
Terraform Modules Restructured
Terraform Modules RestructuredTerraform Modules Restructured
Terraform Modules Restructured
 
Go Faster With Native Compilation
Go Faster With Native CompilationGo Faster With Native Compilation
Go Faster With Native Compilation
 
Go faster with_native_compilation Part-2
Go faster with_native_compilation Part-2Go faster with_native_compilation Part-2
Go faster with_native_compilation Part-2
 
TEMPLATES IN JAVA
TEMPLATES IN JAVATEMPLATES IN JAVA
TEMPLATES IN JAVA
 

More from Arvind Surve

More from Arvind Surve (18)

Apache SystemML Optimizer and Runtime techniques by Arvind Surve and Matthias...
Apache SystemML Optimizer and Runtime techniques by Arvind Surve and Matthias...Apache SystemML Optimizer and Runtime techniques by Arvind Surve and Matthias...
Apache SystemML Optimizer and Runtime techniques by Arvind Surve and Matthias...
 
Apache SystemML Optimizer and Runtime techniques by Matthias Boehm
Apache SystemML Optimizer and Runtime techniques by Matthias BoehmApache SystemML Optimizer and Runtime techniques by Matthias Boehm
Apache SystemML Optimizer and Runtime techniques by Matthias Boehm
 
Apache SystemML Architecture by Niketan Panesar
Apache SystemML Architecture by Niketan PanesarApache SystemML Architecture by Niketan Panesar
Apache SystemML Architecture by Niketan Panesar
 
Clustering and Factorization using Apache SystemML by Prithviraj Sen
Clustering and Factorization using Apache SystemML by  Prithviraj SenClustering and Factorization using Apache SystemML by  Prithviraj Sen
Clustering and Factorization using Apache SystemML by Prithviraj Sen
 
Clustering and Factorization using Apache SystemML by Alexandre V Evfimievski
Clustering and Factorization using Apache SystemML by  Alexandre V EvfimievskiClustering and Factorization using Apache SystemML by  Alexandre V Evfimievski
Clustering and Factorization using Apache SystemML by Alexandre V Evfimievski
 
Classification using Apache SystemML by Prithviraj Sen
Classification using Apache SystemML by Prithviraj SenClassification using Apache SystemML by Prithviraj Sen
Classification using Apache SystemML by Prithviraj Sen
 
Regression using Apache SystemML by Alexandre V Evfimievski
Regression using Apache SystemML by Alexandre V EvfimievskiRegression using Apache SystemML by Alexandre V Evfimievski
Regression using Apache SystemML by Alexandre V Evfimievski
 
Data preparation, training and validation using SystemML by Faraz Makari Mans...
Data preparation, training and validation using SystemML by Faraz Makari Mans...Data preparation, training and validation using SystemML by Faraz Makari Mans...
Data preparation, training and validation using SystemML by Faraz Makari Mans...
 
Apache SystemML Optimizer and Runtime techniques by Arvind Surve and Matthias...
Apache SystemML Optimizer and Runtime techniques by Arvind Surve and Matthias...Apache SystemML Optimizer and Runtime techniques by Arvind Surve and Matthias...
Apache SystemML Optimizer and Runtime techniques by Arvind Surve and Matthias...
 
Apache SystemML Optimizer and Runtime techniques by Matthias Boehm
Apache SystemML Optimizer and Runtime techniques by Matthias BoehmApache SystemML Optimizer and Runtime techniques by Matthias Boehm
Apache SystemML Optimizer and Runtime techniques by Matthias Boehm
 
Apache SystemML Architecture by Niketan Panesar
Apache SystemML Architecture by Niketan PanesarApache SystemML Architecture by Niketan Panesar
Apache SystemML Architecture by Niketan Panesar
 
Clustering and Factorization using Apache SystemML by Prithviraj Sen
Clustering and Factorization using Apache SystemML by  Prithviraj SenClustering and Factorization using Apache SystemML by  Prithviraj Sen
Clustering and Factorization using Apache SystemML by Prithviraj Sen
 
Clustering and Factorization using Apache SystemML by Alexandre V Evfimievski
Clustering and Factorization using Apache SystemML by  Alexandre V EvfimievskiClustering and Factorization using Apache SystemML by  Alexandre V Evfimievski
Clustering and Factorization using Apache SystemML by Alexandre V Evfimievski
 
Classification using Apache SystemML by Prithviraj Sen
Classification using Apache SystemML by Prithviraj SenClassification using Apache SystemML by Prithviraj Sen
Classification using Apache SystemML by Prithviraj Sen
 
Regression using Apache SystemML by Alexandre V Evfimievski
Regression using Apache SystemML by Alexandre V EvfimievskiRegression using Apache SystemML by Alexandre V Evfimievski
Regression using Apache SystemML by Alexandre V Evfimievski
 
Data preparation, training and validation using SystemML by Faraz Makari Mans...
Data preparation, training and validation using SystemML by Faraz Makari Mans...Data preparation, training and validation using SystemML by Faraz Makari Mans...
Data preparation, training and validation using SystemML by Faraz Makari Mans...
 
S1 DML Syntax and Invocation
S1 DML Syntax and InvocationS1 DML Syntax and Invocation
S1 DML Syntax and Invocation
 
Apache SystemML 2016 Summer class primer by Berthold Reinwald
Apache SystemML 2016 Summer class primer by Berthold ReinwaldApache SystemML 2016 Summer class primer by Berthold Reinwald
Apache SystemML 2016 Summer class primer by Berthold Reinwald
 

Recently uploaded

Recently uploaded (20)

Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentation
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the Classroom
 
Plant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptxPlant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptx
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptx
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)
 
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfUnit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
 
REMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxREMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptx
 
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and Modifications
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
 
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
 

DML Syntax and Invocation process

  • 2. Goal of These Slides • Provide you with basic DML syntax • Link to important resources • Invocation Non-Goals • Comprehensive syntax and API coverage
  • 3. Resources • Google “Apache Systemml” • Documentation - https://apache.github.io/incubator-systemml/ • DML Language Reference - https://apache.github.io/incubator-systemml/dml- language-reference.html • MLContext- https://apache.github.io/incubator-systemml/spark-mlcontext- programming-guide.html#spark-shell-scala-example • Github - https://github.com/apache/incubator-systemml Note • Some documentation is outdated • If you find a typo or want to update the document, consider making a Pull Request • All docs are in Markdown format • https://github.com/apache/incubator-systemml/tree/master/docs
  • 4. About DML Briefly • DML = Declarative Machine Learning • R-like syntax, some subtle differences from R • Dynamically typed • Data Structures • Scalars – Boolean, Integers, Strings, Double Precision • Cacheable – Matrices, DataFrames • Data Structure Terminology in DML • Value Type - Boolean, Integers, Strings, Double Precision • Data Type – Scalar, Matrices, DataFrames* • You can have a DataType[ValueType], not all combinations are supported • For instance – matrix[double] • Scoping • One global scope, except inside functions * Coming soon
  • 5. About DML Briefly • Control Flow • Sequential imperative control flow (like most other languages) • Looping – • while (<condition>) { … } • for (var in <for_predicate>) { … } • parfor (var in <for_predicate>) { … } // Iterations in parallel • Guards – • if (<condition>) { ... } [ else if (<condition>) { ... } ... else { … } ] • Functions • Built-in – List available in language reference • User Defined – (multiple return parameters) • functionName = function (<formal_parameters>…) return (<formal_parameters>) { ... } • Can only access variables defined in the formal_parameters in the body of the function • External Function – same as user defined, can call external Java Package
  • 6. About DML Briefly • Imports • Can import user defined/external functions from other source files • Disambiguation using namespaces • Command Line Arguments • By position - $1, $2 … • By name - $X, $Y ... • Limitations • A user defined functions can only be called on the right hand side of assignments as the only expression • Cannot write • X <- Y + bar() • for (i in foo(1,2,3)) { … }
  • 7. Sample Code A = 1.0 # A is an integer X <- matrix(“4 3 2 5 7 8”, rows=3, cols=2) # X = matrix of size 3,2 '<-' is assignment Y = matrix(1, rows=3, cols=2) # Y = matrix of size 3,2 with all 1s b <- t(X) %*% Y # %*% is matrix multiply, t(X) is transpose S = "hello world" i=0 while(i < max_iteration) { H = (H * (t(W) %*% (V/(W%*%H))))/t(colSums(W)) # * is element by element mult W = (W * ((V/(W%*%H)) %*% t(H)))/t(rowSums(H)) i = i + 1; # i is an integer } print (toString(H)) # toString converts a matrix to a string
  • 8. Sample Code source("nn/layers/affine.dml") as affine # import a file in the “affine“ namespace [W, b] = affine::init(D, M) # calls the init function, multiple return parfor (i in 1:nrow(X)) { # i iterates over 1 through num rows in X in parallel for (j in 1:ncol(X)) { # j iterates over 1 through num cols in X # Computation ... } } write (M, fileM, format=“text”) # M=matrix, fileM=file, also writes to HDFS X = read (fileX) # fileX=file, also reads from HDFS if (ncol (A) > 1) { # Matrix A is being sliced by a given range of columns A[,1:(ncol (A) - 1)] = A[,1:(ncol (A) - 1)] - A[,2:ncol (A)]; }
  • 9. Sample Code interpSpline = function( double x, matrix[double] X, matrix[double] Y, matrix[double] K) return (double q) { i = as.integer(nrow(X) - sum(ppred(X, x, ">=")) + 1) # misc computation … q = as.scalar(qm) } eigen = externalFunction(Matrix[Double] A) return(Matrix[Double] eval, Matrix[Double] evec) implemented in (classname="org.apache.sysml.udf.lib.EigenWrapper", exectype="mem")
  • 10. Sample Code (From LinearRegDS.dml*) A = t(X) %*% X b = t(X) %*% y if (intercept_status == 2) { A = t(diag (scale_X) %*% A + shift_X %*% A [m_ext, ]) A = diag (scale_X) %*% A + shift_X %*% A [m_ext, ] b = diag (scale_X) %*% b + shift_X %*% b [m_ext, ] } A = A + diag (lambda) print ("Calling the Direct Solver...") beta_unscaled = solve (A, b) *https://github.com/apache/incubator-systemml/blob/master/scripts/algorithms/LinearRegDS.dml#L133
  • 11. MLContext API • You can invoke SystemML from the • Command line or a • Spark Program • The MLContext API lets you invoke it from a Spark Program • Command line invocation described later • Available as a Scala API and a Python API • These slides will only talk about the Scala API
  • 12. MLContext API – Example Usage val ml = new MLContext(sc) val X_train = sc.textFile("amazon0601.txt") .filter(!_.startsWith("#")) .map(_.split("t") match{case Array(prod1, prod2)=>(prod1.toInt, prod2.toInt,1.0)}) .toDF("prod_i", "prod_j", "x_ij") .filter("prod_i < 5000 AND prod_j < 5000") // Change to smaller number .cache()
  • 13. MLContext API – Example Usage val pnmf = """ # data & args X = read($X) rank = as.integer($rank) # Computation .... write(negloglik, $negloglikout) write(W, $Wout) write(H, $Hout) """
  • 14. MLContext API – Example Usage val pnmf = """ # data & args X = read($X) rank = as.integer($rank) # Computation .... write(negloglik, $negloglikout) write(W, $Wout) write(H, $Hout) """ ml.registerInput("X", X_train) ml.registerOutput("W") ml.registerOutput("H") ml.registerOutput("negloglik") val outputs = ml.executeScript(pnmf, Map("maxiter" -> "100", "rank" -> "10")) val negloglik = getScalarDouble(outputs, "negloglik")
  • 15. Invocation – How to run a DML file • SystemML can run on • Your laptop (Standalone) • Spark • Hybrid Spark – using the better choice between the driver and the cluster • Hadoop • Hybrid Hadoop • For this presentation, we care about standalone, spark & hybrid_spark • Documentation has detailed instructions on the others
  • 16. Invocation – How to run a DML file Standalone In the systemml directory bin/systemml <dml-filename> [arguments] Example invocations: bin/systemml LinearRegCG.dml –nvargs X=X.mtx Y=Y.mtx B=B.mtx bin/systemml oddsRatio.dml –args X.mtx 50 B.mtx Named arguments Position arguments
  • 17. Invocation – How to run a DML file Spark/ Hybrid Spark Define SPARK_HOME to point to your Apache Spark Installation Define SYSTEMML_HOME to point to your Apache SystemML installation In the systemml directory scripts/sparkDML.sh<dml-filename> [systemmlarguments] Example invocations: scripts/sparkDML.sh LinearRegCG.dml --nvargs X=X.mtx Y=Y.mtxB=B.mtx scripts/sparkDML.sh oddsRatio.dml --args X.mtx 50 B.mtx Named arguments Position arguments
  • 18. Invocation – How to run a DML file Spark/ Hybrid Spark Define SPARK_HOME to point to your Apache Spark Installation Define SYSTEMML_HOME to point to your Apache SystemML installation Using the spark-submit script $SPARK_HOME/bin/spark-submit --master <master-url> --class org.apache.sysml.api.DMLScript ${SYSTEMML_HOME}/SystemML.jar -f <dml-filename> <systemml arguments> -exec {hybrid_spark,spark} Example invocation: $SPARK_HOME/bin/spark-submit --master local[*] --class org.apache.sysml.api.DMLScript ${SYSTEMML_HOME}/SystemML.jar -f LinearRegCG.dml --nvargs X=X.mtx Y=Y.mtx B=B.mtx
  • 19. Editor Support • Very rudimentary editor support • Bit of shameless self-promotion : • Atom – Hackable Text editor • Install package - https://atom.io/packages/language-dml • From GUI - http://flight-manual.atom.io/using-atom/sections/atom-packages/ • Or from command line – apm install language-dml • Rudimentary snippet based completion of builtin function • Vim • Install package - https://github.com/nakul02/vim-dml • Works with Vundle(vim package manager) • There is an experimental Zeppelin Notebook integration with DML – • https://issues.apache.org/jira/browse/SYSTEMML-542 • Available as a docker image to play with - https://hub.docker.com/r/nakul02/incubator-zeppelin/ • Please send feedback when using these, requests for features, bugs • I’ll work on them when I can
  • 20. Other Information • All scripts are in - https://github.com/apache/incubator- systemml/tree/master/scripts • Algorithm Scripts - https://github.com/apache/incubator- systemml/tree/master/scripts/algorithms • Test Scripts - https://github.com/apache/incubator- systemml/tree/master/src/test/scripts • Look inside the test folder for programs that run the tests, play around with some of them - https://github.com/apache/incubator- systemml/tree/master/src/test/java/org/apache/sysml/test
  • 21. Thanks! • The documentation might be outdated and have typos • Please submit fixes • If a language feature does not make sense or is missing, ask a SystemML team member • Have Fun!