An LSTM-Based Neural Network Architecture for Model TransformationsLola Burgueño
Model transformations are a key element in any model-driven engineering approach, but writing them is a time-consuming and error-prone activity that requires specific knowledge of the transformation language semantics. We propose to take advantage of the advances in Artificial Intelligence and, in particular Long Short-Term Memory Neural Networks (LSTM), to automatically infer model transformations from sets of input-output model pairs. Once the transformation mappings have been learned, the LSTM system is able to autonomously transform new input models into their corresponding output models without the need of writing any transformationspecific code. We evaluate the correctness and performance of our approach and discuss its advantages and limitations.
Robust and declarative machine learning pipelines for predictive buying at Ba...Gianmario Spacagna
Proof of concept of how to use Scala, Spark and the recent library Sparkz for building production quality machine learning pipelines for predicting buyers of financial products.
The pipelines are implemented through custom declarative APIs that gives us greater control, transparency and testability of the whole process.
The example followed the validation and evaluation principles as defined in The Data Science Manifesto available in beta at www.datasciencemanifesto.org
Using AI to build AI is a promising solution to give the power of AI to those who can't afford it as those multinational corporations. The technology is also known as Automatic Machine Learning (AutoML). OneClick.ai is the first deep learning AutoML platform that make the latest AI technology accessible to anyone with/without AI background. The deck gives a 30 minutes overview of the recent history of AutoML, and how OneClick.ai innovates on it. Check out our platform at http://www.oneclick.ai
Intro to Deep Learning with Keras - using TensorFlow backendAmin Golnari
An overview of deep learning. Keras installation in Windows and how to use it.
Create a sequential network and training it with using MNIST data.
visualization and optimization in keras with example
مرور کلی بر یادگیری عمیق. نصب و راه اندازی کراس در ویندوز. ایجاد یک شبکه عصبی چندلایه و آموزش آن با استفاده از مجموعه داده ارقام دست نویس لاتین
Keras is a high level framework that runs on top of AI library such as Tensorflow, Theano, or CNTK. The key feature of Keras is that it allow to switch out the underlying library without performing any code changes. Keras contains commonly used neural-network building blocks such as layers, optimizer, activation functions etc and keras has support for convolutional and recurrent neural networks. In addition keras contains datasets and some pre-trained deep learnig applications that make it easier to learn for beginners. Essentially Keras is democrasting deep learning by reducing barrier into deep learning.
Separating Hype from Reality in Deep Learning with Sameer FarooquiDatabricks
Deep Learning is all the rage these days, but where does the reality of what Deep Learning can do end and the media hype begin? In this talk, I will dispel common myths about Deep Learning that are not necessarily true and help you decide whether you should practically use Deep Learning in your software stack.
I’ll begin with a technical overview of common neural network architectures like CNNs, RNNs, GANs and their common use cases like computer vision, language understanding or unsupervised machine learning. Then I’ll separate the hype from reality around questions like:
• When should you prefer traditional ML systems like scikit learn or Spark.ML instead of Deep Learning?
• Do you no longer need to do careful feature extraction and standardization if using Deep Learning?
• Do you really need terabytes of data when training neural networks or can you ‘steal’ pre-trained lower layers from public models by using transfer learning?
• How do you decide which activation function (like ReLU, leaky ReLU, ELU, etc) or optimizer (like Momentum, AdaGrad, RMSProp, Adam, etc) to use in your neural network?
• Should you randomly initialize the weights in your network or use more advanced strategies like Xavier or He initialization?
• How easy is it to overfit/overtrain a neural network and what are the common techniques to ovoid overfitting (like l1/l2 regularization, dropout and early stopping)?
An LSTM-Based Neural Network Architecture for Model TransformationsLola Burgueño
Model transformations are a key element in any model-driven engineering approach, but writing them is a time-consuming and error-prone activity that requires specific knowledge of the transformation language semantics. We propose to take advantage of the advances in Artificial Intelligence and, in particular Long Short-Term Memory Neural Networks (LSTM), to automatically infer model transformations from sets of input-output model pairs. Once the transformation mappings have been learned, the LSTM system is able to autonomously transform new input models into their corresponding output models without the need of writing any transformationspecific code. We evaluate the correctness and performance of our approach and discuss its advantages and limitations.
Robust and declarative machine learning pipelines for predictive buying at Ba...Gianmario Spacagna
Proof of concept of how to use Scala, Spark and the recent library Sparkz for building production quality machine learning pipelines for predicting buyers of financial products.
The pipelines are implemented through custom declarative APIs that gives us greater control, transparency and testability of the whole process.
The example followed the validation and evaluation principles as defined in The Data Science Manifesto available in beta at www.datasciencemanifesto.org
Using AI to build AI is a promising solution to give the power of AI to those who can't afford it as those multinational corporations. The technology is also known as Automatic Machine Learning (AutoML). OneClick.ai is the first deep learning AutoML platform that make the latest AI technology accessible to anyone with/without AI background. The deck gives a 30 minutes overview of the recent history of AutoML, and how OneClick.ai innovates on it. Check out our platform at http://www.oneclick.ai
Intro to Deep Learning with Keras - using TensorFlow backendAmin Golnari
An overview of deep learning. Keras installation in Windows and how to use it.
Create a sequential network and training it with using MNIST data.
visualization and optimization in keras with example
مرور کلی بر یادگیری عمیق. نصب و راه اندازی کراس در ویندوز. ایجاد یک شبکه عصبی چندلایه و آموزش آن با استفاده از مجموعه داده ارقام دست نویس لاتین
Keras is a high level framework that runs on top of AI library such as Tensorflow, Theano, or CNTK. The key feature of Keras is that it allow to switch out the underlying library without performing any code changes. Keras contains commonly used neural-network building blocks such as layers, optimizer, activation functions etc and keras has support for convolutional and recurrent neural networks. In addition keras contains datasets and some pre-trained deep learnig applications that make it easier to learn for beginners. Essentially Keras is democrasting deep learning by reducing barrier into deep learning.
Separating Hype from Reality in Deep Learning with Sameer FarooquiDatabricks
Deep Learning is all the rage these days, but where does the reality of what Deep Learning can do end and the media hype begin? In this talk, I will dispel common myths about Deep Learning that are not necessarily true and help you decide whether you should practically use Deep Learning in your software stack.
I’ll begin with a technical overview of common neural network architectures like CNNs, RNNs, GANs and their common use cases like computer vision, language understanding or unsupervised machine learning. Then I’ll separate the hype from reality around questions like:
• When should you prefer traditional ML systems like scikit learn or Spark.ML instead of Deep Learning?
• Do you no longer need to do careful feature extraction and standardization if using Deep Learning?
• Do you really need terabytes of data when training neural networks or can you ‘steal’ pre-trained lower layers from public models by using transfer learning?
• How do you decide which activation function (like ReLU, leaky ReLU, ELU, etc) or optimizer (like Momentum, AdaGrad, RMSProp, Adam, etc) to use in your neural network?
• Should you randomly initialize the weights in your network or use more advanced strategies like Xavier or He initialization?
• How easy is it to overfit/overtrain a neural network and what are the common techniques to ovoid overfitting (like l1/l2 regularization, dropout and early stopping)?
The key challenge in making AI technology more accessible to the broader community is the scarcity of AI experts. Most businesses simply don’t have the much needed resources or skills for modeling and engineering. This is why automated machine learning and deep learning technologies (AutoML and AutoDL) are increasingly valued by academics and industry. The core of AI is the model design. Automated machine learning technology reduces the barriers to AI application, enabling developers with no AI expertise to independently and easily develop and deploy AI models. Automated machine learning is expected to completely overturn the AI industry in the next few years, making AI ubiquitous.
DataMass Summit - Machine Learning for Big Data in SQL ServerŁukasz Grala
Sesja pokazująca zarówno Machine Learning Server (czyli algorytmy uczenia maszynowego w językach R i Python), ale także możliwość korzystania z danych JSON w SQL Server, czy też łączenia się do danych znajdujących się na HDFS, HADOOP, czy Spark poprzez Polybase w SQL Server, by te dane wykorzystywać do analizy, predykcji poprzez modele w językach R lub Python.
Josh Patterson, Advisor, Skymind – Deep learning for Industry at MLconf ATL 2016MLconf
DL4J and DataVec for Enterprise Deep Learning Workflows: Applications in NLP, sensor processing (IoT), image processing, and audio processing have all emerged as prime deep learning applications. In this session we will take a look at a practical review of building practical and secure Deep Learning workflows in the enterprise. We’ll see how DL4J’s DataVec tool enables scalable ETL and vectorization pipelines to be created for a single machine or scale out to Spark on Hadoop. We’ll also see how Deep Networks such as Recurrent Neural Networks are able to leverage DataVec to more quickly process data for modeling.
Embed, Encode, Attend, Predict – applying the 4 step NLP recipe for text clas...Sujit Pal
Slides for talk at PyData Seattle 2017 about Matthew Honnibal's 4-step recipe for Deep Learning NLP pipelines. Description of the stages in pipeline as well as 3 examples of document classification, document similarity and sentence similarity. Examples include Keras custom layers for different types of attention.
TensorFlow Extension (TFX) and Apache Beammarkgrover
Talk on TFX and Beam by Robert Crowe, developer advocate at Google, focussed on TensorFlow.
Learn how the TensorFlow Extended (TFX) project is utilizing Apache Beam to simplify pre- and post-processing for ML pipelines. TFX provides a framework for managing all of necessary pieces of a real-world machine learning project beyond simply training and utilizing models. Robert will provide an overview of TFX, and talk in a little more detail about the pieces of the framework (tf.Transform and tf.ModelAnalysis) which are powered by Apache Beam.
Ballerina is a general purpose language optimized for integration and writing network services and applications. While it looks like Java and other popular languages in some ways, it is very different from those in fundamental ways. This explores how Ballerina is different, why it is different and how those differences give Ballerina an unfair advantage when it comes to writing resilient, performant and secure network services and applications.
This presentation discusses the following topics:
Basic features of R
Exploring R GUI
Data Frames & Lists
Handling Data in R Workspace
Reading Data Sets & Exporting Data from R
Manipulating & Processing Data in R
Evolving a Medical Image Similarity SearchSujit Pal
Slides for talk at Haystack Conference 2018. Covers evolution of an Image Similarity Search Proof of Concept built to identify similar medical images. Discusses various image vectorizing techniques that were considered in order to convert images into searchable entities, an evaluation strategy to rank these techniques, as well as various indexing strategies to allow searching for similar images at scale.
A seminar in advanced Software Engineering concerning using models to guide the development process, and QVT to transfer a model into another model automatically
An LSTM-Based Neural Network Architecture for Model TransformationsJordi Cabot
We propose to take advantage of the advances in Artificial Intelligence and, in particular, Long Short-Term Memory Neural Networks (LSTM), to automatically infer model transformations from sets of input-output model pairs.
The key challenge in making AI technology more accessible to the broader community is the scarcity of AI experts. Most businesses simply don’t have the much needed resources or skills for modeling and engineering. This is why automated machine learning and deep learning technologies (AutoML and AutoDL) are increasingly valued by academics and industry. The core of AI is the model design. Automated machine learning technology reduces the barriers to AI application, enabling developers with no AI expertise to independently and easily develop and deploy AI models. Automated machine learning is expected to completely overturn the AI industry in the next few years, making AI ubiquitous.
DataMass Summit - Machine Learning for Big Data in SQL ServerŁukasz Grala
Sesja pokazująca zarówno Machine Learning Server (czyli algorytmy uczenia maszynowego w językach R i Python), ale także możliwość korzystania z danych JSON w SQL Server, czy też łączenia się do danych znajdujących się na HDFS, HADOOP, czy Spark poprzez Polybase w SQL Server, by te dane wykorzystywać do analizy, predykcji poprzez modele w językach R lub Python.
Josh Patterson, Advisor, Skymind – Deep learning for Industry at MLconf ATL 2016MLconf
DL4J and DataVec for Enterprise Deep Learning Workflows: Applications in NLP, sensor processing (IoT), image processing, and audio processing have all emerged as prime deep learning applications. In this session we will take a look at a practical review of building practical and secure Deep Learning workflows in the enterprise. We’ll see how DL4J’s DataVec tool enables scalable ETL and vectorization pipelines to be created for a single machine or scale out to Spark on Hadoop. We’ll also see how Deep Networks such as Recurrent Neural Networks are able to leverage DataVec to more quickly process data for modeling.
Embed, Encode, Attend, Predict – applying the 4 step NLP recipe for text clas...Sujit Pal
Slides for talk at PyData Seattle 2017 about Matthew Honnibal's 4-step recipe for Deep Learning NLP pipelines. Description of the stages in pipeline as well as 3 examples of document classification, document similarity and sentence similarity. Examples include Keras custom layers for different types of attention.
TensorFlow Extension (TFX) and Apache Beammarkgrover
Talk on TFX and Beam by Robert Crowe, developer advocate at Google, focussed on TensorFlow.
Learn how the TensorFlow Extended (TFX) project is utilizing Apache Beam to simplify pre- and post-processing for ML pipelines. TFX provides a framework for managing all of necessary pieces of a real-world machine learning project beyond simply training and utilizing models. Robert will provide an overview of TFX, and talk in a little more detail about the pieces of the framework (tf.Transform and tf.ModelAnalysis) which are powered by Apache Beam.
Ballerina is a general purpose language optimized for integration and writing network services and applications. While it looks like Java and other popular languages in some ways, it is very different from those in fundamental ways. This explores how Ballerina is different, why it is different and how those differences give Ballerina an unfair advantage when it comes to writing resilient, performant and secure network services and applications.
This presentation discusses the following topics:
Basic features of R
Exploring R GUI
Data Frames & Lists
Handling Data in R Workspace
Reading Data Sets & Exporting Data from R
Manipulating & Processing Data in R
Evolving a Medical Image Similarity SearchSujit Pal
Slides for talk at Haystack Conference 2018. Covers evolution of an Image Similarity Search Proof of Concept built to identify similar medical images. Discusses various image vectorizing techniques that were considered in order to convert images into searchable entities, an evaluation strategy to rank these techniques, as well as various indexing strategies to allow searching for similar images at scale.
A seminar in advanced Software Engineering concerning using models to guide the development process, and QVT to transfer a model into another model automatically
An LSTM-Based Neural Network Architecture for Model TransformationsJordi Cabot
We propose to take advantage of the advances in Artificial Intelligence and, in particular, Long Short-Term Memory Neural Networks (LSTM), to automatically infer model transformations from sets of input-output model pairs.
Natural Language Query to SQL conversion using Machine Learning ApproachMinhazul Arefin
Natural Language Processing is a computer science and artificial intelligence topic concerned with computer-human language interactions and how computers are designed for processing and exploring a variety of natural language data, in particular. The Structured Query Language for non-expert users is usually a challenging database storage, they may not know the database structure. For database applications to improve the interaction between database and user, a new intelligent interface is therefore necessary. The concept of utilizing a natural language instead of a structured query language has led to the creation of the natural language interface to database systems as a new form of processing procedure. The aim of this research is to build a query generating process using an algorithm for the machine learning to represent information according to user's demands for answering query and obtaining information. For the conversion of Natural Language Query into Structured Query, we utilized a lowercase conversion, removing escaped words, tokenization, PoS tagging, word similarity, Jaro-Winklar matching algorithm, and the method Naive Bayes.
The Power of Auto ML and How Does it WorkIvo Andreev
Automated ML is an approach to minimize the need of data science effort by enabling domain experts to build ML models without having deep knowledge of algorithms, mathematics or programming skills. The mechanism works by allowing end-users to simply provide data and the system automatically does the rest by determining approach to perform particular ML task. At first this may sound discouraging to those aiming to the “sexiest job of the 21st century” - the data scientists. However, Auto ML should be considered as democratization of ML, rather that automatic data science.
In this session we will talk about how Auto ML works, how is it implemented by Microsoft and how it could improve the productivity of even professional data scientists.
How Skroutz S.A. utilizes Deep Learning and Machine Learning techniques to efficiently serve product categorization! Based on my talk at Athens PyData meetup!
The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization
Distributed Model Validation with EpsilonSina Madani
Scalable performance is a major challenge with current model management tools. As the size and complexity of models and model management programs increases and the cost of computing falls, one solution for improving performance of model management programs is to perform computations on multiple computers. The developed prototype demonstrates a low-overhead data-parallel approach for distributed model validation in the context of an OCL-like language. The approach minimises communication costs by exploiting the deterministic structure of programs and can take advantage of multiple cores on each (heterogenous) machine with highly configurable computational granularity. Performance evaluation shows linear improvements with more machines and processor cores, being up to 340x faster than the baseline sequential program with 88 computers.
Operationalizing Machine Learning: Serving ML ModelsLightbend
Join O’Reilly author and Lightbend Principal Architect, Boris Lublinsky, as he discusses one of the hottest topics in software engineering today: serving machine learning models.
Typically with machine learning, different groups are responsible for model training and model serving. Data scientists often introduce their own machine-learning tools, causing software engineers to create complementary model-serving frameworks to keep pace. It’s not a very efficient system. In this webinar, Boris demonstrates a more standardized approach to model serving and model scoring:
* How to develop an architecture for serving models in real time as part of input stream processing
* How this approach enables data science teams to update models without restarting existing applications
* Different ways to build this model-scoring solution, using several popular stream processing engines and frameworks
Similar to A Generic Neural Network Architecture to Infer Heterogeneous Model Transformations (20)
Final project report on grocery store management system..pdfKamal Acharya
In today’s fast-changing business environment, it’s extremely important to be able to respond to client needs in the most effective and timely manner. If your customers wish to see your business online and have instant access to your products or services.
Online Grocery Store is an e-commerce website, which retails various grocery products. This project allows viewing various products available enables registered users to purchase desired products instantly using Paytm, UPI payment processor (Instant Pay) and also can place order by using Cash on Delivery (Pay Later) option. This project provides an easy access to Administrators and Managers to view orders placed using Pay Later and Instant Pay options.
In order to develop an e-commerce website, a number of Technologies must be studied and understood. These include multi-tiered architecture, server and client-side scripting techniques, implementation technologies, programming language (such as PHP, HTML, CSS, JavaScript) and MySQL relational databases. This is a project with the objective to develop a basic website where a consumer is provided with a shopping cart website and also to know about the technologies used to develop such a website.
This document will discuss each of the underlying technologies to create and implement an e- commerce website.
Welcome to WIPAC Monthly the magazine brought to you by the LinkedIn Group Water Industry Process Automation & Control.
In this month's edition, along with this month's industry news to celebrate the 13 years since the group was created we have articles including
A case study of the used of Advanced Process Control at the Wastewater Treatment works at Lleida in Spain
A look back on an article on smart wastewater networks in order to see how the industry has measured up in the interim around the adoption of Digital Transformation in the Water Industry.
Cosmetic shop management system project report.pdfKamal Acharya
Buying new cosmetic products is difficult. It can even be scary for those who have sensitive skin and are prone to skin trouble. The information needed to alleviate this problem is on the back of each product, but it's thought to interpret those ingredient lists unless you have a background in chemistry.
Instead of buying and hoping for the best, we can use data science to help us predict which products may be good fits for us. It includes various function programs to do the above mentioned tasks.
Data file handling has been effectively used in the program.
The automated cosmetic shop management system should deal with the automation of general workflow and administration process of the shop. The main processes of the system focus on customer's request where the system is able to search the most appropriate products and deliver it to the customers. It should help the employees to quickly identify the list of cosmetic product that have reached the minimum quantity and also keep a track of expired date for each cosmetic product. It should help the employees to find the rack number in which the product is placed.It is also Faster and more efficient way.
HEAP SORT ILLUSTRATED WITH HEAPIFY, BUILD HEAP FOR DYNAMIC ARRAYS.
Heap sort is a comparison-based sorting technique based on Binary Heap data structure. It is similar to the selection sort where we first find the minimum element and place the minimum element at the beginning. Repeat the same process for the remaining elements.
Immunizing Image Classifiers Against Localized Adversary Attacksgerogepatton
This paper addresses the vulnerability of deep learning models, particularly convolutional neural networks
(CNN)s, to adversarial attacks and presents a proactive training technique designed to counter them. We
introduce a novel volumization algorithm, which transforms 2D images into 3D volumetric representations.
When combined with 3D convolution and deep curriculum learning optimization (CLO), itsignificantly improves
the immunity of models against localized universal attacks by up to 40%. We evaluate our proposed approach
using contemporary CNN architectures and the modified Canadian Institute for Advanced Research (CIFAR-10
and CIFAR-100) and ImageNet Large Scale Visual Recognition Challenge (ILSVRC12) datasets, showcasing
accuracy improvements over previous techniques. The results indicate that the combination of the volumetric
input and curriculum learning holds significant promise for mitigating adversarial attacks without necessitating
adversary training.
Overview of the fundamental roles in Hydropower generation and the components involved in wider Electrical Engineering.
This paper presents the design and construction of hydroelectric dams from the hydrologist’s survey of the valley before construction, all aspects and involved disciplines, fluid dynamics, structural engineering, generation and mains frequency regulation to the very transmission of power through the network in the United Kingdom.
Author: Robbie Edward Sayers
Collaborators and co editors: Charlie Sims and Connor Healey.
(C) 2024 Robbie E. Sayers
Using recycled concrete aggregates (RCA) for pavements is crucial to achieving sustainability. Implementing RCA for new pavement can minimize carbon footprint, conserve natural resources, reduce harmful emissions, and lower life cycle costs. Compared to natural aggregate (NA), RCA pavement has fewer comprehensive studies and sustainability assessments.
2. Motivation
• Models capture relevant properties of systems
• During the models’ life-cycle, models are the subject of manipulations
• Purposes:
• Managing software evolution,
• performing analysis,
• Increasing developers’ productivity,
• reducing human errors,
• etc.
• These manipulations are usually
implemented as model transformations
• Model-to-Model
• Model-to-Text
• Text-to-Model
2
3. Motivation
• Creating model transformations remains a challenging task [1]:
• It requires a high-level expertise,
• competences in language engineering, and
• extensive domain knowledge
• Developers are reluctant to adopt automatic model generators (e.g., from UML to JAVA)
• They do not trust them,
• They look foreign to them,
• they do not follow the company’s coding style
• AI-based solution:
• Heterogeneous model transformations can be automatically inferred from only input-output
• The outputs will comply to the company or project’s standards
3
[1] L. Burgueño, J. Cabot, S. Gérard, The future of model transformation languages: An open community
discussion,
7. Artificial Neural Networks
• Graph structure
• Neurons are mathematical functions
• Receive a set of values through its input connections
• compute an output value
• transfer the output value to another neuron through its output
connection
• Connections have associated weights (i.e., real numbers)
• Adjusted during the learning process to increase/decrease the
strength of the connection
7
Neurons
+
Directed weighted connections
8. Artificial Neural Networks
• The learning process basically means to find the right weights
• Supervised learning methods. Training phase:
• Example input-output pairs are used (Dataset)
Dataset
Training Validation Test
8
• The training dataset contains most
of the inputs-output pairs and is
used to train the ANN
9. Artificial Neural Networks
• The learning process basically means to find the right weights
• Supervised learning methods. Training phase:
• Example input-output pairs are used (Dataset)
Dataset
Training Validation Test
9
• The test dataset is only used once
the training has finished.
• Check the quality of the ANN’s
predictions for inputs it has not seen
before, and hence to study its
accuracy.
• The accuracy is calculated:
# predictions our model gets right
# total of predictions
• Thus, it is a number in the range [0;
1]. The closer to 1, the better
10. Artificial Neural Networks
• The learning process basically means to find the right weights
• Supervised learning methods. Training phase:
• Example input-output pairs are used (Dataset)
Dataset
Training Validation Test
10
• The validation dataset plays a similar
role as the test dataset, during the
training process.
• Controls that the learning process is
correct and avoid overfitting.
• Any accuracy increase over the training
dataset must yield to an accuracy
increase over the validation dataset
11. Architecture
• Encoder-decoder architecture (avoids fixed size input/output constraints)
+
• Long short-term memory neural networks (longer memory than predecessors)
Encoder
LSTM network
Decoder
LSTM network
Input
Model
Output
Model
11
12. Architecture
• Sequence-to-Sequence transformations
• Tree-to-tree transformations
• Input layer to embed the input tree to a numeric vector
+
• Output layer to obtain the output model from the numeric vectors produced by the decoder
Input
Tree
Embedding
Layer
Encoder
LSTM network
Output
Tree
Layer
Decoder
LSTM network
Input
Model
Output
Model
12
13. • Attention mechanism
• To pay more attention (remember better) to specific parts
• Helps the decoder recognize the relevant information in the vectorial
representation of the input AST at each step
Architecture
Input
Tree
Embedding
Layer
Encoder
LSTM network
Output
Tree
Layer
Decoder
LSTM network
Attention
Layer
Input
Model
Output
Model
13
14. • Softmax
• Multi-class classification which uses a softmax activation function
• Maps each real number to a number in the (0; 1) range. Interpreted as probabilities
• In each iteration, the component with higher probability is selected and its corresponding
token passes to be part of the output
Architecture
Input
Tree
Embedding
Layer
Encoder
LSTM network
Output
Tree
Layer
Softmax
Decoder
LSTM network
Attention
Layer
Input
Model
Output
Model
14
15. • Pre- and post-processing required to…
• represent models as trees
• reduce the size of the training dataset by using a canonical form
• rename variables to avoid the “dictionary problem”
Model pre- and post-processing
Input
Model
(preprocessed)
Input
Tree
Embedding
Layer
Encoder
LSTM network
Output
Tree
Layer
Softmax
Output
Model
(non-postprocessed)
Decoder
LSTM network
Attention
Layer
Input
Model
Output
Model
Preprocessing
Postprocessing
15
16. • Parameters that are not learned during training, but manually adjusted
• No rule to choose the best hyperparameters for a specific task
• Choosing the right values has a critical impact on the success and performance of
the network
Hyperparameters
16
17. Hyperparameters
17
Hyperparameter Description Value
Epoch • Number of times the complete training dataset is passed through the Neural
Networks
• In each epoch, the training dataset is randomly shuffled and split into batches
30
Batch • Set of input-output pairs
• Each time a batch is passed, an iteration is completed
64
Neural Networks
depth
• Number of hidden layers in the neural networks 1
Embedding size • Size of the vectors
• Needs to be higher tan the vocabulary size
64
Dropout • Select which weights are updated and which are not in each iteration
• If all the weights were adjusted in each iteration, overfitting would be more likely
to happen
0.75 (prob of
a weight to
be ignored)
Learning rate • Controls how much the weights are adjusted
• Value in the the range [0; 1]
0.005
18. Cases and Results
• We illustrate the feasibility and potential of our approach through its application
in two main operations on models:
• Model-to-Model transformation Class 2 Relational
• Code generation UML 2 Java
18
Results: Neural networks are able to faithfully
learn how to perform these tasks as long as
enough data is provided and no contradictory
examples are given
19. Class 2 Relational
Some transformation rules that need to be learned
are:
• Each Class is transformed into a Table;
• Each DataType is transformed into a Type;
• DataType is transformed into a Table,
19
• Each single-valued Attribute of type DataType
is transformed into a Column;
• Each multi-valued Attribute instance of the type
20. Model representation
MODEL
ASSOC
OBJ
c
Class
ATTS
isAbstract name
false family
OBJ
a
Attribute
ATTS
multivalued name
false surname
OBJ
dt
Datatype
ATTS
name
String
att
c a
ASSOC
type
a dt
20
{
" source_ast ": {
" root ": "<MODEL>",
" children ": [
{
" root ": "<OBJ>",
" children ": [
{
" root ": "D ",
" children ": [
…
21. Class 2 Relational: Results
• Correctness
• Measured through the accuracy and validation loss
21
22. Class 2 Relational: Results
• Performance
1. How long does it take for the
training phase to complete?
22
23. Class 2 Relational: Results
• Performance
1. How long does it take for the
training phase to complete?
2. How long it takes to transform an
input model when the network is
trained?
23
24. Class 2 Relational: Results
• Performance
• Comparison with ATL
• Synthetic models
• Transformation and model from [1]
• ATL: 0.033 seconds
• Our approach: 0.722 seconds
• Although a bit slower, same order of magnitude
• Reasonable time
• The advantages of our approach may pay off
24
[1] AtlanMod (Inria), Class to relational transformation example,
https://www.eclipse.org/atl/atlTransformations/#Class2Relational
25. UML 2 Java: Results
25
Example of translation rules:
• UML classes are transformed into Java classes;
• UML attributes into Java attributes;
• UML associations into Java attributes referencing the class in the other end of the association
Abstraction gap Many variability points in the translation
• E.g., Primitive data type conversions
• UML attribute whose type is Real could be mapped to a Java attribute with type double, Double or
26. UML 2 Java: Results
26
• Training dataset
• Creation:
1. Downloaded the source code of the Eclipse IDE
2. Reverse engineer it with MoDisco
3. Removed the low-level details (e.g., method implementation)
4. Obtained the UML models
• The initial dataset (D1) contains 25,375 input-output pairs
27. UML 2 Java: Results
27
• Correctness
Our definition of Accuracy:
• worst-case scenario
• we are sometimes discarding
generated code that it is semantically
equivalent to the expected output even
if it presents slight syntactic differences
28. UML 2 Java: Results
28
• Training dataset
• Creation:
1. Downloaded the source code of the Eclipse IDE
2. Reverse engineer it with MoDisco
3. Removed the low-level details (e.g., method implementation)
4. Obtained the UML models
• The initial dataset (D1) contains 25,375 input-output pairs
• We had to curate it:
1. Discard pairs that exceed size (dataset D2)
2. Discard examples with inconsistencies (dataset D3)
• E.g., due to inheritance and the location in the hierarchy of getters/setters
• The curated dataset (D3) contains 8,937 input-output pairs
30. UML 2 Java: Results
30
• Performance
• Training • Generating code:
• Average time of 18 milliseconds
• Standard deviation of 3 milliseconds
• Efficient enough to be part of any
continuous software development
process
31. 3
1
Limitations/Discussion
• Generalization problem
• predicting output solutions for input models
very different from the training distribution
it has learn from
• Social acceptance
• Size of the training dataset
• Size of the input-output pairs
• Diversity in the training set
• Computational limitations of ANNs
• i.e., mathematical operations
32. Future work
• Study of transformers
• Validation of our approach with MT
chains
• Connectors for EMF models
• Pretrained networks for an easy quick
start
32
33. References
[1] Loli Burgueño, Jordi Cabot, Sébastien Gérard. An LSTM-Based Neural Network Architecture for
Model Transformations. In Proc. of MoDELS 2019: 294-299
[2] Loli Burgueño, Jordi Cabot, Shuai Li, Sébastien Gérard. A Generic LSTM Neural Network
Architecture to Infer Heterogeneous Model Transformations. Software and Systems Modeling. 2021.
DOI: 10.1007/s10270-021-00893-y
33
Editor's Notes
The loss is another metric in charge of monitoring the quality of the training.
The loss reports the sum of the error made for each example in the dataset, thus it should be as close as 0 as possible.
It is expected that a decrease in training loss should trigger a decrease in validation loss.
In our scenario, we may need this long-term memory to remember previous mappings as part of a more complex mapping pattern
The correctness of ANNs is studied through its accuracy and overfitting (being the latter measured through the validation loss). The accuracy should be as close as 1 as possible while the validation loss as close to 0 as possible.
The accuracy is calculated comparing for each input model in the test dataset whether the output of the network corresponds with the expected output. If it does, the network was able to successfully predict the target model for the given input model.
The accuracy grows and the loss decreases with the size of the dataset, i.e., the more input-output pairs we provide for training, the better our software learns and predicts (transforms). In this concrete case, with a dataset with 1000 models, the accuracy is 1 and the loss 0 (meaning that no overfitting was taking place), which means that the ANNs are perfectly trained and ready to use. Note that we show the size of the complete dataset but, we have split it using an 80% of the pairs for training, a 10% for validation and another 10% for testing.