A Generic Neural Network Architecture to Infer Heterogeneous Model Transformations

@LolaBurgueno
lburguenoc@uoc.edu
A Generic Neural
Network Architecture to
Infer Heterogeneous
Model Transformations
November 5th, 2021
Lola Burgueño
Open University of Catalonia, Barcelona, Spain

Motivation
• Models capture relevant properties of systems
• During the models’ life-cycle, models are the subject of manipulations
• Purposes:
• Managing software evolution,
• performing analysis,
• Increasing developers’ productivity,
• reducing human errors,
• etc.
• These manipulations are usually
implemented as model transformations
• Model-to-Model
• Model-to-Text
• Text-to-Model
2

Motivation
• Creating model transformations remains a challenging task [1]:
• It requires a high-level expertise,
• competences in language engineering, and
• extensive domain knowledge
• Developers are reluctant to adopt automatic model generators (e.g., from UML to JAVA)
• They do not trust them,
• They look foreign to them,
• they do not follow the company’s coding style
• AI-based solution:
• Heterogeneous model transformations can be automatically inferred from only input-output
• The outputs will comply to the company or project’s standards
3
[1] L. Burgueño, J. Cabot, S. Gérard, The future of model transformation languages: An open community
discussion,

Artificial Intelligence
• Machine Learning - Supervised Learning:
Input
Output
Training Transforming
ML Input Output
ML
Artificial Intelligence
Machine Learning
Artificial Neural Networks
Deep Artificial
Neural Networks
5

• MTs ≈ sequence-to-sequence translation
6

• Graph structure
• Neurons are mathematical functions
• Receive a set of values through its input connections
• compute an output value
• transfer the output value to another neuron through its output
connection
• Connections have associated weights (i.e., real numbers)
• Adjusted during the learning process to increase/decrease the
strength of the connection
7
Neurons
+
Directed weighted connections

• The learning process basically means to find the right weights
• Supervised learning methods. Training phase:
• Example input-output pairs are used (Dataset)
Dataset
Training Validation Test
8
• The training dataset contains most
of the inputs-output pairs and is
used to train the ANN

Dataset
9
• The test dataset is only used once
the training has finished.
• Check the quality of the ANN’s
predictions for inputs it has not seen
before, and hence to study its
accuracy.
• The accuracy is calculated:
# predictions our model gets right
# total of predictions
• Thus, it is a number in the range [0;
1]. The closer to 1, the better

Dataset
10
• The validation dataset plays a similar
role as the test dataset, during the
training process.
• Controls that the learning process is
correct and avoid overfitting.
• Any accuracy increase over the training
dataset must yield to an accuracy
increase over the validation dataset

Architecture
• Encoder-decoder architecture (avoids fixed size input/output constraints)
+
• Long short-term memory neural networks (longer memory than predecessors)
Encoder
LSTM network
Decoder
LSTM network
Input
Model
Output
Model
11

Architecture
• Sequence-to-Sequence transformations
• Tree-to-tree transformations
• Input layer to embed the input tree to a numeric vector
+
• Output layer to obtain the output model from the numeric vectors produced by the decoder
Input
Tree
Embedding
Layer
Encoder
LSTM network
Output
Tree
Layer
Decoder
LSTM network
Input
Model
Output
Model
12

• Attention mechanism
• To pay more attention (remember better) to specific parts
• Helps the decoder recognize the relevant information in the vectorial
representation of the input AST at each step
Architecture
Input
Tree
Embedding
Layer
Encoder
LSTM network
Output
Tree
Layer
Decoder
LSTM network
Attention
Layer
Input
Model
Output
Model
13

• Softmax
• Multi-class classification which uses a softmax activation function
• Maps each real number to a number in the (0; 1) range. Interpreted as probabilities
• In each iteration, the component with higher probability is selected and its corresponding
token passes to be part of the output
Architecture
Input
Tree
Embedding
Layer
Encoder
LSTM network
Output
Tree
Layer
Softmax
Decoder
LSTM network
Attention
Layer
Input
Model
Output
Model
14

• Pre- and post-processing required to…
• represent models as trees
• reduce the size of the training dataset by using a canonical form
• rename variables to avoid the “dictionary problem”
Model pre- and post-processing
Input
Model
(preprocessed)
Input
Tree
Embedding
Layer
Encoder
LSTM network
Output
Tree
Layer
Softmax
Output
Model
(non-postprocessed)
Decoder
LSTM network
Attention
Layer
Input
Model
Output
Model
Preprocessing
Postprocessing
15

• Parameters that are not learned during training, but manually adjusted
• No rule to choose the best hyperparameters for a specific task
• Choosing the right values has a critical impact on the success and performance of
the network
Hyperparameters
16

Hyperparameters
17
Hyperparameter Description Value
Epoch • Number of times the complete training dataset is passed through the Neural
Networks
• In each epoch, the training dataset is randomly shuffled and split into batches
30
Batch • Set of input-output pairs
• Each time a batch is passed, an iteration is completed
64
Neural Networks
depth
• Number of hidden layers in the neural networks 1
Embedding size • Size of the vectors
• Needs to be higher tan the vocabulary size
64
Dropout • Select which weights are updated and which are not in each iteration
• If all the weights were adjusted in each iteration, overfitting would be more likely
to happen
0.75 (prob of
a weight to
be ignored)
Learning rate • Controls how much the weights are adjusted
• Value in the the range [0; 1]
0.005

Cases and Results
• We illustrate the feasibility and potential of our approach through its application
in two main operations on models:
• Model-to-Model transformation  Class 2 Relational
• Code generation  UML 2 Java
18
Results: Neural networks are able to faithfully
learn how to perform these tasks as long as
enough data is provided and no contradictory
examples are given

Class 2 Relational
Some transformation rules that need to be learned
are:
• Each Class is transformed into a Table;
• Each DataType is transformed into a Type;
• DataType is transformed into a Table,
19
• Each single-valued Attribute of type DataType
is transformed into a Column;
• Each multi-valued Attribute instance of the type

Model representation
MODEL
ASSOC
OBJ
c
Class
ATTS
isAbstract name
false family
OBJ
a
Attribute
ATTS
multivalued name
false surname
OBJ
dt
Datatype
ATTS
name
String
att
c a
ASSOC
type
a dt
20
{
" source_ast ": {
" root ": "<MODEL>",
" children ": [
{
" root ": "<OBJ>",
" children ": [
{
" root ": "D ",
" children ": [
…

Class 2 Relational: Results
• Correctness
• Measured through the accuracy and validation loss
21

• Performance
1. How long does it take for the
training phase to complete?
22

• Performance
1. How long does it take for the
training phase to complete?
2. How long it takes to transform an
input model when the network is
trained?
23

• Performance
• Comparison with ATL
• Synthetic models
• Transformation and model from [1]
• ATL: 0.033 seconds
• Our approach: 0.722 seconds
• Although a bit slower, same order of magnitude
• Reasonable time
• The advantages of our approach may pay off
24
[1] AtlanMod (Inria), Class to relational transformation example,
https://www.eclipse.org/atl/atlTransformations/#Class2Relational

UML 2 Java: Results
25
Example of translation rules:
• UML classes are transformed into Java classes;
• UML attributes into Java attributes;
• UML associations into Java attributes referencing the class in the other end of the association
Abstraction gap  Many variability points in the translation
• E.g., Primitive data type conversions
• UML attribute whose type is Real could be mapped to a Java attribute with type double, Double or

UML 2 Java: Results
26
• Training dataset
• Creation:
1. Downloaded the source code of the Eclipse IDE
2. Reverse engineer it with MoDisco
3. Removed the low-level details (e.g., method implementation)
4. Obtained the UML models
• The initial dataset (D1) contains 25,375 input-output pairs

UML 2 Java: Results
27
• Correctness
Our definition of Accuracy:
• worst-case scenario
• we are sometimes discarding
generated code that it is semantically
equivalent to the expected output even
if it presents slight syntactic differences

UML 2 Java: Results
28
• Training dataset
• Creation:
1. Downloaded the source code of the Eclipse IDE
2. Reverse engineer it with MoDisco
3. Removed the low-level details (e.g., method implementation)
4. Obtained the UML models
• The initial dataset (D1) contains 25,375 input-output pairs
• We had to curate it:
1. Discard pairs that exceed size (dataset D2)
2. Discard examples with inconsistencies (dataset D3)
• E.g., due to inheritance and the location in the hierarchy of getters/setters
• The curated dataset (D3) contains 8,937 input-output pairs

UML 2 Java: Results
29
• Correctness

UML 2 Java: Results
30
• Performance
• Training • Generating code:
• Average time of 18 milliseconds
• Standard deviation of 3 milliseconds
• Efficient enough to be part of any
continuous software development
process

3
1
Limitations/Discussion
• Generalization problem
• predicting output solutions for input models
very different from the training distribution
it has learn from
• Social acceptance
• Size of the training dataset
• Size of the input-output pairs
• Diversity in the training set
• Computational limitations of ANNs
• i.e., mathematical operations

Future work
• Study of transformers
• Validation of our approach with MT
chains
• Connectors for EMF models
• Pretrained networks for an easy quick
start
32

References
[1] Loli Burgueño, Jordi Cabot, Sébastien Gérard. An LSTM-Based Neural Network Architecture for
Model Transformations. In Proc. of MoDELS 2019: 294-299
[2] Loli Burgueño, Jordi Cabot, Shuai Li, Sébastien Gérard. A Generic LSTM Neural Network
Architecture to Infer Heterogeneous Model Transformations. Software and Systems Modeling. 2021.
DOI: 10.1007/s10270-021-00893-y
33

A Generic Neural Network Architecture to Infer Heterogeneous Model Transformations

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to A Generic Neural Network Architecture to Infer Heterogeneous Model Transformations

Similar to A Generic Neural Network Architecture to Infer Heterogeneous Model Transformations (20)

Recently uploaded

Recently uploaded (20)

A Generic Neural Network Architecture to Infer Heterogeneous Model Transformations

Editor's Notes