Using semantic annotation of web services for analyzing
1. Information Diffusion in Web Services Networks
Shahab Mokarizadeh , Royal Institute of Technology (KTH) , Sweden
Peep Küngas, University of Tartu (UT) , Estonia
Mihhail Matskin , Royal Institute of Technology (KTH) , Sweden
Marco Crasso, Marcelo Campo, Alejandro Zunino , UNICEN University,
Argentina
Contact: shahabm@kth.se
1
2. Outline
Background of Information Flow Analysis
Roadmap and Computational Model
Web service Annotation
Web service Categorization
Experimental Results
Discussion & Conclusion
2
3. Background – Information Diffusion
Information Diffusion: the communication of knowledge over
time among members of a social system
It shows intrinsic properties of real-world phenomenon.
Already studied in the context of: biosphere, microblogs,
publication citation, … where a network structure present.
3
4. Information Diffusion
among Web service Domains
Observation: Services published in the Web form a conceptual
ecology of knowledge where information is shared and flows
along input and output parameters of service operations.
Case-study: How Web services in different commodities have
been designed from information exchange perspective?
Introducing value-add Web services
Web service adoption spots
4
5. Roadmap
1
• Semantically annotation of Web services
2
• Assign Web services to respective categories
3
• Construct Web service network
4
• Compute information flow matrix
• Matrix Analysis
5
5
6. 1-Web service Annotation
-Only semantic annotations of basic elements of input and output
parameters of Web service Operations
-SAWSDL annotation model
-We exploit our Semi-automated ontology learning method which
relies on lexico-syntactic patterns
“Ontology Learning for Cost-Effective Large-Scale Semantic Annotation
of Web Service Interfaces”. EKAW 2010:pp. 401-410
Image from : Web Services and
6 Security,1/17/2006 ,Marco Cova
7. Tax and Customs Board service
Output message content fragment
7
11. 2-Web service Categorization
A category (a.k.a. commodity) describes a general kind of a service
that is provided, for example “B2B” , “Health”, “E-Commerce”, etc.
Each Web service could belong to multiple categories !
Standard Software Taxonomy e.g. UNSPSC: http://www.unspsc.org/
We use Classifier : "AWSC: An approach to Web Service classification
based on machine learning techniques“, Inteligencia Artificial, ISSN 1137-3601, vol.
12, no. 37, pp. 25-36, Asociación Española para la Inteligencia Artificial, Valencia, España.
2008.
UNSPSC
Instant messaging Calendar and scheduling
Adventure games Mobile operator specific
Internet directory services Medical software
11 Music or sound editing Video conferencing software
12. 3-Web service Network Construction
1- Present annotated Web services as bipartite (2-mode) graph
2- Create Semantic Network (1-mode graph)
3- Create Weighted Category Network using Semantic network
12
15. Network Transformation
Semantic Network Category Network
Propagate the categories to semantic Ds, Dt : category nodes
nodes , Cu: semantic node , Label each category edge with weights:
qk: weight of node in category k
Q u q1 ,..qk ., qn u ,v ( Ds , Dt ) qu , s .qv,t
frequency of Cu in Ds
15
qs n
frequency of Cu in Di
W ( Ds , Dt )
edge ( u ,v )
u ,v ( D s , Dt )
i 1
16. 4-Normalizing Weights (Z-score)
Edge category weight W(Di,Dj) : Wi,j
Sum of all weights of all links from category i: W i * W ( Di , D j )
j
Sum of all weights of all links to category j: W* j W ( Di , D j )
i
Sum of weights of all categories: W W ( Di , D j )
i, j
Expected weights from category i to category j :
Wi* W* j
W
Normalize category weights (Z-Score):
Wi* W* j Wi* W* j
i , j (Wi , j )
W W
16
17. Matrix of Information flow
Matrix of information flow between pair of categories:
1,1 1, j 1,n
i ,1 i , j i ,n
n ,1 n, j n,n
A high proximity (Φ i j) between categories i and j reveals a strong
tendency for semantic concepts associated to category j to be resulted
from invocation of services which take semantic concepts associated to
category i.
17
21. Information Exchange Patterns - 1:
Self-Referential Pattern: A category mainly provides inputs
for its own services and consumes mostly the information
provided by itself (i.e. self contained).
Appear in diagonal of matrix
Categories: Financial Analysis Software, Web Platform Development
Software, Map Creation Software, Video Conferencing Software and
Accounting Software
The API-s exposed by these Web services exploit frequently
domain-specific concepts as input and output elements
21
22. Information Exchange Patterns - 2:
Outside main diagonal:
-Foreign Language category , Presentation category
-Financial Analysis category , Enterprise Resource Planning category
Least volume of information flow:
-Video Conferencing software and Financial Analysis software
22
23. Threats to Validity
The presented model heavily relies of accuracy of underlying
semantic annotation and matching scheme !
The examined Web services account only for small proportion
of existing ones on the Web!
The collection of Web services’ interface descriptions may also
suffer from unintentional preference toward some specific
categories.
In the absence of timing factor our analysis is rather static
analysis of information flow
23
24. Conclusion and Future Work
The presented approach can discover information exchange
patterns.
In general our approach is applicable to any other kind of machine
understandable APIs, not just WSDLs, !
Future work:
To examine how presence of service composition or mashups
influences the information exchange pattern
Recommending value-add Web services based on identified
information exchange patterns and Web service network
properties
24
26. Partial Category Weight for Edge (Ds,Dt) : u ,v ( Ds , Dt ) qu , s .qv,t
Augmented Category Weight for Edge (Ds, Dt): W ( Ds , Dt )
edge ( u ,v )
u ,v ( D s , Dt )
26
27. Ontology Learning for Information Elicitation
Web service Annotation1 Term Extraction
Syntactic Refinement
Ontology Discovery
Ontology Learning Input: Pattern-based
- Message Part names of input/output Semantic Analysis
parameters Term Disambiguation
- XML Schema leaf element names of
complex types Class and Relation
Determination
Ontology Organization
Adding Relations
[1] ”Ontology Learning for Cost-Effective Large-scale Semantic
Annotation of XML Schemas and Web Service Interfaces". in Porc.
EKAW 2010, LNAI 6317,pp.401-410, 2010 Reference
27 Ontology