This is the very first presentation I've made in a international conference. It surely has errors, but it was an inspiring work for me anyway.
I still work with visualization :)
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
A Social Network Based Model for e-Mail Information Visualization - Sunbelt 2007
1. Introduction
Information Contained in E-mail Messages
Visualization Model
Conclusions
Thank You
A Social Network Based Model for e-Mail
Information Visualization
Juan Cruz Fabio González
1 Intelligent
Systems Research Laboratory
National University of Colombia
May, 2007
2. Introduction
Information Contained in E-mail Messages
Visualization Model
Conclusions
Thank You
Outline
1 Introduction
2 Information Contained in E-mail Messages
3 Visualization Model
4 Conclusions
3. Introduction
Information Contained in E-mail Messages
Visualization Model
Conclusions
Thank You
Introduction
E-mail is a growing technology for exchanging information
between people and/or organizations.
In e-mail messages there are information that is not visible to
its owners.
That information could be extracted and delivered to the user
in order to answer questions like Who knows what?.
4. Introduction
Information Contained in E-mail Messages
Visualization Model
Conclusions
Thank You
Types of Information
The information contained in e-mail messages can be divided
in two groups:
Topics treated information that is contained in the body and
the attached files.
Information about the contacts those the messages owner have
communication with.
5. Introduction
Information Contained in E-mail Messages
Visualization Model
Conclusions
Thank You
Types of Information
The information contained in e-mail messages can be divided
in two groups:
Topics treated information that is contained in the body and
the attached files.
Information about the contacts those the messages owner have
communication with.
6. Introduction
Information Contained in E-mail Messages
Visualization Model
Conclusions
Thank You
Types of Information
The information contained in e-mail messages can be divided
in two groups:
Topics treated information that is contained in the body and
the attached files.
Information about the contacts those the messages owner have
communication with.
7. Introduction
Information Contained in E-mail Messages
Visualization Model
Conclusions
Thank You
Information Contained in Messages’ Body
The message’s text body contains the the ideas that the
sender wants other people know.
The written text in some few cases is writed in formal and
structurated way: it could say “I attach the document about
we were discussed this morning”.
Attached documents are included in this group.
There are several types of attachments that have information
easy extractable:
Word documents
Power Point presentations
PDF’s
Excel Spreadsheets
Text files
8. Introduction
Information Contained in E-mail Messages
Visualization Model
Conclusions
Thank You
Information Contained in Messages’ Body
The message’s text body contains the the ideas that the
sender wants other people know.
The written text in some few cases is writed in formal and
structurated way: it could say “I attach the document about
we were discussed this morning”.
Attached documents are included in this group.
There are several types of attachments that have information
easy extractable:
Word documents
Power Point presentations
PDF’s
Excel Spreadsheets
Text files
9. Introduction
Information Contained in E-mail Messages
Visualization Model
Conclusions
Thank You
Information Contained in Messages’ Body
The message’s text body contains the the ideas that the
sender wants other people know.
The written text in some few cases is writed in formal and
structurated way: it could say “I attach the document about
we were discussed this morning”.
Attached documents are included in this group.
There are several types of attachments that have information
easy extractable:
Word documents
Power Point presentations
PDF’s
Excel Spreadsheets
Text files
10. Introduction
Information Contained in E-mail Messages
Visualization Model
Conclusions
Thank You
Information Contained in Messages’ Body
The message’s text body contains the the ideas that the
sender wants other people know.
The written text in some few cases is writed in formal and
structurated way: it could say “I attach the document about
we were discussed this morning”.
Attached documents are included in this group.
There are several types of attachments that have information
easy extractable:
Word documents
Power Point presentations
PDF’s
Excel Spreadsheets
Text files
11. Introduction
Information Contained in E-mail Messages
Visualization Model
Conclusions
Thank You
Contact Information in E-mails Messages
All e-mail messages have a sender, one o more addressees, a
subject and the date when it was created.
Addressees can be laid in any of these fields: To, Cc (Carbon
Copy), BCc (Blind Carbon Copy)
With senders and addressees of all the messages it is possible
to build the messages owner’s personal social network.
In that social network the user could be identify the people
that he have any communication and maybe he knows, also,
can see communications between their direct contacts and
people that perhaps he don’t knows.
This is possible when the user receive a message from a direct
contact with more addressees.
12. Introduction
Information Contained in E-mail Messages
Visualization Model
Conclusions
Thank You
Contact Information in E-mails Messages
All e-mail messages have a sender, one o more addressees, a
subject and the date when it was created.
Addressees can be laid in any of these fields: To, Cc (Carbon
Copy), BCc (Blind Carbon Copy)
With senders and addressees of all the messages it is possible
to build the messages owner’s personal social network.
In that social network the user could be identify the people
that he have any communication and maybe he knows, also,
can see communications between their direct contacts and
people that perhaps he don’t knows.
This is possible when the user receive a message from a direct
contact with more addressees.
13. Introduction
Information Contained in E-mail Messages
Visualization Model
Conclusions
Thank You
Contact Information in E-mails Messages
All e-mail messages have a sender, one o more addressees, a
subject and the date when it was created.
Addressees can be laid in any of these fields: To, Cc (Carbon
Copy), BCc (Blind Carbon Copy)
With senders and addressees of all the messages it is possible
to build the messages owner’s personal social network.
In that social network the user could be identify the people
that he have any communication and maybe he knows, also,
can see communications between their direct contacts and
people that perhaps he don’t knows.
This is possible when the user receive a message from a direct
contact with more addressees.
14. Introduction
Information Contained in E-mail Messages
Visualization Model
Conclusions
Thank You
Contact Information in E-mails Messages
All e-mail messages have a sender, one o more addressees, a
subject and the date when it was created.
Addressees can be laid in any of these fields: To, Cc (Carbon
Copy), BCc (Blind Carbon Copy)
With senders and addressees of all the messages it is possible
to build the messages owner’s personal social network.
In that social network the user could be identify the people
that he have any communication and maybe he knows, also,
can see communications between their direct contacts and
people that perhaps he don’t knows.
This is possible when the user receive a message from a direct
contact with more addressees.
15. Introduction
Information Contained in E-mail Messages
Visualization Model
Conclusions
Thank You
Contact Information in E-mails Messages
All e-mail messages have a sender, one o more addressees, a
subject and the date when it was created.
Addressees can be laid in any of these fields: To, Cc (Carbon
Copy), BCc (Blind Carbon Copy)
With senders and addressees of all the messages it is possible
to build the messages owner’s personal social network.
In that social network the user could be identify the people
that he have any communication and maybe he knows, also,
can see communications between their direct contacts and
people that perhaps he don’t knows.
This is possible when the user receive a message from a direct
contact with more addressees.
16. Introduction
Information Contained in E-mail Messages
Visualization Model
Conclusions
Thank You
E-mail Information Extraction
E-mail messages can be stored in many way, each of these can
use it own format:
POP3 servers uses mbox format.
Netscape, Mozilla and Thunderbird uses mork format.
Outlook and Outlook express uses their propietary format.
XML is a structured-marked language that should be used to
define the structure of an e-mail message.
Thus, the first step is to develop a system capable to take the
e-mail messages from some store and translate it to a
tructured language.
With that translation the messages can be readed in a
standard way.
17. Introduction
Information Contained in E-mail Messages
Visualization Model
Conclusions
Thank You
E-mail Information Extraction
E-mail messages can be stored in many way, each of these can
use it own format:
POP3 servers uses mbox format.
Netscape, Mozilla and Thunderbird uses mork format.
Outlook and Outlook express uses their propietary format.
XML is a structured-marked language that should be used to
define the structure of an e-mail message.
Thus, the first step is to develop a system capable to take the
e-mail messages from some store and translate it to a
tructured language.
With that translation the messages can be readed in a
standard way.
18. Introduction
Information Contained in E-mail Messages
Visualization Model
Conclusions
Thank You
E-mail Information Extraction
E-mail messages can be stored in many way, each of these can
use it own format:
POP3 servers uses mbox format.
Netscape, Mozilla and Thunderbird uses mork format.
Outlook and Outlook express uses their propietary format.
XML is a structured-marked language that should be used to
define the structure of an e-mail message.
Thus, the first step is to develop a system capable to take the
e-mail messages from some store and translate it to a
tructured language.
With that translation the messages can be readed in a
standard way.
19. Introduction
Information Contained in E-mail Messages
Visualization Model
Conclusions
Thank You
E-mail Information Extraction
E-mail messages can be stored in many way, each of these can
use it own format:
POP3 servers uses mbox format.
Netscape, Mozilla and Thunderbird uses mork format.
Outlook and Outlook express uses their propietary format.
XML is a structured-marked language that should be used to
define the structure of an e-mail message.
Thus, the first step is to develop a system capable to take the
e-mail messages from some store and translate it to a
tructured language.
With that translation the messages can be readed in a
standard way.
20. Introduction
Information Contained in E-mail Messages
Visualization Model
Conclusions
Thank You
E-mail Information Extraction System
21. Introduction
Information Contained in E-mail Messages
Visualization Model
Conclusions
Thank You
Ring Force Layout
The messages’ owner is
the network central node.
People who have direct
communication with
central user lay in the first
ring.
People in the second ring
are those who have
comunications only with
people in the first ring.
22. Introduction
Information Contained in E-mail Messages
Visualization Model
Conclusions
Thank You
Force Field Layout
This layout uses a SOM map as force field (gravitational or
magnetic).
Each SOM centroid acts like an attractor point, so each social
network node is attracted to topics of influence.
The central user (the owner) is influenced in different
magnitudes by all the topics treated in the messages.
Positions of the nodes are determined by the influence of the
topics, so, some node will be near to it most treated topic.
In each topic area (cluster) every node in it exert a repulsive
force over the other in order to mantain the representation
clear enough.
23. Introduction
Information Contained in E-mail Messages
Visualization Model
Conclusions
Thank You
Force Field Layout
This layout uses a SOM map as force field (gravitational or
magnetic).
Each SOM centroid acts like an attractor point, so each social
network node is attracted to topics of influence.
The central user (the owner) is influenced in different
magnitudes by all the topics treated in the messages.
Positions of the nodes are determined by the influence of the
topics, so, some node will be near to it most treated topic.
In each topic area (cluster) every node in it exert a repulsive
force over the other in order to mantain the representation
clear enough.
24. Introduction
Information Contained in E-mail Messages
Visualization Model
Conclusions
Thank You
Force Field Layout
This layout uses a SOM map as force field (gravitational or
magnetic).
Each SOM centroid acts like an attractor point, so each social
network node is attracted to topics of influence.
The central user (the owner) is influenced in different
magnitudes by all the topics treated in the messages.
Positions of the nodes are determined by the influence of the
topics, so, some node will be near to it most treated topic.
In each topic area (cluster) every node in it exert a repulsive
force over the other in order to mantain the representation
clear enough.
25. Introduction
Information Contained in E-mail Messages
Visualization Model
Conclusions
Thank You
Force Field Layout
This layout uses a SOM map as force field (gravitational or
magnetic).
Each SOM centroid acts like an attractor point, so each social
network node is attracted to topics of influence.
The central user (the owner) is influenced in different
magnitudes by all the topics treated in the messages.
Positions of the nodes are determined by the influence of the
topics, so, some node will be near to it most treated topic.
In each topic area (cluster) every node in it exert a repulsive
force over the other in order to mantain the representation
clear enough.
26. Introduction
Information Contained in E-mail Messages
Visualization Model
Conclusions
Thank You
Force Field Layout
This layout uses a SOM map as force field (gravitational or
magnetic).
Each SOM centroid acts like an attractor point, so each social
network node is attracted to topics of influence.
The central user (the owner) is influenced in different
magnitudes by all the topics treated in the messages.
Positions of the nodes are determined by the influence of the
topics, so, some node will be near to it most treated topic.
In each topic area (cluster) every node in it exert a repulsive
force over the other in order to mantain the representation
clear enough.
27. Introduction
Information Contained in E-mail Messages
Visualization Model
Conclusions
Thank You
Force Field Layout
28. Introduction
Information Contained in E-mail Messages
Visualization Model
Conclusions
Thank You
Force Field Layout
Using SOM, topics areas was identified, so each message
belongs to one of those areas.
Each area has a centroid, which acts as an attractor point
used for drawing the contact network later.
Experimentation consists of a random rectangles generator and
random weights vectors for each node.
First experiment was constructed using 25 random located and
sized rectangles and a social network with 74 nodes.
29. Introduction
Information Contained in E-mail Messages
Visualization Model
Conclusions
Thank You
Force Field Layout
Using SOM, topics areas was identified, so each message
belongs to one of those areas.
Each area has a centroid, which acts as an attractor point
used for drawing the contact network later.
Experimentation consists of a random rectangles generator and
random weights vectors for each node.
First experiment was constructed using 25 random located and
sized rectangles and a social network with 74 nodes.
30. Introduction
Information Contained in E-mail Messages
Visualization Model
Conclusions
Thank You
Force Field Layout
Using SOM, topics areas was identified, so each message
belongs to one of those areas.
Each area has a centroid, which acts as an attractor point
used for drawing the contact network later.
Experimentation consists of a random rectangles generator and
random weights vectors for each node.
First experiment was constructed using 25 random located and
sized rectangles and a social network with 74 nodes.
31. Introduction
Information Contained in E-mail Messages
Visualization Model
Conclusions
Thank You
Force Field Layout
Using SOM, topics areas was identified, so each message
belongs to one of those areas.
Each area has a centroid, which acts as an attractor point
used for drawing the contact network later.
Experimentation consists of a random rectangles generator and
random weights vectors for each node.
First experiment was constructed using 25 random located and
sized rectangles and a social network with 74 nodes.
32. Introduction
Information Contained in E-mail Messages
Visualization Model
Conclusions
Thank You
Force Field Layout
Twenty five topics were
identified.
A 15x15 SOM was used
Each topic (area) has a
centroid.
33. Introduction
Information Contained in E-mail Messages
Visualization Model
Conclusions
Thank You
Force Field Layout
Each point in the plane has a similarity given by:
2
i − Ci − p
Sim p, C = exp
σ2
where p is the test point, C i is the centroid i and σ 2 define
the radius of each area and is given by:
# email messages
σ2 =
k
34. Introduction
Information Contained in E-mail Messages
Visualization Model
Conclusions
Thank You
Force Field Layout
35. Introduction
Information Contained in E-mail Messages
Visualization Model
Conclusions
Thank You
Force Field Layout
36. Introduction
Information Contained in E-mail Messages
Visualization Model
Conclusions
Thank You
Force Field Layout
37. Introduction
Information Contained in E-mail Messages
Visualization Model
Conclusions
Thank You
Conclusions
E-mail messages has many information that is not evident for
users.
Social network information allows to see people that the user
has not direct interaction with
When the contact and the topic information war merged, the
user can see other people, and the topics treated by they.
So, it is possible to the user identify not only far contacts, the
conversations topics too.
38. Introduction
Information Contained in E-mail Messages
Visualization Model
Conclusions
Thank You
Thank You
Juan David Cruz Gómez: jdcruzg@unal.edu.co
Fabio Gonzélez: fagonzalezo@unal.edu.co