2. AN OPEN SOURCE PLATFORM AS
FOUNDATION FOR DH TOOLS
An open source
platform to bind them
all
Digital Humanities are a major challenge to
social sciences: adding ITs in extracting,
archiving, automated analysis, corpus
qualification, data visualization…
3. AN OPEN SOURCE PLATFORM AS
FOUNDATION FOR DH TOOLS
An open source
platform to bind them
all
Digital Humanities are a major challenge to
social sciences: adding ITs in extracting,
archiving, automated analysis, corpus
qualification, data visualization…
CREATE A UNIFICATION DYNAMICS
Too many one-shot projects, high-value innovations
without consolidating experience. One platform to
bind them all ?
4. AN OPEN SOURCE PLATFORM AS
FOUNDATION FOR DH TOOLS
An open source
platform to bind them
all
Digital Humanities are a major challenge to
social sciences: adding ITs in extracting,
archiving, automated analysis, corpus
qualification, data visualization…
CREATE A UNIFICATION DYNAMICS
Too many one-shot projects, high-value innovations
without consolidating experience. One platform to
bind them all ?
OPEN GOVERNANCE FROM DAY ONE
My Web Intelligence is built around collaborative
tools (Github, Trello, etc.). These have been public
from day one, thus publicizing all research
progresses.
5. AN OPEN SOURCE PLATFORM AS
FOUNDATION FOR DH TOOLS
An open source
platform to bind them
all
Digital Humanities are a major challenge to
social sciences: adding ITs in extracting,
archiving, automated analysis, corpus
qualification, data visualization…
CREATE A UNIFICATION DYNAMICS
Too many one-shot projects, high-value innovations
without consolidating experience. One platform to
bind them all ?
OPEN GOVERNANCE FROM DAY ONE
My Web Intelligence is built around collaborative
tools (Github, Trello, etc.). These have been public
from day one, thus publicizing all research
progresses.
FOR THE COMMON GOOD
My Web Intelligence is meant to be as common as
possible so that intelligence tools benefit to all
(easy to install, well documented, etc.)
6. AN OPEN SOURCE PLATFORM AS
FOUNDATION FOR DH TOOLS
An open source
platform to bind them
all
Digital Humanities are a major challenge to
social sciences: adding ITs in extracting,
archiving, automated analysis, corpus
qualification, data visualization…
CREATE A UNIFICATION DYNAMICS
Too many one-shot projects, high-value innovations
without consolidating experience. One platform to
bind them all ?
OPEN GOVERNANCE FROM DAY ONE
My Web Intelligence is built around collaborative
tools (Github, Trello, etc.). These have been public
from day one, thus publicizing all research
progresses.
FOR THE COMMON GOOD
My Web Intelligence is meant to be as common as
possible so that intelligence tools benefit to all (easy
to install, well documented, etc.)
COLLABORATIVE FIRST
My Web Intelligence chooses openness and
collaboration to answer the challenges posed by
new technologies and media.
7. The content manager :
the heterogeneous
archive management
challenge
Allow SHS study the digital humanities is
offer primarily a platform able to extract and
retain huge amounts of expressions from
heterogeneous sources.
MASTERING THE EXTRACTION AND
ARCHIVING AGENTS (CRAWLERS) AMID
BIG DATA.
8. The content manager :
the heterogeneous
archive management
challenge
Allow SHS study the digital humanities is
offer primarily a platform able to extract and
retain huge amounts of expressions from
heterogeneous sources.
AUTOMATICALLY EXTRACTING CORPUS ON
THE NEEDS
Give a crawler accessing heterogeneous sources
with enough modularity to meet all users' projects
MASTERING THE EXTRACTION AND
ARCHIVING AGENTS (CRAWLERS) AMID
BIG DATA.
9. The content manager :
the heterogeneous
archive management
challenge
Allow SHS study the digital humanities is
offer primarily a platform able to extract and
retain huge amounts of expressions from
heterogeneous sources.
AUTOMATICALLY EXTRACTING CORPUS ON
THE NEEDS
Give a crawler accessing heterogeneous sources with
enough modularity to meet all users' projects
GIVE A USER INTERFACE TO MANAGE
CORPUS.
Clean, delete, sort, rearrange, according to its own
heuristics, is a must for any DH project.
MASTERING THE EXTRACTION AND
ARCHIVING AGENTS (CRAWLERS) AMID
BIG DATA.
10. The content manager :
the heterogeneous
archive management
challenge
Allow SHS study the digital humanities is
offer primarily a platform able to extract and
retain huge amounts of expressions from
heterogeneous sources.
AUTOMATICALLY EXTRACTING CORPUS ON
THE NEEDS
Give a crawler accessing heterogeneous sources with
enough modularity to meet all users' projects
GIVE A USER INTERFACE TO MANAGE
CORPUS.
Clean, delete, sort, rearrange, according to its own
heuristics, is a must for any DH project.
A COLLABORATIVE MANAGEMENT TOOLS
FOR DATA STUDIES.
We did not win the HD challenge alone. A platform
of this ambition will integrate a team management
module to the data processing service .
MASTERING THE EXTRACTION AND
ARCHIVING AGENTS (CRAWLERS) AMID
BIG DATA.
11. The content manager :
the heterogeneous
archive management
challenge
Allow SHS study the digital humanities is
offer primarily a platform able to extract and
retain huge amounts of expressions from
heterogeneous sources.
AUTOMATICALLY EXTRACTING CORPUS ON
THE NEEDS
Give a crawler accessing heterogeneous sources with
enough modularity to meet all users' projects
GIVE A USER INTERFACE TO MANAGE
CORPUS.
Clean, delete, sort, rearrange, according to its own
heuristics, is a must for any DH project.
A COLLABORATIVE MANAGEMENT TOOLS FOR
DATA STUDIES.
We did not win the HD challenge alone. A platform of
this ambition will integrate a team management
module to the data processing service .
RECRUITING INTELLIGENT AGENTS
The democratization of machine learning and
artificial intelligence now allows hiring processing
algorithms to assist you in the mass management of
your data.
MASTERING THE EXTRACTION AND
ARCHIVING AGENTS (CRAWLERS) AMID
BIG DATA.
12. Analysis of content :
The challenge of
automating the
qualification.
The language processing has made
enormous progress. However some open
solution provides opportunities to qualify
the body of masses. Our project aims to
bring together the foundations of research
in this area.
QUALIFY AUTOMATICALLY DATA ABOUT
COMMUNICATION SITUATIONS .
13. Analysis of content :
The challenge of
automating the
qualification.
The language processing has made
enormous progress. However some open
solution provides opportunities to qualify
the body of masses. Our project aims to
bring together the foundations of research
in this area.
QUALIFY THE COMMUNICATION SITUATION .
Each expression have to be contextualized in a
mediated communication situation and need to be
qualified automatically.
QUALIFY AUTOMATICALLY DATA ABOUT
COMMUNICATION SITUATIONS .
14. Analysis of content :
The challenge of
automating the
qualification.
The language processing has made
enormous progress. However some open
solution provides opportunities to qualify
the body of masses. Our project aims to
bring together the foundations of research
in this area.
QUALIFY THE COMMUNICATION SITUATION .
Each expression have to be contextualized in a
mediated communication situation and need to be
qualified automatically.
ANALYZE THE IMPACT OF ACTS DISCURSIVE .
Save impact indicators of all expressions to be able
to not only measure their influence but their
resonance with the representations of the receivers
of the message.
QUALIFY AUTOMATICALLY DATA ABOUT
COMMUNICATION SITUATIONS .
15. Analysis of content :
The challenge of
automating the
qualification.
The language processing has made
enormous progress. However some open
solution provides opportunities to qualify
the body of masses. Our project aims to
bring together the foundations of research
in this area.
QUALIFY THE COMMUNICATION SITUATION .
Each expression have to be contextualized in a
mediated communication situation and need to be
qualified automatically.
ANALYZE THE IMPACT OF ACTS DISCURSIVE .
Save impact indicators of all expressions to be able to
not only measure their influence but their resonance
with the representations of the receivers of the
message.
ANALYZE THE CONTENT AUTOMATICALLY.
Lemmatization of texts, the main objects of
expressions, arguments trees ... The content analysis
allows automatic classification of the corpus serving
the detection of collective representations.
QUALIFY AUTOMATICALLY DATA ABOUT
COMMUNICATION SITUATIONS .
16. Analysis of content :
The challenge of
automating the
qualification.
The language processing has made
enormous progress. However some open
solution provides opportunities to qualify
the body of masses. Our project aims to
bring together the foundations of research
in this area.
QUALIFY THE COMMUNICATION SITUATION .
Each expression have to be contextualized in a
mediated communication situation and need to be
qualified automatically.
ANALYZE STYLISTIC FORMS TO IDENTIFY
PATTERNS SPEAKER .
The style , feeling, language level , type of
vocabulary ... the detection of styles enriches
patterns of speakers to better identify their
intention of communication.
ANALYZE THE IMPACT OF ACTS DISCURSIVE .
Save impact indicators of all expressions to be able to
not only measure their influence but their resonance
with the representations of the receivers of the
message.
ANALYZE THE CONTENT AUTOMATICALLY.
Lemmatization of texts, the main objects of
expressions, arguments trees ... The content analysis
allows automatic classification of the corpus serving
the detection of collective representations.
QUALIFY AUTOMATICALLY DATA ABOUT
COMMUNICATION SITUATIONS .
17. The algorithms of
speech: At the source
of the positions.
The generation of discourse responds to
more or less stereotyped behaviors. The
algorithms that detect patterns used to
measure but also to predict them...
DETECTION AND MEASURING PATTERNS
FROM SOURCE OF SPEECH TO
UNDERSTAND THE ECONOMY
GENERATIVE.
18. The algorithms of
speech: At the source
of the positions.
The generation of discourse responds to
more or less stereotyped behaviors. The
algorithms that detect patterns used to
measure but also to predict them...
ANALYZE THE POSITIONS OF SPEAKERS .
By the qualification of expressions depending on
the discursive act model, it is possible to quantify
the production of discourse through multi varied
statistical processing (type AFC , ACP, trees ... )
DETECTION AND MEASURING PATTERNS
FROM SOURCE OF SPEECH TO
UNDERSTAND THE ECONOMY
GENERATIVE.
19. The algorithms of
speech: At the source
of the positions.
The generation of discourse responds to
more or less stereotyped behaviors. The
algorithms that detect patterns used to
measure but also to predict them...
ANALYZE THE POSITIONS OF SPEAKERS .
By the qualification of expressions depending on the
discursive act model, it is possible to quantify the
production of discourse through multi varied
statistical processing (type AFC , ACP, trees ... )
PREDICTING PRODUCTION OF SPEECH
Predictive algorithms enable not only qualify
incomplete data but also to generate hypotheses
about future positions taken by developing future
scenarios
DETECTION AND MEASURING PATTERNS
FROM SOURCE OF SPEECH TO
UNDERSTAND THE ECONOMY
GENERATIVE.
20. The algorithms of
speech: At the source
of the positions.
The generation of discourse responds to
more or less stereotyped behaviors. The
algorithms that detect patterns used to
measure but also to predict them...
ANALYZE THE POSITIONS OF SPEAKERS .
By the qualification of expressions depending on the
discursive act model, it is possible to quantify the
production of discourse through multi varied
statistical processing (type AFC , ACP, trees ... )
PREDICTING PRODUCTION OF SPEECH
Predictive algorithms enable not only qualify
incomplete data but also to generate hypotheses
about future positions taken by developing future
scenarios
THE SOCIAL NETWORK ANALYSIS AS SOCIAL
CONTEXT OF SPEECH
The structural analysis of networks applied to the
analysis of discourse in their co-citation retrieves
the frame that binds and socializes enunciators.
DETECTION AND MEASURING PATTERNS
FROM SOURCE OF SPEECH TO
UNDERSTAND THE ECONOMY
GENERATIVE.
21. The algorithms of
speech: At the source
of the positions.
The generation of discourse responds to
more or less stereotyped behaviors. The
algorithms that detect patterns used to
measure but also to predict them...
ANALYZE THE POSITIONS OF SPEAKERS .
By the qualification of expressions depending on the
discursive act model, it is possible to quantify the
production of discourse through multi varied
statistical processing (type AFC , ACP, trees ... )
SNA AS THE ANALYSIS OF COGNITIVE
STRUCTURES OF SPEECH.
The SNA provides a new perspective in the analysis
of the argumentative co-presence in the large
corpus by introducing its own notions (centrality,
betwenness, etc.).
PREDICTING PRODUCTION OF SPEECH
Predictive algorithms enable not only qualify
incomplete data but also to generate hypotheses
about future positions taken by developing future
scenarios
THE SOCIAL NETWORK ANALYSIS AS SOCIAL
CONTEXT OF SPEECH
The structural analysis of networks applied to the
analysis of discourse in their co-citation retrieves the
frame that binds and socializes enunciators.
DETECTION AND MEASURING PATTERNS
FROM SOURCE OF SPEECH TO
UNDERSTAND THE ECONOMY
GENERATIVE.
22. Data visualization: The
look as a source of
intelligence ?
The data visualization challenge is to give
interpretive schemes for large masses of
data in a specific study context. My Web
Intelligence explore the relationship
between visualization and digital expression.
VIEW AND INTERPRETING DIGITAL
EXPRESSIONS WEB.
23. Data visualization: The
look as a source of
intelligence ?
The data visualization challenge is to give
interpretive schemes for large masses of
data in a specific study context. My Web
Intelligence explore the relationship
between visualization and digital expression.
NAVIGATING THE CORPUS OF EXPRESSION.
View and Navigate the expressions through
dashboards of act of enunciation (type, media,
speakers, hearing, etc.).
VIEW AND INTERPRETING DIGITAL
EXPRESSIONS WEB.
24. Data visualization: The
look as a source of
intelligence ?
The data visualization challenge is to give
interpretive schemes for large masses of
data in a specific study context. My Web
Intelligence explore the relationship
between visualization and digital expression.
NAVIGATING THE CORPUS OF EXPRESSION.
View and Navigate the expressions through
dashboards of act of enunciation (type, media,
speakers, hearing, etc.).
SORT AND INDEXING CONTENT.
Explorer viewing by keyword clouds, dynamic
indexes and other representations of the text to
facilitate the conceptual analysis.
VIEW AND INTERPRETING DIGITAL
EXPRESSIONS WEB.
25. Data visualization: The
look as a source of
intelligence ?
The data visualization challenge is to give
interpretive schemes for large masses of
data in a specific study context. My Web
Intelligence explore the relationship
between visualization and digital expression.
NAVIGATING THE CORPUS OF EXPRESSION.
View and Navigate the expressions through
dashboards of act of enunciation (type, media,
speakers, hearing, etc.).
SORT AND INDEXING CONTENT.
Explorer viewing by keyword clouds, dynamic indexes
and other representations of the text to facilitate the
conceptual analysis.
MAPPING THE SOURCES OF INFORMATION.
The mapping of the speakers enables contextual
navigation supports media by analyzing their
relevant relationships as social context of utterance.
VIEW AND INTERPRETING DIGITAL
EXPRESSIONS WEB.
26. Data visualization: The
look as a source of
intelligence ?
The data visualization challenge is to give
interpretive schemes for large masses of
data in a specific study context. My Web
Intelligence explore the relationship
between visualization and digital expression.
NAVIGATING THE CORPUS OF EXPRESSION.
View and Navigate the expressions through
dashboards of act of enunciation (type, media,
speakers, hearing, etc.).
MAPPING COLLECTIVE REPRESENTATIONS.
The use of SNA in concept mapping offers the
prospect of a new visualization of collective
representations and therefore the context of
knowledge and episteme studied sayings.
SORT AND INDEXING CONTENT.
Explorer viewing by keyword clouds, dynamic indexes
and other representations of the text to facilitate the
conceptual analysis.
MAPPING THE SOURCES OF INFORMATION.
The mapping of the speakers enables contextual
navigation supports media by analyzing their relevant
relationships as social context of utterance.
VIEW AND WEB. INTERPRETING DIGITAL
EXPRESSIONS
27. MY WEB INTELLIGENCE
Architecture Patterns Issues
PROJECT MANAGER
(territories and requests)
ORACLES
first list of approved
expression for starting the
graph`
READER
Donwload and index the
document like an
expression
CRAWLER
Deep crawling web
SCRAPPER
Read heteregenous files
for build an expression
APPROBATION
Algorithm for approving
linked expression
QUALIFICATION
Enrich expression and
domain with data
RANKING
Build Kpi to rank
expression and domain
APIs
Making a bridge with 3rd
Soft
EXPORT FILES
CSV, GEXPH and all kind
of models
VIZUALISATION
Use Viz librairies to
navigate in this data
(graph, tree, etc.)
Input : absorb the corpus Annotate : qualify your datas Output : show the patterns
The My Web Intelligence challenge is to
absorb heterogeneous corpus and archive
and index them in a model of Author -
Media- expression data.
This is both to respect the specificity of
media but at the same time to use the
communication meta analysis models to
analyze both, meaning of speech, but also
the pragmatic analyzes of communicative
acts
It's not about only what is said in a kind of
naive sociology but also to understand the
social dynamics and strategies at work in
the production of meaning .
The second part of the development of My
Web Intelligence is to develop intelligent
agents able the playing the role of
librarians in their functions :
- Analysis of the relevance of the
document in respect of the project. One of
the major issues in the management of
datas is the cleaning of noise and digital
debris. But beyond that, the relevance to
the research issues are the key usability
HD platforms.
- Data Enrichment (or annotation). The
second role of "librarian function" of each
research project is to work , annotate, as
described by algorithms that human
agents , both by external and internal
sources.
My Web Intelligence thinks like a Core
Framework in the ecosystem of data
analysis projects. A major issue is the
interconnection of the process to third-
party applications in both upstream
inputs/outputs .
Designing an API outputs/inputs to make
compatible third management solutions
and data processing.
Facilitate the production and export of BD
-compatible files with the processing in
third-party software (ex . Gephi , SPSS , R,
etc. )
Use data visualization solutions to navigate
, process and analyze large corpus and
identify meaningful patterns.