Time series analysis of collaborative activities-CRIWG2012

Time series analysis of
collaborative activities
Irene-Angelica Chounta, Nikolaos Avouris
HCI Group, University of Patras
{houren, avouris}@upatras.gr

Outline

• Objective
• Time series and collaborative activities
• Methodology of Analysis
• Results
• Conclusions and future work

Objective

• Use of time series as a tool of analysis

• Real time assessment of activity

• Classification of collaborative sessions

Time series and collaborative activities
• Time: important aspect of collaboration

• Analysis regarding time can
describe/reveal underlying group
dynamics

• Phenomena that may affect the quality of
collaboration can be captured in this way
(Vasileiadou, E., 2009)

Methodology of Analysis (1)
Memory-based learning model
Collaborative
session X

tsA
/CQA_A
DistanceX-A
IF (DistanceX-Y is minimum)
tsB
/CQA_B
DistanceX-B then {
CQA_X ≈ CQA_Y
}
…
where CQA: Collaboration
Quality Assessment
tsN
/CQA_n
DistanceX-N

• a data pool of 212 collaborative sessions
(collaboration quality assessed by rating
scheme) (Kahrimanis, G., et al, 2009)
• Groupware application: shared workspace +
chat tool - Task: Dyads constructing flow
charts – Duration: 1h30’
• same conditions applied for all
clients/collaborators

• time series (multivariate) of aggregated
sequences of events of collaborative activities per
time interval
– Number of Chat Messages and Workspace actions,
– Roles’ Alternations in Chat and Workspace activity
– Their differences between consecutive time intervals
• Various time intervals (1, 5, 8 and 10 minutes)
• distance measure: Dynamic Time Warping (DTW)
distance (Giorgino, T., 2009)
• two dissimilarity functions (Euclidean and
Manhattan)

Results (1)
Model evaluation:

• the correlation matrix of CQA(predicted vs.
true value)
• the root mean squared error (RMSE)
• the mean absolute error (MAE)

Results (2)
• The two variables (predicted vs. real CQA
value) are significantly and positively
correlated (p<0.05, Rho>0) for all time
intervals
Manhattan Euclidean
Time interval (min) p value Spearman’s Rho p value Spearman’s Rho

1 0.000 0.296 0.029 0.150
5 0.002 0.202 0.021 0.154
8 0.000 0.235 0.005 0.187
10 0.011 0.168 0.010 0.170

Results (3)
• MAE and RMSE For (CQA Є{-2, 2})
MAE RMSE
Time interval (min) Manhattan Euclidean Manhattan Euclidean
1 0.89* 0.97 1.14 1.21
5 1.19 1.21 1.48 1.5
8 1.18 1.16 1.5 1.48
10 1.17 1.19 1.44 1.47

Results (4)
For time interval=1 minute and Manhattan
distance:
|CQA_eval-CQA_pred| %cases
<0.5 41
<1 68.4
<2 92
CQA Є{-2, 2}

Conclusions & Future Work
• Significant positive correlations among the
(CQA_evaluative, CQA_prediction)
• Best results occur for 1 minute time interval
and Manhattan distance
(Rho:0.3,MAE: 0.89,RMSE: 1.1, CQA Є{-2, 2})

• Advanced classification techniques (k-nearest
neighbor) are expected to improve the results
• Further explore real time assessment and the
way feedback affects collaboration’s unfolding

Thank you

…Questions are welcome!

Euclidean vs. Manhattan
• Best distance highly dependable on data’s
nature
• Euclidean distance is not good with high
dimensional data
Euclidean: Manhattan:

Dynamic Time Warping

• Popular technique for comparing time series
• The series are "warped" non-linearly in the
time dimension in order to find best match
• Provides distance measure than can be further
used for classification
• Applies to both univariate and multivariate
time series

Rating Scheme
• provides quantitative judgments of the quality
of collaboration
• proposes the rating of seven collaborative
dimensions on a 5 point scale
• Collaboration Quality Average (CQA) is defined
as the average value of six dimensions
(Collaboration Flow, Sustaining Mutual Understanding,
Knowledge Exchange, Argumentation, Structuring
Problem Solving Process, Cooperative Orientation)

Time series
• Time series:
any sequence of observations recorded at
successive time intervals
(univariate, multivariate)

• Examples of use:
– Network traffic monitored by a web server per hour
– Shares’ price in a stock market per week
– Genes activity on biological processes

RMSE, MAE
• MAE: all the individual differences are
weighted equally in the average.
• RMSE: the RMSE gives a relatively high weight
to large errors.
• The MAE and the RMSE can be used together
to diagnose the variation in the errors in a set
of forecasts.

Model evaluation
Best MAE=0.89 where:
– previous post assessment, machine learning
techniques scored a MAE=0.74
– and MAE < 1 is acceptable for similar applications
(Kahrimanis, 2010)

– Simplicity of the model
– Real time results

Differences?????

Chat messages: a1 a2 a3 … aN-1 aN

Differences of Chat messages: a2-a1 a3-a2 … aN-aN-1

Time series analysis of collaborative activities-CRIWG2012

Recommended

Recommended

More Related Content

What's hot

What's hot (10)

Viewers also liked

Viewers also liked (12)

Similar to Time series analysis of collaborative activities-CRIWG2012

Similar to Time series analysis of collaborative activities-CRIWG2012 (20)

More from Irene-Angelica Chounta

More from Irene-Angelica Chounta (20)

Recently uploaded

Recently uploaded (20)

Time series analysis of collaborative activities-CRIWG2012

Editor's Notes