데이터 시각화 "연구"와 Explainability를 위한 R Shiny의 활용

데이터 시각화 ”연구”와 Explainability를 위한 R Shiny의 활용
유재성

Overview
연구 분야로서의 ”시각화”
R Shiny는 시각화 연구에서 어디까지 유용할 수 있을까?
Model의 Explainability을 위한 시각화
시각화 연구 과정에서 R Shiny 활용 팁
2

시각화 연구를 하는 사람들
3

4

5

6

시각화 관련 저널
• IEEE Trans. Visualization Comp. Graphics (Proc. InfoVis)
• Journal of Computational and Graphical Statistics
• EuroVA
• SIAM Journal on Scientiﬁc Computing (SISC)
• ACM Human Factors in Computing Systems (Chi)
• The R Journal
• The ASA Data Science Journal
• IEEE Access
• Journal of Open Source Software
• · · ·
7

아름다운 시각화의 조건2
• 참신성: 독자의 흥미를 끌거나 이해를 새롭게 하는 양식이 있어야 한다.
• 정보성: 독자가 지식을 얻을 수 있도록 정보의 접점을 만들어줄 수 있어야 한다.
• 효율성: 정보에 대한 접점은 최대한 간단해야하지만, 내용과 관련이 있다면 복잡하게
표현할 필요도 있다.
• 심미성: 시각적으로 매력적으로 보이게 하는 것 뿐만 아니라, 독자를 잘 인도하고,
의미를 소통하고, 관계를 나타내고, 결론을 강조하기 위해, 구성 요소1를 적절하게
사용해야 한다.
1
좌표 축, 레이아웃, 형태, 색, 선, 타이포그래피 등
2
Noah Iliinsky (2010) ”Beautiful Visualization: Looking at Data through the Eyes of Experts”, O’Reilly Media 8

시각화 ”연구”의 분류
9

10

Figure: [Left] Higher-order orbital mixture of a hydrogen atom3
[Right] Jaegul Choo’s Identity
3
Noah Iliinsky (2010) ”Beautiful Visualization: Looking at Data through the Eyes of Experts”, O’Reilly Media 11

시각화 절차
Figure: 시각화 절차4
4
신희숙, 임정묵, 박준석 (2013), ”정보 시각화 기술과 시각장애인을 위한 정보 표현 기술”, 전자통신동향분석, 28-1 12

시각화 절차
Figure: 정보 조직화 단계5
5
그림 출처: https://blog.naver.com/samsjang/220693792607 13

시각화 절차
구분 주요 시각화 방법
시간 시각화 막대 그래프, 누적 막대 그래프, 점 그래프, · · ·
분포 시각화 파이차트, 도넛 차트, 트리 맵, 누적 연속 그래프, · · ·
관계 시각화 스캐터 플랏, 버블 차트, 히스토그램, · · ·
비교 시각화 히트맵, 스타 차트, 평행 좌표계, 다차원 척도법, · · ·
공간 시각화 지도 맵핑, 다차원 척도법, · · ·
· · · · · ·
Table: 정보 시각화 단계
14

시각화 절차
Figure: 상호작용 단계6
6
그림 출처: https://www.dotmatics.com 15

• 시각화 관련 저널에서 R Shiny가 활용된 사례
16

• 시각화 관련 저널에서 R Shiny가 활용된 사례7
7
Cook, Dianne, Andreas Buja, and Deborah F Swayne (2007),
“Interactive High-Dimensional Data Visualization.”
Journal of Computational and Graphical Statistics, December, 1–23. 17

8
Hafen, R., L. Gosink, J. McDermott, K. Rodland, K. K. V. Dam, and W. S. Cleveland (2013),
“Trelliscope: A System for Detailed Visualization in the Deep Analysis of Large Complex Data.”
Large-Scale Data Analysis and Visualization (Ldav), 2013 Ieee Symposium on, 105–12. 18

9
Heer, Dominik Moritz AND Bill Howe AND Jeffrey (2019),
“Falcon: Balancing Interactive Latency and Resolution Sensitivity for Scalable Linked Visualizations.”
In ACM Human Factors in Computing Systems(Chi) 19

R Shiny는 시각화 연구에서 어디까지 유용할 수 있을까?10
10
그림 출처: https://www.pinterest.se/wired/ 20

• 질문: ”Shiny로 나만의 독창적인 시각화를 할 수 있을까?”
Figure: Jacques Bertin의 시각화 구성 요소1112
11
Jacques Bertin (1967), ”Semiologie Graphiques”, I’Ecole des Hautes Etudes en Sciences
12
그림 출처: 장유희 (2016), ”대상과 전개를 고려한 데이터 시각화 기법”, 숭실대학교 소프트웨어특성화대학원 석사학위 논문 21

• 명암: viridis 패키지 등
• 색상: 구글에 ”colors in R”을 쳐보세요.
• 패턴: barplot() 함수의 ﬁll 인자, angle 인자, density 인자
• 형태: plot() 함수의 cex 인자, symbols() 함수
• 위치: axis() 함수, par() 함수 등
• 크기: plot() 함수의 lwd 인자
• 방향: arrows() 함수 등
22

R Shiny는 시각화 연구에서 어디까지 유용할 수 있을까?13
• 정답: ”가능하긴 합니다.”
우리는 이미 이런 일을 하는 사람을 알고 있죠.
13
그림 출처: https://www.ggplot2-exts.org/gallery/ 23

• 정답: ”가능하긴 합니다.”
Figure: ”UpSet”의 D3.js 버전(왼쪽)과 R 버전(오른쪽)14
14
Alexander Lex, Nils Gehlenborg, Hendrik Strobelt, Romain Vuillemot, Hanspeter Pﬁster (2014)
”UpSet: Visualization of Intersecting Sets”,
IEEE Transactions on Visualization and Computer Graphics(InfoVis ’14), 20-12, 1983–1992. 24

• 정답: ”가능하긴 합니다. 그러나, 너무나도 어렵습니다.”
Figure: ”UpSet”의 R버전 소스코드15
15
그림 출처: https://github.com/hms-dbmi/UpSetR/ 25

26

이제 당분간 이런 삶을 보내게 됩니다.
(언제 끝날지 아무도 모릅니다...)
27

• ”괜찮아요. 이쪽 방향도 있어요..”
28

Figure: Sergio Pec¸anha works for the Graphics Department of The New York Times
29

Figure: R을 이용하여 만든 인포그래픽16
16
https://www.r-bloggers.com/r-how-to-layout-and-design-an-infographic/ 30

Figure: How to Explain? Accuracy vs. Explainability17
17
그림 출처: Freddy Lecue, Pasquale Minervini (2019), ”On Explainable AI: From Theory to Motivation, Applications and Limitations” 31

Figure: R Shiny를 이용한 모델의 Interaction Visualization18
18
그림 출처:
• Jaeseong Yoo, Juntae Kim, Jihoon Moon, Eenjun Hwang, Jaegul Choo (Working),
”Prediction under cold-start problem in the partitioned time-series using Bayesian model selection”,
IEEE Access
32

Figure: Explainable AI의 최근 challenge19
19
그림 출처: David Gunning (2016), ”On Explainable AI: From Theory to Motivation, Applications and Limitations”, DARPA-BAA-16-53. 33

Figure: RETAIN to RetainVIS20
20
그림 출처:
• (1): Edward Choi, Mohammad Taha Bahadori, Joshua A. Kulas, Andy Schuetz, Walter F. Stewart, Jimeng Sun (2016),
”RETAIN: An Interpretable Predictive Model for Healthcare using Reverse Time Attention Mechanism”,
Neural Information Processing Systems (NIPS)
• (2), (3): B. Kwon, M. Choi, J. Kim, E. Choi, Y. Kim, S. Kwon, J. Sun, J. Choo (2019),
”RetainVis: Visual Analytics with Interpretable and Interactive Recurrent Neural Networks on Electronic
Medical Records”, 34

Model의 Explainability을 위한 시각화21
Explainability by Design
21
KDD 2019 Tutorial on Explainable AI in Industry - 5 https://sites.google.com/view/kdd19-explainable-ai-tutorial 35

Model의 Explainability을 위한 시각화22
22
Brent D. Mittelstadt, Chris Russell, Sandra Wachter (2018), ”Explaining Explanations in AI.” CoRR abs/1811.01439 36

What is an Explaination?
• No formal, technical, agreed upon deﬁnition!
• Comprehensive philosophical overview out of scope of the tutorial
• Not limited to machine learning!
37

What about Interpretability?
• Interpretability as Multi-faceted Concept
Interpretability is an ill-deﬁned term!
Not a monolithic concept
38

Levels of Model Transparency
• Simulatability: Transparent Model
Understanding of the functioning of the model
: Can a human easily predict outputs?
: Can a human examine the model all at once?
• Decomposability: Transparent Model Components
Understanding at the level of single components (e.g. parameters)
• Algorithmic Transparency: Transparent Training Algorithm
Understanding at the level of training algorithm
39

Interpretability Goes Beyond the Model
40

Desired Properties of Explainability
• Informativeness: to which extend the model / prediction can be of use
• Interpretability (or Comprehensibility): to which extent the model and/or its
predictions are human understandable. Is measured with the complexity of the
model
• Fidelity: to which extent the model imitate a black-box predictor.
• Accuracy: to which extent the model predicts unseen instances.
• Fairness: the model guarantees the protection of groups against discrimination.
41

Desired Properties of Explainability(Cont’d)
• Privacy: the model does not reveal sensitive information about people.
• Respect Monotonicity: the increase of the values of an attribute either increase or
decrease in a monotonic way the probability of a record of being member of a
class.
• Usability: an interactive and queryable explanation is more usable than a textual
and ﬁxed explanation.
• Low cognitive load: explanation should easy to understand
• Reliability and Robustness: the interpretable model should maintain high levels of
performance independently from small variations of the parameters or of the input
data.
42

Desired Properties of Explainability(Cont’d)
• Non-misleading: the interpretation sticks to the models, and do not hallucinate on
behavior
• Causality: controlled changes in the input due to a perturbation should affect the
model behavior.
• Scalability: the interpretable model should be able to scale to large input data with
large input spaces.
• Generality: the model should not require special training or restrictions.
• Interactivity /Conversational: explanation should be reﬁned based on user proﬁle,
preference and experience.
43

One Objective, Many Metrics
44

One Objective, Many Metrics, Many Deﬁnitions, Many Approaches23
23
그림 출처: Freddy Lecue, Pasquale Minervini (2019), ”On Explainable AI: From Theory to Motivation, Applications and Limitations” 45

시각화 연구 과정에서 R Shiny 활용 팁
• 안보이는 곳의 처리는 최대한 빠르게:
loop 쓰지 않기, 분산 처리, 미리 저장해둔 데이터와 분석 결과
• 남들이 만든 예쁜 패키지 적극적으로 활용하기:
ggplot, Lattice, highcharter, leaﬂet, Plotly, sunburstR, dygraphs, · · ·
• 남들이 만든 예쁜 R Shiny 애드온 패키지 적극 활용하기24:
shinyWidgets, shinydashboard, dashboardthemes, shinyBS, shinyjs, shinyalert,
· · ·
24
그림 출처: https://shiny.rstudio.com/gallery/ 46

데이터 시각화 "연구"와 Explainability를 위한 R Shiny의 활용

Recommended

Recommended

More Related Content

Similar to 데이터 시각화 "연구"와 Explainability를 위한 R Shiny의 활용

Similar to 데이터 시각화 "연구"와 Explainability를 위한 R Shiny의 활용 (20)

데이터 시각화 "연구"와 Explainability를 위한 R Shiny의 활용