Pendekatan untuk riset big data di bidang sosial dan politik:
1. Data governance dan privacy
2. Media Analysis
3. Social Network Analysis
4. Complex System Analysis
1. Big Data untuk Riset Sosial dan Politik
widyawan@ugm.ac.id
2. ā¢ Dosen dteti ugm
ā¢ Direktur dssdi ugm
ā¢ Komisaris pt gamatechno dan & datains
ā¢ Ketua Pokja Big Data Forum Masyarakat Statistik
ā¢ S1 teknik elektro ugm
ā¢ S2 erasmus univ netherland
ā¢ S3 electronic, cork institute of tech, ireland
9. Why Data Silos
ā¢ Structural
ā¢ Government and organization increasingly divided into units and teams
ā¢ Often left to implement their own process and software
ā¢ Creates data separation
ā¢ Ex.: BPS, Dukcapil, MDP, Data Kesehatan
ā¢ Social
ā¢ No incentives for sharing data
ā¢ People do not share data to assert control and power
ā¢ Technological
ā¢ Legacy system
ā¢ Vendor lock-in
10. How
ā¢ Stop silos mentality
ā¢ Clear regulation in data transparency
ā¢ Encourage open data
11. āPersonal data is a new currency of the digital worldā
Meglena Kuneva, European Consumer Commissioner
12. Peopleās Privacy is the Loser
ā¢ Perlindungan terhadap
personal data lemah
(privacy)
ā¢ Pengguna tidak
mempunyai kendali
bagaimana personal
data digunakan, dibagi,
dikomersialkan dan
disebarluaskan
ā¢ Kedaulatan data
(berhubungan dgn
lokasi fisik data)
seringkali merupakan
isu antara negara dgn
korporasi
NegaraPeople
Korporasi
personaldata
personal data
surveillance
m
onetize
kedaulatan
13. Peta Perlindungan Privasi Data
Endemic surveillance societies
Extensive surveillance societies
Systemic failure to uphold safeguards
Some safeguards but weakened protections
Adequate safeguards against abuse
https://www.privacyinternational.org/
17. āIn God we trust, all others must bring data.ā
W. Edwards Deming
18.
19. Notes about data source
Data Source Availability Veracity
Sensor IoT Closed Medium - High
Software Database Closed High
Social Media Open Low
Online news Open Medium
ā¢ Twitter public API provide access to 1% of its data
ā¢ Facebook and Instagram only provide access to public
groups or pages, since the Cambridge Analytica case has
become a more difficult mechanism
ā¢ WhatsApp does not provide access to their data, at least
legally
ā¢ Online news can be obtained from RSS or scrapping
20. Mathematical Modelling in Social Data
Media
Analysis
Social
Network
Analysis
Complexity
Analysis
Social
Simulations
C. Ciof-Revilla, Introduction to Computational Social Science: Principles and Applications. London, U.K.: Springer, 2013.
21. Media Analysis
ā¢ Comprise of information extraction and classification
Information extraction:
ā¢ Unobtrusive method of parsing and coding documents to extract
information from data
ā¢ Mainly text, increasingly all source of media
ā¢ Wisdom of the crowd Ć a vast majority of the extant literature is on
twitter datasets with only 5% of the papers analyzing Facebook
Social Set Analysis: A Set Theoretical Approach to Big Data Analytics, Ravi Patrapu et.all., IEEE Access, 2016
22. Wisdom of the crowd, consideration
ā¢ Information will spread/diffuse in various media. Online News,
Facebook or Twitter contents will reflect the content and resonance
of other media.
ā¢ The difference Ć online news will have a curation process by the
editor. Therefore (ideally) the content are facts (not
gossip/speculation), cover both sides, impartial, do not contain
personal opinions/neutral.
ā¢ Big Data, Big Noise Ć on the other hand, social media such as twitter
and Facebook, a status/tweet is a personal expression. It contains
opinions and there is no mechanism for curating/checking, it's more
prone to hoaxes
26. Classification
ā¢ the action or process of classifying something according to shared
qualities or characteristics.
ā¢ a computational linguistic approach using various mathematical
method
ā¢ Regression
ā¢ Logical/rule based
ā¢ Geometric model
ā¢ Probabilistic model
ā¢ Neural Network Ć evolve into deep learning
29. Metcalfeās Law & Network Economics
ā Value or power of a network grows exponentially as a function of
the number of network members
ā As network members increase, more people want to use it
network value = n(n-1)/2
30. Social Network Analysis
ā¢ Study of social structures using
networks and graphs theory
ā¢ Explorer basic relation in dyadic
structure
ā¢ A node can be an actor/person
and the relations/edges are
relationships between nodes
(e.g. retweet or mention)
Retweet
@gusmusgusmu @me
34. Statistik dari tweet dan onlinenews
waktu: 3-8 Oktober 2018
twitter keyword: jogja, yogya, jogjakarta, DIY
hit: 30.435 tweet
onlinenews keyword: jogja, yogya, jogjakarta, DIY
hit: 356 berita
@YogyakartaCity
@JogjaUpdate
topik: sepakbola
topik: BMKG
topik: ulangtahun
SNA
layout: Yifan hu algoritma
keterangan: warna pada SNA menunjukkan pengelompokan topik
37. Notes about SNA
ā¢ Polarization clearly visible in a divided political issue
ā¢ Some nodes have higher degree of centrality Ć influencer
ā¢ Some nodes need to play a role as boundary spanner.
ā¢ Boundary spanner is a node that connects / bridges between two different
communities, which without them will not communicate with each other
(MAC case: republikaonline, vivacoid, maklambeturah, MbahUyok).
ā¢ Otherwise, echo chamber effect
42. Take Out
ā¢ To get value from data, data governance is needed
ā¢ Privacy needs more protection from commercial interest and state
surveillance
ā¢ Media analysis comprise of information extraction and classification
ā¢ SNA is a study of social structures using networks and graphs theory