SlideShare a Scribd company logo
1 of 71
Download to read offline
Personal	
  Informa-on	
  	
  
Management	
  Systems	
  
Serge	
  Abiteboul	
  
INRIA	
  &	
  ENS	
  Cachan	
  
serge.abiteboul@inria.fr	
  
Amélie	
  Marian	
  
Rutgers	
  University	
  
amelie@cs.rutgers.edu	
  
Personal	
  data	
  is	
  everywhere	
  
Amélie	
  &	
  Serge,	
  EDBT,	
  11111011111	
  	
   2	
  
Personal	
  data	
  is	
  exploding	
  	
  
•  Ac-vely:	
  Data	
  and	
  metadata	
  we	
  produce	
  
–  Pictures,	
  reports,	
  emails,	
  calendars,	
  tweets,	
  annota-ons,	
  
recommenda-on,	
  social	
  network…	
  
	
  	
  	
  	
  	
  	
  Ac-vely:	
  Data	
  we	
  like/buy	
  
–  Books,	
  music,	
  movies…	
  
•  Passively:	
  Data	
  others	
  produce	
  about	
  us	
  
–  Public	
  administra-on,	
  schools,	
  insurances,	
  banks…	
  
–  Amazon,	
  banks,	
  retailers,	
  applestore…	
  	
  
•  Stealthily:	
  sensors	
  
–  GPS,	
  web	
  naviga-on,	
  phone,	
  "quan-fied	
  self"	
  measurements,	
  
contactless	
  card	
  readings,	
  surveillance	
  camera	
  pictures…	
  
•  Stealthily:	
  data	
  analysis	
  
–  Clicks,	
  Searches,	
  TV	
  viewing	
  habits	
  (e.g.,	
  NeYlix)	
  
–  NSA	
  inference	
  
3	
  Amélie	
  &	
  Serge,	
  EDBT,	
  11111011111	
  	
  
Personal	
  data	
  is	
  heterogeneous	
  
•  Structured:	
  rela-onal	
  
•  Semistructured:	
  HTML,	
  XML,	
  Jason…	
  
•  Not	
  structured:	
  text	
  (pdf),	
  pictures,	
  music,	
  video…	
  
•  Metadata:	
  date,	
  loca-on…	
  	
  
•  Seman-c:	
  RDF,	
  RDFS,	
  Owl	
  
•  Different	
  languages,	
  terminologies,	
  ontologies,	
  structures	
  
•  Different	
  systems,	
  protocols	
  	
  
•  Varying	
  quality	
  
4	
  Amélie	
  &	
  Serge,	
  EDBT,	
  11111011111	
  	
  
•  Loss	
  of	
  func-onali-es	
  because	
  of	
  fragmenta-on	
  
–  You	
  don’t	
  know	
  where	
  your	
  data	
  is,	
  how	
  to	
  maintain	
  it	
  up	
  
to	
  date,	
  how	
  to	
  get	
  it	
  some-mes	
  
–  Difficult	
  to	
  do	
  global	
  search,	
  maintenance,	
  
synchroniza-on,	
  archiving...	
  
•  Loss	
  of	
  control	
  over	
  the	
  data	
  
–  Difficult	
  to	
  control	
  privacy	
  
–  Difficult	
  to	
  control	
  sharing	
  	
  
–  Leaks	
  of	
  private	
  informa-on	
  
•  Loss	
  of	
  freedom	
  
–  Vendor	
  lock-­‐in	
  
Bad	
  news	
  
5	
  Amélie	
  &	
  Serge,	
  EDBT,	
  11111011111	
  	
  
Alterna-ves	
  
1.  Con-nue	
  with	
  this	
  increasing	
   	
   	
   	
   	
   	
  
	
  mess	
  
–  Use	
  a	
  shrink	
  to	
  overcome	
  	
   	
   	
   	
   	
   	
   	
   	
  
	
  the	
  frustra-on	
  
2.  Regroup	
  all	
  your	
  data	
  on	
  the	
  same	
  plaYorm	
  
–  Google,	
  Apple,	
  Facebook,	
  …,	
  a	
  new	
  comer	
  
–  Use	
  a	
  shrink	
  to	
  overcome	
  resentment	
  
3.  Study	
  2	
  years	
  to	
  become	
  a	
  geek	
  
–  Geeks	
  know	
  how	
  to	
  manage	
  their	
  informa-on	
  	
  
–  Use	
  a	
  shrink	
  to	
  survive	
  the	
  experience	
  
6	
  Amélie	
  &	
  Serge,	
  EDBT,	
  11111011111	
  	
  
Where	
  do	
  you	
  
keep	
  your	
  data?	
  
The	
  -me	
  for	
  PIMS	
  is	
  now!	
  
A	
  memex	
  is	
  a	
  device	
  in	
  which	
  an	
  individual	
  stores	
  all	
  his	
  books,	
  records,	
  and	
  
communica7ons,	
  and	
  which	
  is	
  mechanized	
  so	
  that	
  it	
  may	
  be	
  consulted	
  with	
  
exceeding	
  speed	
  and	
  flexibility.	
  It	
  is	
  an	
  enlarged	
  in7mate	
  supplement	
  to	
  his	
  
memory. 	
   	
   	
   	
   	
  Vannevar	
  Bush,	
  The	
  Atlan-c	
  Monthly,	
  1945	
  
	
  
Defini-on	
  for	
  this	
  talk:	
  a	
  Personal	
  Informa-on	
  
Management	
  System	
  is	
  a	
  (cloud)	
  system	
  that	
  manages	
  
all	
  the	
  informa7on	
  of	
  a	
  person	
  
Amélie	
  &	
  Serge,	
  EDBT,	
  11111011111	
  	
   7	
  
The	
  PIMS:	
  A	
  change	
  in	
  paradigm	
  
Using	
  Web	
  services	
  today	
  
•  Your	
  data	
  
•  Running	
  with	
  an	
  external	
  
service	
  
•  On	
  some	
  unknown	
  
machines	
  
	
  
Your	
  PIMS	
  	
  
•  Your	
  data	
  	
  
•  Running	
  a	
  local	
  service	
  
•  On	
  your	
  machine	
  
Possibly	
  for	
  external	
  services	
  
•  A	
  replica	
  of	
  the	
  data	
  
•  On	
  a	
  wrapper	
  	
  
•  On	
  your	
  machine	
  
Amélie	
  &	
  Serge,	
  EDBT,	
  11111011111	
  	
   8	
  
PIMS	
  in	
  the	
  Past	
  
Saving	
  Personal	
  Data	
  –	
  Old	
  School	
  
Amélie	
  &	
  Serge,	
  EDBT,	
  11111011111	
  	
   10	
  
Searching	
  Personal	
  Data	
  –	
  Old	
  School…	
  
Amélie	
  &	
  Serge,	
  EDBT,	
  11111011111	
  	
   11	
  File	
  cabinet	
  
around	
  1888	
  
Personal	
  Informa-on	
  Management	
  –	
  
the	
  Digital	
  Age	
  
Amélie	
  &	
  Serge,	
  EDBT,	
  11111011111	
  	
   12	
  
% grep PIMS /home/amelie/presentations
First-­‐genera-on	
  Personal	
  Informa-on	
  
Management	
  Systems	
  
•  Storage	
  
– Archival,	
  safe-­‐keeping	
  
•  Organiza-on	
  
– Structure	
  
– Different	
  file	
  types	
  
•  Finding	
  and	
  re-­‐finding	
  informa-on	
  
– Different	
  from	
  tradi-onal	
  IR/Web	
  search	
  systems	
  
– Keyword	
  searches	
  not	
  ideal	
  
Amélie	
  &	
  Serge,	
  EDBT,	
  11111011111	
  	
   13	
  
Desktop	
  Search	
  Tools	
  
•  Google	
  Desktop	
  Search	
  (defunct)	
  
•  Apple	
  Spotlight	
  
•  Windows	
  Search	
  
•  Lead	
  to	
  frustra-on	
  when	
  users	
  cannot	
  find	
  
informa-on	
  they	
  know	
  they	
  have	
  
Amélie	
  &	
  Serge,	
  EDBT,	
  11111011111	
  	
   14	
  
Use	
  IR-­‐style	
  keyword	
  searches	
  	
  
Some	
  metadata	
  filtering	
  
Past	
  PIMS	
  projects	
  	
  
(late	
  1990’s,	
  2000’s)	
  	
  
•  Lifestreams	
  
–  Time	
  oriented	
  streams	
  
•  Haystack	
  
–  Uniform	
  data	
  model	
  
•  Stuff	
  I’ve	
  seen	
  
–  History	
  of	
  web	
  behavior	
  
•  Dataspaces	
  
–  Seman-c	
  connec-ons.	
  Data	
  
integra-on	
  
•  Connec-ons,	
  Seetrieve	
  
–  Task-­‐based	
  organiza-on	
  
•  deskWeb	
  
–  Looks	
  at	
  the	
  social	
  network	
  
graph	
  	
  
	
  
	
  
Various	
  use	
  of	
  	
  
–  Context	
  
–  Time	
  
–  Social	
  network	
  
	
  
Amélie	
  &	
  Serge,	
  EDBT,	
  11111011111	
  	
   15	
  
LifeStreams	
  
(Freeman	
  and	
  Gelertner,	
  Yale,	
  1996-­‐1997)	
  
Help	
  users	
  
manage	
  their	
  
informa-on	
  
	
  
Time-­‐centric	
  view	
  
of	
  documents	
  
Amélie	
  &	
  Serge,	
  EDBT,	
  11111011111	
  	
   16	
  
Haystack	
  	
  
(Karger	
  et	
  al.,	
  MIT	
  CSAIL	
  1997-­‐2005)	
  
Allows	
  users	
  to	
  store,	
  
examine	
  and	
  manipulate	
  
their	
  informa-on	
  
	
  
•  Uniform	
  Data	
  Model	
  
•  Semi-­‐structured	
  Data	
  
•  Captures	
  
rela-onships	
  
•  Separate	
  Workspaces	
  
Amélie	
  &	
  Serge,	
  EDBT,	
  11111011111	
  	
   17	
  
Stuff	
  I’ve	
  Seen	
  	
  
(Dumais	
  et	
  al.	
  Microsos,	
  2003-­‐2004)	
  
•  Unified	
  Index	
  
•  Integra-on	
  of	
  
sources	
  
•  Re-­‐find	
  
informa-on	
  
•  Focus	
  on	
  UI	
  
Amélie	
  &	
  Serge,	
  EDBT,	
  11111011111	
  	
   18	
  
A	
  changing	
  landscape	
  
	
  
	
  
Cloud-­‐based	
  model	
  	
  
	
  
	
  
Heterogeneous	
  data	
  types	
  and	
  formats	
  
	
  
Need	
  for	
  richer	
  func-onali-es	
  and	
  seman-c	
  analysis	
  
Amélie	
  &	
  Serge,	
  EDBT,	
  11111011111	
  	
   19	
  
A	
  vision	
  for	
  the	
  Future	
  of	
  PIMS	
  	
  
	
  
All	
  the	
  digital	
  life	
  of	
  an	
  individual	
  
From	
  Memex	
  to	
  MyLifeBits	
  
Memex	
  
–  Memory	
  index	
  or	
  memory	
  extender	
  
–  Hypertext	
  system	
  by	
  Vannevar	
  Bush	
  in	
  1945	
  	
  
–  Compress	
  and	
  store	
  all	
  of	
  their	
  books,	
  records,	
  
and	
  communica-ons…	
  
–  Provide	
  an	
  "enlarged	
  in-mate	
  supplement	
  to	
  
one's	
  memory”	
  
MyLifeBits	
  
–  Microsos	
  Research	
  project	
  with	
  Gordon	
  Bell	
  
(2006)	
  
–  Life-­‐logging	
  	
  
–  All	
  documents	
  read	
  or	
  produced	
  by	
  Bell,	
  CDs,	
   	
  
	
  emails,	
  web	
  pages	
  browsed,	
  phone	
  and 	
   	
  
	
  	
  instant	
  messaging	
  conversa-ons,	
  etc.	
  
	
  
	
  
Amélie	
  &	
  Serge,	
  EDBT,	
  11111011111	
  	
   21	
  
Some	
  of	
  the	
  digital	
  life?	
  
•  The	
  “Total	
  Capture	
  vision”	
  has	
  its	
  detractors	
  
•  Advantages	
  of	
  selec-ve	
  human	
  memory	
  	
  
– Ignore	
  irrelevant	
  informa-on	
  to	
  avoid	
  flooding	
  
when	
  searching	
  for	
  something	
  
– Choose	
  what	
  to	
  forget,	
  e.g.,	
  unpleasant	
  memories	
  
•  Perhaps	
  PIMS	
  should	
  also	
  be	
  selec-ve	
  
•  More	
  complicated	
  than	
  Total	
  Capture	
  
Amélie	
  &	
  Serge,	
  EDBT,	
  11111011111	
  	
   22	
  
Hypermnesia	
  
Excep7onally	
  exact	
  or	
  vivid	
  memory,	
  
especially	
  as	
  associated	
  with	
  
certain	
  mental	
  illnesses	
  
For	
  a	
  user:	
  We	
  cannot	
  live	
  knowing	
  
that	
  any	
  word,	
  any	
  move	
  will	
  leave	
  
a	
  trace?	
  	
  
For	
  the	
  ecosystem:	
  We	
  cannot	
  store	
  
all	
  the	
  data	
  we	
  produce	
  –	
  lack	
  of	
  
storage	
  resources	
  	
  
	
  
23	
  
ForgeGng	
  is	
  Key	
  to	
  a	
  Healthy	
  Mind	
  
Scien7fic	
  American	
  
Image:	
  Aaron	
  Goodman	
  
A	
  main	
  issue	
  is	
  to	
  select	
  the	
  informaJon	
  we	
  
choose	
  to	
  keep	
  
	
   Amélie	
  &	
  Serge,	
  EDBT,	
  11111011111	
  	
  
Nature	
  and	
  value	
  of	
  informa-on	
  
	
  
w5h	
  model	
  (context-­‐based)	
  
	
  
	
  
•  Changes	
  with	
  -me	
  
•  Depends	
  on	
  many	
  dimensions:	
  
nature	
  of	
  info,	
  rarity,	
  age,	
  
personal	
  bias/taste/opinions…	
  
•  Difficult	
  to	
  es-mate	
  the	
  cost	
  to	
  
get	
  some	
  info	
  
–  To	
  es-mate	
  how	
  much	
  you	
  would	
  
spend	
  before	
  you	
  give	
  up	
  
•  Difficult	
  to	
  es-mate	
  the	
  value	
  of	
  
informa-on	
  you	
  don't	
  have	
  yet	
  
•  Difficult	
  for	
  the	
  system	
  to	
  know	
  
what	
  a	
  human	
  remembers	
  
–  Makes	
  crowd	
  sourcing	
  difficult	
  
Amélie	
  &	
  Serge,	
  EDBT,	
  11111011111	
  	
   24	
  
Storage	
  and	
  Archival	
  	
  
•  Fully	
  under	
  user’s	
  control	
  
•  Fully	
  available	
  on	
  the	
  cloud	
  
–  Without	
  privacy	
  risk	
  
•  Fully	
  resilient	
  to	
  failure	
  
–  Automa-c	
  back-­‐ups	
  
–  Automa-c	
  synchroniza-on	
  with	
  other	
  systems/devices	
  	
  
•  Support	
  of	
  access	
  control	
  
–  Simple	
  and	
  intui-ve	
  defini-on	
  across	
  systems/devices	
  
•  Use	
  of	
  encryp-on	
  
–  Data	
  is	
  stored	
  encrypted	
  in	
  the	
  cloud	
  or	
  on	
  a	
  personal	
  
machine	
  connected	
  to	
  the	
  cloud	
  
Amélie	
  &	
  Serge,	
  EDBT,	
  11111011111	
  	
   25	
  
Data	
  integra-on	
  
•  Old	
  problems	
  revisited	
  
Person-­‐centric	
  	
  
informa-on	
  integra-on	
  
Amélie	
  &	
  Serge,	
  EDBT,	
  11111011111	
  	
   27	
  
27	
  
Sue’s	
  
PIMS	
  
…	
  
…	
  
W’1	
  
	
  	
  	
  W1	
  	
  
wrapper	
   …	
  
Secured	
  
net	
  
Bob	
  
Joe	
  
…	
  
Decentralized	
  services	
  
(e.g.,	
  Diaspora)	
  	
  
External	
  
Services	
  
(e.g.,	
  Facebook)	
  	
  
	
  	
  	
  Wn	
  
wrapper	
  
	
  	
  	
  L1	
  
Lp	
  
	
  
	
  	
  	
  D1	
  
Dm	
  
	
  
W’n	
  
Local	
  
Services	
  
(e.g.,	
  Analy-cs)	
  	
  
Sue	
  
S	
  
Server-­‐centric	
  
…	
  
Classical	
  data	
  integra-on	
  problem	
  
•  Choose	
  a	
  schema	
  for	
  the	
  
PIMS	
  
•  Choose	
  a	
  mapping	
  
between	
  the	
  sources	
  and	
  
the	
  mediated	
  schema	
  
•  Extract	
  &	
  load	
  &	
  maintain	
  
–  Data	
  and	
  metada	
  from	
  
sources	
  
Lots	
  of	
  works	
  
–  On	
  digital	
  libraries	
  
–  On	
  database	
  integra-on	
  
Amélie	
  &	
  Serge,	
  EDBT,	
  11111011111	
  	
   28	
  
…	
  
Sue’s	
  
PIMS	
  
Sn	
  
Sn	
  
S1	
  
S1	
  
Wrapper	
   Wrapper	
  
Classical	
  knowledge	
  integra-on	
  problem	
  
•  Enrich	
  the	
  ontology	
  
–  Align	
  concepts	
  and	
  rela-ons	
  in	
  
schemas	
  
–  Align	
  objects	
  
•  Reference	
  to	
  external	
  data	
  
	
  
Lots	
  of	
  works	
  
–  On	
  knowledge	
  representa-on	
  	
  
–  On	
  knowledge	
  integra-on	
  	
  
Amélie	
  &	
  Serge,	
  EDBT,	
  11111011111	
  	
   29	
  
Imported	
  knowledge	
  
Alignments	
  
(computed	
  or	
  curated)	
  
Curated	
  	
  
knowledge	
  
Imported	
  ontologies	
  
Personal	
  
ontology	
  
Illustra-on:	
  en-ty	
  resolu-on	
  
•  Mail	
   •  Contact	
  
Amélie	
  &	
  Serge,	
  EDBT,	
  11111011111	
  	
   30	
  
•  Websearch	
  
amelie@gmail.com	
   Amelie	
   Marian	
  
from	
  
…	
  	
  	
  	
  Nikki	
  de	
  Saint-­‐Phalle	
  …	
  body	
  
grandpalais.fr/ndsp/	
  url	
  
…	
  	
  Nikki	
  de	
  Saint-­‐Phalle	
  	
  …	
  
Searching	
  Personal	
  Informa-on	
  
Memory	
  Tasks	
  
•  The	
  “five	
  Rs”	
  memory	
  tasks	
  	
  
	
   	
   	
  -­‐Sellen	
  and	
  Whitaker,	
  CACM	
  2010	
  
Recollec-ng	
  
Reminiscing	
  
Retrieving	
  
Reflec-ng	
  
Remembering	
  inten-ons	
  
Amélie	
  &	
  Serge,	
  EDBT,	
  11111011111	
  	
   32	
  
Recollec-ng	
  
•  Task-­‐based	
  memory	
  process	
  
•  Retracing	
  steps	
  to	
  recollect	
  informa-on	
  
– “Where	
  did	
  I	
  leave	
  my	
  keys”	
  	
  
– “When	
  was	
  the	
  last	
  -me	
  I	
  saw	
  Pierre”	
  
•  Follow	
  a	
  series	
  of	
  cues	
  to	
  iden-fy	
  informa-on	
  
Amélie	
  &	
  Serge,	
  EDBT,	
  11111011111	
  	
   33	
  
Need:	
  ConnecJons	
  between	
  memory	
  objects	
  
	
   	
  (integraJon	
  and	
  navigaJon)	
  
Reminiscing	
  
•  Browsing	
  through	
  past	
  
memories	
  to	
  re-­‐live	
  
them	
  
•  Experience-­‐based	
  (no	
  
specific	
  goal	
  in	
  mind)	
  
–  	
  E.g.,	
  looking	
  at	
  old	
  
photos	
  
Amélie	
  &	
  Serge,	
  EDBT,	
  11111011111	
  	
   34	
  
Need:	
  ConnecJons	
  between	
  memory	
  objects	
  
	
   	
  (integraJon	
  and	
  navigaJon)	
  
Retrieving	
  
•  Retrieving	
  specific	
  informa-on	
  
– Files,	
  documents,	
  pictures	
  
– Data	
  snippets	
  
•  Use	
  of	
  metadata	
  
•  Can	
  be	
  combined	
  with	
  recollec-on	
  
Amélie	
  &	
  Serge,	
  EDBT,	
  11111011111	
  	
   35	
  
Need:	
  Query	
  model,	
  Indexes,	
  
	
   	
  	
  and	
  Search	
  algorithms	
   	
   	
  	
  
Reflec-ng	
  
•  Learning	
  from	
  the	
  past	
  
– Iden-fy	
  paxerns	
  
– Personal	
  data	
  analysis	
  
•  Towards	
  a	
  Personal	
  Knowledge	
  Base	
  (PKB)	
  
– Individual	
  vs.	
  shared	
  knowledge	
  
– Privacy	
  concerns	
  
Amélie	
  &	
  Serge,	
  EDBT,	
  11111011111	
  	
   36	
  
Need:	
  Knowledge	
  Discovery	
  and	
  Mining	
  
techniques	
  designed	
  for	
  personal	
  data	
   	
   	
  	
  
Remembering	
  Inten-ons	
  
•  Focus	
  on	
  prospec-ve	
  memory	
  
–  To-­‐do	
  lists	
  
–  Appointment	
  reminders	
  
•  Ac-ve	
  focus	
  of	
  commercial	
  companies	
  
–  Google	
  Now	
  
–  No-fica-on	
  apps	
  (-me-­‐	
  or	
  loca-on-­‐based)	
  
–  Microsos	
  Personal	
  Agent	
  project?	
  
Amélie	
  &	
  Serge,	
  EDBT,	
  11111011111	
  	
   37	
  
Need:	
  NLP	
  techniques	
  designed	
  for	
   	
  
	
  	
  	
  	
  	
  	
  	
  	
  personal	
  data	
  	
   	
  	
  
Explaining	
  
•  Users	
  want	
  to	
  understand	
  the	
  informaJon	
  
	
  they	
  see,	
  the	
  answers	
  they	
  are	
  given	
  
–  In	
  their	
  professional/social	
  life	
  	
  
•  Difficul-es	
  
– Reasoning	
  with	
  large	
  number	
  of	
  facts	
  	
  
– Informa-on	
  is	
  osen	
  probabilis-c	
  and	
  not	
  public	
  
– Requires	
  knowing	
  how	
  the	
  informa-on	
  was	
  
obtained	
  (its	
  provenance)	
  
38	
  Amélie	
  &	
  Serge,	
  EDBT,	
  11111011111	
  	
  
Serendipity	
  
•  You	
  may	
  hear	
  by	
  chance	
  a	
  
song	
  that	
  is	
  going	
  to	
  totally	
  
obsess	
  you	
  
•  A	
  librarian	
  may	
  suggest	
  
your	
  reading	
  a	
  book	
  that	
  
will	
  change	
  your	
  life	
  
This	
  is	
  serendipity	
  
•  A	
  perfect	
  search	
  engine	
  	
  
•  A	
  perfect	
  recommenda-on	
  
system	
  
•  A	
  perfect	
  computer	
  assistant	
  
Such	
  systems	
  are	
  boring	
  	
  
	
  
They	
  lack	
  serendipity	
  
39	
  
Design	
  programs	
  that	
  would	
  help	
  introduce	
  
serendipity	
  in	
  our	
  lives	
  
Amélie	
  &	
  Serge,	
  EDBT,	
  11111011111	
  	
  
Answer	
  Personaliza-on	
  
•  Modifying	
  the	
  query	
  based	
  on	
  the	
  user’s	
  
ontology	
  and	
  preferences	
  
•  Ranking	
  the	
  result	
  based	
  on	
  the	
  user’s	
  
preferences	
  
•  Example:	
  How	
  do	
  I	
  get	
  to	
  Alice’s	
  place?	
  
–  Modify	
  
•  Alice	
  is	
  Alice.Doe@gmail.com	
  	
  
–  Rank	
  
•  Choose	
  to	
  bike	
  if	
  possible	
  (user’s	
  preference	
  if	
  the	
  weather	
  
is	
  nice)	
  
•  Choose	
  the	
  route	
  by	
  the	
  river	
  if	
  it	
  is	
  open	
  
Amélie	
  &	
  Serge,	
  EDBT,	
  11111011111	
  	
   40	
  
Rich	
  search/queries	
  
Context-­‐aware	
  
•  We	
  remember	
  our	
  data	
  based	
  on	
  
contextual	
  cues	
  	
  
•  Personal	
  informa-on	
  is	
  rich	
  in	
  
contextual	
  informa-on	
  
–  Metadata	
  
–  Applica-on	
  data	
  	
  
–  Environment	
  knowledge	
  	
  
•  Cogni-ve	
  Psychology	
  
–  contextual	
  cues	
  are	
  strong	
  
triggers	
  for	
  autobiographical	
  
memories	
  	
  
InteracJve	
  
-  I	
  am	
  looking	
  for	
  a	
  great	
  movie	
  I	
  
saw	
  about	
  a	
  month	
  ago	
  
-  Was	
  it	
  on	
  TV?	
  
-  No	
  in	
  a	
  theater.	
  
-  Was	
  it	
  Turkish?	
  
-  Yes.	
  
-  It	
  must	
  be	
  Winter	
  Sleep.	
  
Amélie	
  &	
  Serge,	
  EDBT,	
  11111011111	
  	
   41	
  
Digital	
  Self	
  Architecture	
  @	
  Rutgers	
  
Amélie	
  &	
  Serge,	
  EDBT,	
  11111011111	
  	
   42	
  
Architecture
•  Data	
  CollecJon	
  
–  Iden-fica-on,	
  retrieval,	
  storage	
  	
  
–  	
  Personal	
  Extrac-on	
  Tool:	
  	
  
hxps://github.com/ameliemarian/DigitalSelf	
  
•  Data	
  IntegraJon	
  
–  Mul-dimensional,	
  context-­‐
aware,	
  unified	
  data	
  model	
  
–  w5h	
  Model	
  	
  
•  Search 	
  	
  
–  based	
  on	
  the	
  natural	
  memory	
  
retrieval	
  process	
  
–  Context-­‐aware,	
  approximate	
  
–  -­‐w5h	
  Search	
  	
  
•  Knowledge	
  Discovery	
  
–  Find	
  connec-ons	
  and	
  paxerns	
  
–  Integrates	
  user	
  behavior	
  and	
  
feedback	
  
Personal	
  data	
  analy-cs	
  
Aka	
  Small	
  data	
  
Elliox	
  Hedman,	
  Design	
  Research	
  Conference
Personal	
  data	
  analy-cs	
  
•  Rela-vely	
  new	
  topic	
  	
  
–  First	
  Interna7onal	
  Workshop	
  on	
  Personal	
  Data	
  Analy-cs	
  in	
  
the	
  Internet	
  of	
  Things	
  in	
  2014	
  
•  Learn	
  from	
  personal	
  data	
  and	
  predic-ons	
  
–  Personal	
  health	
  and	
  well-­‐being	
  
–  Personal	
  transporta-on	
  	
  
–  Home	
  automa-on	
  
•  Issues	
  
–  Data	
  privacy	
  
–  Complexity	
  of	
  “small”	
  data	
  analy-cs:	
  Less	
  is	
  harder	
  
–  Combine	
  with	
  ver-cal	
  analy-cs:	
  large	
  groups	
  of	
  people	
  
–  Varying	
  data	
  quality:	
  imprecision,	
  inconsistencies	
  
	
  
Amélie	
  &	
  Serge,	
  EDBT,	
  11111011111	
  	
   44	
  
Focus:	
  Quan-fied	
  self	
  
•  From	
  sensors	
  &	
  all	
  kind	
  of	
  data	
  
•  Health	
  and	
  well	
  being	
  model	
  of	
  the	
  person	
  
•  Provide	
  alerts	
  and	
  counseling	
  
•  Monitoring	
  and	
  support	
  for	
  pa-ents	
  with	
  
chronic	
  condi-ons	
  
•  Preven-ve	
  medicine	
  
•  Ac-ve	
  par-cipa-on	
  of	
  the	
  person	
  
•  Large-­‐scale	
  learning	
  –	
  privacy	
  issues	
  
Amélie	
  &	
  Serge,	
  EDBT,	
  11111011111	
  	
   45	
  
Towards	
  a	
  Personal	
  Knowledge	
  Base	
  
•  Combine	
  informa-on	
  from	
  different	
  sources	
  to	
  
infer	
  facts	
  
–  Personal	
  Facts	
  
–  Personal	
  Rules	
  
–  Personal	
  Ontology	
  
	
  
•  Example	
  Query	
  «	
  When	
  was	
  the	
  last	
  -me	
  I	
  was	
  in	
  
Brussels?	
  »	
  
•  Can	
  use	
  exis-ng	
  tools,	
  RDF,	
  RDFS,	
  SPARQL	
  
Amélie	
  &	
  Serge,	
  EDBT,	
  11111011111	
  	
   46	
  
Access	
  control	
  and	
  security	
  
Is	
  privacy	
  needed?	
  
•  Because	
  young	
  people	
  expose	
  personal	
  life	
  online	
  more	
  likely	
  
than	
  adults,	
  privacy	
  is	
  no	
  longer	
  the	
  social	
  norm	
  (M.	
  
Zuckerberg)	
  
•  Proved	
  totally	
  wrong	
  
–  E.g.,	
  young	
  turn	
  to	
  ephemeral	
  communica-on	
  means	
  (Snapchat)	
  
•  Privacy	
  paradox:	
  Internet	
  users	
  are	
  concerned	
  about	
  privacy	
  
but	
  mostly	
  ignore	
  it	
  in	
  their	
  behaviors	
  
Amélie	
  &	
  Serge,	
  EDBT,	
  11111011111	
  	
   48	
  
Different	
  architectures	
  
•  Connec-on	
  with	
  vendors	
  (same	
  
for	
  other	
  services)	
  
•  Secure	
  P2P	
  
Amélie	
  &	
  Serge,	
  EDBT,	
  11111011111	
  	
   49	
  
PIMS	
  
Vendor	
  rela-on	
  
system	
  
V1	
  
V2	
  
V3	
  
PIMS	
  
Trusted	
  
intermediary	
  
V1	
  
V2	
  
V3	
  
Two-­‐-er	
   Three-­‐-er	
  
Distributed	
  
network	
  
(P2P)	
  	
  
Secure	
  
hardware	
  (e.g.,	
  
FreedomBox)	
  
Secure	
  devices	
  
Amélie	
  &	
  Serge,	
  EDBT,	
  11111011111	
  	
   50	
  
•  Secure	
  portable	
  tokens:	
  Secure	
  MCU	
  +	
  Flash	
  storage	
  
–  Issues:	
  limita-ons	
  of	
  the	
  device	
  
–  Example:	
  personal	
  medical	
  folder	
  
•  Works	
  of	
  [Anciaux,Pucheral]	
  
	
  
Reducing	
  or	
  increasing	
  the	
  security	
  risk?	
  
•  An	
  intrusion	
  on	
  my	
  PIMS	
  puts	
  all	
  my	
  informa-on	
  at	
  risk	
  
•  Hard	
  to	
  be	
  riskier	
  than	
  today’s	
  model	
  
–  Hardly	
  comfor-ng	
  
•  The	
  PIMS	
  is	
  ran	
  by	
  a	
  professional	
  operator	
  
–  Security/privacy	
  is	
  guaranteed	
  by	
  contract	
  
–  Applica-ons	
  codes	
  are	
  verified	
  by	
  the	
  operator	
  
–  The	
  PIMS	
  monitors	
  the	
  user’s	
  ac-ons	
  to	
  prevent	
  security	
  
viola-ons	
  
•  Data	
  of	
  different	
  users	
  are	
  isolated	
  
–  Less	
  temp-ng	
  for	
  pirates	
  
•  The	
  PIMS	
  does	
  not	
  solve	
  the	
  security	
  issues	
  
•  It	
  provides	
  a	
  beXer	
  environment	
  to	
  address	
  them	
  
Amélie	
  &	
  Serge,	
  EDBT,	
  11111011111	
  	
   51	
  
Other	
  issues	
  
•  Self	
  administra-on	
  	
  
•  Synchroniza-on	
  and	
  task	
  sequencing	
  
•  Internet	
  of	
  things	
  
Support	
  for	
  system	
  administra-on	
  
•  It	
  should	
  require	
  epsilon	
  competence	
  
–  Users	
  are	
  osen	
  incompetent	
  and	
  in	
  par-cular	
  understand	
  lixle	
  about	
  
access	
  control/security	
  
•  It	
  should	
  be	
  epsilon	
  work	
  
–  Users	
  are	
  not	
  interested	
  
•  The	
  PIMS	
  helps	
  
•  Administrate	
  external	
  applica-ons	
  
•  Synchronize/backup	
  data	
  	
  
•  Select	
  services	
  and	
  op-ons	
  
•  Manage	
  access	
  rights	
  
–  Works	
  on	
  self-­‐tuning	
  systems/databases	
  
–  Need	
  for	
  works	
  on	
  automa-cally	
  genera-ng	
  access	
  control	
  policies	
  
from	
  behavior	
  of	
  users	
  	
  
Amélie	
  &	
  Serge,	
  EDBT,	
  11111011111	
  	
   53	
  
Synchroniza-on	
  and	
  	
  
task	
  sequencing	
  across	
  devices	
  
•  Many	
  possible	
  approaches	
  
•  Service-­‐oriented	
  architecture	
  
•  Workflow	
  	
  
–  Transfer	
  workflow	
  technology	
  to	
  the	
  masses	
  
•  Mashup	
  
–  uses	
  content	
  from	
  more	
  than	
  one	
  sources	
  to	
  create	
  a	
  
single	
  new	
  service	
  displayed	
  in	
  a	
  single	
  graphical	
  
interface	
  
–  E.g.,	
  Yahoo	
  pipes	
  
•  Ishisthenthat	
  style	
  
	
   Amélie	
  &	
  Serge,	
  EDBT,	
  11111011111	
  	
   54	
  
A	
  hub	
  for	
  the	
  IoT	
  
•  Internet	
  of	
  things:	
  Interconnec-on	
  of	
  iden-fiable	
  
compu-ng	
  devices	
  within	
  the	
  exis-ng	
  Internet	
  
infrastructure	
  
•  Control	
  of	
  connected	
  objects	
  
•  Explosion	
  of	
  things	
  
–  E.g.,	
  heart	
  monitoring	
  implants,	
  biochip	
  transponders	
  on	
  farm	
  
animals,	
  automobiles	
  with	
  built-­‐in	
  sensors,	
  field	
  opera-on	
  
devices…	
  
•  According	
  to	
  Gartner,	
  there	
  will	
  be	
  nearly	
  26	
  billion	
  devices	
  
on	
  the	
  Internet	
  of	
  Things	
  by	
  2020	
  
•  Many	
  will	
  be	
  personal	
  devices	
  that	
  the	
  PIMS	
  should	
  
integrate/control	
  
•  Possibly	
  a	
  killer	
  app	
  for	
  the	
  PIMS	
  
Amélie	
  &	
  Serge,	
  EDBT,	
  11111011111	
  	
   55	
  
Conclusion:	
  The	
  PIMS	
  are	
  arriving	
  
For	
  societal,	
  technical,	
  industrial	
  reasons	
  
They	
  will	
  change	
  our	
  lives	
  
Society	
  is	
  ready	
  to	
  move	
  
•  Growing	
  resentment	
  	
  
–  Against	
  companies:	
  intrusive	
  marke-ng,	
  cryp-c	
  
personaliza-on	
  and	
  business	
  decisions	
  (e.g.,	
  on	
  
pricing),	
  creepy	
  "big	
  data"	
  inferences	
  
–  Against	
  governments:	
  NSA	
  and	
  its	
  European	
  
counterparts	
  
•  Increasing	
  awareness	
  of	
  the	
  dissymmetry	
  	
  
–  between	
  what	
  these	
  systems	
  know	
  about	
  a	
  person,	
  
and	
  what	
  the	
  person	
  actually	
  knows	
  
•  Emerging	
  understanding	
  of	
  the	
  value	
  of	
  personal	
  
data	
  for	
  individuals	
  
57	
  Amélie	
  &	
  Serge,	
  EDBT,	
  11111011111	
  	
  
Society	
  is	
  ready	
  to	
  move	
  (2)	
  
•  Privacy	
  control:	
  regula-ons	
  in	
  Europe	
  
•  Informa-on	
  symmetry:	
  Vendor	
  rela-on	
  
management	
  
•  Many	
  reports/proposals	
  that	
  affirm	
  the	
  
ownership	
  of	
  personal	
  data	
  by	
  the	
  person	
  
•  Personal	
  data	
  disclosure	
  ini-a-ves	
  	
  
–  Smart	
  Disclosure	
  (US);	
  MiData	
  (UK),	
  MesInfos	
  (France)	
  
–  Several	
  large	
  companies	
  (network	
  operators,	
  banks,	
  
retailers,	
  insurers…)	
  agreeing	
  to	
  share	
  with	
  customers	
  
the	
  personal	
  data	
  that	
  they	
  have	
  about	
  them	
  
Amélie	
  &	
  Serge,	
  EDBT,	
  11111011111	
  	
   58	
  
Technology	
  is	
  gearing	
  up	
  
•  System	
  administra-on	
  is	
  easier	
  
–  Abstrac-on	
  technologies	
  for	
  servers	
  
–  	
  Virtualiza-on	
  and	
  configura-on	
  management	
  tools	
  
•  Open	
  source	
  technology	
  more	
  and	
  more	
  
available	
  for	
  services	
  
•  Price	
  of	
  machines	
  is	
  going	
  down	
  
–  A	
  hosted-­‐low	
  cost	
  server	
  is	
  as	
  cheap	
  as	
  5€/month	
  
–  Paying	
  is	
  no	
  longer	
  a	
  barrier	
  for	
  a	
  majority	
  of	
  people	
  
You	
  may	
  have	
  friends	
  already	
  doing	
  it	
  
59	
  Amélie	
  &	
  Serge,	
  EDBT,	
  11111011111	
  	
  
Technology	
  is	
  gearing	
  up	
  (2)	
  
•  Many	
  systems	
  &	
  projects	
  
–  Lifestreams,	
  Stuff-­‐I’ve-­‐Seen,	
  Haystack,	
  MyLifeBits,	
  
Connec-ons,	
  Seetrieve,	
  Personal	
  Dataspaces,	
  or	
  
deskWeb.	
  	
  
–  YounoHost,	
  Amahi,	
  ArkOS,	
  OwnCloud	
  or	
  Cozy	
  Cloud	
  
•  Some	
  on	
  par-cular	
  aspects	
  
–  Mailpile	
  for	
  mail	
  
–  Lima	
  for	
  a	
  Dropbox-­‐like	
  service,	
  but	
  at	
  home.	
  
–  Personal	
  NAS	
  (network-­‐connected	
  storage)	
  e.g.	
  
Synologie	
  
–  Personal	
  data	
  store	
  SAMI	
  of	
  Samsung...	
  
•  Many	
  more	
  
	
   60	
  Amélie	
  &	
  Serge,	
  EDBT,	
  11111011111	
  	
  
Industry	
  is	
  interested	
  
	
  (1)	
  Pre-­‐digital	
  companies	
  
•  E.g.,	
  hotels	
  or	
  banks	
  	
  
•  Disintermediated	
  from	
  their	
  customers	
  by	
  pure	
  
Internet	
  players	
  such	
  as	
  Google,	
  Amazon,	
  
Booking.com,	
  Mint.	
  	
  
•  In	
  PIMS,	
  they	
  can	
  rebuild	
  direct	
  interac-on	
  	
  
•  The	
  playing	
  field	
  is	
  neutral	
  	
  
–  Unlike	
  on	
  the	
  Internet	
  where	
  they	
  have	
  less	
  data	
  
•  They	
  can	
  offer	
  new	
  services	
  without	
  
compromising	
  privacy	
  
61	
  Amélie	
  &	
  Serge,	
  EDBT,	
  11111011111	
  	
  
Industry	
  is	
  interested	
  
	
  (2)	
  Home	
  appliances	
  companies	
  
•  Many	
  boxes	
  deployed	
  at	
  home	
  or	
  in	
  
datacenters	
  
– Internet	
  access	
  and	
  TV	
  "boxes”,	
  NAS	
  servers,	
  
"smart"	
  meters	
  provided	
  by	
  energy	
  vendors,	
  
home	
  automa-on	
  systems,	
  "digital	
  lockers”…	
  
•  Personal	
  data	
  spaces	
  dedicated	
  to	
  specific	
  
usage	
  
•  Could	
  evolve	
  to	
  become	
  more	
  generic	
  
•  Control	
  of	
  private	
  Internet	
  of	
  objects	
  
62	
  Amélie	
  &	
  Serge,	
  EDBT,	
  11111011111	
  	
  
Industry	
  is	
  interested	
  
	
  (3)	
  Pure	
  Internet	
  players	
  
•  Amazon:	
  great	
  know-­‐how	
  in	
  providing	
  services	
  
•  Facebook,	
  Google:	
  cannot	
  afford	
  to	
  be	
  out	
  of	
  a	
  
movement	
  in	
  personal	
  data	
  management	
  
•  Very	
  far	
  from	
  their	
  business	
  model	
  based	
  on	
  
personal	
  adver-sement	
  
•  Moving	
  to	
  this	
  new	
  market	
  would	
  require	
  major	
  
changes	
  &	
  the	
  clarifica-on	
  of	
  the	
  rela-onship	
  
with	
  users	
  w.r.t.	
  data	
  mone-za-on	
  
Amélie	
  &	
  Serge,	
  EDBT,	
  11111011111	
  	
   63	
  
They	
  will	
  change	
  our	
  lives:	
  	
  
(1)	
  rebalance	
  the	
  Web	
  	
  
•  User	
  control	
  over	
  their	
  data	
  
–  Who	
  has	
  access	
  to	
  what,	
  under	
  what	
  rules,	
  to	
  do	
  what	
  	
  
•  User	
  empowerment	
  
–  They	
  choose	
  freely	
  services	
  &	
  they	
  can	
  leave	
  a	
  service	
  
•  Par-cipa-on	
  to	
  a	
  more	
  “neutral”	
  Web	
  
–  With	
  the	
  "network	
  effects",	
  the	
  main	
  plaYorms	
  are	
  
accumula-ng	
  data/customers	
  and	
  distor-ng	
  compe--on	
  
–  The	
  PIMS	
  bring	
  back	
  fairness	
  on	
  the	
  Web	
  
–  Good	
  practices	
  are	
  encouraged,	
  e.g.,	
  interoperability,	
  
portability	
  
64	
  Amélie	
  &	
  Serge,	
  EDBT,	
  11111011111	
  	
  
They	
  will	
  change	
  our	
  lives:	
  	
  
(2)	
  new	
  func-onali-es	
  
1.  Data	
  integra-on	
  
2.  Search	
  and	
  queries	
  
3.  Access	
  control	
  and	
  security	
  
4.  Personal	
  data	
  analy-cs	
  
5.  Self	
  administra-on	
  	
  
6.  Synchroniza-on	
  and	
  task	
  sequencing	
  
7.  Control	
  of	
  Internet	
  of	
  things	
  
	
  …	
  
65	
  Amélie	
  &	
  Serge,	
  EDBT,	
  11111011111	
  	
  
(3)	
  So	
  watch	
  out	
  for	
  the	
  killer	
  apps	
  
•  Personal	
  assistant	
  
–  Google	
  now	
  enhanced	
  
–  Appointments,	
  trips,	
  shopping	
  
–  Tax,	
  financial,	
  insurance,	
  pension…	
  
•  Health	
  monitoring	
  
–  Quan-fied	
  self	
  
–  Digital	
  medical	
  records	
  
•  Smart	
  home	
  
•  Elder	
  care	
  monitoring	
  and	
  advising	
  
Amélie	
  &	
  Serge,	
  EDBT,	
  11111011111	
  	
   66	
  
Come	
  and	
  share	
  PIMS	
  
•  Lots	
  of	
  cool	
  problems	
  
•  Lots	
  of	
  opportuni-es	
  for	
  
your	
  favorite	
  data	
  
management	
  techno	
  	
  
•  Lots	
  of	
  super	
  useful	
  
applica-ons	
  
•  And	
  some	
  killer	
  apps	
  to	
  
invent	
  
Amélie	
  &	
  Serge,	
  EDBT,	
  11111011111	
  	
   67	
  
References	
  
Data	
  IntegraJon:	
  
•  A	
  survey	
  of	
  approaches	
  to	
  automa7c	
  schema	
  matching,	
  Rahm	
  &	
  Bernstein	
  2001.	
  	
  
•  Principles	
  of	
  Data	
  integra7on,	
  Doan,	
  Halevy,	
  Ives,	
  2012.	
  
•  Principles	
  of	
  dataspace	
  systems,	
  Halevy,	
  Franklin,	
  and	
  Maier.	
  CACM,	
  2006.	
  	
  
•  Schema	
  matching	
  (Rahm	
  &	
  Bernstein	
  2001).	
  	
  
•  Data	
  integra7on,	
  Halevy,	
  Ashish,	
  Bixon,	
  et	
  al.	
  (2005)	
  
Security	
  and	
  trust	
  
•  Management	
  of	
  Personal	
  Informa7on	
  Disclosure:	
  The	
  Interdependence	
  of	
  Privacy,	
  Security,	
  
and	
  Trust,	
  Clare-­‐Marie	
  Karat,	
  John	
  Karat,	
  and	
  Carolyn	
  Brodie	
  
•  Secure	
  Personal	
  Data	
  Servers:	
  a	
  Vision	
  Paper.	
  T	
  Allard	
  et	
  al.	
  VLDB,	
  2010.	
  
Knowledge	
  management	
  
•  Web	
  Data	
  Management,	
  Serge	
  Abiteboul,	
  Ioana	
  Manolescu,	
  Philippe	
  Rigaux,	
  Marie-­‐Chris-ne	
  
Rousset,	
  Pierre	
  Senellart,	
  Cambridge	
  University	
  Press,	
  2011.	
  
•  Ontology	
  for	
  PIMS:	
  OntoPIM,	
  Ka-fori,	
  Poggi,	
  Scannapieco,	
  et	
  al.	
  2005	
  
•  Networked	
  Environment	
  for	
  Personal,	
  Ontology-­‐based	
  Management	
  of	
  Unified	
  Knowledge	
  
(NEPOMUK).	
  
	
  
Amélie	
  &	
  Serge,	
  EDBT,	
  11111011111	
  	
   69	
  
References	
  
Data	
  extracJon	
  
•  A	
  tool	
  for	
  personal	
  data	
  extrac7on.	
  D.	
  Vianna,	
  A.-­‐M.	
  Yong,	
  C.	
  Xia,	
  A.	
  
Marian,	
  and	
  T.	
  Nguyen	
  
•  Visual	
  Web	
  Informa7on	
  Extrac7on	
  with	
  Lixto,	
  R.	
  Baumgartner,	
  S.	
  Flesca,G.	
  Goxlob.	
  
VLDB01	
  
Societal	
  issues	
  
•  Managing	
  your	
  digital	
  life	
  with	
  a	
  Personal	
  informa7on	
  management	
  
system,	
  	
  Serge	
  Abiteboul,	
  Benjamin	
  André,	
  Daniel	
  Kaplan,	
  Comm.	
  of	
  the	
  
ACM,	
  to	
  appear	
  
•  hxp://mesinfos.fing.org	
  	
  
•  hxp://www.midatalab.org.uk	
  	
  
•  hxps://www.data.gov/consumer/smart-­‐disclosure-­‐policy	
  	
  
•  hxp://socialsafe.net	
  	
  
	
  
Amélie	
  &	
  Serge,	
  EDBT,	
  11111011111	
  	
   70	
  
References	
  
PIMS:	
  
•  As	
  we	
  may	
  think,	
  Vannevar	
  Bush,	
  the	
  Atlan-c	
  Monthly,	
  2005.	
  
•  Personal	
  Informa7on	
  Management.	
  W.	
  Jones	
  and	
  J.	
  Teevan,	
  editors.	
  
	
   	
   	
   	
   	
   	
  	
  University	
  of	
  Washington	
  Press,	
  2007.	
  
•  Beyond	
  total	
  capture:	
  a	
  construc7ve	
  cri7que	
  of	
  Lifelogging,	
  Sellen	
  and	
  Whitaker,	
  CACM	
  2010.	
  
•  A	
  tool	
  for	
  personal	
  data	
  extrac7on.	
  Vianna,	
  Yong,	
  Xia,	
  Marian,	
  	
  and	
  Nguyen,	
  IIWeb	
  2014.	
  
•  Microsos’s	
  Stuff	
  I’ve	
  Seen	
  project,	
  Dumais	
  et	
  al.	
  SIGIR	
  2003.	
  
•  MyLifeBits,	
  Gemmel,	
  Bell	
  and	
  Lueder,	
  CACM	
  2006.	
  
•  deskWeb,	
  Zerr	
  et	
  al.	
  SIGIR	
  2010.	
  
•  Connec7ons,	
  Soules	
  and	
  Ganger,	
  SOSP	
  2005.	
  
•  Seetrieve,	
  Gyllstrom	
  and	
  Soules,	
  IUI	
  2008.	
  
•  LifeStreams,	
  Fer-g,	
  Freeman,	
  and	
  Gelernter,	
  CHI	
  1996.	
  
•  Haystack,	
  Karger	
  et	
  al.	
  CIDR	
  2005.	
  
•  Understanding	
  What	
  Works:	
  Evalua7ng	
  PIM	
  Tools,	
  Diane	
  Kelly	
  and	
  Jaime	
  Teevan	
  
	
  
	
  
Amélie	
  &	
  Serge,	
  EDBT,	
  11111011111	
  	
   71	
  

More Related Content

What's hot

Interactive Innovation Through Social Software And Web 2.0
Interactive Innovation Through Social Software And Web 2.0Interactive Innovation Through Social Software And Web 2.0
Interactive Innovation Through Social Software And Web 2.0Thomas Ryberg
 
Information Technology and Modern Gadgets
Information Technology and Modern GadgetsInformation Technology and Modern Gadgets
Information Technology and Modern GadgetsArnav Chowdhury
 
CISO's Guide to Securing SharePoint
CISO's Guide to Securing SharePointCISO's Guide to Securing SharePoint
CISO's Guide to Securing SharePointImperva
 
Crisis Information Management in the Web 3.0 Age
Crisis Information Management in the Web 3.0 AgeCrisis Information Management in the Web 3.0 Age
Crisis Information Management in the Web 3.0 AgeAxel101
 
CIS375 Interaction Designs Chapter3
CIS375 Interaction Designs Chapter3CIS375 Interaction Designs Chapter3
CIS375 Interaction Designs Chapter3Dr. Ahmed Al Zaidy
 
Data-Ed Online: How Safe is Your Data? Data Security Webinar
Data-Ed Online: How Safe is Your Data?  Data Security WebinarData-Ed Online: How Safe is Your Data?  Data Security Webinar
Data-Ed Online: How Safe is Your Data? Data Security WebinarData Blueprint
 
Digital innovation-summit roi-of-ai-sept2017_v3
Digital innovation-summit roi-of-ai-sept2017_v3Digital innovation-summit roi-of-ai-sept2017_v3
Digital innovation-summit roi-of-ai-sept2017_v3BrightEdge
 
2019 June 27 - Big data and data science
2019 June 27 - Big data and data science2019 June 27 - Big data and data science
2019 June 27 - Big data and data scienceFabio Stella
 
Lecture 6: How do we study the Social Web (2013)
Lecture 6: How do we study the Social Web  (2013)Lecture 6: How do we study the Social Web  (2013)
Lecture 6: How do we study the Social Web (2013)Lora Aroyo
 
Personal Informatics and Context: Using Context to Reveal Factors that Affect...
Personal Informatics and Context: Using Context to Reveal Factors that Affect...Personal Informatics and Context: Using Context to Reveal Factors that Affect...
Personal Informatics and Context: Using Context to Reveal Factors that Affect...Ian Li
 
Socializing Big Data: Collaborative Opportunities in Computer Science, the So...
Socializing Big Data: Collaborative Opportunities in Computer Science, the So...Socializing Big Data: Collaborative Opportunities in Computer Science, the So...
Socializing Big Data: Collaborative Opportunities in Computer Science, the So...Sheryl Grant
 
Postive & Nagetive impacts & Applications of computer
Postive & Nagetive impacts & Applications of computerPostive & Nagetive impacts & Applications of computer
Postive & Nagetive impacts & Applications of computermanju rani
 
Informatics Transform : Re-engineering Libraries for the Data Decade
Informatics Transform : Re-engineering Libraries for the Data DecadeInformatics Transform : Re-engineering Libraries for the Data Decade
Informatics Transform : Re-engineering Libraries for the Data DecadeLiz Lyon
 

What's hot (15)

Interactive Innovation Through Social Software And Web 2.0
Interactive Innovation Through Social Software And Web 2.0Interactive Innovation Through Social Software And Web 2.0
Interactive Innovation Through Social Software And Web 2.0
 
Computing for Human Experience [v3, Aug-Oct 2010]
Computing for Human Experience [v3, Aug-Oct 2010]Computing for Human Experience [v3, Aug-Oct 2010]
Computing for Human Experience [v3, Aug-Oct 2010]
 
Information Technology and Modern Gadgets
Information Technology and Modern GadgetsInformation Technology and Modern Gadgets
Information Technology and Modern Gadgets
 
CISO's Guide to Securing SharePoint
CISO's Guide to Securing SharePointCISO's Guide to Securing SharePoint
CISO's Guide to Securing SharePoint
 
Crisis Information Management in the Web 3.0 Age
Crisis Information Management in the Web 3.0 AgeCrisis Information Management in the Web 3.0 Age
Crisis Information Management in the Web 3.0 Age
 
Misceb digital2014
Misceb digital2014Misceb digital2014
Misceb digital2014
 
CIS375 Interaction Designs Chapter3
CIS375 Interaction Designs Chapter3CIS375 Interaction Designs Chapter3
CIS375 Interaction Designs Chapter3
 
Data-Ed Online: How Safe is Your Data? Data Security Webinar
Data-Ed Online: How Safe is Your Data?  Data Security WebinarData-Ed Online: How Safe is Your Data?  Data Security Webinar
Data-Ed Online: How Safe is Your Data? Data Security Webinar
 
Digital innovation-summit roi-of-ai-sept2017_v3
Digital innovation-summit roi-of-ai-sept2017_v3Digital innovation-summit roi-of-ai-sept2017_v3
Digital innovation-summit roi-of-ai-sept2017_v3
 
2019 June 27 - Big data and data science
2019 June 27 - Big data and data science2019 June 27 - Big data and data science
2019 June 27 - Big data and data science
 
Lecture 6: How do we study the Social Web (2013)
Lecture 6: How do we study the Social Web  (2013)Lecture 6: How do we study the Social Web  (2013)
Lecture 6: How do we study the Social Web (2013)
 
Personal Informatics and Context: Using Context to Reveal Factors that Affect...
Personal Informatics and Context: Using Context to Reveal Factors that Affect...Personal Informatics and Context: Using Context to Reveal Factors that Affect...
Personal Informatics and Context: Using Context to Reveal Factors that Affect...
 
Socializing Big Data: Collaborative Opportunities in Computer Science, the So...
Socializing Big Data: Collaborative Opportunities in Computer Science, the So...Socializing Big Data: Collaborative Opportunities in Computer Science, the So...
Socializing Big Data: Collaborative Opportunities in Computer Science, the So...
 
Postive & Nagetive impacts & Applications of computer
Postive & Nagetive impacts & Applications of computerPostive & Nagetive impacts & Applications of computer
Postive & Nagetive impacts & Applications of computer
 
Informatics Transform : Re-engineering Libraries for the Data Decade
Informatics Transform : Re-engineering Libraries for the Data DecadeInformatics Transform : Re-engineering Libraries for the Data Decade
Informatics Transform : Re-engineering Libraries for the Data Decade
 

Similar to Personal Information Management Systems - EDBT/ICDT'15 Tutorial

Privacy in the Age of Ubiquitous Computing, Stanford PCD seminar March 2004
Privacy in the Age of Ubiquitous Computing, Stanford PCD seminar March 2004Privacy in the Age of Ubiquitous Computing, Stanford PCD seminar March 2004
Privacy in the Age of Ubiquitous Computing, Stanford PCD seminar March 2004Jason Hong
 
Dm sei-tutorial-v7
Dm sei-tutorial-v7Dm sei-tutorial-v7
Dm sei-tutorial-v7CS, NcState
 
"I just scroll through my stuff until I find it or give up": A Contextual Inq...
"I just scroll through my stuff until I find it or give up": A Contextual Inq..."I just scroll through my stuff until I find it or give up": A Contextual Inq...
"I just scroll through my stuff until I find it or give up": A Contextual Inq...Toine Bogers
 
MIS-lecture-2
MIS-lecture-2MIS-lecture-2
MIS-lecture-227273737
 
Enterprise social networking v1.2
Enterprise social networking v1.2Enterprise social networking v1.2
Enterprise social networking v1.2James Sutter
 
Fostering an Ecosystem for Smartphone Privacy
Fostering an Ecosystem for Smartphone PrivacyFostering an Ecosystem for Smartphone Privacy
Fostering an Ecosystem for Smartphone PrivacyJason Hong
 
Why cant all_data_be_the_same
Why cant all_data_be_the_sameWhy cant all_data_be_the_same
Why cant all_data_be_the_sameSkyler Lewis
 
The Programmable Internet of Things
The Programmable Internet of ThingsThe Programmable Internet of Things
The Programmable Internet of ThingsRich Miller
 
Big Data Past, Present and Future – Where are we Headed? - StampedeCon 2014
Big Data Past, Present and Future – Where are we Headed? - StampedeCon 2014Big Data Past, Present and Future – Where are we Headed? - StampedeCon 2014
Big Data Past, Present and Future – Where are we Headed? - StampedeCon 2014StampedeCon
 
A Morning of Mobile Privacy - Presenter Slides
A Morning of Mobile Privacy - Presenter SlidesA Morning of Mobile Privacy - Presenter Slides
A Morning of Mobile Privacy - Presenter SlidesDan Wittmers
 
The big-data revolution in healthcare
The big-data revolution in healthcareThe big-data revolution in healthcare
The big-data revolution in healthcareVaibhav Srivastav
 
How Information Visualization Aids Cognition
How Information Visualization Aids CognitionHow Information Visualization Aids Cognition
How Information Visualization Aids CognitionSarah Jane Hong
 
Helping Developers with Privacy
Helping Developers with PrivacyHelping Developers with Privacy
Helping Developers with PrivacyJason Hong
 
INST201_SP23_Week1_InfoIs.pdf
INST201_SP23_Week1_InfoIs.pdfINST201_SP23_Week1_InfoIs.pdf
INST201_SP23_Week1_InfoIs.pdfbozo18
 
7 Highly Risky Habits of Small to Medium-Sized Nonprofits: IT Security Pitfalls
7 Highly Risky Habits of Small to Medium-Sized Nonprofits: IT Security Pitfalls7 Highly Risky Habits of Small to Medium-Sized Nonprofits: IT Security Pitfalls
7 Highly Risky Habits of Small to Medium-Sized Nonprofits: IT Security PitfallsDaniel Rivas
 
IAT334-Lec01-Intro.pptx
IAT334-Lec01-Intro.pptxIAT334-Lec01-Intro.pptx
IAT334-Lec01-Intro.pptxArifKamal36
 
Preliminary Revision
Preliminary RevisionPreliminary Revision
Preliminary Revisionsmoky_stu
 
Physician Office Presentation
Physician Office PresentationPhysician Office Presentation
Physician Office Presentationfranbodh
 
Management by data
Management by dataManagement by data
Management by dataLuca Foresti
 

Similar to Personal Information Management Systems - EDBT/ICDT'15 Tutorial (20)

Privacy in the Age of Ubiquitous Computing, Stanford PCD seminar March 2004
Privacy in the Age of Ubiquitous Computing, Stanford PCD seminar March 2004Privacy in the Age of Ubiquitous Computing, Stanford PCD seminar March 2004
Privacy in the Age of Ubiquitous Computing, Stanford PCD seminar March 2004
 
Dm sei-tutorial-v7
Dm sei-tutorial-v7Dm sei-tutorial-v7
Dm sei-tutorial-v7
 
"I just scroll through my stuff until I find it or give up": A Contextual Inq...
"I just scroll through my stuff until I find it or give up": A Contextual Inq..."I just scroll through my stuff until I find it or give up": A Contextual Inq...
"I just scroll through my stuff until I find it or give up": A Contextual Inq...
 
MIS-lecture-2
MIS-lecture-2MIS-lecture-2
MIS-lecture-2
 
Enterprise social networking v1.2
Enterprise social networking v1.2Enterprise social networking v1.2
Enterprise social networking v1.2
 
Fostering an Ecosystem for Smartphone Privacy
Fostering an Ecosystem for Smartphone PrivacyFostering an Ecosystem for Smartphone Privacy
Fostering an Ecosystem for Smartphone Privacy
 
Why cant all_data_be_the_same
Why cant all_data_be_the_sameWhy cant all_data_be_the_same
Why cant all_data_be_the_same
 
The Programmable Internet of Things
The Programmable Internet of ThingsThe Programmable Internet of Things
The Programmable Internet of Things
 
Big Data Past, Present and Future – Where are we Headed? - StampedeCon 2014
Big Data Past, Present and Future – Where are we Headed? - StampedeCon 2014Big Data Past, Present and Future – Where are we Headed? - StampedeCon 2014
Big Data Past, Present and Future – Where are we Headed? - StampedeCon 2014
 
A Morning of Mobile Privacy - Presenter Slides
A Morning of Mobile Privacy - Presenter SlidesA Morning of Mobile Privacy - Presenter Slides
A Morning of Mobile Privacy - Presenter Slides
 
Big data
Big dataBig data
Big data
 
The big-data revolution in healthcare
The big-data revolution in healthcareThe big-data revolution in healthcare
The big-data revolution in healthcare
 
How Information Visualization Aids Cognition
How Information Visualization Aids CognitionHow Information Visualization Aids Cognition
How Information Visualization Aids Cognition
 
Helping Developers with Privacy
Helping Developers with PrivacyHelping Developers with Privacy
Helping Developers with Privacy
 
INST201_SP23_Week1_InfoIs.pdf
INST201_SP23_Week1_InfoIs.pdfINST201_SP23_Week1_InfoIs.pdf
INST201_SP23_Week1_InfoIs.pdf
 
7 Highly Risky Habits of Small to Medium-Sized Nonprofits: IT Security Pitfalls
7 Highly Risky Habits of Small to Medium-Sized Nonprofits: IT Security Pitfalls7 Highly Risky Habits of Small to Medium-Sized Nonprofits: IT Security Pitfalls
7 Highly Risky Habits of Small to Medium-Sized Nonprofits: IT Security Pitfalls
 
IAT334-Lec01-Intro.pptx
IAT334-Lec01-Intro.pptxIAT334-Lec01-Intro.pptx
IAT334-Lec01-Intro.pptx
 
Preliminary Revision
Preliminary RevisionPreliminary Revision
Preliminary Revision
 
Physician Office Presentation
Physician Office PresentationPhysician Office Presentation
Physician Office Presentation
 
Management by data
Management by dataManagement by data
Management by data
 

More from Amélie Marian

Integration and Exploration of Connected Personal Digital Traces
Integration and Exploration of Connected Personal Digital TracesIntegration and Exploration of Connected Personal Digital Traces
Integration and Exploration of Connected Personal Digital TracesAmélie Marian
 
Miettes de données - Keynote BDA 2015
Miettes de données - Keynote BDA 2015Miettes de données - Keynote BDA 2015
Miettes de données - Keynote BDA 2015Amélie Marian
 
Personalizing Forum Search using Multidimensional Random Walks
Personalizing Forum Search using Multidimensional Random WalksPersonalizing Forum Search using Multidimensional Random Walks
Personalizing Forum Search using Multidimensional Random WalksAmélie Marian
 
Corroborating Facts from Affirmative Statements
Corroborating Facts from Affirmative StatementsCorroborating Facts from Affirmative Statements
Corroborating Facts from Affirmative StatementsAmélie Marian
 
Remembrance of data past
Remembrance of data pastRemembrance of data past
Remembrance of data pastAmélie Marian
 
Searching data with substance and style
Searching data with substance and styleSearching data with substance and style
Searching data with substance and styleAmélie Marian
 

More from Amélie Marian (7)

Integration and Exploration of Connected Personal Digital Traces
Integration and Exploration of Connected Personal Digital TracesIntegration and Exploration of Connected Personal Digital Traces
Integration and Exploration of Connected Personal Digital Traces
 
Miettes de données - Keynote BDA 2015
Miettes de données - Keynote BDA 2015Miettes de données - Keynote BDA 2015
Miettes de données - Keynote BDA 2015
 
Personalizing Forum Search using Multidimensional Random Walks
Personalizing Forum Search using Multidimensional Random WalksPersonalizing Forum Search using Multidimensional Random Walks
Personalizing Forum Search using Multidimensional Random Walks
 
Corroborating Facts from Affirmative Statements
Corroborating Facts from Affirmative StatementsCorroborating Facts from Affirmative Statements
Corroborating Facts from Affirmative Statements
 
Searching Web Forums
Searching Web ForumsSearching Web Forums
Searching Web Forums
 
Remembrance of data past
Remembrance of data pastRemembrance of data past
Remembrance of data past
 
Searching data with substance and style
Searching data with substance and styleSearching data with substance and style
Searching data with substance and style
 

Recently uploaded

Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...itnewsafrica
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructureitnewsafrica
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxfnnc6jmgwh
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024TopCSSGallery
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesManik S Magar
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkPixlogix Infotech
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 

Recently uploaded (20)

Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App Framework
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 

Personal Information Management Systems - EDBT/ICDT'15 Tutorial

  • 1. Personal  Informa-on     Management  Systems   Serge  Abiteboul   INRIA  &  ENS  Cachan   serge.abiteboul@inria.fr   Amélie  Marian   Rutgers  University   amelie@cs.rutgers.edu  
  • 2. Personal  data  is  everywhere   Amélie  &  Serge,  EDBT,  11111011111     2  
  • 3. Personal  data  is  exploding     •  Ac-vely:  Data  and  metadata  we  produce   –  Pictures,  reports,  emails,  calendars,  tweets,  annota-ons,   recommenda-on,  social  network…              Ac-vely:  Data  we  like/buy   –  Books,  music,  movies…   •  Passively:  Data  others  produce  about  us   –  Public  administra-on,  schools,  insurances,  banks…   –  Amazon,  banks,  retailers,  applestore…     •  Stealthily:  sensors   –  GPS,  web  naviga-on,  phone,  "quan-fied  self"  measurements,   contactless  card  readings,  surveillance  camera  pictures…   •  Stealthily:  data  analysis   –  Clicks,  Searches,  TV  viewing  habits  (e.g.,  NeYlix)   –  NSA  inference   3  Amélie  &  Serge,  EDBT,  11111011111    
  • 4. Personal  data  is  heterogeneous   •  Structured:  rela-onal   •  Semistructured:  HTML,  XML,  Jason…   •  Not  structured:  text  (pdf),  pictures,  music,  video…   •  Metadata:  date,  loca-on…     •  Seman-c:  RDF,  RDFS,  Owl   •  Different  languages,  terminologies,  ontologies,  structures   •  Different  systems,  protocols     •  Varying  quality   4  Amélie  &  Serge,  EDBT,  11111011111    
  • 5. •  Loss  of  func-onali-es  because  of  fragmenta-on   –  You  don’t  know  where  your  data  is,  how  to  maintain  it  up   to  date,  how  to  get  it  some-mes   –  Difficult  to  do  global  search,  maintenance,   synchroniza-on,  archiving...   •  Loss  of  control  over  the  data   –  Difficult  to  control  privacy   –  Difficult  to  control  sharing     –  Leaks  of  private  informa-on   •  Loss  of  freedom   –  Vendor  lock-­‐in   Bad  news   5  Amélie  &  Serge,  EDBT,  11111011111    
  • 6. Alterna-ves   1.  Con-nue  with  this  increasing              mess   –  Use  a  shrink  to  overcome                    the  frustra-on   2.  Regroup  all  your  data  on  the  same  plaYorm   –  Google,  Apple,  Facebook,  …,  a  new  comer   –  Use  a  shrink  to  overcome  resentment   3.  Study  2  years  to  become  a  geek   –  Geeks  know  how  to  manage  their  informa-on     –  Use  a  shrink  to  survive  the  experience   6  Amélie  &  Serge,  EDBT,  11111011111     Where  do  you   keep  your  data?  
  • 7. The  -me  for  PIMS  is  now!   A  memex  is  a  device  in  which  an  individual  stores  all  his  books,  records,  and   communica7ons,  and  which  is  mechanized  so  that  it  may  be  consulted  with   exceeding  speed  and  flexibility.  It  is  an  enlarged  in7mate  supplement  to  his   memory.          Vannevar  Bush,  The  Atlan-c  Monthly,  1945     Defini-on  for  this  talk:  a  Personal  Informa-on   Management  System  is  a  (cloud)  system  that  manages   all  the  informa7on  of  a  person   Amélie  &  Serge,  EDBT,  11111011111     7  
  • 8. The  PIMS:  A  change  in  paradigm   Using  Web  services  today   •  Your  data   •  Running  with  an  external   service   •  On  some  unknown   machines     Your  PIMS     •  Your  data     •  Running  a  local  service   •  On  your  machine   Possibly  for  external  services   •  A  replica  of  the  data   •  On  a  wrapper     •  On  your  machine   Amélie  &  Serge,  EDBT,  11111011111     8  
  • 9. PIMS  in  the  Past  
  • 10. Saving  Personal  Data  –  Old  School   Amélie  &  Serge,  EDBT,  11111011111     10  
  • 11. Searching  Personal  Data  –  Old  School…   Amélie  &  Serge,  EDBT,  11111011111     11  File  cabinet   around  1888  
  • 12. Personal  Informa-on  Management  –   the  Digital  Age   Amélie  &  Serge,  EDBT,  11111011111     12   % grep PIMS /home/amelie/presentations
  • 13. First-­‐genera-on  Personal  Informa-on   Management  Systems   •  Storage   – Archival,  safe-­‐keeping   •  Organiza-on   – Structure   – Different  file  types   •  Finding  and  re-­‐finding  informa-on   – Different  from  tradi-onal  IR/Web  search  systems   – Keyword  searches  not  ideal   Amélie  &  Serge,  EDBT,  11111011111     13  
  • 14. Desktop  Search  Tools   •  Google  Desktop  Search  (defunct)   •  Apple  Spotlight   •  Windows  Search   •  Lead  to  frustra-on  when  users  cannot  find   informa-on  they  know  they  have   Amélie  &  Serge,  EDBT,  11111011111     14   Use  IR-­‐style  keyword  searches     Some  metadata  filtering  
  • 15. Past  PIMS  projects     (late  1990’s,  2000’s)     •  Lifestreams   –  Time  oriented  streams   •  Haystack   –  Uniform  data  model   •  Stuff  I’ve  seen   –  History  of  web  behavior   •  Dataspaces   –  Seman-c  connec-ons.  Data   integra-on   •  Connec-ons,  Seetrieve   –  Task-­‐based  organiza-on   •  deskWeb   –  Looks  at  the  social  network   graph         Various  use  of     –  Context   –  Time   –  Social  network     Amélie  &  Serge,  EDBT,  11111011111     15  
  • 16. LifeStreams   (Freeman  and  Gelertner,  Yale,  1996-­‐1997)   Help  users   manage  their   informa-on     Time-­‐centric  view   of  documents   Amélie  &  Serge,  EDBT,  11111011111     16  
  • 17. Haystack     (Karger  et  al.,  MIT  CSAIL  1997-­‐2005)   Allows  users  to  store,   examine  and  manipulate   their  informa-on     •  Uniform  Data  Model   •  Semi-­‐structured  Data   •  Captures   rela-onships   •  Separate  Workspaces   Amélie  &  Serge,  EDBT,  11111011111     17  
  • 18. Stuff  I’ve  Seen     (Dumais  et  al.  Microsos,  2003-­‐2004)   •  Unified  Index   •  Integra-on  of   sources   •  Re-­‐find   informa-on   •  Focus  on  UI   Amélie  &  Serge,  EDBT,  11111011111     18  
  • 19. A  changing  landscape       Cloud-­‐based  model         Heterogeneous  data  types  and  formats     Need  for  richer  func-onali-es  and  seman-c  analysis   Amélie  &  Serge,  EDBT,  11111011111     19  
  • 20. A  vision  for  the  Future  of  PIMS      
  • 21. All  the  digital  life  of  an  individual   From  Memex  to  MyLifeBits   Memex   –  Memory  index  or  memory  extender   –  Hypertext  system  by  Vannevar  Bush  in  1945     –  Compress  and  store  all  of  their  books,  records,   and  communica-ons…   –  Provide  an  "enlarged  in-mate  supplement  to   one's  memory”   MyLifeBits   –  Microsos  Research  project  with  Gordon  Bell   (2006)   –  Life-­‐logging     –  All  documents  read  or  produced  by  Bell,  CDs,      emails,  web  pages  browsed,  phone  and        instant  messaging  conversa-ons,  etc.       Amélie  &  Serge,  EDBT,  11111011111     21  
  • 22. Some  of  the  digital  life?   •  The  “Total  Capture  vision”  has  its  detractors   •  Advantages  of  selec-ve  human  memory     – Ignore  irrelevant  informa-on  to  avoid  flooding   when  searching  for  something   – Choose  what  to  forget,  e.g.,  unpleasant  memories   •  Perhaps  PIMS  should  also  be  selec-ve   •  More  complicated  than  Total  Capture   Amélie  &  Serge,  EDBT,  11111011111     22  
  • 23. Hypermnesia   Excep7onally  exact  or  vivid  memory,   especially  as  associated  with   certain  mental  illnesses   For  a  user:  We  cannot  live  knowing   that  any  word,  any  move  will  leave   a  trace?     For  the  ecosystem:  We  cannot  store   all  the  data  we  produce  –  lack  of   storage  resources       23   ForgeGng  is  Key  to  a  Healthy  Mind   Scien7fic  American   Image:  Aaron  Goodman   A  main  issue  is  to  select  the  informaJon  we   choose  to  keep     Amélie  &  Serge,  EDBT,  11111011111    
  • 24. Nature  and  value  of  informa-on     w5h  model  (context-­‐based)       •  Changes  with  -me   •  Depends  on  many  dimensions:   nature  of  info,  rarity,  age,   personal  bias/taste/opinions…   •  Difficult  to  es-mate  the  cost  to   get  some  info   –  To  es-mate  how  much  you  would   spend  before  you  give  up   •  Difficult  to  es-mate  the  value  of   informa-on  you  don't  have  yet   •  Difficult  for  the  system  to  know   what  a  human  remembers   –  Makes  crowd  sourcing  difficult   Amélie  &  Serge,  EDBT,  11111011111     24  
  • 25. Storage  and  Archival     •  Fully  under  user’s  control   •  Fully  available  on  the  cloud   –  Without  privacy  risk   •  Fully  resilient  to  failure   –  Automa-c  back-­‐ups   –  Automa-c  synchroniza-on  with  other  systems/devices     •  Support  of  access  control   –  Simple  and  intui-ve  defini-on  across  systems/devices   •  Use  of  encryp-on   –  Data  is  stored  encrypted  in  the  cloud  or  on  a  personal   machine  connected  to  the  cloud   Amélie  &  Serge,  EDBT,  11111011111     25  
  • 26. Data  integra-on   •  Old  problems  revisited  
  • 27. Person-­‐centric     informa-on  integra-on   Amélie  &  Serge,  EDBT,  11111011111     27   27   Sue’s   PIMS   …   …   W’1        W1     wrapper   …   Secured   net   Bob   Joe   …   Decentralized  services   (e.g.,  Diaspora)     External   Services   (e.g.,  Facebook)          Wn   wrapper        L1   Lp          D1   Dm     W’n   Local   Services   (e.g.,  Analy-cs)     Sue   S   Server-­‐centric   …  
  • 28. Classical  data  integra-on  problem   •  Choose  a  schema  for  the   PIMS   •  Choose  a  mapping   between  the  sources  and   the  mediated  schema   •  Extract  &  load  &  maintain   –  Data  and  metada  from   sources   Lots  of  works   –  On  digital  libraries   –  On  database  integra-on   Amélie  &  Serge,  EDBT,  11111011111     28   …   Sue’s   PIMS   Sn   Sn   S1   S1   Wrapper   Wrapper  
  • 29. Classical  knowledge  integra-on  problem   •  Enrich  the  ontology   –  Align  concepts  and  rela-ons  in   schemas   –  Align  objects   •  Reference  to  external  data     Lots  of  works   –  On  knowledge  representa-on     –  On  knowledge  integra-on     Amélie  &  Serge,  EDBT,  11111011111     29   Imported  knowledge   Alignments   (computed  or  curated)   Curated     knowledge   Imported  ontologies   Personal   ontology  
  • 30. Illustra-on:  en-ty  resolu-on   •  Mail   •  Contact   Amélie  &  Serge,  EDBT,  11111011111     30   •  Websearch   amelie@gmail.com   Amelie   Marian   from   …        Nikki  de  Saint-­‐Phalle  …  body   grandpalais.fr/ndsp/  url   …    Nikki  de  Saint-­‐Phalle    …  
  • 32. Memory  Tasks   •  The  “five  Rs”  memory  tasks          -­‐Sellen  and  Whitaker,  CACM  2010   Recollec-ng   Reminiscing   Retrieving   Reflec-ng   Remembering  inten-ons   Amélie  &  Serge,  EDBT,  11111011111     32  
  • 33. Recollec-ng   •  Task-­‐based  memory  process   •  Retracing  steps  to  recollect  informa-on   – “Where  did  I  leave  my  keys”     – “When  was  the  last  -me  I  saw  Pierre”   •  Follow  a  series  of  cues  to  iden-fy  informa-on   Amélie  &  Serge,  EDBT,  11111011111     33   Need:  ConnecJons  between  memory  objects      (integraJon  and  navigaJon)  
  • 34. Reminiscing   •  Browsing  through  past   memories  to  re-­‐live   them   •  Experience-­‐based  (no   specific  goal  in  mind)   –   E.g.,  looking  at  old   photos   Amélie  &  Serge,  EDBT,  11111011111     34   Need:  ConnecJons  between  memory  objects      (integraJon  and  navigaJon)  
  • 35. Retrieving   •  Retrieving  specific  informa-on   – Files,  documents,  pictures   – Data  snippets   •  Use  of  metadata   •  Can  be  combined  with  recollec-on   Amélie  &  Serge,  EDBT,  11111011111     35   Need:  Query  model,  Indexes,        and  Search  algorithms        
  • 36. Reflec-ng   •  Learning  from  the  past   – Iden-fy  paxerns   – Personal  data  analysis   •  Towards  a  Personal  Knowledge  Base  (PKB)   – Individual  vs.  shared  knowledge   – Privacy  concerns   Amélie  &  Serge,  EDBT,  11111011111     36   Need:  Knowledge  Discovery  and  Mining   techniques  designed  for  personal  data        
  • 37. Remembering  Inten-ons   •  Focus  on  prospec-ve  memory   –  To-­‐do  lists   –  Appointment  reminders   •  Ac-ve  focus  of  commercial  companies   –  Google  Now   –  No-fica-on  apps  (-me-­‐  or  loca-on-­‐based)   –  Microsos  Personal  Agent  project?   Amélie  &  Serge,  EDBT,  11111011111     37   Need:  NLP  techniques  designed  for                    personal  data        
  • 38. Explaining   •  Users  want  to  understand  the  informaJon    they  see,  the  answers  they  are  given   –  In  their  professional/social  life     •  Difficul-es   – Reasoning  with  large  number  of  facts     – Informa-on  is  osen  probabilis-c  and  not  public   – Requires  knowing  how  the  informa-on  was   obtained  (its  provenance)   38  Amélie  &  Serge,  EDBT,  11111011111    
  • 39. Serendipity   •  You  may  hear  by  chance  a   song  that  is  going  to  totally   obsess  you   •  A  librarian  may  suggest   your  reading  a  book  that   will  change  your  life   This  is  serendipity   •  A  perfect  search  engine     •  A  perfect  recommenda-on   system   •  A  perfect  computer  assistant   Such  systems  are  boring       They  lack  serendipity   39   Design  programs  that  would  help  introduce   serendipity  in  our  lives   Amélie  &  Serge,  EDBT,  11111011111    
  • 40. Answer  Personaliza-on   •  Modifying  the  query  based  on  the  user’s   ontology  and  preferences   •  Ranking  the  result  based  on  the  user’s   preferences   •  Example:  How  do  I  get  to  Alice’s  place?   –  Modify   •  Alice  is  Alice.Doe@gmail.com     –  Rank   •  Choose  to  bike  if  possible  (user’s  preference  if  the  weather   is  nice)   •  Choose  the  route  by  the  river  if  it  is  open   Amélie  &  Serge,  EDBT,  11111011111     40  
  • 41. Rich  search/queries   Context-­‐aware   •  We  remember  our  data  based  on   contextual  cues     •  Personal  informa-on  is  rich  in   contextual  informa-on   –  Metadata   –  Applica-on  data     –  Environment  knowledge     •  Cogni-ve  Psychology   –  contextual  cues  are  strong   triggers  for  autobiographical   memories     InteracJve   -  I  am  looking  for  a  great  movie  I   saw  about  a  month  ago   -  Was  it  on  TV?   -  No  in  a  theater.   -  Was  it  Turkish?   -  Yes.   -  It  must  be  Winter  Sleep.   Amélie  &  Serge,  EDBT,  11111011111     41  
  • 42. Digital  Self  Architecture  @  Rutgers   Amélie  &  Serge,  EDBT,  11111011111     42   Architecture •  Data  CollecJon   –  Iden-fica-on,  retrieval,  storage     –   Personal  Extrac-on  Tool:     hxps://github.com/ameliemarian/DigitalSelf   •  Data  IntegraJon   –  Mul-dimensional,  context-­‐ aware,  unified  data  model   –  w5h  Model     •  Search     –  based  on  the  natural  memory   retrieval  process   –  Context-­‐aware,  approximate   –  -­‐w5h  Search     •  Knowledge  Discovery   –  Find  connec-ons  and  paxerns   –  Integrates  user  behavior  and   feedback  
  • 43. Personal  data  analy-cs   Aka  Small  data   Elliox  Hedman,  Design  Research  Conference
  • 44. Personal  data  analy-cs   •  Rela-vely  new  topic     –  First  Interna7onal  Workshop  on  Personal  Data  Analy-cs  in   the  Internet  of  Things  in  2014   •  Learn  from  personal  data  and  predic-ons   –  Personal  health  and  well-­‐being   –  Personal  transporta-on     –  Home  automa-on   •  Issues   –  Data  privacy   –  Complexity  of  “small”  data  analy-cs:  Less  is  harder   –  Combine  with  ver-cal  analy-cs:  large  groups  of  people   –  Varying  data  quality:  imprecision,  inconsistencies     Amélie  &  Serge,  EDBT,  11111011111     44  
  • 45. Focus:  Quan-fied  self   •  From  sensors  &  all  kind  of  data   •  Health  and  well  being  model  of  the  person   •  Provide  alerts  and  counseling   •  Monitoring  and  support  for  pa-ents  with   chronic  condi-ons   •  Preven-ve  medicine   •  Ac-ve  par-cipa-on  of  the  person   •  Large-­‐scale  learning  –  privacy  issues   Amélie  &  Serge,  EDBT,  11111011111     45  
  • 46. Towards  a  Personal  Knowledge  Base   •  Combine  informa-on  from  different  sources  to   infer  facts   –  Personal  Facts   –  Personal  Rules   –  Personal  Ontology     •  Example  Query  «  When  was  the  last  -me  I  was  in   Brussels?  »   •  Can  use  exis-ng  tools,  RDF,  RDFS,  SPARQL   Amélie  &  Serge,  EDBT,  11111011111     46  
  • 47. Access  control  and  security  
  • 48. Is  privacy  needed?   •  Because  young  people  expose  personal  life  online  more  likely   than  adults,  privacy  is  no  longer  the  social  norm  (M.   Zuckerberg)   •  Proved  totally  wrong   –  E.g.,  young  turn  to  ephemeral  communica-on  means  (Snapchat)   •  Privacy  paradox:  Internet  users  are  concerned  about  privacy   but  mostly  ignore  it  in  their  behaviors   Amélie  &  Serge,  EDBT,  11111011111     48  
  • 49. Different  architectures   •  Connec-on  with  vendors  (same   for  other  services)   •  Secure  P2P   Amélie  &  Serge,  EDBT,  11111011111     49   PIMS   Vendor  rela-on   system   V1   V2   V3   PIMS   Trusted   intermediary   V1   V2   V3   Two-­‐-er   Three-­‐-er   Distributed   network   (P2P)     Secure   hardware  (e.g.,   FreedomBox)  
  • 50. Secure  devices   Amélie  &  Serge,  EDBT,  11111011111     50   •  Secure  portable  tokens:  Secure  MCU  +  Flash  storage   –  Issues:  limita-ons  of  the  device   –  Example:  personal  medical  folder   •  Works  of  [Anciaux,Pucheral]    
  • 51. Reducing  or  increasing  the  security  risk?   •  An  intrusion  on  my  PIMS  puts  all  my  informa-on  at  risk   •  Hard  to  be  riskier  than  today’s  model   –  Hardly  comfor-ng   •  The  PIMS  is  ran  by  a  professional  operator   –  Security/privacy  is  guaranteed  by  contract   –  Applica-ons  codes  are  verified  by  the  operator   –  The  PIMS  monitors  the  user’s  ac-ons  to  prevent  security   viola-ons   •  Data  of  different  users  are  isolated   –  Less  temp-ng  for  pirates   •  The  PIMS  does  not  solve  the  security  issues   •  It  provides  a  beXer  environment  to  address  them   Amélie  &  Serge,  EDBT,  11111011111     51  
  • 52. Other  issues   •  Self  administra-on     •  Synchroniza-on  and  task  sequencing   •  Internet  of  things  
  • 53. Support  for  system  administra-on   •  It  should  require  epsilon  competence   –  Users  are  osen  incompetent  and  in  par-cular  understand  lixle  about   access  control/security   •  It  should  be  epsilon  work   –  Users  are  not  interested   •  The  PIMS  helps   •  Administrate  external  applica-ons   •  Synchronize/backup  data     •  Select  services  and  op-ons   •  Manage  access  rights   –  Works  on  self-­‐tuning  systems/databases   –  Need  for  works  on  automa-cally  genera-ng  access  control  policies   from  behavior  of  users     Amélie  &  Serge,  EDBT,  11111011111     53  
  • 54. Synchroniza-on  and     task  sequencing  across  devices   •  Many  possible  approaches   •  Service-­‐oriented  architecture   •  Workflow     –  Transfer  workflow  technology  to  the  masses   •  Mashup   –  uses  content  from  more  than  one  sources  to  create  a   single  new  service  displayed  in  a  single  graphical   interface   –  E.g.,  Yahoo  pipes   •  Ishisthenthat  style     Amélie  &  Serge,  EDBT,  11111011111     54  
  • 55. A  hub  for  the  IoT   •  Internet  of  things:  Interconnec-on  of  iden-fiable   compu-ng  devices  within  the  exis-ng  Internet   infrastructure   •  Control  of  connected  objects   •  Explosion  of  things   –  E.g.,  heart  monitoring  implants,  biochip  transponders  on  farm   animals,  automobiles  with  built-­‐in  sensors,  field  opera-on   devices…   •  According  to  Gartner,  there  will  be  nearly  26  billion  devices   on  the  Internet  of  Things  by  2020   •  Many  will  be  personal  devices  that  the  PIMS  should   integrate/control   •  Possibly  a  killer  app  for  the  PIMS   Amélie  &  Serge,  EDBT,  11111011111     55  
  • 56. Conclusion:  The  PIMS  are  arriving   For  societal,  technical,  industrial  reasons   They  will  change  our  lives  
  • 57. Society  is  ready  to  move   •  Growing  resentment     –  Against  companies:  intrusive  marke-ng,  cryp-c   personaliza-on  and  business  decisions  (e.g.,  on   pricing),  creepy  "big  data"  inferences   –  Against  governments:  NSA  and  its  European   counterparts   •  Increasing  awareness  of  the  dissymmetry     –  between  what  these  systems  know  about  a  person,   and  what  the  person  actually  knows   •  Emerging  understanding  of  the  value  of  personal   data  for  individuals   57  Amélie  &  Serge,  EDBT,  11111011111    
  • 58. Society  is  ready  to  move  (2)   •  Privacy  control:  regula-ons  in  Europe   •  Informa-on  symmetry:  Vendor  rela-on   management   •  Many  reports/proposals  that  affirm  the   ownership  of  personal  data  by  the  person   •  Personal  data  disclosure  ini-a-ves     –  Smart  Disclosure  (US);  MiData  (UK),  MesInfos  (France)   –  Several  large  companies  (network  operators,  banks,   retailers,  insurers…)  agreeing  to  share  with  customers   the  personal  data  that  they  have  about  them   Amélie  &  Serge,  EDBT,  11111011111     58  
  • 59. Technology  is  gearing  up   •  System  administra-on  is  easier   –  Abstrac-on  technologies  for  servers   –   Virtualiza-on  and  configura-on  management  tools   •  Open  source  technology  more  and  more   available  for  services   •  Price  of  machines  is  going  down   –  A  hosted-­‐low  cost  server  is  as  cheap  as  5€/month   –  Paying  is  no  longer  a  barrier  for  a  majority  of  people   You  may  have  friends  already  doing  it   59  Amélie  &  Serge,  EDBT,  11111011111    
  • 60. Technology  is  gearing  up  (2)   •  Many  systems  &  projects   –  Lifestreams,  Stuff-­‐I’ve-­‐Seen,  Haystack,  MyLifeBits,   Connec-ons,  Seetrieve,  Personal  Dataspaces,  or   deskWeb.     –  YounoHost,  Amahi,  ArkOS,  OwnCloud  or  Cozy  Cloud   •  Some  on  par-cular  aspects   –  Mailpile  for  mail   –  Lima  for  a  Dropbox-­‐like  service,  but  at  home.   –  Personal  NAS  (network-­‐connected  storage)  e.g.   Synologie   –  Personal  data  store  SAMI  of  Samsung...   •  Many  more     60  Amélie  &  Serge,  EDBT,  11111011111    
  • 61. Industry  is  interested    (1)  Pre-­‐digital  companies   •  E.g.,  hotels  or  banks     •  Disintermediated  from  their  customers  by  pure   Internet  players  such  as  Google,  Amazon,   Booking.com,  Mint.     •  In  PIMS,  they  can  rebuild  direct  interac-on     •  The  playing  field  is  neutral     –  Unlike  on  the  Internet  where  they  have  less  data   •  They  can  offer  new  services  without   compromising  privacy   61  Amélie  &  Serge,  EDBT,  11111011111    
  • 62. Industry  is  interested    (2)  Home  appliances  companies   •  Many  boxes  deployed  at  home  or  in   datacenters   – Internet  access  and  TV  "boxes”,  NAS  servers,   "smart"  meters  provided  by  energy  vendors,   home  automa-on  systems,  "digital  lockers”…   •  Personal  data  spaces  dedicated  to  specific   usage   •  Could  evolve  to  become  more  generic   •  Control  of  private  Internet  of  objects   62  Amélie  &  Serge,  EDBT,  11111011111    
  • 63. Industry  is  interested    (3)  Pure  Internet  players   •  Amazon:  great  know-­‐how  in  providing  services   •  Facebook,  Google:  cannot  afford  to  be  out  of  a   movement  in  personal  data  management   •  Very  far  from  their  business  model  based  on   personal  adver-sement   •  Moving  to  this  new  market  would  require  major   changes  &  the  clarifica-on  of  the  rela-onship   with  users  w.r.t.  data  mone-za-on   Amélie  &  Serge,  EDBT,  11111011111     63  
  • 64. They  will  change  our  lives:     (1)  rebalance  the  Web     •  User  control  over  their  data   –  Who  has  access  to  what,  under  what  rules,  to  do  what     •  User  empowerment   –  They  choose  freely  services  &  they  can  leave  a  service   •  Par-cipa-on  to  a  more  “neutral”  Web   –  With  the  "network  effects",  the  main  plaYorms  are   accumula-ng  data/customers  and  distor-ng  compe--on   –  The  PIMS  bring  back  fairness  on  the  Web   –  Good  practices  are  encouraged,  e.g.,  interoperability,   portability   64  Amélie  &  Serge,  EDBT,  11111011111    
  • 65. They  will  change  our  lives:     (2)  new  func-onali-es   1.  Data  integra-on   2.  Search  and  queries   3.  Access  control  and  security   4.  Personal  data  analy-cs   5.  Self  administra-on     6.  Synchroniza-on  and  task  sequencing   7.  Control  of  Internet  of  things    …   65  Amélie  &  Serge,  EDBT,  11111011111    
  • 66. (3)  So  watch  out  for  the  killer  apps   •  Personal  assistant   –  Google  now  enhanced   –  Appointments,  trips,  shopping   –  Tax,  financial,  insurance,  pension…   •  Health  monitoring   –  Quan-fied  self   –  Digital  medical  records   •  Smart  home   •  Elder  care  monitoring  and  advising   Amélie  &  Serge,  EDBT,  11111011111     66  
  • 67. Come  and  share  PIMS   •  Lots  of  cool  problems   •  Lots  of  opportuni-es  for   your  favorite  data   management  techno     •  Lots  of  super  useful   applica-ons   •  And  some  killer  apps  to   invent   Amélie  &  Serge,  EDBT,  11111011111     67  
  • 68.
  • 69. References   Data  IntegraJon:   •  A  survey  of  approaches  to  automa7c  schema  matching,  Rahm  &  Bernstein  2001.     •  Principles  of  Data  integra7on,  Doan,  Halevy,  Ives,  2012.   •  Principles  of  dataspace  systems,  Halevy,  Franklin,  and  Maier.  CACM,  2006.     •  Schema  matching  (Rahm  &  Bernstein  2001).     •  Data  integra7on,  Halevy,  Ashish,  Bixon,  et  al.  (2005)   Security  and  trust   •  Management  of  Personal  Informa7on  Disclosure:  The  Interdependence  of  Privacy,  Security,   and  Trust,  Clare-­‐Marie  Karat,  John  Karat,  and  Carolyn  Brodie   •  Secure  Personal  Data  Servers:  a  Vision  Paper.  T  Allard  et  al.  VLDB,  2010.   Knowledge  management   •  Web  Data  Management,  Serge  Abiteboul,  Ioana  Manolescu,  Philippe  Rigaux,  Marie-­‐Chris-ne   Rousset,  Pierre  Senellart,  Cambridge  University  Press,  2011.   •  Ontology  for  PIMS:  OntoPIM,  Ka-fori,  Poggi,  Scannapieco,  et  al.  2005   •  Networked  Environment  for  Personal,  Ontology-­‐based  Management  of  Unified  Knowledge   (NEPOMUK).     Amélie  &  Serge,  EDBT,  11111011111     69  
  • 70. References   Data  extracJon   •  A  tool  for  personal  data  extrac7on.  D.  Vianna,  A.-­‐M.  Yong,  C.  Xia,  A.   Marian,  and  T.  Nguyen   •  Visual  Web  Informa7on  Extrac7on  with  Lixto,  R.  Baumgartner,  S.  Flesca,G.  Goxlob.   VLDB01   Societal  issues   •  Managing  your  digital  life  with  a  Personal  informa7on  management   system,    Serge  Abiteboul,  Benjamin  André,  Daniel  Kaplan,  Comm.  of  the   ACM,  to  appear   •  hxp://mesinfos.fing.org     •  hxp://www.midatalab.org.uk     •  hxps://www.data.gov/consumer/smart-­‐disclosure-­‐policy     •  hxp://socialsafe.net       Amélie  &  Serge,  EDBT,  11111011111     70  
  • 71. References   PIMS:   •  As  we  may  think,  Vannevar  Bush,  the  Atlan-c  Monthly,  2005.   •  Personal  Informa7on  Management.  W.  Jones  and  J.  Teevan,  editors.                University  of  Washington  Press,  2007.   •  Beyond  total  capture:  a  construc7ve  cri7que  of  Lifelogging,  Sellen  and  Whitaker,  CACM  2010.   •  A  tool  for  personal  data  extrac7on.  Vianna,  Yong,  Xia,  Marian,    and  Nguyen,  IIWeb  2014.   •  Microsos’s  Stuff  I’ve  Seen  project,  Dumais  et  al.  SIGIR  2003.   •  MyLifeBits,  Gemmel,  Bell  and  Lueder,  CACM  2006.   •  deskWeb,  Zerr  et  al.  SIGIR  2010.   •  Connec7ons,  Soules  and  Ganger,  SOSP  2005.   •  Seetrieve,  Gyllstrom  and  Soules,  IUI  2008.   •  LifeStreams,  Fer-g,  Freeman,  and  Gelernter,  CHI  1996.   •  Haystack,  Karger  et  al.  CIDR  2005.   •  Understanding  What  Works:  Evalua7ng  PIM  Tools,  Diane  Kelly  and  Jaime  Teevan       Amélie  &  Serge,  EDBT,  11111011111     71