SlideShare a Scribd company logo
1 of 7
Unique Identifier Generation in
Distributed Environment
Jianhan Zhu
• In distributed systems, sequential IDs are not always an option
• As short as possible for sharing
• GUID of 36 characters could be too long: 00017071-8786-42a5-94d9-dc0f62f585fc
• A balance between ID length and probability of collision
• The shorter the ID, the higher the probability of collision
Probability of
collision (%)
ID Length
0, 0
100
36
Birthday Paradox
• For 𝑛 randomly chosen persons, the probability that at least two of them have the
same birthday
𝑃𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑦 𝑜𝑓 𝐶𝑜𝑙𝑙𝑖𝑠𝑖𝑜𝑛 ≈ 1 − 𝑒−
𝑛2
2𝑥
• 𝑥: all possible ID values
• 𝑛: number of IDs we plan to have
𝒙 = 𝟔𝟐 𝟖
• 52 Alphabetic and 10 numeric characters
• ID of length 8
𝑛
• Currently 40K, so a probability of collision: 0.0003%
• If 1 million, the probability is 0.23%
• Will be tens of millions or more in future
In triple store:
• Generated ID: 1AFu55Hs
• Prefix: https://id.parliament.uk
• Resource URI: https://id.parliament.uk/1AFu55Hs
3
ID length Num of IDs generated before a
collision (Simulation)
Probability of collision
5 36K 51% (36K)
6 289K 52% (289K)
7 2.3 Million 51% (2.3 Million)
8 Out of memory 0.002% (100K)
0.06% (0.5 Million)
0.23% (1 Million)
5.56% (5 Million)
20.5% (10 Million)
9 - 0.37% (10 Million)
8.82% (50 Million)
30.88% (100 Million)
10 - 0.59% (100 Million)
2.35% (200 Million)
5.22% (300 Million)
13.84% (500 Million)
44.88% (1000 Million) 4
• Results for different ID lengths:
• Random data source: Crypto Random
• Data estimates on current triple store http://indexing.parliament.uk
• 174 million triples
• 9.2 million unique subjects (2.9 million blank nodes)
5
Subject Prefix
Num of
Triples
Num of Unique
Subjects
Average Num
of Triples per
Subject
http://data.parliament.uk/pimsdata/ 92,708,852 2,960,851 31.3
http://data.parliament.uk/edms/ 24,196,297 1,939,024 12.5
http://hansard.intranet.data.parliament.uk/ 18,115,694 552,505 32.8
http://tabledpq.indexing.parliament.uk/ 6,967,006 191,173 36.4
http://data.parliament.uk/writtenparliamentaryquestion/ 3,716,166 70,199 52.9
http://esid.parliament.uk/EUDocument/ 3,247,707 149,035 21.8
http://data.parliament.uk/depositedpapers/ 2,551,168 80,185 31.8
http://services.paperslaid.devci.dev.parliament.uk/ 644,193 23,121 27.9
http://data.parliament.uk/terms/uncontrolled/ 606,951 172,373 3.5
http://data.parliament.uk/resources/ 490,192 31,509 15.6
http://data.parliament.uk/currentawareness/ 487,227 22,044 22.1
http://paperslaidpoller.parliament.uk/ 396,153 9,636 41.1
• Conclusions:
• 8 characters long ID for the near future
• Need to increase ID length to accommodate more IDs
• At 1 million (0.23%)?
• Data will be structured differently from previous two triple stores?
• In future, add ID collision check against the triple store if the effect of
performance is acceptable
• Challenges:
• If a collision occurred, how to spot it? (Log generated IDs?)
6
Further Reading
• https://en.wikipedia.org/wiki/Birthday_problem
• https://en.wikipedia.org/wiki/Cryptographically_secure_pseudorando
m_number_generator
• https://en.wikipedia.org/wiki/Universally_unique_identifier
• https://eager.io/blog/how-long-does-an-id-need-to-be/
• https://github.com/twitter/snowflake
• Parliament Data Platform: https://api.parliament.uk/openapi.json
7

More Related Content

Similar to Data platform ID generation

Applying Noisy Knowledge Graphs to Real Problems
Applying Noisy Knowledge Graphs to Real ProblemsApplying Noisy Knowledge Graphs to Real Problems
Applying Noisy Knowledge Graphs to Real ProblemsDataWorks Summit
 
Creating an In-Aisle Purchasing System from Scratch
Creating an In-Aisle Purchasing System from ScratchCreating an In-Aisle Purchasing System from Scratch
Creating an In-Aisle Purchasing System from ScratchJonathan LeBlanc
 
SANSFIRE18: War Stories on Using Automated Threat Intelligence for Defense
SANSFIRE18: War Stories on Using Automated Threat Intelligence for DefenseSANSFIRE18: War Stories on Using Automated Threat Intelligence for Defense
SANSFIRE18: War Stories on Using Automated Threat Intelligence for DefenseJohn Bambenek
 
Biometrics (Distributed computing)
Biometrics (Distributed computing)Biometrics (Distributed computing)
Biometrics (Distributed computing)Sri Prasanna
 
Blockchain on Azure and Use Cases
Blockchain on Azure and Use CasesBlockchain on Azure and Use Cases
Blockchain on Azure and Use CasesNuri Cankaya
 
Blockchain general presentation nov 2017 v eng
Blockchain general presentation nov 2017 v engBlockchain general presentation nov 2017 v eng
Blockchain general presentation nov 2017 v engDavid Vangulick
 
How to Use Cryptography Properly: Common Mistakes People Make When Using Cry...
How to Use Cryptography Properly:  Common Mistakes People Make When Using Cry...How to Use Cryptography Properly:  Common Mistakes People Make When Using Cry...
How to Use Cryptography Properly: Common Mistakes People Make When Using Cry...All Things Open
 
Schaffner Quantum Computing and Cryptography.pptx
Schaffner Quantum Computing and Cryptography.pptxSchaffner Quantum Computing and Cryptography.pptx
Schaffner Quantum Computing and Cryptography.pptxsanta142869
 
3d password 23 mar 14
3d password 23 mar 143d password 23 mar 14
3d password 23 mar 14Saddam Ahmed
 
influence of AI in IS
influence of AI in ISinfluence of AI in IS
influence of AI in ISISACA Riyadh
 
Nasscom Demystifying Blockchain 101
Nasscom Demystifying Blockchain 101Nasscom Demystifying Blockchain 101
Nasscom Demystifying Blockchain 101Mayank Jain
 
SMART Seminar Series: "Blockchain and its Applications". Presented by Prof Wi...
SMART Seminar Series: "Blockchain and its Applications". Presented by Prof Wi...SMART Seminar Series: "Blockchain and its Applications". Presented by Prof Wi...
SMART Seminar Series: "Blockchain and its Applications". Presented by Prof Wi...SMART Infrastructure Facility
 
Hyperloglog Project
Hyperloglog ProjectHyperloglog Project
Hyperloglog ProjectKendrick Lo
 
Password based encryption
Password based encryptionPassword based encryption
Password based encryptionSachin Tripathi
 
MITRE ATTACKcon Power Hour - January
MITRE ATTACKcon Power Hour - JanuaryMITRE ATTACKcon Power Hour - January
MITRE ATTACKcon Power Hour - JanuaryMITRE - ATT&CKcon
 

Similar to Data platform ID generation (20)

Applying Noisy Knowledge Graphs to Real Problems
Applying Noisy Knowledge Graphs to Real ProblemsApplying Noisy Knowledge Graphs to Real Problems
Applying Noisy Knowledge Graphs to Real Problems
 
Creating an In-Aisle Purchasing System from Scratch
Creating an In-Aisle Purchasing System from ScratchCreating an In-Aisle Purchasing System from Scratch
Creating an In-Aisle Purchasing System from Scratch
 
SANSFIRE18: War Stories on Using Automated Threat Intelligence for Defense
SANSFIRE18: War Stories on Using Automated Threat Intelligence for DefenseSANSFIRE18: War Stories on Using Automated Threat Intelligence for Defense
SANSFIRE18: War Stories on Using Automated Threat Intelligence for Defense
 
Biometrics (Distributed computing)
Biometrics (Distributed computing)Biometrics (Distributed computing)
Biometrics (Distributed computing)
 
Cyber crime &_info_security
Cyber crime &_info_securityCyber crime &_info_security
Cyber crime &_info_security
 
Blockchain on Azure and Use Cases
Blockchain on Azure and Use CasesBlockchain on Azure and Use Cases
Blockchain on Azure and Use Cases
 
Blockchain general presentation nov 2017 v eng
Blockchain general presentation nov 2017 v engBlockchain general presentation nov 2017 v eng
Blockchain general presentation nov 2017 v eng
 
How to Use Cryptography Properly: Common Mistakes People Make When Using Cry...
How to Use Cryptography Properly:  Common Mistakes People Make When Using Cry...How to Use Cryptography Properly:  Common Mistakes People Make When Using Cry...
How to Use Cryptography Properly: Common Mistakes People Make When Using Cry...
 
Internet squared, society squared, wehome, cooperativism, sharing economy at ...
Internet squared, society squared, wehome, cooperativism, sharing economy at ...Internet squared, society squared, wehome, cooperativism, sharing economy at ...
Internet squared, society squared, wehome, cooperativism, sharing economy at ...
 
Schaffner Quantum Computing and Cryptography.pptx
Schaffner Quantum Computing and Cryptography.pptxSchaffner Quantum Computing and Cryptography.pptx
Schaffner Quantum Computing and Cryptography.pptx
 
3D Password
3D Password3D Password
3D Password
 
3d password 23 mar 14
3d password 23 mar 143d password 23 mar 14
3d password 23 mar 14
 
Connected Cars: What Could Possibly Go Wrong
Connected Cars: What Could Possibly Go WrongConnected Cars: What Could Possibly Go Wrong
Connected Cars: What Could Possibly Go Wrong
 
influence of AI in IS
influence of AI in ISinfluence of AI in IS
influence of AI in IS
 
Nasscom Demystifying Blockchain 101
Nasscom Demystifying Blockchain 101Nasscom Demystifying Blockchain 101
Nasscom Demystifying Blockchain 101
 
Network Security
Network SecurityNetwork Security
Network Security
 
SMART Seminar Series: "Blockchain and its Applications". Presented by Prof Wi...
SMART Seminar Series: "Blockchain and its Applications". Presented by Prof Wi...SMART Seminar Series: "Blockchain and its Applications". Presented by Prof Wi...
SMART Seminar Series: "Blockchain and its Applications". Presented by Prof Wi...
 
Hyperloglog Project
Hyperloglog ProjectHyperloglog Project
Hyperloglog Project
 
Password based encryption
Password based encryptionPassword based encryption
Password based encryption
 
MITRE ATTACKcon Power Hour - January
MITRE ATTACKcon Power Hour - JanuaryMITRE ATTACKcon Power Hour - January
MITRE ATTACKcon Power Hour - January
 

More from UK Parliament Data

Making parliamentary procedure machine readable
Making parliamentary procedure machine readableMaking parliamentary procedure machine readable
Making parliamentary procedure machine readableUK Parliament Data
 
Unlocking the Indexing and Search Data Goldmine
Unlocking the Indexing and Search Data GoldmineUnlocking the Indexing and Search Data Goldmine
Unlocking the Indexing and Search Data GoldmineUK Parliament Data
 
Modelling Parliamentary Procedure
Modelling Parliamentary ProcedureModelling Parliamentary Procedure
Modelling Parliamentary ProcedureUK Parliament Data
 
A new data platform for Parliament
A new data platform for ParliamentA new data platform for Parliament
A new data platform for ParliamentUK Parliament Data
 
What do Twitter conversations tell us about petitioning?
What do Twitter conversations tell us about petitioning?What do Twitter conversations tell us about petitioning?
What do Twitter conversations tell us about petitioning?UK Parliament Data
 
UK Parliament: the long road to open data
UK Parliament:  the long road to open data UK Parliament:  the long road to open data
UK Parliament: the long road to open data UK Parliament Data
 
Domain Driven Design at UK Parliament
Domain Driven Design at UK ParliamentDomain Driven Design at UK Parliament
Domain Driven Design at UK ParliamentUK Parliament Data
 
Parliament, data and democracy meetup - Dan Barrett
Parliament, data and democracy meetup - Dan BarrettParliament, data and democracy meetup - Dan Barrett
Parliament, data and democracy meetup - Dan BarrettUK Parliament Data
 
Playing with Parliamentary Data - Tony Hirst
Playing with Parliamentary Data - Tony HirstPlaying with Parliamentary Data - Tony Hirst
Playing with Parliamentary Data - Tony HirstUK Parliament Data
 
How technology can help you monitor your MP’s performance - Steve Goodrich
How technology can help you monitor your MP’s performance - Steve GoodrichHow technology can help you monitor your MP’s performance - Steve Goodrich
How technology can help you monitor your MP’s performance - Steve GoodrichUK Parliament Data
 
Mapping population data for Parliament - Oli Hawkins
Mapping population data for Parliament - Oli HawkinsMapping population data for Parliament - Oli Hawkins
Mapping population data for Parliament - Oli HawkinsUK Parliament Data
 

More from UK Parliament Data (15)

Coping with complexity
Coping with complexityCoping with complexity
Coping with complexity
 
Making parliamentary procedure machine readable
Making parliamentary procedure machine readableMaking parliamentary procedure machine readable
Making parliamentary procedure machine readable
 
What would erskine may do?
What would erskine may do?What would erskine may do?
What would erskine may do?
 
Unlocking the Indexing and Search Data Goldmine
Unlocking the Indexing and Search Data GoldmineUnlocking the Indexing and Search Data Goldmine
Unlocking the Indexing and Search Data Goldmine
 
Modelling Parliamentary Procedure
Modelling Parliamentary ProcedureModelling Parliamentary Procedure
Modelling Parliamentary Procedure
 
Domain modelling Parliament
Domain modelling Parliament Domain modelling Parliament
Domain modelling Parliament
 
A new data platform for Parliament
A new data platform for ParliamentA new data platform for Parliament
A new data platform for Parliament
 
What do Twitter conversations tell us about petitioning?
What do Twitter conversations tell us about petitioning?What do Twitter conversations tell us about petitioning?
What do Twitter conversations tell us about petitioning?
 
UK Parliament: the long road to open data
UK Parliament:  the long road to open data UK Parliament:  the long road to open data
UK Parliament: the long road to open data
 
Domain Driven Design at UK Parliament
Domain Driven Design at UK ParliamentDomain Driven Design at UK Parliament
Domain Driven Design at UK Parliament
 
Open Revolution - James Smith
Open Revolution - James SmithOpen Revolution - James Smith
Open Revolution - James Smith
 
Parliament, data and democracy meetup - Dan Barrett
Parliament, data and democracy meetup - Dan BarrettParliament, data and democracy meetup - Dan Barrett
Parliament, data and democracy meetup - Dan Barrett
 
Playing with Parliamentary Data - Tony Hirst
Playing with Parliamentary Data - Tony HirstPlaying with Parliamentary Data - Tony Hirst
Playing with Parliamentary Data - Tony Hirst
 
How technology can help you monitor your MP’s performance - Steve Goodrich
How technology can help you monitor your MP’s performance - Steve GoodrichHow technology can help you monitor your MP’s performance - Steve Goodrich
How technology can help you monitor your MP’s performance - Steve Goodrich
 
Mapping population data for Parliament - Oli Hawkins
Mapping population data for Parliament - Oli HawkinsMapping population data for Parliament - Oli Hawkins
Mapping population data for Parliament - Oli Hawkins
 

Recently uploaded

Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Researchmichael115558
 
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night StandCall Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...amitlee9823
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Valters Lauzums
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramMoniSankarHazra
 
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Pooja Nehwal
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...SUHANI PANDEY
 

Recently uploaded (20)

Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night StandCall Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics Program
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
Anomaly detection and data imputation within time series
Anomaly detection and data imputation within time seriesAnomaly detection and data imputation within time series
Anomaly detection and data imputation within time series
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 

Data platform ID generation

  • 1. Unique Identifier Generation in Distributed Environment Jianhan Zhu
  • 2. • In distributed systems, sequential IDs are not always an option • As short as possible for sharing • GUID of 36 characters could be too long: 00017071-8786-42a5-94d9-dc0f62f585fc • A balance between ID length and probability of collision • The shorter the ID, the higher the probability of collision Probability of collision (%) ID Length 0, 0 100 36
  • 3. Birthday Paradox • For 𝑛 randomly chosen persons, the probability that at least two of them have the same birthday 𝑃𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑦 𝑜𝑓 𝐶𝑜𝑙𝑙𝑖𝑠𝑖𝑜𝑛 ≈ 1 − 𝑒− 𝑛2 2𝑥 • 𝑥: all possible ID values • 𝑛: number of IDs we plan to have 𝒙 = 𝟔𝟐 𝟖 • 52 Alphabetic and 10 numeric characters • ID of length 8 𝑛 • Currently 40K, so a probability of collision: 0.0003% • If 1 million, the probability is 0.23% • Will be tens of millions or more in future In triple store: • Generated ID: 1AFu55Hs • Prefix: https://id.parliament.uk • Resource URI: https://id.parliament.uk/1AFu55Hs 3
  • 4. ID length Num of IDs generated before a collision (Simulation) Probability of collision 5 36K 51% (36K) 6 289K 52% (289K) 7 2.3 Million 51% (2.3 Million) 8 Out of memory 0.002% (100K) 0.06% (0.5 Million) 0.23% (1 Million) 5.56% (5 Million) 20.5% (10 Million) 9 - 0.37% (10 Million) 8.82% (50 Million) 30.88% (100 Million) 10 - 0.59% (100 Million) 2.35% (200 Million) 5.22% (300 Million) 13.84% (500 Million) 44.88% (1000 Million) 4 • Results for different ID lengths: • Random data source: Crypto Random
  • 5. • Data estimates on current triple store http://indexing.parliament.uk • 174 million triples • 9.2 million unique subjects (2.9 million blank nodes) 5 Subject Prefix Num of Triples Num of Unique Subjects Average Num of Triples per Subject http://data.parliament.uk/pimsdata/ 92,708,852 2,960,851 31.3 http://data.parliament.uk/edms/ 24,196,297 1,939,024 12.5 http://hansard.intranet.data.parliament.uk/ 18,115,694 552,505 32.8 http://tabledpq.indexing.parliament.uk/ 6,967,006 191,173 36.4 http://data.parliament.uk/writtenparliamentaryquestion/ 3,716,166 70,199 52.9 http://esid.parliament.uk/EUDocument/ 3,247,707 149,035 21.8 http://data.parliament.uk/depositedpapers/ 2,551,168 80,185 31.8 http://services.paperslaid.devci.dev.parliament.uk/ 644,193 23,121 27.9 http://data.parliament.uk/terms/uncontrolled/ 606,951 172,373 3.5 http://data.parliament.uk/resources/ 490,192 31,509 15.6 http://data.parliament.uk/currentawareness/ 487,227 22,044 22.1 http://paperslaidpoller.parliament.uk/ 396,153 9,636 41.1
  • 6. • Conclusions: • 8 characters long ID for the near future • Need to increase ID length to accommodate more IDs • At 1 million (0.23%)? • Data will be structured differently from previous two triple stores? • In future, add ID collision check against the triple store if the effect of performance is acceptable • Challenges: • If a collision occurred, how to spot it? (Log generated IDs?) 6
  • 7. Further Reading • https://en.wikipedia.org/wiki/Birthday_problem • https://en.wikipedia.org/wiki/Cryptographically_secure_pseudorando m_number_generator • https://en.wikipedia.org/wiki/Universally_unique_identifier • https://eager.io/blog/how-long-does-an-id-need-to-be/ • https://github.com/twitter/snowflake • Parliament Data Platform: https://api.parliament.uk/openapi.json 7