The current DocGraph social graph was built in Neo4J. With new enhancements in Neo4J 2.0, now was a good time to rebuild the social graph. The goal of this session is to show participants how simple it is to perform basic graph analysis of a healthcare dataset.
2. About Me
•
•
•
•
My Blog: http://www.intelliwareness.org
Find me on Twitter: @davefauth
Email me: dsfauth@gmail.com
GitHub: http://github.com/davidfauth
2
10. Healthcare Data
• Recommend watching Fred Trotter speak at
GraphConnect – SF
• Moving from no data -> bad data -> better
data -> good data
• Claims Data
– Hard to accurately describe what a doctor is
doing and how they are getting paid without
claims data
– Limited and not a good data set by any standard
11. Examples of Bad Data
• Not enough data – More transparency
without having to FOIA
• State level data is hard to get
12. Better Data Sets
• DocGraph Data
– One of the “best” available
– “Best” does not mean “good”
• DocGraph Rx
– Prescribing patterns for Medicare Part D patients
• NPPES
• NUCC
13. DocGraph Dataset
• DocGraph by the numbers
– Directed graph
– Average total degree 52.8
– 940,492 providers (graph nodes/vertices)
– 49,685,810 shared edges
17. NPPES
•
•
•
•
National Plan and Provider Enumeration System
Source of NPI (National Provider Identifier)
No cost download
Information is entered and updated by provider
Data quality is good to poor
• CSV file with 314 columns
18. NUCC
• National Uniform Claim Committee
– Healthcare Provider Taxonomy
– No cost download
• CSV file with 5 columns and 830 rows
– Link taxonomy to NPPES reported taxonomy
30. Fraud Referrals
April 2013 - The owner and another
senior executive of Sacred Heart
Hospital and four physicians
affiliated with the west side facility
were arrested today for allegedly
conspiring to pay and receive illegal
kickbacks, including more than
$225,000 in cash, along with other
forms of payment, in exchange for
the referral of patients insured by
Medicare and Medicaid to the
hospital, announced U.S. Attorney
for the Northern District of Illinois
Gary S. Shapiro.
32. DocGraph RX Data
• Originally obtained by ProPublica
• Prescribing pattern for all physicians for
Medicare Part D – 2011
• Largest public released prescribing database
• 2 sets of data - 30M edges each
• Related to business name and NDC-9 code
– NDC 9 code allows for aggregation of drugs
38. DocGraph RX Data
• http://whnt.com/2013/03/27/follow-updecatur-family-claims-prescription-drugsfrom-dr-shelinder-aggarwal-killed-their-son/
• http://www.palmbeachpost.com/news/news/
state-regional/doctors-booted-fom-medicaidfor-massive-oxy-doses-/nPpMf/
39. DocGraph RX Data
• Back to “bad data”
• http://www.albme.org/actions.html
40.
41. Combine additional datasets
• Medical data
– Doctor referral data
– Medicare doctor prescription practices
– “Dollars for Doctors” – Drug company promotional
payments
• Census Data
– Income data
– Poverty data
42. Recommendation Engine?
• Build a graph model of the data
• Build a recommender model from the graph
model
• Graphs can be visualized, explained, discussed
and debugged collaboratively