2. Energy Mission at LBNL
• Li-ion Batteries
• Photovoltaic (Solar Cells)
• Thermoelectrics
• Biofuels
• New Computational Tools
• Cutting edge Spectroscopic Tools (Advanced Light Source)
http://carboncycle2.lbl.gov/
3. Current Material Design
model is Slow
18 Years... from the average
new materials discovery to
commercialization
Bringing New Materials to the Market: Eagar, T.W.
Technology Review Feb 1995, 98, 42.
4. Materials Genome Initiative:
A Renaissance of American Manufacturing
“To help businesses discover, develop, and deploy new
materials twice as fast, we're launching what we call the
Materials Genome Initiative. The invention of silicon
circuits and lithium-ion batteries made computers and iPods
and iPads possible -- but it took years to get those
technologies from the drawing board to the marketplace.
We can do it faster.”
- President Obama at Carnegie Mellon
University 6/24/2011
12. Machine Learning
How often can you
Structure 1
substitute Mg for Ca?
Structure 2
(new materials)
Structure 3
Structure 4
materials.bson Learning Structure 5
Algorithm Structure 6
What about
Na, V, P, O?
Prof. Gerbrand Ceder (DOI: 10.1103/PhysRevLett.91.135503)
13. Materials Project:
A Play in Three Acts
I.Data generation using HTC
II. Data storage
III.Data analysis/logging
14. Act I: Managing
Calculations
• Centralized distributed model is the only
way to go
• Hub is at LBNL
• Store the state in db
• Overview of running many MPI jobs at
many different HP centers
15. MasterQueue create a new
engine, add
to queue
pull crystal
builder.x master_queue.bson
‘The Brain’
manager.x manager.x manager.x manager.x manager.x
HPC
Franklin Hopper Carver lr1 lr2
NERSC Lawrencium
(Oakland) (Berkeley)
16. Centralized Logging
Example MongoDB
and Management
manager.x manager.x manager.x manager.x manager.x manager.x manager.x manager.x
O1 Cathode Hopper Franklin Carver lr1 lr2 DLX
MIT NERSC (Oakland) LBNL Kentucky
query = {‘elements’: {‘$all’: [“Li”, “O”], ‘nelectrons’ :{“$lte: 200}}
19. Powerful Querying
Every crystal that has (Li or Na or K), (Mn), (O or S or F or Si)
plus one other element except (Zn or Ni or Fe or Cu or Co)
{
"lattice.volume" : { "$lt" : 500 },
"elements" : {"$all" : ['Mn'],"$size" : 4, “$nin”:['Zn','Ni','Fe','Cu','Co']},
"atoms" : { "$elemMatch" : { ‘oxidation_state’ : 3, ‘symbol’:’Mn’} },
"$where" : "match_all(
this.element_names,
['Li', 'Na', 'K'],
['Mn'],
['O', 'S', 'F', 'Si'])"
}
20. pre-MongoDB :(
((SELECT structure.structureid FROM structure NATURAL INNER JOIN
database NATURAL INNER JOIN databaseentry WHERE structureid IN
((select structure.structureid from structure NATURAL INNER JOIN
elemententry where elemententry.symbol='Li' INTERSECT select
structure.structureid from structure NATURAL INNER JOIN elemententry
where elemententry.symbol='O') INTERSECT select structure.structureid
from structure NATURAL INNER JOIN database NATURAL INNER JOIN
databaseentry where database.title='ICSD')) EXCEPT (SELECT
structure.structureid FROM structure where structure.entryid IN
(select duplicateentry.entryid from duplicateentry))) EXCEPT (SELECT
structure.structureid FROM structure where structure.entryid IN
(select entryid from removals))
Search for materials with Li and O,
excluding duplicates
27. Integrated logging just
makes sense
• Semi-structured data easily stored
• Can correlate with all other data
• Automation Layer: Failed tasks
• Web/App Layer
28. Conclusions
• MongoDB is a very versatile tool
• Used in several different cases
• Elegant query syntax
• Very useful for scientific data storage
• A lot of exciting future ideas