3. Scaling Search in Oak with Solr
adaptTo() 2013 3
Ā§ļ§āÆ Following up last yearās āOak / Solr
integrationā session
4. Looking at last yearās agenda
adaptTo() 2013 4
Ā§ļ§āÆ What we have:
Ā§ļ§āÆ Search Oak content with Solr
independently
Ā§ļ§āÆ Oak running with a Solr index
Ā§ļ§āÆ Embedded
Ā
Ā
Ā§ļ§āÆ Remote
Ā
Ā§ļ§āÆ What we miss:
Ā§ļ§āÆ Solr based MicroKernel
5. Apache Jackrabbit Oak
adaptTo() 2013 5
Ā§ļ§āÆ Scalable content repository
Ā§ļ§āÆ JCR compatibility (to some degree)
Ā§ļ§āÆ Performance especially for concurrent
access
Ā§ļ§āÆ Scalability for huge repositories (> 100M
nodes)
Ā§ļ§āÆ Support managed environments (e.g. OSGi)
Ā§ļ§āÆ Cloud deployments
6. Apache Solr
adaptTo() 2013 6
Ā§ļ§āÆ Enterprise search platform
Ā§ļ§āÆ Based on Apache Lucene
Ā§ļ§āÆ Full text, faceting, highlighting, etc.
Ā§ļ§āÆ Dynamic cluster architecture
Ā§ļ§āÆ HTTP API
Ā§ļ§āÆ Latest release 4.4.0
9. IndexEditor API
adaptTo() 2013 9
Ā§ļ§āÆ an
Ā Editor
Ā receives
Ā info
Ā about
Ā changes
Ā to
Ā the
Ā
content
Ā tree
Ā
Ā§ļ§āÆ the
Ā Editor
Ā evaluates
Ā the
Ā status
Ā before
Ā and
Ā a6er
Ā a
Ā
speciļ¬c
Ā Oak
Ā commit
Ā
Ā§ļ§āÆ can
Ā reject
Ā or
Ā accept
Ā the
Ā changes
Ā (by
Ā even
Ā
modifying
Ā the
Ā tree
Ā itself)
Ā
Ā§ļ§āÆ an
Ā Editor
Ā is
Ā executed
Ā right
Ā before
Ā an
Ā Oak
Ā commit
Ā
is
Ā persisted
Ā
Ā§ļ§āÆ the
Ā SolrIndexHook
Ā maps
Ā changes
Ā between
Ā
statuses
Ā in
Ā the
Ā content
Ā tree
Ā to
Ā Solr
Ā documents
Ā
and
Ā sends
Ā them
Ā to
Ā the
Ā Solr
Ā instance(s)
Ā
11. QueryIndex API
adaptTo() 2013 11
Ā§ļ§āÆ evaluate the query (Filter) cost for each
available index to find the ābestā
Ā§ļ§āÆ execute the query (Filter) against a specific
revision and root node state
Ā§ļ§āÆ internally
Ā the
Ā Filter
Ā is
Ā usually
Ā mapped
Ā to
Ā the
Ā
underlying
Ā implementaNon
Ā counterpart
Ā
Ā§ļ§āÆ improved
Ā support
Ā for
Ā full
Ā text
Ā queries
Ā in
Ā Oak
Ā
Ā§ļ§āÆ eventually view the query āplanā
12. QueryIndex API
adaptTo() 2013 12
Ā§ļ§āÆ mapping Filter restrictions:
Ā§ļ§āÆ Property restrictions:
Ā§ļ§āÆ Each
Ā property
Ā is
Ā mapped
Ā as
Ā a
Ā ļ¬eld
Ā
āāÆ Can
Ā use
Ā term
Ā queries
Ā for
Ā simple
Ā value
Ā matching
Ā or
Ā range
Ā
queries
Ā for
Ā āļ¬rst
Ā to
Ā lastā
Ā
Ā§ļ§āÆ Path restrictions
āāÆ indexed
Ā as
Ā strings
Ā and
Ā with
Ā special
Ā ļ¬elds
Ā for
Ā parent
Ā /
Ā
children
Ā /
Ā descendant
Ā matching
Ā
Ā§ļ§āÆ Full text expressions
āāÆ use
Ā (E)DisMax
Ā query
Ā parser
Ā and/or
Ā fallback
Ā ļ¬elds
Ā
Ā§ļ§āÆ NodeType restrictions TBD
13. Configuring the Solr index in Oak
adaptTo() 2013 13
Ā§ļ§āÆ create an index configuration under
Ā§ļ§āÆ /oak:index/solrIdx
Ā§ļ§āÆ jcr:primaryType
Ā =
Ā oak:queryIndexDeļ¬niNon
Ā
Ā§ļ§āÆ type
Ā =
Ā solr
Ā
āāÆ plus
Ā some
Ā mandatory
Ā props
Ā (e.g.
Ā reindex)
Ā
Ā§ļ§āÆ AddiNonal
Ā properNes
Ā if
Ā want
Ā to
Ā run
Ā an
Ā
embedded
Ā Solr
Ā server
Ā (more
Ā on
Ā this
Ā later)
Ā
14. Configuring the Solr index in Oak
adaptTo() 2013 14
Ā§ļ§āÆ Pluggable Solr server providers and
configuration
Ā§ļ§āÆ to allow different deployment scenarios
Ā§ļ§āÆ to allow custom configuration
15. Oak Solr core bundle
adaptTo() 2013 15
Ā§ļ§āÆ provides basic API and implementation
to index and search Oak content on Solr
Ā§ļ§āÆ Solr implementation of IndexEditor
Ā§ļ§āÆ Solr implementation of QueryIndex
Ā§ļ§āÆ allows configurable mapping between
Ā§ļ§āÆ property types and fields
Ā§ļ§āÆ e.g.
Ā all
Ā binaries
Ā should
Ā be
Ā indexed
Ā in
Ā speciļ¬c
Ā ļ¬eld
Ā
Ā§ļ§āÆ filter restrictions and fields
Ā§ļ§āÆ e.g.
Ā path
Ā restricNons
Ā for
Ā children
Ā should
Ā hit
Ā a
Ā
certain
Ā ļ¬eld
Ā
16. Oak Solr Embedded bundle
adaptTo() 2013 16
Ā§ļ§āÆ provides support for indexing and
searching on an embedded Solr instance
Ā§ļ§āÆ running inside the Oak repository
Ā§ļ§āÆ configuration can be done
Ā§ļ§āÆ via
Ā the
Ā repository
Ā
āāÆ stored
Ā in
Ā the
Ā index
Ā deļ¬niNon
Ā node
Ā
Ā§ļ§āÆ via
Ā OSGi
Ā
17. Oak Solr Remote bundle
adaptTo() 2013 17
Ā§ļ§āÆ provides support for indexing and
searching on remote Solr instances
Ā§ļ§āÆ single Solr instance
Ā§ļ§āÆ distributed / replicated Solr cluster
Ā§ļ§āÆ SolrCloud deployments
Ā§ļ§āÆ configuration is done via OSGi
18. OSGi platform running on Oak with Solr
adaptTo() 2013 18
Ā§ļ§āÆ Star
Ā instance
Ā with
Ā Oak
Ā repository
Ā
Ā§ļ§āÆ Add
Ā a
Ā bunch
Ā of
Ā bundles
Ā for
Ā Solr
Ā
āāÆ oak-Āāsolr-Āācore,
Ā oak-Āāsolr-Āāremote,
Ā zookeeper,
Ā
servicemix.bundles.solr-Āāsolrj,
Ā etc.
Ā
Ā§ļ§āÆ Conļ¬gure
Ā the
Ā Solr
Ā instance
Ā
Ā§ļ§āÆ Conļ¬gure
Ā oak-Āāsolr-Āāremote
Ā providers
Ā via
Ā OSGi
Ā
21. What needs to be done
adaptTo() 2013 21
Ā§ļ§āÆ Easy OSGi deployment
Ā§ļ§āÆ Move common configuration stuff in
oak-solr-core
Ā§ļ§āÆ Leverage new full text expression API
Ā§ļ§āÆ Solr MK?