Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Things Made Easy: One Click CMS Integration with Solr & Drupal


Published on

Presented by Peter Wolanin | Acquia, Inc - See conference video -

If you have a new web project or and existing Drupal site, the combination of Drupal and Apache Solr is both powerful and easy to set up thanks to the existing integration code. The module allows for substantial customization with the administrative UI. Drupal facilitates further customizations of the UI, indexing, and bosting because of the open architecture that provides multiple opportunities for custom code to alter the behavior. A couple code snippets will be followed by a review of other contributed Drupal modules that further enhance the search capability.

Finally, this session will showcase some example of Drupal sites using Solr including Acquia's own sites and Drupal sites including many well-known Enterprise and government sites.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Things Made Easy: One Click CMS Integration with Solr & Drupal

  1. 1. May 10, 2012Things Made Easy: One ClickCMS Integration with Solr &Drupal Peter M. Wolanin, Ph.D. Momentum Specialist (principal engineer), Acquia, Inc. Drupal contributor co-maintainer of the Drupal Apache Solr Search Integration module
  2. 2. Key Questions to Be Answered• What is Drupal?• What Apache Solr features are integrated with Drupal?• Why is Drupal plus Apache Solr is better than starting from scratch?• What elements of the search can you configure in the UI without code?
  3. 3. Why Are You Here?• You are starting a new website project?• You are wondering how hard it is to actually integrate Apache Solr with a website?• You already use Drupal but not with Apache Solr?• You like things that are easy yet powerful?
  4. 4. Drupal: Web Application Framework + CMS == Social Publishing PlatformDrupal “… is as much a Social Software platformas it is a web content management system.” content usersCMS Watch, The Web CMS Report 2009 blogs / workflow wikis forums / taxonomy comments Content Social Mgmt Software Systems Tools social semantic web ranking RSS social tagging social analytics networks
  5. 5. Drupal + Solr Provides ImmediateAccess to Rich Search Features Dynamic content requires dynamic navigation - which is provided by an effective search Search facets mean no dead ends Solr provides better keyword relevancy in results Much faster searches for sites with lots of content By avoiding database queries, Drupal with Solr scales better
  6. 6. DEMO:A Drupal 7 partial copy of the conference site with Apache Solr integration
  7. 7. Drupal Has User Accounts, Roles& Permissions Define custom roles Set granular access controls by role Configure user behavior:– Registration– Email– Profiles– Pictures
  8. 8. Drupal Modules AddFunctionality “There’s a module for that” More than 4100 Drupal 7 community modules Often controlled by role- based permissions Drupal core and modules are GPL v2+, and have a huge, active community
  9. 9. Drupal is Written in PHP, WhichMakes for Easy Customization The Drupal architecture encourages and provides many avenues for customization by writing modules but not patching Drupal core Drupal has a huge community of users. Approximately 10,000 sites report to that they use the Apache Solr Search Integration module.
  10. 10. Drupal Adapts toYou!!
  11. 11. Drupal Entities are Content + Data Nodes are the basic entity used for text content Node 1 Node 2 Node 3 The entity system is extensible - can represent Node 4 Node 5 Node 6 any data Examples of data stored within Drupal entities Node 7 Node 8 Node 9– Text– geographic location– Node reference
  12. 12. Entity Types are Enriched WithUser-configurable Data Fields Define new data fields on a node using the Field API module.– Text, images, integers, date, reference, etc Flexible and configurable in the UI No programming required (many existing modules)
  13. 13. A Strong Framework forContent ClassificationCore taxonomy systemModules providetaxonomy-basedappearance, accesscontrolStandard input optionsinclude free tagging,flat-controlled, andhierarchical-controlled
  14. 14. Drupal + Solr Search for Business, Government and NGOs apachesolr_search/
  15. 15. Drupal Has Already Solved ManySolr Integration Challenges The most important - content indexing. Facets, sorting, and highlighting of results. Immediate integration with the More Like This and spell-check handlers. Included sub-module integrates content access permissions by indexing to and filtering Solr results based on the current user.
  16. 16. Easy Content Recommendation! Uses the MLT handler Picks fields from the currently viewed node
  17. 17. The Module Has a Pipeline forIndexing Drupal Content to Solr Drupal entities are processed into one (or more) document objects. Each document object is converted to XML and sent to Solr.Node object Document object XML string entity_type <doc> title label <field <field name="entity_type">node</field> name="label">Hello Drupal</field> <field name="entity_id">101</field> nid entity_id <field </doc> name="bundle">session</field> type Drupal bundle functions
  18. 18. Entity Meta-data GivesAutomatic Facets! Content types Taxonomy terms per vocabulary Content authors Posted and modified dates Text and numbers selected via select list/radios/check boxes
  19. 19. Drupal Modules Implement hooksto Control Indexing and DisplayHOOK_apachesolr_index_document_build($document,$entity, $entity_type, $env_id) By creating a Drupal module (in PHP), you can implement module and theme “hooks” to extend or alter Drupal behavior. Change or replace the data normally indexed. Modify the search results and their appearance.
  20. 20. Updates to an Entity or RelatedMeta-data Cause Reindexing Drupal entities are indexed during Drupal cron (typically invoked via *nix cron). By using a specialized tracking table, content can automatically be queued for reindex when changed, and subsets of content can potentially be sent to different Solr indexes. Entities include many ID-based reference fields (e.g. the User ID of the author). Changes to the referenced data is also watched.
  21. 21. Indexing Tracking Tables MaintainOrder+-------------+-----------+-------------+--------+------------+| entity_type | entity_id | bundle | status | changed |+-------------+-----------+-------------+--------+------------+| node | 36 | session | 1 | 1336520756 || node | 37 | session | 1 | 1336510489 || node | 38 | session | 1 | 1336510456 || node | 39 | session | 1 | 1336510456 || node | 40 | speaker_bio | 1 | 1336510456 |+-------------+-----------+-------------+--------+------------+ When a node is updated, the “changed” timestamp is updated. The indexing pipeline tracks the largest timestamp and entity_id which has been indexed.
  22. 22. Example: Taxonomy TermClassifying a Node is Changed Grapefruit Citrus fruitfunction apachesolr_taxonomy_term_update($term) All nodes classified with this terms are queued to be re-indexed by setting the “changed” column to the current time. Thus you will correctly match ‘Citrus’ instead of ‘Grapefruit’ for those documents.
  23. 23. When Unpublished, Content isPurged Drupal core includes a simple editorial workflow where content may be toggled between published (visible) and unpublished (incomplete, removed, spam, etc). The module immediately removes content from the index when unpublished, and also tracks it for future removal in case the Solr server is unavailable.
  24. 24. Search Using Dismax QueryParsing & Boosting Features Dynamic fields in schema.xml used to index standard and custom entity data fields Dismax (or EDismax) handler used for keyword searching across multiple fields and per-field boosts Query-time boosting options available in the UI
  25. 25. A Query Object Is Used toPrepare and Run Searches HOOK_apachesolr_query_prepare($query) $query->setParam(hl.fl, $field); $keys = $query->getParam(q); $response = $query->search();
  26. 26. More Modules Available toAdd More FeaturesA few examples: ApacheSolr Attachments Apache Solr Multisite Search Apache Solr Organic Groups Integration Apachesolr User indexing Apachesolr Commerce
  27. 27. To Wrap Up ! Drupal has extensive Apache Solr integration already, and is highly customizable. The Drupal platform is widely adopted, and the Drupal community drives rapid innovation. Acquia provides Enterprise Drupal support and a network of partners. Acquia includes a secure, hosted Solr index with every support subscription.
  28. 28. Did I Answer These?• What is Drupal?• What Apache Solr features are integrated with Drupal?• Why is Drupal plus Apache Solr is better than starting from scratch?• What elements of the search can you configure in the UI without code?
  29. 29. Other PHP Integration Tools••• don’t use serialized PHP response format in a custom integration - use JSON writer.
  30. 30. Acquia is Hiring!• Do you love Drupal, Solr, the LAMP stack, DevOps or anything related, and working at a fast-growing and successful startup?• Boston and Portland area U.S. offices.• Some remote opportunities as well.• Come talk to me! pwolanin in IRC #drupal or #solr
  31. 31. Resources ... Questions? ! drupalconchi_day2_attain_apache_solr_coding_c hops