SlideShare a Scribd company logo
1 of 38
Presented by: Gaston Gonzalez, headwire.com, Inc.
+
Advanced AEM Search
Consuming External Content and Enriching Content with Apache
Camel
About Me
• Senior Technical Architect at
headwire.com, Inc.
• Search Engineer / Developer
• AEM Architect / Developer
• Creator of AEM Solr Search
• Tech Blogger
• UNIX Systems Administrator
+
+
Typical AEM + Search
Integration
Typical AEM + Search Architecture
+
Typical AEM + Search Architecture
+
Pros Cons
• Straight forward implementation
• Simple architecture (AEM + Search)
• Complete data model in AEM?
• Not all data may be in AEM
• Processing overhead
• Data cleansing, transformation and
enrichment handled in AEM
• Fault Tolerance
• What if Solr is down?
• Tight coupling to search platform
Is there another
way?
+
Goals for a better Architecture
• Offload processing outside of AEM
• Improve fault tolerance
• Provide flexible platform for data cleansing,
transformation and aggregation
• Allow for changes to indexing logic with impacting
AEM
• Search engine agnostic
+
Introduce an ETL / Document Processor
+
+
Document Processing
Document Processing Platform
• Roles & Responsibilities
• Enriches submitted documents prior to indexing.
• Submits documents for indexing.
• Terms & Definitions
• Enrichment: Data cleansing, filtering, transformation,
aggregation, etc.
• Processing Stage: Independent processing unit
responsible for contributing to the enrichment process.
• Pipeline: Consists of one or more processing stages or
sub pipelines.
+
Document Processing Platform
+
Document processing is really an
integration problem, right?
+
Integration Library Integration Framework &
Stream Processing
Enterprise Service Bus
Apache Camel Spring Integration Mule ESB
Spring Cloud Data Flow &
Cloud Stream
Low Complexity High
+
Apache Camel
Apache Camel
• A light-weight, open source
integration library.
• Mediation engine
• Implements well-known Enterprise
Integration Patterns (EIPs)
• Aggregator
• Content Enricher
• Content-based router
• Message
• Message Translator
• Pipes and Filters
• Splitter…
+
Why Apache Camel?
• Light weight—it’s a JAR
• Imposes no runtime constraints
• Routing engine
• Powerful, fluent Java DSL
• Mature open source project
• Extensive list of integration components
• Avoid writing boiler plate code—leverage EIPs
+
Apache Camel & EIP Concepts
+
Message
• Unit of information exchange between applications
Exchange
• Wraps inbound & outbound message + headers
Message Channel
• Allows applications to communicate using messaging
Pipes and Filters
• Perform loosely coupled processing on a message
• Routes and Processors in Camel
Camel’s Data Model
+
Camel’s Architecture
+
Importing Product Content into Solr
Problem: “As an AEM developer, I need to import product
content into Solr so that I can display products via search
and on PDPs on my AEM-powered site.”
+
Let’s use Best Buy’s Product API as example…
1. Fetch product data ZIP file via HTTP request.
2. Unzip product data.
3. Parse each JSON file to extract individual products.
4. Transform, enrich and cleanse each product as necessary.
5. Submit each product to Solr for indexing.
A solution using EIPs
+
A solution using Camel
+
A short list of Camel Components
+
AMPQ Git RabbitMQ
ATOM HTTP / HTTP4 Rest
AWS JCR RSS
Bean JDBC Solr
Box JMS Apache Spark
Cache Jsch SQL
CouchDB Log Timer
Elasticsearch MongoDB XSLT
File Netty / Netty4 Quartz
http://camel.apache.org/components.html
Back to AEM and
indexing AEM content…
+
A Better AEM + Search Architecture
+
Enrichment Use Cases for AEM
• Search Relevancy
• Merge ratings and review signals
• Merge analytics signals (visits, page views…)
• Merge social signals (likes, shares, …)
• Cleanse data for search
• Rich content processing (Tika)
• Natural Language Processing (OpenNLP)
• Filter / drop documents
• Classify content
+
AEM: Data Model (1/3)
• Use a serializable object to represent your document
• In fact, use a HashMap
• No dependency object graph
• Most search platforms already think of documents as a
series of key/value pairs
• Use key name prefixes to model:
• Index operation type (aem.op)
• Document Fields (aem.field.<field>)
• Metadata (aem.meta.<field>)
+
AEM: Data Model (1/3)
HashMap<String, Object> jmsDoc = new HashMap<String, Object>();
// Operation Type
jmsDoc.put("aem.op.type","ADD_DOC");
// Document fields
jmsDoc.put("aem.field.id", page.getPath());
jmsDoc.put("aem.field.crxPath", page.getPath());
jmsDoc.put("aem.field.url", page.getPath() + ".html");
jmsDoc.put("aem.field.title", page.getTitle());
jmsDoc.put("aem.field.description", page.getDescription());
// Metadata
jmsDoc.put("aem.meta.foo", "bar");
+
AEM: Listener / JMS Producer (2/3)
+
• Create an AEM Listener
• Implement EventHandler interface
• Listen for the PageEvent topics
• Convert the Page resource to a our data model
• Add operation type
• Add document fields
• Add metadata fields
• Send the message to JMS index topic
• Example: JmsIndexListener.java
AEM: JMS Camel Consumer (3/3)
+
• Define your Camel runtime (e.g., standalone, OSGi, etc.)
• Define your Camel routes
• Consume JMS topic
• Route operation type using content-based router
• Enrich document as needed
• Convert JMS document model to Solr model
• Submit index request
• Example: AemToSolr.java
+
Demo
Demo Prerequisites
• Java 8 / Maven 3.2.x
• AEM 6.1
• http://www.aemsolrsearch.com
• https://github.com/GastonGonzalez/aem-solr-
search-product-sample
• Best Buy API Key
• Vagrant and VirtualBox
+
+
Camel Runtime
Options
Java main:
CamelContext
Java main:
Wrapper
OSGi Runtime
Resources
• My Blog - http://www.gastongonzalez.com/
• AEM Solr Search - http://www.aemsolrsearch.com
• Apache Camel
• http://camel.apache.org/index.html
• https://www.manning.com/books/camel-in-
action-second-edition
• Contact Us: aemsolr@headwire.com
+
In summary…
+
• If you do not need enrichment, keep it simple and
use a direct indexing approach.
• If you have a need to enrich your AEM content
consider using Camel as your document processing
platform.
• This architecture is NOT search-specific!
• Syndicate AEM content to other systems
• Workflow replacement
+
THANK YOU.

More Related Content

What's hot

Search domain basics
Search domain basicsSearch domain basics
Search domain basics
pmanvi
 
Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...
Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...
Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...
Lucidworks
 
Battle of the Giants - Apache Solr vs. Elasticsearch (ApacheCon)
Battle of the Giants - Apache Solr vs. Elasticsearch (ApacheCon)Battle of the Giants - Apache Solr vs. Elasticsearch (ApacheCon)
Battle of the Giants - Apache Solr vs. Elasticsearch (ApacheCon)
Sematext Group, Inc.
 
Introduction to Solr
Introduction to SolrIntroduction to Solr
Introduction to Solr
Erik Hatcher
 

What's hot (20)

Introduction to Apache Solr
Introduction to Apache SolrIntroduction to Apache Solr
Introduction to Apache Solr
 
Case study of Rujhaan.com (A social news app )
Case study of Rujhaan.com (A social news app )Case study of Rujhaan.com (A social news app )
Case study of Rujhaan.com (A social news app )
 
Scaling Recommendations, Semantic Search, & Data Analytics with solr
Scaling Recommendations, Semantic Search, & Data Analytics with solrScaling Recommendations, Semantic Search, & Data Analytics with solr
Scaling Recommendations, Semantic Search, & Data Analytics with solr
 
Introduction to Lucene & Solr and Usecases
Introduction to Lucene & Solr and UsecasesIntroduction to Lucene & Solr and Usecases
Introduction to Lucene & Solr and Usecases
 
State-of-the-Art Drupal Search with Apache Solr
State-of-the-Art Drupal Search with Apache SolrState-of-the-Art Drupal Search with Apache Solr
State-of-the-Art Drupal Search with Apache Solr
 
Building a Large Scale SEO/SEM Application with Apache Solr
Building a Large Scale SEO/SEM Application with Apache SolrBuilding a Large Scale SEO/SEM Application with Apache Solr
Building a Large Scale SEO/SEM Application with Apache Solr
 
Search domain basics
Search domain basicsSearch domain basics
Search domain basics
 
Deduplication Using Solr: Presented by Neeraj Jain, Stubhub
Deduplication Using Solr: Presented by Neeraj Jain, StubhubDeduplication Using Solr: Presented by Neeraj Jain, Stubhub
Deduplication Using Solr: Presented by Neeraj Jain, Stubhub
 
Battle of the giants: Apache Solr vs ElasticSearch
Battle of the giants: Apache Solr vs ElasticSearchBattle of the giants: Apache Solr vs ElasticSearch
Battle of the giants: Apache Solr vs ElasticSearch
 
Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...
Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...
Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...
 
Battle of the Giants - Apache Solr vs. Elasticsearch (ApacheCon)
Battle of the Giants - Apache Solr vs. Elasticsearch (ApacheCon)Battle of the Giants - Apache Solr vs. Elasticsearch (ApacheCon)
Battle of the Giants - Apache Solr vs. Elasticsearch (ApacheCon)
 
Deepak khetawat sling_models_sightly_jsp
Deepak khetawat sling_models_sightly_jspDeepak khetawat sling_models_sightly_jsp
Deepak khetawat sling_models_sightly_jsp
 
Elastic search overview
Elastic search overviewElastic search overview
Elastic search overview
 
SharePoint Framework, React, and Office UI Fabric spc adriatics 2016
SharePoint Framework, React, and Office UI Fabric spc adriatics 2016SharePoint Framework, React, and Office UI Fabric spc adriatics 2016
SharePoint Framework, React, and Office UI Fabric spc adriatics 2016
 
Managed Search: Presented by Jacob Graves, Getty Images
Managed Search: Presented by Jacob Graves, Getty ImagesManaged Search: Presented by Jacob Graves, Getty Images
Managed Search: Presented by Jacob Graves, Getty Images
 
Sitemap comparison
Sitemap comparisonSitemap comparison
Sitemap comparison
 
Postman Collection Format v2.0 (pre-draft)
Postman Collection Format v2.0 (pre-draft)Postman Collection Format v2.0 (pre-draft)
Postman Collection Format v2.0 (pre-draft)
 
Introduction to Apache Solr
Introduction to Apache SolrIntroduction to Apache Solr
Introduction to Apache Solr
 
Battle of the Giants round 2
Battle of the Giants round 2Battle of the Giants round 2
Battle of the Giants round 2
 
Introduction to Solr
Introduction to SolrIntroduction to Solr
Introduction to Solr
 

Viewers also liked (6)

Inter-Sling communication with message queue
Inter-Sling communication with message queueInter-Sling communication with message queue
Inter-Sling communication with message queue
 
Elastic search adaptto2014
Elastic search adaptto2014Elastic search adaptto2014
Elastic search adaptto2014
 
adaptTo() 2014 - Integrating Open Source Search with CQ/AEM
adaptTo() 2014 - Integrating Open Source Search with CQ/AEMadaptTo() 2014 - Integrating Open Source Search with CQ/AEM
adaptTo() 2014 - Integrating Open Source Search with CQ/AEM
 
Camel ratings ppt
Camel ratings pptCamel ratings ppt
Camel ratings ppt
 
Camels Rating
Camels RatingCamels Rating
Camels Rating
 
Culture
CultureCulture
Culture
 

Similar to Consuming External Content and Enriching Content with Apache Camel

Solr Recipes Workshop
Solr Recipes WorkshopSolr Recipes Workshop
Solr Recipes Workshop
Erik Hatcher
 

Similar to Consuming External Content and Enriching Content with Apache Camel (20)

Essential Camel Components
Essential Camel ComponentsEssential Camel Components
Essential Camel Components
 
Solr Recipes Workshop
Solr Recipes WorkshopSolr Recipes Workshop
Solr Recipes Workshop
 
Introduction to Amazon Athena
Introduction to Amazon AthenaIntroduction to Amazon Athena
Introduction to Amazon Athena
 
EPM Automate - Automating Enterprise Performance Management Cloud Solutions
EPM Automate - Automating Enterprise Performance Management Cloud SolutionsEPM Automate - Automating Enterprise Performance Management Cloud Solutions
EPM Automate - Automating Enterprise Performance Management Cloud Solutions
 
NEW LAUNCH! Intro to Amazon Athena. Easily analyze data in S3, using SQL.
NEW LAUNCH! Intro to Amazon Athena. Easily analyze data in S3, using SQL.NEW LAUNCH! Intro to Amazon Athena. Easily analyze data in S3, using SQL.
NEW LAUNCH! Intro to Amazon Athena. Easily analyze data in S3, using SQL.
 
NEW LAUNCH! Intro to Amazon Athena. Easily analyze data in S3, using SQL.
NEW LAUNCH! Intro to Amazon Athena. Easily analyze data in S3, using SQL.NEW LAUNCH! Intro to Amazon Athena. Easily analyze data in S3, using SQL.
NEW LAUNCH! Intro to Amazon Athena. Easily analyze data in S3, using SQL.
 
Learn AJAX at ASIT
Learn AJAX at ASITLearn AJAX at ASIT
Learn AJAX at ASIT
 
Ajax workshop
Ajax workshopAjax workshop
Ajax workshop
 
Introduction to Monsoon PHP framework
Introduction to Monsoon PHP frameworkIntroduction to Monsoon PHP framework
Introduction to Monsoon PHP framework
 
(ATS6-PLAT02) Accelrys Catalog and Protocol Validation
(ATS6-PLAT02) Accelrys Catalog and Protocol Validation(ATS6-PLAT02) Accelrys Catalog and Protocol Validation
(ATS6-PLAT02) Accelrys Catalog and Protocol Validation
 
How to Load Data, Revisited
How to Load Data, RevisitedHow to Load Data, Revisited
How to Load Data, Revisited
 
EPM Automate - Automating Enterprise Performance Management Cloud Solutions
EPM Automate - Automating Enterprise Performance Management Cloud SolutionsEPM Automate - Automating Enterprise Performance Management Cloud Solutions
EPM Automate - Automating Enterprise Performance Management Cloud Solutions
 
(ATS6-PLAT04) Query service
(ATS6-PLAT04) Query service (ATS6-PLAT04) Query service
(ATS6-PLAT04) Query service
 
Introduction to AJAX
Introduction to AJAXIntroduction to AJAX
Introduction to AJAX
 
How to Load Data, Revisited, UTOUG
How to Load Data, Revisited, UTOUGHow to Load Data, Revisited, UTOUG
How to Load Data, Revisited, UTOUG
 
Automation Nation
Automation NationAutomation Nation
Automation Nation
 
Share point development 101
Share point development 101Share point development 101
Share point development 101
 
SolrCloud on Hadoop
SolrCloud on HadoopSolrCloud on Hadoop
SolrCloud on Hadoop
 
(ATS6-DEV01) What’s new for Protocol and Component Developers in AEP 9.0
(ATS6-DEV01) What’s new for Protocol and Component Developers in AEP 9.0(ATS6-DEV01) What’s new for Protocol and Component Developers in AEP 9.0
(ATS6-DEV01) What’s new for Protocol and Component Developers in AEP 9.0
 
Sf big analytics_2018_04_18: Evolution of the GoPro's data platform
Sf big analytics_2018_04_18: Evolution of the GoPro's data platformSf big analytics_2018_04_18: Evolution of the GoPro's data platform
Sf big analytics_2018_04_18: Evolution of the GoPro's data platform
 

Recently uploaded

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 

Recently uploaded (20)

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 

Consuming External Content and Enriching Content with Apache Camel

  • 1. Presented by: Gaston Gonzalez, headwire.com, Inc. + Advanced AEM Search Consuming External Content and Enriching Content with Apache Camel
  • 2. About Me • Senior Technical Architect at headwire.com, Inc. • Search Engineer / Developer • AEM Architect / Developer • Creator of AEM Solr Search • Tech Blogger • UNIX Systems Administrator +
  • 3. + Typical AEM + Search Integration
  • 4. Typical AEM + Search Architecture +
  • 5. Typical AEM + Search Architecture + Pros Cons • Straight forward implementation • Simple architecture (AEM + Search) • Complete data model in AEM? • Not all data may be in AEM • Processing overhead • Data cleansing, transformation and enrichment handled in AEM • Fault Tolerance • What if Solr is down? • Tight coupling to search platform
  • 7. Goals for a better Architecture • Offload processing outside of AEM • Improve fault tolerance • Provide flexible platform for data cleansing, transformation and aggregation • Allow for changes to indexing logic with impacting AEM • Search engine agnostic +
  • 8. Introduce an ETL / Document Processor +
  • 10. Document Processing Platform • Roles & Responsibilities • Enriches submitted documents prior to indexing. • Submits documents for indexing. • Terms & Definitions • Enrichment: Data cleansing, filtering, transformation, aggregation, etc. • Processing Stage: Independent processing unit responsible for contributing to the enrichment process. • Pipeline: Consists of one or more processing stages or sub pipelines. +
  • 12. Document processing is really an integration problem, right? + Integration Library Integration Framework & Stream Processing Enterprise Service Bus Apache Camel Spring Integration Mule ESB Spring Cloud Data Flow & Cloud Stream Low Complexity High
  • 14. Apache Camel • A light-weight, open source integration library. • Mediation engine • Implements well-known Enterprise Integration Patterns (EIPs) • Aggregator • Content Enricher • Content-based router • Message • Message Translator • Pipes and Filters • Splitter… +
  • 15. Why Apache Camel? • Light weight—it’s a JAR • Imposes no runtime constraints • Routing engine • Powerful, fluent Java DSL • Mature open source project • Extensive list of integration components • Avoid writing boiler plate code—leverage EIPs +
  • 16. Apache Camel & EIP Concepts + Message • Unit of information exchange between applications Exchange • Wraps inbound & outbound message + headers Message Channel • Allows applications to communicate using messaging Pipes and Filters • Perform loosely coupled processing on a message • Routes and Processors in Camel
  • 19. Importing Product Content into Solr Problem: “As an AEM developer, I need to import product content into Solr so that I can display products via search and on PDPs on my AEM-powered site.” + Let’s use Best Buy’s Product API as example… 1. Fetch product data ZIP file via HTTP request. 2. Unzip product data. 3. Parse each JSON file to extract individual products. 4. Transform, enrich and cleanse each product as necessary. 5. Submit each product to Solr for indexing.
  • 21. A solution using Camel +
  • 22. A short list of Camel Components + AMPQ Git RabbitMQ ATOM HTTP / HTTP4 Rest AWS JCR RSS Bean JDBC Solr Box JMS Apache Spark Cache Jsch SQL CouchDB Log Timer Elasticsearch MongoDB XSLT File Netty / Netty4 Quartz http://camel.apache.org/components.html
  • 23. Back to AEM and indexing AEM content… +
  • 24. A Better AEM + Search Architecture +
  • 25. Enrichment Use Cases for AEM • Search Relevancy • Merge ratings and review signals • Merge analytics signals (visits, page views…) • Merge social signals (likes, shares, …) • Cleanse data for search • Rich content processing (Tika) • Natural Language Processing (OpenNLP) • Filter / drop documents • Classify content +
  • 26. AEM: Data Model (1/3) • Use a serializable object to represent your document • In fact, use a HashMap • No dependency object graph • Most search platforms already think of documents as a series of key/value pairs • Use key name prefixes to model: • Index operation type (aem.op) • Document Fields (aem.field.<field>) • Metadata (aem.meta.<field>) +
  • 27. AEM: Data Model (1/3) HashMap<String, Object> jmsDoc = new HashMap<String, Object>(); // Operation Type jmsDoc.put("aem.op.type","ADD_DOC"); // Document fields jmsDoc.put("aem.field.id", page.getPath()); jmsDoc.put("aem.field.crxPath", page.getPath()); jmsDoc.put("aem.field.url", page.getPath() + ".html"); jmsDoc.put("aem.field.title", page.getTitle()); jmsDoc.put("aem.field.description", page.getDescription()); // Metadata jmsDoc.put("aem.meta.foo", "bar"); +
  • 28. AEM: Listener / JMS Producer (2/3) + • Create an AEM Listener • Implement EventHandler interface • Listen for the PageEvent topics • Convert the Page resource to a our data model • Add operation type • Add document fields • Add metadata fields • Send the message to JMS index topic • Example: JmsIndexListener.java
  • 29. AEM: JMS Camel Consumer (3/3) + • Define your Camel runtime (e.g., standalone, OSGi, etc.) • Define your Camel routes • Consume JMS topic • Route operation type using content-based router • Enrich document as needed • Convert JMS document model to Solr model • Submit index request • Example: AemToSolr.java
  • 31. Demo Prerequisites • Java 8 / Maven 3.2.x • AEM 6.1 • http://www.aemsolrsearch.com • https://github.com/GastonGonzalez/aem-solr- search-product-sample • Best Buy API Key • Vagrant and VirtualBox +
  • 36. Resources • My Blog - http://www.gastongonzalez.com/ • AEM Solr Search - http://www.aemsolrsearch.com • Apache Camel • http://camel.apache.org/index.html • https://www.manning.com/books/camel-in- action-second-edition • Contact Us: aemsolr@headwire.com +
  • 37. In summary… + • If you do not need enrichment, keep it simple and use a direct indexing approach. • If you have a need to enrich your AEM content consider using Camel as your document processing platform. • This architecture is NOT search-specific! • Syndicate AEM content to other systems • Workflow replacement

Editor's Notes

  1. This is how AEM Solr Search 2.0.0 behaves.
  2. This is how AEM Solr Search 2.0.0 behaves.
  3. Can be thought of as an ETL. Terms & Definitions Processing Stage – Typically reusable. DoT – Do One Thing.
  4. Can be thought of as an ETL. Terms & Definitions Processing Stage – Typically reusable. DoT – Do One Thing.
  5. Many defunct, search-specific projects: OpenPipe, Pypes, OpenPipeline Other interesting search-specific pipelines include: Hydra by Findwise
  6. Mediation EIPs Aggregator - Content Enricher - Content-based router - Message - Message Translator - Pipes and Filters - Polling Consumer - Splitter -
  7. Declarative Spring-based, route definition also available
  8. Declarative Spring-based, route definition also available
  9. Declarative Spring-based, route definition also available
  10. Take a minute and visually think about how much code would be needed to achieve this goal? Is most of it boilerplate (e.g., setting up HTTP client, dealing with file input/output, marshaling/unmarshaling JSON, etc.)?
  11. TODO: Add transfrom
  12. 3 routes defined, all of which are asynchronous Demo code available
  13. Declarative Spring-based, route definition also available