SlideShare a Scribd company logo
1 of 10
Database Indexing Framework  ( Version 1.0 )
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Overview
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Overview
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Overview
The following slides discuss a incremental indexing approach that we thought would work well for our requirements. In this approach the Search Index relevant views are created using  Database Views  and the indexing is done as a  Batch Process  and not at real time. First we need to  understand the need for the Database Views . When a search term is searched for in the index, the result page shows some details and summary of the result. For instant results these details need to be stored in the index itself  so we don’t have to hit the database just to display collated results in the results page. When creating the Solr index it then doesn't make much sense to index all the tables individually. This is because each table will have it own dependencies with child and parent tables. We will either have to create similar dependencies in the index or else create our indexes intelligently keeping the search needs in mind. This will involve creating appropriate joins  across tables to fetch all the data relevant to a search result at one shot. The database view can do this job of collating data from the parent and child tables in a  representation that exactly matches the requirements of the search index. This makes the job of the application layer hassle free. It just picks everything from the view  and indexes it as it is.  Incremental Indexing Process  ( the need for Database Views )
Next we need to understand why the  Batch Indexing process  can work well for us. Most of our search requirements would involve searching for historic data. Rarely could there be cases where we search for data put in immediately. Even these cases can be handled by setting the Batch Process interval to a very small time. The real time indexing process can become a pretty expensive process in case a large  amount of data is entered in small intervals. Also the batch process gives us the flexibility of working on a copy of the database to make  the whole indexing process an offline one. Incremental Indexing Process  ( the need for Batch indexing )
Database Result Set to XML Converter Data Fetcher Indexing Job Scheduler Database Indexer (the controller class) SOLR Index Manager (9)  Solr XML (1)  Indexing Job Name (2)  Database View Name (5)  Result Set (6)  Solr XML (3)  Query (4)  Result Set (8)  Solr XML Indexing Job - Trigger Config file ( Indexing Job Schedules ) Trigger Time 1  -  Indexing Job 1 Trigger Time 2  -  Indexing Job 2 Trigger Time 3  -  Indexing Job 3 7)  Solr XML Incremental Indexing Batch Process  ( the flow ) Components in green are explained in detail in next slide  >> Indexing Job – Database View Mapping file More than one DB view might need to be indexed at the same time, so these  can be as an Indexing Job. Indexing Job 1 – Database View1 Database View2   Database View3   Database View4 Indexing Job 2 – Database View5 Database View6 DB View Column name to Solr field mapping - Database   View 1 Column 1  - Solr Field 1 Column 2  - Solr   Field 2 Column 3  - Solr   Field 3   - Database   View 2 Column 1  - Solr Field 3 Column 2  - Solr   Field 2
Incremental Indexing Batch Process  ( the components ) An  Indexing Job  has been defined as indexing of all the set of Database Views  that need to be indexed at the same time and at equal time intervals. Triggers  holds the time information, the start time, time interval and other such  time related details. So when a Indexing Job is associated to a trigger, the job will  run according to the start time and time intervals as mentioned in the trigger. Indexing Job - Trigger Config file  has all Indexing Job Schedules. It maps triggers to indexing jobs. Indexing Job – Database View Mapping file  defines the Indexing Jobs. It associates Database Views with each Indexing Job. If a database view like the one for the messages module requires to be picked up for at a smaller time interval than the one for the shopping module, then  they will be part of different indexing jobs having different Triggers. Database Indexer  acts as the controller of the database indexing process. It does the job of calling the Data Fetcher to get database records in XML format which it sends to the Index Manager to post it to Solr. The  Data Fetcher  communicates with the database to get all the new and updated  records for a given database view along with those records that have been marked  for deletion. It then feeds this data to the Result Set to XML converter to get the  data converted to the Solr recognizable XML format. The  Result Set to XML converter  is a utility class which converts database records to  XML format. If the record is new or updated it puts it in the <add> tag. If it is marked  for deletion then it is put in the <delete> tag.  It picks up Solr Field names corresponding to the DB View Column names from the  DB View Column name to Solr field mapping  file.
Incremental Indexing Batch Process  ( the flow) The indexing process is triggered off by the  Indexing Job Scheduler . An indexing job is triggered from the Indexing Job Scheduler based on the  trigger settings to which it is associated in the  Indexing Job - Trigger Config file . The Indexing Job Scheduler makes a call to the  Database Indexer  sending the  name of the job to done as an argument. The Database Indexer   acts as the controller for this whole process. It picks up  the names of Database Views to be indexed corresponding to the Indexing Job  sent by Indexing Job Scheduler from the  Indexing Job – Database View Mapping file . The Database Indexer loops over the set of Database Views and makes a call to the  Data Fetcher  for each View. The Data Fetcher hits the database with a query to get all the latest records from the  View. The result set is sent to  Result set to XML Converter  which return the Solr XML. This Solr XML is sent back to the Database Indexer which in turn sends it to the  Index manger for posting it to Solr.
(4) Result Set (3 ) View Query Indexing Job to Database Views mapping file Job - Trigger Config file (Indexing Job Schedules) DB View Column name to  Solr field mapping (2 ) Database View Name (7) Solr XML (6) Solr XML (5) Result Set (8) Solr XML (1) Indexing Job Name Indexing Job Scheduler Triggers  Database Indexer  with an  Indexing job   based on the trigger times in the  Job - Trigger Config  file ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Database SOLR Index Manager (9) Solr XML

More Related Content

What's hot

Lecture12 abap on line
Lecture12 abap on lineLecture12 abap on line
Lecture12 abap on line
Milind Patil
 

What's hot (20)

Introduction of sql server indexing
Introduction of sql server indexingIntroduction of sql server indexing
Introduction of sql server indexing
 
Sql server lesson6
Sql server lesson6Sql server lesson6
Sql server lesson6
 
"Using Indexes in SQL Server 2008" by Alexander Korotkiy, part 1
"Using Indexes in SQL Server 2008" by Alexander Korotkiy, part 1 "Using Indexes in SQL Server 2008" by Alexander Korotkiy, part 1
"Using Indexes in SQL Server 2008" by Alexander Korotkiy, part 1
 
SQL Server Index and Partition Strategy
SQL Server Index and Partition StrategySQL Server Index and Partition Strategy
SQL Server Index and Partition Strategy
 
Lecture12 abap on line
Lecture12 abap on lineLecture12 abap on line
Lecture12 abap on line
 
SQL_Part1
SQL_Part1SQL_Part1
SQL_Part1
 
Optimized cluster index generation
Optimized cluster index generationOptimized cluster index generation
Optimized cluster index generation
 
Chapter.07
Chapter.07Chapter.07
Chapter.07
 
What is Link list? explained with animations
What is Link list? explained with animationsWhat is Link list? explained with animations
What is Link list? explained with animations
 
MySQL Indexing
MySQL IndexingMySQL Indexing
MySQL Indexing
 
Indexes: The Second Pillar of Database Wisdom
Indexes: The Second Pillar of Database WisdomIndexes: The Second Pillar of Database Wisdom
Indexes: The Second Pillar of Database Wisdom
 
dotnetMALAGA - Sql query tuning guidelines
dotnetMALAGA - Sql query tuning guidelinesdotnetMALAGA - Sql query tuning guidelines
dotnetMALAGA - Sql query tuning guidelines
 
Sql introduction
Sql introductionSql introduction
Sql introduction
 
MySQL: Indexing for Better Performance
MySQL: Indexing for Better PerformanceMySQL: Indexing for Better Performance
MySQL: Indexing for Better Performance
 
Ardbms
ArdbmsArdbms
Ardbms
 
Database Performance
Database PerformanceDatabase Performance
Database Performance
 
Quick And Dirty Databases
Quick And Dirty DatabasesQuick And Dirty Databases
Quick And Dirty Databases
 
Ijebea14 228
Ijebea14 228Ijebea14 228
Ijebea14 228
 
DATASTORAGE.pptx
DATASTORAGE.pptxDATASTORAGE.pptx
DATASTORAGE.pptx
 
DATASTORAGE.pdf
DATASTORAGE.pdfDATASTORAGE.pdf
DATASTORAGE.pdf
 

Viewers also liked

12. Indexing and Hashing in DBMS
12. Indexing and Hashing in DBMS12. Indexing and Hashing in DBMS
12. Indexing and Hashing in DBMS
koolkampus
 
Indexing the MySQL Index: Key to performance tuning
Indexing the MySQL Index: Key to performance tuningIndexing the MySQL Index: Key to performance tuning
Indexing the MySQL Index: Key to performance tuning
OSSCube
 
MySQL Indexing - Best practices for MySQL 5.6
MySQL Indexing - Best practices for MySQL 5.6MySQL Indexing - Best practices for MySQL 5.6
MySQL Indexing - Best practices for MySQL 5.6
MYXPLAIN
 
Search Engines Presentation
Search Engines PresentationSearch Engines Presentation
Search Engines Presentation
JSCHO9
 
Search Engine Powerpoint
Search Engine PowerpointSearch Engine Powerpoint
Search Engine Powerpoint
201014161
 

Viewers also liked (20)

12. Indexing and Hashing in DBMS
12. Indexing and Hashing in DBMS12. Indexing and Hashing in DBMS
12. Indexing and Hashing in DBMS
 
1 data types
1 data types1 data types
1 data types
 
3 indexes
3 indexes3 indexes
3 indexes
 
Indexing the MySQL Index: Key to performance tuning
Indexing the MySQL Index: Key to performance tuningIndexing the MySQL Index: Key to performance tuning
Indexing the MySQL Index: Key to performance tuning
 
Ms sql-server
Ms sql-serverMs sql-server
Ms sql-server
 
MS SQL Server
MS SQL ServerMS SQL Server
MS SQL Server
 
Building a Real-time Solr-powered Recommendation Engine
Building a Real-time Solr-powered Recommendation EngineBuilding a Real-time Solr-powered Recommendation Engine
Building a Real-time Solr-powered Recommendation Engine
 
Introduction to TFS 2013
Introduction to TFS 2013Introduction to TFS 2013
Introduction to TFS 2013
 
MySQL Indexing - Best practices for MySQL 5.6
MySQL Indexing - Best practices for MySQL 5.6MySQL Indexing - Best practices for MySQL 5.6
MySQL Indexing - Best practices for MySQL 5.6
 
Types of Search Engines
Types of Search EnginesTypes of Search Engines
Types of Search Engines
 
Lucene basics
Lucene basicsLucene basics
Lucene basics
 
Apache Lucene: Searching the Web and Everything Else (Jazoon07)
Apache Lucene: Searching the Web and Everything Else (Jazoon07)Apache Lucene: Searching the Web and Everything Else (Jazoon07)
Apache Lucene: Searching the Web and Everything Else (Jazoon07)
 
Solr4 nosql search_server_2013
Solr4 nosql search_server_2013Solr4 nosql search_server_2013
Solr4 nosql search_server_2013
 
Search engines powerpoint
Search engines powerpointSearch engines powerpoint
Search engines powerpoint
 
How to Design Indexes, Really
How to Design Indexes, ReallyHow to Design Indexes, Really
How to Design Indexes, Really
 
Search Engines Presentation
Search Engines PresentationSearch Engines Presentation
Search Engines Presentation
 
Introduction to Search Engines
Introduction to Search EnginesIntroduction to Search Engines
Introduction to Search Engines
 
Search engines
Search enginesSearch engines
Search engines
 
Search Engine
Search EngineSearch Engine
Search Engine
 
Search Engine Powerpoint
Search Engine PowerpointSearch Engine Powerpoint
Search Engine Powerpoint
 

Similar to Database indexing framework

Large-Scale Distributed Storage System for Business Provenance - Cloud 2011
Large-Scale Distributed Storage System for Business Provenance - Cloud 2011Large-Scale Distributed Storage System for Business Provenance - Cloud 2011
Large-Scale Distributed Storage System for Business Provenance - Cloud 2011
Szabolcs Rozsnyai
 

Similar to Database indexing framework (20)

A Review of Data Access Optimization Techniques in a Distributed Database Man...
A Review of Data Access Optimization Techniques in a Distributed Database Man...A Review of Data Access Optimization Techniques in a Distributed Database Man...
A Review of Data Access Optimization Techniques in a Distributed Database Man...
 
A Review of Data Access Optimization Techniques in a Distributed Database Man...
A Review of Data Access Optimization Techniques in a Distributed Database Man...A Review of Data Access Optimization Techniques in a Distributed Database Man...
A Review of Data Access Optimization Techniques in a Distributed Database Man...
 
At the core you will have KUSTO
At the core you will have KUSTOAt the core you will have KUSTO
At the core you will have KUSTO
 
Ms sql server architecture
Ms sql server architectureMs sql server architecture
Ms sql server architecture
 
Brad McGehee Intepreting Execution Plans Mar09
Brad McGehee Intepreting Execution Plans Mar09Brad McGehee Intepreting Execution Plans Mar09
Brad McGehee Intepreting Execution Plans Mar09
 
Brad McGehee Intepreting Execution Plans Mar09
Brad McGehee Intepreting Execution Plans Mar09Brad McGehee Intepreting Execution Plans Mar09
Brad McGehee Intepreting Execution Plans Mar09
 
Optimizing Data Accessin Sq Lserver2005
Optimizing Data Accessin Sq Lserver2005Optimizing Data Accessin Sq Lserver2005
Optimizing Data Accessin Sq Lserver2005
 
Process management seminar
Process management seminarProcess management seminar
Process management seminar
 
Cost Based Optimizer - Part 1 of 2
Cost Based Optimizer - Part 1 of 2Cost Based Optimizer - Part 1 of 2
Cost Based Optimizer - Part 1 of 2
 
Database Basics
Database BasicsDatabase Basics
Database Basics
 
Sql server introduction
Sql server introductionSql server introduction
Sql server introduction
 
Data warehousing interview_questionsandanswers
Data warehousing interview_questionsandanswersData warehousing interview_questionsandanswers
Data warehousing interview_questionsandanswers
 
Msbi Architecture
Msbi ArchitectureMsbi Architecture
Msbi Architecture
 
PostgreSQL Performance Tables Partitioning vs. Aggregated Data Tables
PostgreSQL Performance Tables Partitioning vs. Aggregated Data TablesPostgreSQL Performance Tables Partitioning vs. Aggregated Data Tables
PostgreSQL Performance Tables Partitioning vs. Aggregated Data Tables
 
Large-Scale Distributed Storage System for Business Provenance - Cloud 2011
Large-Scale Distributed Storage System for Business Provenance - Cloud 2011Large-Scale Distributed Storage System for Business Provenance - Cloud 2011
Large-Scale Distributed Storage System for Business Provenance - Cloud 2011
 
Sql server introduction fundamental
Sql server introduction fundamentalSql server introduction fundamental
Sql server introduction fundamental
 
Bt0066 database management system1
Bt0066 database management system1Bt0066 database management system1
Bt0066 database management system1
 
Search Approach - ES, GraphDB
Search Approach - ES, GraphDBSearch Approach - ES, GraphDB
Search Approach - ES, GraphDB
 
AWS RDS Migration Tool
AWS RDS Migration Tool AWS RDS Migration Tool
AWS RDS Migration Tool
 
Data warehouse physical design
Data warehouse physical designData warehouse physical design
Data warehouse physical design
 

Recently uploaded

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 

Recently uploaded (20)

Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 

Database indexing framework

  • 1. Database Indexing Framework ( Version 1.0 )
  • 2.
  • 3.
  • 4.
  • 5. The following slides discuss a incremental indexing approach that we thought would work well for our requirements. In this approach the Search Index relevant views are created using Database Views and the indexing is done as a Batch Process and not at real time. First we need to understand the need for the Database Views . When a search term is searched for in the index, the result page shows some details and summary of the result. For instant results these details need to be stored in the index itself so we don’t have to hit the database just to display collated results in the results page. When creating the Solr index it then doesn't make much sense to index all the tables individually. This is because each table will have it own dependencies with child and parent tables. We will either have to create similar dependencies in the index or else create our indexes intelligently keeping the search needs in mind. This will involve creating appropriate joins across tables to fetch all the data relevant to a search result at one shot. The database view can do this job of collating data from the parent and child tables in a representation that exactly matches the requirements of the search index. This makes the job of the application layer hassle free. It just picks everything from the view and indexes it as it is. Incremental Indexing Process ( the need for Database Views )
  • 6. Next we need to understand why the Batch Indexing process can work well for us. Most of our search requirements would involve searching for historic data. Rarely could there be cases where we search for data put in immediately. Even these cases can be handled by setting the Batch Process interval to a very small time. The real time indexing process can become a pretty expensive process in case a large amount of data is entered in small intervals. Also the batch process gives us the flexibility of working on a copy of the database to make the whole indexing process an offline one. Incremental Indexing Process ( the need for Batch indexing )
  • 7. Database Result Set to XML Converter Data Fetcher Indexing Job Scheduler Database Indexer (the controller class) SOLR Index Manager (9) Solr XML (1) Indexing Job Name (2) Database View Name (5) Result Set (6) Solr XML (3) Query (4) Result Set (8) Solr XML Indexing Job - Trigger Config file ( Indexing Job Schedules ) Trigger Time 1 - Indexing Job 1 Trigger Time 2 - Indexing Job 2 Trigger Time 3 - Indexing Job 3 7) Solr XML Incremental Indexing Batch Process ( the flow ) Components in green are explained in detail in next slide >> Indexing Job – Database View Mapping file More than one DB view might need to be indexed at the same time, so these can be as an Indexing Job. Indexing Job 1 – Database View1 Database View2 Database View3 Database View4 Indexing Job 2 – Database View5 Database View6 DB View Column name to Solr field mapping - Database View 1 Column 1 - Solr Field 1 Column 2 - Solr Field 2 Column 3 - Solr Field 3 - Database View 2 Column 1 - Solr Field 3 Column 2 - Solr Field 2
  • 8. Incremental Indexing Batch Process ( the components ) An Indexing Job has been defined as indexing of all the set of Database Views that need to be indexed at the same time and at equal time intervals. Triggers holds the time information, the start time, time interval and other such time related details. So when a Indexing Job is associated to a trigger, the job will run according to the start time and time intervals as mentioned in the trigger. Indexing Job - Trigger Config file has all Indexing Job Schedules. It maps triggers to indexing jobs. Indexing Job – Database View Mapping file defines the Indexing Jobs. It associates Database Views with each Indexing Job. If a database view like the one for the messages module requires to be picked up for at a smaller time interval than the one for the shopping module, then they will be part of different indexing jobs having different Triggers. Database Indexer acts as the controller of the database indexing process. It does the job of calling the Data Fetcher to get database records in XML format which it sends to the Index Manager to post it to Solr. The Data Fetcher communicates with the database to get all the new and updated records for a given database view along with those records that have been marked for deletion. It then feeds this data to the Result Set to XML converter to get the data converted to the Solr recognizable XML format. The Result Set to XML converter is a utility class which converts database records to XML format. If the record is new or updated it puts it in the <add> tag. If it is marked for deletion then it is put in the <delete> tag. It picks up Solr Field names corresponding to the DB View Column names from the DB View Column name to Solr field mapping file.
  • 9. Incremental Indexing Batch Process ( the flow) The indexing process is triggered off by the Indexing Job Scheduler . An indexing job is triggered from the Indexing Job Scheduler based on the trigger settings to which it is associated in the Indexing Job - Trigger Config file . The Indexing Job Scheduler makes a call to the Database Indexer sending the name of the job to done as an argument. The Database Indexer acts as the controller for this whole process. It picks up the names of Database Views to be indexed corresponding to the Indexing Job sent by Indexing Job Scheduler from the Indexing Job – Database View Mapping file . The Database Indexer loops over the set of Database Views and makes a call to the Data Fetcher for each View. The Data Fetcher hits the database with a query to get all the latest records from the View. The result set is sent to Result set to XML Converter which return the Solr XML. This Solr XML is sent back to the Database Indexer which in turn sends it to the Index manger for posting it to Solr.
  • 10.