SlideShare a Scribd company logo
1 of 15
Download to read offline
SPHINX AND THINKING
       SPHINX
      10 Minute Intro
HAYES DAVIS
      Founder, Appozite
cheaptweet.com | @cheaptweet
        @hayesdavis
SPHINX
•Open Source full-text search
 engine
•Designed around SQL
•Standalone daemon
 (searchd)


                                http://guardians.net/hawass/images/sphinx3.jpg
THINKING
     SPHINX
•Rails plugin
•Integrates Active Record
 with Sphinx
•Makes talking to Sphinx
 basically painless
BASIC IDEA


• Configure   your indexes

• Index

• Query

• Repeat
CONFIGURING INDEXES

• Add indexes on your AR            class Article < ActiveRecord::Base

 classes using define_index           define_index do
                                        # fields
• Fields (indexes)   contain text       indexes subject, :sortable => true
                                        indexes content
 you can search                         indexes author.name, :as=> :author,
                                          :sortable => true

• Attributes (has)
                 allow you to           # attributes

 sort and constrain your                has author_id, created_at,
                                            updated_at
 searches                             end

                                    end
• Careful!Column names
 aren’t symbols
Run the indexer
rake thinking_sphinx:index
source twitterer_core_0
{
  type = mysql
  sql_host = 127.0.0.1
  sql_user = cheaptweet
  sql_pass = cheaptweet
  sql_db = cheaptweet_development2
  sql_query_pre = UPDATE `twitterer` SET `delta` = 0
  sql_query_pre = SET NAMES utf8
  sql_query = SELECT `twitterer`.`id` * 1 + 0 AS `id` , CAST(`twitterer`.`screen_name` AS CHAR) AS `screen_name`, CAST(`twitterer`.`name` AS
CHAR) AS `name`, CAST(`twitterer`.`description` AS CHAR) AS `description`, CAST(`twitterer`.`url` AS CHAR) AS `url`,
CAST(`twitterer`.`location` AS CHAR) AS `location`, `twitterer`.`id` AS `sphinx_internal_id`, 283224142 AS `class_crc`, '283224142' AS
`subclass_crcs`, 0 AS `sphinx_deleted` FROM twitterer    WHERE `twitterer`.`id` >= $start   AND `twitterer`.`id` <= $end    AND
`twitterer`.`delta` = 0 GROUP BY `twitterer`.`id` ORDER BY NULL
  sql_query_range = SELECT IFNULL(MIN(`id`), 1), IFNULL(MAX(`id`), 1) FROM `twitterer` WHERE `twitterer`.`delta` = 0
  sql_attr_uint = sphinx_internal_id
  sql_attr_uint = class_crc
  sql_attr_uint = sphinx_deleted
  sql_attr_multi = uint subclass_crcs from field
  sql_query_info = SELECT * FROM `twitterer` WHERE `id` = (($id - 0) / 1)
}

index twitterer_core
{
  source = twitterer_core_0
  path = /Users/hayesdavis/Appozite/workspace/CheapTweet/data/sphinx/development/twitterer_core
  morphology = stem_en
  charset_type = utf-8
}




          MORE ABOUT INDEXING
Thinking Sphinx generates a config file for sphinx, indexes (aka
        “sources”) are defined. It’s a little complicated.
Start Sphinx
rake thinking_sphinx:start
#Searches all fields for “pants”
Article.search “pants”

#Conditions are allowed on fields but must be hash
Article.search “pants”, :conditions=>{
  :subject=>”How To Wear”
}

#Query attributes using :with
Article.search “pants”, :with=>{
  :author_id=>1, :created_at=>1.week.ago..Time.now
}




               SEARCHING
         Use the search method on AR classes
BUT WAIT
     HOW DO I KEEP INDEXES
(ESPECIALLY BIG ONES) UP TO DATE?
DELTA INDEXES TO THE
                 RESCUE
• Mini   index of only rows that have been updated

• Must    merge into “core” index periodically or it’ll get slow

• Simplest   approach: add delta boolean column to model

• Add set_property :delta=>true        to define_index block

• Delta   index is rebuilt on model saves, can cause performance
 hit
DEPLOYMENT &
                 PRODUCTION

• Must   schedule full re-indexing periodically

• Have   god or monit keep an eye on things

• Consider adding some cap tasks to help out with reindexing
 and restarting
TIPS, TRICKS, GOTCHAS

• Simplest   delta indexing can lead to performance issues

• Indexer assumes you have sequential ids on your DB rows and
 iterates through them in chunks - very bad if you have big
 gaps

• Run full indexing as often as you can without hurting
 performance - it’s usually pretty fast

• Youcan hand-edit config files if you need to tune - but be
 careful not to regenerate
RESOURCES


Sphinx http://www.sphinxsearch.com/

Thinking Sphinx http://freelancing-god.github.com/ts/en/

Railscast http://railscasts.com/episodes/120-thinking-sphinx

More Related Content

What's hot

5分で説明する Play! scala
5分で説明する Play! scala5分で説明する Play! scala
5分で説明する Play! scalamasahitojp
 
Solr Anti - patterns
Solr Anti - patternsSolr Anti - patterns
Solr Anti - patternsRafał Kuć
 
(DEV305) Building Apps with the AWS SDK for PHP | AWS re:Invent 2014
(DEV305) Building Apps with the AWS SDK for PHP | AWS re:Invent 2014(DEV305) Building Apps with the AWS SDK for PHP | AWS re:Invent 2014
(DEV305) Building Apps with the AWS SDK for PHP | AWS re:Invent 2014Amazon Web Services
 
State of search | drupalcon dublin
State of search | drupalcon dublinState of search | drupalcon dublin
State of search | drupalcon dublinJoris Vercammen
 
Transforming WordPress Search and Query Performance with Elasticsearch
Transforming WordPress Search and Query Performance with Elasticsearch Transforming WordPress Search and Query Performance with Elasticsearch
Transforming WordPress Search and Query Performance with Elasticsearch Taylor Lovett
 
Assetic (Symfony Live Paris)
Assetic (Symfony Live Paris)Assetic (Symfony Live Paris)
Assetic (Symfony Live Paris)Kris Wallsmith
 
Better Data Persistence on Android
Better Data Persistence on AndroidBetter Data Persistence on Android
Better Data Persistence on AndroidEric Maxwell
 
AngularJS Tips&Tricks
AngularJS Tips&TricksAngularJS Tips&Tricks
AngularJS Tips&TricksPetr Bela
 
Intro To Moose
Intro To MooseIntro To Moose
Intro To MoosecPanel
 
The effective use of Django ORM
The effective use of Django ORMThe effective use of Django ORM
The effective use of Django ORMYaroslav Muravskyi
 
Building Cloud Castles - LRUG
Building Cloud Castles - LRUGBuilding Cloud Castles - LRUG
Building Cloud Castles - LRUGBen Scofield
 
Great Developers Steal
Great Developers StealGreat Developers Steal
Great Developers StealBen Scofield
 
Getting Hiera and Hiera
Getting Hiera and HieraGetting Hiera and Hiera
Getting Hiera and HieraPuppet
 
Building Cloud Castles
Building Cloud CastlesBuilding Cloud Castles
Building Cloud CastlesBen Scofield
 
Pourquoi ruby et rails déchirent
Pourquoi ruby et rails déchirentPourquoi ruby et rails déchirent
Pourquoi ruby et rails déchirentNicolas Ledez
 

What's hot (20)

5分で説明する Play! scala
5分で説明する Play! scala5分で説明する Play! scala
5分で説明する Play! scala
 
Solr Anti - patterns
Solr Anti - patternsSolr Anti - patterns
Solr Anti - patterns
 
(DEV305) Building Apps with the AWS SDK for PHP | AWS re:Invent 2014
(DEV305) Building Apps with the AWS SDK for PHP | AWS re:Invent 2014(DEV305) Building Apps with the AWS SDK for PHP | AWS re:Invent 2014
(DEV305) Building Apps with the AWS SDK for PHP | AWS re:Invent 2014
 
it's just search
it's just searchit's just search
it's just search
 
Mentor Your Indexes
Mentor Your IndexesMentor Your Indexes
Mentor Your Indexes
 
State of search | drupalcon dublin
State of search | drupalcon dublinState of search | drupalcon dublin
State of search | drupalcon dublin
 
Transforming WordPress Search and Query Performance with Elasticsearch
Transforming WordPress Search and Query Performance with Elasticsearch Transforming WordPress Search and Query Performance with Elasticsearch
Transforming WordPress Search and Query Performance with Elasticsearch
 
Assetic (Symfony Live Paris)
Assetic (Symfony Live Paris)Assetic (Symfony Live Paris)
Assetic (Symfony Live Paris)
 
Elegant APIs
Elegant APIsElegant APIs
Elegant APIs
 
Better Data Persistence on Android
Better Data Persistence on AndroidBetter Data Persistence on Android
Better Data Persistence on Android
 
Assetic (OSCON)
Assetic (OSCON)Assetic (OSCON)
Assetic (OSCON)
 
AngularJS Tips&Tricks
AngularJS Tips&TricksAngularJS Tips&Tricks
AngularJS Tips&Tricks
 
Intro To Moose
Intro To MooseIntro To Moose
Intro To Moose
 
The effective use of Django ORM
The effective use of Django ORMThe effective use of Django ORM
The effective use of Django ORM
 
Building Cloud Castles - LRUG
Building Cloud Castles - LRUGBuilding Cloud Castles - LRUG
Building Cloud Castles - LRUG
 
Great Developers Steal
Great Developers StealGreat Developers Steal
Great Developers Steal
 
Getting Hiera and Hiera
Getting Hiera and HieraGetting Hiera and Hiera
Getting Hiera and Hiera
 
Building Cloud Castles
Building Cloud CastlesBuilding Cloud Castles
Building Cloud Castles
 
Pourquoi ruby et rails déchirent
Pourquoi ruby et rails déchirentPourquoi ruby et rails déchirent
Pourquoi ruby et rails déchirent
 
Lu solr32 34-20110912
Lu solr32 34-20110912Lu solr32 34-20110912
Lu solr32 34-20110912
 

Viewers also liked

Структура сайта Camisco 2006
Структура сайта Camisco 2006Структура сайта Camisco 2006
Структура сайта Camisco 2006Vadim Andreev
 
fidelity national information 2nd Quarter 2007 10Q
fidelity national information  2nd Quarter 2007 10Qfidelity national information  2nd Quarter 2007 10Q
fidelity national information 2nd Quarter 2007 10Qfinance48
 
Thesis110309
Thesis110309Thesis110309
Thesis110309klee4vp
 
Thesis100609
Thesis100609Thesis100609
Thesis100609klee4vp
 
18 Minute Presentation In Greek
18 Minute Presentation In Greek18 Minute Presentation In Greek
18 Minute Presentation In GreekFred Johansen
 
AIESEC HUST 09Fall 招新——外语学院
AIESEC HUST 09Fall 招新——外语学院AIESEC HUST 09Fall 招新——外语学院
AIESEC HUST 09Fall 招新——外语学院cscguochang
 
Have Breakfast… Or…Be Breakfast
Have Breakfast… Or…Be BreakfastHave Breakfast… Or…Be Breakfast
Have Breakfast… Or…Be BreakfastRajesh Goyal
 
Městská karta
Městská kartaMěstská karta
Městská kartabezouska
 
Ground breakingceremony csr-ptcsi
Ground breakingceremony csr-ptcsiGround breakingceremony csr-ptcsi
Ground breakingceremony csr-ptcsiAmril Taufik Gobel
 
AIESEC HUST 09Fall招新进行时——信息学院
AIESEC HUST 09Fall招新进行时——信息学院AIESEC HUST 09Fall招新进行时——信息学院
AIESEC HUST 09Fall招新进行时——信息学院cscguochang
 
Passie Voor Horeca Minicursus Arrangeren De Rooi Pannen
Passie Voor Horeca Minicursus Arrangeren De Rooi PannenPassie Voor Horeca Minicursus Arrangeren De Rooi Pannen
Passie Voor Horeca Minicursus Arrangeren De Rooi PannenJohan Lapidaire
 
Cuestionariojornadadereflexion
CuestionariojornadadereflexionCuestionariojornadadereflexion
CuestionariojornadadereflexionJuan Castillo
 
Mobile Cloud Architectures
Mobile Cloud ArchitecturesMobile Cloud Architectures
Mobile Cloud ArchitecturesDavid Coallier
 
Thesis Midterm032610
Thesis Midterm032610Thesis Midterm032610
Thesis Midterm032610klee4vp
 
Crusade propaganda and ideology
Crusade propaganda and ideologyCrusade propaganda and ideology
Crusade propaganda and ideologyMehmet Saruhan
 
SXSW 2013 Submission- Marketing Tech When Your Product Changes Every Day
SXSW 2013 Submission- Marketing Tech When Your Product Changes Every DaySXSW 2013 Submission- Marketing Tech When Your Product Changes Every Day
SXSW 2013 Submission- Marketing Tech When Your Product Changes Every DayCaitlin Jeansonne
 
RIAアーキテクチャー研究会 第3回 セッション4 Mvpvm pattern
RIAアーキテクチャー研究会 第3回 セッション4 Mvpvm patternRIAアーキテクチャー研究会 第3回 セッション4 Mvpvm pattern
RIAアーキテクチャー研究会 第3回 セッション4 Mvpvm patternMami Shiino
 
Lams201: Digging deeper into the Learning Activity Management System
Lams201: Digging deeper into the Learning Activity Management SystemLams201: Digging deeper into the Learning Activity Management System
Lams201: Digging deeper into the Learning Activity Management SystemAllan Carrington
 
As cores do casamento - O azul
As cores do casamento - O azulAs cores do casamento - O azul
As cores do casamento - O azulcasebem
 

Viewers also liked (20)

Структура сайта Camisco 2006
Структура сайта Camisco 2006Структура сайта Camisco 2006
Структура сайта Camisco 2006
 
fidelity national information 2nd Quarter 2007 10Q
fidelity national information  2nd Quarter 2007 10Qfidelity national information  2nd Quarter 2007 10Q
fidelity national information 2nd Quarter 2007 10Q
 
Thesis110309
Thesis110309Thesis110309
Thesis110309
 
Thesis100609
Thesis100609Thesis100609
Thesis100609
 
18 Minute Presentation In Greek
18 Minute Presentation In Greek18 Minute Presentation In Greek
18 Minute Presentation In Greek
 
AIESEC HUST 09Fall 招新——外语学院
AIESEC HUST 09Fall 招新——外语学院AIESEC HUST 09Fall 招新——外语学院
AIESEC HUST 09Fall 招新——外语学院
 
Have Breakfast… Or…Be Breakfast
Have Breakfast… Or…Be BreakfastHave Breakfast… Or…Be Breakfast
Have Breakfast… Or…Be Breakfast
 
Accept the Pain
Accept the PainAccept the Pain
Accept the Pain
 
Městská karta
Městská kartaMěstská karta
Městská karta
 
Ground breakingceremony csr-ptcsi
Ground breakingceremony csr-ptcsiGround breakingceremony csr-ptcsi
Ground breakingceremony csr-ptcsi
 
AIESEC HUST 09Fall招新进行时——信息学院
AIESEC HUST 09Fall招新进行时——信息学院AIESEC HUST 09Fall招新进行时——信息学院
AIESEC HUST 09Fall招新进行时——信息学院
 
Passie Voor Horeca Minicursus Arrangeren De Rooi Pannen
Passie Voor Horeca Minicursus Arrangeren De Rooi PannenPassie Voor Horeca Minicursus Arrangeren De Rooi Pannen
Passie Voor Horeca Minicursus Arrangeren De Rooi Pannen
 
Cuestionariojornadadereflexion
CuestionariojornadadereflexionCuestionariojornadadereflexion
Cuestionariojornadadereflexion
 
Mobile Cloud Architectures
Mobile Cloud ArchitecturesMobile Cloud Architectures
Mobile Cloud Architectures
 
Thesis Midterm032610
Thesis Midterm032610Thesis Midterm032610
Thesis Midterm032610
 
Crusade propaganda and ideology
Crusade propaganda and ideologyCrusade propaganda and ideology
Crusade propaganda and ideology
 
SXSW 2013 Submission- Marketing Tech When Your Product Changes Every Day
SXSW 2013 Submission- Marketing Tech When Your Product Changes Every DaySXSW 2013 Submission- Marketing Tech When Your Product Changes Every Day
SXSW 2013 Submission- Marketing Tech When Your Product Changes Every Day
 
RIAアーキテクチャー研究会 第3回 セッション4 Mvpvm pattern
RIAアーキテクチャー研究会 第3回 セッション4 Mvpvm patternRIAアーキテクチャー研究会 第3回 セッション4 Mvpvm pattern
RIAアーキテクチャー研究会 第3回 セッション4 Mvpvm pattern
 
Lams201: Digging deeper into the Learning Activity Management System
Lams201: Digging deeper into the Learning Activity Management SystemLams201: Digging deeper into the Learning Activity Management System
Lams201: Digging deeper into the Learning Activity Management System
 
As cores do casamento - O azul
As cores do casamento - O azulAs cores do casamento - O azul
As cores do casamento - O azul
 

Similar to Quick Introduction to Sphinx and Thinking Sphinx

Bye bye $GLOBALS['TYPO3_DB']
Bye bye $GLOBALS['TYPO3_DB']Bye bye $GLOBALS['TYPO3_DB']
Bye bye $GLOBALS['TYPO3_DB']Jan Helke
 
Remixing Confluence with Speakeasy - AtlasCamp 2011
Remixing Confluence with Speakeasy - AtlasCamp 2011Remixing Confluence with Speakeasy - AtlasCamp 2011
Remixing Confluence with Speakeasy - AtlasCamp 2011Atlassian
 
Ako prepojiť aplikáciu s Elasticsearch
Ako prepojiť aplikáciu s ElasticsearchAko prepojiť aplikáciu s Elasticsearch
Ako prepojiť aplikáciu s Elasticsearchbart-sk
 
Slides python elixir
Slides python elixirSlides python elixir
Slides python elixirAdel Totott
 
Modernizing WordPress Search with Elasticsearch
Modernizing WordPress Search with ElasticsearchModernizing WordPress Search with Elasticsearch
Modernizing WordPress Search with ElasticsearchTaylor Lovett
 
Using Sphinx for Search in PHP
Using Sphinx for Search in PHPUsing Sphinx for Search in PHP
Using Sphinx for Search in PHPMike Lively
 
[Srijan Wednesday Webinar] Easy Performance Wins for Your Rails App
[Srijan Wednesday Webinar] Easy Performance Wins for Your Rails App[Srijan Wednesday Webinar] Easy Performance Wins for Your Rails App
[Srijan Wednesday Webinar] Easy Performance Wins for Your Rails AppSrijan Technologies
 
Examiness hints and tips from the trenches
Examiness hints and tips from the trenchesExaminess hints and tips from the trenches
Examiness hints and tips from the trenchesIsmail Mayat
 
Sterling for Windows Phone 7
Sterling for Windows Phone 7Sterling for Windows Phone 7
Sterling for Windows Phone 7Jeremy Likness
 
Rails 3 (beta) Roundup
Rails 3 (beta) RoundupRails 3 (beta) Roundup
Rails 3 (beta) RoundupWayne Carter
 
Remixing Confluence With Speakeasy
Remixing Confluence With SpeakeasyRemixing Confluence With Speakeasy
Remixing Confluence With Speakeasynabeelahali
 
The Way to Theme Enlightenment
The Way to Theme EnlightenmentThe Way to Theme Enlightenment
The Way to Theme EnlightenmentAmanda Giles
 
Find Anything In Your APEX App - Fuzzy Search with Oracle Text
Find Anything In Your APEX App - Fuzzy Search with Oracle TextFind Anything In Your APEX App - Fuzzy Search with Oracle Text
Find Anything In Your APEX App - Fuzzy Search with Oracle TextCarsten Czarski
 
Sphinx: Leveraging Scalable Search in Drupal
Sphinx: Leveraging Scalable Search in DrupalSphinx: Leveraging Scalable Search in Drupal
Sphinx: Leveraging Scalable Search in Drupalelliando dias
 
Enabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax EnterpriseEnabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax EnterpriseDataStax Academy
 
Solr's Search Relevancy (Understand Solr's query debug)
Solr's Search Relevancy (Understand Solr's query debug)Solr's Search Relevancy (Understand Solr's query debug)
Solr's Search Relevancy (Understand Solr's query debug)Wongnai
 
Wordpress search-elasticsearch
Wordpress search-elasticsearchWordpress search-elasticsearch
Wordpress search-elasticsearchTaylor Lovett
 

Similar to Quick Introduction to Sphinx and Thinking Sphinx (20)

Full Text Search In PostgreSQL
Full Text Search In PostgreSQLFull Text Search In PostgreSQL
Full Text Search In PostgreSQL
 
Bye bye $GLOBALS['TYPO3_DB']
Bye bye $GLOBALS['TYPO3_DB']Bye bye $GLOBALS['TYPO3_DB']
Bye bye $GLOBALS['TYPO3_DB']
 
Remixing Confluence with Speakeasy - AtlasCamp 2011
Remixing Confluence with Speakeasy - AtlasCamp 2011Remixing Confluence with Speakeasy - AtlasCamp 2011
Remixing Confluence with Speakeasy - AtlasCamp 2011
 
Ako prepojiť aplikáciu s Elasticsearch
Ako prepojiť aplikáciu s ElasticsearchAko prepojiť aplikáciu s Elasticsearch
Ako prepojiť aplikáciu s Elasticsearch
 
Slides python elixir
Slides python elixirSlides python elixir
Slides python elixir
 
Modernizing WordPress Search with Elasticsearch
Modernizing WordPress Search with ElasticsearchModernizing WordPress Search with Elasticsearch
Modernizing WordPress Search with Elasticsearch
 
Real World MVC
Real World MVCReal World MVC
Real World MVC
 
Using Sphinx for Search in PHP
Using Sphinx for Search in PHPUsing Sphinx for Search in PHP
Using Sphinx for Search in PHP
 
[Srijan Wednesday Webinar] Easy Performance Wins for Your Rails App
[Srijan Wednesday Webinar] Easy Performance Wins for Your Rails App[Srijan Wednesday Webinar] Easy Performance Wins for Your Rails App
[Srijan Wednesday Webinar] Easy Performance Wins for Your Rails App
 
Examiness hints and tips from the trenches
Examiness hints and tips from the trenchesExaminess hints and tips from the trenches
Examiness hints and tips from the trenches
 
Sterling for Windows Phone 7
Sterling for Windows Phone 7Sterling for Windows Phone 7
Sterling for Windows Phone 7
 
Rails 3 (beta) Roundup
Rails 3 (beta) RoundupRails 3 (beta) Roundup
Rails 3 (beta) Roundup
 
Remixing Confluence With Speakeasy
Remixing Confluence With SpeakeasyRemixing Confluence With Speakeasy
Remixing Confluence With Speakeasy
 
The Way to Theme Enlightenment
The Way to Theme EnlightenmentThe Way to Theme Enlightenment
The Way to Theme Enlightenment
 
Find Anything In Your APEX App - Fuzzy Search with Oracle Text
Find Anything In Your APEX App - Fuzzy Search with Oracle TextFind Anything In Your APEX App - Fuzzy Search with Oracle Text
Find Anything In Your APEX App - Fuzzy Search with Oracle Text
 
Sphinx: Leveraging Scalable Search in Drupal
Sphinx: Leveraging Scalable Search in DrupalSphinx: Leveraging Scalable Search in Drupal
Sphinx: Leveraging Scalable Search in Drupal
 
Enabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax EnterpriseEnabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax Enterprise
 
Solr's Search Relevancy (Understand Solr's query debug)
Solr's Search Relevancy (Understand Solr's query debug)Solr's Search Relevancy (Understand Solr's query debug)
Solr's Search Relevancy (Understand Solr's query debug)
 
Wordpress search-elasticsearch
Wordpress search-elasticsearchWordpress search-elasticsearch
Wordpress search-elasticsearch
 
SphinxSE with MySQL
SphinxSE with MySQLSphinxSE with MySQL
SphinxSE with MySQL
 

Recently uploaded

How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?IES VE
 
Empowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintEmpowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintMahmoud Rabie
 
Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.YounusS2
 
UiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPathCommunity
 
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...Aggregage
 
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8DianaGray10
 
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDEADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDELiveplex
 
Nanopower In Semiconductor Industry.pdf
Nanopower  In Semiconductor Industry.pdfNanopower  In Semiconductor Industry.pdf
Nanopower In Semiconductor Industry.pdfPedro Manuel
 
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...DianaGray10
 
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfDianaGray10
 
OpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureOpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureEric D. Schabell
 
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsIgniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsSafe Software
 
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostKubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostMatt Ray
 
Bird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemBird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemAsko Soukka
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaborationbruanjhuli
 
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve DecarbonizationUsing IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve DecarbonizationIES VE
 
Building AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxBuilding AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxUdaiappa Ramachandran
 
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IES VE
 
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxMatsuo Lab
 

Recently uploaded (20)

How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?
 
20150722 - AGV
20150722 - AGV20150722 - AGV
20150722 - AGV
 
Empowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintEmpowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership Blueprint
 
Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.
 
UiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation Developers
 
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
 
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8
 
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDEADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
 
Nanopower In Semiconductor Industry.pdf
Nanopower  In Semiconductor Industry.pdfNanopower  In Semiconductor Industry.pdf
Nanopower In Semiconductor Industry.pdf
 
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
 
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
 
OpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureOpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability Adventure
 
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsIgniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
 
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostKubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
 
Bird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemBird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystem
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
 
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve DecarbonizationUsing IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
 
Building AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxBuilding AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptx
 
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
 
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptx
 

Quick Introduction to Sphinx and Thinking Sphinx

  • 1. SPHINX AND THINKING SPHINX 10 Minute Intro
  • 2. HAYES DAVIS Founder, Appozite cheaptweet.com | @cheaptweet @hayesdavis
  • 3. SPHINX •Open Source full-text search engine •Designed around SQL •Standalone daemon (searchd) http://guardians.net/hawass/images/sphinx3.jpg
  • 4. THINKING SPHINX •Rails plugin •Integrates Active Record with Sphinx •Makes talking to Sphinx basically painless
  • 5. BASIC IDEA • Configure your indexes • Index • Query • Repeat
  • 6. CONFIGURING INDEXES • Add indexes on your AR class Article < ActiveRecord::Base classes using define_index define_index do # fields • Fields (indexes) contain text indexes subject, :sortable => true indexes content you can search indexes author.name, :as=> :author, :sortable => true • Attributes (has) allow you to # attributes sort and constrain your has author_id, created_at, updated_at searches end end • Careful!Column names aren’t symbols
  • 7. Run the indexer rake thinking_sphinx:index
  • 8. source twitterer_core_0 { type = mysql sql_host = 127.0.0.1 sql_user = cheaptweet sql_pass = cheaptweet sql_db = cheaptweet_development2 sql_query_pre = UPDATE `twitterer` SET `delta` = 0 sql_query_pre = SET NAMES utf8 sql_query = SELECT `twitterer`.`id` * 1 + 0 AS `id` , CAST(`twitterer`.`screen_name` AS CHAR) AS `screen_name`, CAST(`twitterer`.`name` AS CHAR) AS `name`, CAST(`twitterer`.`description` AS CHAR) AS `description`, CAST(`twitterer`.`url` AS CHAR) AS `url`, CAST(`twitterer`.`location` AS CHAR) AS `location`, `twitterer`.`id` AS `sphinx_internal_id`, 283224142 AS `class_crc`, '283224142' AS `subclass_crcs`, 0 AS `sphinx_deleted` FROM twitterer WHERE `twitterer`.`id` >= $start AND `twitterer`.`id` <= $end AND `twitterer`.`delta` = 0 GROUP BY `twitterer`.`id` ORDER BY NULL sql_query_range = SELECT IFNULL(MIN(`id`), 1), IFNULL(MAX(`id`), 1) FROM `twitterer` WHERE `twitterer`.`delta` = 0 sql_attr_uint = sphinx_internal_id sql_attr_uint = class_crc sql_attr_uint = sphinx_deleted sql_attr_multi = uint subclass_crcs from field sql_query_info = SELECT * FROM `twitterer` WHERE `id` = (($id - 0) / 1) } index twitterer_core { source = twitterer_core_0 path = /Users/hayesdavis/Appozite/workspace/CheapTweet/data/sphinx/development/twitterer_core morphology = stem_en charset_type = utf-8 } MORE ABOUT INDEXING Thinking Sphinx generates a config file for sphinx, indexes (aka “sources”) are defined. It’s a little complicated.
  • 10. #Searches all fields for “pants” Article.search “pants” #Conditions are allowed on fields but must be hash Article.search “pants”, :conditions=>{ :subject=>”How To Wear” } #Query attributes using :with Article.search “pants”, :with=>{ :author_id=>1, :created_at=>1.week.ago..Time.now } SEARCHING Use the search method on AR classes
  • 11. BUT WAIT HOW DO I KEEP INDEXES (ESPECIALLY BIG ONES) UP TO DATE?
  • 12. DELTA INDEXES TO THE RESCUE • Mini index of only rows that have been updated • Must merge into “core” index periodically or it’ll get slow • Simplest approach: add delta boolean column to model • Add set_property :delta=>true to define_index block • Delta index is rebuilt on model saves, can cause performance hit
  • 13. DEPLOYMENT & PRODUCTION • Must schedule full re-indexing periodically • Have god or monit keep an eye on things • Consider adding some cap tasks to help out with reindexing and restarting
  • 14. TIPS, TRICKS, GOTCHAS • Simplest delta indexing can lead to performance issues • Indexer assumes you have sequential ids on your DB rows and iterates through them in chunks - very bad if you have big gaps • Run full indexing as often as you can without hurting performance - it’s usually pretty fast • Youcan hand-edit config files if you need to tune - but be careful not to regenerate
  • 15. RESOURCES Sphinx http://www.sphinxsearch.com/ Thinking Sphinx http://freelancing-god.github.com/ts/en/ Railscast http://railscasts.com/episodes/120-thinking-sphinx