SlideShare a Scribd company logo
1 of 73
Download to read offline
www.globalbigdataconference.com
Twitter : @bigdataconf
AI in multi billion search
engines. Building AI and
Search teams
My goal for this talk is to explain how
1. AI/ML in Search Engines improves customer experience, revenue/GMV,
operational costs, helps to get more customers and serve them better
2. Build AI systems,for search
3. To build global AI teams and how to make them successful
My message
Successful AI in Search is an AI infrastructure, engineering and science
culture and toolsets enabling to continuously introduce, measure, improve
AI features in every part of the search engine rather than several SOTA
models in ranking or query understanding
The same applies to every large scale consumer or business facing platform
(recommendation engines, call center analytics etc)
Development of such a type of AI solutions implies certain requirements on
teams which are successful in building large scale consumer facing AI
software systems
Multi billion? (customers, dollars, documents )
We focus on search engines with many billion dollars revenue/GMV, billions of
users (or hundreds of millions), billions of documents which justify investment in
building AI infrastructure because it improves revenues by hundreds
millions/billions of dollars or saves infrastructure costs on large scale - justifying
developing many AI applications for search
Typical Search Engine - High Level View
Search Engine - high level view
Many other *key* parts are not in the picture
● Experimentation and other framework to support ‘Search Science’ design,
analysis, debugging, deployment, of various query understanding, ranking etc
models
● Evaluation - continuous monitoring quality of results and analysis of users
behaviour
● Loging, monitoring, alerting (to serve all others, feed consumer behavior
systems, and react to operational problems)
etc
AI is everywhere
Using AI components to improve data acquisition by 10%, indexing by 10%, query
understanding by 10%, ranking by 10%, result page by 10% gives more gains in
customer satisfaction and revenue/GMB than applying SOTA and improving only
one components such as query understanding or ranking by 30%.
The AI development should be driven by creating infrastructure, culture and
toolsets for continuous AI deployment, improvement, measurement everywhere in
the search stack. There are no ‘engineering’ teams, every team is an AI team
AI is not separable from engineering
AI development is not separable from other engineering developments.
Improvement of index selection by 10% lets either accommodate other data
sources to improve coverage (by getting more documents) or to improve ranking
(by getting more computation and using it for more advanced functions since
fewer documents to rank)
Improvement of infrastructure to decrease latency by 10% lets to deploy more
sophisticated ranking or query understanding functions (10% more time)
Good engineering quality and culture is not separable from AI development but a
mandatory part of AI culture
AI ‘rank labs’ and AI platforms
Search engine teams benefit if there is an unified environment to train, deploy,
serve models - by reducing work on infrastructure, MLOps, sharing metrics,
making easy to measure end to end metrics
There is no an environment which will handle all types of AI development, AI
serving, AI measurement.
Search Systems are naturally complex by different tasks, different environments,
different languages, CI/CD systems but never ending work on unifying AI
development infrastructures across search teams helps
Multiple ways to ‘deploy’ AI models
Deploy models to TF Serving, TorchServe etc
Deploy models served in container
Compile a model directly into a machine code or as a source code of search
component (GBDT models into c++/java code to be used in ranking)
Relearn and change parameters of existing models served
Tons of other deployment scenarios
Etc etc etc
Multiple ways to serve AI models in Search
Streaming (ex: document updates)
Batch (ex: offline processing of queries or users or documents)
Serving runtime services (ranking, query understanding)
Multiple ways to improve AI in Search
Change evaluation methods and metrics, train models to new metrics
Change sampling procedures
Change training procedures
Change modeling techniques
Model previously unmodeled tasks
New data and new features
Infrastructure changes in serving etc etc etc
AI platforms
So, it’s almost impossible to make one platform to handle all types of AI
development and deployment (see variety in previous slides)
But, unification of some of tasks reduces development and operation efforts and
costs, increase velocity of AI development bringing a lot of money and customer
satisfaction
Every AI driven Search company created “Rank Lab” AI platform for Search
Now , there is an open source such as KubeFlow, MLFlow, to simplify
developments
AI services in prod
Good if they are decoupled, so multiple small teams can work on services
independently
But wiring is needed (a ‘signal’ from query understanding to be used in ranking
etc)
Processing of a query may call ~dozens of AI services, processing of a document
in data acquisition and index may call 100s of AI services. Performance
considerations are extremely important
AI infrastructure benefits greatly from common software practices, protocols,
orchestration to organize this ‘AI chaos’ and make order out of it
AI in prod
Besides ML objective functions and metrics
a. Latency
b. Resiliency
c. Throughput
d. Resource utilization
Are super important factors in design of every AI services at the search engine
stack
AI service development should be tested and benchmark against them
Your model will serve billions of document updates or billions of queries, every
1ms delay, 1 ms downtime, etc will cost either bad user experience or millions in
ops
AI in Search
Several use cases to demonstrate that AI is
driving search in every part of search stack
Indexing - index selection
Perhaps, one of the first of AI applications to the search domains. Started in 90s
when web volume increased and index selection strategy started to be important
Which documents should be indexed? In which index layers they should be
placed? (many modern search engines are multi level, smaller index for frequently
searched items, a very big and comprehensive index for rarely search items)
AI for quality, popularity assessment of ‘documents’
Indexing - duplicate resolution
Which ‘documents’ are essentially the same (represent the same item)? Or have
highly duplicated content, so the second document does not carry more
information?
Which documents are the same wrt to a particular query? (we do not want to show
collocated Target and Target Pharmacy for local search query target but they are
different entities for query pharmacy)
Indexing - attributes extraction
Given documents - full text descriptions of houses, ecommerce products,
businesses etc - extract significant attributes important for search to understand
items of interest
(size, wheel size, weight, number of pages, location, view)
Index - statistical tasks
Evaluation of quality and size of the index.
Is our index provides good coverage? What categories are missing? What data
quality problems?
Evaluation of Index size of external systems
Index - data quality
What attributes are important and must be mandatory and which attributes can be
optional in data acquisition? Which types of data, which categories to acquire?
AI processes continuously looking into search logs to decide customer priorities,
what drives conversion and using this information to drive data acquisition
Also detection of spam, fraud, adult content.
Demand generation beyond just data
The same type of AI evaluation procedures to compute and forecast future
demand of items, to drive purchase decision if search engine is used to sell items
AI for query understanding
Query understanding is mapping of customer’s query into a machine
understandable format to retrieve a set of relevant items and rank them with
highest probability of customer engagement (view, purchase, etc)
Synonym expansion for better retrieval, removal of insignificant terms, correcting
spelling and other errors, term weighting, attribute and entity extraction, compound
and phrase extraction, classification (novelty, price range etc ) etc
Query understanding Classification
Mapping a query into a certain set of categories to be used in retrieval and ranking
-> most probable document category (italian -> restaurants in local search),
-> most probable distance (gas -> 5 miles distance, micheline restaurant -> 50
miles distance local search)
-> novelty: printers -> released within 1 years, pillows -> release date does not
matter
Typically: 100s classifiers per search engine with significant impact on quality /
revenue
Query understanding Similar Queries
Given queries q1 - q2 how similar they are (how results for one query will be good
as results for the other query)
Tons of applications in query understanding and ranking: given features for one
query, apply them to another query for ranking, extend retrieval set etc etc
Query understanding: entity and attribute extraction
Given a query: map it into structured representation of entities and attributes to be
used for better retrieval and ranking
AI for ranking
Learning to rank / Machine Learning based ranking technologies to rank document
(LeToR/ MLR)
AI for unbiased ranking
User interactions based LeToR/Counterfactual
AI for search assistance
Typeahead prediction - language modeling, other contextual information, location
of user, previous searches of users,
Query dependent , user dependent navigational panels and guided search
AI Whole page
Given an output of several search engines: how to combine them to construct the
best customer experience.
Ex: music, video, book, podcasts as in iTunes;
web search, maps, youtube, news, image, books, scholar etc in Google
AI SERP snippet
How to generate the best descriptions of items in the search result page so
customers understand relevance of items without clicking on them
How to select the best chunk of text representing the item, picture, formats, -
depending on the query and the user
AI price prediction
Predict the price of the item (for selling search engines)
Which will maximize item conversion and customer satisfaction and revenue of the
company
(economics problems, but tightly connected to search, depends on item position in
search, relevance, exposure, prices of other search results)
AI conversational search
Conversational interfaces for search, multi turn interactions with customers to
understand customer search intent and help her/him to express their intent or
even to find it by making latent intent explicit
NLP/NLU, dialog state management, deep reinforcement learning, text generation
ASR for voice based systems
Post search AI
Given a set of queries relevant to user (saved queries, previous sessions) and a
set of items relevant to users
Generate email and other notifications about new items, price changes, availability
changes - which will help users to find/buy/discover what they want
Building Search Teams
Running Search Engines which are front face of businesses, for example, real
estate (Zillow, Trulia), eCommerce (Walmart, eBay), is different and similar from
running search engines such as Google Web, I’ll focus mostly on search engines
which are front faces of business
Do you need a search team?
Some companies buy Search SaaS service, some companies ask consultants to
build a search engine for them.
It might work when the search is not core of your business, your customers
satisfaction, profits, core of your business depends on it. Say, search of web
forum threads on your web site, or any other non-critical search
When multi billion business and satisfaction of hundreds millions of users depend
on your search engines and may significantly improve revenue and customer
satisfaction numbers by improving your search engine, the only way - to control
and own the search engine and have your organization owning and developing it
Do you need a search team?
“We believe that we need to own and control the primary technologies behind the
products we make,” was added to Apple values when Tim Cook become the CEO
of Apple
Quite a good value and proven to be a very efficient value. This value is efficient is
any other business besides Apple
If the search engine is the primary technology behind your business/products, the
only way to operate it is to own and control it
Typical roles in the Search team
There are multiple critical roles in Search. I’ll describe some of them. There are
more roles. The exact composition of your search team depends on your
business: some teams are more AI ranking heavy where the search ranking is the
most business critical, some teams are more backend engineering heavy when
your business success depends on integration with many other systems
(availability, pricing)
Success factor: all roles must be ‘AI aware’ and know how AI works to make the
whole team successful
Skills - search infrastructure
Building search engine you’ll build it either on top of existing open source such as
Solr or ElasticSearch, or you will build it from scratch.
You’ll need experts in the technology you use, since you can improve
performance, operation costs, resilience, etc only if your team knows this
technology deeply
Performance is important characteristic of search engine, you’ll need experts in
technologies used (for example, Java backend engineers for Solr which is based
in Java or experts in building high load distributed systems in c++ if you implement
your engine in c++ )
Skills - search infrastructure
Depending on scale of your operations a lot of search operations and performance
improvements may in particular algorithmic improvements, for example, more
efficient index structures, or more efficient fuzzy match algorithms.
You’ll need experts in algorithms in those particular areas
Skills - Search Quality aka Search Science
Search quality consist of multiple subcomponents. Natural language processing
for query understanding, Machine learned based ranking, and other (index
selection etc- depends on your business)
You’ll need good Machine Learning engineers and applied scientists. Better with
background in IR, Search, LTR, NLP (distribution depends on your business). All
of them are well established long term areas and there are people who are experts
already. But typically, a good generic ML catches up quite fast. But you need a set
of people who are deep experts in search quality to be core of your team to drive
other people
Skills - Search Quality aka Search Science
What I noticed abstract Applied Scientist who know only how to train model do not
work well in search
You need more MLE type profile, good in Science, but capable to build and
improve efficient system. A lot of search development is not about mining new
features, training new models, but about building new components
Skills - Operations / SRE
Do not forget - typically, the search engine is a core of your business.
It goes down, customers can not find what they want - they go to competitors , it
costs a lot
Having good SRE/ops teamwho can operate a high load, complex distributed
system with multiple dependencies on other systems is indispensable for your
search engine operations
Skills - operations / SRE
There are multiple dimensions of complexity.
Your scientist, search quality, search infrastructure people will be continuously
improving your search engine from performance, efficiency, search quality, other
points of view.
You need to build devops and operations team, who can support such complexity
Skills - UX
Everybody talks about LTR and Query understanding, but satisfaction of your
customers and revenues of your business depends a lot on UX
I saw surprisingly trivial UX changes which caused huge conversion/revenue gains
You need designers who know how to build Search UX
But these designers must be data driven, understanding how to run UX A/B or
other experiments and how to interpret their results.
Skills - UX
Search pages are necessary complex. There are many search results, there is a
lot information about search result in every snippet, there are other interaction
elements (filters, maps)
Any UX performance problems causes lost customer satisfaction and revenues
‘Full stack’ engineer who know how to build stack - from the search engine
API/query language to the final efficient rendering of pages,
Skills - Product managers
There are multiple different roles product managers have in search engines
development - perhaps, even deserve different titles
1 getting a continuous stream, of feature requests from businesses. Working with
data and business leadership to understand if business truly needs these feature
or not. Frequently, businesses are disconnected from consumer, behavior and
other data and may have not a right assessment of important of certain feature.
Good PMs create a good connection between business and engineering.
Sometime giving higher priority to feature request, some time proving that it should
not be implemented (engineering and other costs do not justify business gain or
actually the feature may cause negative business results -it’s not obvious without
data)
Skills - Product managers
2 building search metrics, which will reflect true interests of business but which
will be implementable and pursuable for engineering team. It’s not enough to say
we want to have higher revenue/profit, higher CSAT score etc, many other metrics
may serve customers and business, be understandable and usable by business
and be useful to train ML models
3 design a roadmap, which will improve search metrics, but taking into account
tons of constraints in development from efficiency, data and other
Being a good PMs search - requires deep technical skills, and business
understanding, and communications with both and more sides
Skills - statisticians/Data Scientists
Design and analysis of search experiments, getting insights from search
experiments,
Analysing metrics and connecting them with customer experience and business.
Analyzing customer behaviour, getting insights, what’s right or wrong in the
search,
Skill - Data Engineering
A typical search engine produces billions of customer based events (search, click
on result page, refinement, map view etc) per day.
It consumes billions of other events (update of a web page or other source of
information) per day.
A typical search engine lives on petabytes of data per day streams and they
processing is crucial both for operations and improvements (model training etc)
Running a search team
Invest heavily in continuous training and professional development in every
stream. Each area is actively developing and there is huge margin between good
and better in performance and impact on your business in every workstream. All
education efforts pay back well
Invest heavily in good collaboration culture, visibility, alignments, team
connections - create a clique of connections. In Search, everyone may
surprisingly affect performance of anyone else (or hurt) and may contribute a lot to
your business. Most areas are heavily interdependent. High visibility/alignment
within the whole org helps launch bigger impact features/products with better
quality. People are more happy when they know all details what they are doing.
Running a search team
Invest heavily in an engineering culture, search engines are very complex systems
(at certain moment Google was biggest system by lines of code, I believe) and
such complex systems can not function well without high quality of engineering at
every step. People are more happy when they produce high quality stuff
Invest heavily in experimentation infrastructure (everybody knows about it) and
experimentation culture (little known) - available to everybody. Businesses got
huge gains from search experiments run by PM and even business owners, rather
than by scientists only. But it’s a culture and education across whole org, not
limited to engineering
Running a search team
Invest in high visibility of work of a search team by other team, stakeholders,
business owners. Search has huge impact on business. But due to its natural
complexity, its impact is not always fully understandable and visible by non
engineering. Visibility affects prioritization, resource allocation, many other things.
Important to have high visibility of what happens in search, what results it brought,
how it works to anybody else in the organization
QA
Addendum
What makes a good Search Engineer
What makes a good search engineer
This part of the presentation is about what are qualities of a good search engineer
and how to build career in AI/Search
1. How to be successful in your search projects and what makes you a good
search engineer
2. How to be successful in a long term career building
Qualities of a good Search engineer
Required Knowledge for long term success in search (to be able to delivery
multiple company level impact successful projects):
1. Machine Learning, new models, new features,
2. Engineering, implementing software solutions with performance, quality, etc
requirements
3. Metrics / Customer, transforming customer experience into metrics which can
be used for ML training, experiments/analysis
4. Statistics, design and analysis of experiments
5. Business, understanding business, how to transform business development
into metrics/OKRs, and consequentually into new search features, new
search products
Qualities of a good Search engineer
Many search features require changes in many parts of search stack: indexing,
ranking, query understanding, evaluation setups
Requires collaboration with many different teams: engineering, MLE, research,
statisticians.
Ability to collaborate at large scale with multiple diverse teams: communications,
document writing, project organization at multiple levels from coding to project
management to product management
Qualities of a good Search engineer
Sometime, search development work requires long time a person / a small team
efforts, where help from management or from colleagues will not change much
Require ability to have long term focus and be able to work in an isolated result
focused environment (PhD style work), result focused environment
Qualities of a good Search engineer
Ability to work on long term projects with no guaranteed outcomes
Many search projects are focused on improving certain customer satisfaction
metrics, (the number of local results, the number of new relevant results etc etc),
improving the model, feature set, something else.
Frequently, there is no guarantee that it’s achievable. Some search projects
require work with multiple unsuccessful tries before finding a good solution
Requires certain persistence to go through failure to failure before finding a
successful solution
Qualities of good Search engineer
Understanding the customer, and skills of transforming understanding the
customer needs into into actionable metrics
A lot of search development is not about continuous improvement of one
relevance, query understanding, index size etc metric, but about discovering and
understanding of various aspects of customer satisfaction and transforming this
understanding into new metrics, which can be used for training models,
measurement and improvements of the search
Qualities of a good Search engineer
Continuous awareness of new developments in many areas of
ML/IR/NLP/statistics which can be used to improve search
Continuous professional development, learning, reading, in machine
learning/AI/NLP/IR, engineering/programming, and other professionals skills
Qualities of a good Search engineer
Success of many big projects and initiatives depends on collaboration with
multiple teams from other technology teams to business departments (legal,
marketing, etc)
Ability to find a support and convince people with very different points of view
about importance, criteria of success, impact of technology projects
and
Ability to listen to feedback and proposals of very different people from business to
tech, objectively understand it and incorporate it into technology development
Qualities of a good Search engineer
Qualities of a good Search Engineer
Engineering part is super important and frequently underestimated in many
articles and books. Only small part of the search development is a training of new
models. The other part is development of new product features, building
infrastructure to serve models, etc software engineering is a part of the job.
Search engines has strict performance limits, search engine is a face of your
business. It’s down, business is down. Quality engineering.
Skills how to write good, quality, performance code, how to test it, tune it,
document it, etc is crucial part of search engineering success.
Long term career success as a Search engineer
Reputation is the number 1 success criteria of a long term career success.
Reputation of you as an engineer, MLE, leader, collaborator. Reputation of you,
teams you built, etc
Reputation among engineering teams, business teams, your peers, partners and
you leadership.
Reputation based on different qualities from building large scale systems to
success in ML projects to understanding business needs and transforming them
into engineering products
First 15 years of career is focus on building of a reputation
Long term career success
Select only jobs which truly suits your
Next job offer: analyze the company: values, culture, technology area, business
vision - is it what you want?
Very important for the first job after college, PhD you get etc - good initial fit is
crucial
Assess companies, will you relate to its business, culture and people?
What you learn there will define your career for several decades
Do systematic assessment of every job offer -- but especially the first job after
college, PhD one is very important
Long term career success
The best job is a job with a company that suits you
When you select next step, be sure that company culture, values, product,
engineering fits you, your development goals, your values. Do not move because
of popular technology, a big title, sudden unexpected salary increase, hype, and
other accidental to your long term career reasons
Long term career success
Focus on development of long term professional relationships
Develop diverse base of meaningful work connections, with colleagues from
different technology departments, different lines of business, marketing, legal,
recruiters etc based on joint work and your reputation as your work with them
Long term career success
Within your company, Move to more strategic projects with big impact on the
company business
Strategic projects - More opportunities for career development, more meaningful
work connections, more things to learn for long term career goals , typically more
interesting technologies, more to learn about business, technology, customer,
more opportunity for self development, more skills, more knowledge
Long term career success
More to more strategic and bigger impact contributions in your area of work
First job - develop models, develop software features as requested by mentor,
manager
Move from individual projects to team projects, from coding and model training to
defining vision, strategy, roadmap, execution, building teams
In *every* role and project, widen your scope, do more challenging tasks, bigger
impact on the company business
Long term career success
Do not complaint, Make changes
It applies to code, technology, org structure, culture, relationships, products,
anything you believe can be improved
Do not just complain about things going wrong. Fix them whenever possible. By
coding, writing documentation, making people aware about wrong things and
proposing solutions, at every level of your career, you can make bigger changes
than you are expected at this step of career. Bring changes rather than whine.
Even if a problem is well above your role, propose solutions, notify relevant
people, bring value to solve it, rather than just complain.
Long term career success
Continuous professional development is crucial at every step of the career
Every year ask yourself questions,
over last 12 months
1. How much I learned about the technologies, the products, the services, the
markets? What part of this knowledge is relevant to my work? How much did
it help to improve my performance (performance of my team)
2. How many new people have I gotten to know at work? How diverse is this
people set? How many people have I improved relationships with?
-
Long term career success
Continuous professional development is crucial at every step of the career
Over last 12 months
1. What new results, accomplishments have i achieved? What have I launched,
improved? How much does it add to my reputation? Track record?
2. What new skills have i developed? Am I better in communications?
Technology? Analytics skills? Judgement? In which areas?
How can I do it better next year? What should I improve? How to apply these new
skills, relationship, knowledge?

More Related Content

What's hot

System design for recommendations and search
System design for recommendations and searchSystem design for recommendations and search
System design for recommendations and searchEugene Yan Ziyou
 
Analytics & Data Strategy 101 by Deko Dimeski
Analytics & Data Strategy 101 by Deko DimeskiAnalytics & Data Strategy 101 by Deko Dimeski
Analytics & Data Strategy 101 by Deko DimeskiDeko Dimeski
 
OLX Group Prod Tech 2019 Keynote: Asia's Tech Giants
OLX Group Prod Tech 2019 Keynote: Asia's Tech GiantsOLX Group Prod Tech 2019 Keynote: Asia's Tech Giants
OLX Group Prod Tech 2019 Keynote: Asia's Tech GiantsEugene Yan Ziyou
 
Datasciencein E-commerce industry
Datasciencein E-commerce industryDatasciencein E-commerce industry
Datasciencein E-commerce industryRakuten Group, Inc.
 
INSEAD Sharing on Lazada Data Science and my Journey
INSEAD Sharing on Lazada Data Science and my JourneyINSEAD Sharing on Lazada Data Science and my Journey
INSEAD Sharing on Lazada Data Science and my JourneyEugene Yan Ziyou
 
Towards Human-Centered Machine Learning
Towards Human-Centered Machine LearningTowards Human-Centered Machine Learning
Towards Human-Centered Machine LearningSri Ambati
 
Getting Your Supply Chain Back on Track with AI
Getting Your Supply Chain Back on Track with AIGetting Your Supply Chain Back on Track with AI
Getting Your Supply Chain Back on Track with AISri Ambati
 
SMU BIA Sharing on Data Science
SMU BIA Sharing on Data ScienceSMU BIA Sharing on Data Science
SMU BIA Sharing on Data ScienceEugene Yan Ziyou
 
Self Service Reporting & Analytics For an Enterprise
Self Service Reporting & Analytics For an EnterpriseSelf Service Reporting & Analytics For an Enterprise
Self Service Reporting & Analytics For an EnterpriseSreejith Madhavan
 
How Lazada ranks products to improve customer experience and conversion
How Lazada ranks products to improve customer experience and conversionHow Lazada ranks products to improve customer experience and conversion
How Lazada ranks products to improve customer experience and conversionEugene Yan Ziyou
 
Productionising Machine Learning Models
Productionising Machine Learning ModelsProductionising Machine Learning Models
Productionising Machine Learning ModelsTash Bickley
 
Ecr presentation ss chain - jeffrey - final
Ecr presentation   ss chain - jeffrey - finalEcr presentation   ss chain - jeffrey - final
Ecr presentation ss chain - jeffrey - finalECR Community
 
20150118 s snet analytics vca
20150118 s snet analytics vca20150118 s snet analytics vca
20150118 s snet analytics vcaVishwanath Ramdas
 
Carmelo Iaria, AI Academy - How The AI Academy is accelerating NLP projects w...
Carmelo Iaria, AI Academy - How The AI Academy is accelerating NLP projects w...Carmelo Iaria, AI Academy - How The AI Academy is accelerating NLP projects w...
Carmelo Iaria, AI Academy - How The AI Academy is accelerating NLP projects w...Sri Ambati
 
Deep learning for e-commerce: current status and future prospects
Deep learning for e-commerce: current status and future prospectsDeep learning for e-commerce: current status and future prospects
Deep learning for e-commerce: current status and future prospectsRakuten Group, Inc.
 
What's New in Predictive Analytics IBM SPSS - Apr 2016
What's New in Predictive Analytics IBM SPSS - Apr 2016What's New in Predictive Analytics IBM SPSS - Apr 2016
What's New in Predictive Analytics IBM SPSS - Apr 2016Edgar Alejandro Villegas
 
AWS Summit Webinar Edition | Modern Data Architecture | Microsoft Application...
AWS Summit Webinar Edition | Modern Data Architecture | Microsoft Application...AWS Summit Webinar Edition | Modern Data Architecture | Microsoft Application...
AWS Summit Webinar Edition | Modern Data Architecture | Microsoft Application...Amazon Web Services
 
IBM SPSS Overview Text Analytics Brief
IBM SPSS Overview Text Analytics BriefIBM SPSS Overview Text Analytics Brief
IBM SPSS Overview Text Analytics BriefIan Balina
 
Guiding through a typical Machine Learning Pipeline
Guiding through a typical Machine Learning PipelineGuiding through a typical Machine Learning Pipeline
Guiding through a typical Machine Learning PipelineMichael Gerke
 

What's hot (20)

System design for recommendations and search
System design for recommendations and searchSystem design for recommendations and search
System design for recommendations and search
 
Analytics & Data Strategy 101 by Deko Dimeski
Analytics & Data Strategy 101 by Deko DimeskiAnalytics & Data Strategy 101 by Deko Dimeski
Analytics & Data Strategy 101 by Deko Dimeski
 
OLX Group Prod Tech 2019 Keynote: Asia's Tech Giants
OLX Group Prod Tech 2019 Keynote: Asia's Tech GiantsOLX Group Prod Tech 2019 Keynote: Asia's Tech Giants
OLX Group Prod Tech 2019 Keynote: Asia's Tech Giants
 
Datasciencein E-commerce industry
Datasciencein E-commerce industryDatasciencein E-commerce industry
Datasciencein E-commerce industry
 
INSEAD Sharing on Lazada Data Science and my Journey
INSEAD Sharing on Lazada Data Science and my JourneyINSEAD Sharing on Lazada Data Science and my Journey
INSEAD Sharing on Lazada Data Science and my Journey
 
Towards Human-Centered Machine Learning
Towards Human-Centered Machine LearningTowards Human-Centered Machine Learning
Towards Human-Centered Machine Learning
 
Getting Your Supply Chain Back on Track with AI
Getting Your Supply Chain Back on Track with AIGetting Your Supply Chain Back on Track with AI
Getting Your Supply Chain Back on Track with AI
 
SMU BIA Sharing on Data Science
SMU BIA Sharing on Data ScienceSMU BIA Sharing on Data Science
SMU BIA Sharing on Data Science
 
Self Service Reporting & Analytics For an Enterprise
Self Service Reporting & Analytics For an EnterpriseSelf Service Reporting & Analytics For an Enterprise
Self Service Reporting & Analytics For an Enterprise
 
How Lazada ranks products to improve customer experience and conversion
How Lazada ranks products to improve customer experience and conversionHow Lazada ranks products to improve customer experience and conversion
How Lazada ranks products to improve customer experience and conversion
 
Productionising Machine Learning Models
Productionising Machine Learning ModelsProductionising Machine Learning Models
Productionising Machine Learning Models
 
Ecr presentation ss chain - jeffrey - final
Ecr presentation   ss chain - jeffrey - finalEcr presentation   ss chain - jeffrey - final
Ecr presentation ss chain - jeffrey - final
 
20150118 s snet analytics vca
20150118 s snet analytics vca20150118 s snet analytics vca
20150118 s snet analytics vca
 
Carmelo Iaria, AI Academy - How The AI Academy is accelerating NLP projects w...
Carmelo Iaria, AI Academy - How The AI Academy is accelerating NLP projects w...Carmelo Iaria, AI Academy - How The AI Academy is accelerating NLP projects w...
Carmelo Iaria, AI Academy - How The AI Academy is accelerating NLP projects w...
 
Deep learning for e-commerce: current status and future prospects
Deep learning for e-commerce: current status and future prospectsDeep learning for e-commerce: current status and future prospects
Deep learning for e-commerce: current status and future prospects
 
HP Vertica
HP Vertica HP Vertica
HP Vertica
 
What's New in Predictive Analytics IBM SPSS - Apr 2016
What's New in Predictive Analytics IBM SPSS - Apr 2016What's New in Predictive Analytics IBM SPSS - Apr 2016
What's New in Predictive Analytics IBM SPSS - Apr 2016
 
AWS Summit Webinar Edition | Modern Data Architecture | Microsoft Application...
AWS Summit Webinar Edition | Modern Data Architecture | Microsoft Application...AWS Summit Webinar Edition | Modern Data Architecture | Microsoft Application...
AWS Summit Webinar Edition | Modern Data Architecture | Microsoft Application...
 
IBM SPSS Overview Text Analytics Brief
IBM SPSS Overview Text Analytics BriefIBM SPSS Overview Text Analytics Brief
IBM SPSS Overview Text Analytics Brief
 
Guiding through a typical Machine Learning Pipeline
Guiding through a typical Machine Learning PipelineGuiding through a typical Machine Learning Pipeline
Guiding through a typical Machine Learning Pipeline
 

Similar to AI in multi billion search engines. Building AI and Search teams

Beginners discussion to - Google Analytics
Beginners discussion to - Google Analytics Beginners discussion to - Google Analytics
Beginners discussion to - Google Analytics Lee Trevena
 
integrating-cognitive-services-into-your-devops-strategy
integrating-cognitive-services-into-your-devops-strategyintegrating-cognitive-services-into-your-devops-strategy
integrating-cognitive-services-into-your-devops-strategyKarthik Jaganathan
 
Integrating cognitive services in to your devops strategy
Integrating cognitive services in to your devops strategyIntegrating cognitive services in to your devops strategy
Integrating cognitive services in to your devops strategyAspire Systems
 
One Stop Recommendation
One Stop RecommendationOne Stop Recommendation
One Stop RecommendationIRJET Journal
 
One Stop Recommendation
One Stop RecommendationOne Stop Recommendation
One Stop RecommendationIRJET Journal
 
Intro of Key Features of Soft CAAT Ent Software
Intro of Key Features of Soft CAAT Ent SoftwareIntro of Key Features of Soft CAAT Ent Software
Intro of Key Features of Soft CAAT Ent Softwarerafeq
 
AI for Customer Service - How to Improve Contact Center Efficiency with Machi...
AI for Customer Service - How to Improve Contact Center Efficiency with Machi...AI for Customer Service - How to Improve Contact Center Efficiency with Machi...
AI for Customer Service - How to Improve Contact Center Efficiency with Machi...Skyl.ai
 
Driving Customer Loyalty with Azure Machine Learning
Driving Customer Loyalty with Azure Machine LearningDriving Customer Loyalty with Azure Machine Learning
Driving Customer Loyalty with Azure Machine LearningCCG
 
Intro of Key Features of Auto eCAAT Ent Software
Intro of Key Features of Auto eCAAT Ent SoftwareIntro of Key Features of Auto eCAAT Ent Software
Intro of Key Features of Auto eCAAT Ent Softwarerafeq
 
AI for Software Engineering
AI for Software EngineeringAI for Software Engineering
AI for Software EngineeringMiroslaw Staron
 
Digital marketing pharma - google event
Digital marketing   pharma - google eventDigital marketing   pharma - google event
Digital marketing pharma - google eventDaniel Viveiros
 
The Next Digital Marketing- Digital Pharma presentation by Ci&T and Google
The Next Digital Marketing- Digital Pharma presentation by Ci&T and GoogleThe Next Digital Marketing- Digital Pharma presentation by Ci&T and Google
The Next Digital Marketing- Digital Pharma presentation by Ci&T and GoogleCI&T
 
AI for Customer Service: How to Improve Contact Center Efficiency with Machin...
AI for Customer Service: How to Improve Contact Center Efficiency with Machin...AI for Customer Service: How to Improve Contact Center Efficiency with Machin...
AI for Customer Service: How to Improve Contact Center Efficiency with Machin...Skyl.ai
 
Embedded analytics: The future of Business Intelligence
Embedded analytics: The future of Business IntelligenceEmbedded analytics: The future of Business Intelligence
Embedded analytics: The future of Business IntelligenceAnil Kumar Saini
 
Deep.bi - Real-time, Deep Data Analytics Platform For Ecommerce
Deep.bi - Real-time, Deep Data Analytics Platform For EcommerceDeep.bi - Real-time, Deep Data Analytics Platform For Ecommerce
Deep.bi - Real-time, Deep Data Analytics Platform For EcommerceDeep.BI
 
business analytics.ppt
business analytics.pptbusiness analytics.ppt
business analytics.pptRenu Lamba
 
How Data Annotation is Beneficial for Artificial Intelligence and Machine Lea...
How Data Annotation is Beneficial for Artificial Intelligence and Machine Lea...How Data Annotation is Beneficial for Artificial Intelligence and Machine Lea...
How Data Annotation is Beneficial for Artificial Intelligence and Machine Lea...Andrew Leo
 

Similar to AI in multi billion search engines. Building AI and Search teams (20)

Beginners discussion to - Google Analytics
Beginners discussion to - Google Analytics Beginners discussion to - Google Analytics
Beginners discussion to - Google Analytics
 
Analytics in Online Retail
Analytics in Online RetailAnalytics in Online Retail
Analytics in Online Retail
 
integrating-cognitive-services-into-your-devops-strategy
integrating-cognitive-services-into-your-devops-strategyintegrating-cognitive-services-into-your-devops-strategy
integrating-cognitive-services-into-your-devops-strategy
 
Integrating cognitive services in to your devops strategy
Integrating cognitive services in to your devops strategyIntegrating cognitive services in to your devops strategy
Integrating cognitive services in to your devops strategy
 
One Stop Recommendation
One Stop RecommendationOne Stop Recommendation
One Stop Recommendation
 
One Stop Recommendation
One Stop RecommendationOne Stop Recommendation
One Stop Recommendation
 
DU Series - Day 4.pptx
DU Series - Day 4.pptxDU Series - Day 4.pptx
DU Series - Day 4.pptx
 
Intro of Key Features of Soft CAAT Ent Software
Intro of Key Features of Soft CAAT Ent SoftwareIntro of Key Features of Soft CAAT Ent Software
Intro of Key Features of Soft CAAT Ent Software
 
AI for Customer Service - How to Improve Contact Center Efficiency with Machi...
AI for Customer Service - How to Improve Contact Center Efficiency with Machi...AI for Customer Service - How to Improve Contact Center Efficiency with Machi...
AI for Customer Service - How to Improve Contact Center Efficiency with Machi...
 
Driving Customer Loyalty with Azure Machine Learning
Driving Customer Loyalty with Azure Machine LearningDriving Customer Loyalty with Azure Machine Learning
Driving Customer Loyalty with Azure Machine Learning
 
Intro of Key Features of Auto eCAAT Ent Software
Intro of Key Features of Auto eCAAT Ent SoftwareIntro of Key Features of Auto eCAAT Ent Software
Intro of Key Features of Auto eCAAT Ent Software
 
AI for Software Engineering
AI for Software EngineeringAI for Software Engineering
AI for Software Engineering
 
Digital marketing pharma - google event
Digital marketing   pharma - google eventDigital marketing   pharma - google event
Digital marketing pharma - google event
 
The Next Digital Marketing- Digital Pharma presentation by Ci&T and Google
The Next Digital Marketing- Digital Pharma presentation by Ci&T and GoogleThe Next Digital Marketing- Digital Pharma presentation by Ci&T and Google
The Next Digital Marketing- Digital Pharma presentation by Ci&T and Google
 
AI for Customer Service: How to Improve Contact Center Efficiency with Machin...
AI for Customer Service: How to Improve Contact Center Efficiency with Machin...AI for Customer Service: How to Improve Contact Center Efficiency with Machin...
AI for Customer Service: How to Improve Contact Center Efficiency with Machin...
 
Embedded analytics: The future of Business Intelligence
Embedded analytics: The future of Business IntelligenceEmbedded analytics: The future of Business Intelligence
Embedded analytics: The future of Business Intelligence
 
Deep.bi - Real-time, Deep Data Analytics Platform For Ecommerce
Deep.bi - Real-time, Deep Data Analytics Platform For EcommerceDeep.bi - Real-time, Deep Data Analytics Platform For Ecommerce
Deep.bi - Real-time, Deep Data Analytics Platform For Ecommerce
 
business analytics.ppt
business analytics.pptbusiness analytics.ppt
business analytics.ppt
 
How Data Annotation is Beneficial for Artificial Intelligence and Machine Lea...
How Data Annotation is Beneficial for Artificial Intelligence and Machine Lea...How Data Annotation is Beneficial for Artificial Intelligence and Machine Lea...
How Data Annotation is Beneficial for Artificial Intelligence and Machine Lea...
 
Introduction to Data Analytics
Introduction to Data AnalyticsIntroduction to Data Analytics
Introduction to Data Analytics
 

Recently uploaded

Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 

Recently uploaded (20)

Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 

AI in multi billion search engines. Building AI and Search teams

  • 2. AI in multi billion search engines. Building AI and Search teams
  • 3. My goal for this talk is to explain how 1. AI/ML in Search Engines improves customer experience, revenue/GMV, operational costs, helps to get more customers and serve them better 2. Build AI systems,for search 3. To build global AI teams and how to make them successful
  • 4. My message Successful AI in Search is an AI infrastructure, engineering and science culture and toolsets enabling to continuously introduce, measure, improve AI features in every part of the search engine rather than several SOTA models in ranking or query understanding The same applies to every large scale consumer or business facing platform (recommendation engines, call center analytics etc) Development of such a type of AI solutions implies certain requirements on teams which are successful in building large scale consumer facing AI software systems
  • 5. Multi billion? (customers, dollars, documents ) We focus on search engines with many billion dollars revenue/GMV, billions of users (or hundreds of millions), billions of documents which justify investment in building AI infrastructure because it improves revenues by hundreds millions/billions of dollars or saves infrastructure costs on large scale - justifying developing many AI applications for search
  • 6. Typical Search Engine - High Level View
  • 7. Search Engine - high level view Many other *key* parts are not in the picture ● Experimentation and other framework to support ‘Search Science’ design, analysis, debugging, deployment, of various query understanding, ranking etc models ● Evaluation - continuous monitoring quality of results and analysis of users behaviour ● Loging, monitoring, alerting (to serve all others, feed consumer behavior systems, and react to operational problems) etc
  • 8. AI is everywhere Using AI components to improve data acquisition by 10%, indexing by 10%, query understanding by 10%, ranking by 10%, result page by 10% gives more gains in customer satisfaction and revenue/GMB than applying SOTA and improving only one components such as query understanding or ranking by 30%. The AI development should be driven by creating infrastructure, culture and toolsets for continuous AI deployment, improvement, measurement everywhere in the search stack. There are no ‘engineering’ teams, every team is an AI team
  • 9. AI is not separable from engineering AI development is not separable from other engineering developments. Improvement of index selection by 10% lets either accommodate other data sources to improve coverage (by getting more documents) or to improve ranking (by getting more computation and using it for more advanced functions since fewer documents to rank) Improvement of infrastructure to decrease latency by 10% lets to deploy more sophisticated ranking or query understanding functions (10% more time) Good engineering quality and culture is not separable from AI development but a mandatory part of AI culture
  • 10. AI ‘rank labs’ and AI platforms Search engine teams benefit if there is an unified environment to train, deploy, serve models - by reducing work on infrastructure, MLOps, sharing metrics, making easy to measure end to end metrics There is no an environment which will handle all types of AI development, AI serving, AI measurement. Search Systems are naturally complex by different tasks, different environments, different languages, CI/CD systems but never ending work on unifying AI development infrastructures across search teams helps
  • 11. Multiple ways to ‘deploy’ AI models Deploy models to TF Serving, TorchServe etc Deploy models served in container Compile a model directly into a machine code or as a source code of search component (GBDT models into c++/java code to be used in ranking) Relearn and change parameters of existing models served Tons of other deployment scenarios Etc etc etc
  • 12. Multiple ways to serve AI models in Search Streaming (ex: document updates) Batch (ex: offline processing of queries or users or documents) Serving runtime services (ranking, query understanding)
  • 13. Multiple ways to improve AI in Search Change evaluation methods and metrics, train models to new metrics Change sampling procedures Change training procedures Change modeling techniques Model previously unmodeled tasks New data and new features Infrastructure changes in serving etc etc etc
  • 14. AI platforms So, it’s almost impossible to make one platform to handle all types of AI development and deployment (see variety in previous slides) But, unification of some of tasks reduces development and operation efforts and costs, increase velocity of AI development bringing a lot of money and customer satisfaction Every AI driven Search company created “Rank Lab” AI platform for Search Now , there is an open source such as KubeFlow, MLFlow, to simplify developments
  • 15. AI services in prod Good if they are decoupled, so multiple small teams can work on services independently But wiring is needed (a ‘signal’ from query understanding to be used in ranking etc) Processing of a query may call ~dozens of AI services, processing of a document in data acquisition and index may call 100s of AI services. Performance considerations are extremely important AI infrastructure benefits greatly from common software practices, protocols, orchestration to organize this ‘AI chaos’ and make order out of it
  • 16. AI in prod Besides ML objective functions and metrics a. Latency b. Resiliency c. Throughput d. Resource utilization Are super important factors in design of every AI services at the search engine stack AI service development should be tested and benchmark against them Your model will serve billions of document updates or billions of queries, every 1ms delay, 1 ms downtime, etc will cost either bad user experience or millions in ops
  • 17. AI in Search Several use cases to demonstrate that AI is driving search in every part of search stack
  • 18. Indexing - index selection Perhaps, one of the first of AI applications to the search domains. Started in 90s when web volume increased and index selection strategy started to be important Which documents should be indexed? In which index layers they should be placed? (many modern search engines are multi level, smaller index for frequently searched items, a very big and comprehensive index for rarely search items) AI for quality, popularity assessment of ‘documents’
  • 19. Indexing - duplicate resolution Which ‘documents’ are essentially the same (represent the same item)? Or have highly duplicated content, so the second document does not carry more information? Which documents are the same wrt to a particular query? (we do not want to show collocated Target and Target Pharmacy for local search query target but they are different entities for query pharmacy)
  • 20. Indexing - attributes extraction Given documents - full text descriptions of houses, ecommerce products, businesses etc - extract significant attributes important for search to understand items of interest (size, wheel size, weight, number of pages, location, view)
  • 21. Index - statistical tasks Evaluation of quality and size of the index. Is our index provides good coverage? What categories are missing? What data quality problems? Evaluation of Index size of external systems
  • 22. Index - data quality What attributes are important and must be mandatory and which attributes can be optional in data acquisition? Which types of data, which categories to acquire? AI processes continuously looking into search logs to decide customer priorities, what drives conversion and using this information to drive data acquisition Also detection of spam, fraud, adult content.
  • 23. Demand generation beyond just data The same type of AI evaluation procedures to compute and forecast future demand of items, to drive purchase decision if search engine is used to sell items
  • 24. AI for query understanding Query understanding is mapping of customer’s query into a machine understandable format to retrieve a set of relevant items and rank them with highest probability of customer engagement (view, purchase, etc) Synonym expansion for better retrieval, removal of insignificant terms, correcting spelling and other errors, term weighting, attribute and entity extraction, compound and phrase extraction, classification (novelty, price range etc ) etc
  • 25. Query understanding Classification Mapping a query into a certain set of categories to be used in retrieval and ranking -> most probable document category (italian -> restaurants in local search), -> most probable distance (gas -> 5 miles distance, micheline restaurant -> 50 miles distance local search) -> novelty: printers -> released within 1 years, pillows -> release date does not matter Typically: 100s classifiers per search engine with significant impact on quality / revenue
  • 26. Query understanding Similar Queries Given queries q1 - q2 how similar they are (how results for one query will be good as results for the other query) Tons of applications in query understanding and ranking: given features for one query, apply them to another query for ranking, extend retrieval set etc etc
  • 27. Query understanding: entity and attribute extraction Given a query: map it into structured representation of entities and attributes to be used for better retrieval and ranking
  • 28. AI for ranking Learning to rank / Machine Learning based ranking technologies to rank document (LeToR/ MLR) AI for unbiased ranking User interactions based LeToR/Counterfactual
  • 29. AI for search assistance Typeahead prediction - language modeling, other contextual information, location of user, previous searches of users, Query dependent , user dependent navigational panels and guided search
  • 30. AI Whole page Given an output of several search engines: how to combine them to construct the best customer experience. Ex: music, video, book, podcasts as in iTunes; web search, maps, youtube, news, image, books, scholar etc in Google
  • 31. AI SERP snippet How to generate the best descriptions of items in the search result page so customers understand relevance of items without clicking on them How to select the best chunk of text representing the item, picture, formats, - depending on the query and the user
  • 32. AI price prediction Predict the price of the item (for selling search engines) Which will maximize item conversion and customer satisfaction and revenue of the company (economics problems, but tightly connected to search, depends on item position in search, relevance, exposure, prices of other search results)
  • 33. AI conversational search Conversational interfaces for search, multi turn interactions with customers to understand customer search intent and help her/him to express their intent or even to find it by making latent intent explicit NLP/NLU, dialog state management, deep reinforcement learning, text generation ASR for voice based systems
  • 34. Post search AI Given a set of queries relevant to user (saved queries, previous sessions) and a set of items relevant to users Generate email and other notifications about new items, price changes, availability changes - which will help users to find/buy/discover what they want
  • 35. Building Search Teams Running Search Engines which are front face of businesses, for example, real estate (Zillow, Trulia), eCommerce (Walmart, eBay), is different and similar from running search engines such as Google Web, I’ll focus mostly on search engines which are front faces of business
  • 36. Do you need a search team? Some companies buy Search SaaS service, some companies ask consultants to build a search engine for them. It might work when the search is not core of your business, your customers satisfaction, profits, core of your business depends on it. Say, search of web forum threads on your web site, or any other non-critical search When multi billion business and satisfaction of hundreds millions of users depend on your search engines and may significantly improve revenue and customer satisfaction numbers by improving your search engine, the only way - to control and own the search engine and have your organization owning and developing it
  • 37. Do you need a search team? “We believe that we need to own and control the primary technologies behind the products we make,” was added to Apple values when Tim Cook become the CEO of Apple Quite a good value and proven to be a very efficient value. This value is efficient is any other business besides Apple If the search engine is the primary technology behind your business/products, the only way to operate it is to own and control it
  • 38. Typical roles in the Search team There are multiple critical roles in Search. I’ll describe some of them. There are more roles. The exact composition of your search team depends on your business: some teams are more AI ranking heavy where the search ranking is the most business critical, some teams are more backend engineering heavy when your business success depends on integration with many other systems (availability, pricing) Success factor: all roles must be ‘AI aware’ and know how AI works to make the whole team successful
  • 39. Skills - search infrastructure Building search engine you’ll build it either on top of existing open source such as Solr or ElasticSearch, or you will build it from scratch. You’ll need experts in the technology you use, since you can improve performance, operation costs, resilience, etc only if your team knows this technology deeply Performance is important characteristic of search engine, you’ll need experts in technologies used (for example, Java backend engineers for Solr which is based in Java or experts in building high load distributed systems in c++ if you implement your engine in c++ )
  • 40. Skills - search infrastructure Depending on scale of your operations a lot of search operations and performance improvements may in particular algorithmic improvements, for example, more efficient index structures, or more efficient fuzzy match algorithms. You’ll need experts in algorithms in those particular areas
  • 41. Skills - Search Quality aka Search Science Search quality consist of multiple subcomponents. Natural language processing for query understanding, Machine learned based ranking, and other (index selection etc- depends on your business) You’ll need good Machine Learning engineers and applied scientists. Better with background in IR, Search, LTR, NLP (distribution depends on your business). All of them are well established long term areas and there are people who are experts already. But typically, a good generic ML catches up quite fast. But you need a set of people who are deep experts in search quality to be core of your team to drive other people
  • 42. Skills - Search Quality aka Search Science What I noticed abstract Applied Scientist who know only how to train model do not work well in search You need more MLE type profile, good in Science, but capable to build and improve efficient system. A lot of search development is not about mining new features, training new models, but about building new components
  • 43. Skills - Operations / SRE Do not forget - typically, the search engine is a core of your business. It goes down, customers can not find what they want - they go to competitors , it costs a lot Having good SRE/ops teamwho can operate a high load, complex distributed system with multiple dependencies on other systems is indispensable for your search engine operations
  • 44. Skills - operations / SRE There are multiple dimensions of complexity. Your scientist, search quality, search infrastructure people will be continuously improving your search engine from performance, efficiency, search quality, other points of view. You need to build devops and operations team, who can support such complexity
  • 45. Skills - UX Everybody talks about LTR and Query understanding, but satisfaction of your customers and revenues of your business depends a lot on UX I saw surprisingly trivial UX changes which caused huge conversion/revenue gains You need designers who know how to build Search UX But these designers must be data driven, understanding how to run UX A/B or other experiments and how to interpret their results.
  • 46. Skills - UX Search pages are necessary complex. There are many search results, there is a lot information about search result in every snippet, there are other interaction elements (filters, maps) Any UX performance problems causes lost customer satisfaction and revenues ‘Full stack’ engineer who know how to build stack - from the search engine API/query language to the final efficient rendering of pages,
  • 47. Skills - Product managers There are multiple different roles product managers have in search engines development - perhaps, even deserve different titles 1 getting a continuous stream, of feature requests from businesses. Working with data and business leadership to understand if business truly needs these feature or not. Frequently, businesses are disconnected from consumer, behavior and other data and may have not a right assessment of important of certain feature. Good PMs create a good connection between business and engineering. Sometime giving higher priority to feature request, some time proving that it should not be implemented (engineering and other costs do not justify business gain or actually the feature may cause negative business results -it’s not obvious without data)
  • 48. Skills - Product managers 2 building search metrics, which will reflect true interests of business but which will be implementable and pursuable for engineering team. It’s not enough to say we want to have higher revenue/profit, higher CSAT score etc, many other metrics may serve customers and business, be understandable and usable by business and be useful to train ML models 3 design a roadmap, which will improve search metrics, but taking into account tons of constraints in development from efficiency, data and other Being a good PMs search - requires deep technical skills, and business understanding, and communications with both and more sides
  • 49. Skills - statisticians/Data Scientists Design and analysis of search experiments, getting insights from search experiments, Analysing metrics and connecting them with customer experience and business. Analyzing customer behaviour, getting insights, what’s right or wrong in the search,
  • 50. Skill - Data Engineering A typical search engine produces billions of customer based events (search, click on result page, refinement, map view etc) per day. It consumes billions of other events (update of a web page or other source of information) per day. A typical search engine lives on petabytes of data per day streams and they processing is crucial both for operations and improvements (model training etc)
  • 51. Running a search team Invest heavily in continuous training and professional development in every stream. Each area is actively developing and there is huge margin between good and better in performance and impact on your business in every workstream. All education efforts pay back well Invest heavily in good collaboration culture, visibility, alignments, team connections - create a clique of connections. In Search, everyone may surprisingly affect performance of anyone else (or hurt) and may contribute a lot to your business. Most areas are heavily interdependent. High visibility/alignment within the whole org helps launch bigger impact features/products with better quality. People are more happy when they know all details what they are doing.
  • 52. Running a search team Invest heavily in an engineering culture, search engines are very complex systems (at certain moment Google was biggest system by lines of code, I believe) and such complex systems can not function well without high quality of engineering at every step. People are more happy when they produce high quality stuff Invest heavily in experimentation infrastructure (everybody knows about it) and experimentation culture (little known) - available to everybody. Businesses got huge gains from search experiments run by PM and even business owners, rather than by scientists only. But it’s a culture and education across whole org, not limited to engineering
  • 53. Running a search team Invest in high visibility of work of a search team by other team, stakeholders, business owners. Search has huge impact on business. But due to its natural complexity, its impact is not always fully understandable and visible by non engineering. Visibility affects prioritization, resource allocation, many other things. Important to have high visibility of what happens in search, what results it brought, how it works to anybody else in the organization
  • 54. QA
  • 55. Addendum What makes a good Search Engineer
  • 56. What makes a good search engineer This part of the presentation is about what are qualities of a good search engineer and how to build career in AI/Search 1. How to be successful in your search projects and what makes you a good search engineer 2. How to be successful in a long term career building
  • 57. Qualities of a good Search engineer Required Knowledge for long term success in search (to be able to delivery multiple company level impact successful projects): 1. Machine Learning, new models, new features, 2. Engineering, implementing software solutions with performance, quality, etc requirements 3. Metrics / Customer, transforming customer experience into metrics which can be used for ML training, experiments/analysis 4. Statistics, design and analysis of experiments 5. Business, understanding business, how to transform business development into metrics/OKRs, and consequentually into new search features, new search products
  • 58. Qualities of a good Search engineer Many search features require changes in many parts of search stack: indexing, ranking, query understanding, evaluation setups Requires collaboration with many different teams: engineering, MLE, research, statisticians. Ability to collaborate at large scale with multiple diverse teams: communications, document writing, project organization at multiple levels from coding to project management to product management
  • 59. Qualities of a good Search engineer Sometime, search development work requires long time a person / a small team efforts, where help from management or from colleagues will not change much Require ability to have long term focus and be able to work in an isolated result focused environment (PhD style work), result focused environment
  • 60. Qualities of a good Search engineer Ability to work on long term projects with no guaranteed outcomes Many search projects are focused on improving certain customer satisfaction metrics, (the number of local results, the number of new relevant results etc etc), improving the model, feature set, something else. Frequently, there is no guarantee that it’s achievable. Some search projects require work with multiple unsuccessful tries before finding a good solution Requires certain persistence to go through failure to failure before finding a successful solution
  • 61. Qualities of good Search engineer Understanding the customer, and skills of transforming understanding the customer needs into into actionable metrics A lot of search development is not about continuous improvement of one relevance, query understanding, index size etc metric, but about discovering and understanding of various aspects of customer satisfaction and transforming this understanding into new metrics, which can be used for training models, measurement and improvements of the search
  • 62. Qualities of a good Search engineer Continuous awareness of new developments in many areas of ML/IR/NLP/statistics which can be used to improve search Continuous professional development, learning, reading, in machine learning/AI/NLP/IR, engineering/programming, and other professionals skills
  • 63. Qualities of a good Search engineer Success of many big projects and initiatives depends on collaboration with multiple teams from other technology teams to business departments (legal, marketing, etc) Ability to find a support and convince people with very different points of view about importance, criteria of success, impact of technology projects and Ability to listen to feedback and proposals of very different people from business to tech, objectively understand it and incorporate it into technology development
  • 64. Qualities of a good Search engineer Qualities of a good Search Engineer Engineering part is super important and frequently underestimated in many articles and books. Only small part of the search development is a training of new models. The other part is development of new product features, building infrastructure to serve models, etc software engineering is a part of the job. Search engines has strict performance limits, search engine is a face of your business. It’s down, business is down. Quality engineering. Skills how to write good, quality, performance code, how to test it, tune it, document it, etc is crucial part of search engineering success.
  • 65. Long term career success as a Search engineer Reputation is the number 1 success criteria of a long term career success. Reputation of you as an engineer, MLE, leader, collaborator. Reputation of you, teams you built, etc Reputation among engineering teams, business teams, your peers, partners and you leadership. Reputation based on different qualities from building large scale systems to success in ML projects to understanding business needs and transforming them into engineering products First 15 years of career is focus on building of a reputation
  • 66. Long term career success Select only jobs which truly suits your Next job offer: analyze the company: values, culture, technology area, business vision - is it what you want? Very important for the first job after college, PhD you get etc - good initial fit is crucial Assess companies, will you relate to its business, culture and people? What you learn there will define your career for several decades Do systematic assessment of every job offer -- but especially the first job after college, PhD one is very important
  • 67. Long term career success The best job is a job with a company that suits you When you select next step, be sure that company culture, values, product, engineering fits you, your development goals, your values. Do not move because of popular technology, a big title, sudden unexpected salary increase, hype, and other accidental to your long term career reasons
  • 68. Long term career success Focus on development of long term professional relationships Develop diverse base of meaningful work connections, with colleagues from different technology departments, different lines of business, marketing, legal, recruiters etc based on joint work and your reputation as your work with them
  • 69. Long term career success Within your company, Move to more strategic projects with big impact on the company business Strategic projects - More opportunities for career development, more meaningful work connections, more things to learn for long term career goals , typically more interesting technologies, more to learn about business, technology, customer, more opportunity for self development, more skills, more knowledge
  • 70. Long term career success More to more strategic and bigger impact contributions in your area of work First job - develop models, develop software features as requested by mentor, manager Move from individual projects to team projects, from coding and model training to defining vision, strategy, roadmap, execution, building teams In *every* role and project, widen your scope, do more challenging tasks, bigger impact on the company business
  • 71. Long term career success Do not complaint, Make changes It applies to code, technology, org structure, culture, relationships, products, anything you believe can be improved Do not just complain about things going wrong. Fix them whenever possible. By coding, writing documentation, making people aware about wrong things and proposing solutions, at every level of your career, you can make bigger changes than you are expected at this step of career. Bring changes rather than whine. Even if a problem is well above your role, propose solutions, notify relevant people, bring value to solve it, rather than just complain.
  • 72. Long term career success Continuous professional development is crucial at every step of the career Every year ask yourself questions, over last 12 months 1. How much I learned about the technologies, the products, the services, the markets? What part of this knowledge is relevant to my work? How much did it help to improve my performance (performance of my team) 2. How many new people have I gotten to know at work? How diverse is this people set? How many people have I improved relationships with? -
  • 73. Long term career success Continuous professional development is crucial at every step of the career Over last 12 months 1. What new results, accomplishments have i achieved? What have I launched, improved? How much does it add to my reputation? Track record? 2. What new skills have i developed? Am I better in communications? Technology? Analytics skills? Judgement? In which areas? How can I do it better next year? What should I improve? How to apply these new skills, relationship, knowledge?