SlideShare a Scribd company logo
1 of 51
Download to read offline
검색엔진의패러다임전환 
-빅데이터분석과검색의융합- 
고려대학교정보대학컴퓨터학과 
강재우
연구배경 
사용자의정보욕구변화 
참여, 공유, 개방의Web 2.0 시대도래 사용자중심의정보생산/소비구조로의변화 
웹및SNS상에개인의의견/주관적정보의양폭증 
“분당상견례하기좋은한식집”, “반전이좋은스릴러“, “유행하는핸드백” 등의주관적정보에대한정보요구증가 
•사실검색(e.g., ‘action movie’) 수요는정체또는불규칙한반면, ‘best action movie’, ‘best SUV’와같은주관적질의는꾸준히증가 
2 
“action movie'와best action movie' 질의어에대한구글검색추세그래프 
(Google Trends, http://www.google.com/trends/)
3 
Aardvark: Large-Scale Social Search Engine 
(Horowitz and Kamvar, WWW2010) 
“64% of queries contain subjective element in Aardvark” 
(e.g., “Do you know of any great delis in Baltimore, MD?” 
“What are the things/crafts/toys your children have made that made them really proud of themselves?”) 
2010년google이$50,000,000 USD (한화530억) 에인수 
사실검색VS. 컨센서스검색 
컨센서스검색요구의증가
검색엔진VS. 컨센서스엔진 
기존문서기반검색엔진의한계 
객관적정보(e.g., ‘액션영화’또는‘핸드백가격‘)는현재의검색엔진에서검색가능하나주관적질의(‘재미있는액션영화’, ’요즘유행하는핸드백‘) 에는적절한대응불가능 
문서내에서기술의대상이되는객체를찾아내어이를색인의대상으로인식하고다양한문서에산재한사용자의의견을대상객체별로종합/분석하여랭킹하는새로운검색기술로의근본적인패러다임의전환요구 
4
5 
•낮은가격순 
•높은가격순 
•등록일순 
•상품평많은순 
의단순한상품정렬 
단순나열되는사용자리뷰 
•내용파악이힘들며 
•정보의종합이어려움 
복잡한옵션선택 
TV의인치와가격외에유용한정보가없는결과리스트
6 
구매후기|2013.04.12 
고가의전자제품을인터넷구매라많이망설였습니다.설치된후제품을보니너무만족합니다. 화면크고잘나오고저렴하게구입잘한것같아서기분이좋습니다. 
LG전자 
47LM6200 
가격대비막강한성능을가진TV입니다.|2013.04.01 
제품자체가보급형으로저렴한가격.인터넷, 3D 등의막강한기능을가졌고이곳저곳상품평읽어보니모두만족하는제품이라안심하고구매했습니다. 좋은제품합리적인가격에잘구매한것같습니다. 감사합니다. 
탁월한선택... LG 스마트TV 47LM6200...|2012.09.10 
특히리모콘의기능과3D안경은S사것보다활용도가아주편하고좋습니다. 3D안경도타사의밧데리로하는3D안경보다훨씬편하고특히안경쓴사람들에게편리한클립형은아이디어가돋보인다. 
깔끔한화질및벽걸이설치Good. 제품수급에따른배송지연|2012.07.02 
화질도깔끔히잘나오고, 무엇보다벽걸이형으로아주잘설치되어서만족합니다. 
나쁘지않습니다.|2013.04.19 
가격대비이정도면괜찮은듯싶습니다. 
그러나마우스리모컨이은근계륵이네요. 스마트티비엔확실히필요하나감도가영불편하게되어있구요. 리모컨도초간단으로나오는데.. 너무간단해서조작하기영.. 리모컨시스템빼고는뭐나쁘지않습니다. 
Search 
가격성능비가좋은TV 
제품자체가보급형으로저렴한가격 
LG 47LM 
가격대비아주좋은선택이었네요. 
LG 47LM 
가격대비성능비가매우우수한3D 스마트LED TV라고생각합니다. 
LG 47LM 
LG 47LM 
화면크고잘나오고저렴하게구입잘한것같아서기분이좋습니다. 
삼성UN50 
무엇보다가격대배최고의제품이라말하고싶습니다 
삼성UN50 
아주좋은가격에사게되어만족합니다 
삼성UN50 
가격대비크기및화질좋습니다. 
삼성UN50 
정말최고의제품&서비스입니다.|2013.07.31 
어제주문했는데이렇게빨리배송이올줄이야!!! 배송기사님도너무마음에들게설치해주시고. 무엇보다가격대배최고의제품이라말하고싶습니다. 모든것만족!! 
착한가격에만족합니다.|2012.12.18 
아주좋은가격에사게되어만족합니다. 삼성스마트TV로성능이나외관은기존에백화점에서보는것과별반다르지않고만족합니다. 현재약2주정도사용중인데기능이나외관모두만족입니다 
가격대비최고의가치있는모델|2013.03.21 
저녁에주문했는데다음날아침에배송!!!벽걸이로샀는데크기도크고영화보기에는아주좋을것같습니다. 화질도좋고, 크기도좋고, 배송도번개배송!! 
저렴하게구입 
가격대배최고 
저렴한가격 
가격대비성능비가매우우수 
가격대비크기및화질좋습니다 
아주좋은가격 
가격대비이정도면괜찮 
가격대비아주좋은선택 
0.5 
0.8 
0.9 
0.7 
0.5 
0.8 
0.7 
0.6 
Query Term과매칭된Aspect 
Segment Score 
삼성합계: 2.9 
LG합계: 2.6 
최종검색순위 
1. 삼성UN50ES6800F 
2. LG 47LM6200 
Click! 
삼성전자 
UN50ES6800F
Consensus Search 
최근사용자들은구매활동이나문화생활과관련된의사결정을위해인터넷검색을활발히활용 
공연관람이나, 상품구매를위해타사용자들의리뷰, 후기를참조 
각리뷰는작성자의“주관적의견”을토대로작성 
가능한많은리뷰를읽어야의사결정에도움 
컨센서스엔진이란? 
타사용자들이기작성해놓은수많은리뷰를사전에분석 
사용자가원하는관점(질의)에서타사용자들의리뷰를분석, 종합해주는검색시스템 
7
Consensus Engine 
현재의검색엔진으로는충분하지않다! 
상위몇개의문서에원하는정보가있을수는있다 
하지만각각의문서는각작성자의의견 
대중의consensus를대표할수없다 
하지만답은이미Web에존재! 
많은사용자들이각자의의견을여러형태(SNS, blog, review)로온라인상에게시 
이러한온라인의견들을“잠재적투표”로인식 
이미피력된온라인의견을검색시점에(query time)모아서분석하면컨센서스검색이가능 
8
Uhm.. Yeah.. It is noisy, but… 
9 
Online Consumer Posts: 2ndmost trusted forms of advertising (The Nielson Company, Q3 2011)
Is consensus search ever possible…? 
“Best Action Movies in 2013” 
Not immediately answerable with conventional search engines 
Because the answer should be based on consensus, which cannot be found in one of “top-10” documents 
However, the answers are already on the Web 
Numerous implicit votes from people on the Web and Social Networks 
Only if we can process them …. 
… ONLINE! 
10
CONSENTO Overview 
11
CONSENTO Overview 
12
The Key Ideas (I) 
Subdocument-level Indexing 
Capture semantics from user opinion more precisely 
Indexing unit no longer a page but; 
•a reviewwithin a page if more than one reviews exist on the page, 
•or a sentencewithin a review, 
•or even a clauseor phrasewithin a sentence discussing one aspect of the target entity 
Maximal Coherent Semantic Unit (MCSU) 
•a finest granule indexing unit used in CONSENTO indexing 
•maximal subsequence of words within a sentence, which carries single coherent semantics 
Indexing MCSUs instead of documents enables semantic analysis to be performed during indexing time 
•facilitating the online processing of consensus search in query time 
13
The Key Ideas (II) 
ConsensusRank: A Unique Ranking Method based on Public Sentiment 
Virtually, all existing ranking methods rank target objects (either documents or entities) directly based on their relevance to the query terms 
Contrastingly, ConsensusRankranks the entities indirectly through aggregating the scores of referring segments (e.g., MCSUs) that match to the query context 
It can be viewed as a voting process where each reviewer casts a weighted vote on an entity with respect to a query by expressing positive or negative opinions about that entity 
14
15 
(A)Indexing Subsystem 
Web 
Documents 
Parsing & 
Preprocessing 
DOM-tree Parsing 
Contents Extraction 
ContentsSegmentation 
Sentence Splitter 
MCSU Extraction 
Entity 
Search Index 
(B) Searching Subsystem 
Query Parsing 
Query Preprocessing 
& Expansion 
Retrieval 
Matching MCSU Retrieval 
Ranking 
Segment Grouping 
Score Aggregation 
Entity List 
User 
Query 
1 
2 
3 
4 
5 
6 
ReviewContents 
ExpandedQuery 
MCSU 
Posting List 
MCSUs 
Indexing 
Inverted Entry Construction 
& Indexing 
CONSENTOArchitecture 
Indexing Subsystem 
Parsing & Preprocessing 
Contents Segmentation 
Indexing 
Searching Subsystem 
Query Parsing 
Retrieval 
Ranking
The current working prototype of CONSENTO is built on movie domain 
CONSENTO crawled review pages from popular movie review sites such as IMDB, Meta Critics etc. 
Review contents are extracted using DOM- tree parsing and XPATH queries 
Extracted information include: 
entity name (i.e., movie name) 
review text, 
date and time 
review quality (e.g., “20 out of 30 people found the review helpful”) 
I: Parsing & Preprocessing
Split the review contents into MCSUs 
e.g., “The storyline is ridiculous, the acting is laughable, and the camera work is terrible.” 
s1) “The storyline is ridiculous” 
s2) “the acting is laughable” 
s3) “the camera work is terrible” 
II: Contents Segmentation
II: Contents Segmentation
CONSENTOindexes MCSUs on a conventional inverted index that is used in most modern search engines. 
Only mapping needs to be redefined logically from (terms → documents) to (terms → MCSUs) 
III: Indexing
III: Indexing 
20 
Feature 2 
Feature 1 
excellent 
visual effects, 
but 
plot 
was 
hard to follow 
Entity Name 
Transformer 3 
sentiment 
sentiment 
Document #1 
Bag of words 
excellent 
effects, 
plot 
hard 
Doc#1 
Term 
Doc 
excellent 
#1 
hard 
#1 
follow 
#1 
plot 
#1 
visual 
#1 
effects 
#1 
follow 
visual 
Traditional 
Inverted index 
Query: “excellent plot”. System return this document 
* Conventional Indexing Method Example
III: Indexing 
21 
excellent 
visual effects, 
but 
plot 
was 
hard to follow 
Segment 2 
Segment 1 
SegmentID 
ObjectName 
Feature 
Sentiment 
Segment1 
Transformer 3 
visual effects 
excellent 
Segment 2 
Transformer 3 
plot 
hard to follow 
Sub-document level indexing 
Term 
SegmentID 
ObjectName 
Feature 
Sentiment 
excellent 
SID1 
Transformer 3 
visual effects 
excellent 
visual 
SID1 
Transformer 3 
visual effects 
excellent 
effect 
SID1 
Transformer 3 
visual effects 
excellent 
plot 
SID2 
Transformer 3 
plot 
hard 
hard 
SID2 
Transformer 3 
plot 
hard 
follow 
SID2 
Transformer 3 
plot 
hard 
Query: “excellent plot”, doesn't match any segment 
* Subdocument-level Indexing Example
III: Indexing 
Simply treating an MCSU as a document 
Store additional information in each posting for use in the ranking stage 
MCSU posting structure
rid 
ts 
rq 
푟1 
푡푠1 
0.8 
푟2 
푡푠2 
0.4 
푟3 
푡푠3 
0.6 
푟4 
푡푠4 
0.9 
푟5 
푡푠5 
0.4 
푟6 
푡푠6 
0.5 
푟7 
푡푠7 
0.7 
푟8 
푡푠8 
0.6 
푟9 
푡푠9 
0.8 
Site Name 
Source ID 
IMDb 
푤1 
Flixster 
푤2 
Metacritic 
푤3 
Yahoo! 
푤4 
Feature 
id 
music 
푎1 
soundtrack 
푎2 
story 
푎3 
plot 
푎4 
performance 
푎5 
acting 
푎6 
Sentiword 
id 
great 
푚1 
excellent 
푚2 
superb 
푚3 
tragic 
푚4 
Entity 
id 
Titanic 
푒1 
Brokeback 
Mountain 
푒2 
Dark Knight 
푒3 
Avatar 
푒4 
Term 
Postings 
Cameron 
<푠19, 푒4, [−], [푚3], 푟7, 푤3> 
Pandora 
<푠16, 푒4, [푎2], [−], 푟6, 푤3>, 
<푠18, 푒4, [−], [−], 푟6, 푤3> 
tragic 
<푠7, 푒2, [푎3], [푚4], 푟3, 푤1> 
performance 
<푠5, 푒1, [푎6], [푚6], 푟2, 푤1>, 
<푠9, 푒2, [푎6], [푚3], 푟3, 푤1>, 
<푠11, 푒2, [푎6], [푚1], 푟4, 푤1>, 
<푠13, 푒3, [푎6], [−], 푟5, 푤2>, 
<푠15, 푒4, [푎6], [−], 푟5, 푤3>, 
<푠20, 푒3, [푎6], [−], 푟8, 푤4>, 
<푠21, 푒3,[푎6], [푚6], 푟9, 푤4> 
soundtrack 
<푠4, 푒1, [푎2],[−], 푟2, 푤1>, 
<푠10, 푒2, [푎2],[푚2], 푟4, 푤1>, 
<푠16, 푒4, [푎2],[−], 푟6, 푤2>, 
<푠22, 푒3, [푎2],[푚1], 푟9, 푤4> 
plot 
<푠14, 푒3, [푎4],[−], 푟5, 푤2> 
acting 
<푠13, 푒4, [푎6], [−], 푟9, 푤4>, 
music 
<푠2, 푒1, [푎1], [푚1], 푟1, 푤1>, 
<푠8, 푒2, [푎1], [푚1], 푟3, 푤1> 
Yeston 
<푠2, 푒1, [푎1], [−],푟1, 푤1>, 
story 
<푠1, 푒1, [푎3], [푚1],푟1, 푤1>, 
<푠7, 푒2, [푎3], [−],푟3, 푤1>, 
<푠12, 푒2, [푎3], [푚2],푟4, 푤1>, 
<푠17, 푒4, [푎3], [−],푟6, 푤3> 
(s7) beautiful tragic love story, //(s8)with great music.//(s9) superb performances in movies ever! 
(s10) The soundtrack is also excellent,// 
(s11)great performance, //(s12)excellent presentation of a love story… 
Brokeback 
Mountain 
퐫ퟑ 
퐫ퟒ 
The Dark Knight 
(s13) The performance by Heath Ledger was outstanding //(s14) and plot is amazing too… 
퐫ퟓ 
The Dark Knight 
(s20) Joker shows phonemically awesome performance!… 
(s21) nice performance //(s22)and backed up with great soundtrack. //(s23)excellent casting! 
퐫ퟖ 
퐫ퟗ 
(s1) the greatest love stories of all //(s2)and beautiful music from Yeston. // (s3) Everything about this movie was excellent... 
(푠4) touching soundtrack, //(푠5) and perfect handling of the known tragedy with nice performance. //(푠6)This has the best love scene I have ever seen… 
Titanic 
퐫ퟏ 
퐫ퟐ 
(s15) Navilooks very real, good performance, 
//(s16) beautiful soundtrack that emphasize the vastness of the Pandora, //(s17)with love story.// (s18) The world of Pandora is stunning 
Avatar 
퐫ퟔ 
퐫ퟕ 
(s19) James Cameron deserves high praise for this creation… 
Review ID
IV: Query Parsing 
CONSENTOpreprocesses the query and performs query expansion 
stop-word removal, 
polarity only-word removal 
feature expansion 
stemming 
Polarity only-word removal 
"good action movie" and "greataction movie" should be treated as the same query 
Feature words expanded for better recall 
‘plot’ → {plot, story} 
‘music’ → {music, soundtrack}
V: Retrieval 
Retrieve MCSU segments that match to the query terms 
Same as the conventional systems retrieve document posting lists
VI: Ranking 
Group MCSU postings by entity and aggregate the scores of the postings to compute the score of the corresponding entity
VI: Ranking
VI: Ranking
VI: Ranking 
29
VI: Ranking 
30
Movie data sets 
Source 
•Amazon , IMDB, Metacritic, Flixster, Rotten Tomatoes and Yahoo Movies 
Period 
•2008 ~ 2010 
More than 740 movies, and 30K reviews 
Hotel data sets 
hotel data set from Ganesanand Zhai 
reviews for the hotels in 10 major cities from TripAdvisor 
The authors provided us the corrected judgment set for our test 
Experimental Setup: Data Set
Experiment 
Methods 
Ganesanand Zhai’sOE and QAM methods 
•Opinion expansion word 
•Query aspect model 
Baseline 
1) BM25 
•b = 0.75 
•k1 = 2 
2) VSMBM (lucenedefault) 
•Vector space model + Boolean model 
3) ConsensusRank
Experimental Result -Movie
Experimental Result -Hotel
Hawaii 
Cebu 
Gold Coast
Honeymoon 
Snorkeling 
Hawaii! 
Honeymoon 
Whale Watching 
Snorkeling 
Whale watching 
Whale Watching 
Snorkeling 
Snorkeling 
Active Volcano 
Honeymoon 
Honeymoon 
Whale Watching 
Snorkeling 
Honeymoon 
Whale Watching
1. 웹및소셜네트워크상의다양한정보를 
사전에분석및인덱싱 
스릴러영화? 
반전있는 
스릴러 
영화? 
대학생백팩? 
믿을만한 
중고차딜러? 
믿을만한 
근처어린이집 
2. Ad-hoc 의사결정질의에대한실시간결과도출 
면접용 
메이크업 
미용실 
학원근처 
갈만한 
스터디장소 
강남상견례한식집 
배낭여행숙소 
우리동네PT 잘하는 
트레이너?
38 
best thriller with plot twist
The Artist vs. Jack and Jill 
39
40 
good pizza restaurant
Click!
42
CONSENTO Local 서비스예제 
43
CONSENTO Local 서비스예제 
44
‘Napk-In’ 서비스예제 
45
‘Napk-In’ 서비스예제 
46
‘슝’서비스예제 
47
잠재된컨센서스검색시장 
48 
사실검색 
컨센서스검색
ENGINEERINGKNOWLEDGE 
SEARCHINGWISDOM
CONSENTO
THANK YOU

More Related Content

What's hot

Использование Elasticsearch для организации поиска по сайту
Использование Elasticsearch для организации поиска по сайтуИспользование Elasticsearch для организации поиска по сайту
Использование Elasticsearch для организации поиска по сайтуOlga Lavrentieva
 
What You Missed in Computer Science
What You Missed in Computer ScienceWhat You Missed in Computer Science
What You Missed in Computer ScienceTaylor Lovett
 
Postgresql search demystified
Postgresql search demystifiedPostgresql search demystified
Postgresql search demystifiedjavier ramirez
 
Elasticsearch - DevNexus 2015
Elasticsearch - DevNexus 2015Elasticsearch - DevNexus 2015
Elasticsearch - DevNexus 2015Roy Russo
 
ElasticSearch - DevNexus Atlanta - 2014
ElasticSearch - DevNexus Atlanta - 2014ElasticSearch - DevNexus Atlanta - 2014
ElasticSearch - DevNexus Atlanta - 2014Roy Russo
 
Data Exploration with Elasticsearch
Data Exploration with ElasticsearchData Exploration with Elasticsearch
Data Exploration with ElasticsearchAleksander Stensby
 
Side by Side with Elasticsearch and Solr
Side by Side with Elasticsearch and SolrSide by Side with Elasticsearch and Solr
Side by Side with Elasticsearch and SolrSematext Group, Inc.
 
ElasticSearch in action
ElasticSearch in actionElasticSearch in action
ElasticSearch in actionCodemotion
 
ElasticSearch: Найдется все... и быстро!
ElasticSearch: Найдется все... и быстро!ElasticSearch: Найдется все... и быстро!
ElasticSearch: Найдется все... и быстро!Alexander Byndyu
 
Solr vs. Elasticsearch - Case by Case
Solr vs. Elasticsearch - Case by CaseSolr vs. Elasticsearch - Case by Case
Solr vs. Elasticsearch - Case by CaseAlexandre Rafalovitch
 
Elasticsearch 101 - Cluster setup and tuning
Elasticsearch 101 - Cluster setup and tuningElasticsearch 101 - Cluster setup and tuning
Elasticsearch 101 - Cluster setup and tuningPetar Djekic
 
Beyond full-text searches with Lucene and Solr
Beyond full-text searches with Lucene and SolrBeyond full-text searches with Lucene and Solr
Beyond full-text searches with Lucene and SolrBertrand Delacretaz
 
[제1회 루씬 한글분석기 기술세미나] solr로 나만의 검색엔진을 만들어보자
[제1회 루씬 한글분석기 기술세미나] solr로 나만의 검색엔진을 만들어보자[제1회 루씬 한글분석기 기술세미나] solr로 나만의 검색엔진을 만들어보자
[제1회 루씬 한글분석기 기술세미나] solr로 나만의 검색엔진을 만들어보자Donghyeok Kang
 
Terms of endearment - the ElasticSearch Query DSL explained
Terms of endearment - the ElasticSearch Query DSL explainedTerms of endearment - the ElasticSearch Query DSL explained
Terms of endearment - the ElasticSearch Query DSL explainedclintongormley
 
ElasticSearch - index server used as a document database
ElasticSearch - index server used as a document databaseElasticSearch - index server used as a document database
ElasticSearch - index server used as a document databaseRobert Lujo
 
Elastic Search Training#1 (brief tutorial)-ESCC#1
Elastic Search Training#1 (brief tutorial)-ESCC#1Elastic Search Training#1 (brief tutorial)-ESCC#1
Elastic Search Training#1 (brief tutorial)-ESCC#1medcl
 
Spark with Elasticsearch - umd version 2014
Spark with Elasticsearch - umd version 2014Spark with Elasticsearch - umd version 2014
Spark with Elasticsearch - umd version 2014Holden Karau
 
Elasticsearch at Dailymotion
Elasticsearch at DailymotionElasticsearch at Dailymotion
Elasticsearch at DailymotionCédric Hourcade
 
Ops Jumpstart: MongoDB Administration 101
Ops Jumpstart: MongoDB Administration 101Ops Jumpstart: MongoDB Administration 101
Ops Jumpstart: MongoDB Administration 101MongoDB
 
Performance Tuning and Optimization
Performance Tuning and OptimizationPerformance Tuning and Optimization
Performance Tuning and OptimizationMongoDB
 

What's hot (20)

Использование Elasticsearch для организации поиска по сайту
Использование Elasticsearch для организации поиска по сайтуИспользование Elasticsearch для организации поиска по сайту
Использование Elasticsearch для организации поиска по сайту
 
What You Missed in Computer Science
What You Missed in Computer ScienceWhat You Missed in Computer Science
What You Missed in Computer Science
 
Postgresql search demystified
Postgresql search demystifiedPostgresql search demystified
Postgresql search demystified
 
Elasticsearch - DevNexus 2015
Elasticsearch - DevNexus 2015Elasticsearch - DevNexus 2015
Elasticsearch - DevNexus 2015
 
ElasticSearch - DevNexus Atlanta - 2014
ElasticSearch - DevNexus Atlanta - 2014ElasticSearch - DevNexus Atlanta - 2014
ElasticSearch - DevNexus Atlanta - 2014
 
Data Exploration with Elasticsearch
Data Exploration with ElasticsearchData Exploration with Elasticsearch
Data Exploration with Elasticsearch
 
Side by Side with Elasticsearch and Solr
Side by Side with Elasticsearch and SolrSide by Side with Elasticsearch and Solr
Side by Side with Elasticsearch and Solr
 
ElasticSearch in action
ElasticSearch in actionElasticSearch in action
ElasticSearch in action
 
ElasticSearch: Найдется все... и быстро!
ElasticSearch: Найдется все... и быстро!ElasticSearch: Найдется все... и быстро!
ElasticSearch: Найдется все... и быстро!
 
Solr vs. Elasticsearch - Case by Case
Solr vs. Elasticsearch - Case by CaseSolr vs. Elasticsearch - Case by Case
Solr vs. Elasticsearch - Case by Case
 
Elasticsearch 101 - Cluster setup and tuning
Elasticsearch 101 - Cluster setup and tuningElasticsearch 101 - Cluster setup and tuning
Elasticsearch 101 - Cluster setup and tuning
 
Beyond full-text searches with Lucene and Solr
Beyond full-text searches with Lucene and SolrBeyond full-text searches with Lucene and Solr
Beyond full-text searches with Lucene and Solr
 
[제1회 루씬 한글분석기 기술세미나] solr로 나만의 검색엔진을 만들어보자
[제1회 루씬 한글분석기 기술세미나] solr로 나만의 검색엔진을 만들어보자[제1회 루씬 한글분석기 기술세미나] solr로 나만의 검색엔진을 만들어보자
[제1회 루씬 한글분석기 기술세미나] solr로 나만의 검색엔진을 만들어보자
 
Terms of endearment - the ElasticSearch Query DSL explained
Terms of endearment - the ElasticSearch Query DSL explainedTerms of endearment - the ElasticSearch Query DSL explained
Terms of endearment - the ElasticSearch Query DSL explained
 
ElasticSearch - index server used as a document database
ElasticSearch - index server used as a document databaseElasticSearch - index server used as a document database
ElasticSearch - index server used as a document database
 
Elastic Search Training#1 (brief tutorial)-ESCC#1
Elastic Search Training#1 (brief tutorial)-ESCC#1Elastic Search Training#1 (brief tutorial)-ESCC#1
Elastic Search Training#1 (brief tutorial)-ESCC#1
 
Spark with Elasticsearch - umd version 2014
Spark with Elasticsearch - umd version 2014Spark with Elasticsearch - umd version 2014
Spark with Elasticsearch - umd version 2014
 
Elasticsearch at Dailymotion
Elasticsearch at DailymotionElasticsearch at Dailymotion
Elasticsearch at Dailymotion
 
Ops Jumpstart: MongoDB Administration 101
Ops Jumpstart: MongoDB Administration 101Ops Jumpstart: MongoDB Administration 101
Ops Jumpstart: MongoDB Administration 101
 
Performance Tuning and Optimization
Performance Tuning and OptimizationPerformance Tuning and Optimization
Performance Tuning and Optimization
 

Similar to [2B1]검색엔진의 패러다임 전환

Web Intelligence 2013 - Characterizing concepts of interest leveraging Linked...
Web Intelligence 2013 - Characterizing concepts of interest leveraging Linked...Web Intelligence 2013 - Characterizing concepts of interest leveraging Linked...
Web Intelligence 2013 - Characterizing concepts of interest leveraging Linked...Fabrizio Orlandi
 
[100621]제안발표
[100621]제안발표[100621]제안발표
[100621]제안발표DongKyun Lee
 
Measurement and modeling of the web and related data sets
Measurement and modeling of the web and related data setsMeasurement and modeling of the web and related data sets
Measurement and modeling of the web and related data setsMark J. Feldman
 
Semantic search within Earth Observation products databases based on automati...
Semantic search within Earth Observation products databases based on automati...Semantic search within Earth Observation products databases based on automati...
Semantic search within Earth Observation products databases based on automati...Gasperi Jerome
 
Query by Example of Speaker Audio Signals using Power Spectrum and MFCCs
Query by Example of Speaker Audio Signals using Power Spectrum and MFCCsQuery by Example of Speaker Audio Signals using Power Spectrum and MFCCs
Query by Example of Speaker Audio Signals using Power Spectrum and MFCCsIJECEIAES
 
Fox-Keynote-Now and Now of Data Publishing-nfdp13
Fox-Keynote-Now and Now of Data Publishing-nfdp13Fox-Keynote-Now and Now of Data Publishing-nfdp13
Fox-Keynote-Now and Now of Data Publishing-nfdp13DataDryad
 
Optimizing Search Interactions within Professional Social Networks (thesis p...
Optimizing Search Interactions within Professional Social Networks (thesis p...Optimizing Search Interactions within Professional Social Networks (thesis p...
Optimizing Search Interactions within Professional Social Networks (thesis p...Nik Spirin
 
An image crawler for content based image retrieval system
An image crawler for content based image retrieval systemAn image crawler for content based image retrieval system
An image crawler for content based image retrieval systemeSAT Journals
 
An image crawler for content based image retrieval
An image crawler for content based image retrievalAn image crawler for content based image retrieval
An image crawler for content based image retrievaleSAT Publishing House
 
Bayesian Network 을 활용한 예측 분석
Bayesian Network 을 활용한 예측 분석Bayesian Network 을 활용한 예측 분석
Bayesian Network 을 활용한 예측 분석datasciencekorea
 
Web-scale semantic search
Web-scale semantic searchWeb-scale semantic search
Web-scale semantic searchEdgar Meij
 
Quality, Quantity, Web and Semantics
Quality, Quantity, Web and SemanticsQuality, Quantity, Web and Semantics
Quality, Quantity, Web and SemanticsZemanta
 
Quality, quantity, web and semantics
Quality, quantity, web and semanticsQuality, quantity, web and semantics
Quality, quantity, web and semanticsAndraz Tori
 
The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)theijes
 
Automatic Identification of Best Answers in Online Enquiry Communities
Automatic Identification of Best Answers in Online Enquiry CommunitiesAutomatic Identification of Best Answers in Online Enquiry Communities
Automatic Identification of Best Answers in Online Enquiry CommunitiesGregoire Burel
 
A Spark-Based Intelligent Assistant: Making Data Exploration in Natural Langu...
A Spark-Based Intelligent Assistant: Making Data Exploration in Natural Langu...A Spark-Based Intelligent Assistant: Making Data Exploration in Natural Langu...
A Spark-Based Intelligent Assistant: Making Data Exploration in Natural Langu...Databricks
 
Image web crawler
Image web crawlerImage web crawler
Image web crawlerdixitas
 
Fyp ideas
Fyp ideasFyp ideas
Fyp ideasMr SMAK
 

Similar to [2B1]검색엔진의 패러다임 전환 (20)

Web Intelligence 2013 - Characterizing concepts of interest leveraging Linked...
Web Intelligence 2013 - Characterizing concepts of interest leveraging Linked...Web Intelligence 2013 - Characterizing concepts of interest leveraging Linked...
Web Intelligence 2013 - Characterizing concepts of interest leveraging Linked...
 
[100621]제안발표
[100621]제안발표[100621]제안발표
[100621]제안발표
 
Measurement and modeling of the web and related data sets
Measurement and modeling of the web and related data setsMeasurement and modeling of the web and related data sets
Measurement and modeling of the web and related data sets
 
Semantic search within Earth Observation products databases based on automati...
Semantic search within Earth Observation products databases based on automati...Semantic search within Earth Observation products databases based on automati...
Semantic search within Earth Observation products databases based on automati...
 
Query by Example of Speaker Audio Signals using Power Spectrum and MFCCs
Query by Example of Speaker Audio Signals using Power Spectrum and MFCCsQuery by Example of Speaker Audio Signals using Power Spectrum and MFCCs
Query by Example of Speaker Audio Signals using Power Spectrum and MFCCs
 
Fox-Keynote-Now and Now of Data Publishing-nfdp13
Fox-Keynote-Now and Now of Data Publishing-nfdp13Fox-Keynote-Now and Now of Data Publishing-nfdp13
Fox-Keynote-Now and Now of Data Publishing-nfdp13
 
Optimizing Search Interactions within Professional Social Networks (thesis p...
Optimizing Search Interactions within Professional Social Networks (thesis p...Optimizing Search Interactions within Professional Social Networks (thesis p...
Optimizing Search Interactions within Professional Social Networks (thesis p...
 
An image crawler for content based image retrieval system
An image crawler for content based image retrieval systemAn image crawler for content based image retrieval system
An image crawler for content based image retrieval system
 
An image crawler for content based image retrieval
An image crawler for content based image retrievalAn image crawler for content based image retrieval
An image crawler for content based image retrieval
 
Bayesian Network 을 활용한 예측 분석
Bayesian Network 을 활용한 예측 분석Bayesian Network 을 활용한 예측 분석
Bayesian Network 을 활용한 예측 분석
 
Web-scale semantic search
Web-scale semantic searchWeb-scale semantic search
Web-scale semantic search
 
Quality, Quantity, Web and Semantics
Quality, Quantity, Web and SemanticsQuality, Quantity, Web and Semantics
Quality, Quantity, Web and Semantics
 
Quality, quantity, web and semantics
Quality, quantity, web and semanticsQuality, quantity, web and semantics
Quality, quantity, web and semantics
 
The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)
 
Automatic Identification of Best Answers in Online Enquiry Communities
Automatic Identification of Best Answers in Online Enquiry CommunitiesAutomatic Identification of Best Answers in Online Enquiry Communities
Automatic Identification of Best Answers in Online Enquiry Communities
 
A Spark-Based Intelligent Assistant: Making Data Exploration in Natural Langu...
A Spark-Based Intelligent Assistant: Making Data Exploration in Natural Langu...A Spark-Based Intelligent Assistant: Making Data Exploration in Natural Langu...
A Spark-Based Intelligent Assistant: Making Data Exploration in Natural Langu...
 
Image web crawler
Image web crawlerImage web crawler
Image web crawler
 
Lecture09
Lecture09Lecture09
Lecture09
 
Fyp ideas
Fyp ideasFyp ideas
Fyp ideas
 
WISE2019 presentation
WISE2019 presentationWISE2019 presentation
WISE2019 presentation
 

More from NAVER D2

[211] 인공지능이 인공지능 챗봇을 만든다
[211] 인공지능이 인공지능 챗봇을 만든다[211] 인공지능이 인공지능 챗봇을 만든다
[211] 인공지능이 인공지능 챗봇을 만든다NAVER D2
 
[233] 대형 컨테이너 클러스터에서의 고가용성 Network Load Balancing: Maglev Hashing Scheduler i...
[233] 대형 컨테이너 클러스터에서의 고가용성 Network Load Balancing: Maglev Hashing Scheduler i...[233] 대형 컨테이너 클러스터에서의 고가용성 Network Load Balancing: Maglev Hashing Scheduler i...
[233] 대형 컨테이너 클러스터에서의 고가용성 Network Load Balancing: Maglev Hashing Scheduler i...NAVER D2
 
[215] Druid로 쉽고 빠르게 데이터 분석하기
[215] Druid로 쉽고 빠르게 데이터 분석하기[215] Druid로 쉽고 빠르게 데이터 분석하기
[215] Druid로 쉽고 빠르게 데이터 분석하기NAVER D2
 
[245]Papago Internals: 모델분석과 응용기술 개발
[245]Papago Internals: 모델분석과 응용기술 개발[245]Papago Internals: 모델분석과 응용기술 개발
[245]Papago Internals: 모델분석과 응용기술 개발NAVER D2
 
[236] 스트림 저장소 최적화 이야기: 아파치 드루이드로부터 얻은 교훈
[236] 스트림 저장소 최적화 이야기: 아파치 드루이드로부터 얻은 교훈[236] 스트림 저장소 최적화 이야기: 아파치 드루이드로부터 얻은 교훈
[236] 스트림 저장소 최적화 이야기: 아파치 드루이드로부터 얻은 교훈NAVER D2
 
[235]Wikipedia-scale Q&A
[235]Wikipedia-scale Q&A[235]Wikipedia-scale Q&A
[235]Wikipedia-scale Q&ANAVER D2
 
[244]로봇이 현실 세계에 대해 학습하도록 만들기
[244]로봇이 현실 세계에 대해 학습하도록 만들기[244]로봇이 현실 세계에 대해 학습하도록 만들기
[244]로봇이 현실 세계에 대해 학습하도록 만들기NAVER D2
 
[243] Deep Learning to help student’s Deep Learning
[243] Deep Learning to help student’s Deep Learning[243] Deep Learning to help student’s Deep Learning
[243] Deep Learning to help student’s Deep LearningNAVER D2
 
[234]Fast & Accurate Data Annotation Pipeline for AI applications
[234]Fast & Accurate Data Annotation Pipeline for AI applications[234]Fast & Accurate Data Annotation Pipeline for AI applications
[234]Fast & Accurate Data Annotation Pipeline for AI applicationsNAVER D2
 
Old version: [233]대형 컨테이너 클러스터에서의 고가용성 Network Load Balancing
Old version: [233]대형 컨테이너 클러스터에서의 고가용성 Network Load BalancingOld version: [233]대형 컨테이너 클러스터에서의 고가용성 Network Load Balancing
Old version: [233]대형 컨테이너 클러스터에서의 고가용성 Network Load BalancingNAVER D2
 
[226]NAVER 광고 deep click prediction: 모델링부터 서빙까지
[226]NAVER 광고 deep click prediction: 모델링부터 서빙까지[226]NAVER 광고 deep click prediction: 모델링부터 서빙까지
[226]NAVER 광고 deep click prediction: 모델링부터 서빙까지NAVER D2
 
[225]NSML: 머신러닝 플랫폼 서비스하기 & 모델 튜닝 자동화하기
[225]NSML: 머신러닝 플랫폼 서비스하기 & 모델 튜닝 자동화하기[225]NSML: 머신러닝 플랫폼 서비스하기 & 모델 튜닝 자동화하기
[225]NSML: 머신러닝 플랫폼 서비스하기 & 모델 튜닝 자동화하기NAVER D2
 
[224]네이버 검색과 개인화
[224]네이버 검색과 개인화[224]네이버 검색과 개인화
[224]네이버 검색과 개인화NAVER D2
 
[216]Search Reliability Engineering (부제: 지진에도 흔들리지 않는 네이버 검색시스템)
[216]Search Reliability Engineering (부제: 지진에도 흔들리지 않는 네이버 검색시스템)[216]Search Reliability Engineering (부제: 지진에도 흔들리지 않는 네이버 검색시스템)
[216]Search Reliability Engineering (부제: 지진에도 흔들리지 않는 네이버 검색시스템)NAVER D2
 
[214] Ai Serving Platform: 하루 수 억 건의 인퍼런스를 처리하기 위한 고군분투기
[214] Ai Serving Platform: 하루 수 억 건의 인퍼런스를 처리하기 위한 고군분투기[214] Ai Serving Platform: 하루 수 억 건의 인퍼런스를 처리하기 위한 고군분투기
[214] Ai Serving Platform: 하루 수 억 건의 인퍼런스를 처리하기 위한 고군분투기NAVER D2
 
[213] Fashion Visual Search
[213] Fashion Visual Search[213] Fashion Visual Search
[213] Fashion Visual SearchNAVER D2
 
[232] TensorRT를 활용한 딥러닝 Inference 최적화
[232] TensorRT를 활용한 딥러닝 Inference 최적화[232] TensorRT를 활용한 딥러닝 Inference 최적화
[232] TensorRT를 활용한 딥러닝 Inference 최적화NAVER D2
 
[242]컴퓨터 비전을 이용한 실내 지도 자동 업데이트 방법: 딥러닝을 통한 POI 변화 탐지
[242]컴퓨터 비전을 이용한 실내 지도 자동 업데이트 방법: 딥러닝을 통한 POI 변화 탐지[242]컴퓨터 비전을 이용한 실내 지도 자동 업데이트 방법: 딥러닝을 통한 POI 변화 탐지
[242]컴퓨터 비전을 이용한 실내 지도 자동 업데이트 방법: 딥러닝을 통한 POI 변화 탐지NAVER D2
 
[212]C3, 데이터 처리에서 서빙까지 가능한 하둡 클러스터
[212]C3, 데이터 처리에서 서빙까지 가능한 하둡 클러스터[212]C3, 데이터 처리에서 서빙까지 가능한 하둡 클러스터
[212]C3, 데이터 처리에서 서빙까지 가능한 하둡 클러스터NAVER D2
 
[223]기계독해 QA: 검색인가, NLP인가?
[223]기계독해 QA: 검색인가, NLP인가?[223]기계독해 QA: 검색인가, NLP인가?
[223]기계독해 QA: 검색인가, NLP인가?NAVER D2
 

More from NAVER D2 (20)

[211] 인공지능이 인공지능 챗봇을 만든다
[211] 인공지능이 인공지능 챗봇을 만든다[211] 인공지능이 인공지능 챗봇을 만든다
[211] 인공지능이 인공지능 챗봇을 만든다
 
[233] 대형 컨테이너 클러스터에서의 고가용성 Network Load Balancing: Maglev Hashing Scheduler i...
[233] 대형 컨테이너 클러스터에서의 고가용성 Network Load Balancing: Maglev Hashing Scheduler i...[233] 대형 컨테이너 클러스터에서의 고가용성 Network Load Balancing: Maglev Hashing Scheduler i...
[233] 대형 컨테이너 클러스터에서의 고가용성 Network Load Balancing: Maglev Hashing Scheduler i...
 
[215] Druid로 쉽고 빠르게 데이터 분석하기
[215] Druid로 쉽고 빠르게 데이터 분석하기[215] Druid로 쉽고 빠르게 데이터 분석하기
[215] Druid로 쉽고 빠르게 데이터 분석하기
 
[245]Papago Internals: 모델분석과 응용기술 개발
[245]Papago Internals: 모델분석과 응용기술 개발[245]Papago Internals: 모델분석과 응용기술 개발
[245]Papago Internals: 모델분석과 응용기술 개발
 
[236] 스트림 저장소 최적화 이야기: 아파치 드루이드로부터 얻은 교훈
[236] 스트림 저장소 최적화 이야기: 아파치 드루이드로부터 얻은 교훈[236] 스트림 저장소 최적화 이야기: 아파치 드루이드로부터 얻은 교훈
[236] 스트림 저장소 최적화 이야기: 아파치 드루이드로부터 얻은 교훈
 
[235]Wikipedia-scale Q&A
[235]Wikipedia-scale Q&A[235]Wikipedia-scale Q&A
[235]Wikipedia-scale Q&A
 
[244]로봇이 현실 세계에 대해 학습하도록 만들기
[244]로봇이 현실 세계에 대해 학습하도록 만들기[244]로봇이 현실 세계에 대해 학습하도록 만들기
[244]로봇이 현실 세계에 대해 학습하도록 만들기
 
[243] Deep Learning to help student’s Deep Learning
[243] Deep Learning to help student’s Deep Learning[243] Deep Learning to help student’s Deep Learning
[243] Deep Learning to help student’s Deep Learning
 
[234]Fast & Accurate Data Annotation Pipeline for AI applications
[234]Fast & Accurate Data Annotation Pipeline for AI applications[234]Fast & Accurate Data Annotation Pipeline for AI applications
[234]Fast & Accurate Data Annotation Pipeline for AI applications
 
Old version: [233]대형 컨테이너 클러스터에서의 고가용성 Network Load Balancing
Old version: [233]대형 컨테이너 클러스터에서의 고가용성 Network Load BalancingOld version: [233]대형 컨테이너 클러스터에서의 고가용성 Network Load Balancing
Old version: [233]대형 컨테이너 클러스터에서의 고가용성 Network Load Balancing
 
[226]NAVER 광고 deep click prediction: 모델링부터 서빙까지
[226]NAVER 광고 deep click prediction: 모델링부터 서빙까지[226]NAVER 광고 deep click prediction: 모델링부터 서빙까지
[226]NAVER 광고 deep click prediction: 모델링부터 서빙까지
 
[225]NSML: 머신러닝 플랫폼 서비스하기 & 모델 튜닝 자동화하기
[225]NSML: 머신러닝 플랫폼 서비스하기 & 모델 튜닝 자동화하기[225]NSML: 머신러닝 플랫폼 서비스하기 & 모델 튜닝 자동화하기
[225]NSML: 머신러닝 플랫폼 서비스하기 & 모델 튜닝 자동화하기
 
[224]네이버 검색과 개인화
[224]네이버 검색과 개인화[224]네이버 검색과 개인화
[224]네이버 검색과 개인화
 
[216]Search Reliability Engineering (부제: 지진에도 흔들리지 않는 네이버 검색시스템)
[216]Search Reliability Engineering (부제: 지진에도 흔들리지 않는 네이버 검색시스템)[216]Search Reliability Engineering (부제: 지진에도 흔들리지 않는 네이버 검색시스템)
[216]Search Reliability Engineering (부제: 지진에도 흔들리지 않는 네이버 검색시스템)
 
[214] Ai Serving Platform: 하루 수 억 건의 인퍼런스를 처리하기 위한 고군분투기
[214] Ai Serving Platform: 하루 수 억 건의 인퍼런스를 처리하기 위한 고군분투기[214] Ai Serving Platform: 하루 수 억 건의 인퍼런스를 처리하기 위한 고군분투기
[214] Ai Serving Platform: 하루 수 억 건의 인퍼런스를 처리하기 위한 고군분투기
 
[213] Fashion Visual Search
[213] Fashion Visual Search[213] Fashion Visual Search
[213] Fashion Visual Search
 
[232] TensorRT를 활용한 딥러닝 Inference 최적화
[232] TensorRT를 활용한 딥러닝 Inference 최적화[232] TensorRT를 활용한 딥러닝 Inference 최적화
[232] TensorRT를 활용한 딥러닝 Inference 최적화
 
[242]컴퓨터 비전을 이용한 실내 지도 자동 업데이트 방법: 딥러닝을 통한 POI 변화 탐지
[242]컴퓨터 비전을 이용한 실내 지도 자동 업데이트 방법: 딥러닝을 통한 POI 변화 탐지[242]컴퓨터 비전을 이용한 실내 지도 자동 업데이트 방법: 딥러닝을 통한 POI 변화 탐지
[242]컴퓨터 비전을 이용한 실내 지도 자동 업데이트 방법: 딥러닝을 통한 POI 변화 탐지
 
[212]C3, 데이터 처리에서 서빙까지 가능한 하둡 클러스터
[212]C3, 데이터 처리에서 서빙까지 가능한 하둡 클러스터[212]C3, 데이터 처리에서 서빙까지 가능한 하둡 클러스터
[212]C3, 데이터 처리에서 서빙까지 가능한 하둡 클러스터
 
[223]기계독해 QA: 검색인가, NLP인가?
[223]기계독해 QA: 검색인가, NLP인가?[223]기계독해 QA: 검색인가, NLP인가?
[223]기계독해 QA: 검색인가, NLP인가?
 

Recently uploaded

Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024SkyPlanner
 
UiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPathCommunity
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaborationbruanjhuli
 
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...Aggregage
 
Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Brian Pichman
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UbiTrack UK
 
Computer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and HazardsComputer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and HazardsSeth Reyes
 
Designing A Time bound resource download URL
Designing A Time bound resource download URLDesigning A Time bound resource download URL
Designing A Time bound resource download URLRuncy Oommen
 
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDEADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDELiveplex
 
COMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a WebsiteCOMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a Websitedgelyza
 
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsIgniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsSafe Software
 
9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding TeamAdam Moalla
 
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8DianaGray10
 
Building AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxBuilding AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxUdaiappa Ramachandran
 
NIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopNIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopBachir Benyammi
 
AI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity WebinarAI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity WebinarPrecisely
 
Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.YounusS2
 
Linked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesLinked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesDavid Newbury
 

Recently uploaded (20)

Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024
 
UiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation Developers
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
 
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
 
20230104 - machine vision
20230104 - machine vision20230104 - machine vision
20230104 - machine vision
 
Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
 
Computer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and HazardsComputer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and Hazards
 
Designing A Time bound resource download URL
Designing A Time bound resource download URLDesigning A Time bound resource download URL
Designing A Time bound resource download URL
 
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDEADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
 
COMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a WebsiteCOMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a Website
 
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsIgniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
 
9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team
 
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8
 
Building AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxBuilding AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptx
 
NIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopNIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 Workshop
 
AI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity WebinarAI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity Webinar
 
Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.
 
201610817 - edge part1
201610817 - edge part1201610817 - edge part1
201610817 - edge part1
 
Linked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesLinked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond Ontologies
 

[2B1]검색엔진의 패러다임 전환

  • 2. 연구배경 사용자의정보욕구변화 참여, 공유, 개방의Web 2.0 시대도래 사용자중심의정보생산/소비구조로의변화 웹및SNS상에개인의의견/주관적정보의양폭증 “분당상견례하기좋은한식집”, “반전이좋은스릴러“, “유행하는핸드백” 등의주관적정보에대한정보요구증가 •사실검색(e.g., ‘action movie’) 수요는정체또는불규칙한반면, ‘best action movie’, ‘best SUV’와같은주관적질의는꾸준히증가 2 “action movie'와best action movie' 질의어에대한구글검색추세그래프 (Google Trends, http://www.google.com/trends/)
  • 3. 3 Aardvark: Large-Scale Social Search Engine (Horowitz and Kamvar, WWW2010) “64% of queries contain subjective element in Aardvark” (e.g., “Do you know of any great delis in Baltimore, MD?” “What are the things/crafts/toys your children have made that made them really proud of themselves?”) 2010년google이$50,000,000 USD (한화530억) 에인수 사실검색VS. 컨센서스검색 컨센서스검색요구의증가
  • 4. 검색엔진VS. 컨센서스엔진 기존문서기반검색엔진의한계 객관적정보(e.g., ‘액션영화’또는‘핸드백가격‘)는현재의검색엔진에서검색가능하나주관적질의(‘재미있는액션영화’, ’요즘유행하는핸드백‘) 에는적절한대응불가능 문서내에서기술의대상이되는객체를찾아내어이를색인의대상으로인식하고다양한문서에산재한사용자의의견을대상객체별로종합/분석하여랭킹하는새로운검색기술로의근본적인패러다임의전환요구 4
  • 5. 5 •낮은가격순 •높은가격순 •등록일순 •상품평많은순 의단순한상품정렬 단순나열되는사용자리뷰 •내용파악이힘들며 •정보의종합이어려움 복잡한옵션선택 TV의인치와가격외에유용한정보가없는결과리스트
  • 6. 6 구매후기|2013.04.12 고가의전자제품을인터넷구매라많이망설였습니다.설치된후제품을보니너무만족합니다. 화면크고잘나오고저렴하게구입잘한것같아서기분이좋습니다. LG전자 47LM6200 가격대비막강한성능을가진TV입니다.|2013.04.01 제품자체가보급형으로저렴한가격.인터넷, 3D 등의막강한기능을가졌고이곳저곳상품평읽어보니모두만족하는제품이라안심하고구매했습니다. 좋은제품합리적인가격에잘구매한것같습니다. 감사합니다. 탁월한선택... LG 스마트TV 47LM6200...|2012.09.10 특히리모콘의기능과3D안경은S사것보다활용도가아주편하고좋습니다. 3D안경도타사의밧데리로하는3D안경보다훨씬편하고특히안경쓴사람들에게편리한클립형은아이디어가돋보인다. 깔끔한화질및벽걸이설치Good. 제품수급에따른배송지연|2012.07.02 화질도깔끔히잘나오고, 무엇보다벽걸이형으로아주잘설치되어서만족합니다. 나쁘지않습니다.|2013.04.19 가격대비이정도면괜찮은듯싶습니다. 그러나마우스리모컨이은근계륵이네요. 스마트티비엔확실히필요하나감도가영불편하게되어있구요. 리모컨도초간단으로나오는데.. 너무간단해서조작하기영.. 리모컨시스템빼고는뭐나쁘지않습니다. Search 가격성능비가좋은TV 제품자체가보급형으로저렴한가격 LG 47LM 가격대비아주좋은선택이었네요. LG 47LM 가격대비성능비가매우우수한3D 스마트LED TV라고생각합니다. LG 47LM LG 47LM 화면크고잘나오고저렴하게구입잘한것같아서기분이좋습니다. 삼성UN50 무엇보다가격대배최고의제품이라말하고싶습니다 삼성UN50 아주좋은가격에사게되어만족합니다 삼성UN50 가격대비크기및화질좋습니다. 삼성UN50 정말최고의제품&서비스입니다.|2013.07.31 어제주문했는데이렇게빨리배송이올줄이야!!! 배송기사님도너무마음에들게설치해주시고. 무엇보다가격대배최고의제품이라말하고싶습니다. 모든것만족!! 착한가격에만족합니다.|2012.12.18 아주좋은가격에사게되어만족합니다. 삼성스마트TV로성능이나외관은기존에백화점에서보는것과별반다르지않고만족합니다. 현재약2주정도사용중인데기능이나외관모두만족입니다 가격대비최고의가치있는모델|2013.03.21 저녁에주문했는데다음날아침에배송!!!벽걸이로샀는데크기도크고영화보기에는아주좋을것같습니다. 화질도좋고, 크기도좋고, 배송도번개배송!! 저렴하게구입 가격대배최고 저렴한가격 가격대비성능비가매우우수 가격대비크기및화질좋습니다 아주좋은가격 가격대비이정도면괜찮 가격대비아주좋은선택 0.5 0.8 0.9 0.7 0.5 0.8 0.7 0.6 Query Term과매칭된Aspect Segment Score 삼성합계: 2.9 LG합계: 2.6 최종검색순위 1. 삼성UN50ES6800F 2. LG 47LM6200 Click! 삼성전자 UN50ES6800F
  • 7. Consensus Search 최근사용자들은구매활동이나문화생활과관련된의사결정을위해인터넷검색을활발히활용 공연관람이나, 상품구매를위해타사용자들의리뷰, 후기를참조 각리뷰는작성자의“주관적의견”을토대로작성 가능한많은리뷰를읽어야의사결정에도움 컨센서스엔진이란? 타사용자들이기작성해놓은수많은리뷰를사전에분석 사용자가원하는관점(질의)에서타사용자들의리뷰를분석, 종합해주는검색시스템 7
  • 8. Consensus Engine 현재의검색엔진으로는충분하지않다! 상위몇개의문서에원하는정보가있을수는있다 하지만각각의문서는각작성자의의견 대중의consensus를대표할수없다 하지만답은이미Web에존재! 많은사용자들이각자의의견을여러형태(SNS, blog, review)로온라인상에게시 이러한온라인의견들을“잠재적투표”로인식 이미피력된온라인의견을검색시점에(query time)모아서분석하면컨센서스검색이가능 8
  • 9. Uhm.. Yeah.. It is noisy, but… 9 Online Consumer Posts: 2ndmost trusted forms of advertising (The Nielson Company, Q3 2011)
  • 10. Is consensus search ever possible…? “Best Action Movies in 2013” Not immediately answerable with conventional search engines Because the answer should be based on consensus, which cannot be found in one of “top-10” documents However, the answers are already on the Web Numerous implicit votes from people on the Web and Social Networks Only if we can process them …. … ONLINE! 10
  • 13. The Key Ideas (I) Subdocument-level Indexing Capture semantics from user opinion more precisely Indexing unit no longer a page but; •a reviewwithin a page if more than one reviews exist on the page, •or a sentencewithin a review, •or even a clauseor phrasewithin a sentence discussing one aspect of the target entity Maximal Coherent Semantic Unit (MCSU) •a finest granule indexing unit used in CONSENTO indexing •maximal subsequence of words within a sentence, which carries single coherent semantics Indexing MCSUs instead of documents enables semantic analysis to be performed during indexing time •facilitating the online processing of consensus search in query time 13
  • 14. The Key Ideas (II) ConsensusRank: A Unique Ranking Method based on Public Sentiment Virtually, all existing ranking methods rank target objects (either documents or entities) directly based on their relevance to the query terms Contrastingly, ConsensusRankranks the entities indirectly through aggregating the scores of referring segments (e.g., MCSUs) that match to the query context It can be viewed as a voting process where each reviewer casts a weighted vote on an entity with respect to a query by expressing positive or negative opinions about that entity 14
  • 15. 15 (A)Indexing Subsystem Web Documents Parsing & Preprocessing DOM-tree Parsing Contents Extraction ContentsSegmentation Sentence Splitter MCSU Extraction Entity Search Index (B) Searching Subsystem Query Parsing Query Preprocessing & Expansion Retrieval Matching MCSU Retrieval Ranking Segment Grouping Score Aggregation Entity List User Query 1 2 3 4 5 6 ReviewContents ExpandedQuery MCSU Posting List MCSUs Indexing Inverted Entry Construction & Indexing CONSENTOArchitecture Indexing Subsystem Parsing & Preprocessing Contents Segmentation Indexing Searching Subsystem Query Parsing Retrieval Ranking
  • 16. The current working prototype of CONSENTO is built on movie domain CONSENTO crawled review pages from popular movie review sites such as IMDB, Meta Critics etc. Review contents are extracted using DOM- tree parsing and XPATH queries Extracted information include: entity name (i.e., movie name) review text, date and time review quality (e.g., “20 out of 30 people found the review helpful”) I: Parsing & Preprocessing
  • 17. Split the review contents into MCSUs e.g., “The storyline is ridiculous, the acting is laughable, and the camera work is terrible.” s1) “The storyline is ridiculous” s2) “the acting is laughable” s3) “the camera work is terrible” II: Contents Segmentation
  • 19. CONSENTOindexes MCSUs on a conventional inverted index that is used in most modern search engines. Only mapping needs to be redefined logically from (terms → documents) to (terms → MCSUs) III: Indexing
  • 20. III: Indexing 20 Feature 2 Feature 1 excellent visual effects, but plot was hard to follow Entity Name Transformer 3 sentiment sentiment Document #1 Bag of words excellent effects, plot hard Doc#1 Term Doc excellent #1 hard #1 follow #1 plot #1 visual #1 effects #1 follow visual Traditional Inverted index Query: “excellent plot”. System return this document * Conventional Indexing Method Example
  • 21. III: Indexing 21 excellent visual effects, but plot was hard to follow Segment 2 Segment 1 SegmentID ObjectName Feature Sentiment Segment1 Transformer 3 visual effects excellent Segment 2 Transformer 3 plot hard to follow Sub-document level indexing Term SegmentID ObjectName Feature Sentiment excellent SID1 Transformer 3 visual effects excellent visual SID1 Transformer 3 visual effects excellent effect SID1 Transformer 3 visual effects excellent plot SID2 Transformer 3 plot hard hard SID2 Transformer 3 plot hard follow SID2 Transformer 3 plot hard Query: “excellent plot”, doesn't match any segment * Subdocument-level Indexing Example
  • 22. III: Indexing Simply treating an MCSU as a document Store additional information in each posting for use in the ranking stage MCSU posting structure
  • 23. rid ts rq 푟1 푡푠1 0.8 푟2 푡푠2 0.4 푟3 푡푠3 0.6 푟4 푡푠4 0.9 푟5 푡푠5 0.4 푟6 푡푠6 0.5 푟7 푡푠7 0.7 푟8 푡푠8 0.6 푟9 푡푠9 0.8 Site Name Source ID IMDb 푤1 Flixster 푤2 Metacritic 푤3 Yahoo! 푤4 Feature id music 푎1 soundtrack 푎2 story 푎3 plot 푎4 performance 푎5 acting 푎6 Sentiword id great 푚1 excellent 푚2 superb 푚3 tragic 푚4 Entity id Titanic 푒1 Brokeback Mountain 푒2 Dark Knight 푒3 Avatar 푒4 Term Postings Cameron <푠19, 푒4, [−], [푚3], 푟7, 푤3> Pandora <푠16, 푒4, [푎2], [−], 푟6, 푤3>, <푠18, 푒4, [−], [−], 푟6, 푤3> tragic <푠7, 푒2, [푎3], [푚4], 푟3, 푤1> performance <푠5, 푒1, [푎6], [푚6], 푟2, 푤1>, <푠9, 푒2, [푎6], [푚3], 푟3, 푤1>, <푠11, 푒2, [푎6], [푚1], 푟4, 푤1>, <푠13, 푒3, [푎6], [−], 푟5, 푤2>, <푠15, 푒4, [푎6], [−], 푟5, 푤3>, <푠20, 푒3, [푎6], [−], 푟8, 푤4>, <푠21, 푒3,[푎6], [푚6], 푟9, 푤4> soundtrack <푠4, 푒1, [푎2],[−], 푟2, 푤1>, <푠10, 푒2, [푎2],[푚2], 푟4, 푤1>, <푠16, 푒4, [푎2],[−], 푟6, 푤2>, <푠22, 푒3, [푎2],[푚1], 푟9, 푤4> plot <푠14, 푒3, [푎4],[−], 푟5, 푤2> acting <푠13, 푒4, [푎6], [−], 푟9, 푤4>, music <푠2, 푒1, [푎1], [푚1], 푟1, 푤1>, <푠8, 푒2, [푎1], [푚1], 푟3, 푤1> Yeston <푠2, 푒1, [푎1], [−],푟1, 푤1>, story <푠1, 푒1, [푎3], [푚1],푟1, 푤1>, <푠7, 푒2, [푎3], [−],푟3, 푤1>, <푠12, 푒2, [푎3], [푚2],푟4, 푤1>, <푠17, 푒4, [푎3], [−],푟6, 푤3> (s7) beautiful tragic love story, //(s8)with great music.//(s9) superb performances in movies ever! (s10) The soundtrack is also excellent,// (s11)great performance, //(s12)excellent presentation of a love story… Brokeback Mountain 퐫ퟑ 퐫ퟒ The Dark Knight (s13) The performance by Heath Ledger was outstanding //(s14) and plot is amazing too… 퐫ퟓ The Dark Knight (s20) Joker shows phonemically awesome performance!… (s21) nice performance //(s22)and backed up with great soundtrack. //(s23)excellent casting! 퐫ퟖ 퐫ퟗ (s1) the greatest love stories of all //(s2)and beautiful music from Yeston. // (s3) Everything about this movie was excellent... (푠4) touching soundtrack, //(푠5) and perfect handling of the known tragedy with nice performance. //(푠6)This has the best love scene I have ever seen… Titanic 퐫ퟏ 퐫ퟐ (s15) Navilooks very real, good performance, //(s16) beautiful soundtrack that emphasize the vastness of the Pandora, //(s17)with love story.// (s18) The world of Pandora is stunning Avatar 퐫ퟔ 퐫ퟕ (s19) James Cameron deserves high praise for this creation… Review ID
  • 24. IV: Query Parsing CONSENTOpreprocesses the query and performs query expansion stop-word removal, polarity only-word removal feature expansion stemming Polarity only-word removal "good action movie" and "greataction movie" should be treated as the same query Feature words expanded for better recall ‘plot’ → {plot, story} ‘music’ → {music, soundtrack}
  • 25. V: Retrieval Retrieve MCSU segments that match to the query terms Same as the conventional systems retrieve document posting lists
  • 26. VI: Ranking Group MCSU postings by entity and aggregate the scores of the postings to compute the score of the corresponding entity
  • 31. Movie data sets Source •Amazon , IMDB, Metacritic, Flixster, Rotten Tomatoes and Yahoo Movies Period •2008 ~ 2010 More than 740 movies, and 30K reviews Hotel data sets hotel data set from Ganesanand Zhai reviews for the hotels in 10 major cities from TripAdvisor The authors provided us the corrected judgment set for our test Experimental Setup: Data Set
  • 32. Experiment Methods Ganesanand Zhai’sOE and QAM methods •Opinion expansion word •Query aspect model Baseline 1) BM25 •b = 0.75 •k1 = 2 2) VSMBM (lucenedefault) •Vector space model + Boolean model 3) ConsensusRank
  • 36. Honeymoon Snorkeling Hawaii! Honeymoon Whale Watching Snorkeling Whale watching Whale Watching Snorkeling Snorkeling Active Volcano Honeymoon Honeymoon Whale Watching Snorkeling Honeymoon Whale Watching
  • 37. 1. 웹및소셜네트워크상의다양한정보를 사전에분석및인덱싱 스릴러영화? 반전있는 스릴러 영화? 대학생백팩? 믿을만한 중고차딜러? 믿을만한 근처어린이집 2. Ad-hoc 의사결정질의에대한실시간결과도출 면접용 메이크업 미용실 학원근처 갈만한 스터디장소 강남상견례한식집 배낭여행숙소 우리동네PT 잘하는 트레이너?
  • 38. 38 best thriller with plot twist
  • 39. The Artist vs. Jack and Jill 39
  • 40. 40 good pizza restaurant
  • 42. 42