5. Pricing
■ Free (3 indexes, up to 50 Mb storage)
■ Basic – 75$ month (15 indexes, 2Gb storage, up to 3 search
units, up to 3 replicas)
■ Standart – 250$ month (50 indexes, 25Gb storage, up to 36
search units, up to 12 replicas, up to 12 )
■ … up to 2850$ (with High Density support)
6. Create datasource with portal UI
■ Если вы используете Cosmos DB, то добавьте
Database=ИМЯ_ВАШЕЙ_БАЗЫ к connection string
10. Suggester vs Autocomplete
Suggestions API suggests documents and returns document Ids which
contain the query term
Autocomplete API returns potential terms from the index which match the
partial term in the query
11. Simple query syntax
wifi+luxury searching for wifi and luxury at same time
“luxury hotel” searching for phrase
wifi | luxury searching for wifi or luxury
wifi –luxury searching for wifi without luxury
motel+(wifi | luxury) you can combine with parenthesis
lux* searching for words starting from lux
12. Lucene query syntax - queryType=full
range searches are constructed in Azure Search
through $filter expressions
OR or ||
AND, && or +
NOT, ! or -
13. NOT ! или -
searchMode=any
wifi –luxury поиск содержит wifi или не содержит luxury
searchMode=all
wifi –luxury поиск содержит wifi и не содержит luxury
14. Стемминг и лемматизация
Сте́мминг — это процесс нахождения основы слова для
заданного исходного слова.
Лемматиза́ция — процесс приведения словоформы к лемме
— её нормальной (словарной) форме.
15. Analyzers
default analyzer is Standard Lucene
Lucene's English analyzer applies stemming as per Porter
Stemming algorithm
Microsoft's English analyzer performs lemmatization instead of
stemming
16. Escaping and encoding special
characters
Следующие символы
+ - && || ! ( ) { } [ ] ^ " ~ * ? : /
необходимо экранировать с помощью
Небезопасные для использования в URL символы
" ` < > # % { } | ^ ~ [ ]
необходимо энкодировать. Например, символ # станет %23
17. Fuzzy search
"blue~" or "blue~1" would return "blue", "blues", and "glue“
but…
"business~analyst" means business OR analyst.
22. Data Change Detection Policy for SQL
"dataChangeDetectionPolicy" : {
"@odata.type" :
"#Microsoft.Azure.Search.HighWaterMarkChangeDetectionPolicy",
"highWaterMarkColumnName" : "[a rowversion or last_updated
column name]" }
"dataChangeDetectionPolicy" : {
"@odata.type" :
"#Microsoft.Azure.Search.SqlIntegratedChangeTrackingPolicy" }
23. SQL Server ChangeTracking
ALTER DATABASE AdventureWorks
SET ALLOW_SNAPSHOT_ISOLATION ON;
ALTER DATABASE AdventureWorks
SET CHANGE_TRACKING = ON
(CHANGE_RETENTION = 2 DAYS, AUTO_CLEANUP = ON)
24. Data Change Detection Policy for
Cosmos DB
{
"@odata.type" :
"#Microsoft.Azure.Search.HighWaterMarkChangeDetectionPolicy"
,
"highWaterMarkColumnName" : "_ts"
}
28. Natural language processing skills
entity recognition
language detection
key phrase extraction
text manipulation
sentiment detection
29. Image processing
Optical Character Recognition (OCR)
textExtractionAlgorithm "handwritten“ (for English only)
textExtractionAlgorithm "printed“
Identification of visual features
provided by ComputerVision in Cognitive Services
Indexers are available for Azure Cosmos DB, Azure SQL Database, Azure Blob Storage, and SQL Server hosted in an Azure VM.
High Density tier is targeted at SaaS providers who build applications which support a large number of relatively small indexes in a single search service
Max indexes for partition - 1000 (max 3000/service)
This syntaxis is default
https://docs.microsoft.com/en-us/azure/search/query-simple-syntax
Отличие Azure-овского от классического Lucene синтаксиса только в отсутствии range ( mod_date:[20020101 TO 20030101] – вот так в Azure Search нельзя )
https://docs.microsoft.com/en-us/azure/search/query-lucene-syntax
Stem (англ.) – основа, стебель, происхождение
Лемма (лингвистика) - каноническая, основная форма слова
Стемминг – использует алгоритмы (зачастую обрезает слова удаляя суффиксы и окончания, получая основу слова)
Лемматизация использует поиск по словарям содержащим различные формы слов
Lucene's English and Microsoft's English analyzers are better than default
Odata filter could be used additionally with simple or full query syntax used in a search parameter
https://docs.microsoft.com/en-us/rest/api/searchservice/support-for-odata
https://docs.microsoft.com/en-us/azure/search/query-odata-filter-orderby-syntax
2 варианта для SQL
Второй только для tables и не для таблиц у которых составной первичный ключ
https://docs.microsoft.com/en-us/sql/relational-databases/track-changes/work-with-change-tracking-sql-server?view=sql-server-2017
https://docs.microsoft.com/en-us/sql/t-sql/statements/set-transaction-isolation-level-transact-sql?view=sql-server-2017
SET TRANSACTION ISOLATION LEVEL SNAPSHOT;
BEGIN TRAN
-- Verify that version of the previous synchronization is valid.
-- Obtain the version to use next time.
-- Obtain changes.
COMMIT TRAN
Soft delete if HighWaterMarkChangeDetectionPolicy was selected