SlideShare a Scribd company logo
1 of 46
Best Practices and
Performance Tuning of
XML Queries in SQL Server
AD-501-M

Michael Rys
Principal Program Manager
Microsoft Corp

mrys@microsoft.com
@SQLServerMike




                            October 11-14, Seattle, WA
Session Objectives
• Understand when and how
  to use XML in SQL Server
• Understand and correct common
  performance problems with XML and
  XQuery
Session Agenda

XML Scenarios and when to store XML


XML Design Optimizations


General Optimizations


XML Datatype method Optimizations


XQuery Optimizations


XML Index Optimizations

                                      AD-501-M| XQuery Performance   3
AD-501-M| XQuery Performance   4
XML Scenarios

Data Exchange between loosely-coupled systems
•   XML is ubiquitous, extensible, platform independent transport format
•   Message Envelope in XML
    Simple Object Access Protocol (SOAP), RSS, REST
•   Message Payload/Business Data in XML
•   Vertical Industry Exchange schemas
Document Management
•   XHTML, DocBook, Home-grown, domain-specific markup (e.g.
    contracts), OpenOffice, Microsoft Office XML (both default and user-
    extended)
Ad-hoc modeling of semistructured data
•   Storing and querying heterogeneous complex objects
•   Semistructured data with sparse, highly-varying
    structure at the instance level
•   XML provides self-describing format and extensible schemas

     →Transport, Store, and Query XML data
                                                   AD-501-M| XQuery Performance   5
Decision Tree: Processing XML In SQL Server
Does the data fit
                              Shred the XML
 the relational
                    Yes        into relations
    model?
         No                                structured
                                                         Known sparse
                              Shred the structured
                             XML into relations, store            Shred known
Is the data semi-            semistructured aspects             sparse data into
    structured?     Yes       as XML and/or sparse
                                                                 sparse columns
                                       col

         No                                Open schema


                               Is the XML                            Promote
                                                         Yes
  Is the data a               Search within
                             constrainedthe
                              Query into by                    frequently queried
   document?                    the XML?
                                   XML?                             properties
                    Yes        schemas?                            relationally
                        No                 Yes

                                                                Use primary and
                             Constrain XML if
          Store as           Define a full-text                  secondary XML
                             validation XML is
                               Store as cost
       varbinary(max)              index                             indexes as
                                     ok     AD-501-M|                  needed 6
                                                           XQuery Performance
SQL Server XML Data Type Architecture

              XML                                  Relational

   XML
            XML Parser                                     XML Schemata


                                                   Schema
             Validation                           Collection

                          OpenXML/nodes()                         PATH
XML-DML XML data type                               Rowsets
                                                                 Index
         (binary XML) PRIMARY        Node
                                      Table                      PROP
                      XML INDEX with
                         FOR XML                                 Index
                         TYPE directive
                                                                 VALUE
           XQuery                                                Index


                                          AD-501-M| XQuery Performance   7
General Impacts
Concurrency Control
•   Locks on both XML data type and relevant
    rows in primary and secondary XML Indices
•   Lock escalation on indices
•   Snapshot Isolation reduces locks and lock contention
Transaction Logs
•   Bulkinsert into XML Indices may fill transaction log
•   Delay the creation of the XML indexes and use the SIMPLE recovery
    model
•   Preallocate database file instead of dynamically growing
•   Place log on different disk
In-Row/Out-of-Row of XML large object
•   Moving XML into side table or out-of-row if
    mixed with relational data reduces scan time
Due to clustering, insertion into XML Index may not be linear
•   Chose integer/bigint identity column as key
                                                  AD-501-M| XQuery Performance   8
Choose The Right XML Model
• Element-centric versus attribute-centric
     <Customer><name>Joe</name></Customer>
     <Customer name="Joe" />
  +: Attributes    often better performing querying
  –: Parsing Attributes   uniqueness check

• Generic element names with type attribute vs Specific
  element names
     <Entity type="Customer">
       <Prop type="Name">Joe</Prop>
     </Entity>
  <Customer><name>Joe</name></Customer>
  +: Specific names    shorter path expressions
  +: Specific names    no filter on type attribute
  /Entity[@type="Customer"]/Prop[@type="Name"] vs /Customer/name

• Wrapper elements
     <Orders><Order id="1"/></Orders>
  +: No wrapper elements      smaller XML, shorter path expressions
                                                 AD-501-M| XQuery Performance   9
Use an XML Schema Collection?

Using no XML Schema (untyped XML)
•   Can still use XQuery and XML Index!!!
•   Atomic values are always weakly typed strings
      compare as strings to avoid runtime
    conversions and loss of index usage
•   No schema validation overhead
•   No schema evolution revalidation costs

XML Schema provides structural information
•   Atomic typed elements are now using only one instead of two
    rows in node table/XML index (closer to attributes)
•   Static typing can detect cardinality and feasibility of expression

XML Schema provides semantic information
•   Elements/attributes have correct atomic
    type for comparison and order semantics
•   No runtime casts required and better use of index for value lookup

                                               AD-501-M| XQuery Performance   10
XQuery Methods

query() creates new, untyped XML data type
instance

exist() returns 1 if the XQuery expression returns
at least one item, 0 otherwise

value() extracts an XQuery value into the SQL
value and type space
• Expression has to statically be a singleton
• String value of atomized XQuery item is cast to
  SQL type
• SQL type has to be SQL scalar type
  (no XML or CLR UDT)                 AD-501-M| XQuery Performance   11
XQuery: nodes()

Returns a row per selected node as a special
XML data type instance
• Preserves the original structure and types
• Can only be used with the XQuery methods (but not
  modify()), count(*), and IS (NOT) NULL

Appears as Table-valued Function (TVF) in
queryplan if no index present




                                 AD-501-M| XQuery Performance   12
sql:column()/sql:variable()

Map SQL value and type into XQuery values and types in context of XQuery or
XML-DML
• sql:variable(): accesses a SQL variable/parameter
  declare @value int
  set @value=42
  select * from T
  where
  T.x.exist('/a/b[@id=sql:variable("@value")]')=1
• sql:column(): accesses another column value

   tables: T(key int, x xml), S(key int, val int)

   select * from T join S on T.key=S.key
   where T.x.exist('/a/b[@id=sql:column("S.val")]')=1

• Restrictions in SQL Server:
   No XML, CLR UDT, datetime, or deprecated text/ntext/image
                                                    AD-501-M| XQuery Performance   13
Improving Slow XQueries, Bad
FOR XML
demo




                     October 11-14, Seattle, WA
Optimal Use Of Methods
How to Cast from XML to SQL

BAD:
CAST( CAST(xmldoc.query('/a/b/text()') as
      nvarchar(500)) as int)
GOOD:
xmldoc.value('(/a/b/text())[1]', 'int')
BAD:
node.query('.').value('@attr',
                      'nvarchar(50)')
GOOD:
node.value('@attr', 'nvarchar(50)')



                           AD-501-M| XQuery Performance   15
Optimal Use Of Methods
Grouping value() method
Group value() methods on same XML instance next to
each other if the path expressions in the value()
methods are
• Simple path expressions that only use child and attribute axis
  and do not contain wildcards, predicates, node tests, ordinals
• The path expressions infer statically a singleton

The singleton can be statically inferred from
• the DOCUMENT and XML Schema Collection
• Relative paths on the context node provided by the nodes()
  method

Requires XML index to be present
                                          AD-501-M| XQuery Performance   16
Optimal Use of Methods
Using the right method to join and compare

  Use exist() method, sql:column()/sql:variable() and an
  XQuery comparison for checking for a value or joining
  if secondary XML indices present
    BAD:*
    select doc
    from doc_tab join authors
    on doc.value('(/doc/mainauthor/lname/text())[1]',
    'nvarchar(50)') = lastname
    GOOD:
    select doc
    from doc_tab join authors
    on 1 = doc.exist('/doc/mainauthor/lname/text()[. =
    sql:column("lastname")]')
  * If applied on XML variable/no index present, value()
  method is most of the time more efficient
                                    AD-501-M| XQuery Performance   17
Optimal Use of Methods
Avoiding bad costing with nodes()
nodes() without XML index is a Table-valued function (details later)
Bad cardinality estimates can lead to bad plans
   •   BAD:
       select c.value('@id', 'int') as CustID
            , c.value('@name', 'nvarchar(50)') as CName
       from Customer, @x.nodes('/doc/customer') as N(c)
       where Customer.ID = c.value('@id', 'int')
   •   BETTER (if only one wrapper doc element):
      select c.value('@id', 'int') as CustID
           , c.value('@name', 'nvarchar(50)') as CName
      from Customer, @x.nodes('/doc[1]') as D(d)
      cross apply d.nodes('customer') as N(c)
      where Customer.ID = c.value('@id', 'int')
Use temp table (insert into #temp select … from nodes()) or Table-
valued parameter instead of XML to get better estimates
                                               AD-501-M| XQuery Performance   18
Optimal Use Of Methods
Avoiding multiple method evaluations
Use subqueries
   • BAD:
     SELECT CASE isnumeric (doc.value(
       '(/doc/customer/order/price)[1]', 'nvarchar(32)'))
      WHEN 1 THEN doc.value(
       '(/doc/customer/order/price)[1]', 'decimal(5,2)')
      ELSE 0 END
     FROM T
   • GOOD:
     SELECT CASE isnumeric (Price)
       WHEN 1 THEN CAST(Price as decimal(5,2))
       ELSE 0 END
     FROM (SELECT doc.value(
             '(/doc/customer/order/price)[1]',
             'nvarchar(32)')) as Price FROM T) X

Use subqueries also with NULLIF()
                                    AD-501-M| XQuery Performance   19
Combined SQL And XQuery/DML Processing
         SELECT x.query('…'), y FROM T WHERE …

Static                                                            Metadata
              SQL Parser        XQuery Parser
Phase
                                                                      XML
            Static Typing        Static Typing                      Schema
                                                                   Collection

            Algebrization        Algebrization



                  Static Optimization of
                 combined Logical and
                 Physical Operation Tree

Dynamic            Runtime Optimization                             XML and
Phase                and Execution of                                  rel.
                     physical Op Tree                                Indices
                                                 AD-501-M| XQuery Performance   20
New XQuery Algebra Operators
XML Reader TVF
Table-Valued Function XML Reader UDF with XPath Filter
Used if no Primary XML Index is present
Creates node table rowset in query flow
Multiple XPath filters can be pushed in to reduce node table
to subtree
Base cardinality estimate is always 10’000 rows! 
Some adjustment based on pushed path filters

XMLReader node table format example (simplified)

 ID      TAG ID      Node      Type-ID         VALUE            HID
 1.3.1   4 (TITLE)   Element   2 (xs:string)   Bad Bugs         #title#section#book

                                                       AD-501-M| XQuery Performance   21
New XQuery Algebra Operators
UDX

• Serializer UDX
  serializes the query result as XML
• XQuery String UDX
  evaluates the XQuery string() function
• XQuery Data UDX
  evaluates the XQuery data() function
• Check UDX
  validates XML being inserted

•   UDX name visible in SSMS properties window
                              AD-501-M| XQuery Performance   22
Optimal Use Of XQuery
Atomization of nodes
Value comparisons, XQuery casts and value() method
casts require atomization of item
  • attribute:
    /person[@age = 42]
    /person[data(@age) = 42]
  • Atomic typed element:
    /person[age = 42]          /person[data(age) = 42]
  • Untyped, mixed content typed element (adds UDX):
    /person[age = 42]          /person[data(age) = 42]
    /person[string(age) = 42]
  • If only one text node for untyped element (better):
    /person[age/text() = 42]
    /person[data(age/text()) = 42]
  • value() method on untyped elements:
    value('/person/age', 'int')
      value('/person/age/text()', 'int')

String() aggregates all text nodes, prohibits index use
                                       AD-501-M| XQuery Performance   23
Optimal Use Of XQuery
Casting Values
Value comparisons require casts and type promotion
  • Untyped attribute:
    /person[@age = 42]       /person[xs:decimal(@age) = 42]
  • Untyped text node():
    /person[age/text() = 42]
    /person[xs:decimal(age/text()) = 42]
  • Typed element (typed as xs:int):
    /person[salary = 3e4]         /person[xs:double(salary) =
    3e4]

Casting is expensive and prohibits index lookup

Tips to avoid casting
  • Use appropriate types for comparison (string for untyped)
  • Use schema to declare type          AD-501-M| XQuery Performance   24
Optimal Use Of XQuery
Maximize XPath expressions

Single paths are more efficient than twig paths
Avoid predicates in the middle of path expressions
    book[@ISBN = "1-8610-0157-6"]/author[first-
    name = "Davis"]
    /book[@ISBN = "1-8610-0157-6"] "∩"
    /book/author[first-name = "Davis"]

Move ordinals to the end of path expressions
  • Make sure you get the same semantics!
  • /a[1]/b[1] ≠ (/a/b)[1] ≠ /a/b[1]
  • (/book/@isbn)[1] is better than/book[1]/@isbn
                               AD-501-M| XQuery Performance   25
Optimal Use Of XQuery
Maximize XPath expressions in exist()
Use context item in predicate to lengthen path in exist()
   • Existential quantification makes returned node irrelevant

• BAD:
     SELECT * FROM docs WHERE 1 = xCol.exist
       ('/book/subject[text() = "security"]')
• GOOD:
     SELECT * FROM docs WHERE 1 = xCol.exist
       ('/book/subject/text()[. = "security"]')
• BAD:
     SELECT * FROM docs WHERE 1 = xCol.exist
       ('/book[@price > 9.99 and @price < 49.99]')
• GOOD:
     SELECT * FROM docs WHERE 1 = xCol.exist
       ('/book/@price[. > 9.99 and . < 49.99]')

This does not work with or-predicate            AD-501-M| XQuery Performance   26
Optimal Use Of XQuery
Inefficient operations: Parent axis

Most frequent offender: parent axis with nodes()

• BAD:
  select o.value('../@id', 'int') as CustID
       , o.value('@id', 'int') as OrdID
  from T
  cross apply x.nodes('/doc/customer/orders') as N(o)

• GOOD:
  select c.value('@id', 'int') as CustID
       , o.value('@id', 'int') as OrdID
  from T cross apply x.nodes('/doc/customer') as N1(c)
         cross apply c.nodes('orders') as N2(o)
                                    AD-501-M| XQuery Performance   27
Optimal Use Of XQuery
Inefficient operations
Avoid descendant axes and // in the middle of path
expressions if the data structure is known.
  • // still can use the HID lookup, but is less efficient

XQuery construction performs worse than FOR XML
  • BAD:
     SELECT notes.query('
       <Customer cid="{sql:column(''cid'')}">{
         <name>{sql:column("name")}</name>, /
       }</Customer>')
     FROM Customers WHERE cid=1
  • GOOD:
     SELECT cid as "@cid", name, notes as "*"
     FROM Customers WHERE cid=1
     FOR XML PATH('Customer'), TYPE
                                              AD-501-M| XQuery Performance   28
Optimal Use Of FOR XML
Use TYPE directive when assigning result to XML
  • BAD:
    declare @x xml;
    set @x =
         (select * from Customers for xml raw);
  • GOOD:
    declare @x xml;
    set @x =
         (select * from Customers for xml raw,
          type);

Use FOR XML PATH for complex grouping and additional
hierarchy levels over FOR XML EXPLICIT

Use FOR XML EXPLICIT for complex nesting if FOR XML PATH
performance is not appropriate

                                    AD-501-M| XQuery Performance   29
XML Indices
Create XML index on XML column
        CREATE PRIMARY XML INDEX idx_1 ON docs (xDoc)
Create secondary indexes on tags, values, paths
Creation:
  • Single-threaded only for primary XML index
  • Multi-threaded for secondary XML indexes
Uses:
  •     Primary Index will always be used if defined (not a cost
        based decision)
  •     Results can be served directly from index
  •     SQL’s cost based optimizer will consider secondary indexes
Maintenance:
  •     Primary and Secondary Indices will be efficiently maintained
        during updates
  •     Only subtree that changes will be updated
  •     No online index rebuild 
  •     Clustered key may lead to non-linear maintenance cost 
Schema revalidation still checks whole instance
                                            AD-501-M| XQuery Performance   30
Example Index Contents

insert into Person values (42,
'<book ISBN=”1-55860-438-3”>
    <section>
      <title>Bad Bugs</title>
      Nobody loves bad bugs.
    </section>
    <section>
      <title>Tree Frogs</title>
     All right-thinking people
      <bold>love</bold> tree frogs.
</section>
</book>')

                       AD-501-M| XQuery Performance   31
Primary XML Index
 CREATE PRIMARY XML INDEX PersonIdx ON Person (Pdesc)
PK   XID     TAG ID        Node        Type-ID         VALUE                HID
42   1       1 (book)      Element     1 (bookT)       null                 #book
42   1.1     2 (ISBN)      Attribute   2 (xs:string)   1-55860-438-3        #@ISBN#book
42   1.3     3 (section)   Element     3 (sectionT)    null                 #section#book
42   1.3.1   4 (TITLE)     Element     2 (xs:string)   Bad Bugs             #title#section#book

42   1.3.3   --            Text        --              Nobody    loves      #text()#section#book
                                                       bad bugs.
42   1.5     3 (section)   Element     3 (sectionT)    null                 #section#book
42   1.5.1   4 (title)     Element     2 (xs:string)   Tree frogs           #title#section#book
42   1.5.3   --            Text        --              All right-thinking   #text()#section#book
                                                       people
42   1.5.5   7 (bold)      Element     4 (boldT)       love                 #bold#section#book
42   1.5.7   --            Text        --              tree frogs           #text()#section#book


 Assumes typed data; Columns and Values are simplified, see VLDB 2004 paper for details



                                                               AD-501-M| XQuery Performance   32
Secondary XML Indices

     XML Column              Primary XML Index (1 per XML column)
     in table T(id, x)       Clustered on Primary Key (of table T), XID

                         PK     XID   NID   TID   VALUE      LVALUE     HID     xsinil   …
id      x
                         1
1       Binary XML
                         1

                         1

2       Binary XML       2

                         2
                             1 34                   1
                                                    2
                                                    3                     1
                                                                          2
                         2
3       Binary XML
                         3

                         3

                         3




 Non-clustered Secondary Indices (n per primary Index)

       Value Index           Property Index                 Path Index


                                                        AD-501-M| XQuery Performance     33
XQueries And XML
Indices
demo




                   October 11-14, Seattle, WA
Takeaway: XML Indices

PRIMARY XML Index – Use when lots of XQuery
FOR VALUE – Useful for queries where values are
more selective than paths such as
//*[.=“Seattle”]
FOR PATH – Useful for Path expressions: avoids
joins by mapping paths to hierarchical index
(HID) numbers. Example: /person/address/zip
FOR PROPERTY – Useful when optimizer chooses
other index (for example, on relational column,
or FT Index) in addition so row is already known



                              AD-501-M| XQuery Performance   35
Shredding Approaches
Approach     Complex    Bulkload   Server Business       Programming Scale/
             Shapes                vs      logic                     Performance
                                   Midtier
SQLXML       Yes with   Yes        midtier   staging     annotated     very good/
Bulkload     limits                          tables on   XSD and small very good
with                                         server,     API
annotated                                    XSLT on
schema                                       midtier
ADO.Net      No         No         midtier   midtier,    DataSet API      good/good
DataSet                                      SSIS        or SSIS
CLR Table-   Yes        No         Server    Server or   C#, VB           limited/good
valued                             or        midtier     custom code
function                           midtier
OpenXML      Yes        No         Server    T-SQL       declarative T-   limited/good
                                                         SQL, XPath
                                                         against
                                                         variable
nodes()      Yes        No         Server    T-SQL       declarative      good/careful
                                                         SQL, XQuery
                                                         against var or
                                                         table
To Promote or Not Promote…
Promotion pre-calculates paths
Requires relational query
•    XQuery does not know about promotion

Promotion during loading of the data
•    Using any of the shredding mechanisms
•    1-to-1 or 1-to-many relationships

Promotion using computed columns
•    1-to-1 only
•    Persist computed column: Fast lookup and retrieval
•    Relational index on persisted computed column: Fast lookup
•    Have to be precise

Promotion using Triggers
•    1-to-1 or 1-to-many relationships
•    Trigger overhead

Relational View over XML data
•    Filters on relational view are not pushed down due to different type/value system
                                                       AD-501-M| XQuery Performance   37
Promotion using computed columns
Use a schema-bound UDF that encapsulates XQuery

Persist computed column
 •   Fast lookup and retrieval


Relational index on persisted computed column
 •   Fast lookup


Query will have to use the schema-bound UDF to match

CAVEAT: No parallel plans with a persisted computed
column based on a UDF


                                   AD-501-M| XQuery Performance   38
Use of Full-Text Index for Optimization

 Can provide improvement for XQuery contains() queries

 Query for documents where section title contains “optimization”

 Use Fulltext index to prefilter candidates (includes false positives)




 SELECT * FROM docs
 WHERE contains(xCol, 'optimization')
       1 = xCol.exist('
 /book/section/title/text()[contains(.,"optimization")]
 AND 1 = xCol.exist('
 ')
 /book/section/title/text()[contains(.,"optimization")]
 ')


                                               AD-501-M| XQuery Performance   39
Futures: Selective XML Index
CREATE SELECTIVE XML INDEX pxi_index ON Tbl(xmlcol)
FOR (
-– the first four match XQuery predicates
-- in all XML data type methods

-- simple flavor - default mapping (xs:untypedAtomic),
-- no optimization hints
node42 = ‘/a/b’,
pathatc = ‘/a/b/c/@atc’,

-- advanced flavor - use of optimization hints
path02 =‘/a/b/c’ as XQUERY ‘xs:string’ MAXLENGTH(25),
node13 = ‘/a/b/d’ as XQUERY ‘xs:double SINGLETON,

-–   the next two match value() method
--   require regular SQL Server type semantics
--   they can be mixed with the XQUERY ones
--   specifying a type is mandatory for the SQL type semantics

pathfloat = ‘/a/b/c’ as SQL FLOAT,
pathabd = ‘/a/b/d’ as SQL VARCHAR(200)
)
Session Takeaways

• Understand when and how
  to use XML in SQL Server
• Understand and correct common
  performance problems with XML and
  XQuery
• Shred “relational” XML to relations
• Use XML datatype for semistructured
  and markup scenarios
• Write your XQueries so that XML
  Indices can be used
• Use persisted computed columns to
  promote XQuery results (with caveat)
October 11-14, Seattle, WA
Related Content
Optimization whitepapers
http://msdn2.microsoft.com/en-us/library/ms345118.aspx
http://msdn2.microsoft.com/en-us/library/ms345121.aspx
General XML and Databases whitepapers
http://msdn2.microsoft.com/en-us/xml/bb190603.aspx
Online WebCasts
http://www.microsoft.com/events/series/msdnsqlserver2005.mspx#SQ
LXML
Newsgroups & Forum:
microsoft.public.sqlserver.xml
http://communities.microsoft.com/newsgroups/default.asp?ICP=sqlse
rver2005&sLCID=us
http://forums.microsoft.com/msdn/ShowForum.aspx?ForumID=89

My E-mail: mrys@microsoft.com
My Weblog: http://sqlblog.com/blogs/michael_rys/


                                           AD-501-M| XQuery Performance   43
Complete the Evaluation Form to Win!



 Win a Dell Mini Netbook – every day – just for
 submitting your completed form. Each session
 evaluation form represents a chance to win.

 Pick up your evaluation form:
 • In each presentation room                       Sponsored by Dell
 • Online on the PASS Summit website
 Drop off your completed form:
 • Near the exit of each presentation room
 • At the Registration desk
 • Online on the PASS Summit website


                                         AD-501-M| XQuery Performance   44
Thank you
for attending this session and the
2011 PASS Summit in Seattle




                                     October 11-14, Seattle, WA
Microsoft SQL                Microsoft                Expert Pods              Hands-on Labs
  Server Clinic             Product Pavilion            Meet Microsoft SQL
                                                        Server Engineering
   Work through your         Talk with Microsoft SQL                           Get experienced through
                                                         team members &
technical issues with SQL     Server & BI experts to                            self-paced & instructor-
                                                            SQL MVPs
    Server CSS & get          learn about the next                                led labs on our cloud
 architectural guidance       version of SQL Server                                based lab platform -
      from SQLCAT           and check out the new                              bring your laptop or use
                            Database Consolidation                               HP provided hardware
                                   Appliance


     Room 611                    Expo Hall             6th Floor Lobby           Room 618-620

                                                                AD-501-M| XQuery Performance     46

More Related Content

What's hot

Eo gaddis java_chapter_16_5e
Eo gaddis java_chapter_16_5eEo gaddis java_chapter_16_5e
Eo gaddis java_chapter_16_5eGina Bullock
 
The Road to U-SQL: Experiences in Language Design (SQL Konferenz 2017 Keynote)
The Road to U-SQL: Experiences in Language Design (SQL Konferenz 2017 Keynote)The Road to U-SQL: Experiences in Language Design (SQL Konferenz 2017 Keynote)
The Road to U-SQL: Experiences in Language Design (SQL Konferenz 2017 Keynote)Michael Rys
 
Easy Data Object Relational Mapping Tool
Easy Data Object Relational Mapping ToolEasy Data Object Relational Mapping Tool
Easy Data Object Relational Mapping ToolHasitha Guruge
 
A comparison between several no sql databases with comments and notes
A comparison between several no sql databases with comments and notesA comparison between several no sql databases with comments and notes
A comparison between several no sql databases with comments and notesJoão Gabriel Lima
 
Intro to T-SQL - 1st session
Intro to T-SQL - 1st sessionIntro to T-SQL - 1st session
Intro to T-SQL - 1st sessionMedhat Dawoud
 
White paper for High Performance Messaging App Dev with Oracle AQ
White paper for High Performance Messaging App Dev with Oracle AQWhite paper for High Performance Messaging App Dev with Oracle AQ
White paper for High Performance Messaging App Dev with Oracle AQJeff Jacobs
 
B.Vinithamani,II-M.sc.,Computer science,Bon Secours college for women,thanjavur.
B.Vinithamani,II-M.sc.,Computer science,Bon Secours college for women,thanjavur.B.Vinithamani,II-M.sc.,Computer science,Bon Secours college for women,thanjavur.
B.Vinithamani,II-M.sc.,Computer science,Bon Secours college for women,thanjavur.vinithamaniB
 
New T-SQL Features in SQL Server 2012
New T-SQL Features in SQL Server 2012 New T-SQL Features in SQL Server 2012
New T-SQL Features in SQL Server 2012 Richie Rump
 
Database Programming Techniques
Database Programming TechniquesDatabase Programming Techniques
Database Programming TechniquesRaji Ghawi
 
UKOUG 2010 (Birmingham) - XML Indexing strategies - Choosing the Right Index ...
UKOUG 2010 (Birmingham) - XML Indexing strategies - Choosing the Right Index ...UKOUG 2010 (Birmingham) - XML Indexing strategies - Choosing the Right Index ...
UKOUG 2010 (Birmingham) - XML Indexing strategies - Choosing the Right Index ...Marco Gralike
 
Data Migration with Spark to Hive
Data Migration with Spark to HiveData Migration with Spark to Hive
Data Migration with Spark to HiveDatabricks
 
Introduction to column oriented databases
Introduction to column oriented databasesIntroduction to column oriented databases
Introduction to column oriented databasesArangoDB Database
 
Killer Scenarios with Data Lake in Azure with U-SQL
Killer Scenarios with Data Lake in Azure with U-SQLKiller Scenarios with Data Lake in Azure with U-SQL
Killer Scenarios with Data Lake in Azure with U-SQLMichael Rys
 
Developing Dynamic Reports for TMS Using Crystal Reports
Developing Dynamic Reports for TMS Using Crystal ReportsDeveloping Dynamic Reports for TMS Using Crystal Reports
Developing Dynamic Reports for TMS Using Crystal ReportsChad Petrovay
 
Domain Driven Design and NoSQL TLV
Domain Driven Design and NoSQL TLVDomain Driven Design and NoSQL TLV
Domain Driven Design and NoSQL TLVArangoDB Database
 

What's hot (20)

Eo gaddis java_chapter_16_5e
Eo gaddis java_chapter_16_5eEo gaddis java_chapter_16_5e
Eo gaddis java_chapter_16_5e
 
The Road to U-SQL: Experiences in Language Design (SQL Konferenz 2017 Keynote)
The Road to U-SQL: Experiences in Language Design (SQL Konferenz 2017 Keynote)The Road to U-SQL: Experiences in Language Design (SQL Konferenz 2017 Keynote)
The Road to U-SQL: Experiences in Language Design (SQL Konferenz 2017 Keynote)
 
MYSQL.ppt
MYSQL.pptMYSQL.ppt
MYSQL.ppt
 
Easy Data Object Relational Mapping Tool
Easy Data Object Relational Mapping ToolEasy Data Object Relational Mapping Tool
Easy Data Object Relational Mapping Tool
 
Module02
Module02Module02
Module02
 
A comparison between several no sql databases with comments and notes
A comparison between several no sql databases with comments and notesA comparison between several no sql databases with comments and notes
A comparison between several no sql databases with comments and notes
 
Intro to T-SQL - 1st session
Intro to T-SQL - 1st sessionIntro to T-SQL - 1st session
Intro to T-SQL - 1st session
 
T-SQL Overview
T-SQL OverviewT-SQL Overview
T-SQL Overview
 
White paper for High Performance Messaging App Dev with Oracle AQ
White paper for High Performance Messaging App Dev with Oracle AQWhite paper for High Performance Messaging App Dev with Oracle AQ
White paper for High Performance Messaging App Dev with Oracle AQ
 
L04 base patterns
L04 base patternsL04 base patterns
L04 base patterns
 
B.Vinithamani,II-M.sc.,Computer science,Bon Secours college for women,thanjavur.
B.Vinithamani,II-M.sc.,Computer science,Bon Secours college for women,thanjavur.B.Vinithamani,II-M.sc.,Computer science,Bon Secours college for women,thanjavur.
B.Vinithamani,II-M.sc.,Computer science,Bon Secours college for women,thanjavur.
 
New T-SQL Features in SQL Server 2012
New T-SQL Features in SQL Server 2012 New T-SQL Features in SQL Server 2012
New T-SQL Features in SQL Server 2012
 
Database Programming Techniques
Database Programming TechniquesDatabase Programming Techniques
Database Programming Techniques
 
UKOUG 2010 (Birmingham) - XML Indexing strategies - Choosing the Right Index ...
UKOUG 2010 (Birmingham) - XML Indexing strategies - Choosing the Right Index ...UKOUG 2010 (Birmingham) - XML Indexing strategies - Choosing the Right Index ...
UKOUG 2010 (Birmingham) - XML Indexing strategies - Choosing the Right Index ...
 
Data Migration with Spark to Hive
Data Migration with Spark to HiveData Migration with Spark to Hive
Data Migration with Spark to Hive
 
Sql server T-sql basics ppt-3
Sql server T-sql basics  ppt-3Sql server T-sql basics  ppt-3
Sql server T-sql basics ppt-3
 
Introduction to column oriented databases
Introduction to column oriented databasesIntroduction to column oriented databases
Introduction to column oriented databases
 
Killer Scenarios with Data Lake in Azure with U-SQL
Killer Scenarios with Data Lake in Azure with U-SQLKiller Scenarios with Data Lake in Azure with U-SQL
Killer Scenarios with Data Lake in Azure with U-SQL
 
Developing Dynamic Reports for TMS Using Crystal Reports
Developing Dynamic Reports for TMS Using Crystal ReportsDeveloping Dynamic Reports for TMS Using Crystal Reports
Developing Dynamic Reports for TMS Using Crystal Reports
 
Domain Driven Design and NoSQL TLV
Domain Driven Design and NoSQL TLVDomain Driven Design and NoSQL TLV
Domain Driven Design and NoSQL TLV
 

Similar to SQLPASS AD501-M XQuery MRys

Unit2_XML_S_SS_US Data_CS19414.pptx
Unit2_XML_S_SS_US Data_CS19414.pptxUnit2_XML_S_SS_US Data_CS19414.pptx
Unit2_XML_S_SS_US Data_CS19414.pptxNEHARAJPUT239591
 
BGOUG 2012 - XML Index Strategies
BGOUG 2012 - XML Index StrategiesBGOUG 2012 - XML Index Strategies
BGOUG 2012 - XML Index StrategiesMarco Gralike
 
Sql Summit Clr, Service Broker And Xml
Sql Summit   Clr, Service Broker And XmlSql Summit   Clr, Service Broker And Xml
Sql Summit Clr, Service Broker And XmlDavid Truxall
 
Developer & Fusion Middleware 1 | Mark Drake | An introduction to Oracle XML ...
Developer & Fusion Middleware 1 | Mark Drake | An introduction to Oracle XML ...Developer & Fusion Middleware 1 | Mark Drake | An introduction to Oracle XML ...
Developer & Fusion Middleware 1 | Mark Drake | An introduction to Oracle XML ...InSync2011
 
Hotsos 2013 - Creating Structure in Unstructured Data
Hotsos 2013 - Creating Structure in Unstructured DataHotsos 2013 - Creating Structure in Unstructured Data
Hotsos 2013 - Creating Structure in Unstructured DataMarco Gralike
 
Xml serialization
Xml serializationXml serialization
Xml serializationRaghu nath
 
Real World Experience With Oracle Xml Database 11g An Oracle Ace’s Perspectiv...
Real World Experience With Oracle Xml Database 11g An Oracle Ace’s Perspectiv...Real World Experience With Oracle Xml Database 11g An Oracle Ace’s Perspectiv...
Real World Experience With Oracle Xml Database 11g An Oracle Ace’s Perspectiv...Marco Gralike
 
ravenbenweb xml and its application .PPT
ravenbenweb xml and its application .PPTravenbenweb xml and its application .PPT
ravenbenweb xml and its application .PPTubaidullah75790
 
Bitmap Indexes for Relational XML Twig Query Processing
Bitmap Indexes for Relational XML Twig Query ProcessingBitmap Indexes for Relational XML Twig Query Processing
Bitmap Indexes for Relational XML Twig Query ProcessingKyong-Ha Lee
 
LINQ to XML
LINQ to XMLLINQ to XML
LINQ to XMLukdpe
 
advDBMS_XML.pptx
advDBMS_XML.pptxadvDBMS_XML.pptx
advDBMS_XML.pptxIreneGetzi
 
XSD%20and%20jCAM%20tutorial
XSD%20and%20jCAM%20tutorialXSD%20and%20jCAM%20tutorial
XSD%20and%20jCAM%20tutorialtutorialsruby
 
XSD%20and%20jCAM%20tutorial
XSD%20and%20jCAM%20tutorialXSD%20and%20jCAM%20tutorial
XSD%20and%20jCAM%20tutorialtutorialsruby
 
unit_5_XML data integration database management
unit_5_XML data integration database managementunit_5_XML data integration database management
unit_5_XML data integration database managementsathiyabcsbs
 
Jaxp Xmltutorial 11 200108
Jaxp Xmltutorial 11 200108Jaxp Xmltutorial 11 200108
Jaxp Xmltutorial 11 200108nit Allahabad
 
Xml and webdata
Xml and webdataXml and webdata
Xml and webdataFraboni Ec
 

Similar to SQLPASS AD501-M XQuery MRys (20)

Unit2_XML_S_SS_US Data_CS19414.pptx
Unit2_XML_S_SS_US Data_CS19414.pptxUnit2_XML_S_SS_US Data_CS19414.pptx
Unit2_XML_S_SS_US Data_CS19414.pptx
 
BGOUG 2012 - XML Index Strategies
BGOUG 2012 - XML Index StrategiesBGOUG 2012 - XML Index Strategies
BGOUG 2012 - XML Index Strategies
 
Sql Summit Clr, Service Broker And Xml
Sql Summit   Clr, Service Broker And XmlSql Summit   Clr, Service Broker And Xml
Sql Summit Clr, Service Broker And Xml
 
Developer & Fusion Middleware 1 | Mark Drake | An introduction to Oracle XML ...
Developer & Fusion Middleware 1 | Mark Drake | An introduction to Oracle XML ...Developer & Fusion Middleware 1 | Mark Drake | An introduction to Oracle XML ...
Developer & Fusion Middleware 1 | Mark Drake | An introduction to Oracle XML ...
 
Hotsos 2013 - Creating Structure in Unstructured Data
Hotsos 2013 - Creating Structure in Unstructured DataHotsos 2013 - Creating Structure in Unstructured Data
Hotsos 2013 - Creating Structure in Unstructured Data
 
Xml serialization
Xml serializationXml serialization
Xml serialization
 
Xmll
XmllXmll
Xmll
 
Real World Experience With Oracle Xml Database 11g An Oracle Ace’s Perspectiv...
Real World Experience With Oracle Xml Database 11g An Oracle Ace’s Perspectiv...Real World Experience With Oracle Xml Database 11g An Oracle Ace’s Perspectiv...
Real World Experience With Oracle Xml Database 11g An Oracle Ace’s Perspectiv...
 
ravenbenweb xml and its application .PPT
ravenbenweb xml and its application .PPTravenbenweb xml and its application .PPT
ravenbenweb xml and its application .PPT
 
Agile xml
Agile xmlAgile xml
Agile xml
 
1 xml fundamentals
1 xml fundamentals1 xml fundamentals
1 xml fundamentals
 
Bitmap Indexes for Relational XML Twig Query Processing
Bitmap Indexes for Relational XML Twig Query ProcessingBitmap Indexes for Relational XML Twig Query Processing
Bitmap Indexes for Relational XML Twig Query Processing
 
LINQ to XML
LINQ to XMLLINQ to XML
LINQ to XML
 
Xml databases
Xml databasesXml databases
Xml databases
 
advDBMS_XML.pptx
advDBMS_XML.pptxadvDBMS_XML.pptx
advDBMS_XML.pptx
 
XSD%20and%20jCAM%20tutorial
XSD%20and%20jCAM%20tutorialXSD%20and%20jCAM%20tutorial
XSD%20and%20jCAM%20tutorial
 
XSD%20and%20jCAM%20tutorial
XSD%20and%20jCAM%20tutorialXSD%20and%20jCAM%20tutorial
XSD%20and%20jCAM%20tutorial
 
unit_5_XML data integration database management
unit_5_XML data integration database managementunit_5_XML data integration database management
unit_5_XML data integration database management
 
Jaxp Xmltutorial 11 200108
Jaxp Xmltutorial 11 200108Jaxp Xmltutorial 11 200108
Jaxp Xmltutorial 11 200108
 
Xml and webdata
Xml and webdataXml and webdata
Xml and webdata
 

More from Michael Rys

Big Data and Data Warehousing Together with Azure Synapse Analytics (SQLBits ...
Big Data and Data Warehousing Together with Azure Synapse Analytics (SQLBits ...Big Data and Data Warehousing Together with Azure Synapse Analytics (SQLBits ...
Big Data and Data Warehousing Together with Azure Synapse Analytics (SQLBits ...Michael Rys
 
Big Data Processing with .NET and Spark (SQLBits 2020)
Big Data Processing with .NET and Spark (SQLBits 2020)Big Data Processing with .NET and Spark (SQLBits 2020)
Big Data Processing with .NET and Spark (SQLBits 2020)Michael Rys
 
Running cost effective big data workloads with Azure Synapse and ADLS (MS Ign...
Running cost effective big data workloads with Azure Synapse and ADLS (MS Ign...Running cost effective big data workloads with Azure Synapse and ADLS (MS Ign...
Running cost effective big data workloads with Azure Synapse and ADLS (MS Ign...Michael Rys
 
Running cost effective big data workloads with Azure Synapse and Azure Data L...
Running cost effective big data workloads with Azure Synapse and Azure Data L...Running cost effective big data workloads with Azure Synapse and Azure Data L...
Running cost effective big data workloads with Azure Synapse and Azure Data L...Michael Rys
 
Big Data Processing with Spark and .NET - Microsoft Ignite 2019
Big Data Processing with Spark and .NET - Microsoft Ignite 2019Big Data Processing with Spark and .NET - Microsoft Ignite 2019
Big Data Processing with Spark and .NET - Microsoft Ignite 2019Michael Rys
 
Bringing the Power and Familiarity of .NET, C# and F# to Big Data Processing ...
Bringing the Power and Familiarity of .NET, C# and F# to Big Data Processing ...Bringing the Power and Familiarity of .NET, C# and F# to Big Data Processing ...
Bringing the Power and Familiarity of .NET, C# and F# to Big Data Processing ...Michael Rys
 
Building data pipelines for modern data warehouse with Apache® Spark™ and .NE...
Building data pipelines for modern data warehouse with Apache® Spark™ and .NE...Building data pipelines for modern data warehouse with Apache® Spark™ and .NE...
Building data pipelines for modern data warehouse with Apache® Spark™ and .NE...Michael Rys
 
Modernizing ETL with Azure Data Lake: Hyperscale, multi-format, multi-platfor...
Modernizing ETL with Azure Data Lake: Hyperscale, multi-format, multi-platfor...Modernizing ETL with Azure Data Lake: Hyperscale, multi-format, multi-platfor...
Modernizing ETL with Azure Data Lake: Hyperscale, multi-format, multi-platfor...Michael Rys
 
Best Practices and Performance Tuning of U-SQL in Azure Data Lake (SQL Konfer...
Best Practices and Performance Tuning of U-SQL in Azure Data Lake (SQL Konfer...Best Practices and Performance Tuning of U-SQL in Azure Data Lake (SQL Konfer...
Best Practices and Performance Tuning of U-SQL in Azure Data Lake (SQL Konfer...Michael Rys
 
Bring your code to explore the Azure Data Lake: Execute your .NET/Python/R co...
Bring your code to explore the Azure Data Lake: Execute your .NET/Python/R co...Bring your code to explore the Azure Data Lake: Execute your .NET/Python/R co...
Bring your code to explore the Azure Data Lake: Execute your .NET/Python/R co...Michael Rys
 
Best practices on Building a Big Data Analytics Solution (SQLBits 2018 Traini...
Best practices on Building a Big Data Analytics Solution (SQLBits 2018 Traini...Best practices on Building a Big Data Analytics Solution (SQLBits 2018 Traini...
Best practices on Building a Big Data Analytics Solution (SQLBits 2018 Traini...Michael Rys
 
U-SQL Killer Scenarios: Custom Processing, Big Cognition, Image and JSON Proc...
U-SQL Killer Scenarios: Custom Processing, Big Cognition, Image and JSON Proc...U-SQL Killer Scenarios: Custom Processing, Big Cognition, Image and JSON Proc...
U-SQL Killer Scenarios: Custom Processing, Big Cognition, Image and JSON Proc...Michael Rys
 
Introduction to Azure Data Lake and U-SQL for SQL users (SQL Saturday 635)
Introduction to Azure Data Lake and U-SQL for SQL users (SQL Saturday 635)Introduction to Azure Data Lake and U-SQL for SQL users (SQL Saturday 635)
Introduction to Azure Data Lake and U-SQL for SQL users (SQL Saturday 635)Michael Rys
 
U-SQL Killer Scenarios: Taming the Data Science Monster with U-SQL and Big Co...
U-SQL Killer Scenarios: Taming the Data Science Monster with U-SQL and Big Co...U-SQL Killer Scenarios: Taming the Data Science Monster with U-SQL and Big Co...
U-SQL Killer Scenarios: Taming the Data Science Monster with U-SQL and Big Co...Michael Rys
 
Introducing U-SQL (SQLPASS 2016)
Introducing U-SQL (SQLPASS 2016)Introducing U-SQL (SQLPASS 2016)
Introducing U-SQL (SQLPASS 2016)Michael Rys
 
Tuning and Optimizing U-SQL Queries (SQLPASS 2016)
Tuning and Optimizing U-SQL Queries (SQLPASS 2016)Tuning and Optimizing U-SQL Queries (SQLPASS 2016)
Tuning and Optimizing U-SQL Queries (SQLPASS 2016)Michael Rys
 
Taming the Data Science Monster with A New ‘Sword’ – U-SQL
Taming the Data Science Monster with A New ‘Sword’ – U-SQLTaming the Data Science Monster with A New ‘Sword’ – U-SQL
Taming the Data Science Monster with A New ‘Sword’ – U-SQLMichael Rys
 
ADL/U-SQL Introduction (SQLBits 2016)
ADL/U-SQL Introduction (SQLBits 2016)ADL/U-SQL Introduction (SQLBits 2016)
ADL/U-SQL Introduction (SQLBits 2016)Michael Rys
 
U-SQL Learning Resources (SQLBits 2016)
U-SQL Learning Resources (SQLBits 2016)U-SQL Learning Resources (SQLBits 2016)
U-SQL Learning Resources (SQLBits 2016)Michael Rys
 
U-SQL Federated Distributed Queries (SQLBits 2016)
U-SQL Federated Distributed Queries (SQLBits 2016)U-SQL Federated Distributed Queries (SQLBits 2016)
U-SQL Federated Distributed Queries (SQLBits 2016)Michael Rys
 

More from Michael Rys (20)

Big Data and Data Warehousing Together with Azure Synapse Analytics (SQLBits ...
Big Data and Data Warehousing Together with Azure Synapse Analytics (SQLBits ...Big Data and Data Warehousing Together with Azure Synapse Analytics (SQLBits ...
Big Data and Data Warehousing Together with Azure Synapse Analytics (SQLBits ...
 
Big Data Processing with .NET and Spark (SQLBits 2020)
Big Data Processing with .NET and Spark (SQLBits 2020)Big Data Processing with .NET and Spark (SQLBits 2020)
Big Data Processing with .NET and Spark (SQLBits 2020)
 
Running cost effective big data workloads with Azure Synapse and ADLS (MS Ign...
Running cost effective big data workloads with Azure Synapse and ADLS (MS Ign...Running cost effective big data workloads with Azure Synapse and ADLS (MS Ign...
Running cost effective big data workloads with Azure Synapse and ADLS (MS Ign...
 
Running cost effective big data workloads with Azure Synapse and Azure Data L...
Running cost effective big data workloads with Azure Synapse and Azure Data L...Running cost effective big data workloads with Azure Synapse and Azure Data L...
Running cost effective big data workloads with Azure Synapse and Azure Data L...
 
Big Data Processing with Spark and .NET - Microsoft Ignite 2019
Big Data Processing with Spark and .NET - Microsoft Ignite 2019Big Data Processing with Spark and .NET - Microsoft Ignite 2019
Big Data Processing with Spark and .NET - Microsoft Ignite 2019
 
Bringing the Power and Familiarity of .NET, C# and F# to Big Data Processing ...
Bringing the Power and Familiarity of .NET, C# and F# to Big Data Processing ...Bringing the Power and Familiarity of .NET, C# and F# to Big Data Processing ...
Bringing the Power and Familiarity of .NET, C# and F# to Big Data Processing ...
 
Building data pipelines for modern data warehouse with Apache® Spark™ and .NE...
Building data pipelines for modern data warehouse with Apache® Spark™ and .NE...Building data pipelines for modern data warehouse with Apache® Spark™ and .NE...
Building data pipelines for modern data warehouse with Apache® Spark™ and .NE...
 
Modernizing ETL with Azure Data Lake: Hyperscale, multi-format, multi-platfor...
Modernizing ETL with Azure Data Lake: Hyperscale, multi-format, multi-platfor...Modernizing ETL with Azure Data Lake: Hyperscale, multi-format, multi-platfor...
Modernizing ETL with Azure Data Lake: Hyperscale, multi-format, multi-platfor...
 
Best Practices and Performance Tuning of U-SQL in Azure Data Lake (SQL Konfer...
Best Practices and Performance Tuning of U-SQL in Azure Data Lake (SQL Konfer...Best Practices and Performance Tuning of U-SQL in Azure Data Lake (SQL Konfer...
Best Practices and Performance Tuning of U-SQL in Azure Data Lake (SQL Konfer...
 
Bring your code to explore the Azure Data Lake: Execute your .NET/Python/R co...
Bring your code to explore the Azure Data Lake: Execute your .NET/Python/R co...Bring your code to explore the Azure Data Lake: Execute your .NET/Python/R co...
Bring your code to explore the Azure Data Lake: Execute your .NET/Python/R co...
 
Best practices on Building a Big Data Analytics Solution (SQLBits 2018 Traini...
Best practices on Building a Big Data Analytics Solution (SQLBits 2018 Traini...Best practices on Building a Big Data Analytics Solution (SQLBits 2018 Traini...
Best practices on Building a Big Data Analytics Solution (SQLBits 2018 Traini...
 
U-SQL Killer Scenarios: Custom Processing, Big Cognition, Image and JSON Proc...
U-SQL Killer Scenarios: Custom Processing, Big Cognition, Image and JSON Proc...U-SQL Killer Scenarios: Custom Processing, Big Cognition, Image and JSON Proc...
U-SQL Killer Scenarios: Custom Processing, Big Cognition, Image and JSON Proc...
 
Introduction to Azure Data Lake and U-SQL for SQL users (SQL Saturday 635)
Introduction to Azure Data Lake and U-SQL for SQL users (SQL Saturday 635)Introduction to Azure Data Lake and U-SQL for SQL users (SQL Saturday 635)
Introduction to Azure Data Lake and U-SQL for SQL users (SQL Saturday 635)
 
U-SQL Killer Scenarios: Taming the Data Science Monster with U-SQL and Big Co...
U-SQL Killer Scenarios: Taming the Data Science Monster with U-SQL and Big Co...U-SQL Killer Scenarios: Taming the Data Science Monster with U-SQL and Big Co...
U-SQL Killer Scenarios: Taming the Data Science Monster with U-SQL and Big Co...
 
Introducing U-SQL (SQLPASS 2016)
Introducing U-SQL (SQLPASS 2016)Introducing U-SQL (SQLPASS 2016)
Introducing U-SQL (SQLPASS 2016)
 
Tuning and Optimizing U-SQL Queries (SQLPASS 2016)
Tuning and Optimizing U-SQL Queries (SQLPASS 2016)Tuning and Optimizing U-SQL Queries (SQLPASS 2016)
Tuning and Optimizing U-SQL Queries (SQLPASS 2016)
 
Taming the Data Science Monster with A New ‘Sword’ – U-SQL
Taming the Data Science Monster with A New ‘Sword’ – U-SQLTaming the Data Science Monster with A New ‘Sword’ – U-SQL
Taming the Data Science Monster with A New ‘Sword’ – U-SQL
 
ADL/U-SQL Introduction (SQLBits 2016)
ADL/U-SQL Introduction (SQLBits 2016)ADL/U-SQL Introduction (SQLBits 2016)
ADL/U-SQL Introduction (SQLBits 2016)
 
U-SQL Learning Resources (SQLBits 2016)
U-SQL Learning Resources (SQLBits 2016)U-SQL Learning Resources (SQLBits 2016)
U-SQL Learning Resources (SQLBits 2016)
 
U-SQL Federated Distributed Queries (SQLBits 2016)
U-SQL Federated Distributed Queries (SQLBits 2016)U-SQL Federated Distributed Queries (SQLBits 2016)
U-SQL Federated Distributed Queries (SQLBits 2016)
 

Recently uploaded

Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 

Recently uploaded (20)

Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 

SQLPASS AD501-M XQuery MRys

  • 1. Best Practices and Performance Tuning of XML Queries in SQL Server AD-501-M Michael Rys Principal Program Manager Microsoft Corp mrys@microsoft.com @SQLServerMike October 11-14, Seattle, WA
  • 2. Session Objectives • Understand when and how to use XML in SQL Server • Understand and correct common performance problems with XML and XQuery
  • 3. Session Agenda XML Scenarios and when to store XML XML Design Optimizations General Optimizations XML Datatype method Optimizations XQuery Optimizations XML Index Optimizations AD-501-M| XQuery Performance 3
  • 5. XML Scenarios Data Exchange between loosely-coupled systems • XML is ubiquitous, extensible, platform independent transport format • Message Envelope in XML Simple Object Access Protocol (SOAP), RSS, REST • Message Payload/Business Data in XML • Vertical Industry Exchange schemas Document Management • XHTML, DocBook, Home-grown, domain-specific markup (e.g. contracts), OpenOffice, Microsoft Office XML (both default and user- extended) Ad-hoc modeling of semistructured data • Storing and querying heterogeneous complex objects • Semistructured data with sparse, highly-varying structure at the instance level • XML provides self-describing format and extensible schemas →Transport, Store, and Query XML data AD-501-M| XQuery Performance 5
  • 6. Decision Tree: Processing XML In SQL Server Does the data fit Shred the XML the relational Yes into relations model? No structured Known sparse Shred the structured XML into relations, store Shred known Is the data semi- semistructured aspects sparse data into structured? Yes as XML and/or sparse sparse columns col No Open schema Is the XML Promote Yes Is the data a Search within constrainedthe Query into by frequently queried document? the XML? XML? properties Yes schemas? relationally No Yes Use primary and Constrain XML if Store as Define a full-text secondary XML validation XML is Store as cost varbinary(max) index indexes as ok AD-501-M| needed 6 XQuery Performance
  • 7. SQL Server XML Data Type Architecture XML Relational XML XML Parser XML Schemata Schema Validation Collection OpenXML/nodes() PATH XML-DML XML data type Rowsets Index (binary XML) PRIMARY Node Table PROP XML INDEX with FOR XML Index TYPE directive VALUE XQuery Index AD-501-M| XQuery Performance 7
  • 8. General Impacts Concurrency Control • Locks on both XML data type and relevant rows in primary and secondary XML Indices • Lock escalation on indices • Snapshot Isolation reduces locks and lock contention Transaction Logs • Bulkinsert into XML Indices may fill transaction log • Delay the creation of the XML indexes and use the SIMPLE recovery model • Preallocate database file instead of dynamically growing • Place log on different disk In-Row/Out-of-Row of XML large object • Moving XML into side table or out-of-row if mixed with relational data reduces scan time Due to clustering, insertion into XML Index may not be linear • Chose integer/bigint identity column as key AD-501-M| XQuery Performance 8
  • 9. Choose The Right XML Model • Element-centric versus attribute-centric <Customer><name>Joe</name></Customer> <Customer name="Joe" /> +: Attributes often better performing querying –: Parsing Attributes uniqueness check • Generic element names with type attribute vs Specific element names <Entity type="Customer"> <Prop type="Name">Joe</Prop> </Entity> <Customer><name>Joe</name></Customer> +: Specific names shorter path expressions +: Specific names no filter on type attribute /Entity[@type="Customer"]/Prop[@type="Name"] vs /Customer/name • Wrapper elements <Orders><Order id="1"/></Orders> +: No wrapper elements smaller XML, shorter path expressions AD-501-M| XQuery Performance 9
  • 10. Use an XML Schema Collection? Using no XML Schema (untyped XML) • Can still use XQuery and XML Index!!! • Atomic values are always weakly typed strings compare as strings to avoid runtime conversions and loss of index usage • No schema validation overhead • No schema evolution revalidation costs XML Schema provides structural information • Atomic typed elements are now using only one instead of two rows in node table/XML index (closer to attributes) • Static typing can detect cardinality and feasibility of expression XML Schema provides semantic information • Elements/attributes have correct atomic type for comparison and order semantics • No runtime casts required and better use of index for value lookup AD-501-M| XQuery Performance 10
  • 11. XQuery Methods query() creates new, untyped XML data type instance exist() returns 1 if the XQuery expression returns at least one item, 0 otherwise value() extracts an XQuery value into the SQL value and type space • Expression has to statically be a singleton • String value of atomized XQuery item is cast to SQL type • SQL type has to be SQL scalar type (no XML or CLR UDT) AD-501-M| XQuery Performance 11
  • 12. XQuery: nodes() Returns a row per selected node as a special XML data type instance • Preserves the original structure and types • Can only be used with the XQuery methods (but not modify()), count(*), and IS (NOT) NULL Appears as Table-valued Function (TVF) in queryplan if no index present AD-501-M| XQuery Performance 12
  • 13. sql:column()/sql:variable() Map SQL value and type into XQuery values and types in context of XQuery or XML-DML • sql:variable(): accesses a SQL variable/parameter declare @value int set @value=42 select * from T where T.x.exist('/a/b[@id=sql:variable("@value")]')=1 • sql:column(): accesses another column value tables: T(key int, x xml), S(key int, val int) select * from T join S on T.key=S.key where T.x.exist('/a/b[@id=sql:column("S.val")]')=1 • Restrictions in SQL Server: No XML, CLR UDT, datetime, or deprecated text/ntext/image AD-501-M| XQuery Performance 13
  • 14. Improving Slow XQueries, Bad FOR XML demo October 11-14, Seattle, WA
  • 15. Optimal Use Of Methods How to Cast from XML to SQL BAD: CAST( CAST(xmldoc.query('/a/b/text()') as nvarchar(500)) as int) GOOD: xmldoc.value('(/a/b/text())[1]', 'int') BAD: node.query('.').value('@attr', 'nvarchar(50)') GOOD: node.value('@attr', 'nvarchar(50)') AD-501-M| XQuery Performance 15
  • 16. Optimal Use Of Methods Grouping value() method Group value() methods on same XML instance next to each other if the path expressions in the value() methods are • Simple path expressions that only use child and attribute axis and do not contain wildcards, predicates, node tests, ordinals • The path expressions infer statically a singleton The singleton can be statically inferred from • the DOCUMENT and XML Schema Collection • Relative paths on the context node provided by the nodes() method Requires XML index to be present AD-501-M| XQuery Performance 16
  • 17. Optimal Use of Methods Using the right method to join and compare Use exist() method, sql:column()/sql:variable() and an XQuery comparison for checking for a value or joining if secondary XML indices present BAD:* select doc from doc_tab join authors on doc.value('(/doc/mainauthor/lname/text())[1]', 'nvarchar(50)') = lastname GOOD: select doc from doc_tab join authors on 1 = doc.exist('/doc/mainauthor/lname/text()[. = sql:column("lastname")]') * If applied on XML variable/no index present, value() method is most of the time more efficient AD-501-M| XQuery Performance 17
  • 18. Optimal Use of Methods Avoiding bad costing with nodes() nodes() without XML index is a Table-valued function (details later) Bad cardinality estimates can lead to bad plans • BAD: select c.value('@id', 'int') as CustID , c.value('@name', 'nvarchar(50)') as CName from Customer, @x.nodes('/doc/customer') as N(c) where Customer.ID = c.value('@id', 'int') • BETTER (if only one wrapper doc element): select c.value('@id', 'int') as CustID , c.value('@name', 'nvarchar(50)') as CName from Customer, @x.nodes('/doc[1]') as D(d) cross apply d.nodes('customer') as N(c) where Customer.ID = c.value('@id', 'int') Use temp table (insert into #temp select … from nodes()) or Table- valued parameter instead of XML to get better estimates AD-501-M| XQuery Performance 18
  • 19. Optimal Use Of Methods Avoiding multiple method evaluations Use subqueries • BAD: SELECT CASE isnumeric (doc.value( '(/doc/customer/order/price)[1]', 'nvarchar(32)')) WHEN 1 THEN doc.value( '(/doc/customer/order/price)[1]', 'decimal(5,2)') ELSE 0 END FROM T • GOOD: SELECT CASE isnumeric (Price) WHEN 1 THEN CAST(Price as decimal(5,2)) ELSE 0 END FROM (SELECT doc.value( '(/doc/customer/order/price)[1]', 'nvarchar(32)')) as Price FROM T) X Use subqueries also with NULLIF() AD-501-M| XQuery Performance 19
  • 20. Combined SQL And XQuery/DML Processing SELECT x.query('…'), y FROM T WHERE … Static Metadata SQL Parser XQuery Parser Phase XML Static Typing Static Typing Schema Collection Algebrization Algebrization Static Optimization of combined Logical and Physical Operation Tree Dynamic Runtime Optimization XML and Phase and Execution of rel. physical Op Tree Indices AD-501-M| XQuery Performance 20
  • 21. New XQuery Algebra Operators XML Reader TVF Table-Valued Function XML Reader UDF with XPath Filter Used if no Primary XML Index is present Creates node table rowset in query flow Multiple XPath filters can be pushed in to reduce node table to subtree Base cardinality estimate is always 10’000 rows!  Some adjustment based on pushed path filters XMLReader node table format example (simplified) ID TAG ID Node Type-ID VALUE HID 1.3.1 4 (TITLE) Element 2 (xs:string) Bad Bugs #title#section#book AD-501-M| XQuery Performance 21
  • 22. New XQuery Algebra Operators UDX • Serializer UDX serializes the query result as XML • XQuery String UDX evaluates the XQuery string() function • XQuery Data UDX evaluates the XQuery data() function • Check UDX validates XML being inserted • UDX name visible in SSMS properties window AD-501-M| XQuery Performance 22
  • 23. Optimal Use Of XQuery Atomization of nodes Value comparisons, XQuery casts and value() method casts require atomization of item • attribute: /person[@age = 42] /person[data(@age) = 42] • Atomic typed element: /person[age = 42] /person[data(age) = 42] • Untyped, mixed content typed element (adds UDX): /person[age = 42] /person[data(age) = 42] /person[string(age) = 42] • If only one text node for untyped element (better): /person[age/text() = 42] /person[data(age/text()) = 42] • value() method on untyped elements: value('/person/age', 'int') value('/person/age/text()', 'int') String() aggregates all text nodes, prohibits index use AD-501-M| XQuery Performance 23
  • 24. Optimal Use Of XQuery Casting Values Value comparisons require casts and type promotion • Untyped attribute: /person[@age = 42] /person[xs:decimal(@age) = 42] • Untyped text node(): /person[age/text() = 42] /person[xs:decimal(age/text()) = 42] • Typed element (typed as xs:int): /person[salary = 3e4] /person[xs:double(salary) = 3e4] Casting is expensive and prohibits index lookup Tips to avoid casting • Use appropriate types for comparison (string for untyped) • Use schema to declare type AD-501-M| XQuery Performance 24
  • 25. Optimal Use Of XQuery Maximize XPath expressions Single paths are more efficient than twig paths Avoid predicates in the middle of path expressions book[@ISBN = "1-8610-0157-6"]/author[first- name = "Davis"] /book[@ISBN = "1-8610-0157-6"] "∩" /book/author[first-name = "Davis"] Move ordinals to the end of path expressions • Make sure you get the same semantics! • /a[1]/b[1] ≠ (/a/b)[1] ≠ /a/b[1] • (/book/@isbn)[1] is better than/book[1]/@isbn AD-501-M| XQuery Performance 25
  • 26. Optimal Use Of XQuery Maximize XPath expressions in exist() Use context item in predicate to lengthen path in exist() • Existential quantification makes returned node irrelevant • BAD: SELECT * FROM docs WHERE 1 = xCol.exist ('/book/subject[text() = "security"]') • GOOD: SELECT * FROM docs WHERE 1 = xCol.exist ('/book/subject/text()[. = "security"]') • BAD: SELECT * FROM docs WHERE 1 = xCol.exist ('/book[@price > 9.99 and @price < 49.99]') • GOOD: SELECT * FROM docs WHERE 1 = xCol.exist ('/book/@price[. > 9.99 and . < 49.99]') This does not work with or-predicate AD-501-M| XQuery Performance 26
  • 27. Optimal Use Of XQuery Inefficient operations: Parent axis Most frequent offender: parent axis with nodes() • BAD: select o.value('../@id', 'int') as CustID , o.value('@id', 'int') as OrdID from T cross apply x.nodes('/doc/customer/orders') as N(o) • GOOD: select c.value('@id', 'int') as CustID , o.value('@id', 'int') as OrdID from T cross apply x.nodes('/doc/customer') as N1(c) cross apply c.nodes('orders') as N2(o) AD-501-M| XQuery Performance 27
  • 28. Optimal Use Of XQuery Inefficient operations Avoid descendant axes and // in the middle of path expressions if the data structure is known. • // still can use the HID lookup, but is less efficient XQuery construction performs worse than FOR XML • BAD: SELECT notes.query(' <Customer cid="{sql:column(''cid'')}">{ <name>{sql:column("name")}</name>, / }</Customer>') FROM Customers WHERE cid=1 • GOOD: SELECT cid as "@cid", name, notes as "*" FROM Customers WHERE cid=1 FOR XML PATH('Customer'), TYPE AD-501-M| XQuery Performance 28
  • 29. Optimal Use Of FOR XML Use TYPE directive when assigning result to XML • BAD: declare @x xml; set @x = (select * from Customers for xml raw); • GOOD: declare @x xml; set @x = (select * from Customers for xml raw, type); Use FOR XML PATH for complex grouping and additional hierarchy levels over FOR XML EXPLICIT Use FOR XML EXPLICIT for complex nesting if FOR XML PATH performance is not appropriate AD-501-M| XQuery Performance 29
  • 30. XML Indices Create XML index on XML column CREATE PRIMARY XML INDEX idx_1 ON docs (xDoc) Create secondary indexes on tags, values, paths Creation: • Single-threaded only for primary XML index • Multi-threaded for secondary XML indexes Uses: • Primary Index will always be used if defined (not a cost based decision) • Results can be served directly from index • SQL’s cost based optimizer will consider secondary indexes Maintenance: • Primary and Secondary Indices will be efficiently maintained during updates • Only subtree that changes will be updated • No online index rebuild  • Clustered key may lead to non-linear maintenance cost  Schema revalidation still checks whole instance AD-501-M| XQuery Performance 30
  • 31. Example Index Contents insert into Person values (42, '<book ISBN=”1-55860-438-3”> <section> <title>Bad Bugs</title> Nobody loves bad bugs. </section> <section> <title>Tree Frogs</title> All right-thinking people <bold>love</bold> tree frogs. </section> </book>') AD-501-M| XQuery Performance 31
  • 32. Primary XML Index CREATE PRIMARY XML INDEX PersonIdx ON Person (Pdesc) PK XID TAG ID Node Type-ID VALUE HID 42 1 1 (book) Element 1 (bookT) null #book 42 1.1 2 (ISBN) Attribute 2 (xs:string) 1-55860-438-3 #@ISBN#book 42 1.3 3 (section) Element 3 (sectionT) null #section#book 42 1.3.1 4 (TITLE) Element 2 (xs:string) Bad Bugs #title#section#book 42 1.3.3 -- Text -- Nobody loves #text()#section#book bad bugs. 42 1.5 3 (section) Element 3 (sectionT) null #section#book 42 1.5.1 4 (title) Element 2 (xs:string) Tree frogs #title#section#book 42 1.5.3 -- Text -- All right-thinking #text()#section#book people 42 1.5.5 7 (bold) Element 4 (boldT) love #bold#section#book 42 1.5.7 -- Text -- tree frogs #text()#section#book Assumes typed data; Columns and Values are simplified, see VLDB 2004 paper for details AD-501-M| XQuery Performance 32
  • 33. Secondary XML Indices XML Column Primary XML Index (1 per XML column) in table T(id, x) Clustered on Primary Key (of table T), XID PK XID NID TID VALUE LVALUE HID xsinil … id x 1 1 Binary XML 1 1 2 Binary XML 2 2 1 34 1 2 3 1 2 2 3 Binary XML 3 3 3 Non-clustered Secondary Indices (n per primary Index) Value Index Property Index Path Index AD-501-M| XQuery Performance 33
  • 34. XQueries And XML Indices demo October 11-14, Seattle, WA
  • 35. Takeaway: XML Indices PRIMARY XML Index – Use when lots of XQuery FOR VALUE – Useful for queries where values are more selective than paths such as //*[.=“Seattle”] FOR PATH – Useful for Path expressions: avoids joins by mapping paths to hierarchical index (HID) numbers. Example: /person/address/zip FOR PROPERTY – Useful when optimizer chooses other index (for example, on relational column, or FT Index) in addition so row is already known AD-501-M| XQuery Performance 35
  • 36. Shredding Approaches Approach Complex Bulkload Server Business Programming Scale/ Shapes vs logic Performance Midtier SQLXML Yes with Yes midtier staging annotated very good/ Bulkload limits tables on XSD and small very good with server, API annotated XSLT on schema midtier ADO.Net No No midtier midtier, DataSet API good/good DataSet SSIS or SSIS CLR Table- Yes No Server Server or C#, VB limited/good valued or midtier custom code function midtier OpenXML Yes No Server T-SQL declarative T- limited/good SQL, XPath against variable nodes() Yes No Server T-SQL declarative good/careful SQL, XQuery against var or table
  • 37. To Promote or Not Promote… Promotion pre-calculates paths Requires relational query • XQuery does not know about promotion Promotion during loading of the data • Using any of the shredding mechanisms • 1-to-1 or 1-to-many relationships Promotion using computed columns • 1-to-1 only • Persist computed column: Fast lookup and retrieval • Relational index on persisted computed column: Fast lookup • Have to be precise Promotion using Triggers • 1-to-1 or 1-to-many relationships • Trigger overhead Relational View over XML data • Filters on relational view are not pushed down due to different type/value system AD-501-M| XQuery Performance 37
  • 38. Promotion using computed columns Use a schema-bound UDF that encapsulates XQuery Persist computed column • Fast lookup and retrieval Relational index on persisted computed column • Fast lookup Query will have to use the schema-bound UDF to match CAVEAT: No parallel plans with a persisted computed column based on a UDF AD-501-M| XQuery Performance 38
  • 39. Use of Full-Text Index for Optimization Can provide improvement for XQuery contains() queries Query for documents where section title contains “optimization” Use Fulltext index to prefilter candidates (includes false positives) SELECT * FROM docs WHERE contains(xCol, 'optimization') 1 = xCol.exist(' /book/section/title/text()[contains(.,"optimization")] AND 1 = xCol.exist(' ') /book/section/title/text()[contains(.,"optimization")] ') AD-501-M| XQuery Performance 39
  • 40. Futures: Selective XML Index CREATE SELECTIVE XML INDEX pxi_index ON Tbl(xmlcol) FOR ( -– the first four match XQuery predicates -- in all XML data type methods -- simple flavor - default mapping (xs:untypedAtomic), -- no optimization hints node42 = ‘/a/b’, pathatc = ‘/a/b/c/@atc’, -- advanced flavor - use of optimization hints path02 =‘/a/b/c’ as XQUERY ‘xs:string’ MAXLENGTH(25), node13 = ‘/a/b/d’ as XQUERY ‘xs:double SINGLETON, -– the next two match value() method -- require regular SQL Server type semantics -- they can be mixed with the XQUERY ones -- specifying a type is mandatory for the SQL type semantics pathfloat = ‘/a/b/c’ as SQL FLOAT, pathabd = ‘/a/b/d’ as SQL VARCHAR(200) )
  • 41. Session Takeaways • Understand when and how to use XML in SQL Server • Understand and correct common performance problems with XML and XQuery • Shred “relational” XML to relations • Use XML datatype for semistructured and markup scenarios • Write your XQueries so that XML Indices can be used • Use persisted computed columns to promote XQuery results (with caveat)
  • 43. Related Content Optimization whitepapers http://msdn2.microsoft.com/en-us/library/ms345118.aspx http://msdn2.microsoft.com/en-us/library/ms345121.aspx General XML and Databases whitepapers http://msdn2.microsoft.com/en-us/xml/bb190603.aspx Online WebCasts http://www.microsoft.com/events/series/msdnsqlserver2005.mspx#SQ LXML Newsgroups & Forum: microsoft.public.sqlserver.xml http://communities.microsoft.com/newsgroups/default.asp?ICP=sqlse rver2005&sLCID=us http://forums.microsoft.com/msdn/ShowForum.aspx?ForumID=89 My E-mail: mrys@microsoft.com My Weblog: http://sqlblog.com/blogs/michael_rys/ AD-501-M| XQuery Performance 43
  • 44. Complete the Evaluation Form to Win! Win a Dell Mini Netbook – every day – just for submitting your completed form. Each session evaluation form represents a chance to win. Pick up your evaluation form: • In each presentation room Sponsored by Dell • Online on the PASS Summit website Drop off your completed form: • Near the exit of each presentation room • At the Registration desk • Online on the PASS Summit website AD-501-M| XQuery Performance 44
  • 45. Thank you for attending this session and the 2011 PASS Summit in Seattle October 11-14, Seattle, WA
  • 46. Microsoft SQL Microsoft Expert Pods Hands-on Labs Server Clinic Product Pavilion Meet Microsoft SQL Server Engineering Work through your Talk with Microsoft SQL Get experienced through team members & technical issues with SQL Server & BI experts to self-paced & instructor- SQL MVPs Server CSS & get learn about the next led labs on our cloud architectural guidance version of SQL Server based lab platform - from SQLCAT and check out the new bring your laptop or use Database Consolidation HP provided hardware Appliance Room 611 Expo Hall 6th Floor Lobby Room 618-620 AD-501-M| XQuery Performance 46