SlideShare a Scribd company logo
1 of 40
Download to read offline
Tagging and Folksonomy
  Schema Design for Scalability and Performance
                                     Jay Pipes
                     Community Relations Manager, North America
                                 (jay@mysql.com)
                                    MySQL, Inc.
   TELECONFERENCE: please dial a number to hear the audio portion of this presentation.

   Toll-free US/Canada: 866-469-3239        Direct US/Canada: 650-429-3300 :
   0800-295-240 Austria                     0800-71083 Belgium
   80-884912 Denmark                        0-800-1-12585 Finland
   0800-90-5571 France                      0800-101-6943 Germany
   00800-12-6759 Greece                     1-800-882019 Ireland
   800-780-632 Italy                        0-800-9214652 Israel
   800-2498 Luxembourg                      0800-022-6826 Netherlands
   800-15888 Norway                         900-97-1417 Spain
   020-79-7251 Sweden                       0800-561-201 Switzerland
   0800-028-8023 UK                         1800-093-897 Australia
   800-90-3575 Hong Kong                    00531-12-1688 Japan

                                   EVENT NUMBER/ACCESS CODE:

Copyright MySQL AB                                            The World’s Most Popular Open Source Database   1
Communicate ...

Connect ...

Share ...

Play ...

Trade ...                                            craigslist


Search & Look Up



Copyright MySQL AB   The World’s Most Popular Open Source Database   2
Software Industry Evolution
                                   WEB 1.0                     WEB 2.0




                                                                FREE & OPEN
                                                                SOURCE
                                                                SOFTWARE


                                    CLOSED SOURCE SOFTWARE



                           1990          2000
                                  1995             2005

Copyright MySQL AB                              The World’s Most Popular Open Source Database   3
How MySQL Enables Web 2.0

                                   ubiquitous with developers
             lower tco
                                                           large objects
                                  tag databases
                        lamp                      XML
  full-text search                                                language support
                        ease of use
     world-wide
                                                  performance
     community
                                                                                       scale out
                           interoperable
                                                  connectors
      open source                                                         bundled

                                reliability          session management
             replication
                                                                      query cache
   partitioning (5.1)
                      pluggable storage engines




Copyright MySQL AB                                      The World’s Most Popular Open Source Database   4
Agenda
     •   Web 2.0: more than just rounded corners
     •   Tagging Concepts in SQL
     •   Folksonomy Concepts in SQL
     •   Scaling out sensibly




Copyright MySQL AB                        The World’s Most Popular Open Source Database   5
More than just rounded corners
   • What is Web 2.0?
          – Participation
          – Interaction
          – Connection
   • A changing of design patterns
          –   AJAX and XML-RPC are changing the way data is queried
          –   Rich web-based clients
          –   Everyone's got an API
          –   API leads to increased requests per second. Must deal with
              growth!




Copyright MySQL AB                                  The World’s Most Popular Open Source Database   6
AJAX and XML-RPC change the game
   • Traditional (Web 1.0) page request builds entire page
          – Lots of HTML, style and data in each page request
          – Lots of data processed or queried in each request
          – Complex page requests and application logic
   • AJAX/XML-RPC request returns small data set, or
     page fragment
          – Smaller amount of data being passed on each request
          – But, often many more requests compared to Web 1.0!
          – Exposed APIs mean data services distribute your data to
            syndicated sites
          – RSS feeds supply data, not full web page




Copyright MySQL AB                                The World’s Most Popular Open Source Database   7
flickr Tag Cloud(s)




Copyright MySQL AB                  The World’s Most Popular Open Source Database   8
Participation, Interaction and Connection
   • Multiple users editing the same content
          – Content explosion
          – Lots of textual data to manage. How do we organize it?
   • User-driven data
          – “Tracked” pages/items
          – Allows interlinking of content via the user to user relationship
            (folksonomy)
   • Tags becoming the new categorization/filing of this
     content
          – Anyone can tag the data
          – Tags connect one thing to another, similar to the way that the
            folksonomy relationships link users together
   • So, the common concept here is “linking”...

Copyright MySQL AB                                    The World’s Most Popular Open Source Database   9
What The User Dimension Gives You
   • Folksonomy adds the “user dimension”
   • Tags often provide the “glue” between user and item
     dimensions
   • Answers questions such as:
          – Who shares my interests
          – What are the other interests of users who share my interests
          – Someone tagged my bookmark (or any other item) with
            something. What other items did that person tag with the
            same thing?
          – What is the user's immediate “ecosystem”? What are the
            fringes of that ecosystem?
                • Ecosystem radius expansion (like zipcode radius expansion...)
   • Marks a new aspect of web applications into the world
     of data warehousing and analysis
Copyright MySQL AB                                       The World’s Most Popular Open Source Database   10
Linking Is the “R” in RDBMS
   • The schema becomes the absolute driving force behind
     Web 2.0 applications
   • Understanding of many-to-many relationships is critical
   • Take advantage of MySQL's architecture so that our
     schema is as efficient as possible
   • Ensure your data store is normalized, standardized,
     and consistent
   • So, let's digg in! (pun intended)




Copyright MySQL AB                      The World’s Most Popular Open Source Database   11
Comparison of Tag Schema Designs
   • “MySQLicious”
          –   Entirely denormalized
          –   One main fact table with one field storing delimited list of tags
          –   Unless using FULLTEXT indexing, does not scale well at all
          –   Very inflexible
   • “Scuttle” solution
          – Two tables. Lookup one-way from item table to tag table
          – Somewhat inflexible
          – Doesn't represent a many-to-many relationship correctly
   • “Toxi” solution
          – Almost gets it right, except uses surrogate keys in mapping
            tables ...
          – Flexible, normalized approach
          – Closest to our recommended architecture ...
Copyright MySQL AB                                     The World’s Most Popular Open Source Database   12
Tagging Concepts in SQL




Copyright MySQL AB                    The World’s Most Popular Open Source Database   13
The Tags Table
   • The “Tags” table is the foundational table upon which
     all tag links are built
   • Lean and mean
   • Make primary key an INT!

    CREATE TABLE Tags (
    tag_id INT UNSIGNED NOT NULL AUTO_INCREMENT
    , tag_text VARCHAR(50) NOT NULL
    , PRIMARY KEY pk_Tags (tag_id)
    , UNIQUE INDEX uix_TagText (tag_text)
    ) ENGINE=InnoDB;



Copyright MySQL AB                       The World’s Most Popular Open Source Database   14
Example Tag2Post Mapping Table
   • The mapping table creates the link between a tag and
     anything else
   • In other terms, it maps a many-to-many relationship
   • Important to index from both “sides”

    CREATE TABLE Tag2Post (
    tag_id INT UNSIGNED NOT NULL
    , post_id INT UNSIGNED NOT NULL
    , PRIMARY KEY pk_Tag2Post (tag_id, post_id)
    , INDEX (post_id)
    ) ENGINE=InnoDB;



Copyright MySQL AB                     The World’s Most Popular Open Source Database   15
The Tag Cloud
   • Tag density typically represented by larger fonts or
     different colors

      SELECT tag_text
      , COUNT(*) as num_tags
      FROM Tag2Post t2p
      INNER JOIN Tags t
      ON t2p.tag_id = t.tag_id
      GROUP BY tag_text;




Copyright MySQL AB                        The World’s Most Popular Open Source Database   16
Efficiency Issues with the Tag Cloud
   • With InnoDB tables, you don't want to be issuing
     COUNT() queries, even on an indexed field
       CREATE TABLE TagStat
       tag_id INT UNSIGNED NOT NULL
       , num_posts INT UNSIGNED NOT NULL
       // ... num_xxx INT UNSIGNED NOT NULL 
       , PRIMARY KEY (tag_id)
       ) ENGINE=InnoDB;

        SELECT tag_text, ts.num_posts
        FROM Tag2Post t2p
        INNER JOIN Tags t
        ON t2p.tag_id = t.tag_id
        INNER JOIN TagStat ts
        ON t.tag_id = ts.tag_id
        GROUP BY tag_text;

Copyright MySQL AB                      The World’s Most Popular Open Source Database   17
The Typical Related Items Query
   • Get all posts tagged with any tag attached to Post #6
   • In other words “Get me all posts related to post #6”


       SELECT p2.post_id 
       FROM Tag2Post p1
       INNER JOIN Tag2Post p2
       ON p1.tag_id = p2.tag_id
       WHERE p1.post_id = 6
       GROUP BY p2.post_id;




Copyright MySQL AB                       The World’s Most Popular Open Source Database   18
Problems With Related Items Query
   • Joining small to medium sized “tag sets” works great
   • But... when you've got a large tag set on either “side” of
     the join, problems can occur with scalablity
   • One way to solve is via derived tables

       SELECT p2.post_id 
       FROM (
       SELECT tag_id FROM Tag2Post
       WHERE post_id = 6 LIMIT 10
       ) AS p1
       INNER JOIN Tag2Post p2
       ON p1.tag_id = p2.tag_id
       GROUP BY p2.post_id LIMIT 10;

Copyright MySQL AB                        The World’s Most Popular Open Source Database   19
The Typical Related Tags Query
   • Get all tags related to a particular tag via an item
   • The “reverse” of the related items query; we want a set
     of related tags, not related posts

       SELECT t2p2.tag_id, t2.tag_text  
       FROM (SELECT post_id FROM Tags t1
       INNER JOIN Tag2Post 
       ON t1.tag_id = Tag2Post.tag_id
       WHERE t1.tag_text = 'beach' LIMIT 10
       ) AS t2p1
       INNER JOIN Tag2Post t2p2
       ON t2p1.post_id = t2p2.post_id
       INNER JOIN Tags t2
       ON t2p2.tag_id = t2.tag_id
       GROUP BY t2p2.tag_id LIMIT 10;


Copyright MySQL AB                       The World’s Most Popular Open Source Database   20
Dealing With More Than One Tag
   • What if we want only items related to each of a set of
     tags?
   • Here is the typical way of dealing with this problem

       SELECT t2p.post_id
       FROM Tags t1 
       INNER JOIN Tag2Post t2p
       ON t1.tag_id = t2p.tag_id
       WHERE t1.tag_text IN ('beach','cloud')
       GROUP BY t2p.post_id 
       HAVING COUNT(DISTINCT t2p.tag_id) = 2;




Copyright MySQL AB                       The World’s Most Popular Open Source Database   21
Dealing With More Than One Tag (cont'd)
   • The GROUP BY and the HAVING COUNT(DISTINCT
     ...) can be eliminated through joins
   • Thus, you eliminate the Using temporary, using filesort
     in the query execution

        SELECT t2p2.post_id
        FROM Tags t1 CROSS JOIN Tags t2
        INNER JOIN Tag2Post t2p1
        ON t1.tag_id = t2p1.tag_id
        INNER JOIN Tag2Post t2p2
        ON t2p1.post_id = t2p2.post_id
        AND t2p2.tag_id = t2.tag_id
        WHERE t1.tag_text = 'beach'
        AND t2.tag_text = 'cloud';


Copyright MySQL AB                        The World’s Most Popular Open Source Database   22
Excluding Tags From A Resultset
   • Here, we want have a search query like “Give me all
     posts tagged with “beach” and “cloud” but not tagged
     with “flower”. Typical solution you will see:

        SELECT t2p1.post_id
        FROM Tag2Post t2p1
        INNER JOIN Tags t1 ON t2p1.tag_id = t1.tag_id
        WHERE t1.tag_text IN ('beach','cloud')
        AND t2p1.post_id NOT IN
        (SELECT post_id FROM Tag2Post t2p2
        INNER JOIN Tags t2 
        ON t2p2.tag_id = t2.tag_id
        WHERE t2.tag_text = 'flower')
        GROUP BY t2p1.post_id 
        HAVING COUNT(DISTINCT t2p1.tag_id) = 2;


Copyright MySQL AB                      The World’s Most Popular Open Source Database   23
Excluding Tags From A Resultset (cont'd)
   • More efficient to use an outer join to filter out the
     “minus” operator, plus get rid of the GROUP BY...
       SELECT t2p2.post_id
       FROM Tags t1 CROSS JOIN Tags t2 CROSS JOIN Tags t3
       INNER JOIN Tag2Post t2p1
       ON t1.tag_id = t2p1.tag_id
       INNER JOIN Tag2Post t2p2
       ON t2p1.post_id = t2p2.post_id
       AND t2p2.tag_id = t2.tag_id
       LEFT JOIN Tag2Post t2p3
       ON t2p2.post_id = t2p3.post_id
       AND t2p3.tag_id = t3.tag_id
       WHERE t1.tag_text = 'beach'
       AND t2.tag_text = 'cloud'
       AND t3.tag_text = 'flower'
       AND t2p3.post_id IS NULL;

Copyright MySQL AB                         The World’s Most Popular Open Source Database   24
Summary of SQL Tips for Tagging
   • Index fields in mapping tables properly. Ensure that
     each GROUP BY can access an index from the left
     side of the index
   • Use summary or statistic tables to eliminate the use of
     COUNT(*) expressions in tag clouding
   • Get rid of GROUP BY and HAVING COUNT(...) by
     using standard join techniques
   • Get rid of NOT IN expressions via a standard outer join
   • Use derived tables with an internal LIMIT expression to
     prevent wild relation queries from breaking scalability




Copyright MySQL AB                       The World’s Most Popular Open Source Database   25
Folksonomy Concepts in SQL




Copyright MySQL AB                     The World’s Most Popular Open Source Database   26
Here we see
                        tagging and
                        folksonomy
                     together: the user
                        dimension...




Copyright MySQL AB                        The World’s Most Popular Open Source Database   27
Folksonomy Adds The User Dimension
   • Adding the user dimension to our schema
   • The tag is the relationship glue between the user and
     item dimensions

       CREATE TABLE UserTagPost (
       user_id INT UNSIGNED NOT NULL
       , tag_id INT UNSIGNED NOT NULL
       , post_id INT UNSIGNED NOT NULL
       , PRIMARY KEY (user_id, tag_id, post_id)
       , INDEX (tag_id)
       , INDEX (post_id)
       ) ENGINE=InnoDB;


Copyright MySQL AB                       The World’s Most Popular Open Source Database   28
Who Shares My Interest Directly?
   • Find out the users who have linked to the same item I
     have
   • Direct link; we don't go through the tag glue


    SELECT user_id 
    FROM UserTagPost
    WHERE post_id = @my_post_id;




Copyright MySQL AB                      The World’s Most Popular Open Source Database   29
Who Shares My Interests Indirectly?
   • Find out the users who have similar tag sets...
   • But, how much matching do we want to do? In other
     words, what radius do we want to match on...
   • The first step is to find my tags that are within the
     search radius... this yields my “top” or most popular
     tags

           SELECT tag_id 
           FROM UserTagPost
           WHERE user_id = @my_user_id
           GROUP BY tag_id
           HAVING COUNT(tag_id) >= @radius;


Copyright MySQL AB                       The World’s Most Popular Open Source Database   30
Who Shares My Interests (cont'd)
   • Now that we have our “top” tag set, we want to find
     users who match all of our top tags:


           SELECT others.user_id 
           FROM UserTagPost others 
           INNER JOIN (SELECT tag_id 
           FROM UserTagPost
           WHERE user_id = @my_user_id
           GROUP BY tag_id
           HAVING COUNT(tag_id) >= @radius) AS my_tags
           ON others.tag_id = my_tags.tag_id
           GROUP BY others.user_id;




Copyright MySQL AB                       The World’s Most Popular Open Source Database   31
Ranking Ecosystem Matches
   • What about finding our “closest” ecosystem matches
   • We can “rank” other users based on whether they have
     tagged items a number of times similar to ourselves
           SELECT others.user_id
           , (COUNT(*) ­ my_tags.num_tags) AS rank
           FROM UserTagPost others 
           INNER JOIN (
           SELECT tag_id, COUNT(*) AS num_tags 
           FROM UserTagPost
           WHERE user_id = @my_user_id
           GROUP BY tag_id
           HAVING COUNT(tag_id) >= @radius
           ) AS my_tags
           ON others.tag_id = my_tags.tag_id
           GROUP BY others.user_id
           ORDER BY rank DESC;

Copyright MySQL AB                       The World’s Most Popular Open Source Database   32
Ranking Ecosystem Matches Efficiently
   • But... we've still got our COUNT(*) problem
   • How about another summary table ...

                 CREATE TABLE UserTagStat (
                 user_id INT UNSIGNED NOT NULL
                 , tag_id INT UNSIGNED NOT NUL
                 , num_posts INT UNSIGNED NOT NULL
                 , PRIMARY KEY (user_id, tag_id)
                 , INDEX (tag_id)
                 ) ENGINE=InnoDB;




Copyright MySQL AB                          The World’s Most Popular Open Source Database   33
Ranking Ecosystem Matches Efficiently 2
   • Hey, we've eliminated the aggregation!


        SELECT others.user_id
        , (others.num_posts ­ my_tags.num_posts) AS rank
        FROM UserTagStat others 
        INNER JOIN (
        SELECT tag_id, num_posts
        FROM UserTagStat
        WHERE user_id = @my_user_id
        AND  num_posts >= @radius
        ) AS my_tags
        ON others.tag_id = my_tags.tag_id
        ORDER BY rank DESC;




Copyright MySQL AB                      The World’s Most Popular Open Source Database   34
Scaling Out Sensibly




Copyright MySQL AB                   The World’s Most Popular Open Source Database   35
MySQL Replication (Scale Out)
                                                             • Write to one master
                      Web/App       Web/App     Web/App
                                                             • Read from many slaves
                       Server        Server      Server
                                                             • Excellent for read
                                                               intensive apps
                            Load Balancer


                                        Reads             Reads
     Writes & Reads




                                                                         …
             Master                       Slave            Slave
             MySQL                        MySQL            MySQL
             Server                       Server           Server



                      Replication




Copyright MySQL AB                                         The World’s Most Popular Open Source Database   36
Scale Out Using Replication
     • Master DB stores all writes
     • Master has InnoDB tables
     • Slaves handle aggregate reads, non-realtime reads
     • Web servers can be load balanced (directed) to one or
       more slaves
     • Just plug in another slave to increase read performance
       (that's scaling out...)
     • Slave can provide hot standby as well as backup server




Copyright MySQL AB                       The World’s Most Popular Open Source Database   37
Scale Out Strategies
                     • Slave storage engine can differ from
                       Master!
                        – InnoDB on Master (great update/insert/delete
                          performance)
                        – MyISAM on Slave (fantastic read performance
                          and well as excellent concurrent insert
                          performance, plus can use FULLTEXT
                          indexing)
                     • Push aggregated summary data in
                       batches onto slaves for excellent read
                       performance of semi-static data
                        – Example: “this week's popular tags”
                           • Generate the data via cron job on each slave.
                             No need to burden the master server
                           • Truncate every week

Copyright MySQL AB                               The World’s Most Popular Open Source Database   38
Scale Out Strategies (cont'd)
                        • Offload FULLTEXT indexing onto a FT
                          indexer such as Apache Lucene,
                          Mnogosearch, Sphinx FT Engine, etc.
                        • Use Partitioning feature of 5.1 to
                          segment tag data across multiple
                          partitions, allowing you to spread disk
                          load sensibly based on your tag text
                          density
                        • Use the MySQL Query Cache effectively
                           – Use SQL_NO_CACHE when selecting from
                             frequently updated tables (ex: TagStat)
                           – Very effective for high-read environments:
                             can yield 200-250% performance
                             improvement


Copyright MySQL AB                                The World’s Most Popular Open Source Database   39
Questions?
 MySQL Forge
 http://forge.mysql.com/

 MySQL Forge Tag Schema Wiki pages
 http://forge.mysql.com/wiki/TagSchema

 PlanetMySQL
 http://www.planetmysql.org

                                   Jay Pipes
                               (jay@mysql.com)
                                  MySQL AB




Copyright MySQL AB                               The World’s Most Popular Open Source Database   40

More Related Content

What's hot

Oracle Cloud Infrastructure:2021年11月度サービス・アップデート
Oracle Cloud Infrastructure:2021年11月度サービス・アップデートOracle Cloud Infrastructure:2021年11月度サービス・アップデート
Oracle Cloud Infrastructure:2021年11月度サービス・アップデートオラクルエンジニア通信
 
その Pod 突然落ちても大丈夫ですか!?(OCHaCafe5 #5 実験!カオスエンジニアリング 発表資料)
その Pod 突然落ちても大丈夫ですか!?(OCHaCafe5 #5 実験!カオスエンジニアリング 発表資料)その Pod 突然落ちても大丈夫ですか!?(OCHaCafe5 #5 実験!カオスエンジニアリング 発表資料)
その Pod 突然落ちても大丈夫ですか!?(OCHaCafe5 #5 実験!カオスエンジニアリング 発表資料)NTT DATA Technology & Innovation
 
KubernetesでRedisを使うときの選択肢
KubernetesでRedisを使うときの選択肢KubernetesでRedisを使うときの選択肢
KubernetesでRedisを使うときの選択肢Naoyuki Yamada
 
意識の低い自動化
意識の低い自動化意識の低い自動化
意識の低い自動化greenasparagus
 
微服務架構|01|入門微服務|到底什麼是微服務?
微服務架構|01|入門微服務|到底什麼是微服務?微服務架構|01|入門微服務|到底什麼是微服務?
微服務架構|01|入門微服務|到底什麼是微服務?Kuo Lung Chu
 
JSON:APIについてざっくり入門
JSON:APIについてざっくり入門JSON:APIについてざっくり入門
JSON:APIについてざっくり入門iPride Co., Ltd.
 
DDD, CQRS and testing with ASP.Net MVC
DDD, CQRS and testing with ASP.Net MVCDDD, CQRS and testing with ASP.Net MVC
DDD, CQRS and testing with ASP.Net MVCAndy Butland
 
CDNで高速化!Drupal認証ユーザーむけページキャッシュ設定
CDNで高速化!Drupal認証ユーザーむけページキャッシュ設定CDNで高速化!Drupal認証ユーザーむけページキャッシュ設定
CDNで高速化!Drupal認証ユーザーむけページキャッシュ設定Katsuhisa Ogawa
 
アプリエンジニアからクラウド専用のインフラエンジニアになってみて
アプリエンジニアからクラウド専用のインフラエンジニアになってみてアプリエンジニアからクラウド専用のインフラエンジニアになってみて
アプリエンジニアからクラウド専用のインフラエンジニアになってみてSato Shun
 
3分でわかる Azure Managed Diskのしくみ
3分でわかる Azure Managed Diskのしくみ3分でわかる Azure Managed Diskのしくみ
3分でわかる Azure Managed DiskのしくみToru Makabe
 
Migrating Monoliths to Microservices -- M3
Migrating Monoliths to Microservices -- M3Migrating Monoliths to Microservices -- M3
Migrating Monoliths to Microservices -- M3Asir Selvasingh
 
使ってみた!ioMemoryで実現する噂のAtomic write!
使ってみた!ioMemoryで実現する噂のAtomic write!使ってみた!ioMemoryで実現する噂のAtomic write!
使ってみた!ioMemoryで実現する噂のAtomic write!IIJ
 
OpenStackを使用したGPU仮想化IaaS環境 事例紹介
OpenStackを使用したGPU仮想化IaaS環境 事例紹介OpenStackを使用したGPU仮想化IaaS環境 事例紹介
OpenStackを使用したGPU仮想化IaaS環境 事例紹介VirtualTech Japan Inc.
 
[B31] LOGMinerってレプリケーションソフトで使われているけどどうなってる? by Toshiya Morita
[B31] LOGMinerってレプリケーションソフトで使われているけどどうなってる? by Toshiya Morita[B31] LOGMinerってレプリケーションソフトで使われているけどどうなってる? by Toshiya Morita
[B31] LOGMinerってレプリケーションソフトで使われているけどどうなってる? by Toshiya MoritaInsight Technology, Inc.
 
クラウドのためのアーキテクチャ設計 - ベストプラクティス -
クラウドのためのアーキテクチャ設計 - ベストプラクティス - クラウドのためのアーキテクチャ設計 - ベストプラクティス -
クラウドのためのアーキテクチャ設計 - ベストプラクティス - SORACOM, INC
 
Cassandraのしくみ データの読み書き編
Cassandraのしくみ データの読み書き編Cassandraのしくみ データの読み書き編
Cassandraのしくみ データの読み書き編Yuki Morishita
 
Fluentdのお勧めシステム構成パターン
Fluentdのお勧めシステム構成パターンFluentdのお勧めシステム構成パターン
Fluentdのお勧めシステム構成パターンKentaro Yoshida
 

What's hot (20)

Oracle Cloud Infrastructure:2021年11月度サービス・アップデート
Oracle Cloud Infrastructure:2021年11月度サービス・アップデートOracle Cloud Infrastructure:2021年11月度サービス・アップデート
Oracle Cloud Infrastructure:2021年11月度サービス・アップデート
 
その Pod 突然落ちても大丈夫ですか!?(OCHaCafe5 #5 実験!カオスエンジニアリング 発表資料)
その Pod 突然落ちても大丈夫ですか!?(OCHaCafe5 #5 実験!カオスエンジニアリング 発表資料)その Pod 突然落ちても大丈夫ですか!?(OCHaCafe5 #5 実験!カオスエンジニアリング 発表資料)
その Pod 突然落ちても大丈夫ですか!?(OCHaCafe5 #5 実験!カオスエンジニアリング 発表資料)
 
KubernetesでRedisを使うときの選択肢
KubernetesでRedisを使うときの選択肢KubernetesでRedisを使うときの選択肢
KubernetesでRedisを使うときの選択肢
 
意識の低い自動化
意識の低い自動化意識の低い自動化
意識の低い自動化
 
微服務架構|01|入門微服務|到底什麼是微服務?
微服務架構|01|入門微服務|到底什麼是微服務?微服務架構|01|入門微服務|到底什麼是微服務?
微服務架構|01|入門微服務|到底什麼是微服務?
 
JSON:APIについてざっくり入門
JSON:APIについてざっくり入門JSON:APIについてざっくり入門
JSON:APIについてざっくり入門
 
DDD, CQRS and testing with ASP.Net MVC
DDD, CQRS and testing with ASP.Net MVCDDD, CQRS and testing with ASP.Net MVC
DDD, CQRS and testing with ASP.Net MVC
 
Nutanix 概要紹介
Nutanix 概要紹介Nutanix 概要紹介
Nutanix 概要紹介
 
nginxの紹介
nginxの紹介nginxの紹介
nginxの紹介
 
CDNで高速化!Drupal認証ユーザーむけページキャッシュ設定
CDNで高速化!Drupal認証ユーザーむけページキャッシュ設定CDNで高速化!Drupal認証ユーザーむけページキャッシュ設定
CDNで高速化!Drupal認証ユーザーむけページキャッシュ設定
 
アプリエンジニアからクラウド専用のインフラエンジニアになってみて
アプリエンジニアからクラウド専用のインフラエンジニアになってみてアプリエンジニアからクラウド専用のインフラエンジニアになってみて
アプリエンジニアからクラウド専用のインフラエンジニアになってみて
 
3分でわかる Azure Managed Diskのしくみ
3分でわかる Azure Managed Diskのしくみ3分でわかる Azure Managed Diskのしくみ
3分でわかる Azure Managed Diskのしくみ
 
Migrating Monoliths to Microservices -- M3
Migrating Monoliths to Microservices -- M3Migrating Monoliths to Microservices -- M3
Migrating Monoliths to Microservices -- M3
 
使ってみた!ioMemoryで実現する噂のAtomic write!
使ってみた!ioMemoryで実現する噂のAtomic write!使ってみた!ioMemoryで実現する噂のAtomic write!
使ってみた!ioMemoryで実現する噂のAtomic write!
 
OpenStackを使用したGPU仮想化IaaS環境 事例紹介
OpenStackを使用したGPU仮想化IaaS環境 事例紹介OpenStackを使用したGPU仮想化IaaS環境 事例紹介
OpenStackを使用したGPU仮想化IaaS環境 事例紹介
 
[B31] LOGMinerってレプリケーションソフトで使われているけどどうなってる? by Toshiya Morita
[B31] LOGMinerってレプリケーションソフトで使われているけどどうなってる? by Toshiya Morita[B31] LOGMinerってレプリケーションソフトで使われているけどどうなってる? by Toshiya Morita
[B31] LOGMinerってレプリケーションソフトで使われているけどどうなってる? by Toshiya Morita
 
クラウドのためのアーキテクチャ設計 - ベストプラクティス -
クラウドのためのアーキテクチャ設計 - ベストプラクティス - クラウドのためのアーキテクチャ設計 - ベストプラクティス -
クラウドのためのアーキテクチャ設計 - ベストプラクティス -
 
Cassandraのしくみ データの読み書き編
Cassandraのしくみ データの読み書き編Cassandraのしくみ データの読み書き編
Cassandraのしくみ データの読み書き編
 
KubeConRecap_nakamura.pdf
KubeConRecap_nakamura.pdfKubeConRecap_nakamura.pdf
KubeConRecap_nakamura.pdf
 
Fluentdのお勧めシステム構成パターン
Fluentdのお勧めシステム構成パターンFluentdのお勧めシステム構成パターン
Fluentdのお勧めシステム構成パターン
 

Viewers also liked

Tagging and Folksonomies
Tagging and FolksonomiesTagging and Folksonomies
Tagging and Folksonomieshhunt75
 
Tagging & Folksonomies poster
Tagging & Folksonomies posterTagging & Folksonomies poster
Tagging & Folksonomies posterPrattSILS
 
Understanding Tagging and Folksonomy - SharePoint Saturday DC
Understanding Tagging and Folksonomy - SharePoint Saturday DCUnderstanding Tagging and Folksonomy - SharePoint Saturday DC
Understanding Tagging and Folksonomy - SharePoint Saturday DCThomas Vander Wal
 
Folksonomy Presentation
Folksonomy PresentationFolksonomy Presentation
Folksonomy Presentationamit_nathwani
 
Bridging Social Web and Sem Web : 2 application cases in the field of sustain...
Bridging Social Web and Sem Web : 2 application cases in the field of sustain...Bridging Social Web and Sem Web : 2 application cases in the field of sustain...
Bridging Social Web and Sem Web : 2 application cases in the field of sustain...Freddy Limpens
 
Folksonomy Presentation
Folksonomy PresentationFolksonomy Presentation
Folksonomy PresentationVanessa Yau
 
Next Generation Search: Social Bookmarking and Tagging
Next Generation Search: Social Bookmarking and TaggingNext Generation Search: Social Bookmarking and Tagging
Next Generation Search: Social Bookmarking and TaggingThomas Vander Wal
 
Taxonomies and Folksonomies
Taxonomies and FolksonomiesTaxonomies and Folksonomies
Taxonomies and FolksonomiesHeather Hedden
 
Introduction to jQuery Mobile - Web Deliver for All
Introduction to jQuery Mobile - Web Deliver for AllIntroduction to jQuery Mobile - Web Deliver for All
Introduction to jQuery Mobile - Web Deliver for AllMarc Grabanski
 
Social Web and Semantic Web: towards synergy between folksonomies and ontologies
Social Web and Semantic Web: towards synergy between folksonomies and ontologiesSocial Web and Semantic Web: towards synergy between folksonomies and ontologies
Social Web and Semantic Web: towards synergy between folksonomies and ontologiesFreddy Limpens
 

Viewers also liked (11)

Tagging and Folksonomies
Tagging and FolksonomiesTagging and Folksonomies
Tagging and Folksonomies
 
Tagging & Folksonomies poster
Tagging & Folksonomies posterTagging & Folksonomies poster
Tagging & Folksonomies poster
 
Understanding Tagging and Folksonomy - SharePoint Saturday DC
Understanding Tagging and Folksonomy - SharePoint Saturday DCUnderstanding Tagging and Folksonomy - SharePoint Saturday DC
Understanding Tagging and Folksonomy - SharePoint Saturday DC
 
Folksonomy Presentation
Folksonomy PresentationFolksonomy Presentation
Folksonomy Presentation
 
Bridging Social Web and Sem Web : 2 application cases in the field of sustain...
Bridging Social Web and Sem Web : 2 application cases in the field of sustain...Bridging Social Web and Sem Web : 2 application cases in the field of sustain...
Bridging Social Web and Sem Web : 2 application cases in the field of sustain...
 
Folksonomy
FolksonomyFolksonomy
Folksonomy
 
Folksonomy Presentation
Folksonomy PresentationFolksonomy Presentation
Folksonomy Presentation
 
Next Generation Search: Social Bookmarking and Tagging
Next Generation Search: Social Bookmarking and TaggingNext Generation Search: Social Bookmarking and Tagging
Next Generation Search: Social Bookmarking and Tagging
 
Taxonomies and Folksonomies
Taxonomies and FolksonomiesTaxonomies and Folksonomies
Taxonomies and Folksonomies
 
Introduction to jQuery Mobile - Web Deliver for All
Introduction to jQuery Mobile - Web Deliver for AllIntroduction to jQuery Mobile - Web Deliver for All
Introduction to jQuery Mobile - Web Deliver for All
 
Social Web and Semantic Web: towards synergy between folksonomies and ontologies
Social Web and Semantic Web: towards synergy between folksonomies and ontologiesSocial Web and Semantic Web: towards synergy between folksonomies and ontologies
Social Web and Semantic Web: towards synergy between folksonomies and ontologies
 

Similar to Tagging and Folksonomy Schema Design for Scalability and Performance

Vote NO for MySQL
Vote NO for MySQLVote NO for MySQL
Vote NO for MySQLUlf Wendel
 
Why no sql_ibm_cloudant
Why no sql_ibm_cloudantWhy no sql_ibm_cloudant
Why no sql_ibm_cloudantPeter Tutty
 
Oracle no sql database bigdata
Oracle no sql database   bigdataOracle no sql database   bigdata
Oracle no sql database bigdataJoão Gabriel Lima
 
MySQL 8: Ready for Prime Time
MySQL 8: Ready for Prime TimeMySQL 8: Ready for Prime Time
MySQL 8: Ready for Prime TimeArnab Ray
 
Databases in the Cloud - DevDay Austin 2017 Day 2
Databases in the Cloud - DevDay Austin 2017 Day 2Databases in the Cloud - DevDay Austin 2017 Day 2
Databases in the Cloud - DevDay Austin 2017 Day 2Amazon Web Services
 
MySQL Ecosystem in 2020
MySQL Ecosystem in 2020MySQL Ecosystem in 2020
MySQL Ecosystem in 2020Alkin Tezuysal
 
MySQL Cluster Scaling to a Billion Queries
MySQL Cluster Scaling to a Billion QueriesMySQL Cluster Scaling to a Billion Queries
MySQL Cluster Scaling to a Billion QueriesBernd Ocklin
 
http://www.hfadeel.com/Blog/?p=151
http://www.hfadeel.com/Blog/?p=151http://www.hfadeel.com/Blog/?p=151
http://www.hfadeel.com/Blog/?p=151xlight
 
My sql crashcourse_intro_kdl
My sql crashcourse_intro_kdlMy sql crashcourse_intro_kdl
My sql crashcourse_intro_kdlsqlhjalp
 
High-performance database technology for rock-solid IoT solutions
High-performance database technology for rock-solid IoT solutionsHigh-performance database technology for rock-solid IoT solutions
High-performance database technology for rock-solid IoT solutionsClusterpoint
 
MySQL Community Meetup in China : Innovation driven by the Community
MySQL Community Meetup in China : Innovation driven by the CommunityMySQL Community Meetup in China : Innovation driven by the Community
MySQL Community Meetup in China : Innovation driven by the CommunityFrederic Descamps
 
20090425mysqlslides 12593434194072-phpapp02
20090425mysqlslides 12593434194072-phpapp0220090425mysqlslides 12593434194072-phpapp02
20090425mysqlslides 12593434194072-phpapp02Vinamra Mittal
 
How to Radically Simplify Your Business Data Management
How to Radically Simplify Your Business Data ManagementHow to Radically Simplify Your Business Data Management
How to Radically Simplify Your Business Data ManagementClusterpoint
 

Similar to Tagging and Folksonomy Schema Design for Scalability and Performance (20)

Mysql
MysqlMysql
Mysql
 
Vote NO for MySQL
Vote NO for MySQLVote NO for MySQL
Vote NO for MySQL
 
MySQL Cluster
MySQL ClusterMySQL Cluster
MySQL Cluster
 
Brandon
BrandonBrandon
Brandon
 
Why no sql_ibm_cloudant
Why no sql_ibm_cloudantWhy no sql_ibm_cloudant
Why no sql_ibm_cloudant
 
Oracle no sql database bigdata
Oracle no sql database   bigdataOracle no sql database   bigdata
Oracle no sql database bigdata
 
MySQL 8: Ready for Prime Time
MySQL 8: Ready for Prime TimeMySQL 8: Ready for Prime Time
MySQL 8: Ready for Prime Time
 
Databases in the Cloud - DevDay Austin 2017 Day 2
Databases in the Cloud - DevDay Austin 2017 Day 2Databases in the Cloud - DevDay Austin 2017 Day 2
Databases in the Cloud - DevDay Austin 2017 Day 2
 
MySQL Ecosystem in 2020
MySQL Ecosystem in 2020MySQL Ecosystem in 2020
MySQL Ecosystem in 2020
 
The NoSQL Movement
The NoSQL MovementThe NoSQL Movement
The NoSQL Movement
 
MySQL Cluster Scaling to a Billion Queries
MySQL Cluster Scaling to a Billion QueriesMySQL Cluster Scaling to a Billion Queries
MySQL Cluster Scaling to a Billion Queries
 
http://www.hfadeel.com/Blog/?p=151
http://www.hfadeel.com/Blog/?p=151http://www.hfadeel.com/Blog/?p=151
http://www.hfadeel.com/Blog/?p=151
 
Introduction to Mysql
Introduction to MysqlIntroduction to Mysql
Introduction to Mysql
 
My sql crashcourse_intro_kdl
My sql crashcourse_intro_kdlMy sql crashcourse_intro_kdl
My sql crashcourse_intro_kdl
 
Apache ignite v1.3
Apache ignite v1.3Apache ignite v1.3
Apache ignite v1.3
 
No sql3 rmoug
No sql3 rmougNo sql3 rmoug
No sql3 rmoug
 
High-performance database technology for rock-solid IoT solutions
High-performance database technology for rock-solid IoT solutionsHigh-performance database technology for rock-solid IoT solutions
High-performance database technology for rock-solid IoT solutions
 
MySQL Community Meetup in China : Innovation driven by the Community
MySQL Community Meetup in China : Innovation driven by the CommunityMySQL Community Meetup in China : Innovation driven by the Community
MySQL Community Meetup in China : Innovation driven by the Community
 
20090425mysqlslides 12593434194072-phpapp02
20090425mysqlslides 12593434194072-phpapp0220090425mysqlslides 12593434194072-phpapp02
20090425mysqlslides 12593434194072-phpapp02
 
How to Radically Simplify Your Business Data Management
How to Radically Simplify Your Business Data ManagementHow to Radically Simplify Your Business Data Management
How to Radically Simplify Your Business Data Management
 

Recently uploaded

Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 

Recently uploaded (20)

Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 

Tagging and Folksonomy Schema Design for Scalability and Performance

  • 1. Tagging and Folksonomy Schema Design for Scalability and Performance Jay Pipes Community Relations Manager, North America (jay@mysql.com) MySQL, Inc. TELECONFERENCE: please dial a number to hear the audio portion of this presentation. Toll-free US/Canada: 866-469-3239 Direct US/Canada: 650-429-3300 : 0800-295-240 Austria 0800-71083 Belgium 80-884912 Denmark 0-800-1-12585 Finland 0800-90-5571 France 0800-101-6943 Germany 00800-12-6759 Greece 1-800-882019 Ireland 800-780-632 Italy 0-800-9214652 Israel 800-2498 Luxembourg 0800-022-6826 Netherlands 800-15888 Norway 900-97-1417 Spain 020-79-7251 Sweden 0800-561-201 Switzerland 0800-028-8023 UK 1800-093-897 Australia 800-90-3575 Hong Kong 00531-12-1688 Japan EVENT NUMBER/ACCESS CODE: Copyright MySQL AB The World’s Most Popular Open Source Database 1
  • 2. Communicate ... Connect ... Share ... Play ... Trade ... craigslist Search & Look Up Copyright MySQL AB The World’s Most Popular Open Source Database 2
  • 3. Software Industry Evolution WEB 1.0 WEB 2.0 FREE & OPEN SOURCE SOFTWARE CLOSED SOURCE SOFTWARE 1990 2000 1995 2005 Copyright MySQL AB The World’s Most Popular Open Source Database 3
  • 4. How MySQL Enables Web 2.0 ubiquitous with developers lower tco large objects tag databases lamp XML full-text search language support ease of use world-wide performance community scale out interoperable connectors open source bundled reliability session management replication query cache partitioning (5.1) pluggable storage engines Copyright MySQL AB The World’s Most Popular Open Source Database 4
  • 5. Agenda • Web 2.0: more than just rounded corners • Tagging Concepts in SQL • Folksonomy Concepts in SQL • Scaling out sensibly Copyright MySQL AB The World’s Most Popular Open Source Database 5
  • 6. More than just rounded corners • What is Web 2.0? – Participation – Interaction – Connection • A changing of design patterns – AJAX and XML-RPC are changing the way data is queried – Rich web-based clients – Everyone's got an API – API leads to increased requests per second. Must deal with growth! Copyright MySQL AB The World’s Most Popular Open Source Database 6
  • 7. AJAX and XML-RPC change the game • Traditional (Web 1.0) page request builds entire page – Lots of HTML, style and data in each page request – Lots of data processed or queried in each request – Complex page requests and application logic • AJAX/XML-RPC request returns small data set, or page fragment – Smaller amount of data being passed on each request – But, often many more requests compared to Web 1.0! – Exposed APIs mean data services distribute your data to syndicated sites – RSS feeds supply data, not full web page Copyright MySQL AB The World’s Most Popular Open Source Database 7
  • 8. flickr Tag Cloud(s) Copyright MySQL AB The World’s Most Popular Open Source Database 8
  • 9. Participation, Interaction and Connection • Multiple users editing the same content – Content explosion – Lots of textual data to manage. How do we organize it? • User-driven data – “Tracked” pages/items – Allows interlinking of content via the user to user relationship (folksonomy) • Tags becoming the new categorization/filing of this content – Anyone can tag the data – Tags connect one thing to another, similar to the way that the folksonomy relationships link users together • So, the common concept here is “linking”... Copyright MySQL AB The World’s Most Popular Open Source Database 9
  • 10. What The User Dimension Gives You • Folksonomy adds the “user dimension” • Tags often provide the “glue” between user and item dimensions • Answers questions such as: – Who shares my interests – What are the other interests of users who share my interests – Someone tagged my bookmark (or any other item) with something. What other items did that person tag with the same thing? – What is the user's immediate “ecosystem”? What are the fringes of that ecosystem? • Ecosystem radius expansion (like zipcode radius expansion...) • Marks a new aspect of web applications into the world of data warehousing and analysis Copyright MySQL AB The World’s Most Popular Open Source Database 10
  • 11. Linking Is the “R” in RDBMS • The schema becomes the absolute driving force behind Web 2.0 applications • Understanding of many-to-many relationships is critical • Take advantage of MySQL's architecture so that our schema is as efficient as possible • Ensure your data store is normalized, standardized, and consistent • So, let's digg in! (pun intended) Copyright MySQL AB The World’s Most Popular Open Source Database 11
  • 12. Comparison of Tag Schema Designs • “MySQLicious” – Entirely denormalized – One main fact table with one field storing delimited list of tags – Unless using FULLTEXT indexing, does not scale well at all – Very inflexible • “Scuttle” solution – Two tables. Lookup one-way from item table to tag table – Somewhat inflexible – Doesn't represent a many-to-many relationship correctly • “Toxi” solution – Almost gets it right, except uses surrogate keys in mapping tables ... – Flexible, normalized approach – Closest to our recommended architecture ... Copyright MySQL AB The World’s Most Popular Open Source Database 12
  • 13. Tagging Concepts in SQL Copyright MySQL AB The World’s Most Popular Open Source Database 13
  • 14. The Tags Table • The “Tags” table is the foundational table upon which all tag links are built • Lean and mean • Make primary key an INT! CREATE TABLE Tags ( tag_id INT UNSIGNED NOT NULL AUTO_INCREMENT , tag_text VARCHAR(50) NOT NULL , PRIMARY KEY pk_Tags (tag_id) , UNIQUE INDEX uix_TagText (tag_text) ) ENGINE=InnoDB; Copyright MySQL AB The World’s Most Popular Open Source Database 14
  • 15. Example Tag2Post Mapping Table • The mapping table creates the link between a tag and anything else • In other terms, it maps a many-to-many relationship • Important to index from both “sides” CREATE TABLE Tag2Post ( tag_id INT UNSIGNED NOT NULL , post_id INT UNSIGNED NOT NULL , PRIMARY KEY pk_Tag2Post (tag_id, post_id) , INDEX (post_id) ) ENGINE=InnoDB; Copyright MySQL AB The World’s Most Popular Open Source Database 15
  • 16. The Tag Cloud • Tag density typically represented by larger fonts or different colors SELECT tag_text , COUNT(*) as num_tags FROM Tag2Post t2p INNER JOIN Tags t ON t2p.tag_id = t.tag_id GROUP BY tag_text; Copyright MySQL AB The World’s Most Popular Open Source Database 16
  • 17. Efficiency Issues with the Tag Cloud • With InnoDB tables, you don't want to be issuing COUNT() queries, even on an indexed field CREATE TABLE TagStat tag_id INT UNSIGNED NOT NULL , num_posts INT UNSIGNED NOT NULL // ... num_xxx INT UNSIGNED NOT NULL  , PRIMARY KEY (tag_id) ) ENGINE=InnoDB; SELECT tag_text, ts.num_posts FROM Tag2Post t2p INNER JOIN Tags t ON t2p.tag_id = t.tag_id INNER JOIN TagStat ts ON t.tag_id = ts.tag_id GROUP BY tag_text; Copyright MySQL AB The World’s Most Popular Open Source Database 17
  • 18. The Typical Related Items Query • Get all posts tagged with any tag attached to Post #6 • In other words “Get me all posts related to post #6” SELECT p2.post_id  FROM Tag2Post p1 INNER JOIN Tag2Post p2 ON p1.tag_id = p2.tag_id WHERE p1.post_id = 6 GROUP BY p2.post_id; Copyright MySQL AB The World’s Most Popular Open Source Database 18
  • 19. Problems With Related Items Query • Joining small to medium sized “tag sets” works great • But... when you've got a large tag set on either “side” of the join, problems can occur with scalablity • One way to solve is via derived tables SELECT p2.post_id  FROM ( SELECT tag_id FROM Tag2Post WHERE post_id = 6 LIMIT 10 ) AS p1 INNER JOIN Tag2Post p2 ON p1.tag_id = p2.tag_id GROUP BY p2.post_id LIMIT 10; Copyright MySQL AB The World’s Most Popular Open Source Database 19
  • 20. The Typical Related Tags Query • Get all tags related to a particular tag via an item • The “reverse” of the related items query; we want a set of related tags, not related posts SELECT t2p2.tag_id, t2.tag_text   FROM (SELECT post_id FROM Tags t1 INNER JOIN Tag2Post  ON t1.tag_id = Tag2Post.tag_id WHERE t1.tag_text = 'beach' LIMIT 10 ) AS t2p1 INNER JOIN Tag2Post t2p2 ON t2p1.post_id = t2p2.post_id INNER JOIN Tags t2 ON t2p2.tag_id = t2.tag_id GROUP BY t2p2.tag_id LIMIT 10; Copyright MySQL AB The World’s Most Popular Open Source Database 20
  • 21. Dealing With More Than One Tag • What if we want only items related to each of a set of tags? • Here is the typical way of dealing with this problem SELECT t2p.post_id FROM Tags t1  INNER JOIN Tag2Post t2p ON t1.tag_id = t2p.tag_id WHERE t1.tag_text IN ('beach','cloud') GROUP BY t2p.post_id  HAVING COUNT(DISTINCT t2p.tag_id) = 2; Copyright MySQL AB The World’s Most Popular Open Source Database 21
  • 22. Dealing With More Than One Tag (cont'd) • The GROUP BY and the HAVING COUNT(DISTINCT ...) can be eliminated through joins • Thus, you eliminate the Using temporary, using filesort in the query execution SELECT t2p2.post_id FROM Tags t1 CROSS JOIN Tags t2 INNER JOIN Tag2Post t2p1 ON t1.tag_id = t2p1.tag_id INNER JOIN Tag2Post t2p2 ON t2p1.post_id = t2p2.post_id AND t2p2.tag_id = t2.tag_id WHERE t1.tag_text = 'beach' AND t2.tag_text = 'cloud'; Copyright MySQL AB The World’s Most Popular Open Source Database 22
  • 23. Excluding Tags From A Resultset • Here, we want have a search query like “Give me all posts tagged with “beach” and “cloud” but not tagged with “flower”. Typical solution you will see: SELECT t2p1.post_id FROM Tag2Post t2p1 INNER JOIN Tags t1 ON t2p1.tag_id = t1.tag_id WHERE t1.tag_text IN ('beach','cloud') AND t2p1.post_id NOT IN (SELECT post_id FROM Tag2Post t2p2 INNER JOIN Tags t2  ON t2p2.tag_id = t2.tag_id WHERE t2.tag_text = 'flower') GROUP BY t2p1.post_id  HAVING COUNT(DISTINCT t2p1.tag_id) = 2; Copyright MySQL AB The World’s Most Popular Open Source Database 23
  • 24. Excluding Tags From A Resultset (cont'd) • More efficient to use an outer join to filter out the “minus” operator, plus get rid of the GROUP BY... SELECT t2p2.post_id FROM Tags t1 CROSS JOIN Tags t2 CROSS JOIN Tags t3 INNER JOIN Tag2Post t2p1 ON t1.tag_id = t2p1.tag_id INNER JOIN Tag2Post t2p2 ON t2p1.post_id = t2p2.post_id AND t2p2.tag_id = t2.tag_id LEFT JOIN Tag2Post t2p3 ON t2p2.post_id = t2p3.post_id AND t2p3.tag_id = t3.tag_id WHERE t1.tag_text = 'beach' AND t2.tag_text = 'cloud' AND t3.tag_text = 'flower' AND t2p3.post_id IS NULL; Copyright MySQL AB The World’s Most Popular Open Source Database 24
  • 25. Summary of SQL Tips for Tagging • Index fields in mapping tables properly. Ensure that each GROUP BY can access an index from the left side of the index • Use summary or statistic tables to eliminate the use of COUNT(*) expressions in tag clouding • Get rid of GROUP BY and HAVING COUNT(...) by using standard join techniques • Get rid of NOT IN expressions via a standard outer join • Use derived tables with an internal LIMIT expression to prevent wild relation queries from breaking scalability Copyright MySQL AB The World’s Most Popular Open Source Database 25
  • 26. Folksonomy Concepts in SQL Copyright MySQL AB The World’s Most Popular Open Source Database 26
  • 27. Here we see tagging and folksonomy together: the user dimension... Copyright MySQL AB The World’s Most Popular Open Source Database 27
  • 28. Folksonomy Adds The User Dimension • Adding the user dimension to our schema • The tag is the relationship glue between the user and item dimensions CREATE TABLE UserTagPost ( user_id INT UNSIGNED NOT NULL , tag_id INT UNSIGNED NOT NULL , post_id INT UNSIGNED NOT NULL , PRIMARY KEY (user_id, tag_id, post_id) , INDEX (tag_id) , INDEX (post_id) ) ENGINE=InnoDB; Copyright MySQL AB The World’s Most Popular Open Source Database 28
  • 29. Who Shares My Interest Directly? • Find out the users who have linked to the same item I have • Direct link; we don't go through the tag glue SELECT user_id  FROM UserTagPost WHERE post_id = @my_post_id; Copyright MySQL AB The World’s Most Popular Open Source Database 29
  • 30. Who Shares My Interests Indirectly? • Find out the users who have similar tag sets... • But, how much matching do we want to do? In other words, what radius do we want to match on... • The first step is to find my tags that are within the search radius... this yields my “top” or most popular tags SELECT tag_id  FROM UserTagPost WHERE user_id = @my_user_id GROUP BY tag_id HAVING COUNT(tag_id) >= @radius; Copyright MySQL AB The World’s Most Popular Open Source Database 30
  • 31. Who Shares My Interests (cont'd) • Now that we have our “top” tag set, we want to find users who match all of our top tags: SELECT others.user_id  FROM UserTagPost others  INNER JOIN (SELECT tag_id  FROM UserTagPost WHERE user_id = @my_user_id GROUP BY tag_id HAVING COUNT(tag_id) >= @radius) AS my_tags ON others.tag_id = my_tags.tag_id GROUP BY others.user_id; Copyright MySQL AB The World’s Most Popular Open Source Database 31
  • 32. Ranking Ecosystem Matches • What about finding our “closest” ecosystem matches • We can “rank” other users based on whether they have tagged items a number of times similar to ourselves SELECT others.user_id , (COUNT(*) ­ my_tags.num_tags) AS rank FROM UserTagPost others  INNER JOIN ( SELECT tag_id, COUNT(*) AS num_tags  FROM UserTagPost WHERE user_id = @my_user_id GROUP BY tag_id HAVING COUNT(tag_id) >= @radius ) AS my_tags ON others.tag_id = my_tags.tag_id GROUP BY others.user_id ORDER BY rank DESC; Copyright MySQL AB The World’s Most Popular Open Source Database 32
  • 33. Ranking Ecosystem Matches Efficiently • But... we've still got our COUNT(*) problem • How about another summary table ... CREATE TABLE UserTagStat ( user_id INT UNSIGNED NOT NULL , tag_id INT UNSIGNED NOT NUL , num_posts INT UNSIGNED NOT NULL , PRIMARY KEY (user_id, tag_id) , INDEX (tag_id) ) ENGINE=InnoDB; Copyright MySQL AB The World’s Most Popular Open Source Database 33
  • 34. Ranking Ecosystem Matches Efficiently 2 • Hey, we've eliminated the aggregation! SELECT others.user_id , (others.num_posts ­ my_tags.num_posts) AS rank FROM UserTagStat others  INNER JOIN ( SELECT tag_id, num_posts FROM UserTagStat WHERE user_id = @my_user_id AND  num_posts >= @radius ) AS my_tags ON others.tag_id = my_tags.tag_id ORDER BY rank DESC; Copyright MySQL AB The World’s Most Popular Open Source Database 34
  • 35. Scaling Out Sensibly Copyright MySQL AB The World’s Most Popular Open Source Database 35
  • 36. MySQL Replication (Scale Out) • Write to one master Web/App Web/App Web/App • Read from many slaves Server Server Server • Excellent for read intensive apps Load Balancer Reads Reads Writes & Reads … Master Slave Slave MySQL MySQL MySQL Server Server Server Replication Copyright MySQL AB The World’s Most Popular Open Source Database 36
  • 37. Scale Out Using Replication • Master DB stores all writes • Master has InnoDB tables • Slaves handle aggregate reads, non-realtime reads • Web servers can be load balanced (directed) to one or more slaves • Just plug in another slave to increase read performance (that's scaling out...) • Slave can provide hot standby as well as backup server Copyright MySQL AB The World’s Most Popular Open Source Database 37
  • 38. Scale Out Strategies • Slave storage engine can differ from Master! – InnoDB on Master (great update/insert/delete performance) – MyISAM on Slave (fantastic read performance and well as excellent concurrent insert performance, plus can use FULLTEXT indexing) • Push aggregated summary data in batches onto slaves for excellent read performance of semi-static data – Example: “this week's popular tags” • Generate the data via cron job on each slave. No need to burden the master server • Truncate every week Copyright MySQL AB The World’s Most Popular Open Source Database 38
  • 39. Scale Out Strategies (cont'd) • Offload FULLTEXT indexing onto a FT indexer such as Apache Lucene, Mnogosearch, Sphinx FT Engine, etc. • Use Partitioning feature of 5.1 to segment tag data across multiple partitions, allowing you to spread disk load sensibly based on your tag text density • Use the MySQL Query Cache effectively – Use SQL_NO_CACHE when selecting from frequently updated tables (ex: TagStat) – Very effective for high-read environments: can yield 200-250% performance improvement Copyright MySQL AB The World’s Most Popular Open Source Database 39
  • 40. Questions? MySQL Forge http://forge.mysql.com/ MySQL Forge Tag Schema Wiki pages http://forge.mysql.com/wiki/TagSchema PlanetMySQL http://www.planetmysql.org Jay Pipes (jay@mysql.com) MySQL AB Copyright MySQL AB The World’s Most Popular Open Source Database 40