Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Migration from SQL to MongoDB - A Case Study at TheKnot.com

11,374 views

Published on

8 out of 10 couples use TheKnot.com to help plan their wedding. A key part of planning involves selecting articles, photographs, and other resources and storing these in the user's Favorites. Recently we migrated major parts of our technology stack to open source technologies. As part of our migration strategy, we zeroed in on MongoDB, since it better suited our requirements for speed and data structure as well as eliminating the need for a caching layer. The transition required a period in which both our legacy and new API where working concurrently with data being persisted on both databases (SQL and Mongo) and all records were being synched with every request. We resourced to many strategies and applications to achieve this goal, including: Pentaho, AWS SQS and SNS, a queue messenger system and some proprietary ruby gems. In this session we will review our strategy and some of the lessons we learned about successfully migrating with zero downtime.

Published in: Technology
  • My friend sent me a link to to tis site. This awesome company. They wrote my entire research paper for me, and it turned out brilliantly. I highly recommend this service to anyone in my shoes. ⇒ www.HelpWriting.net ⇐.
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • ⇒ www.WritePaper.info ⇐ is a good website if you’re looking to get your essay written for you. You can also request things like research papers or dissertations. It’s really convenient and helpful.
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • your form does not work to download!
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

Migration from SQL to MongoDB - A Case Study at TheKnot.com

  1. 1. | 1 © 2014 XO GROUP INC. ALL RIGHTS RESERVED.
  2. 2. | 2 © 2014 XO GROUP INC. ALL RIGHTS RESERVED. XO Group Inc. Membership and Community Team Alexander Copquin - Senior Software Engineer Vladimir Carballo - Senior Software Engineer
  3. 3. | 3 © 2014 XO GROUP INC. ALL RIGHTS RESERVED. Favorites API Re-platforming …a case study
  4. 4. | 4 © 2014 XO GROUP INC. ALL RIGHTS RESERVED. Favorites API Re-platforming • Architectures SQL .NET / Ruby Mongo • Reasons for migration • Schema design • RoR model design and implementation • Migration strategies and systems • Lessons learned
  5. 5. | 5 © 2014 XO GROUP INC. ALL RIGHTS RESERVED. Our Favorites Feature
  6. 6. | 6 © 2014 XO GROUP INC. ALL RIGHTS RESERVED. Favorites API • Add / Edit / Delete Object. • Manage Boards • Get counts & stats • RESTful API • Rails • JavaScript • Ios • Android Features
  7. 7. | 7 © 2014 XO GROUP INC. ALL RIGHTS RESERVED. • + 100,000,000 “favorited” objects • + 760,000 boards • Avg. 55,000 new objects per day Stats
  8. 8. | 8 © 2014 XO GROUP INC. ALL RIGHTS RESERVED. Legacy Architecture
  9. 9. | 9 © 2014 XO GROUP INC. ALL RIGHTS RESERVED. • Database 55 GB and growing… • Avg 45 rpm on peak times • Avg 80 msec response POST • Avg 460 msec response GET Legacy Benchmarks
  10. 10. | 10 © 2014 XO GROUP INC. ALL RIGHTS RESERVED. • Db Reaching max. capacity for setup • Scalability problems • Hard to modify schema • Bad response times • Very complex caching layer • Out of line with company’s strategy Maxed
  11. 11. | 11 © 2014 XO GROUP INC. ALL RIGHTS RESERVED. New Architecture
  12. 12. | 12 © 2014 XO GROUP INC. ALL RIGHTS RESERVED. • Easy to scale • Flexible schema • Fast Response • No Cache Layer • Fast Iteration / Deploy • TDD first and foremost • At a glance monitoring of all layers Scalable
  13. 13. | 13 © 2014 XO GROUP INC. ALL RIGHTS RESERVED. Implementation
  14. 14. | 14 © 2014 XO GROUP INC. ALL RIGHTS RESERVED. What we persisted in the legacy schema • UserId (primary key) • UniqueId • Url (unique per user) • ImageUrl • Name • Description • ObjectId (unique per application adding favorites) • Category • Timestamps • Other
  15. 15. | 15 © 2014 XO GROUP INC. ALL RIGHTS RESERVED. Favorites DB Legacy Schema
  16. 16. | 16 © 2014 XO GROUP INC. ALL RIGHTS RESERVED. select top 10 UserFavoriteId, Name, Description, Url, ImageUrl from userFavorites where userId = '5174181997807393' Sample queries
  17. 17. | 17 © 2014 XO GROUP INC. ALL RIGHTS RESERVED. select top 5 grp.groupId, grp.Name as GroupName, fav.userFavoriteId, fav.name, fav.Description, fav.Url, fav.ImageUrl from userFavoritesGroups grp inner join userFavoritesGroupsItems grpItm on grp.GroupId = grpItm.GroupId inner join userFavorites fav on grpItm.userFavoriteId = fav.userFavoriteId where grp.userId = '5174181997807393' Sample queries
  18. 18. | 18 © 2014 XO GROUP INC. ALL RIGHTS RESERVED. Towards a new Schema and Persistance Layer • Start with a clean slate • Break with the past • Persist only relevant minimum data points • Think and rethink relationships • High Performance • Flexible • Prototype different scenarios
  19. 19. | 19 © 2014 XO GROUP INC. ALL RIGHTS RESERVED. First attempt
  20. 20. | 20 © 2014 XO GROUP INC. ALL RIGHTS RESERVED. UserFavorites
  21. 21. | 21 © 2014 XO GROUP INC. ALL RIGHTS RESERVED. • Document contains embedded documents which are required to be accessed on its own • Documents would grow without bound • Most queries would be slow • Indexes would be very expensive • Tries too hard to imitate legacy Cons
  22. 22. | 22 © 2014 XO GROUP INC. ALL RIGHTS RESERVED. Second attempt
  23. 23. | 23 © 2014 XO GROUP INC. ALL RIGHTS RESERVED.
  24. 24. | 24 © 2014 XO GROUP INC. ALL RIGHTS RESERVED. Board document with one recent favorite
  25. 25. | 25 © 2014 XO GROUP INC. ALL RIGHTS RESERVED. Board document with more recent favorites
  26. 26. | 26 © 2014 XO GROUP INC. ALL RIGHTS RESERVED.
  27. 27. | 27 © 2014 XO GROUP INC. ALL RIGHTS RESERVED. Favorite document located on different boards
  28. 28. | 28 © 2014 XO GROUP INC. ALL RIGHTS RESERVED. • Document structure matches the data required on the view • A Board document includes the 4 most recent favorites. • A Favorite document includes the list of boards it was added to • Faster queries. • More control on the size of each document • Better implementation of UX intent Pros
  29. 29. | 29 © 2014 XO GROUP INC. ALL RIGHTS RESERVED. Sample queries db.favorites.find({'member': 'e1606ed5-4ac8-48b4-aee6-bc4203937903'}) .limit(1) db.favorites.find({'boards': '7557acf8-b7b1-4eab-a64d-57449034cfc6'}) .limit(1) db.favorites.find({'application': 'marketplace'}) .limit(1) db.boards.find({'member': 'e1606ed5-4ac8-48b4-aee6-bc4203937903'}) .limit(1) db.boards.find({'member': 'e1606ed5-4ac8-48b4-aee6-bc4203937903', 'default_board': true}) db.boards.find({'name' : 'Simple Reception Decor'}).limit(1)
  30. 30. | 30 © 2014 XO GROUP INC. ALL RIGHTS RESERVED. ● Rails web application framework ● We speak RoR and JS ● mongoDB as a data repository (we love NoSQL) ● Two collections, one for Boards and one for Favorites ● No joins, no foreign keys ● Referential integrity is handled in a different fashion. ● MongoId Gem (Pros & Cons) Some implementation details
  31. 31. | 31 © 2014 XO GROUP INC. ALL RIGHTS RESERVED. Favorites re-platform
  32. 32. | 32 © 2014 XO GROUP INC. ALL RIGHTS RESERVED. Board class
  33. 33. | 33 © 2014 XO GROUP INC. ALL RIGHTS RESERVED.
  34. 34. | 34 © 2014 XO GROUP INC. ALL RIGHTS RESERVED. Favorite class
  35. 35. | 35 © 2014 XO GROUP INC. ALL RIGHTS RESERVED.
  36. 36. | 36 © 2014 XO GROUP INC. ALL RIGHTS RESERVED. Scaling reads with replica sets
  37. 37. | 37 © 2014 XO GROUP INC. ALL RIGHTS RESERVED. Scaling reads with sharding
  38. 38. | 38 © 2014 XO GROUP INC. ALL RIGHTS RESERVED. Migration
  39. 39. | 39 © 2014 XO GROUP INC. ALL RIGHTS RESERVED. Clients switchover New API Legacy API Client Client Client Client ONE WAY
  40. 40. | 40 © 2014 XO GROUP INC. ALL RIGHTS RESERVED. Migration Timeline new API. Continuous Migr. Implement Monitors Turn on Continuous Data Catch-up Plug ClientsBulk Migr. Development Bulk Migr. Migration
  41. 41. | 41 © 2014 XO GROUP INC. ALL RIGHTS RESERVED. Bulk Migration ETL
  42. 42. | 42 © 2014 XO GROUP INC. ALL RIGHTS RESERVED. Bulk Migration FavoritesUserFavorites SQL Tables Mongo Collections BoardsUserFavoritesGroups UserFavoritesGroupsItems
  43. 43. | 43 © 2014 XO GROUP INC. ALL RIGHTS RESERVED. Favorites Job
  44. 44. | 44 © 2014 XO GROUP INC. ALL RIGHTS RESERVED. Pentaho Steps
  45. 45. | 45 © 2014 XO GROUP INC. ALL RIGHTS RESERVED. Auto-increment Id vs. UUID UserFavoriteId GroupId Favorites UUID Groups UUID Continuous Migration
  46. 46. | 46 © 2014 XO GROUP INC. ALL RIGHTS RESERVED. Get UUID from the Get Go • Add a column to legacy Db (+ 100M recs!!) with new Mongo UUID • Then migrations will take care of inserting into new documents SQL has all new ids xxxxx-xxxx-xxxx xxxxx-xxxx-xxxx xxxxx-xxxx-xxxx Mongo Ids are inserted Migration Systems
  47. 47. | 47 © 2014 XO GROUP INC. ALL RIGHTS RESERVED. Add UUID Columns in SQL 100 M recs!!!Alter table add UUID uniqueidentifier New Favs TempTable with UUID SELECT *, uuid = NEWID() INTO NewUserFavorites FROM UserFavorites Add Indexes Rename & drop
  48. 48. | 48 © 2014 XO GROUP INC. ALL RIGHTS RESERVED. • SQL needed some sanitation • SQL prep scripts approx. 4 hs • Pentaho ETL on local Workstation: 8hs • Restore into production Mongo Cluster: 4hs Facts
  49. 49. | 49 © 2014 XO GROUP INC. ALL RIGHTS RESERVED. We’ve got data!!
  50. 50. | 50 © 2014 XO GROUP INC. ALL RIGHTS RESERVED. Continuous Migration Architecture Clients Legacy API New API SQS Queue Messenger ONE WAY SYNC…
  51. 51. | 51 © 2014 XO GROUP INC. ALL RIGHTS RESERVED. Continuous Migration Favorites Legacy Messenger • Ruby • Consumer of an SQS queue coming from Legacy that generates 1 message per operation • Issues API call to new app per each operation • Runs as a worker in the background Legacy API SQS Legacy Messenge r New API Mongo
  52. 52. | 52 © 2014 XO GROUP INC. ALL RIGHTS RESERVED. SQS Queue is not a FIFO Friend Sent by Legacy 1 2 3 4 5 6 7 5 3 1 2 4 6 7 Consumed by Messenger
  53. 53. | 53 © 2014 XO GROUP INC. ALL RIGHTS RESERVED. • Queue is not FIFO • Objects don’t exist • Queue bloats fast • Can get like not-real-time • Data is different Challenges
  54. 54. | 54 © 2014 XO GROUP INC. ALL RIGHTS RESERVED. • Verify if entity exists (API call), otherwise, throw back in queue • Set message expiration • Sanitize data • Get multiple workers to achieve near real-time syncing. Solutions
  55. 55. | 55 © 2014 XO GROUP INC. ALL RIGHTS RESERVED. • Favor a simple document structure • Try different schema paradigms • Bypass native objectId generation in favor of UUID • Break with the past • Queues can be deceiving • Gems can simplify application layer impl. • Manage ref. integrity in app. layer • No cache required Take away
  56. 56. | 56 © 2014 XO GROUP INC. ALL RIGHTS RESERVED. • Avg 85 rpm on peak times • Avg 58 msec response POST • Avg 18 msec response GET New Benchmarks
  57. 57. | 57 © 2014 XO GROUP INC. ALL RIGHTS RESERVED. New vs. Legacy • Overall Performance Increase • 18 ms vs. 460 ms for GET • 58 ms vs. 80 ms for POST • Easy Schema Changes • Scalable • Simpler architecture • No Cache layer • Fast Code iteration, testing and deployment • In-line with company’s technology strategy Good
  58. 58. | 58 © 2014 XO GROUP INC. ALL RIGHTS RESERVED. Acknowledgments • Dmitri Nesterenko • Jason Sherman • Nelly Santoso • Phillip Chiu • Sean Lipkin • George Taveras • Alison Fay • Diana Taykhman • Rajendra Prashad • Josh Keys • Lewis DiFelice
  59. 59. | 59 © 2014 XO GROUP INC. ALL RIGHTS RESERVED. contact, questions, inquiries? memcomtech@xogrp.com

×