Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Upcoming SlideShare
What to Upload to SlideShare
What to Upload to SlideShare
Loading in …3
1 of 220

[BrightonSEO 2019] Restructuring Websites to Improve Indexability



[BrightonSEO April 2019] This talk shares the detailed methodology used to restructure a job aggregator site, which, like many large websites, had huge problems with indexability, and the rules used to direct robot crawl, taking the site from over 2.5 million URLs to just 20,000. It covers crawling and indexing issues and dive into the case study aided by flow charts to explain the full approach used to restructure the site and how to implement it.

Related Books

Free with a 30 day trial from Scribd

See all

[BrightonSEO 2019] Restructuring Websites to Improve Indexability

  2. 2.
  3. 3. I’m here to talk to you about a framework that I came up with to fix a client’s indexability challenge
  4. 4. But more importantly… (And I only realised this after I’d finished preparing this talk)
  5. 5. I’m also here to talk to you about how “Technical Problems are People Problems”
  6. 6. I first met this client back in 2017
  7. 7. They’re a job aggregator site
  8. 8. My CEO introduced me, saying: “Areej is the bestTech SEO you’ll find.”
  9. 9. Needless to say, the pressure was on!
  10. 10. They wanted to work with us because we’re known to produce cool creative campaigns
  11. 11. And we’re good at outreach and link building
  12. 12. But we said that from some quick analysis, it was too early for that…
  13. 13. “Your website is struggling”
  14. 14. Here’s where you are…
  15. 15. 89% decrease in organic visibility
  16. 16. Organic Traffic 2017 X 2016 Y Change -45.6%
  17. 17. You barely rank…
  18. 18. And your site is so massive; we struggled to crawl it
  19. 19. On the other hand, here’s where your competitors are…
  20. 20. To make this work, we need to fix the fundamentals first
  21. 21. We need to fix your tech
  22. 22. So, let’s spend 6 months purely on tech
  23. 23. Once that’s sorted, then we can work on creative campaigns together
  24. 24. This talk is about my 18-month relationship with this client
  25. 25. It’s about what worked, what didn’t work and what I would have done differently
  26. 26. The Initial Findings
  27. 27. Links Tech Content I started working on a comprehensive audit
  28. 28. I ended up with a 70-page document
  29. 29. There was a total of 50 recommendations
  30. 30. Some of the main things included were…
  31. 31. 72% of backlinks came from only 3 referring domains
  32. 32. Their on-page content was full of duplication and missing the basics
  33. 33. There were NO sitemaps
  34. 34. Canonical tags were not set up correctly
  35. 35. And their internal linking structure was a nightmare
  36. 36. Every recommendation was outlined using a traffic light system
  37. 37. We had a half-day audit handover meeting where I walked them through all of our recommendations
  38. 38. Everyone was in good spirits
  39. 39. Yet I couldn’t help but feel it was not enough…
  40. 40. Something was missing
  41. 41. Everything I recommended up till now was solid
  42. 42. But I had a gut feeling that due to the nature of their site…
  43. 43. These recommendations wouldn’t quite cut it
  44. 44. I had to go back to the drawing board
  45. 45. The Supplementary Findings
  46. 46. Also Known As…
  47. 47. The Findings I Should’ve Found The First Time Round But Didn’t So I’m Choosing To Call It Supplementary Findings To Sound Like An Expert.
  48. 48. They’re a job aggregator site – in essence, they’re a job search engine
  49. 49. That means that every single search conducted could, potentially, be crawled and/or indexed if it wasn’t built right
  50. 50. That’s equivalent to an infinite number of potential searches!
  51. 51. I needed to go back to basics
  52. 52. In my first six months in SEO, I used the terms ‘Crawling’ and ‘Indexing’ interchangeably
  53. 53. Until someone on Twitter called me out on it, and said…
  54. 54. “You do realise they’re not the same thing, right?”
  55. 55. So let’s make sure we’re all on the same page
  56. 56. Crawling (V) The process of gathering information from billions of webpages through their link structure.
  57. 57. Indexing (V) To add crawled webpages to the search engine index.
  58. 58. Which part was their site not getting right?
  59. 59. I knew that it was impossible to crawl
  60. 60. The one time I tried to fully crawl the site, it returned over 2.5 million URLs
  61. 61. I could only crawl it by excluding massive sections
  62. 62. And if their pages couldn’t be crawled, then they would never be indexed properly…
  63. 63. And they won’t rank
  64. 64. So there were three problems that needed to be fixed
  65. 65. Crawling Indexing Ranking
  66. 66. It was apparent that there were no rules in place to help direct robots
  67. 67. This can create a potentially-unlimited number of URLs
  68. 68. Google was wasting crawl budget by crawling duplicate thin pages and attempting to index them
  69. 69. The ‘Aha!’ Moment
  70. 70. It was apparent that there were no rules in place to help direct robots
  71. 71. I needed to create a customised framework that instructs bots on what to do and what not to do
  72. 72. The job aggregator industry seemed to be doing one of two things
  73. 73. Limiting indexable pages à miss out on ranking opportunity
  74. 74. Limiting indexable pages à miss out on ranking opportunity Not limiting indexable pages à wind up with reduced link equity
  75. 75. My framework was going to do neither
  76. 76. Instead it would use search volume data to determine which pages are valuable to search engines
  77. 77. So how exactly would this work?
  78. 78. It starts off by passing the search query through a keyword combination script
  79. 79. This script outputs different combinations for the search conducted
  80. 80. It does that by changing the order of keywords to see all possible combinations
  81. 81. Digital Marketing Manager London
  82. 82. Digital Marketing Manager London London Digital Marketing Manager Digital Marketing London Manager
  83. 83. 4! = 24 combinations
  84. 84. These combinations will increase based on the search query and filters applied
  85. 85. Even though Google will regard most of these combinations as the same…
  86. 86. The script will help avoid duplicating pages that have different versions of the same thing
  87. 87. It then searches the database to see if this job is available
  88. 88. If (Job # Available) Load a page stating so and that is no-index and no-crawl
  89. 89. If (Job = Available) Search the keyword database for all keyword combinations from the script
  90. 90. Fetch search volume data for these keyword combinations
  91. 91. For search volume data, we recommended using API
  92. 92. If (SearchVolume > 50) A page is then created using the highest SV keyword combination that is both indexed and crawled
  93. 93. If (SearchVolume < 50) Load page for users but no-index and no-crawl it
  94. 94. Your search volume cut-off can be updated at anytime and is based on what makes sense for your industry
  95. 95. There is always the possibility of errors occurring
  96. 96. What if there’s a tie-breaker? (Several keyword combinations having the same search volume)
  97. 97. Create an indexed crawlable page based on the keyword used in the user search query
  98. 98. What if the API was down? (And you’re unable to generate search volume data)
  99. 99. Load a no-index no-crawl page for usability and don’t store the query in the database
  100. 100. But I still knew that this was not enough…
  101. 101. What about their internal linking structure?
  102. 102. This was the status of their header
  103. 103. This is what we recommended
  104. 104. We also provided internal linking recommendations for their footer and job advertisement pages
  105. 105. And an exact breakdown of their filter system on job search pages
  106. 106. As for their sitemaps – they had none
  107. 107. We recommended creating and splitting them up
  108. 108. Blog Sitemap Job Advertisements Sitemap Job Results Sitemap Ancillary Sitemap
  109. 109. The final step was to help them sort out their content
  110. 110. Even if they were only indexing high search volume pages; their content was very thin
  111. 111. Their chances to rank would still be minimal
  112. 112. They had the same H1,Title Tag & Meta Description for every filtered indexed page
  113. 113. This was creating thin pages and partially duplicated content
  114. 114. Most competitors automatically generate optimised meta tags
  115. 115. They needed to do the same for indexable filter pages
  116. 116. They also needed to create pages targeting specific category search terms
  117. 117. Other than a company page and a handful of blog posts, there were no core content pages
  118. 118. So we performed in-depth keyword research and opportunity analysis to see what content generates traffic
  119. 119. And we provided a content audit and strategy to go with it
  120. 120. My ‘Aha!’ Moment felt complete
  121. 121. I was happy, I was pleased, I was ecstatic!
  122. 122. This was the piece of the puzzle that was missing
  123. 123. Now all that was left was for the client to implement it!
  124. 124. The Implementation Phase
  125. 125. Four months later, the client confirmed that they implemented everything
  126. 126. By that point, we had started working on creative campaigns and outreach
  127. 127. The first thing that caught my attention was their site went from over 2.5M crawled pages to 20K
  128. 128. Which initially seemed like good news…
  129. 129. Until I realised that their traffic had declined…
  130. 130. Remember this? Organic Traffic 2017 X 2016 Y Change -45.6%
  131. 131. Organic Traffic 2018 X 2017 Y 2016 Z Change -85.7%
  132. 132. The Mini Audit Findings
  133. 133. Also Known As…
  134. 134. The Findings I’m Rushing To Find In A Panic To Prove That They Haven’t Implemented My Recommendations Accurately, Hence I Am Still An Expert.
  135. 135. I went through the original list of 50 recommended actions
  136. 136. At that point, there were 29 that had not been implemented
  137. 137. I also discovered ten new issues that were affecting their indexability
  138. 138. Google was choosing to only index 20% of the submitted sitemap
  139. 139. Crawling is one of the most expensive things that Google do
  140. 140. Crawl budget is precious – it should not be wasted
  141. 141. Googlebot will choose to visit their site less if their indexability is not in check
  142. 142. The client had added canonical tags and felt that was enough
  143. 143. They’re relying on Googlebot to: 1) Crawl the pages
  144. 144. They’re relying on Googlebot to: 1) Crawl the pages 2) Find the canonical tags
  145. 145. They’re relying on Googlebot to: 1) Crawl the pages 2) Find the canonical tags 3) Then choose to ignore them
  146. 146. Canonical tags are simply hints for bots
  147. 147. Google might decide to ignore the tags and pick other pages to index
  148. 148. Canonical tags do not preserve crawl budget
  149. 149. Almost 80K pages had been indexed despite not being submitted via the sitemap
  150. 150. And some pages were included in the sitemap but not indexed
  151. 151. /job/finance-manager-liverpool
  152. 152. /job/finance-manager-liverpool?index= 57271b7c~4&utm_campaign=job- detail&utm_source=search- result&utm_medium=website
  153. 153. Because this page was the one discovered via internal links
  154. 154. Over 5K similar pages with parameters were getting indexed
  155. 155. Your main goal is to maximise crawl budget
  156. 156. You cannot use canonical tags as a sticking plaster to fix that
  157. 157. This implementation was still incomplete
  158. 158. I had to change my way of conveying this message
  159. 159. I put a stop to the endless stream of emails and scheduled a face-to-face meeting
  160. 160. We reviewed each and every single remaining task and discussed them in detail
  161. 161. We re-prioritised tasks and put estimated completion dates
  162. 162. It was not an easy meeting but it felt productive
  163. 163. Where are we now?
  164. 164. On a personal level, I discovered that I suffer from Imposter Syndrome
  165. 165. This was my constant state of mind
  166. 166. I was working closely with the CTO throughout this project
  167. 167. I felt he didn’t trust me or my knowledge
  168. 168. No matter how much I tried to assure him that he’s in safe hands
  169. 169. The fact of the matter is, this site is his baby
  170. 170. And I was constantly attacking his baby
  171. 171. So, what would I have done differently?
  172. 172. If I could go back in time, I would realise what the actual problem was
  173. 173. All technical problems are people problems
  174. 174. The SEO recommendations were solid
  175. 175. Getting it implemented was the hard part
  176. 176. As a tech SEO, the most you can do is to influence priorities
  177. 177. You have no control
  178. 178. In this instance, I didn’t manage to persuade him to implement the recommendations
  179. 179. I also learned that the way I’ve been doing SEO audits is plain wrong
  180. 180. I always focused on delivering a set of comprehensive actions
  181. 181. Instead, maybe I should just deliver a SINGLE recommendation
  182. 182. And once that’s implemented…
  183. 183. Then, and only then, will I recommend another
  184. 184. And maybe I shouldn’t recommend Nice-To-Do’s…
  185. 185. Until there are only Nice-To-Do’s left to do
  186. 186. Because they are simply a distraction from the main problem
  187. 187. Over the past year, we created 6 creative campaigns for them
  188. 188. These campaigns generated 261 links and had 1.6M estimated views
  189. 189. We got coverage on all of these sites and more…
  190. 190. Yet it barely had an impact…
  191. 191. On a daily basis, I was being asked: “Why aren’t our rankings improving?”
  192. 192. On a daily basis, I was constantly saying…
  193. 193. “The core reason your rankings aren’t improving is due to incomplete tech actions…”
  194. 194. “Let’s work together to get these implemented ASAP as they will directly impact your traffic and rankings.”
  195. 195. This talk doesn’t have a happy ending…
  196. 196. It isn’t a successful case study…
  197. 197. It isn’t a successful case study… YET
  198. 198. There’s no upward visibility graph and page one rankings for me to show off…
  199. 199. There’s no upward visibility graph and page one rankings for me to show off… YET
  200. 200. This talk is about real life…
  201. 201. It’s about a long-overdue project where I learned a lot about working alongside CTOs
  202. 202. It’s about how I created a framework that fixes indexability issues that I’m proud of and I know in my gut *works*
  203. 203. So I wanted to share it with you
  204. 204. Because I can see this applied across many sites in plenty of industries
  205. 205. And I’d love forYOU to implement it
  206. 206. So I’m going to share my full methodology with you
  207. 207.
  208. 208. And just remember…
  209. 209. Getting the basics right is so fundamental
  210. 210. If you can do nothing else, just do the tech.
  211. 211. Areej AbuAli Head of SEO |Verve Search Slides: Questions: Tweet things @areej_abuali