SlideShare a Scribd company logo
1 of 25
Download to read offline
LOG FILE ANALYSIS 
The most powerful tool in your SEO toolkit 
Tom Bennet 
Consultant, Builtvisible 
@tomcbennet
Getting Started
What is a log file? 
A record of all hits that a server has received – humans and robots. 
http://www.brightonseo.com/about/ 
1. Protocol 
2. Host name 
3. File name 
Host name -> IP Address via DNS -> Connection to Server -> 
HTTP Get Request via Protocol for File -> HTML to Browser
They’re not pretty…
…but they’re very powerful. 
188.65.114.122 - - [30/Sep/2013:08:07:05 -0400] "GET /resources/whitepapers/retail-whitepaper/ HTTP/1.1" 200 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; + http://www.google.com/bot.html)" 
Server IP 
Timestamp (date & time) 
Method (GET / POST) 
Request URI 
HTTP status code 
User-agent
Log Files & SEO
What is Crawl Budget? 
Crawl Budget = The number of URLs crawled on each visit to your site. 
Higher Authority = Higher Crawl Budget
Crawl Budget Utilisation 
http://example.com/thin-product-page-1 
http://example.com/category/thin-product-page-1 
http://example.com/category/subcategory/thin-product-page-1 
http://example.com/category/subcategory/thin-product-page-1?colour=blue 
Etc… 
Conservation of crawl budget is key.
Working With Logs
Preparing Your Data 
Extraction: Varies by server. See accompanying guide. 
Filter: By Googlebot user-agent, validate the IP range. https://support.google.com/webmasters/answer/80553?hl=en 
Tools: Gamut and Splunk are great, but you can’t beat Excel.
Working in Excel 
1. Convert .log to .csv 
(cool tip: just change the file extension)
Working in Excel 
2. Sample size 
(60-120k Googlebot requests / rows is a good size)
Working in Excel 
3. Text-to-columns 
(a space will usually be a suitable delimiter)
Working in Excel 
4. Create a table 
(Label your columns, sort by timestamp)
Investigate
Most vs Least Crawled 
Formula: Use COUNTIF on Request URL. 
Tip: Extract top-level category for crawl distribution by site-section. 
http://www.brightonseo.com/speakers/person-name/
Crawl Frequency Over Time 
Formula: Pivot date against count of requests. 
Tip: Segment by site section or by user-agent (G-bot Mobile, Images, Video, etc).
HTTP Response Codes 
Formula: Total up HTTP Response Codes. 
Tip: Find most common 302s or 404s, filter by code and sort by URL occurrence.
Level Up 
Robots.txt – Crawl all URLs with Screaming Frog to determine if they are blocked in robots.txt. Investigate most frequently crawled. 
Faceted Nav Issues – Dedupe a list of unique resources, sort by times requested. 
Sitemap – Add your sitemap URLs into an Excel table, VLOOKUP against your logs. Which mapped URLs are crawl deficient? 
CSS / JS – These resources should be crawlable, but are files unnecessary for render absorbing an inordinate amount of crawl budget?
Top Level Crawl Waste 
Formula: Use IF statements to check for every cause of waste.
Crime = Solved
All Brighton SEO attendees will receive the guide via email.
THANKS FOR LISTENING 
Get in touch 
e: tom@builtvisible.com 
t: @tomcbennet 
Tom Bennet 
Consultant, Builtvisible 
@tomcbennet

More Related Content

What's hot

On-Page & Off-Page SEO Check List
On-Page & Off-Page SEO Check ListOn-Page & Off-Page SEO Check List
On-Page & Off-Page SEO Check ListNasir Uddin Shamim
 
BrightonSEO April 2023 Similar AI: Automation recipes for SEO success
BrightonSEO April 2023 Similar AI: Automation recipes for SEO successBrightonSEO April 2023 Similar AI: Automation recipes for SEO success
BrightonSEO April 2023 Similar AI: Automation recipes for SEO successDylan Fuler
 
Making Good SEO Reports Portent Webinar
Making Good SEO Reports Portent WebinarMaking Good SEO Reports Portent Webinar
Making Good SEO Reports Portent WebinarBrightEdge
 
10 Must-HAve GA4 Reports for SEO - Brighton SEO Apr 2023
10 Must-HAve GA4 Reports for SEO - Brighton SEO Apr 202310 Must-HAve GA4 Reports for SEO - Brighton SEO Apr 2023
10 Must-HAve GA4 Reports for SEO - Brighton SEO Apr 2023AccuraCast
 
Conference slide design tips for brightonSEO speakers (and other events too)
Conference slide design tips for brightonSEO speakers (and other events too)Conference slide design tips for brightonSEO speakers (and other events too)
Conference slide design tips for brightonSEO speakers (and other events too)Kelvin Newman
 
SEO, Search Engine Ranking Position (SERP) Report
SEO, Search Engine Ranking Position (SERP) ReportSEO, Search Engine Ranking Position (SERP) Report
SEO, Search Engine Ranking Position (SERP) ReportKevin James
 
EAT: Have We Been Looking At It Backwards
EAT: Have We Been Looking At It BackwardsEAT: Have We Been Looking At It Backwards
EAT: Have We Been Looking At It BackwardsEdwardZiubrzynski1
 
PubCon, Lazarina Stoy. - Machine Learning in Search: Google's ML APIs vs Open...
PubCon, Lazarina Stoy. - Machine Learning in Search: Google's ML APIs vs Open...PubCon, Lazarina Stoy. - Machine Learning in Search: Google's ML APIs vs Open...
PubCon, Lazarina Stoy. - Machine Learning in Search: Google's ML APIs vs Open...LazarinaStoyanova
 
How to be the ultimate double agent- PR and Link Builder Isa Lavahun BTNSEO S...
How to be the ultimate double agent- PR and Link Builder Isa Lavahun BTNSEO S...How to be the ultimate double agent- PR and Link Builder Isa Lavahun BTNSEO S...
How to be the ultimate double agent- PR and Link Builder Isa Lavahun BTNSEO S...Isa Lavs
 
BrightonSEO: How to generate 8 million SEO test ideas - Will Critchlow
BrightonSEO: How to generate 8 million SEO test ideas - Will CritchlowBrightonSEO: How to generate 8 million SEO test ideas - Will Critchlow
BrightonSEO: How to generate 8 million SEO test ideas - Will CritchlowWill Critchlow
 
Hacking GA4 for SEO - Brighton SEO - Apr 2023
Hacking GA4 for SEO - Brighton SEO - Apr 2023Hacking GA4 for SEO - Brighton SEO - Apr 2023
Hacking GA4 for SEO - Brighton SEO - Apr 2023Nitesh Sharoff
 
On-Page Optimization SEO Report Sample by SEO Traffic
On-Page Optimization SEO Report Sample by SEO TrafficOn-Page Optimization SEO Report Sample by SEO Traffic
On-Page Optimization SEO Report Sample by SEO TrafficSEO Traffic
 
Keyword Research for SEO: Best Practices & Top Tips
Keyword Research for SEO: Best Practices & Top TipsKeyword Research for SEO: Best Practices & Top Tips
Keyword Research for SEO: Best Practices & Top TipsSearch Engine Journal
 
On page SEO Optimization & it's Techniques
On page SEO Optimization & it's TechniquesOn page SEO Optimization & it's Techniques
On page SEO Optimization & it's TechniquesPratibha Maurya
 
The AIR Framework _ brightonSEO April 2023.pptx
The AIR Framework _ brightonSEO April 2023.pptxThe AIR Framework _ brightonSEO April 2023.pptx
The AIR Framework _ brightonSEO April 2023.pptxAlex Wright
 
Introduction to SEO
Introduction to SEOIntroduction to SEO
Introduction to SEORand Fishkin
 

What's hot (20)

On-Page & Off-Page SEO Check List
On-Page & Off-Page SEO Check ListOn-Page & Off-Page SEO Check List
On-Page & Off-Page SEO Check List
 
Technical SEO.pdf
Technical SEO.pdfTechnical SEO.pdf
Technical SEO.pdf
 
Seo for-content
Seo for-contentSeo for-content
Seo for-content
 
BrightonSEO April 2023 Similar AI: Automation recipes for SEO success
BrightonSEO April 2023 Similar AI: Automation recipes for SEO successBrightonSEO April 2023 Similar AI: Automation recipes for SEO success
BrightonSEO April 2023 Similar AI: Automation recipes for SEO success
 
Making Good SEO Reports Portent Webinar
Making Good SEO Reports Portent WebinarMaking Good SEO Reports Portent Webinar
Making Good SEO Reports Portent Webinar
 
10 Must-HAve GA4 Reports for SEO - Brighton SEO Apr 2023
10 Must-HAve GA4 Reports for SEO - Brighton SEO Apr 202310 Must-HAve GA4 Reports for SEO - Brighton SEO Apr 2023
10 Must-HAve GA4 Reports for SEO - Brighton SEO Apr 2023
 
Conference slide design tips for brightonSEO speakers (and other events too)
Conference slide design tips for brightonSEO speakers (and other events too)Conference slide design tips for brightonSEO speakers (and other events too)
Conference slide design tips for brightonSEO speakers (and other events too)
 
Sample SEO Proposal
Sample SEO ProposalSample SEO Proposal
Sample SEO Proposal
 
SEO, Search Engine Ranking Position (SERP) Report
SEO, Search Engine Ranking Position (SERP) ReportSEO, Search Engine Ranking Position (SERP) Report
SEO, Search Engine Ranking Position (SERP) Report
 
EAT: Have We Been Looking At It Backwards
EAT: Have We Been Looking At It BackwardsEAT: Have We Been Looking At It Backwards
EAT: Have We Been Looking At It Backwards
 
PubCon, Lazarina Stoy. - Machine Learning in Search: Google's ML APIs vs Open...
PubCon, Lazarina Stoy. - Machine Learning in Search: Google's ML APIs vs Open...PubCon, Lazarina Stoy. - Machine Learning in Search: Google's ML APIs vs Open...
PubCon, Lazarina Stoy. - Machine Learning in Search: Google's ML APIs vs Open...
 
How to be the ultimate double agent- PR and Link Builder Isa Lavahun BTNSEO S...
How to be the ultimate double agent- PR and Link Builder Isa Lavahun BTNSEO S...How to be the ultimate double agent- PR and Link Builder Isa Lavahun BTNSEO S...
How to be the ultimate double agent- PR and Link Builder Isa Lavahun BTNSEO S...
 
BrightonSEO: How to generate 8 million SEO test ideas - Will Critchlow
BrightonSEO: How to generate 8 million SEO test ideas - Will CritchlowBrightonSEO: How to generate 8 million SEO test ideas - Will Critchlow
BrightonSEO: How to generate 8 million SEO test ideas - Will Critchlow
 
Hacking GA4 for SEO - Brighton SEO - Apr 2023
Hacking GA4 for SEO - Brighton SEO - Apr 2023Hacking GA4 for SEO - Brighton SEO - Apr 2023
Hacking GA4 for SEO - Brighton SEO - Apr 2023
 
On-Page Optimization SEO Report Sample by SEO Traffic
On-Page Optimization SEO Report Sample by SEO TrafficOn-Page Optimization SEO Report Sample by SEO Traffic
On-Page Optimization SEO Report Sample by SEO Traffic
 
Keyword Research for SEO: Best Practices & Top Tips
Keyword Research for SEO: Best Practices & Top TipsKeyword Research for SEO: Best Practices & Top Tips
Keyword Research for SEO: Best Practices & Top Tips
 
On page SEO Optimization & it's Techniques
On page SEO Optimization & it's TechniquesOn page SEO Optimization & it's Techniques
On page SEO Optimization & it's Techniques
 
The AIR Framework _ brightonSEO April 2023.pptx
The AIR Framework _ brightonSEO April 2023.pptxThe AIR Framework _ brightonSEO April 2023.pptx
The AIR Framework _ brightonSEO April 2023.pptx
 
Seo report [template]
Seo report [template]Seo report [template]
Seo report [template]
 
Introduction to SEO
Introduction to SEOIntroduction to SEO
Introduction to SEO
 

Similar to Analyze Log Files and Improve Your SEO with Excel

Jeremy cabral search marketing summit - scraping data-driven content (1)
Jeremy cabral   search marketing summit - scraping data-driven content (1)Jeremy cabral   search marketing summit - scraping data-driven content (1)
Jeremy cabral search marketing summit - scraping data-driven content (1)Jeremy Cabral
 
12 core technologies you should learn, love, and hate to be a 'real' technocrat
12 core technologies you should learn, love, and hate to be a 'real' technocrat12 core technologies you should learn, love, and hate to be a 'real' technocrat
12 core technologies you should learn, love, and hate to be a 'real' technocratlinoj
 
The Technical SEO Full Course how to do
The Technical SEO  Full Course  how to doThe Technical SEO  Full Course  how to do
The Technical SEO Full Course how to doasadkhan888889990
 
Future of Search Engine Factors, AMP, On-Page Key to Success
Future of Search Engine Factors, AMP, On-Page Key to SuccessFuture of Search Engine Factors, AMP, On-Page Key to Success
Future of Search Engine Factors, AMP, On-Page Key to SuccessAnetwork
 
Web Architectures - Web Technologies (1019888BNR)
Web Architectures - Web Technologies (1019888BNR)Web Architectures - Web Technologies (1019888BNR)
Web Architectures - Web Technologies (1019888BNR)Beat Signer
 
Introduction to Web Programming - first course
Introduction to Web Programming - first courseIntroduction to Web Programming - first course
Introduction to Web Programming - first courseVlad Posea
 
RESTful SOA - 中科院暑期讲座
RESTful SOA - 中科院暑期讲座RESTful SOA - 中科院暑期讲座
RESTful SOA - 中科院暑期讲座Li Yi
 
Software performance testing_overview
Software performance testing_overviewSoftware performance testing_overview
Software performance testing_overviewRohan Bhattarai
 
How To Web - Introduction To Data Mining For Web Applications
How To Web - Introduction To Data Mining For Web ApplicationsHow To Web - Introduction To Data Mining For Web Applications
How To Web - Introduction To Data Mining For Web ApplicationsWembrio
 
Restful web-services
Restful web-servicesRestful web-services
Restful web-servicesrporwal
 
Improving your team’s source code searching capabilities
Improving your team’s source code searching capabilitiesImproving your team’s source code searching capabilities
Improving your team’s source code searching capabilitiesNikos Katirtzis
 
Improving your team's source code searching capabilities - Voxxed Thessalonik...
Improving your team's source code searching capabilities - Voxxed Thessalonik...Improving your team's source code searching capabilities - Voxxed Thessalonik...
Improving your team's source code searching capabilities - Voxxed Thessalonik...Nikos Katirtzis
 
Lesson 6 web based attacks
Lesson 6 web based attacksLesson 6 web based attacks
Lesson 6 web based attacksFrank Victory
 

Similar to Analyze Log Files and Improve Your SEO with Excel (20)

White Hat Cloaking
White Hat CloakingWhite Hat Cloaking
White Hat Cloaking
 
OTG-Recon
OTG-ReconOTG-Recon
OTG-Recon
 
Jeremy cabral search marketing summit - scraping data-driven content (1)
Jeremy cabral   search marketing summit - scraping data-driven content (1)Jeremy cabral   search marketing summit - scraping data-driven content (1)
Jeremy cabral search marketing summit - scraping data-driven content (1)
 
12 core technologies you should learn, love, and hate to be a 'real' technocrat
12 core technologies you should learn, love, and hate to be a 'real' technocrat12 core technologies you should learn, love, and hate to be a 'real' technocrat
12 core technologies you should learn, love, and hate to be a 'real' technocrat
 
The Technical SEO Full Course how to do
The Technical SEO  Full Course  how to doThe Technical SEO  Full Course  how to do
The Technical SEO Full Course how to do
 
Future of Search Engine Factors, AMP, On-Page Key to Success
Future of Search Engine Factors, AMP, On-Page Key to SuccessFuture of Search Engine Factors, AMP, On-Page Key to Success
Future of Search Engine Factors, AMP, On-Page Key to Success
 
Web Architectures - Web Technologies (1019888BNR)
Web Architectures - Web Technologies (1019888BNR)Web Architectures - Web Technologies (1019888BNR)
Web Architectures - Web Technologies (1019888BNR)
 
Web hacking
Web hackingWeb hacking
Web hacking
 
ProjectHub
ProjectHubProjectHub
ProjectHub
 
Introduction to Web Programming - first course
Introduction to Web Programming - first courseIntroduction to Web Programming - first course
Introduction to Web Programming - first course
 
Fundamentals Of Search
Fundamentals Of SearchFundamentals Of Search
Fundamentals Of Search
 
RESTful SOA - 中科院暑期讲座
RESTful SOA - 中科院暑期讲座RESTful SOA - 中科院暑期讲座
RESTful SOA - 中科院暑期讲座
 
Software performance testing_overview
Software performance testing_overviewSoftware performance testing_overview
Software performance testing_overview
 
Apex REST
Apex RESTApex REST
Apex REST
 
internet workshop
internet workshopinternet workshop
internet workshop
 
How To Web - Introduction To Data Mining For Web Applications
How To Web - Introduction To Data Mining For Web ApplicationsHow To Web - Introduction To Data Mining For Web Applications
How To Web - Introduction To Data Mining For Web Applications
 
Restful web-services
Restful web-servicesRestful web-services
Restful web-services
 
Improving your team’s source code searching capabilities
Improving your team’s source code searching capabilitiesImproving your team’s source code searching capabilities
Improving your team’s source code searching capabilities
 
Improving your team's source code searching capabilities - Voxxed Thessalonik...
Improving your team's source code searching capabilities - Voxxed Thessalonik...Improving your team's source code searching capabilities - Voxxed Thessalonik...
Improving your team's source code searching capabilities - Voxxed Thessalonik...
 
Lesson 6 web based attacks
Lesson 6 web based attacksLesson 6 web based attacks
Lesson 6 web based attacks
 

Recently uploaded

Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesThousandEyes
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...AliaaTarek5
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditSkynet Technologies
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...panagenda
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 

Recently uploaded (20)

Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance Audit
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 

Analyze Log Files and Improve Your SEO with Excel

  • 1. LOG FILE ANALYSIS The most powerful tool in your SEO toolkit Tom Bennet Consultant, Builtvisible @tomcbennet
  • 2.
  • 4. What is a log file? A record of all hits that a server has received – humans and robots. http://www.brightonseo.com/about/ 1. Protocol 2. Host name 3. File name Host name -> IP Address via DNS -> Connection to Server -> HTTP Get Request via Protocol for File -> HTML to Browser
  • 6. …but they’re very powerful. 188.65.114.122 - - [30/Sep/2013:08:07:05 -0400] "GET /resources/whitepapers/retail-whitepaper/ HTTP/1.1" 200 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; + http://www.google.com/bot.html)" Server IP Timestamp (date & time) Method (GET / POST) Request URI HTTP status code User-agent
  • 8. What is Crawl Budget? Crawl Budget = The number of URLs crawled on each visit to your site. Higher Authority = Higher Crawl Budget
  • 9. Crawl Budget Utilisation http://example.com/thin-product-page-1 http://example.com/category/thin-product-page-1 http://example.com/category/subcategory/thin-product-page-1 http://example.com/category/subcategory/thin-product-page-1?colour=blue Etc… Conservation of crawl budget is key.
  • 11. Preparing Your Data Extraction: Varies by server. See accompanying guide. Filter: By Googlebot user-agent, validate the IP range. https://support.google.com/webmasters/answer/80553?hl=en Tools: Gamut and Splunk are great, but you can’t beat Excel.
  • 12. Working in Excel 1. Convert .log to .csv (cool tip: just change the file extension)
  • 13. Working in Excel 2. Sample size (60-120k Googlebot requests / rows is a good size)
  • 14. Working in Excel 3. Text-to-columns (a space will usually be a suitable delimiter)
  • 15. Working in Excel 4. Create a table (Label your columns, sort by timestamp)
  • 17. Most vs Least Crawled Formula: Use COUNTIF on Request URL. Tip: Extract top-level category for crawl distribution by site-section. http://www.brightonseo.com/speakers/person-name/
  • 18. Crawl Frequency Over Time Formula: Pivot date against count of requests. Tip: Segment by site section or by user-agent (G-bot Mobile, Images, Video, etc).
  • 19. HTTP Response Codes Formula: Total up HTTP Response Codes. Tip: Find most common 302s or 404s, filter by code and sort by URL occurrence.
  • 20.
  • 21. Level Up Robots.txt – Crawl all URLs with Screaming Frog to determine if they are blocked in robots.txt. Investigate most frequently crawled. Faceted Nav Issues – Dedupe a list of unique resources, sort by times requested. Sitemap – Add your sitemap URLs into an Excel table, VLOOKUP against your logs. Which mapped URLs are crawl deficient? CSS / JS – These resources should be crawlable, but are files unnecessary for render absorbing an inordinate amount of crawl budget?
  • 22. Top Level Crawl Waste Formula: Use IF statements to check for every cause of waste.
  • 24. All Brighton SEO attendees will receive the guide via email.
  • 25. THANKS FOR LISTENING Get in touch e: tom@builtvisible.com t: @tomcbennet Tom Bennet Consultant, Builtvisible @tomcbennet