Did you know 30% of travel industry website visitors are unsavory competitors, hackers, spammers, and fraudsters?
When aggressive scrapers took his website offline, Rob Gennaro, Digital Marketing Officer at Red Label Vacations, said enough was enough.
The fact is, travel suppliers, OTAs, and meta search sites are all being scraped by bots which hurts their marketing metrics, SEO, website performance, and customer loyalty.
You can protect your site from web-scraping competitors and fraudsters.
Attend this FREE 30-minute TLearn webinar to understand:
The prevalence and impact of bots on your website
How to identify and block fraudsters and scrapers
When a web scraper is actually good
The future of online travel and website security
Our panelists are:
Rob Gennaro, digital marketing officer, Red Label Vacations
Rami Essaid, co-founder and CEO, Distil Networks
Kevin May, moderator and editor, Tnooz
Nick Vivion, producer and reporter, Tnooz
4. Poll no. 1
Has your website been a victim to web scraping
and/or bad bot attacks in the last 12 months?
5. Poll no. 2
Do you currently have a bot defense
solution in place?
6. Agenda
The growing bot problem
Web scraping bots and your data
Is web scraping legal?
Red Label Vacations journey to clean traffic
Selection criteria for a bot detection solution
Distil Networks overview
7. How Big is the Problem?
Up to 38% of traffic on travel websites are Bad Bots
4.2 million IP addresses impacted by “Pushdo” botnet alone
15% bot traffic can equate to hitting each of your pricing pages
30 times per month
8. Why the Massive Increase in Bot Traffic?
Online data has increased in value
Pricing, incentive packages, flight routes, reviews,
images, star ratings, availability, hotel
names/descriptions, and editorial are changing
daily
Anyone can get in the game
Cheap or free virtual servers, bandwidth, easy-to-
use tools, and scrapers for hire
Bots no longer tied to IP addresses
Bots cycle through random IP addresses
Bots hide behind anonymous proxies
Consumer IPs now infected with bot traffic too
9. What Is Web Scraping?
Web Scraping
Also known as screen scraping, web scraping is the act of
copying large amounts of data from a website – either
manually or with an automated program (Bot)
Legitimate Scraping
Scraping can sometimes be benevolent and totally
acceptable. For example, the search engine bots that index
your website
Malicious Scraping
A systematic theft of intellectual property accessible on a
website, including pricing, content, images, and proprietary
data
10. What Are Scrapers Doing with Your Travel Site?
Posting your content on competitor
sites
Scrapers steal your traffic and advertising
dollars. Duplicative content and high bounce
rates diminishes your SEO
Undermining your prices
Bots monitor your prices, ensuring competitors
can undercut with lower price listings
Executing searches on your site
The resulting API calls to third parties can cost
you
11. Bots Impact Your Website and Bottom Line
○ Cause brownouts
○ Undermine pricing strategies
○ Damage the human visitor’s experience
○ Negatively impact revenues
○ Waste advertising spend
○ Lower site quality score and hurt SEO
○ Hijack accounts
14. Is the Legal Route Effective?
Hard to prosecute scrapers
No easy way to detect or identify stolen data in derivatives
Legal route is too expensive
Travel website’s legal bill for one “scraper” > $10M
Copyrights and terms of use don’t have teeth
Easy for thieves to assert plausible deniability
16. About Red Label Vacations
○ Largest independent travel company
in Canada
○ 19 brands
○ Deals on vacations, flights, cruises,
hotels and car rentals
Canada’s premier online travel agency service
offering cheap flights, airline tickets, last
minute vacation packages and discounted
cruises
17. Red Label Vacations Complex IT Infrastructure
Complex IT Infrastructure
○ Total of 19 different web properties
○ 5 servers for RedTag.ca
○ 5 servers for ITA (for flight technology)
○ API calls into ITA, Sabre, Softvoyage, Hotels.com, Cartrawler
○ Mixed web infrastructure environments (outsourced hosting,
owned data centers)
○ Mix of web application stacks (e.g., .NET, PHP)
○ Akamai CDN
18. Red Label Vacations Bot Challenges
Bot Challenges
○ Homegrown, IP blocking system wasn’t working
○ Bots came in through proxies; IP addresses were spoofed
○ Bots caused brownouts
○ Brownouts caused immediate loss of revenue ($1000s)
○ Bots can hurt Google quality score and SEO
○ Akamai CDN was difficult to manage
19. Red Label Vacations Selection Criteria
Bot Detection and Mitigation Solution Requirements
○ Block web scrapers without impacting human visitors
○ Accurately identify good bots vs. bad bots
○ Increase website availability and speed
○ Detect automated browsing tools
○ Simple setup
○ Little or no maintenance, “self-optimizing” solution
○ Reduce costs and complexity of Akamai
20. Red Label Vacations Results with Distil
Improved Website Performance with Distil
○ Uptime went from 99.6 to to 99.9%
○ Faster load times; no errors
○ User time on site increased; bounce rate
decreased
○ Detailed reporting distinguishes human visitors
from malicious bots
21. Red Label Vacations Results with Distil
Monthly Cost Savings with Distil
○ 65% less expensive than Akamai
○ Reduced costs for third party API calls
○ Cost savings due to improved uptime
○ Eliminated tax on internal teams
25. Selection Criteria: Purpose-Built Solution
Bot Detection is a New Category, NOT a Feature
○ NOT a Content Delivery Service (CDN)
○ NOT a Distributed Denial of Service (DDoS) protection solution
○ NOT a Web Application Firewall (WAF)
○ NOT a simple IP list or set of scripts
A purpose-built bot detection solution is
always updating and evolving
26. Selection Criteria: Complete Protection
Internal Teams Catch 20%
IP BLOCK
USER AGENT
TESTING
IP ANALYSIS
USER AGENT
TESTING
JAVASCRIPT
TEST
COOKIE
SELENIUM TEST
BROWSER RATE
LIMITING
AUTOMATED
BROWSER
PHANTOM JS
MACHINE
LEARNING
IP CYCLING
A Purpose-Built Solution Should Catch 99.9%
28. Selection Criteria: Accuracy
Inline Fingerprinting
Fingerprints stick to the bot even if it attempts to
reconnect from random IP addresses or hide behind
an anonymous proxy
Known Violators Database
Real-time updates from a Known Violators Database,
which is based on the collective intelligence of all
protected sites
Behavioral Modeling and Machine
Learning
Machine-learning algorithms pinpoint behavioral
anomalies specific to your site’s unique traffic
patterns
29. Selection Criteria: Accuracy
Browser Automation Tool Detection
JavaScript Validation on the connection stream
identifies browser automation tools
Advanced Rate Limiting
Set rate limits such as pages per minute, pages per
session, and session length
“Good Bot” Authentication
Validate that good bot requests (Google, Bing, etc.)
map to the correct user agent and IP range
30. How Travel Companies Benefit from Distil
Increase insight & control
over human, good bot &
bad bot traffic
Block 99.9% of malicious
bots without impacting
legitimate users
Slash the high tax bots
place on internal teams
& web infrastructure
Protect data from web
scrapers, unauthorized
aggregators & hackers
33. Thank you!
Send your questions and comments to
kevin@tnooz.com
Replay and presentation of webinar will be available on
www.tnooz.com
Editor's Notes
Tor
Travel package margins are tight
All of their prices should be on par
Competitors will drop their prices on packages based on time of day
Late at night
Reviews
Diminishing data quality
Waste advertising spend - bots hit google adwords, click fraud report
Hurts SEO
1,000 visits and time on site is low, high bounce rate, google’s algorithm, lowers your quality score
Had a form on their site and it became a spam haven…
Ryan Air lost their suit in Europe
In the US it’s illegal
In emerging markets
It Depends
Canada
Akamai is good at being a CDN, DDoS
Disitl’s main focus was a CDN
Looking for something for something to block bots visiting the site
Everyone says they offer the next greatest thing
Notices right away that
University or Company that shares the same NAT as the person launching the Bot