SlideShare a Scribd company logo
1 of 132
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
DUPLICATE CONTENT: MYTHS, TYPES &
WAYS TO MAKE IT WORK FOR YOU
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
Duplicate Content Penalty ‘Myth’… It Just Won’t Die
Query  Refinement  Suggestion  
Next  Probable  Queries  on  “near  
duplicate  urls can  cause”
2017
At least 30% of the
web is a duplicate
of other pages on
the web
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
“And  that’s  
OK”
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
The Duplicate Content ‘Penalty’ Myth
‘Real’  duplicates  (matching
content  checksum)  filtered  and  
not  indexed
“Each  content  filter sends  the  
retrieved  web  pages  to  Dupserver
to  determine  if  they  are  duplicates
of  other  web  pages”
http://www.google.ch/patents/US20120317089
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation SubtitleFilters
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
Handling Near-Duplicate Content Attracted Lots of Research
§Dennis  Fetterly
§Marc  Najork
§Mark  Manasse
§Ziv Bar-­‐Yossef
§Monica  Henzinger
§William  Pugh
§Andrei  Broder
Some Notable ‘Spot the
Difference’ Researchers
DETECTING  DUPLICATES  &  NEAR-­‐DUPLICATES  
EARLY  SAVES  ON  RESOURCES  /  EFFICIENCY
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
Because… Near Duplicate Content is More Difficult to Detect
than Exact Duplicates
’Detecting  Duplicate  and  
Near  Duplicate  Files’
IT’S  AN  ONGOING  REAL  
WORLD  CHALLENGE
(Henzinger /  Pugh,  2003,  2009,  2011,  
2012,  2011,  2016)
These  Google  patents  in  the  series  
keep  being  ‘tweaked’  (A  is  not  the  
same  as  B)
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
A lot of busy Googlebots & potential for duplicates
• The  web  doubled  in  
size  2010  – 2012
• Another  1/3  by  2015
• Finite  search  engine  
resources
• Processes  automated  
for  scale
“I  just  never  have  
any  ‘me’-­‐time’  
any  more”
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
Near
Duplicates
Do Not
Change
Often
SO…  WHY  
WASTE  
RESOURCES  
CRAWLING  
THEM?
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
DENNIS FETTERLY
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
The Slow Page Evolution of Near Duplicates
“Clusters  of  near-­‐duplicate  documents  
are  fairly  stable:  Two  documents  that  
are  near-­‐duplicates  of  one  another  are  
very  likely  to  still  be  near-­‐duplicates  10  
weeks  later”
(Fetterly &  Najork,  2003)
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
… The Raters Guidelines still ask raters to catch ‘dupes’
In  Fact…  There’s  
a  whole  section  
of  the  guidelines  
dedicated  to  
them
2017
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
Mostly Stable For Years… But… The Web is Always Changing
2017
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
Near-Dupes are still doing strange things
John Mu at International Search Summit § Nearly  the  same  but  not  
the  same  still  causes  
confusion
§ Particularly  problematic  
on  internationalization
§ But  applies  to  all  sites  
with  pages  not  the  same  
but  ’nearly-­‐the-­‐same’
2017
BUT…Different
types of
‘duplicate
content’
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
• Full  duplication
• Partial  duplication
• Document  inclusion
• In-­‐document  duplication
• (Local  duplication  (in-­‐same-­‐site))
All types may not
be treated the
same
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
PERFECT
DUPLICATES
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation SubtitleFiltered before indexing
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
D.U.S.T. (DIFFERENT URL, SIMILAR TEXT)
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
DUSTBUSTER - Do Not Crawl in The Dust… Ziv Bar-Yossef
Reduce  crawling  and  wasted  
resources  to  low  importance  pages
CAVEAT:  IT  IS  NOT  
KNOWN  WHETHER  
THIS  IS  BEING  USED  
AT  ALL.    RESEARCH  
AND  THEORY
§ Builds  crawling  ‘rules’
§ Detects  duplicate  content  
URL  patterns
§ From  small  ‘sampling’  visits
§ Swerves  ‘DUST’
§ DUSTBUSTER
§ Saves  crawling  resources
§ Potentially  Popular  CMS  
configurations  URL  
parameters  detect  ‘DUST’
2003
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation SubtitleFiltered before indexing
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
Cookie Cutter Sites
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation SubtitleFiltered before indexing
Query &
Category
Agnostic
Never hit query run-time
auction
Because…
They’re not indexed…
Filtered
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
Tripping Flags
NEAR  DUP  ==  TRUE
NEAR  DUP  ==  FALSE
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
Query Agnostic Nature of Near-Duplicate Clustering
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
Single URL Content Fingerprint
But… The Single
URL Fingerprint
May Not Be The
One You Choose
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
BOILERPLATE ISSUES
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation SubtitleFiltered before indexing
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
TOKENS, VECTORS & SHINGLING
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
(w) Shingling
A  rose  is  a  rose  is  a  rose
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
Shingling
A  rose  is  a  rose  is  a  rose
N-­‐Gram
(Where  ‘n’  is  no.  
words  (tokens)  in  
snapshot)
[A  rose  is  a]  
[rose  is  a  
rose]  [is  a  
rose  is]  (4)
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
SHINGLE
VECTORS
SUPERSHINGLE
MEGASHINGLE
Shingles, Supershingles & Megashingles
WORD  ==  
TOKEN
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitlehttp://corpus.tools/wiki/Onion
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
http://corpus.tools/wiki/Onion
N-­‐gram  
length  
(word  
string)
POTENTIAL  EXAMPLE
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
http://corpus.tools/wiki/Onion
Dup  Content  
Threshold
e.g.  0.5  
(50%)
POTENTIAL  EXAMPLE
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
Broder, A.Z., Glassman, S.C., Manasse, M.S. and Zweig, G., 1997. Syntactic clustering of the web. Computer
Networks and ISDN Systems, 29(8-13), pp.1157-1166.
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
“We have developed an
efficient way to determine
the syntactic similarity of
files and have applied it to
every document on the
World Wide Web”
(Broder et al, 1997)
Broder, A.Z., Glassman, S.C., Manasse, M.S. and Zweig, G., 1997. Syntactic clustering of the web. Computer
Networks and ISDN Systems, 29(8-13), pp.1157-1166.
Documents grouped
together to meet
similar queries
equally (in a cluster)
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
Multiple Title
Candidates For A
Query
DYNAMIC,  
CONTEXTUAL  
SEARCH
Not?
Query
Agnostic
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
Quilting Web Pages
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
DUPLICATE CONTENT
TYPE – NEAR DUPE
(QUILTING)
UNIQUE
PARAGRAPH
EXTERNAL
SYNDICATED
EXTERNAL
SYNDICATED
EXTERNAL
SYNDICATED
HEADER - TEMPLATE
FOOTER - TEMPLATE
UNIQUE
PARAGRAPH
A
S
I
D
E
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
CONTENT  INCLUSION
CMS’
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
UNIQUE  
MAIN  
CONTENT
BUT  ITS  
CONTENT  IS  
INCLUDED  
ELSEWHERE
TEASER  ‘INCLUDED’  ELSEWHERE
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
May… or May NOT be Filtered before
indexing
?
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
Pages  that  look  very  different  
but  meet  the  same  user  
information  need  equally
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
Cross  Over,  Query  Class    &  Semantic  Collisions
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
Possible Treatment of Near-Duplicate Query Candidates
“If  more  than  one  candidate  is  
determined  to  be  part  of  a  
’search  query  cluster’,  the  most  
important  one  based  on  factors  
such  as  relevance,  freshness,  
importance  is  returned.    The  
others  are  eliminated.”
(Henzinger /  Pugh,  2012,2016)
Last  updated  
2016
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
May… or May NOT be Filtered
before indexing
?
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
URL Parameter-driven Ecommerce Platforms
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
How Are Choosing Strategies Catered For in Ecommerce?
§ FACETED  NAVIGATION  
&  WEBSITE  FILTERS  ==  
Allows  for  ‘Elimination  by  
Aspects’
§ PAGINATION ==  Reduces  
‘Too  Much  Choice’  effects
§ SORTING ==  Caters  for  
‘FIRST  /  BEST’  choosing  
strategies
CHOICE-­‐
ASSISTING  
FUNCTIONALITY
HEURISTICS
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
And with these choice-assisting functionalities come…
“Exponentially  
multiplicative  
URLs”
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
Exponentially Multiplicative URLs From Faceted Navigation…
100  DRESSES
5  COLOURS
10  SIZES
2  LENGTHS
4  SUPPLIERS
100  x  5  x  10  x  
2  x  4  =
40,000
URLs
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
And that’s without HTTPS, WWW/non or internationalization
100  DRESSES
5  COLOURS
10  SIZES
2  LENGTHS
4  SUPPLIERS
100  x  5  x  10  x  
2  x  4  =
40,000
URLs
X  2  BECAUSE…  
HTTPS  VERSION
80,000  
URLs
X  2…  BECAUSE…  
WWW  /  NON  
WWW  VERSION 160,000  
URLs
X  5…  
BECAUSE…  
EN  /  FR  /  ES  /  
DE  /  IT  (e.g.)
800,000  
URLs
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
May… or May NOT be Filtered
before indexing
?
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
THAT’S A LOT
OF URLs FOR
100 DRESSES
Bored  Googlebot
(Unrelated  to  speed)
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
When You
Stop Boring
Googlebot
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
When You Stop Boring Googlebot
NOT  
SPEED  
RELATED
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
CANONICALIZATION
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
The Canonical Tag - Otherwise Known As… RFC 6596
‘THE  
CANONICAL  
LINK  
RELATION’
2012
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
‘The Canonical Link Relation’ – RFC6596 Is Still Adhered To
2017
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
50%  OF  SEO’S  
“SEARCH  ENGINES  HAVE  
IGNORED  CANONICAL  TAGS  
THEY  HAD  IMPLEMENTED”
2017
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
A lot can go wrong with mixed signals…
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
There Are Many Signals To Consider In Canonicalization
404 & 410
301
302, 303, 307
Valid canonical from ‘context’ URL to valid target
Fall back to default pre ‘Canonical Link
Relation’duplicate handling signals
Valid href lang (if present and applicable)
Manual Action SUPER  STRONG
STRONG  -­‐ DIRECTIVE
STRONG  -­‐ DIRECTIVE
STRONG  -­‐ DIRECTIVE
STRONG  -­‐ HINT
STRONG  -­‐ HINT
DEFAULT
ALL  NEED  TO  BE  IN  UNISON
HTTPS  (Google  Specific)
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
“REL=NEXT  /  REL  =  
PREV”  IS  NOTA  FORM  
OF  CANONICALIZATION
2017
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
PAGINATION & SORTING
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
Rel =”next” Rel = “prev” RFC 5988 (Web Linking)
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
2011… -> We’re Still Unclear
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
View-all’ Search Experience
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
If  a  canonical  is  not  deemed  to  be  valid  
there  is  likelihood  the  pre-­‐RFC6596
Canonical  Link  Relation  treatment  of  
duplicates  and  near-­‐duplicates  will  be  
applied:
Such  as  ‘internal  links’
COMMON  CANONICAL  MISTAKES
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
“301s  AND  302s  ARE  
BOTH  A  FORM  OF  
CANONICALIZATION”
2017
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
Don’t  canonicalize from  an  
”index”  to  a  “noindex or  vice-­‐
versa  because  this  means  the  
pages  are  NOT  the  same.
The  canonical  will  likely  be  
ignored
COMMON  CANONICAL  MISTAKES
If  “href lang”  references  an  
alternative  which  does  not  
match  a  canonical  link  the  
canonical  will  likely  be  
ignored
Fall  Back
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
HOW TO RESOLVE?
Instead of
‘remove’
consider
‘degroup’
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
DUPLICATE CONTENT TYPE – NEAR DUPE
(ADDING VALUE)
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
Hubs &
Authorities
BOWTIE  OF  THE  WEB
Build  Strongly  Connected  Components
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
SORT OUT YOUR LIBRARY
SYSTEM & QUERY CLASSES
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
SEARCH ENGINES
LOVE
CATEGORIZATION
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
Focused
Crawling
CRAWLING  
CONTENT  ON  A  
SPECIFIC  
TOPIC  FOR  
EFFICIENCY
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
The ‘Mere’ Categorization Effect (Phenomenon) FTW
Simply  by  labelling  products  /  
items  as  being  part  of  a  
category  regardless  of  label  
appears  to  increase  
perception  of  variety  &  
positive  experience  (Mogliner
et  al,  2003)
HUMANS  LOVE  CATEGORIES  
TOO…  IT  IS  A  PHENOMENON
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
Homonyms contribute to need for query refinement
HOMONYMS  –WORDS  THAT  ARE  SPELT  OR  PRONOUNCED  
THE  SAME  BUT  HAVE  DIFFERENT  MEANINGS  
ROSE
EVENING WATCH
SINK
BACK
ARMS
BOW
CHECK
STRENGTHEN  DIFFERENTIAL  
CONTEXT
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
INTELLIGENT INTERNAL LINKING
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
LOCAL NAVIGATION RELEVANCE
TABLE  OF  CONTENTS  STYLE  IN  
PAGE  NAVIGATIONAL  HEURISTIC  
FOR  SEARCH  ENGINE  AND  
HUMAN
PAGINATED  TAB  THROUGH  ON  
SECTIONS  OF  REVIEW
GRANULAR  
RELEVANCE
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
Parameter Handling
“IS  PARAMETER-­‐HANDLING  A  
WAY  TO  HELP  GOOGLE  BUILD  A  
SET  OF  ‘DUSTBUSTER  
CRAWLING  RULES’  EARLY?”
MAKE  THE  
RULES
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
ADD VALUE TO NEAR DUPES
(INFORMATIONAL VIEWS (INFORMATION
ARCHITECTURE)
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
INFORMATION  VIEWS  
ADDING  VALUE  AND  
PASSING  STRENGTH  TO  
CANONICAL  TARGETS
Internal Links
Are
The Dogs
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation SubtitleSitemaps Below Surface
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
BUILD STRONG SECTIONS
REVIEWS
BLOG
BUYING  
GUIDES
COST  
CALCULATORS
COMMERCE
MAIN  SITE  THEME  (ONTOLOGY
SEMANTICS  RULE
UGC
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
Power Mapper
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
Related Content Mostly Adds Value To Other Content
That  content  is  ‘stitched’  from  elsewhere
But  it  is  VERY  useful  overall  &  helps  with  
searcher  ‘foraging’
To  create  context  for  what  it  links  out  to
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation SubtitleNOT Filtered before indexing
Doc  IDs  meeting  
contextual  
information  needs  
-­‐ 1,  or  2  pages  
(max)  chosen  at  
query  run-­‐time
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation SubtitleNOT Filtered before indexing
Fighting  with  each  
other  to  be  ‘THE’  
result
Seems  like  
‘dilution’
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
BE VERY CAREFUL WITH ‘PRUNING’
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
CAN YOU ‘IMPROVE’, ‘DE-GROUP’ OR
‘REMORPH’ … RATHER THAN ‘REMOVE?
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
THAT’S A LOT
OF URLs FOR
100 DRESSES
Is  The  Difference  Substantively  Different  To  Queries?
Does  The  
Repurposed  or  
Collated  Content  Add  
‘Additional’  Value??
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
Content
meeting
informational
needs equally
treated
different TO
DUPLICATES
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
GOTCHAS
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
Gotchas – Velvet
Blues Update
(SOME) URLs (WP
PLUGIN)
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
BETTER
SEARCH
REPLACE
PLUGIN
REVIEWS
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
WP For AMP
Internal
Linking
Canonical
Issues
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
WP For AMP
Internal
Linking
Canonical
Issues
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
MAGENTO GOTCHA WITH CANONICALS
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
Understand The Canonical Link Relation Rules – RFC6596
The target (canonical) IRI
MUST identify content that
is either duplicative or a
superset of the content at
the context (referring) IRI.
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
Google’s Maile Ohye on ‘How To Hire An SEO’
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
Without &filter=0 Appended to end of Query
https://www.google.co.uk/search?q=red+dress
es+size+10+long+sleeves&oq=red+dresses+siz
e+10+long+sleeves&aqs=chrome.0.69i59.1257
0j0j7&sourceid=chrome&ie=UTF-­‐8
NOBODY  HAS  MORE  
THAN  ONE  LISTING
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
With filter=0 Appended to end of Query
https://www.google.co.uk/search?q=red+size+
10+dresses+long+sleeves&oq=red+size+10+dr
esses+long+sleeves&aqs=chrome..69i57.13605
j0j7&sourceid=chrome&ie=UTF-­‐8&filter=0
ALL  SITES  HAVE  AT  
LEAST  2  LISTINGS
MISSED
OPPORTUNITIES
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
ASOS.com
18%
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
Quadruple
Listings
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
Similar Content – Query Refinement SERPs
NOT  FILTERED
NOT  NEAR-­‐DUPES
Does  the  searcher  
want  ‘gas  
engineers,  heating  
engineers,  central  
heating?’
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
CONFUSING  DUPLICATE,  NEAR-­‐
DUPLICATE  (DUST)  AND  SIMILAR  
CONTENT  COULD  COST  YOU  
DEARLY
Maybe a lot of people are confused by duplicates?
§ Be  careful  about  canonicalizing
when  unnecessary
§ True  duplicate  content  &  near-­‐
dupes  are  query  and  category  
agnostic
§ Similar  is  not  duplicate
§ You  may  still  have  the  answers  
to  different  queries  based  on  a  
small  important  difference
§ AT  LEAST  4  TYPES  OF  
DUPLICATE  CONTENT
2017
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
Thank
You
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
APPENDIX
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
Problems With The Many ‘Faces’ of Faceted Navigation
https://webmasters.googleblog.com/2014/02/faceted-­‐navigation-­‐best-­‐
and-­‐5-­‐of-­‐worst.html -­‐ Wednesday, February 12, 2014
Example  of  faceted  navigation:  
http://www.example.com/category.php?category=gummy-candies&price=5-
10&price=over-10
Facet  means  ‘little  faces’  (USEFUL  TRIVIA)
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
Relation Links – ‘Web Linking’
https://tools.ietf.org/html/rfc5988
Web  LINKING  – RFC  5988
INTERNET ENGINEERING TASK FORCE
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
Internationalization – An Additional Layer of Complexity
‘TAGS  FOR  IDENTIFYING  
LANGUAGES  – rfc 5646
https://tools.ietf.org/html/rfc5646
INTERNET  ENGINEERING  TASK  
FORCE
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
A Solution - The Introduction of Href Lang
Wikipedia  page  on  href lang
Rules  on  href lang
https://support.google.com/webmasters/answer/182192?hl=en&ref_topic=2370587 -­‐
MULTINATIONAL  &  MULTILINGUAL  SITES  AND  HREF  LANG
https://support.google.com/webmasters/topic/2370587?hl=en&ref_topic=4598733 -­‐
HREF  LANG  Google
https://support.google.com/webmasters/answer/2620865?hl=en&ref_topic=2370587 -­‐
USE  A  SITEMAP  FOR  HREF  LANG
https://support.google.com/webmasters/answer/6144055?hl=en&ref_topic=2370587 -­‐
LOCALE  AWARE  WITH  GOOGLEBOT
CRAWLING
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
INTERNATIONALIZED RESOURCE INDICATOR
IRI
Internationalized Resource Identifiers (IRIs)
RFC 3987
https://tools.ietf.org/html/rfc3987
INTERNET ENGINEERING TASK FORCE
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
REFERENCES
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
References & Sources
Fetterly,  D.,  Manasse,  M.  and  Najork,  M.,  2003.  On  the  evolution  of  clusters  
of  near-­‐duplicate  web  pages. Journal  of  Web  Engineering, 2(4),  pp.228-­‐246.
Broder,  A.Z.,  Glassman,  S.C.,  Manasse,  M.S.  and  Zweig,  G.,  1997.  Syntactic  
clustering  of  the  web. Computer  Networks  and  ISDN  Systems, 29(8-­‐13),  
pp.1157-­‐1166.
Broder,  A.,  Kumar,  R.,  Maghoul,  F.,  Raghavan,  P.,  Rajagopalan,  S.,  Stata,  R.,  
Tomkins,  A.  and  Wiener,  J.,  2000.  Graph  structure  in  the  web. Computer  
networks, 33(1),  pp.309-­‐320.
Mogilner,  C.,  Rudnick,  T.  and  Iyengar,  S.S.,  2008.  The  mere  categorization  
effect:  How  the  presence  of  categories  increases  choosers'  perceptions  of  
assortment  variety  and  outcome  satisfaction. Journal  of  Consumer  
Research, 35(2),  pp.202-­‐215.
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
References & Sources
http://www.seobythesea.com/2008/02/new-­‐google-­‐process-­‐for-­‐detecting-­‐
near-­‐duplicate-­‐content/
Pugh,  W.  and  Henzinger,  M.H.,  Google  Inc.,  2016. Detecting  duplicate  and  
near-­‐duplicate  files.  U.S.  Patent  9,275,143.
Alonso,  O.,  Fetterly,  D.  and  Manasse,  M.,  2013,  December.  Duplicate  news  
story  detection  revisited.  In Asia  Information  Retrieval  Symposium (pp.  203-­‐
214).  Springer  Berlin  Heidelberg.
RFC  5988  – The  Canonical  Relation  Link  -­‐ https://tools.ietf.org/html/rfc5988
Fetterly,  D.,  Manasse,  M.  and  Najork,  M.,  2003.  On  the  evolution  of  clusters  
of  near-­‐duplicate  web  pages. Journal  of  Web  Engineering, 2(4),  pp.228-­‐246.
@dawnieando from @MoveItMarketing
Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
References & Sources
Najork,  M.,  2012,  August.  Detecting  quilted  web  pages  at  scale.  
In Proceedings  of  the  35th  international  ACM  SIGIR  conference  on  
Research  and  development  in  information  retrieval (pp.  385-­‐394).  ACM
Source: Broder, A., Kumar, R., Maghoul, F., Raghavan, P., Rajagopalan, S.,
Stata, R., Tomkins, A. and Wiener, J., 2000. Graph structure in the
web. Computer networks, 33(1), pp.309-320.
.

More Related Content

What's hot

Technical SEO Myths Facts And Theories On Crawl Budget And The Importance Of ...
Technical SEO Myths Facts And Theories On Crawl Budget And The Importance Of ...Technical SEO Myths Facts And Theories On Crawl Budget And The Importance Of ...
Technical SEO Myths Facts And Theories On Crawl Budget And The Importance Of ...Dawn Anderson MSc DigM
 
Modern SEO Players Guide
Modern SEO Players GuideModern SEO Players Guide
Modern SEO Players GuideMichael King
 
Dawn Anderson SEO Consumer Choice Crawl Budget Optimization Conflicts
Dawn Anderson SEO Consumer Choice Crawl Budget Optimization ConflictsDawn Anderson SEO Consumer Choice Crawl Budget Optimization Conflicts
Dawn Anderson SEO Consumer Choice Crawl Budget Optimization ConflictsDawn Anderson MSc DigM
 
BrightonSEO - The Search Universe - Links, Log Files, GSC and everything in b...
BrightonSEO - The Search Universe - Links, Log Files, GSC and everything in b...BrightonSEO - The Search Universe - Links, Log Files, GSC and everything in b...
BrightonSEO - The Search Universe - Links, Log Files, GSC and everything in b...Jon Myers
 
Pubcon florida 2018 logs dont lie dawn anderson
Pubcon florida 2018 logs dont lie dawn andersonPubcon florida 2018 logs dont lie dawn anderson
Pubcon florida 2018 logs dont lie dawn andersonDawn Anderson MSc DigM
 
So you think you know canonical tags - Sean Butcher Brighton SEO presentation
So you think you know canonical tags -  Sean Butcher Brighton SEO presentationSo you think you know canonical tags -  Sean Butcher Brighton SEO presentation
So you think you know canonical tags - Sean Butcher Brighton SEO presentationSean Butcher
 
HOW TO INCREASE YOUR TRAFFIC 5X WITH THIS ONE SEO METHOD
HOW TO INCREASE YOUR TRAFFIC 5X WITH THIS ONE SEO METHODHOW TO INCREASE YOUR TRAFFIC 5X WITH THIS ONE SEO METHOD
HOW TO INCREASE YOUR TRAFFIC 5X WITH THIS ONE SEO METHODChristoph C. Cemper
 
Rendering SEO Manifesto - Why we need to go beyond JavaScript SEO
Rendering SEO Manifesto - Why we need to go beyond JavaScript SEORendering SEO Manifesto - Why we need to go beyond JavaScript SEO
Rendering SEO Manifesto - Why we need to go beyond JavaScript SEOOnely
 
SearchLove Boston 2018 - Tom Anthony - Hacking Google: what you can learn fro...
SearchLove Boston 2018 - Tom Anthony - Hacking Google: what you can learn fro...SearchLove Boston 2018 - Tom Anthony - Hacking Google: what you can learn fro...
SearchLove Boston 2018 - Tom Anthony - Hacking Google: what you can learn fro...Distilled
 
SearchLove San Diego 2018 | Will Critchlow | From the Horse’s Mouth: What We ...
SearchLove San Diego 2018 | Will Critchlow | From the Horse’s Mouth: What We ...SearchLove San Diego 2018 | Will Critchlow | From the Horse’s Mouth: What We ...
SearchLove San Diego 2018 | Will Critchlow | From the Horse’s Mouth: What We ...Distilled
 
Bringing in the family to emphasise importance and win during crawling
Bringing in the family to emphasise importance and win during crawlingBringing in the family to emphasise importance and win during crawling
Bringing in the family to emphasise importance and win during crawlingDawn Anderson MSc DigM
 
BrightonSEO - How to use XPath with eCommerce Websites
BrightonSEO - How to use XPath with eCommerce WebsitesBrightonSEO - How to use XPath with eCommerce Websites
BrightonSEO - How to use XPath with eCommerce WebsitesJanet Plumpton
 
Keeping Things Lean & Mean: Crawl Optimisation - Search Marketing Summit AU
Keeping Things Lean & Mean: Crawl Optimisation - Search Marketing Summit AUKeeping Things Lean & Mean: Crawl Optimisation - Search Marketing Summit AU
Keeping Things Lean & Mean: Crawl Optimisation - Search Marketing Summit AUJason Mun
 
BrightonSEO 2017 - SEO quick wins from a technical check
BrightonSEO 2017  - SEO quick wins from a technical checkBrightonSEO 2017  - SEO quick wins from a technical check
BrightonSEO 2017 - SEO quick wins from a technical checkChloe Bodard
 
The State of the Web: Pagination and Infinite Scroll
The State of the Web: Pagination and Infinite ScrollThe State of the Web: Pagination and Infinite Scroll
The State of the Web: Pagination and Infinite ScrollAdam Gent
 
SearchLove Boston 2018 - Bartosz Goralewicz - JavaScript: Looking Past the ...
SearchLove Boston 2018 -  Bartosz Goralewicz -  JavaScript: Looking Past the ...SearchLove Boston 2018 -  Bartosz Goralewicz -  JavaScript: Looking Past the ...
SearchLove Boston 2018 - Bartosz Goralewicz - JavaScript: Looking Past the ...Distilled
 
Infinite Loops Dirty Architecture And Too Many Indexed URLs
Infinite Loops Dirty Architecture And Too Many Indexed URLsInfinite Loops Dirty Architecture And Too Many Indexed URLs
Infinite Loops Dirty Architecture And Too Many Indexed URLsDawn Anderson MSc DigM
 
How to Improve Your Website's Indexation - Sean Butcher Brighton SEO Present...
How to Improve Your Website's Indexation  - Sean Butcher Brighton SEO Present...How to Improve Your Website's Indexation  - Sean Butcher Brighton SEO Present...
How to Improve Your Website's Indexation - Sean Butcher Brighton SEO Present...Sean Butcher
 
Google Search Engine Ranking Position - 200 Top Ranking Factors for SEO Marke...
Google Search Engine Ranking Position - 200 Top Ranking Factors for SEO Marke...Google Search Engine Ranking Position - 200 Top Ranking Factors for SEO Marke...
Google Search Engine Ranking Position - 200 Top Ranking Factors for SEO Marke...Ronald Soh
 
Negotiating crawl budget with googlebots
Negotiating crawl budget with googlebotsNegotiating crawl budget with googlebots
Negotiating crawl budget with googlebotsDawn Anderson MSc DigM
 

What's hot (20)

Technical SEO Myths Facts And Theories On Crawl Budget And The Importance Of ...
Technical SEO Myths Facts And Theories On Crawl Budget And The Importance Of ...Technical SEO Myths Facts And Theories On Crawl Budget And The Importance Of ...
Technical SEO Myths Facts And Theories On Crawl Budget And The Importance Of ...
 
Modern SEO Players Guide
Modern SEO Players GuideModern SEO Players Guide
Modern SEO Players Guide
 
Dawn Anderson SEO Consumer Choice Crawl Budget Optimization Conflicts
Dawn Anderson SEO Consumer Choice Crawl Budget Optimization ConflictsDawn Anderson SEO Consumer Choice Crawl Budget Optimization Conflicts
Dawn Anderson SEO Consumer Choice Crawl Budget Optimization Conflicts
 
BrightonSEO - The Search Universe - Links, Log Files, GSC and everything in b...
BrightonSEO - The Search Universe - Links, Log Files, GSC and everything in b...BrightonSEO - The Search Universe - Links, Log Files, GSC and everything in b...
BrightonSEO - The Search Universe - Links, Log Files, GSC and everything in b...
 
Pubcon florida 2018 logs dont lie dawn anderson
Pubcon florida 2018 logs dont lie dawn andersonPubcon florida 2018 logs dont lie dawn anderson
Pubcon florida 2018 logs dont lie dawn anderson
 
So you think you know canonical tags - Sean Butcher Brighton SEO presentation
So you think you know canonical tags -  Sean Butcher Brighton SEO presentationSo you think you know canonical tags -  Sean Butcher Brighton SEO presentation
So you think you know canonical tags - Sean Butcher Brighton SEO presentation
 
HOW TO INCREASE YOUR TRAFFIC 5X WITH THIS ONE SEO METHOD
HOW TO INCREASE YOUR TRAFFIC 5X WITH THIS ONE SEO METHODHOW TO INCREASE YOUR TRAFFIC 5X WITH THIS ONE SEO METHOD
HOW TO INCREASE YOUR TRAFFIC 5X WITH THIS ONE SEO METHOD
 
Rendering SEO Manifesto - Why we need to go beyond JavaScript SEO
Rendering SEO Manifesto - Why we need to go beyond JavaScript SEORendering SEO Manifesto - Why we need to go beyond JavaScript SEO
Rendering SEO Manifesto - Why we need to go beyond JavaScript SEO
 
SearchLove Boston 2018 - Tom Anthony - Hacking Google: what you can learn fro...
SearchLove Boston 2018 - Tom Anthony - Hacking Google: what you can learn fro...SearchLove Boston 2018 - Tom Anthony - Hacking Google: what you can learn fro...
SearchLove Boston 2018 - Tom Anthony - Hacking Google: what you can learn fro...
 
SearchLove San Diego 2018 | Will Critchlow | From the Horse’s Mouth: What We ...
SearchLove San Diego 2018 | Will Critchlow | From the Horse’s Mouth: What We ...SearchLove San Diego 2018 | Will Critchlow | From the Horse’s Mouth: What We ...
SearchLove San Diego 2018 | Will Critchlow | From the Horse’s Mouth: What We ...
 
Bringing in the family to emphasise importance and win during crawling
Bringing in the family to emphasise importance and win during crawlingBringing in the family to emphasise importance and win during crawling
Bringing in the family to emphasise importance and win during crawling
 
BrightonSEO - How to use XPath with eCommerce Websites
BrightonSEO - How to use XPath with eCommerce WebsitesBrightonSEO - How to use XPath with eCommerce Websites
BrightonSEO - How to use XPath with eCommerce Websites
 
Keeping Things Lean & Mean: Crawl Optimisation - Search Marketing Summit AU
Keeping Things Lean & Mean: Crawl Optimisation - Search Marketing Summit AUKeeping Things Lean & Mean: Crawl Optimisation - Search Marketing Summit AU
Keeping Things Lean & Mean: Crawl Optimisation - Search Marketing Summit AU
 
BrightonSEO 2017 - SEO quick wins from a technical check
BrightonSEO 2017  - SEO quick wins from a technical checkBrightonSEO 2017  - SEO quick wins from a technical check
BrightonSEO 2017 - SEO quick wins from a technical check
 
The State of the Web: Pagination and Infinite Scroll
The State of the Web: Pagination and Infinite ScrollThe State of the Web: Pagination and Infinite Scroll
The State of the Web: Pagination and Infinite Scroll
 
SearchLove Boston 2018 - Bartosz Goralewicz - JavaScript: Looking Past the ...
SearchLove Boston 2018 -  Bartosz Goralewicz -  JavaScript: Looking Past the ...SearchLove Boston 2018 -  Bartosz Goralewicz -  JavaScript: Looking Past the ...
SearchLove Boston 2018 - Bartosz Goralewicz - JavaScript: Looking Past the ...
 
Infinite Loops Dirty Architecture And Too Many Indexed URLs
Infinite Loops Dirty Architecture And Too Many Indexed URLsInfinite Loops Dirty Architecture And Too Many Indexed URLs
Infinite Loops Dirty Architecture And Too Many Indexed URLs
 
How to Improve Your Website's Indexation - Sean Butcher Brighton SEO Present...
How to Improve Your Website's Indexation  - Sean Butcher Brighton SEO Present...How to Improve Your Website's Indexation  - Sean Butcher Brighton SEO Present...
How to Improve Your Website's Indexation - Sean Butcher Brighton SEO Present...
 
Google Search Engine Ranking Position - 200 Top Ranking Factors for SEO Marke...
Google Search Engine Ranking Position - 200 Top Ranking Factors for SEO Marke...Google Search Engine Ranking Position - 200 Top Ranking Factors for SEO Marke...
Google Search Engine Ranking Position - 200 Top Ranking Factors for SEO Marke...
 
Negotiating crawl budget with googlebots
Negotiating crawl budget with googlebotsNegotiating crawl budget with googlebots
Negotiating crawl budget with googlebots
 

Similar to Duplicate Content Myths Types and Ways To Make It Work For You

MOZCON 2017 WINNING WITH CHOICE & INFORMATION SYSTEMS FOR BOTH CRAWLERS & CON...
MOZCON 2017 WINNING WITH CHOICE & INFORMATION SYSTEMS FOR BOTH CRAWLERS & CON...MOZCON 2017 WINNING WITH CHOICE & INFORMATION SYSTEMS FOR BOTH CRAWLERS & CON...
MOZCON 2017 WINNING WITH CHOICE & INFORMATION SYSTEMS FOR BOTH CRAWLERS & CON...Dawn Anderson MSc DigM
 
SearchLeeds 2017 - Dawn Anderson - Move it Marketing, Founder - Too much choi...
SearchLeeds 2017 - Dawn Anderson - Move it Marketing, Founder - Too much choi...SearchLeeds 2017 - Dawn Anderson - Move it Marketing, Founder - Too much choi...
SearchLeeds 2017 - Dawn Anderson - Move it Marketing, Founder - Too much choi...Branded3
 
With Great Power, a lecture on web typography
With Great Power, a lecture on web typographyWith Great Power, a lecture on web typography
With Great Power, a lecture on web typographyErika Tarte
 
SearchLove London | Dave Sottimano, 'Using Data to Win Arguments'
SearchLove London | Dave Sottimano, 'Using Data to Win Arguments' SearchLove London | Dave Sottimano, 'Using Data to Win Arguments'
SearchLove London | Dave Sottimano, 'Using Data to Win Arguments' Distilled
 
Infographics
InfographicsInfographics
InfographicsMianlside
 
SEO Strategies by Thirdparty Labs and Shovemedia
SEO Strategies by Thirdparty Labs and ShovemediaSEO Strategies by Thirdparty Labs and Shovemedia
SEO Strategies by Thirdparty Labs and ShovemediaRobRuchte
 
Wordcamp rochester-2017-accessibility-johnson-steigelman
Wordcamp rochester-2017-accessibility-johnson-steigelmanWordcamp rochester-2017-accessibility-johnson-steigelman
Wordcamp rochester-2017-accessibility-johnson-steigelmanH. Trevor Johnson-Steigelman
 
Improve your Wordpress SEO Strategy
Improve your Wordpress SEO StrategyImprove your Wordpress SEO Strategy
Improve your Wordpress SEO StrategyKillian Kostiha
 
How to Make Wordpress Social
How to Make Wordpress SocialHow to Make Wordpress Social
How to Make Wordpress SocialJames Gentes
 
Building on the Shoulders of Giants: the Story of Bitbucket Pipelines
Building on the Shoulders of Giants: the Story of Bitbucket PipelinesBuilding on the Shoulders of Giants: the Story of Bitbucket Pipelines
Building on the Shoulders of Giants: the Story of Bitbucket PipelinesAtlassian
 
Matrix charts
Matrix chartsMatrix charts
Matrix chartsMianlside
 
Wordcamp rochester-2017-accessibility-johnson-steigelman
Wordcamp rochester-2017-accessibility-johnson-steigelmanWordcamp rochester-2017-accessibility-johnson-steigelman
Wordcamp rochester-2017-accessibility-johnson-steigelmanH. Trevor Johnson-Steigelman
 
Methodology of Inclusion by Corey Timpson
Methodology of Inclusion by Corey TimpsonMethodology of Inclusion by Corey Timpson
Methodology of Inclusion by Corey TimpsonCorey Timpson
 
Four image slide
Four image slideFour image slide
Four image slideMianlside
 
Principles Of Web Design Workshop
Principles Of Web Design WorkshopPrinciples Of Web Design Workshop
Principles Of Web Design WorkshopGavin Elliott
 
How Modern Software Architecture Benefits from Patterns Found in Natural Comp...
How Modern Software Architecture Benefits from Patterns Found in Natural Comp...How Modern Software Architecture Benefits from Patterns Found in Natural Comp...
How Modern Software Architecture Benefits from Patterns Found in Natural Comp...Jeremiah Jones
 

Similar to Duplicate Content Myths Types and Ways To Make It Work For You (20)

MOZCON 2017 WINNING WITH CHOICE & INFORMATION SYSTEMS FOR BOTH CRAWLERS & CON...
MOZCON 2017 WINNING WITH CHOICE & INFORMATION SYSTEMS FOR BOTH CRAWLERS & CON...MOZCON 2017 WINNING WITH CHOICE & INFORMATION SYSTEMS FOR BOTH CRAWLERS & CON...
MOZCON 2017 WINNING WITH CHOICE & INFORMATION SYSTEMS FOR BOTH CRAWLERS & CON...
 
SearchLeeds 2017 - Dawn Anderson - Move it Marketing, Founder - Too much choi...
SearchLeeds 2017 - Dawn Anderson - Move it Marketing, Founder - Too much choi...SearchLeeds 2017 - Dawn Anderson - Move it Marketing, Founder - Too much choi...
SearchLeeds 2017 - Dawn Anderson - Move it Marketing, Founder - Too much choi...
 
With Great Power, a lecture on web typography
With Great Power, a lecture on web typographyWith Great Power, a lecture on web typography
With Great Power, a lecture on web typography
 
SearchLove London | Dave Sottimano, 'Using Data to Win Arguments'
SearchLove London | Dave Sottimano, 'Using Data to Win Arguments' SearchLove London | Dave Sottimano, 'Using Data to Win Arguments'
SearchLove London | Dave Sottimano, 'Using Data to Win Arguments'
 
Infographics
InfographicsInfographics
Infographics
 
SEO Strategies by Thirdparty Labs and Shovemedia
SEO Strategies by Thirdparty Labs and ShovemediaSEO Strategies by Thirdparty Labs and Shovemedia
SEO Strategies by Thirdparty Labs and Shovemedia
 
Wordcamp rochester-2017-accessibility-johnson-steigelman
Wordcamp rochester-2017-accessibility-johnson-steigelmanWordcamp rochester-2017-accessibility-johnson-steigelman
Wordcamp rochester-2017-accessibility-johnson-steigelman
 
Arrows
ArrowsArrows
Arrows
 
Improve your Wordpress SEO Strategy
Improve your Wordpress SEO StrategyImprove your Wordpress SEO Strategy
Improve your Wordpress SEO Strategy
 
The basics of seo
The basics of seoThe basics of seo
The basics of seo
 
Human
HumanHuman
Human
 
How to Make Wordpress Social
How to Make Wordpress SocialHow to Make Wordpress Social
How to Make Wordpress Social
 
Line chart
Line chartLine chart
Line chart
 
Building on the Shoulders of Giants: the Story of Bitbucket Pipelines
Building on the Shoulders of Giants: the Story of Bitbucket PipelinesBuilding on the Shoulders of Giants: the Story of Bitbucket Pipelines
Building on the Shoulders of Giants: the Story of Bitbucket Pipelines
 
Matrix charts
Matrix chartsMatrix charts
Matrix charts
 
Wordcamp rochester-2017-accessibility-johnson-steigelman
Wordcamp rochester-2017-accessibility-johnson-steigelmanWordcamp rochester-2017-accessibility-johnson-steigelman
Wordcamp rochester-2017-accessibility-johnson-steigelman
 
Methodology of Inclusion by Corey Timpson
Methodology of Inclusion by Corey TimpsonMethodology of Inclusion by Corey Timpson
Methodology of Inclusion by Corey Timpson
 
Four image slide
Four image slideFour image slide
Four image slide
 
Principles Of Web Design Workshop
Principles Of Web Design WorkshopPrinciples Of Web Design Workshop
Principles Of Web Design Workshop
 
How Modern Software Architecture Benefits from Patterns Found in Natural Comp...
How Modern Software Architecture Benefits from Patterns Found in Natural Comp...How Modern Software Architecture Benefits from Patterns Found in Natural Comp...
How Modern Software Architecture Benefits from Patterns Found in Natural Comp...
 

More from Dawn Anderson MSc DigM

Human vs AI Quality Raters for Search Engines.pdf
Human vs AI Quality Raters for Search Engines.pdfHuman vs AI Quality Raters for Search Engines.pdf
Human vs AI Quality Raters for Search Engines.pdfDawn Anderson MSc DigM
 
Life of An SEO - Surfing The Waves of Googles Many Algorithmic Updates
Life of An SEO - Surfing The Waves of Googles Many Algorithmic UpdatesLife of An SEO - Surfing The Waves of Googles Many Algorithmic Updates
Life of An SEO - Surfing The Waves of Googles Many Algorithmic UpdatesDawn Anderson MSc DigM
 
Natural Semantic SEO - Surfacing Walnuts in Densely Represented, Every Increa...
Natural Semantic SEO - Surfacing Walnuts in Densely Represented, Every Increa...Natural Semantic SEO - Surfacing Walnuts in Densely Represented, Every Increa...
Natural Semantic SEO - Surfacing Walnuts in Densely Represented, Every Increa...Dawn Anderson MSc DigM
 
Passage indexing is likely more important than you think
Passage indexing is likely more important than you thinkPassage indexing is likely more important than you think
Passage indexing is likely more important than you thinkDawn Anderson MSc DigM
 
Zipfs Law & Zipfian Distribution in SEO - Pubcon Virtual Fall 2020 - Dawn And...
Zipfs Law & Zipfian Distribution in SEO - Pubcon Virtual Fall 2020 - Dawn And...Zipfs Law & Zipfian Distribution in SEO - Pubcon Virtual Fall 2020 - Dawn And...
Zipfs Law & Zipfian Distribution in SEO - Pubcon Virtual Fall 2020 - Dawn And...Dawn Anderson MSc DigM
 
Google BERT - SMX London 2020 Virtual Conference
Google BERT - SMX London 2020 Virtual ConferenceGoogle BERT - SMX London 2020 Virtual Conference
Google BERT - SMX London 2020 Virtual ConferenceDawn Anderson MSc DigM
 
Google BERT - What SEOs and Marketers Need to Know
Google BERT - What SEOs and Marketers Need to KnowGoogle BERT - What SEOs and Marketers Need to Know
Google BERT - What SEOs and Marketers Need to KnowDawn Anderson MSc DigM
 
Disambiguating Equiprobability in SEO Dawn Anderson Friends of Search 2020
Disambiguating Equiprobability in SEO Dawn Anderson Friends of Search 2020Disambiguating Equiprobability in SEO Dawn Anderson Friends of Search 2020
Disambiguating Equiprobability in SEO Dawn Anderson Friends of Search 2020Dawn Anderson MSc DigM
 
2019 Tech SEO Boost Dawn Anderson Contextual Recommender Search
2019 Tech SEO Boost Dawn Anderson Contextual Recommender Search2019 Tech SEO Boost Dawn Anderson Contextual Recommender Search
2019 Tech SEO Boost Dawn Anderson Contextual Recommender SearchDawn Anderson MSc DigM
 
Connecting The Worlds of Information Retrieval & SEO - Search solutions 2019 ...
Connecting The Worlds of Information Retrieval & SEO - Search solutions 2019 ...Connecting The Worlds of Information Retrieval & SEO - Search solutions 2019 ...
Connecting The Worlds of Information Retrieval & SEO - Search solutions 2019 ...Dawn Anderson MSc DigM
 
Planning an SEO Strategy for a New Website - SMXL Milan 2019
Planning an SEO Strategy for a New Website - SMXL Milan 2019Planning an SEO Strategy for a New Website - SMXL Milan 2019
Planning an SEO Strategy for a New Website - SMXL Milan 2019Dawn Anderson MSc DigM
 
Google BERT and Family and the Natural Language Understanding Leaderboard Race
Google BERT and Family and the Natural Language Understanding Leaderboard RaceGoogle BERT and Family and the Natural Language Understanding Leaderboard Race
Google BERT and Family and the Natural Language Understanding Leaderboard RaceDawn Anderson MSc DigM
 
The User is the Query - The Rise of Predictive Proactive Search
The User is the Query - The Rise of Predictive Proactive SearchThe User is the Query - The Rise of Predictive Proactive Search
The User is the Query - The Rise of Predictive Proactive SearchDawn Anderson MSc DigM
 
Natural Language Processing and Search Intent Understanding C3 Conductor 2019...
Natural Language Processing and Search Intent Understanding C3 Conductor 2019...Natural Language Processing and Search Intent Understanding C3 Conductor 2019...
Natural Language Processing and Search Intent Understanding C3 Conductor 2019...Dawn Anderson MSc DigM
 
Using topic modelling frameworks for NLP and semantic search
Using topic modelling frameworks for NLP and semantic searchUsing topic modelling frameworks for NLP and semantic search
Using topic modelling frameworks for NLP and semantic searchDawn Anderson MSc DigM
 
Voice Search and Conversation Action Assistive Systems - Challenges & Opportu...
Voice Search and Conversation Action Assistive Systems - Challenges & Opportu...Voice Search and Conversation Action Assistive Systems - Challenges & Opportu...
Voice Search and Conversation Action Assistive Systems - Challenges & Opportu...Dawn Anderson MSc DigM
 
The Iceberg Approach - Power from what lies beneath in SEO for a mobile-first...
The Iceberg Approach - Power from what lies beneath in SEO for a mobile-first...The Iceberg Approach - Power from what lies beneath in SEO for a mobile-first...
The Iceberg Approach - Power from what lies beneath in SEO for a mobile-first...Dawn Anderson MSc DigM
 
SEO and The Mobile-First Paradigm Shift
SEO and The Mobile-First Paradigm ShiftSEO and The Mobile-First Paradigm Shift
SEO and The Mobile-First Paradigm ShiftDawn Anderson MSc DigM
 

More from Dawn Anderson MSc DigM (20)

Human vs AI Quality Raters for Search Engines.pdf
Human vs AI Quality Raters for Search Engines.pdfHuman vs AI Quality Raters for Search Engines.pdf
Human vs AI Quality Raters for Search Engines.pdf
 
Life of An SEO - Surfing The Waves of Googles Many Algorithmic Updates
Life of An SEO - Surfing The Waves of Googles Many Algorithmic UpdatesLife of An SEO - Surfing The Waves of Googles Many Algorithmic Updates
Life of An SEO - Surfing The Waves of Googles Many Algorithmic Updates
 
Natural Semantic SEO - Surfacing Walnuts in Densely Represented, Every Increa...
Natural Semantic SEO - Surfacing Walnuts in Densely Represented, Every Increa...Natural Semantic SEO - Surfacing Walnuts in Densely Represented, Every Increa...
Natural Semantic SEO - Surfacing Walnuts in Densely Represented, Every Increa...
 
Passage indexing is likely more important than you think
Passage indexing is likely more important than you thinkPassage indexing is likely more important than you think
Passage indexing is likely more important than you think
 
Zipfs Law & Zipfian Distribution in SEO - Pubcon Virtual Fall 2020 - Dawn And...
Zipfs Law & Zipfian Distribution in SEO - Pubcon Virtual Fall 2020 - Dawn And...Zipfs Law & Zipfian Distribution in SEO - Pubcon Virtual Fall 2020 - Dawn And...
Zipfs Law & Zipfian Distribution in SEO - Pubcon Virtual Fall 2020 - Dawn And...
 
Google BERT - SMX London 2020 Virtual Conference
Google BERT - SMX London 2020 Virtual ConferenceGoogle BERT - SMX London 2020 Virtual Conference
Google BERT - SMX London 2020 Virtual Conference
 
Google BERT - What SEOs and Marketers Need to Know
Google BERT - What SEOs and Marketers Need to KnowGoogle BERT - What SEOs and Marketers Need to Know
Google BERT - What SEOs and Marketers Need to Know
 
Disambiguating Equiprobability in SEO Dawn Anderson Friends of Search 2020
Disambiguating Equiprobability in SEO Dawn Anderson Friends of Search 2020Disambiguating Equiprobability in SEO Dawn Anderson Friends of Search 2020
Disambiguating Equiprobability in SEO Dawn Anderson Friends of Search 2020
 
2019 Tech SEO Boost Dawn Anderson Contextual Recommender Search
2019 Tech SEO Boost Dawn Anderson Contextual Recommender Search2019 Tech SEO Boost Dawn Anderson Contextual Recommender Search
2019 Tech SEO Boost Dawn Anderson Contextual Recommender Search
 
Connecting The Worlds of Information Retrieval & SEO - Search solutions 2019 ...
Connecting The Worlds of Information Retrieval & SEO - Search solutions 2019 ...Connecting The Worlds of Information Retrieval & SEO - Search solutions 2019 ...
Connecting The Worlds of Information Retrieval & SEO - Search solutions 2019 ...
 
Planning an SEO Strategy for a New Website - SMXL Milan 2019
Planning an SEO Strategy for a New Website - SMXL Milan 2019Planning an SEO Strategy for a New Website - SMXL Milan 2019
Planning an SEO Strategy for a New Website - SMXL Milan 2019
 
Google BERT and Family and the Natural Language Understanding Leaderboard Race
Google BERT and Family and the Natural Language Understanding Leaderboard RaceGoogle BERT and Family and the Natural Language Understanding Leaderboard Race
Google BERT and Family and the Natural Language Understanding Leaderboard Race
 
The User is the Query - The Rise of Predictive Proactive Search
The User is the Query - The Rise of Predictive Proactive SearchThe User is the Query - The Rise of Predictive Proactive Search
The User is the Query - The Rise of Predictive Proactive Search
 
Natural Language Processing and Search Intent Understanding C3 Conductor 2019...
Natural Language Processing and Search Intent Understanding C3 Conductor 2019...Natural Language Processing and Search Intent Understanding C3 Conductor 2019...
Natural Language Processing and Search Intent Understanding C3 Conductor 2019...
 
Using topic modelling frameworks for NLP and semantic search
Using topic modelling frameworks for NLP and semantic searchUsing topic modelling frameworks for NLP and semantic search
Using topic modelling frameworks for NLP and semantic search
 
SEO in a Mobile First World
SEO in a Mobile First WorldSEO in a Mobile First World
SEO in a Mobile First World
 
Modern Ecommerce SEO
Modern Ecommerce SEOModern Ecommerce SEO
Modern Ecommerce SEO
 
Voice Search and Conversation Action Assistive Systems - Challenges & Opportu...
Voice Search and Conversation Action Assistive Systems - Challenges & Opportu...Voice Search and Conversation Action Assistive Systems - Challenges & Opportu...
Voice Search and Conversation Action Assistive Systems - Challenges & Opportu...
 
The Iceberg Approach - Power from what lies beneath in SEO for a mobile-first...
The Iceberg Approach - Power from what lies beneath in SEO for a mobile-first...The Iceberg Approach - Power from what lies beneath in SEO for a mobile-first...
The Iceberg Approach - Power from what lies beneath in SEO for a mobile-first...
 
SEO and The Mobile-First Paradigm Shift
SEO and The Mobile-First Paradigm ShiftSEO and The Mobile-First Paradigm Shift
SEO and The Mobile-First Paradigm Shift
 

Recently uploaded

What is Google Search Console and What is it provide?
What is Google Search Console and What is it provide?What is Google Search Console and What is it provide?
What is Google Search Console and What is it provide?riteshhsociall
 
Major SEO Trends in 2024 - Banyanbrain Digital
Major SEO Trends in 2024 - Banyanbrain DigitalMajor SEO Trends in 2024 - Banyanbrain Digital
Major SEO Trends in 2024 - Banyanbrain DigitalBanyanbrain
 
The+State+of+Careers+In+Retention+Marketing-2.pdf
The+State+of+Careers+In+Retention+Marketing-2.pdfThe+State+of+Careers+In+Retention+Marketing-2.pdf
The+State+of+Careers+In+Retention+Marketing-2.pdfSocial Samosa
 
BDSM⚡Call Girls in Sector 39 Noida Escorts Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 39 Noida Escorts Escorts >༒8448380779 Escort ServiceBDSM⚡Call Girls in Sector 39 Noida Escorts Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 39 Noida Escorts Escorts >༒8448380779 Escort ServiceDelhi Call girls
 
Moving beyond multi-touch attribution - DigiMarCon CanWest 2024
Moving beyond multi-touch attribution - DigiMarCon CanWest 2024Moving beyond multi-touch attribution - DigiMarCon CanWest 2024
Moving beyond multi-touch attribution - DigiMarCon CanWest 2024Richard Ingilby
 
Defining Marketing for the 21st Century,kotler
Defining Marketing for the 21st Century,kotlerDefining Marketing for the 21st Century,kotler
Defining Marketing for the 21st Century,kotlerAmirNasiruog
 
Kraft Mac and Cheese campaign presentation
Kraft Mac and Cheese campaign presentationKraft Mac and Cheese campaign presentation
Kraft Mac and Cheese campaign presentationtbatkhuu1
 
Brand experience Dream Center Peoria Presentation.pdf
Brand experience Dream Center Peoria Presentation.pdfBrand experience Dream Center Peoria Presentation.pdf
Brand experience Dream Center Peoria Presentation.pdftbatkhuu1
 
Aryabhata I, II of mathematics of both.pptx
Aryabhata I, II of mathematics of both.pptxAryabhata I, II of mathematics of both.pptx
Aryabhata I, II of mathematics of both.pptxtegevi9289
 
How to utilize calculated properties in your HubSpot setups
How to utilize calculated properties in your HubSpot setupsHow to utilize calculated properties in your HubSpot setups
How to utilize calculated properties in your HubSpot setupsssuser4571da
 
The Science of Landing Page Messaging.pdf
The Science of Landing Page Messaging.pdfThe Science of Landing Page Messaging.pdf
The Science of Landing Page Messaging.pdfVWO
 
Five Essential Tools for International SEO - Natalia Witczyk - SearchNorwich 15
Five Essential Tools for International SEO - Natalia Witczyk - SearchNorwich 15Five Essential Tools for International SEO - Natalia Witczyk - SearchNorwich 15
Five Essential Tools for International SEO - Natalia Witczyk - SearchNorwich 15SearchNorwich
 
Labour Day Celebrating Workers and Their Contributions.pptx
Labour Day Celebrating Workers and Their Contributions.pptxLabour Day Celebrating Workers and Their Contributions.pptx
Labour Day Celebrating Workers and Their Contributions.pptxelizabethella096
 
Netflix Ads The Game Changer in Video Ads – Who Needs YouTube.pptx (Chester Y...
Netflix Ads The Game Changer in Video Ads – Who Needs YouTube.pptx (Chester Y...Netflix Ads The Game Changer in Video Ads – Who Needs YouTube.pptx (Chester Y...
Netflix Ads The Game Changer in Video Ads – Who Needs YouTube.pptx (Chester Y...ChesterYang6
 
BDSM⚡Call Girls in Sector 150 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 150 Noida Escorts >༒8448380779 Escort ServiceBDSM⚡Call Girls in Sector 150 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 150 Noida Escorts >༒8448380779 Escort ServiceDelhi Call girls
 
How to Leverage Behavioral Science Insights for Direct Mail Success
How to Leverage Behavioral Science Insights for Direct Mail SuccessHow to Leverage Behavioral Science Insights for Direct Mail Success
How to Leverage Behavioral Science Insights for Direct Mail SuccessAggregage
 
Instant Digital Issuance: An Overview With Critical First Touch Best Practices
Instant Digital Issuance: An Overview With Critical First Touch Best PracticesInstant Digital Issuance: An Overview With Critical First Touch Best Practices
Instant Digital Issuance: An Overview With Critical First Touch Best PracticesMedia Logic
 

Recently uploaded (20)

What is Google Search Console and What is it provide?
What is Google Search Console and What is it provide?What is Google Search Console and What is it provide?
What is Google Search Console and What is it provide?
 
Major SEO Trends in 2024 - Banyanbrain Digital
Major SEO Trends in 2024 - Banyanbrain DigitalMajor SEO Trends in 2024 - Banyanbrain Digital
Major SEO Trends in 2024 - Banyanbrain Digital
 
The+State+of+Careers+In+Retention+Marketing-2.pdf
The+State+of+Careers+In+Retention+Marketing-2.pdfThe+State+of+Careers+In+Retention+Marketing-2.pdf
The+State+of+Careers+In+Retention+Marketing-2.pdf
 
SEO Master Class - Steve Wiideman, Wiideman Consulting Group
SEO Master Class - Steve Wiideman, Wiideman Consulting GroupSEO Master Class - Steve Wiideman, Wiideman Consulting Group
SEO Master Class - Steve Wiideman, Wiideman Consulting Group
 
BDSM⚡Call Girls in Sector 39 Noida Escorts Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 39 Noida Escorts Escorts >༒8448380779 Escort ServiceBDSM⚡Call Girls in Sector 39 Noida Escorts Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 39 Noida Escorts Escorts >༒8448380779 Escort Service
 
Turn Digital Reputation Threats into Offense Tactics - Daniel Lemin
Turn Digital Reputation Threats into Offense Tactics - Daniel LeminTurn Digital Reputation Threats into Offense Tactics - Daniel Lemin
Turn Digital Reputation Threats into Offense Tactics - Daniel Lemin
 
Moving beyond multi-touch attribution - DigiMarCon CanWest 2024
Moving beyond multi-touch attribution - DigiMarCon CanWest 2024Moving beyond multi-touch attribution - DigiMarCon CanWest 2024
Moving beyond multi-touch attribution - DigiMarCon CanWest 2024
 
Defining Marketing for the 21st Century,kotler
Defining Marketing for the 21st Century,kotlerDefining Marketing for the 21st Century,kotler
Defining Marketing for the 21st Century,kotler
 
Kraft Mac and Cheese campaign presentation
Kraft Mac and Cheese campaign presentationKraft Mac and Cheese campaign presentation
Kraft Mac and Cheese campaign presentation
 
Brand experience Dream Center Peoria Presentation.pdf
Brand experience Dream Center Peoria Presentation.pdfBrand experience Dream Center Peoria Presentation.pdf
Brand experience Dream Center Peoria Presentation.pdf
 
Aryabhata I, II of mathematics of both.pptx
Aryabhata I, II of mathematics of both.pptxAryabhata I, II of mathematics of both.pptx
Aryabhata I, II of mathematics of both.pptx
 
How to utilize calculated properties in your HubSpot setups
How to utilize calculated properties in your HubSpot setupsHow to utilize calculated properties in your HubSpot setups
How to utilize calculated properties in your HubSpot setups
 
The Science of Landing Page Messaging.pdf
The Science of Landing Page Messaging.pdfThe Science of Landing Page Messaging.pdf
The Science of Landing Page Messaging.pdf
 
Creator Influencer Strategy Master Class - Corinne Rose Guirgis
Creator Influencer Strategy Master Class - Corinne Rose GuirgisCreator Influencer Strategy Master Class - Corinne Rose Guirgis
Creator Influencer Strategy Master Class - Corinne Rose Guirgis
 
Five Essential Tools for International SEO - Natalia Witczyk - SearchNorwich 15
Five Essential Tools for International SEO - Natalia Witczyk - SearchNorwich 15Five Essential Tools for International SEO - Natalia Witczyk - SearchNorwich 15
Five Essential Tools for International SEO - Natalia Witczyk - SearchNorwich 15
 
Labour Day Celebrating Workers and Their Contributions.pptx
Labour Day Celebrating Workers and Their Contributions.pptxLabour Day Celebrating Workers and Their Contributions.pptx
Labour Day Celebrating Workers and Their Contributions.pptx
 
Netflix Ads The Game Changer in Video Ads – Who Needs YouTube.pptx (Chester Y...
Netflix Ads The Game Changer in Video Ads – Who Needs YouTube.pptx (Chester Y...Netflix Ads The Game Changer in Video Ads – Who Needs YouTube.pptx (Chester Y...
Netflix Ads The Game Changer in Video Ads – Who Needs YouTube.pptx (Chester Y...
 
BDSM⚡Call Girls in Sector 150 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 150 Noida Escorts >༒8448380779 Escort ServiceBDSM⚡Call Girls in Sector 150 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 150 Noida Escorts >༒8448380779 Escort Service
 
How to Leverage Behavioral Science Insights for Direct Mail Success
How to Leverage Behavioral Science Insights for Direct Mail SuccessHow to Leverage Behavioral Science Insights for Direct Mail Success
How to Leverage Behavioral Science Insights for Direct Mail Success
 
Instant Digital Issuance: An Overview With Critical First Touch Best Practices
Instant Digital Issuance: An Overview With Critical First Touch Best PracticesInstant Digital Issuance: An Overview With Critical First Touch Best Practices
Instant Digital Issuance: An Overview With Critical First Touch Best Practices
 

Duplicate Content Myths Types and Ways To Make It Work For You

  • 1. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle DUPLICATE CONTENT: MYTHS, TYPES & WAYS TO MAKE IT WORK FOR YOU
  • 2. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle Duplicate Content Penalty ‘Myth’… It Just Won’t Die Query  Refinement  Suggestion   Next  Probable  Queries  on  “near   duplicate  urls can  cause” 2017
  • 3. At least 30% of the web is a duplicate of other pages on the web
  • 4. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle “And  that’s   OK”
  • 5. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle The Duplicate Content ‘Penalty’ Myth ‘Real’  duplicates  (matching content  checksum)  filtered  and   not  indexed “Each  content  filter sends  the   retrieved  web  pages  to  Dupserver to  determine  if  they  are  duplicates of  other  web  pages” http://www.google.ch/patents/US20120317089
  • 6. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation SubtitleFilters
  • 7. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle Handling Near-Duplicate Content Attracted Lots of Research §Dennis  Fetterly §Marc  Najork §Mark  Manasse §Ziv Bar-­‐Yossef §Monica  Henzinger §William  Pugh §Andrei  Broder Some Notable ‘Spot the Difference’ Researchers DETECTING  DUPLICATES  &  NEAR-­‐DUPLICATES   EARLY  SAVES  ON  RESOURCES  /  EFFICIENCY
  • 8. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle Because… Near Duplicate Content is More Difficult to Detect than Exact Duplicates ’Detecting  Duplicate  and   Near  Duplicate  Files’ IT’S  AN  ONGOING  REAL   WORLD  CHALLENGE (Henzinger /  Pugh,  2003,  2009,  2011,   2012,  2011,  2016) These  Google  patents  in  the  series   keep  being  ‘tweaked’  (A  is  not  the   same  as  B)
  • 9. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle A lot of busy Googlebots & potential for duplicates • The  web  doubled  in   size  2010  – 2012 • Another  1/3  by  2015 • Finite  search  engine   resources • Processes  automated   for  scale “I  just  never  have   any  ‘me’-­‐time’   any  more”
  • 10. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle Near Duplicates Do Not Change Often SO…  WHY   WASTE   RESOURCES   CRAWLING   THEM?
  • 11. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle DENNIS FETTERLY
  • 12. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle The Slow Page Evolution of Near Duplicates “Clusters  of  near-­‐duplicate  documents   are  fairly  stable:  Two  documents  that   are  near-­‐duplicates  of  one  another  are   very  likely  to  still  be  near-­‐duplicates  10   weeks  later” (Fetterly &  Najork,  2003)
  • 13. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle … The Raters Guidelines still ask raters to catch ‘dupes’ In  Fact…  There’s   a  whole  section   of  the  guidelines   dedicated  to   them 2017
  • 14. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle Mostly Stable For Years… But… The Web is Always Changing 2017
  • 15. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle Near-Dupes are still doing strange things John Mu at International Search Summit § Nearly  the  same  but  not   the  same  still  causes   confusion § Particularly  problematic   on  internationalization § But  applies  to  all  sites   with  pages  not  the  same   but  ’nearly-­‐the-­‐same’ 2017
  • 17. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle • Full  duplication • Partial  duplication • Document  inclusion • In-­‐document  duplication • (Local  duplication  (in-­‐same-­‐site))
  • 18. All types may not be treated the same
  • 19. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle PERFECT DUPLICATES
  • 20. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation SubtitleFiltered before indexing
  • 21. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle D.U.S.T. (DIFFERENT URL, SIMILAR TEXT)
  • 22. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle DUSTBUSTER - Do Not Crawl in The Dust… Ziv Bar-Yossef Reduce  crawling  and  wasted   resources  to  low  importance  pages CAVEAT:  IT  IS  NOT   KNOWN  WHETHER   THIS  IS  BEING  USED   AT  ALL.    RESEARCH   AND  THEORY § Builds  crawling  ‘rules’ § Detects  duplicate  content   URL  patterns § From  small  ‘sampling’  visits § Swerves  ‘DUST’ § DUSTBUSTER § Saves  crawling  resources § Potentially  Popular  CMS   configurations  URL   parameters  detect  ‘DUST’ 2003
  • 23. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation SubtitleFiltered before indexing
  • 24. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle Cookie Cutter Sites
  • 25. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
  • 26. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
  • 27. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation SubtitleFiltered before indexing
  • 29. Never hit query run-time auction Because… They’re not indexed… Filtered
  • 30. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle Tripping Flags NEAR  DUP  ==  TRUE NEAR  DUP  ==  FALSE
  • 31. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle Query Agnostic Nature of Near-Duplicate Clustering
  • 32. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle Single URL Content Fingerprint
  • 33. But… The Single URL Fingerprint May Not Be The One You Choose
  • 34. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle BOILERPLATE ISSUES
  • 35. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation SubtitleFiltered before indexing
  • 36. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle TOKENS, VECTORS & SHINGLING
  • 37. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle (w) Shingling A  rose  is  a  rose  is  a  rose
  • 38. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle Shingling A  rose  is  a  rose  is  a  rose N-­‐Gram (Where  ‘n’  is  no.   words  (tokens)  in   snapshot) [A  rose  is  a]   [rose  is  a   rose]  [is  a   rose  is]  (4)
  • 39. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle SHINGLE VECTORS SUPERSHINGLE MEGASHINGLE Shingles, Supershingles & Megashingles WORD  ==   TOKEN
  • 40. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitlehttp://corpus.tools/wiki/Onion
  • 41. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle http://corpus.tools/wiki/Onion N-­‐gram   length   (word   string) POTENTIAL  EXAMPLE
  • 42. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle http://corpus.tools/wiki/Onion Dup  Content   Threshold e.g.  0.5   (50%) POTENTIAL  EXAMPLE
  • 43. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle Broder, A.Z., Glassman, S.C., Manasse, M.S. and Zweig, G., 1997. Syntactic clustering of the web. Computer Networks and ISDN Systems, 29(8-13), pp.1157-1166.
  • 44. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle “We have developed an efficient way to determine the syntactic similarity of files and have applied it to every document on the World Wide Web” (Broder et al, 1997) Broder, A.Z., Glassman, S.C., Manasse, M.S. and Zweig, G., 1997. Syntactic clustering of the web. Computer Networks and ISDN Systems, 29(8-13), pp.1157-1166.
  • 45. Documents grouped together to meet similar queries equally (in a cluster)
  • 46. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle Multiple Title Candidates For A Query DYNAMIC,   CONTEXTUAL   SEARCH
  • 48. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle Quilting Web Pages
  • 49. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle DUPLICATE CONTENT TYPE – NEAR DUPE (QUILTING) UNIQUE PARAGRAPH EXTERNAL SYNDICATED EXTERNAL SYNDICATED EXTERNAL SYNDICATED HEADER - TEMPLATE FOOTER - TEMPLATE UNIQUE PARAGRAPH A S I D E
  • 50. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle CONTENT  INCLUSION CMS’
  • 51. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle UNIQUE   MAIN   CONTENT BUT  ITS   CONTENT  IS   INCLUDED   ELSEWHERE TEASER  ‘INCLUDED’  ELSEWHERE
  • 52. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle May… or May NOT be Filtered before indexing ?
  • 53. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle Pages  that  look  very  different   but  meet  the  same  user   information  need  equally
  • 54. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle Cross  Over,  Query  Class    &  Semantic  Collisions
  • 55. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle Possible Treatment of Near-Duplicate Query Candidates “If  more  than  one  candidate  is   determined  to  be  part  of  a   ’search  query  cluster’,  the  most   important  one  based  on  factors   such  as  relevance,  freshness,   importance  is  returned.    The   others  are  eliminated.” (Henzinger /  Pugh,  2012,2016) Last  updated   2016
  • 56. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle May… or May NOT be Filtered before indexing ?
  • 57. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle URL Parameter-driven Ecommerce Platforms
  • 58. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle How Are Choosing Strategies Catered For in Ecommerce? § FACETED  NAVIGATION   &  WEBSITE  FILTERS  ==   Allows  for  ‘Elimination  by   Aspects’ § PAGINATION ==  Reduces   ‘Too  Much  Choice’  effects § SORTING ==  Caters  for   ‘FIRST  /  BEST’  choosing   strategies CHOICE-­‐ ASSISTING   FUNCTIONALITY HEURISTICS
  • 59. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle And with these choice-assisting functionalities come… “Exponentially   multiplicative   URLs”
  • 60. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle Exponentially Multiplicative URLs From Faceted Navigation… 100  DRESSES 5  COLOURS 10  SIZES 2  LENGTHS 4  SUPPLIERS 100  x  5  x  10  x   2  x  4  = 40,000 URLs
  • 61. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle And that’s without HTTPS, WWW/non or internationalization 100  DRESSES 5  COLOURS 10  SIZES 2  LENGTHS 4  SUPPLIERS 100  x  5  x  10  x   2  x  4  = 40,000 URLs X  2  BECAUSE…   HTTPS  VERSION 80,000   URLs X  2…  BECAUSE…   WWW  /  NON   WWW  VERSION 160,000   URLs X  5…   BECAUSE…   EN  /  FR  /  ES  /   DE  /  IT  (e.g.) 800,000   URLs
  • 62. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle May… or May NOT be Filtered before indexing ?
  • 63. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle THAT’S A LOT OF URLs FOR 100 DRESSES Bored  Googlebot (Unrelated  to  speed)
  • 64. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle When You Stop Boring Googlebot
  • 65. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle When You Stop Boring Googlebot NOT   SPEED   RELATED
  • 66. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle CANONICALIZATION
  • 67. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle The Canonical Tag - Otherwise Known As… RFC 6596 ‘THE   CANONICAL   LINK   RELATION’ 2012
  • 68. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle ‘The Canonical Link Relation’ – RFC6596 Is Still Adhered To 2017
  • 69. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle 50%  OF  SEO’S   “SEARCH  ENGINES  HAVE   IGNORED  CANONICAL  TAGS   THEY  HAD  IMPLEMENTED” 2017
  • 70. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle A lot can go wrong with mixed signals…
  • 71. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle There Are Many Signals To Consider In Canonicalization 404 & 410 301 302, 303, 307 Valid canonical from ‘context’ URL to valid target Fall back to default pre ‘Canonical Link Relation’duplicate handling signals Valid href lang (if present and applicable) Manual Action SUPER  STRONG STRONG  -­‐ DIRECTIVE STRONG  -­‐ DIRECTIVE STRONG  -­‐ DIRECTIVE STRONG  -­‐ HINT STRONG  -­‐ HINT DEFAULT ALL  NEED  TO  BE  IN  UNISON HTTPS  (Google  Specific)
  • 72. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle “REL=NEXT  /  REL  =   PREV”  IS  NOTA  FORM   OF  CANONICALIZATION 2017
  • 73. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle PAGINATION & SORTING
  • 74. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle Rel =”next” Rel = “prev” RFC 5988 (Web Linking)
  • 75. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle 2011… -> We’re Still Unclear
  • 76. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle View-all’ Search Experience
  • 77. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle If  a  canonical  is  not  deemed  to  be  valid   there  is  likelihood  the  pre-­‐RFC6596 Canonical  Link  Relation  treatment  of   duplicates  and  near-­‐duplicates  will  be   applied: Such  as  ‘internal  links’ COMMON  CANONICAL  MISTAKES
  • 78. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle “301s  AND  302s  ARE   BOTH  A  FORM  OF   CANONICALIZATION” 2017
  • 79. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle Don’t  canonicalize from  an   ”index”  to  a  “noindex or  vice-­‐ versa  because  this  means  the   pages  are  NOT  the  same. The  canonical  will  likely  be   ignored COMMON  CANONICAL  MISTAKES If  “href lang”  references  an   alternative  which  does  not   match  a  canonical  link  the   canonical  will  likely  be   ignored
  • 81. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle HOW TO RESOLVE?
  • 83. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle DUPLICATE CONTENT TYPE – NEAR DUPE (ADDING VALUE)
  • 84. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle Hubs & Authorities BOWTIE  OF  THE  WEB Build  Strongly  Connected  Components
  • 85. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle SORT OUT YOUR LIBRARY SYSTEM & QUERY CLASSES
  • 86. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle SEARCH ENGINES LOVE CATEGORIZATION
  • 87. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle Focused Crawling CRAWLING   CONTENT  ON  A   SPECIFIC   TOPIC  FOR   EFFICIENCY
  • 88. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle The ‘Mere’ Categorization Effect (Phenomenon) FTW Simply  by  labelling  products  /   items  as  being  part  of  a   category  regardless  of  label   appears  to  increase   perception  of  variety  &   positive  experience  (Mogliner et  al,  2003) HUMANS  LOVE  CATEGORIES   TOO…  IT  IS  A  PHENOMENON
  • 89. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle Homonyms contribute to need for query refinement HOMONYMS  –WORDS  THAT  ARE  SPELT  OR  PRONOUNCED   THE  SAME  BUT  HAVE  DIFFERENT  MEANINGS   ROSE EVENING WATCH SINK BACK ARMS BOW CHECK STRENGTHEN  DIFFERENTIAL   CONTEXT
  • 90. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle INTELLIGENT INTERNAL LINKING
  • 91. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle LOCAL NAVIGATION RELEVANCE TABLE  OF  CONTENTS  STYLE  IN   PAGE  NAVIGATIONAL  HEURISTIC   FOR  SEARCH  ENGINE  AND   HUMAN PAGINATED  TAB  THROUGH  ON   SECTIONS  OF  REVIEW GRANULAR   RELEVANCE
  • 92. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle Parameter Handling “IS  PARAMETER-­‐HANDLING  A   WAY  TO  HELP  GOOGLE  BUILD  A   SET  OF  ‘DUSTBUSTER   CRAWLING  RULES’  EARLY?” MAKE  THE   RULES
  • 93. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle ADD VALUE TO NEAR DUPES (INFORMATIONAL VIEWS (INFORMATION ARCHITECTURE)
  • 94. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle INFORMATION  VIEWS   ADDING  VALUE  AND   PASSING  STRENGTH  TO   CANONICAL  TARGETS
  • 96. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation SubtitleSitemaps Below Surface
  • 97. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle BUILD STRONG SECTIONS REVIEWS BLOG BUYING   GUIDES COST   CALCULATORS COMMERCE MAIN  SITE  THEME  (ONTOLOGY SEMANTICS  RULE UGC
  • 98. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle Power Mapper
  • 99. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle Related Content Mostly Adds Value To Other Content That  content  is  ‘stitched’  from  elsewhere But  it  is  VERY  useful  overall  &  helps  with   searcher  ‘foraging’ To  create  context  for  what  it  links  out  to
  • 100. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation SubtitleNOT Filtered before indexing Doc  IDs  meeting   contextual   information  needs   -­‐ 1,  or  2  pages   (max)  chosen  at   query  run-­‐time
  • 101. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation SubtitleNOT Filtered before indexing Fighting  with  each   other  to  be  ‘THE’   result Seems  like   ‘dilution’
  • 102. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle BE VERY CAREFUL WITH ‘PRUNING’
  • 103. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle CAN YOU ‘IMPROVE’, ‘DE-GROUP’ OR ‘REMORPH’ … RATHER THAN ‘REMOVE?
  • 104. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle THAT’S A LOT OF URLs FOR 100 DRESSES Is  The  Difference  Substantively  Different  To  Queries?
  • 105. Does  The   Repurposed  or   Collated  Content  Add   ‘Additional’  Value??
  • 106. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle Content meeting informational needs equally treated different TO DUPLICATES
  • 107. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle GOTCHAS
  • 108. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle Gotchas – Velvet Blues Update (SOME) URLs (WP PLUGIN)
  • 109. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle BETTER SEARCH REPLACE PLUGIN REVIEWS
  • 110. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle WP For AMP Internal Linking Canonical Issues
  • 111. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle WP For AMP Internal Linking Canonical Issues
  • 112. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle MAGENTO GOTCHA WITH CANONICALS
  • 113. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle Understand The Canonical Link Relation Rules – RFC6596 The target (canonical) IRI MUST identify content that is either duplicative or a superset of the content at the context (referring) IRI.
  • 114. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle Google’s Maile Ohye on ‘How To Hire An SEO’
  • 115. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle Without &filter=0 Appended to end of Query https://www.google.co.uk/search?q=red+dress es+size+10+long+sleeves&oq=red+dresses+siz e+10+long+sleeves&aqs=chrome.0.69i59.1257 0j0j7&sourceid=chrome&ie=UTF-­‐8 NOBODY  HAS  MORE   THAN  ONE  LISTING
  • 116. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle With filter=0 Appended to end of Query https://www.google.co.uk/search?q=red+size+ 10+dresses+long+sleeves&oq=red+size+10+dr esses+long+sleeves&aqs=chrome..69i57.13605 j0j7&sourceid=chrome&ie=UTF-­‐8&filter=0 ALL  SITES  HAVE  AT   LEAST  2  LISTINGS MISSED OPPORTUNITIES
  • 117. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle
  • 118. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle ASOS.com 18%
  • 119. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle Quadruple Listings
  • 120. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle Similar Content – Query Refinement SERPs NOT  FILTERED NOT  NEAR-­‐DUPES Does  the  searcher   want  ‘gas   engineers,  heating   engineers,  central   heating?’
  • 121. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle CONFUSING  DUPLICATE,  NEAR-­‐ DUPLICATE  (DUST)  AND  SIMILAR   CONTENT  COULD  COST  YOU   DEARLY Maybe a lot of people are confused by duplicates? § Be  careful  about  canonicalizing when  unnecessary § True  duplicate  content  &  near-­‐ dupes  are  query  and  category   agnostic § Similar  is  not  duplicate § You  may  still  have  the  answers   to  different  queries  based  on  a   small  important  difference § AT  LEAST  4  TYPES  OF   DUPLICATE  CONTENT 2017
  • 122. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle Thank You
  • 123. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle APPENDIX
  • 124. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle Problems With The Many ‘Faces’ of Faceted Navigation https://webmasters.googleblog.com/2014/02/faceted-­‐navigation-­‐best-­‐ and-­‐5-­‐of-­‐worst.html -­‐ Wednesday, February 12, 2014 Example  of  faceted  navigation:   http://www.example.com/category.php?category=gummy-candies&price=5- 10&price=over-10 Facet  means  ‘little  faces’  (USEFUL  TRIVIA)
  • 125. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle Relation Links – ‘Web Linking’ https://tools.ietf.org/html/rfc5988 Web  LINKING  – RFC  5988 INTERNET ENGINEERING TASK FORCE
  • 126. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle Internationalization – An Additional Layer of Complexity ‘TAGS  FOR  IDENTIFYING   LANGUAGES  – rfc 5646 https://tools.ietf.org/html/rfc5646 INTERNET  ENGINEERING  TASK   FORCE
  • 127. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle A Solution - The Introduction of Href Lang Wikipedia  page  on  href lang Rules  on  href lang https://support.google.com/webmasters/answer/182192?hl=en&ref_topic=2370587 -­‐ MULTINATIONAL  &  MULTILINGUAL  SITES  AND  HREF  LANG https://support.google.com/webmasters/topic/2370587?hl=en&ref_topic=4598733 -­‐ HREF  LANG  Google https://support.google.com/webmasters/answer/2620865?hl=en&ref_topic=2370587 -­‐ USE  A  SITEMAP  FOR  HREF  LANG https://support.google.com/webmasters/answer/6144055?hl=en&ref_topic=2370587 -­‐ LOCALE  AWARE  WITH  GOOGLEBOT CRAWLING
  • 128. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle INTERNATIONALIZED RESOURCE INDICATOR IRI Internationalized Resource Identifiers (IRIs) RFC 3987 https://tools.ietf.org/html/rfc3987 INTERNET ENGINEERING TASK FORCE
  • 129. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle REFERENCES
  • 130. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle References & Sources Fetterly,  D.,  Manasse,  M.  and  Najork,  M.,  2003.  On  the  evolution  of  clusters   of  near-­‐duplicate  web  pages. Journal  of  Web  Engineering, 2(4),  pp.228-­‐246. Broder,  A.Z.,  Glassman,  S.C.,  Manasse,  M.S.  and  Zweig,  G.,  1997.  Syntactic   clustering  of  the  web. Computer  Networks  and  ISDN  Systems, 29(8-­‐13),   pp.1157-­‐1166. Broder,  A.,  Kumar,  R.,  Maghoul,  F.,  Raghavan,  P.,  Rajagopalan,  S.,  Stata,  R.,   Tomkins,  A.  and  Wiener,  J.,  2000.  Graph  structure  in  the  web. Computer   networks, 33(1),  pp.309-­‐320. Mogilner,  C.,  Rudnick,  T.  and  Iyengar,  S.S.,  2008.  The  mere  categorization   effect:  How  the  presence  of  categories  increases  choosers'  perceptions  of   assortment  variety  and  outcome  satisfaction. Journal  of  Consumer   Research, 35(2),  pp.202-­‐215.
  • 131. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle References & Sources http://www.seobythesea.com/2008/02/new-­‐google-­‐process-­‐for-­‐detecting-­‐ near-­‐duplicate-­‐content/ Pugh,  W.  and  Henzinger,  M.H.,  Google  Inc.,  2016. Detecting  duplicate  and   near-­‐duplicate  files.  U.S.  Patent  9,275,143. Alonso,  O.,  Fetterly,  D.  and  Manasse,  M.,  2013,  December.  Duplicate  news   story  detection  revisited.  In Asia  Information  Retrieval  Symposium (pp.  203-­‐ 214).  Springer  Berlin  Heidelberg. RFC  5988  – The  Canonical  Relation  Link  -­‐ https://tools.ietf.org/html/rfc5988 Fetterly,  D.,  Manasse,  M.  and  Najork,  M.,  2003.  On  the  evolution  of  clusters   of  near-­‐duplicate  web  pages. Journal  of  Web  Engineering, 2(4),  pp.228-­‐246.
  • 132. @dawnieando from @MoveItMarketing Click To Edit Presentation SubtitleClick To Edit Presentation Subtitle References & Sources Najork,  M.,  2012,  August.  Detecting  quilted  web  pages  at  scale.   In Proceedings  of  the  35th  international  ACM  SIGIR  conference  on   Research  and  development  in  information  retrieval (pp.  385-­‐394).  ACM Source: Broder, A., Kumar, R., Maghoul, F., Raghavan, P., Rajagopalan, S., Stata, R., Tomkins, A. and Wiener, J., 2000. Graph structure in the web. Computer networks, 33(1), pp.309-320. .