Many webmasters and SEOs are stealing from themselves in their potential to rank in Google Search because they end up ranking the wrong pages for search terms. They are competing with themselves in many cases. Google does not always know which page to rank from a site for a query due to many things. One of which is 'too similar' content or 'wrong version' indexation. There are a number of ways to identify this problem and formulate strategies to fix this using on page and your own site power and relevance with clues from within as to which page is most important for given queries.
6. BUT…WE ARE
DROWNING IN A SEA
OF CONTENT…
WHAT’S WORSE
THAN DUPLICATE
CONTENT?
7. MAKING
GOOGLE
PLAY
SPOT
THE
DIFFERENCE
‘TOO SIMILAR CONTENT’ IS
ONE OF THE CANNIBALS
8. WE ARE ALL PUBLISHERS
SOURCE:
http://wordpress/activity/posting
“Make
more
content”
can
have
a
down
side
9. A
LOT
OF
THE
CONTENT
IS
‘KIND
OF
THE
SAME’
“There’s
a
needle
in
here
somewhere”
“It’s
an
important
needle
too”
10. GOOGLE
CAN
GET
CONFUSED
AS
TO
WHICH
PAGE
IT
SHOULD
RANK
FROM
YOUR
SITE
FOR
KEY
TERMS
FROM
TOO
SIMILARS
11. The
Four
Main
Types
Of
Cannibalisation –
Slideshare @jonearnshaw
http://www.slideshare.ne
t/jonathanearnshaw/seo-‐
46813620
12. AS
ONE
PAGE
GOES
UP…
OTHER
’TOO
SIMILAR’
PAGES
GO
DOWN…
SOMETIMES
A
LOT
13. SOME
SYMPTOMS
• Different pages ranking for the
same term
• Hanging around on page 2
• Never quite achieving those top
spots
• Different URLs ranking for the same
term in your industry seasonal
periods
• Not getting the CTR (click through
rate) despite ranking reasonably
well
• Shared impressions in Google
Search Console for different pages
for the same terms
17. ARE MANY URLs FIGHTING WITH EACH
OTHER FOR THE SAME QUERY?
CAVEAT:
CONTEXT
OF
QUERY
WILL
ALSO
COME
INTO
PLAY
BUT
GENERALLY
YOU
DON’T
WANT
SEVERAL
URLS
IN
THE
LIST
22. DO
A
SITE:YOURDOMAIN.COM
SEARCH
IN
GOOGLE
SERPS
DO
A
SITE:
www.YOURDOMAIN.COM
SEARCH
IN
GOOGLE
SERPS
CHECK
THE
DIFFERENCE
REPEAT
WITH
HTTPS
23. JUST
A
FEW
OF
THE
CAUSES
OF
CANNIBALIZATION
• ‘Too similar or wrong version content’
• Skewed votes of ‘awesomeness’ from
within your own site
• Anchor text inconsistency
• Lack of logic within site structure
• Not enough ‘environmental context’
surrounding the content
• Placement in site structure
• Sharing relevance for queries with other
pages (even those which are not that
similar)
• Optimising several pages for the same
terms (even unwittingly)
• Incorrect use of canonicalisation and
hreflang
24. SKEWED ‘IMPORTANCE’ VIA INTERNAL LINKING
STOP
VOTING
FOR
THE
WRONG
TARGETS
FROM
WITHIN
YOUR
OWN
SITE
THE
MOST
IMPORTANT
PAGES
SHOULD
BE
TOWARD
THE
TOP
OF
THE
LIST…
NOT
YOUR
BLOG
OR
BLOG
CATEGORIES
IDEALLY
25. BAD VERY SIMILAR CONTENT CAN BE
GENERATED VIA ‘WRONG PARAMETER’ PICKUP
SOMETIMES
GOOGLEBOT
PICKS
UP
ON
THE
WRONG
PARAMETER
FIELD
FROM
DYNAMIC
URLS
AND
HEADS
OFF
INTO
THE
MAZE.
THE
IMPORTANT
URL
IS
LOST
CHECK
LOGS
FOR
STRANGE
URLS
THE
IMPORTANT
URL
LOSES
IMPORTANCE
WHEN
GOOGLEBOT
ENCOUNTERS
DYNAMICALLY
GENERATED
RANDOM
BUT
‘LOGICAL’
CONTENT
FROM
PARAMETERS
26. INCONSISTENT ANCHOR TEXT INTERNAL
LINKING
BE
CONSISTENT
IN
INTERNAL
ANCHOR
LINKS
BUT
DON’T
BE
SPAMMY
EITHER
MAKE
SURE
THAT
THE
CONTEXT
AND
LINK
IS
USEFUL
TO
THE
VISITOR
AND
ALSO
TO
THE
SEARCH
ENGINE
IF
YOUR
’INTERNAL
ANCHOR
CLOUD’
SCREAMS
‘SPAM
ON
MONEY
TERMS’…
NOOO
AVOID
BEING
’FORMULAIC’
27. USE BREADCRUMBS TO EMPHASISE
IMPORTANCE & RELEVANCE IN SITES
Image
credit:
https://www.smashingmagazine.com/2009/03/breadcrumbs-‐in-‐web-‐
design-‐examples-‐and-‐best-‐practices/
HOME
SITE
SECTION
CATEGORY
PRODUCT
/
ARTICLE
MOST
FEWER
FEWER
SINGLE
TEXT
OUTPUT
ONLY
BREADCRUMB
28. SOME NOTES ON CANONICALIZATION
1)
Duplicate
or
‘very
near
duplicate’
content
should
be
canonicalized
2)
You
can’t
no-‐index
a
URL
and
then
canonicalize it
to
something
else
3)
With
and
without
a
trailing
slash
is
different
4)
Don’t
forget
to
switch
you
canonicals
when
moving
to
HTTPS
5)
Self
referencing
canonicalization
can
help
if
your
content
gets
scraped
6)
Avoid
canonicalizing URLs
to
another
if
it’s
too
different
7)
Don’t
canonicalize paginated
URLs
to
the
first
page
of
the
set
– use
next/prev
8)
Remember
that
Google
is
looking
for
a
single
version
of
a
URL
to
index
–
canonicalization
is
key
to
this
9)
Don’t
canonicalize just
for
rankings.
Canonicalize to
reduce
duplication
10)
Consider
whether
a
301
redirect
would
be
the
best
fit
11)
Review
‘URL
parameters’
on
dynamic
sites
to
help
choose
canonicals
SOME
CANONICALS
MAY
BE
IGNORED
IF
GOOGLEBOT
/
THE
SEARCH
ENGINE
THINKS
YOU
MADE
A
MISTAKE
(e.g.
too
different
a
URL)
29. ”PICK
YOUR
TARGETS”
“Let
your
target
pages
be
a
clear
winner
always”
BUT…
DO
NOT
OVER
OPTIMISE
ANYTHING
EVER
YOU DON’T NEED TO OPIMISE
EVERYTHING
MAKE
YOUR
TARGET
PAGE
‘THE
MOST
RELEVANT’ ON
A
TOPIC
30. MERGING OF ‘TOO SIMILAR’ CONTENT
FOR CONCENTRATION (BULK UP (DON’T
STEAL) THE POWER)
1)
Variants
2)
Stemming
3)
Synonyms
4)
Associated
keywords
5)
Theme
level,
section
and
page
level
6)
Silos
DO
A
CONTENT
AUDIT
Find
variants
for
a
keyword
in
Google
Search
Console
in
content
keywords
They’re
potentially
the
same
thing
You
need
to
merge
pages
if
they
can’t
stand
on
their
own
two
feet
alone
CONCENTRATE
RATHER
THAN
DILUTE
32. ORGANISE TO SOME ORDERED
LOGIC TO ADD FURTHER
MEANING
‘MAKE
LIKE
A
LIBRARY’
A
BLOG
IS
NOT
A
LIBRARY
IT
IS
A
RANDOM
COLLECTION
OF
MANY
UNRELATED
THINGS
33. INTENT TO CONTENT MAPPING
BOTH
HUMANS
AND
SEARCH
ENGINES
NEED
A
LOGICAL
STRUCTURE
AND
SIGNS
TO
UNDERSTAND
THE
CONTENT
WHICH
MATCHES
QUERY
INTENT.
THIS
CAN
HELP
TO
AVOID
CANNIBALISATION
34. THE LOGIC RELEVANCE DIRECTOR
‘THE PERMANENT TOPICAL HUB PAGE’
CATS
GALLERY
(Pictures
of
cats)
Queries:
cat
pictures
Cat
picture
categories
Kitten
pictures,
etc
FORUM
Chatting
cat
lovers
Long
tail
community
question
answering
Builds
the
buzz
around
the
rest
of
the
site
too
BREEDERS
A
directory
of
breeders
categorised
internall by
cat
type
Cat
breeders
PET
CARE
Lots
of
help
and
advice
on
the
care
of
cats
Queries:
My
cat
is
ill,
cat
care,
sick
cat
THE
CAT
PAGE
IS
MOST
POWERFUL– but
relevance
&
queries
are
clearly
matched
with
intent
‘Wall-ties of
semantically
‘related’
relevance
35. INTENT TO CONTENT FRAMEWORK
HUB
CONTENT
HERO
CONTENT
THE
BRAND
FRONT
DOOR
HELP
CONTENT
An
environment
related
to
‘support’,
‘questions’,
‘technical
q
and
a’s,
‘opening
times’
ANSWER
QUESTIONS
On
PRODUCTS
ETC,
ideas,
solutions,
styles
This
is
where
the
community,
blogs,
trends,
fashion,
opinions,
thoughts,
ramblings
and
content
less
focused
on
transaction
and
sales.
Don’t
compete
with
your
hero
pages
here.
Build
the
buzz
to
support
the
sales.
UGC
ideas
too
CAT CAT CAT
sc sc sc sc
e.g.
Query:
Shoes,
buy
shoes,
red
shoes,
etc
E.g.
US
shoe
sizes,
shoe
size
charts,
shoes
don’t
fit,
shoe
styles,
shoe
reviews
The
Shoe
Site
Strong
power
here
P P P P P
36. 21 WAYS TO AVOID SITE CANNIBALISATION
1)
Utilise Internal
linking
connecting
related
items
&
sections,
navigation,
breadcrumbs
and
some
contextual
(BUT
DO
NOT
BE
SPAMMY)
2)
Anchor
text
consistency
(BUT
DO
NOT
BE
SPAMMY)
3)
Canonicalization
of
near
similar
content
4)
Page
Title
Differentiation
5)
H1,
H2,
H3
tag
differentiation
6)
Intent
to
content
mapping
&
topical
permanent
hub
pages
7)
If
you
can’t
improve
(for
now),
noindex (temporarily)
and
then
improve
8)
Exclusion
of
near
duplicate
non-‐preferred
version
from
XML
sitemap
9)
Exclusion
of
301
redirects
from
XML
sitemap
10)
Verify
all
possible
versions
of
your
site
in
Google
Search
Console
(http/https/www/non-‐www)
11)
Choose
1
version
of
your
site
(HTTPS/HTTP/WWW/NONWWW)
and
301
redirect
other
versions
12)
Review
your
pages
for
duplicate
body
content
13)
Check
for
multiple
similar
URLs
sharing
impressions
for
queries
in
GSC
14)
Keep
boilerplate
areas
to
a
minimum
15)
Avoid
‘filler’
or
’placeholder’
content
just
for
the
sake
of
having
content
16)
Avoid
‘spinning
content
around’
in
different
parts
of
your
site
without
having
decent
unique
content
16)
Use
breadcrumbs
to
include
site
section
to
provide
further
context
17)
Review
parameters
and
check
to
see
that
Googlebot is
not
crawling
the
wrong
ones
18)
Fold
up
thin
content
and
merge
to
make
‘great
content’
which
stands
out
on
its
own
19)
Review
internal
links
and
chop
out
any
redirect
chains
to
end
target
20)
Check
any
hreflang tags
to
make
sure
right
content
is
ranking
in
right
language
if
internationalised
21)
Check
server
logs
on
larger
sites
to
check
for
abnormalities
&
googlebot visiting
strange
places
37. ”EMPHASISE
IMPORTANCE”
“Make
sure
you’re
giving
the
right
hints
on
URLs
to
Google
from
within
your
own
website”
ALWAYS REMEMBER ‘NUANCES’ARE KEY
BIG
TAKE
AWAY
38. TWITTER
-‐ @dawnieando
GOOGLE+
-‐ +DawnAnderson888
LINKEDIN
– msdawnanderson
MOVE
IT
MARKETING
–
www.move-‐it-‐marketing.co.uk
THANK
YOU