This document provides an overview of bibliometric tools and metrics that can be used to measure and evaluate research impact and productivity. It discusses common metrics like publication counts, citations, h-index, g-index and m-index which are calculated using bibliographic databases like Scopus and Web of Science. It also explores more novel altmetric tools that can measure impact of research outputs beyond publications. The document emphasizes that bibliometrics should be used cautiously and as a supplement to peer review, to avoid perverting incentives for researchers.
ENG 5 Q4 WEEk 1 DAY 1 Restate sentences heard in one’s own words. Use appropr...
Research Librarian's Tools for Measuring Impact
1. | | 1
| 1
What’s in the research
librarian’s tool shed?
26 Aug 2014
Lucia Schoombee
Customer Consultant – Research Intelligence
Elsevier
2. | | 2
| 2
Contents
1. Introduction
2. Nuts and bolts
3. High power tools
4. Hammer
5. Comprehensive, automated
tools
6. New gadgets
7. Skill & know how
8. Safety & Caution
2
3. | | 3
| 3
Introduction
• Researchers = “simplify things and help us to be less busy”
• Evidence of productivity and impact
• Needed for: performance appraisal, promotion, funding
applications, rating and to include in curriculum vitae
• Profusion of indicators and tools
• All tools provide quantitative means of articulating a
researchers productivity and impact
• NB! Supplementary to peer-review
4. | | 4
| 4
Nuts and bolts
• Publication counts
• Mostly books, articles, book chapters &
conference proceedings.
• “Work not published is work not done”
• Citations
• Broadly, a citation is a reference to a
published or unpublished source (not
always the original source).
• Attributes scientific work and gives credit to the
original source of an idea, theory or concept.
• Citation analysis = Measuring the impact of an
author, or a publication, by counting the number of
times that author or publication has been cited by
other works
5. | | 5
| 5
Indicators
• Indicator is a composite numeric
symbol/expression that indicates the
state or level of something
• The nuts and bolts – publication counts
and citations are used to produce
indicators of research performance
• Most common indicators used in
science: h-index, g-index, m-index,
impact factor
6. | | 6
| 6
h-index
• Developed by Prof Jorge Hirsch
in 2005
• The h-index is an equation
based on the number of
publications and the number of citations per
• publication
• h-index is now recognized as an industry
standard that gives information about
the performance of Researchers and
Research Areas that is very useful in
some situations
Prof Jorge Hirsch
invented h-index in
2005
7. | | 7
| 7
h-index formula
• A scientist has an h-index 9 if
his top 9 most-cited
publications have each
received at least 9 citations; it is 13 if an
entity’s top 13 most-cited publications
have each received at least 13 citations;
and so on
[A scientist has index h if h of
his/her Np papers have at least h
citations each, and the other (Np −
h) papers have no more than h
citations each]
8. | | 8
| 8
From the horse’s mouth
Prof Dermott Diamond, National Centre for Sensor
Research, Dublin City University talks about Bibliometrics for
the individual
10. | | 10
| 10
Advantages of h-index
• Combines quantity (publications) and impact (citations).
• Objective measure of performance
• Insensitive to low cited papers
• Better than other single-number criteria:
Impact factor, total number of documents, total number
of citations, citation per paper rate and number of highly
cited papers
• Easy to obtain
• Easy to understand
11. | | 11
| 11
Limitations of h-index
• Publication and citation patterns vary between disciplines
• Not time sensitive
• Highly cited papers are not reflected in the h-index
• Easy to obtain, risk of indiscriminate use and over-reliance
• May change behaviour of scientists (self-citations)
• There are also technical limitations:
• Difficulty to obtain the complete output of scientists
• Deciding whether self-citations should be removed or
not
12. | | 12
| 12
g-index
[Given a set of articles] ranked in decreasing
order of the number of citations that they
received, the g-index is the (unique) largest
number such that the top g articles received
(together) at least g2 citations
Source: https://who.rocq.inria.fr/Jean-Charles.Gilbert/publications/indices.jpg
13. | | 13
| 13
g-index calculation
Method/calculation: Rank by decreasing order the citations
of all the documents of the unit. The position where the
square of the rank position is equal to the accumulated
number of citations corresponds to the g-index
Source: Costas, R., & Bordons, M. (2008). Is g-index better than h-index?
An exploratory study at the individual level. Scientometrics,
77(2), 267-288.
14. | | 14
| 14
m-index
• The h-index depends on the duration of each scientist’s
career because the pool of publications and citations
increases over time.
• In order to compare scientists at different stages of their
career, Hirsch presented the “m parameter”, which is the
result of dividing h by the scientific age of a scientist
(number of years since the author’s first publication)
• The m-index is defined as h/n, where n is the number of
years since the first published paper of the scientist
16. | | 16
| 16
High power tools
• Most comprehensive and robust
sources of publication and citation
data are citation indexes
• E.g.
• Scopus
• Web of Science
• Google Scholar
17. | | 17
| 17
Scopus at a glance
• The largest abstract and citation database of peer-reviewed
research literature from around the world. Belongs to
Elsevier.
• More than 21,900 titles from more than 5,000 international
publishers and 105 different countries
• Over 54 million records, 23 million patents from 5 patent
offices worldwide
• All content is evaluated by an independent, 15-person,
international board of experts called the Content Selection
and Advisory Board (CSAB)
19. | | 19
| 19
What content does Scopus include?
CONFERENCES
17k events
5.5M records (10%)
Conf. expansion:
1,000 conferences
6,000 conf. events
400k conf. papers
5M citations
Mainly Engineering
and Physical
Sciences
BOOKS
421 book series
- 28K Volumes
- 925K items
29,917 books
- 311K items
Books expansion:
75K books by 2015
- Focus on Social
Sciences and A&H
PATENTS
24M patents
from 5 major
patent offices
JOURNALS
21,912 peer-reviewed journals
367 trade journals
- Full metadata, abstracts and
cited references (pre-1996)
- >2,800 fully Open Access titles
- Going back to 1823
- Funding data from
acknowledgements
Physical
Sciences
6,600
Health
Sciences
6,300
Social
Sciences
6,350
Life
Sciences
4,050
20. | | 20
| 20
References
• 33 million date back to
1996, incl. 85% references
• 21 million date back to
1823, no references
• 3 million records added per
year
• (5,500 records added per
day)
33M
21M
25. | | 25
| 25
Web of Science
• ~12,000 Journal titles
• Databases:
• Science Citation Index Expanded (1900-present)
• Social Sciences Citation Index (1900-present)
• Arts & Humanities Citation Index (1975-present)
• Book Citation Index (2005-present)
• Conference Proceedings Citation Index (1993-present)
• 54 million records - 3,300 publishers - “100 years of abstracts”
• 6.5 million conference records - 760 million+ cited references
• Updated weekly - Since 1955
28. | | 28
| 28
Google Scholar Limitations
• Evasive about exactly which resources it indexes
• Update schedule unknown
• Definition of “scholarly” includes grey literature, including article
preprints, conference proceedings, and other materials not peer-reviewed.
• BUT Google Scholar is “freely” available and easy to use
29. | | 29
| 29
Hammer
• Journal Impact Factor
• Widely used journal metric, most
notorious
• Represents the impact a journal has in
relation to other journals in a specific field
• Measures the frequency with which the
average article in a journal has been
cited in particular year
• Despite criticism, widely used to
articulate an authors reputation
30. | | 30
| 30
Comprehensive, automated tools
• Multi-dimensional, flexible and efficient features to handle
complex queries, simulations and is able to compare
entities.
• E.g. SciVal & Incites
31. | | 31
| 31
SciVal in a nutshell
31
SciVal offers access to the research performance of 220 countries and 4,600
research institutions worldwide, and groups of institutions
Visualize research
performance
Benchmark your
progress
Develop collaborative
partnerships
Ready-made- at a glance
snapshots of any
selected entity
Flexibility to create and
compare any research
groups
Identify and analyze
existing and potential
collaboration opportunities
32. | | 32
| 32
The layers of SciVal
32
Using advanced data analytics super-computer technology, SciVal allows
you to instantly process an enormous amount of data to generate powerful
data visualizations on-demand, in seconds.
• Scopus data only
• 1996 onwards
• Weekly update
Query around 75 trillion
metric values
35. | | 35
| 35
Gadgets
• One often also finds new and unopened gadgets which
seem great but is difficult to figure out. Altmetrics fall into
this category, showing a lot of promise, but not yet being
quite understood and trusted.
• Variety of research outputs: datasets, software, posters,
slides, videos, websites and articles
• Impact is measured by: number of tweets, bookmarks,
Likes on Facebook, blog posts, media mentions, etc.
• Altmetrics tracks how many times the outputs have been
viewed, downloaded, cited, reused/adapted, shared,
bookmarked, or commented upon
36. | | 36
| 36
Altmetrics tools
A few examples of the rapidly growing number of new tools
for measuring the impacts of these kinds of research outputs
include:
• Altmetric analyses articles and datasets
• ImpactStory analyses articles, datasets, blog posts,
software, etc.
• Topsy an index of the public social web
• Slideshare analyses videos, presentations, slides and
documents posted to its website
• Plum Analytics and CitedIn
37. | | 37
| 37
Using Altmetrics.com
• Go to http://www.altmetric.com/
• Librarians & researchers - Request a free trial account of
the Explorer using the form at the bottom of the page
• Go to Product > Explorer and Sign in
• Pick an article by keyword, journal, etc.
• N.B. Altmetrics is an article-based metric
38. | | 38
| 38
Altmetric donut
• Badge can be embedded
• Add “altmetric it” to browser toolbar
39. | | 39
| 39
Altmetrics in Scopus
• Appears in sidebar of Scopus articles
• Only when there is data available for
the article being viewed
• E.g. Priem, J., Groth, P., & Taraborelli,
D. (2011). The altmetrics collection.
PloS one, 7(11), e48753-e48753
40. | | 40
| 40
Impactstory
• Go to www.impactstory.org
• Create a profile
• Impactstory collects your products from slideshare, ORCID, Figshare,
Githup
41. | | 41
| 41
Skill and know-how
• Ale Ebrahim, Nader, et al. "Effective strategies for increasing citation
frequency." International Education Studies 6.11 (2013): 93-99
• Visibility is the key to higher citations
• This paper, by reviewing the relevant articles, extracted 33 different
ways for increasing citation possibilities
42. | | 42
| 42
Skill and know-how (2)
• Use a Unique Name Consistently
• Use a standardised institutional affiliation and address, using no
abbreviations
• Assign keyword terms to the manuscript
• Publish in journal with high impact factor
• Self-archive articles
• Team-authored articles get cited more
43. | | 43
| 43
Skill and know-how (3)
• Review articles
• Contribute to Wikipedia
• Join academic social networking sites
• Share detailed research data
• Publish your work in a journal with the highest number of abstracting
and indexing
• Create an online CV
• Make a podcast about your research
44. | | 44
| 44
Safety
• Note the limitations of citation analysis
• Papers are cited for a variety of reasons, some of which are not related
to the value of the research
• Citation analysis relies heavily on journal content. However many
subject areas prefer to publish books and are therefor not well
represented in citation indexes.
• Major indexes STM focus
45. | | 45
| 45
Caution
• Indicators should be a means to an end not an end in itself
• Use indicators in combination (never rely on one indicator)
• Peer-review should still be regarded as the primary judge of science
• avoid the perversion of the science process, use metrics responsibly
• Guard against “publication obesity”
When asked about the kind of support they need, a common response of researchers is “simplify things and help us to be less busy”.
In reality the opposite is true – reseachers increasinly have to give evidence of their productivity
This “evidence” is mostly required for:
performance appraisal
promotion
funding applications
rating
include in curriculum vitae
The prospect of distinguishing between research indicators and understanding the advantages and pitfalls of each quite bewilderings
The aim of the presentation is to demystify the metrics involved at the micro level to quantify individual researchers’ performance.
I will try to do this in simple terms, highlighting the advantages and limitations of each.
>end<
The first items we will find in our tool shed are Nuts and Bolts which represent PUBLICATION COUNTS and CITATIONS.
Publication counts are mostly: books, articles, book chapters & conference proceedings.
Also: theses & dissertations / datasets / patents and other IP / presentations / reports / etc
NB because - in science - the saying goes: “Work not published is work not done”
Citation is a reference to a published (or unpublished source)
Regarded as a measure of value and indicates contribution to a scientific field
When talking about Citation analysis – We refer to measuring the impact of an author or a publication, by counting the number of times that author or publication has been cited by other works
>end<
Publication and citation counts = units used to build indicators (these often look very complicated)
What is an Indicator? is a composite numeric symbol/expression that indicates the state or level of something
Four common indicators used in science today are the: h-index, g-index, m-index, impact factor – which we will look at next.
Firstly,
There’s the h-index which is an indicator that uses a researchers number of citations in relation to his number of articles as a measure of impact
h-index is now recognized as an industry standard that gives information about the performance of Researchers and Research Areas that is very useful in some situations
FOR INTEREST:
Developed by Prof Jorge Hirsch in 2005
Hirsch is an Argentine American professor of physics at the University of California, San Diego.
He introduced the concept in 2005 by means of a paper (An index to quantify an individual's scientific research output) published in Proceedings of the National academy of Sciences of the United States of America 102(46), 16569-16572.
The concept was specifically novel i.t.o using both publications and citations in determining an individual’s performance
>end<
The h-index proposes that a scientist has an h-index 9 if his top 9 most-cited publications have each received at least 9 citations
It is 13 - if the scientist’s top 13 most-cited publications have each received at least 13 citations; and so on
For the introduction, I have a short video by Prof Diamond who explains very neatly about the role of metrics in determining individual researchers impact
Until a few years ago University’s measured their research endeavour primarily by number of publications
Particularly the focus was also on the impact factors of the journals in which they were publishing
Increasingly however research performance is being measured by looking at a researchers number of publications and the number of citations per publication
This is because the number of times that a publication is cited, is regarded as a measure of the value (or impact) of the publication
When illustrated in a graph – the h-index is on the 45° line where the number of citations intersect with the number of publications
with the number of citations are given on the Y-axis
and the papers are numbered in order of decreasing citations on the X-axis
Advantages of h-index:
Combines quantity (publications) and impact (citations).
It is regarded as an objective measure of performance
Insensitive to low cited papers
Better than other single-number criteria - impact factor, total number of documents, total number of citations, citation per paper rate and number of highly cited papers.
Easy to obtain – in citation databases such as Scopus and Web of Science
Easy to understand.
Source: Costas, R., & Bordons, M. (2007). The h-index: Advantages, limitations and its relation with other bibliometric indicators at the micro level. Journal of Informetrics, 1(3), 193-203.
Limitations of h-index:
Publication and citation patterns vary between subject fields
high publication/citation rate=neurosciences, life sciences, pharmacology and toxicology
low publication/citation rates= mathematics, computer science, arts and humanities
The h-index favours more established scientists who have had a longer time to accumulate citations
Highly cited papers that continue to accumulate citations are not reflected in the h-index
Over –reliance because easy to obtain
Can provoke changes in the publishing behaviour of scientists, such an artificial increase in the number of self-citations
There are also technical limitations:
Sometimes difficult to obtain the complete output of scientists with very common names
Difficult to decide whether self-citations should be removed or not (Self-citations can be misused but is also considered acceptable science practice)
The g-index was proposed by Leo Egghe in a paper published in 2006
The aim of the g-index was to rectify the fact that the h-index did not reflect a scientist’s most highly cited paper
It therefore tried to improve on the h-index by giving more weight to highly-cited articles.
G is the number of articles that received G-square citations
A scientist with a g-index of 59 means that 59 of his/her papers, together, have at least 59x59 (g2) citations
The g-indexes of researchers vary more than those of h-indexes which makes it a useful measure to distinguish between researchers
From a practical point of view, in order to obtain the g-index of a scientist, it is necessary to rank all the documents by decreasing order of citations
The position where the square of the rank is equal to the sum of citations corresponds to the g-index.
>end<
If the number of documents is not enough for the g-index calculation, the existence of a few “fictitious” documents with 0 citations is supposed in order to complete the calculation several examples of its calculations can be examined in EGGHE [2006]). An example of the calculation of h-index and g-index for a hypothetical scientist with 4 documents is shown in Table 1. As it can be seen, this scientist would get an h-index of 4 and a g-index of 6. Two “fictitious” documents with 0 citations are necessary (documents 6 and 7) in order to calculate the g-index properly (documents in italics are “fictitious”).
Another limitation of the h-index is that it favours more established scientists who have had a longer time to accumulate citations
To correct this problem, Hirsch introduced the m-indicator because the pool of publications and citations increases over time
.. which is the result of dividing h by the scientific age of a scientist (number of years since the author’s first publication)
The m-index is therefore defined as h divided by n, where n is the number of years since the first published paper of the scientist
>end<
It is often advisable to use all three indicators when evaluating researchers (appointment, promotion, for collaboration, etc.)
In this table the h-, g- and m-indices of four researchers are listed.
Dr Ray has the highest h-index, indicating a pattern of consistently high citations.
Dr Smith has the highest g-index which shows that he has published work with extraordinary impact
Dr Ray however also has the highest m-index which indicates that he had cumulated the most citations relative to the span of his career
Where
the h-index is generally available in citation indexes, this is not the case for g- and m-index
>end of nuts & bols<
Having looked into the nuts and bolts, we can now look at where to find this information.
To find citation information you need to use a citation index
Examples of citation indexes are Scopus, Web of Science and Google Scholar
Young inventions - date back to 60’s when –
The idea arose that articles could be indexed by using citation information as retrieval terms instead of keywords and descriptors assigned by indexers.
So, a citation index is a bibliographic index which does not only include a specific publication but also the references that were made to that publication
Very significant invention i.t.o for information retrieval
even more significant ito research measurement and evaluation
However some limitations:
The focus is on journal literature - to the exclusion of books
They generally have a strong English language bias and have limited coverage of foreign language journals
There is an emphasis on journals published in the US, UK and Western Europe
Coverage of science journals are superior to social sciences and humanities journals
Slide shows the spread of publishers included in Scopus
On the right – the scientific fieds covered. Scopus consits of
32% Health Sciences
30 % Physical Sciences
23 % Social Sciences
15 % Life Sciences
Focus is on journals
Also covers 17,000 conferences
Includes 30,000 books (by 2015 books will expand to 75,000 = social sciences and humanities
Patents from 5 patent offices
Currently Scopus has 33M records with references that date back to 1996
21M records without citations date back to 1823 (Expansion Project will take citations back to 1970 and fill up the gaps after 1996)
3 million records added per year
(5,500 records added per day)
>end<
On the Scopus SEARCH page, select AUTHOR SEARCH
Type in the name of the author, initial, affiliation
Click Search
Select “View Citation Overview”
The “Citation Overview” covers citations received since 1996
Filters include:
An option is provided to exclude self-citations and citations from book
An option is provided for sorting (by year or by citations)
The date range can be set
The h-index is provided
All publications are listed aggregated by year and by total
From the “Citation Overview” you can go to “The Author Evaluator” which visualises the citation report using graphs
The “Author Evaluator” provides complete information about all publications and citations
It once again provides options to
eliminate self citations, etc.
The graph shown her, is Van Helden’s h-index
~12,000 Journal titles
The Web of Science interface provides online access to the following databases:
Science Citation Index Expanded (1900-present)
Social Sciences Citation Index (1900-present)
Arts & Humanities Citation Index (1975-present)
Book Citation Index (2005-present)
Conference Proceedings Citation Index (1993-present)
54 million records - 3,300 publishers - “100 years of abstracts”
6.5 million conference records - 760 million+ cited references
Updated weekly - Since 1955
Google Scholar was established in 2004 and is the other big player in the citation business
To use Google Scholar’s citation information, you have to create a profile under My Citations
For this, Go to Google Scholar (http://scholar.google.co.za/)
Click on “My Citations”
Complete registration form
Google Scholar will collect your articles and create a profile page automaticallyThe profile page lists your articles and citations and creates an h-index for you based on the data in Google Scholar
Google Scholar does have its limitations though
They are evasive about exactly which resources are indexed
The Update schedule is unknown
Their definition of “scholarly” includes grey literature, including article preprints, conference proceedings, and other materials not peer-reviewed
Duplication of records
The advantage however is that Google Scholar is “freely” available and easy to use
>end of High Power Tools<
Next tool is the HAMMER (the tool you use when you don’t know any other)
IMPACT FACTOR is a Journal Metric used to represent the impact a journal has in relation to other journals in a specific field
Divides the number of citations a journal received by the number of articles (2 year period)
NOTORIOUSLY used to evaluate a researchers performance
>end of hammer<
Then you find – really comprehensive, sophisticated tools (put wood in on one side and a chair comes out the other side)
Multi-dimensional, flexible and efficient features to handle complex queries, simulations and is able to compare entities
I’m going to focus on Scival
SciVal is a platform which allows you to configure, visualize and export information about countries, institutions and individuals
Provides:
Ready-made- at a glance snapshots
Benchmarking using 15 metrics
Evaluation of co-authorships
>end<
SciVal consists of 3 modules: overview, benchmarking and collaboration
Data-source is Scopus,
Updated weekly with data from Scopus
SciVal u makes use of HPCC super computing technology, to calculate the metrics in SciVal, fast access
Users can select the metrics they want to use, to explore the entities of their choice.
Allows you to use a large number of “ready made” entities such as 4600 institutions, 223 countries
You can also create your own entities, such as researchers, groups of researchers, research areas.
One often also finds new and unopened gadgets which seem great but is difficult to figure out.
Altmetrics fall into this category, showing a lot of promise, but not yet being quite understood and trusted.
Looks at a variety of research outputs: datasets, software, posters, slides, videos, websites and articles
Impact is measured by: number of tweets, bookmarks, Likes on Facebook, blog posts, media mentions, etc.
Altmetrics tracks how many times the outputs have been viewed, downloaded, cited, reused/adapted, shared, bookmarked, or commented upon
>end<
A few examples of the rapidly growing number of new tools for measuring the impacts of these kinds of research outputs include:
Altmetric analyses articles and datasets
ImpactStory analyses articles, datasets, blog posts, software, etc.
Topsy an index of the public social web
Slideshare analyses videos, presentations, slides and documents posted to its website
Plum Analytics and CitedIn
Altmetric watches:
Social media sites (e.g. Twitter, Facebook, Pinterest, Google+), science blogs,
Mainstream media outlets (NY Times, The Guardian, Scientific American, New Scientist)
Reference managers for mentions of academic papers.
It cleans up this data, enriches it and then allows authors, readers and researchers to see it in context
If you find the article you’re interested in, a Altmetric score will be shown a SCORE surrounded by a DONUT
Colours reflect the sources
blue for Twitter
yellow for blogs
red for mainstream media sources
and so on.
Calculations: (Don’t know exact algorithm) but looks at number of mentions and weights sources
E.g. a news story counts for more than a Facebook post. Attention from a researcher counts more than Tweet.
Altmetrics has been incorporated in Scopus
Appears in sidebar of Scopus articles
Only when there is data available for the article being viewed
ImpactStory is an open source, web-based tool that provides altmetrics to help researchers measure and share the impacts of all their research outputs—from traditional ones such as journal articles, to alternative research outputs such as blog posts, datasets, and software.
It is similar to Google Scholar Citations, in the sense that you go to the website, create a profile and import your products
It provides context to its metrics so that they are meaningful to non-experts
The metrics provided by ImpactStory can be used by researchers who want to know how many times their work has been downloaded and shared
(beyond only considering citations to journal articles)
The tool shed would be worthless if it didn’t go hand in hand with someone who knew how to use the tools and had a clear vision of what he/she want to accomplish.
I would therefore like to bring to your attention an article by Ale Ebrahim
In which he declares that “Visibility is the key to higher citations”
and he reviews articles that suggest ways to increase citations
(some of the suggestions are whimsical and others are noteworthy – we’ll have a look at a few)
Finally – a few words of caution. Although we’ve looked at citations as a powerful means of measuring and enhancing your research footprint, it does have shortcomings that should be taken into account.
Keep in mind that papers are cited for a variety of reasons, some of which are not related to the value of the research. E.g.
Citing friends or colleagues
Citing certain authors to show respect for their work
A paper with errors may be cited heavily
Self-citations (to inflate personal citation counts)
Citation analysis relies heavily on journal content. Many subject areas however prefer to publish books and are therefor not well represented in citation indexes.
Also keep in mind that the major citation indexes have an STM bias.
CAUTION
Indicators should be a means to an end not an end in itself
Use indicators in combination (never rely on one indicator)
Peer-review should still be regarded as the primary judge of science
avoid the perversion of the science process, use metrics responsibly
Guard against “publication obesity”