2. Introduction
Hans Peter Luhn (1st July 1896-19th August 1964).
He worked in IBM as a computer scientist after that he
became manager of the information retrieval research
division.
He crated Luhn algorithm and KWIC indexing etc.
Luhn was the first person proposed counting of terms
occurrences in the subject of a document.
3. KWIC
H.P. Luhn and his associates produced keywords
copies of machine produced permuted title indexes in
the International Conference of Scientific Information
held at Washington in 1958. It is named as Keyword-
In-Context index (KWIC).
American Chemical society adopted KWIC in 1961 for
the publication of ‘Chemical Titles’.
4. Cont…
KWIC is a form of word index where each occurrence
of the keyword is displayed together with surrounding
words in a list of string.
In this KWIC the Keyword appears in the center.
KWIC is a automatic indexing system.
5. Three parts in KWIC Index
Keywords:
subject denoting words which serve as approach
terms.
Context:
keywords selected also specify the particular
context of the document. (rest of the words in the
title)
Identification or location code:
To provide full bibliographic description of the
document address code is used.
6. KWIC indexing stages
1) Preparation of the ‘Stop list’. It is the list consist of
words those words don’t have any value for
indexing/retrieval.
2) Selection of keyword from the title of the document
after omitting stop words.
3) Index generation with the help of rotation of the title
and placing keywords in appropriate place with
meaningful context.
7. Cont…
4) Separate the last and first word using stroke ‘/’ or ‘*’
asterisk marks in an entry and keywords are to be
print in bold letters.
5) In each entry the identification code to be print in
the right most end. And
6) Arrange the entries alphabetical order based on
keywords.
8. Examples
Title:
Essentials of Business communication – 580
Keywords in the title:
Essentials
Business
Communication
Entries:
1. Essentials of Business communication/580
2. Business communication/Essential of 580
3. Communication/Essential of business 580
Arrangement of Entries:
1. Essentials of Business communication/580
2. Business communication/Essential of 580
3. Communication/Essential of business 580
9. Cont…
Title:
Encyclopedia of research on library science – 020.72
Keywords in the title:
Encyclopedia
Research
Library science
Entries:
1. Encyclopedia of Research on library science/020.72
2. Library science/Encyclopedia of research on 020.72
3. Research on Library science/Encyclopedia of 020.72
Arrangement of Entries:
1. Research on Library science/Encyclopedia of 020.72
2. Library science/Encyclopedia of research on 020.72
3. Encyclopedia of Research on library science/020.72
10. Merits of KWIC
It is always current and based on the actual words of
the authors.
It is easy to make compare to other indexing system.
Because of the mechanical method of preparation
more information may be displayed other than manual
method.
The cost is low because of the machine compilation.
11. Demerits of KEIC
Lack of terminology control because it is totally based
on the title of the document.
Some titles are misleading the users to other
documents.
12. Conclusion
KWIC is totally based on the machine works. It is
easy to the users to retrieve the all documents related
to the keywords given by the users. It relates all the
related documents together. Luhn development is very
useful to the information providers. Because it is easy
to develop.
13. References
1) David G. Hays (1966). Automatic Language
Processing. New York: Elsevier
2) Mika Kaki (2005). fKWIC: Frequency Based Keyword-
in-Context Index for Filtering Web Search Results.
Finland: University of Tampere
3) www.egyonkosh.doi