Keyword in context (KWIC) indexing is a technique for indexing a
collection of descriptions, each of which is a small (twenty to thirty words,
or so) sentence such as article titles. The principle operation in KWIC
indexing is rotation, which consists of circularly shifting the description
right (or left) by one word to create a set of descriptions. For example, the
title "Multithreaded, Parallel and Distributed Programming", when
rotated, produces the following set:
Multithreaded, Parallel and Distributed Programming
Parallel and Distributed Programming Multithreaded,
and Distributed Programming Multithreaded, Parallel
Distributed Programming Multithreaded, Parallel and
Programming Multithreaded, Parallel and Distributed
The KWIC index of a set of descriptions is produced as follows:
- Rotate each description and add the rotations to the set.
- Sort the set to produce the index.
KWIC indexing can get fancier: rotated descriptions that begin with common
words (such as "and Distributed Programming Multithreaded, Parallel"
above, and known as stop words can be eliminated, the rotated descriptions
can be unrotated back to the original description after sorting (in this case
the first word of the rotated description is usually highlighted somehow in the
unrotated description), and so on. For now though, the simple KWIC indexing is
good enough.