blob: 5483970cea49de3a2a99f579d14e8401c51763f3 [file] [log] [blame]
Ispell has a long and convoluted history. I have tried to track down
as much as possible about it and condense it below.
THE DEVELOPMENT OF SPELL-CHECKING AND THE FIRST ISPELL
The following background information on spelling checkers in general,
and ispell in particular, was provided to me by Les Earnest
(les@dec-lite.stanford.edu):
> The earliest spelling checker (of sorts) of which I am aware was in a
> program that attempted to automatically receive human-keyed Morse
> code, which can be ambiguous because of the variable timing between
> dots, dashes, intercharacter pauses, and interword pauses. This
> program didn't use a full dictionary; instead, used a table of
> digraphs (two-letter sequences) that occur in English and barred
> improper letter sequences. This program was written by someone at MIT
> Lincoln Lab around 1959 and, I think, ran on the TX-2 computer there.
> Unfortunately, I don't remember his name. I might still have the
> paper he wrote in my files but it would take a major search to find it
> and I might not succeed.
>
> A program that I wrote in 1961 to read cursive writing contained a
> real spelling checker, using the 10,000 most common English words.
> It is reported in:
> L. Earnest, "Machine Recognition of Cursive Writing," Information
> Processing 62, (Proc. IFIP Congress 1962, Munich), North-Holland,
> Amsterdam, 1963.
> and
> N. Lindgren, ``Machine Recognition of Human Language, Part III -
> Cursive Script Recognition'', IEEE Spectrum, May 1965.
>
> I brought that dictionary to Stanford and got a PhD student to write
> a spelling checker for text in Lisp running on our PDP-6 computer at
> the Stanford Artificial Intelligence Lab around 1967.
> Unfortunately, I do not remember which student it was; it could have
> been Gil Falk. It was a rather simple program (certainly much
> simpler than the earlier cursive writing program) and I didn't think
> of it as a significant development at the time.
>
> [Later], I got another PhD student, Ralph Gorin, to do a better and
> faster spelling checker sometime in the early '70s, still using my
> old dictionary. Ralph later wrote an article about it in CACM. I
> believe that he later augmented the dictionary.
[note: Ralph has since informed me that he wrote no such article. The
program was called SPELL and was written in 1971. Ralph provided me
with a reference to "Computer Programs for Spelling Correction", by
James L. Peterson, Springer-Verlag, Berlin, 1980, No. 96 in the series
"Lecture Notes in Computer Science." This book states that Ralph's
SPELL program, which was the direct ancestor of ispell, was the first
computer program written for checking the spelling of text documents.
The book is also a good source of references on spelling programs.]
> ...
>
> [Ispell] was originally written in PDP-10 assembly language and ran
> under the WAITS operating system, which is similar to TOPS-10 but existed
> only on SAIL (a dual processor KA10/PDP-6 system). It was and is called
> SPELL on that machine. It later was modified to run under Tenex and
> TOPS-20.
[Ralph mentions that SPELL was also ported to MIT's ITS and TOPS-10.]
The Tenex version of ispell was later revised by W. E. Matson (1974),
and Bill Ackerman (1978). Bill has provided the following information:
> I came across the SPELL program in 1978 on ITS. It was a port from
> Stanford, and had the names Ralph Gorin (approximately 1971) and
> Wayne Matson (1974) associated with it. I did 3 things to it:
>
> Rewrote it as a native program for ITS, and, shortly thereafter,
> TOPS-20. (I never did anything for TOPS-10, and am not aware
> that it ever ran on TOPS-10, though it may have.)
>
> Replaced the heuristics for suffix removal, which I found unreliable
> and unsatisfactory, with an algorithm that was driven by specific
> suffix flags in the dictionary. This way, the dictionary would have
> complete control over what words were legal, and there would be no
> spurious hits.
>
> Apparently most importantly, though I had no idea at time, gave it
> the name "ISPELL", for "ITS version of spell", since I didn't
> consider myself authorized to throw away an existing program
> and overwrite it with a new one under the same name.
>
> I have not followed the history of the program since then, and do not know
> if it still uses the "suffix flags" in its dictionary. But if it does,
> I introduced them. The Ispell algorithm that uses those flags to make
> accurate decisions about the legality of words was documented in great
> detail in James Peterson's Springer-Verlag book. (He spent a semester
> at MIT while working on the book, and I provided him with a lot of
> information and documentation at that time.)
>
> Bill Ackerman
> wba@apollo.hp.com
Michael Adler adds:
> I did work on ispell in 1982. Actually, I stole the ispell
> dictionary and suffix compression algorithm and wrote a spelling
> checker for CP/M in 8080 assembler that I very creatively called "SPELL."
> By sorting the dictionary alphabetically and using a difference encoding
> I managed to pack the entire dictionary that Bill was using in about
> 56Kb. The CP/M program read a document, sorted all the words alphabetically
> and then checked them. It then reread the document and compared words as
> it found them against the in memory, sorted and checked words. SPELL was
> around in the public domain on CP/M.
>
> I was in high school at the time and talked to Bill only over email.
> We wound up in the same compiler group at Apollo in the late 80's by
> coincidence.
DEVELOPMENT OF THE C/UNIX VERSION OF ISPELL
In 1983, Pace Willisson (pace@prep.ai.mit.edu) wrote a C/Unix version
from scratch, based on the ispell documentation.
In 1987, Walt Buehring revised and enhanced ispell, and posted it to the
Usenet along with a dictionary. In addition, Walt wrote the first version
of "ispell.el", the emacs interface.
Geoff Kuenning (geoff@ITcorp.com, that's me, and by the way I
pronounce it "Kenning"; the "u" is silent) picked up this version,
fixed some bugs, and added further enhancements, all of which made me
the de-facto ispell maintainer for the net. I also put quite a bit of
work into improving the quality of the dictionaries. In 1987 I began
work on the "munchlist" script, which I originally intended to be used
to add flags to personal dictionary entries. At the same time I was
studying German, and wanted to use ispell to check the papers I was
writing for that class. After thinking about it for some time, I
realized that the suffix flags could be table-driven, which would both
add flexibility and would get rid of certain difficult-to-find bugs.
In 1988 I rewrote major portions of the code to do this, resulting in
the first multi-lingual version. Ole Bjoern Hessen (obh@ifi.uio.no)
in Norway alpha-tested this version and provided several important
enhancements.
Bob Devine (vianet!devine) provided two larger dictionaries (which
became the basis for english.1 and english.2) to me for inclusion
with subsequent releases.
Ashwin Ram (ram@cs.yale.edu) made substantial enhancements to Walt
Buehring's emacs interface, and provided them to me for inclusion
with an earlier release.
The emacs interface was then completely overhauled by Ken Stevens
(stevens@hplabs.hp.com), who also beta-tested the software and
without whom this posting would not have been possible. If there's a
feature in the emacs interface that you like, you probably have Ken to
thank for it. His efforts have been tireless for many years.
Martin Boyer made major contributions to the munchlist script,
including producing a version that runs under perl (see
languages/Where for instructions on how to get that version).
Philippe-Andre Prindeville provided xspell (a Motif-based X
interface), and Moritz Willers provided a NeXTStep interface.
DEVELOPMENT AND DEATH OF ISPELL 4.0
Meanwhile, and unbeknownst to me, Pace Willisson was working on his
own improvements to ispell. He focused primarily on dictionary size
and startup time. His solution was a dictionary compression algorithm
that detected and encoded frequent letter pairs. This also reduced
the time needed to read it in. Pace also changed some internal data
structures to improve startup time. Pace and I eventually discovered
each other's efforts, and discussed re-merging our changes, but we
decided that there would be too much work involved. This was partly
because I was close to a release and didn't want to delay it with an
extensive and error-prone merge.
In late 1992 (if my memory serves correctly), Richard Stallman
contacted me, asking for permission to distribute ispell as part of
the GNU suite. I responded that he was welcome to distribute it, but
that I was not willing to place my software under the Gnu Public
License. Through a misunderstanding, neither of us considered the
possibility of finding a compromise license that both could live with.
So Richard started a search for an alternate version, and found Pace
working right in his back yard.
I have been told that when FSF first learned of Pace's version, they
again considered using International Ispell instead because it was
both more popular and more capable, but this idea was rejected due to
the license misunderstanding. Instead, FSF enhanced Pace's version
somewhat and called it ispell 4.0, apparently in the hopes that by
numbering the version higher, it would become the standard.
When ispell 4.0 was released, much confusion ensued. Many ispell
users innocently "upgraded" to 4.0 and then screamed when they could
not find features to which they had grown accustomed. Europeans in
general were angered by the apparent provincialism shown by the
"dropping" of international support. I found myself inundated with
questions about a version I had never heard of or seen.
One of the earliest and most common suggestions was that FSF should
rename their version "gispell". This had a lot of precedent, both in
the naming of other FSF utilities and in the then-recent change of the
suffix used by gzip from ".z" to ".gz". Unfortunately, the FSF
refused to do this. I may have inadvertently contributed to this
refusal with a Usenet posting in which I tried to clarify what had
happened, pointing out that the FSF version was more recently related
to Pace's than my own. This may have been seen as an acknowledgment
that FSF should have the rights to the name "ispell," and that I
should rename my version.
A flame war arose, and I decided that the only way to solve the
problem was to rename my version to eliminate the confusion. However,
at about the same time Richard Stallman and I began negotiating via
e-mail. We itemized and clarified his objections to my license, and I
learned from a third party that FSF is willing to distribute software
that falls under the University of California license (also known as
the Berkeley license). Richard and I agreed that if I changed my
license to be a paraphrase of the UC license, FSF would be willing to
distribute my version with no changes. Since then, ispell 4.0 has
been dropped by FSF and has pretty well disappeared from the net,
leaving 3.1 as the version of choice for nearly everyone.
OTHER CONTRIBUTORS
Many other enhancements and bug fixes were provided by the numerous
people listed below. Do not assume, because I omit mention of their
specific contributions, that these persons were any less instrumental
in creating the version of ispell that you see before you. Every one
of them made a significant contribution, and it is only a lack of
space that prevents me from detailing these contributions. This
version of ispell is truly a cooperative effort, and it would not
exist without the help of the generous souls listed above and below.
A full list of contributors, including those mentioned above, follows. (I
think I have listed everyone, but if you contributed and aren't listed,
let me know and I'll correct it):
Ivar Aavatsmark
Per Abrahamsen
Robert Abramovitz
Bill Ackerman
Michael Adler
Rohit Aggarwal
Jose Joao Almeida
Jerry Anders
Boris Aronov
Yves Arrouye
Jamshid Afshar
Michael C. B. Ashley
Bertil Askelid
Eric Backus
Isaac Balbin
Neal Becker
Tony Bennett
R. Bernstein
Jim Berry
Peter A. Bigot
E. Jay Berkenbilt
Benno Blumenthal
Uwe Bonnes
Marc Boucher
Martin Boyer
Ethan Bradford
David Brooks
Nicolas Brouard
Peter Bruells
Ferd Brundick
Jack Bryans
Walt Buehring
Richard Caley
John D. Campbell
Keith Cantrell
John Capo
Bill Carpenter
Jesus Carretero
Michael W. Chang
Steven Chaplin
Wei-Jou Chen
Peter Chubb
Stewart Clamen
Henri Cohen
Ken Cox
Robert Crowe
Damian Cugley
Ian Dall
Kevin Dalley
David Dalton
Neal Dalton
Hugh Daniel
Mark Davies
Frederic Devernay
Bob Devine
Paul Dickson
Casper H.S. Dik
Detlev Droege
Steve Dum
Alexander Durner
Jiri Dvorak
Les Earnest
John Eaton
David Edelsohn
Jeff Edmonds
Eric Eide
Orjan Ekeberg
Kevin Ellwood
Michael Ernst
L. Van Eycken
Rik Faith
Ralf Fassel
George Ferguson
Jeff Finger
Werner Fink
John Fitch
Peter Flatau
Jeffrey Friedl
Georg Gieseke
Ralph. E. Gorin
Amos A. Gouaux
Andy Gruss
Michael Gschwind
Ron Guilmette
Bhusan Gupta
Michael A. Guravage
Chris Hadley
Mark Hanning-Lee
John Heidemann
Arne Helme
Ole Bjoern Hessen
Karl Heuer
Juergen A. Holm
Denis Howe
Joe Huber
Brian Hunt
imt3b2!imtsft (true name unknown)
Lester Ingber
Nick Ing-Simmons
Richard L. Jackson, Jr.
Michal Jaegermann
John Jendro
Bob Jewett
Trevor Jim
Gary Johnson
Gjalt de Jong
Don Kark
Dan Karron
Brendan Kehoe
Steve Kelem
Vivek Khera
Axel Kielhorn
Masahiro Kitagawa
Peter Knaggs
Don Knuth
Jim Knutson
Heinz Knutzen
Fred Korz
Sebastian Kremer
Geoff Kuenning
Ralf Lammers
Markus Lautenbacher
Jack Lawler
Cherie N. Lawrence
Charles Levert
Doug Lind
Torbjoern Lindgren
Michael N. Lipp
Ernst Lippe
Richard Lloyd
John Lu
Dean Luick
Erik Luijten
Ian MacPhedran
Martin Maechler
Ross Maloney
Albrecht Melan
Lee Melvin
Evan Marcus
Simon Marshall
Dave Mason
W. E. Matson
Meinhard E. Mayer
Rob McMahon
Bob McQueer
Dean Messing
Chris Metcalf
Jim Meyering
Hal Miller
N.O. Monaghan
Chris Moore
Bernd Mueller
Ulrich Mueller
Guido Muesch
John Murdie
Peter Mutsaers
Erik Toubro Nielsen
Gaute Nessan
Keith Neufeld
Paul Nevai
David Neves
Mike Ogush
Thorstein Ohl
Piet van Oostrum
Joe Orost
Pham Dinh-Tuan
Gildas Perrot
Francois Pinard
Israel Pinkas
Paul Placeway
Mick Pont
Jim Prescott
Philippe-Andre Prindeville
Gary Puckering
Philippe Queinnec
Ashwin Ram
Bill Randle
Christopher Rath
Joachim Reinert
Rob Riepel
Marc Ries
Loren J. Rittle
Germic Robert
Philippe Robert
Doug Roberts
Kevin Rodgers
Santiago Rodriguez
Hagen Ross
Jonathan Ross
Arie Rudich
Jonathan Ryshpan
Bruno Salvy
Rich Salz
Julio Sanchez
Paul A. Sand
Ken Scales
Bart Schaefer
Greg Schaffer
Harald Schlangmann
Joachim Schrod
Vernon Schryver
Martin Schulz
Gregory Neil Shapiro
Guy Shaw
David Shepherd
Tom Shott
Joel Shprentz
Duncan Sinclair
Vivek P. Singhal
Klaus Singvogel
George M. Sipe
David M. Smith
Perry Smith
Luis Soltero
David Spuler
Richard Stallman
Kevin B. Stanton
Kjartan Stefansson
Ken Stevens
Andreas Stolcke
Thos Sumner
Bob Sutterfield
Stefan Taxhet
Gruppe Thi
Julian Thompson
Thomas Tornblom
Michael Toy
Bill Triggs
Goeran (G\366ran) Uddeborg
Marc Ullman
Koaunghi Un
Arjan de Vet
Andrew Vignaux
Christoph Vogelsang
Jochen Voss
David Waitzman
Peter Watkins
Gray Watson
Patrick Weemeeuw
Edward Welbourne
Petri Wessman
Michael Wester
Peter Whaite
Jon L. White
Johan Widen
Fredrik Wilhelmsen
Moritz Willers
Pace Willisson
Joerg Winckler
Bill Wohler
Michael J. Wolski
James Woods
Frank Wuebbeling
Avishai Yacobi
Ken Yap
Benny Yih
Peter Young
Jamie Zawinski
Christos S. Zoulas