mirror of
https://github.com/vim/vim
synced 2025-04-30 13:27:47 +02:00
264 lines
9.4 KiB
Text
264 lines
9.4 KiB
Text
*spell.txt* For Vim version 7.0aa. Last change: 2005 Apr 17
|
|
|
|
|
|
VIM REFERENCE MANUAL by Bram Moolenaar
|
|
|
|
|
|
Spell checking *spell*
|
|
|
|
1. Quick start |spell-quickstart|
|
|
2. Generating a spell file |spell-mkspell|
|
|
9. Spell file format |spell-file-format|
|
|
|
|
{Vi does not have any of these commands}
|
|
|
|
Spell checking is not available when the |+syntax| feature has been disabled
|
|
at compile time.
|
|
|
|
==============================================================================
|
|
1. Quick start *spell-quickstart*
|
|
|
|
This command switches on spell checking: >
|
|
|
|
:setlocal spell spelllang=en_us
|
|
|
|
This switches on the 'spell' option and specifies to check for US English.
|
|
|
|
The words that are not recognized are highlighted with one of these:
|
|
SpellBad word not recognized
|
|
SpellRare rare word
|
|
SpellLocal wrong spelling for selected region
|
|
|
|
Vim only checks words for spelling, there is no grammar check.
|
|
|
|
To search for the next misspelled word:
|
|
|
|
*]s* *E756*
|
|
]s Move to next misspelled word after the cursor.
|
|
|
|
*[s*
|
|
[s Move to next misspelled word before the cursor.
|
|
DOESN'T WORK YET!
|
|
|
|
|
|
PERFORMANCE
|
|
|
|
Note that Vim does on-the-fly spellchecking. To make this work fast the
|
|
word list is loaded in memory. Thus this uses a lot of memory (1 Mbyte or
|
|
more). There might also be a noticable delay when the word list is loaded,
|
|
which happens when 'spelllang' is set. Each word list is only loaded once,
|
|
they are not deleted when 'spelllang' is made empty. When 'encoding' is set
|
|
the word lists are reloaded, thus you may notice a delay then too.
|
|
|
|
|
|
REGIONS
|
|
|
|
A word may be spelled differently in various regions. For example, English
|
|
comes in (at least) these variants:
|
|
|
|
en all regions
|
|
en_us US
|
|
en_gb Great Britain
|
|
en_ca Canada
|
|
|
|
Words that are not used in one region but are used in another region are
|
|
highlighted with SpellLocal.
|
|
|
|
Always use lowercase letters for the language and region names.
|
|
|
|
|
|
SPELL FILES
|
|
|
|
Vim searches for spell files in the "spell" subdirectory of the directories in
|
|
'runtimepath'. The name is: LL-XXX.EEE.spl, where:
|
|
LL the language name
|
|
-XXX optional addition
|
|
EEE the value of 'encoding'
|
|
|
|
Exceptions:
|
|
- Vim uses "latin1" when 'encoding' is "iso-8859-15". The euro sign doesn't
|
|
matter for spelling.
|
|
- When no spell file for 'encoding' is found "ascii" is tried. This only
|
|
works for languages where nearly all words are ASCII, such as English. It
|
|
helps when 'encoding' is not "latin1", such as iso-8859-2, and English text
|
|
is being edited.
|
|
|
|
Spelling for EBCDIC is currently not supported.
|
|
|
|
A spell file might not be available in the current 'encoding'. See
|
|
|spell-mkspell| about how to create a spell file. Converting a spell file
|
|
with "iconv" will NOT work!
|
|
|
|
*E758* *E759*
|
|
When loading a spell file Vim checks that it is properly formatted. If you
|
|
get an error the file may be truncated, modified or intended for another Vim
|
|
version.
|
|
|
|
|
|
WORDS
|
|
|
|
Vim uses a fixed method to recognize a word. This is independent of
|
|
'iskeyword', so that it also works in help files and for languages that
|
|
include characters like '-' in 'iskeyword'. The word characters do depend on
|
|
'encoding'.
|
|
|
|
A word that starts with a digit is always ignored.
|
|
|
|
|
|
SYNTAX HIGHLIGHTING
|
|
|
|
Files that use syntax highlighting can specify where spell checking should be
|
|
done:
|
|
|
|
everywhere default
|
|
in specific items use "contains=@Spell"
|
|
everywhere but specific items use "contains=@NoSpell"
|
|
|
|
Note that mixing @Spell and @NoSpell doesn't make sense.
|
|
|
|
==============================================================================
|
|
2. Generating a spell file *spell-mkspell*
|
|
|
|
Vim uses a binary file format for spelling. This greatly speeds up loading
|
|
the word list and keeps it small.
|
|
|
|
You can create a Vim spell file from the .aff and .dic files that Myspell
|
|
uses. Myspell is used by OpenOffice.org and Mozilla. You should be able to
|
|
find them here:
|
|
http://lingucomponent.openoffice.org/spell_dic.html
|
|
|
|
:mksp[ell] [-ascii] {outname} {inname} ... *:mksp* *:mkspell*
|
|
Generate spell file {outname}.spl from Myspell files
|
|
{inname}.aff and {inname}.dic.
|
|
When the [-ascii] argument is present, words with
|
|
non-ascii characters are skipped. The resulting file
|
|
ends in "ascii.spl". Otherwise the resulting file
|
|
ends in "ENC.spl", where ENC is the value of
|
|
'encoding'.
|
|
Multiple {inname} arguments can be given to combine
|
|
regions into one Vim spell file. Example: >
|
|
:mkspell ~/.vim/spell/en /tmp/en_US /tmp/en_CA /tmp/en_AU
|
|
< This combines the English word lists for US, CA and AU
|
|
into one en.spl file.
|
|
Up to eight regions can be combined. *E754* *755*
|
|
|
|
Since you might want to change the word list for use with Vim the following
|
|
procedure is recommended:
|
|
|
|
1. Obtain the xx_YY.aff and xx_YY.dic files from Myspell.
|
|
2. Make a copy of these files to xx_YY.orig.aff and xx_YY.orig.dic.
|
|
3. Change the xx_YY.aff and xx_YY.dic files to remove bad words, add missing
|
|
words, etc.
|
|
4. Use |:mkspell| to generate the Vim spell file and try it out.
|
|
|
|
When the Myspell files are updated you can merge the differences:
|
|
5. Obtain the new Myspell files as xx_YY.new.aff and xx_UU.new.dic.
|
|
6. Use Vimdiff to see what changed: >
|
|
vimdiff xx_YY.orig.dic xx_YY.new.dic
|
|
7. Take over the changes you like in xx_YY.dic.
|
|
You may also need to change xx_YY.aff.
|
|
8. Rename xx_YY.new.dic to xx_YY.orig.dic and xx_YY.new.aff to xx_YY.new.aff.
|
|
|
|
==============================================================================
|
|
9. Spell file format *spell-file-format*
|
|
|
|
This is the format of the files that are used by the person who creates and
|
|
maintains a word list.
|
|
|
|
Note that we avoid the word "dictionary" here. That is because the goal of
|
|
spell checking differs from writing a dictionary (as in the book). For
|
|
spelling we need a list of words that are OK, thus need not to be highlighted.
|
|
Names will not appear in a dictionary, but do appear in a word list. And
|
|
some old words are rarely used and are common misspellings. These do appear
|
|
in a dictionary but not in a word list.
|
|
|
|
There are two files: the basic word list and an affix file. The affixes are
|
|
used to modify the basic words to get the full word list. This significantly
|
|
reduces the number of words, especially for a language like Polish. This is
|
|
called affix compression.
|
|
|
|
The format for the affix and word list files is mostly identical to what
|
|
Myspell uses (the spell checker of Mozilla and OpenOffice.org). A description
|
|
can be found here:
|
|
http://lingucomponent.openoffice.org/affix.readme ~
|
|
Note that affixes are case sensitive, this isn't obvious from the description.
|
|
Vim supports a few extras. Hopefully Myspell will support these too some day.
|
|
See |spell-affix-vim|.
|
|
|
|
The basic word list and the affix file are combined and turned into a binary
|
|
spell file. All the preprocessing has been done, thus this file loads fast.
|
|
The binary spell file format is described in the source code (src/spell.c).
|
|
But only developers need to know about it.
|
|
|
|
The preprocessing also allows us to take the Myspell language files and modify
|
|
them before the Vim word list is made. The tools for this can be found in the
|
|
"src/spell" directory.
|
|
|
|
|
|
WORD LIST FORMAT *spell-wordlist-format*
|
|
|
|
A very short example, with line numbers:
|
|
|
|
1 1234
|
|
2 aan
|
|
3 Als
|
|
4 Etten-Leur
|
|
5 et al.
|
|
6 's-Gravenhage
|
|
7 's-Gravenhaags
|
|
8 bedel/P
|
|
9 kado/1
|
|
10 cadeau/2
|
|
|
|
The first line contains the number of words. Vim ignores it. *E760*
|
|
|
|
What follows is one word per line. There should be no white space after the
|
|
word.
|
|
|
|
When the word only has lower-case letters it will also match with the word
|
|
starting with an upper-case letter.
|
|
|
|
When the word includes an upper-case letter, this means the upper-case letter
|
|
is required at this position. The same word with a lower-case letter at this
|
|
position will not match. When some of the other letters are upper-case it will
|
|
not match either.
|
|
|
|
The same word with all upper-case characters will always be OK.
|
|
|
|
word list matches does not match ~
|
|
als als Als ALS ALs AlS aLs aLS
|
|
Als Als ALS als ALs AlS aLs aLS
|
|
ALS ALS als Als ALs AlS aLs aLS
|
|
AlS AlS ALS als Als ALs aLs aLS
|
|
|
|
Note in line 5 to 7 that non-word characters are used. You can include
|
|
any character in a word. When checking the text a word still only matches
|
|
when it appears with a non-word character before and after it. For Myspell a
|
|
word starting with a non-word character probably won't work.
|
|
|
|
After the word there is an optional slash and flags. Most of these flags are
|
|
letters that indicate the affixes that can be used with this word.
|
|
|
|
*spell-affix-vim*
|
|
A flag that Vim adds and is not in Myspell is the "=" flag. This has the
|
|
meaning that case matters. This can be used if the word does not have the
|
|
first letter in upper case at the start of a sentence. Example:
|
|
|
|
word list matches does not match ~
|
|
's morgens/= 's morgens 'S morgens 's Morgens
|
|
's Morgens 's Morgens 'S morgens 's morgens
|
|
|
|
*spell-affix-mbyte*
|
|
The basic word list is normally in an 8-bit encoding, which is mentioned in
|
|
the affix file. The affix file must always be in the same encoding as the
|
|
word list. This is compatible with Myspell. For Vim the encoding may also be
|
|
something else, any encoding that "iconv" supports. The "SET" line must
|
|
specify the name of the encoding. When using a multi-byte encoding it's
|
|
possible to use more different affixes.
|
|
|
|
Performance hint: Although using affixes reduces the number of words, it
|
|
reduces the speed. It's a good idea to put all the often used words in the
|
|
word list with the affixes prepended/appended.
|
|
|
|
|
|
vim:tw=78:sw=4:ts=8:ft=help:norl:
|