Wiktionary:Misspellings

Wiktionary accepts common misspellings. These are intended to help users who search for them, rather than being met with a red link, the entry directs them to the correct spelling.

Another use of misspelling entries is by editors mining corpora for forms not yet covered by Wiktionary: they may appreciate having a common database of forms known to be misspellings, even if relatively rare ones. The use case is rather different from the first one.

Common
Almost any word can be misspelled in dozens of ways. For this reason, we only accept common misspellings. There are currently no criteria to establish what a "common" misspelling is. Misspellings have to meet Criteria for inclusion just as much as any other entry, so some entries may be required to pass Requests for verification.

Evidence to support commonness could include uses in reliable third-party published materials, such as books, magazines, leaflets and newspapers. Per our criteria, anything that is in "widespread use" should be included.

Frequency ratio test

 * This paragraph is not known to be universally accepted.

One test of what is a "misspelling" and what is a "common misspelling" is the frequency ratio test, considering how common the misspelling is relative to the correct spelling. Compared to less common alternative spellings, misspellings tend to have poor frequency ratios, and rare misspellings even worse. For instance, for concieve shows the frequency ratio of about 2500, still fine for a "common" misspelling, while  for concive shows over 47 000, which would make it a "rare" misspelling. However, this test is not a policy and is not universally accepted by Wiktionary editors. It works less well for hyphenated forms since they are all too often scanned as solid forms.

Another test of common misspelling is how common it is relative to other misspellings. For instance, if concieve is accepted as common, we may note and conclude enthousiastic is not hugely rarer. shows authoritive to be similarly common as concieve. By contrast, shows acclamate to be on a different order of magnitude of frequency, so less protected by concieve. To use this test, one would have to pick a benchmark; concieve is a candidate, but it may be too common and thus too high a bar to compare against.

One idea behind the use of frequency ratios in Google Books is that it reveals copyeditors voting, as it were, for what is incorrect by removing it during the editing process. What slips through their fingers is going to be rare.

The test is consistent with WT:CFI: "There is no simple hard and fast rule, particularly in English, for determining which category (correct spellings, misspellings, variant spellings) a specific spelling belongs to. Published dictionaries, grammars, style guides and statistics can be useful guides in this regard but they are not necessarily binding." Note "statistics". A previous version had more explicit language: "statistics concerning the prevalence of various forms".

Absence from Google Ngram Viewer

 * This paragraph is not known to be universally accepted.

If an English variant spelling is not found in Google Ngram Viewer while the "correct" spelling is found, it is hard to claim it is a common misspelling, at least in absolute terms. Such an entry may still serve the purpose of tracking the misspelling for corpus miners.

Copyedited corpus test

 * This paragraph is not known to be universally accepted.

A putative misspelling's not being attested in a copyedited corpus such as Google Books and only being attested in Usenet or the like can support its being a misspelling.

Style guides
WT:CFI mentions style guides as one cue for classification of spellings as misspellings or variant spellings. However, RFD discussions mentioning style guides are hard to find. Moreover, it is unclear how it would work: for instance, GPO style manual favors micro-organism, but that does not make microorganism spelled solid a misspelling.

Precedent

 * Deleted rare misspellings include hisown, himand, dolemite, motted, enthousiastic, informacíon, trolly dolly, blackhoe, stylishy, râter, animalike, suthern, increidbly, aqcuire, and antiRoman.
 * Talk:unEnglish resulted in keeping.
 * A search is Special:Search/incategory:"RFD_result_(failed)" misspelling.

Typos
Typos are excluded regardless of frequency per WT:CFI, e.g. amgydala. As an aside, this typo is on the same order of magnitude of frequency as concieve after 2000:.

Obsolete spellings
Obsolete spellings such as musick are not marked as misspellings. They are misspellings from today's point of view, but were standard spellings at the time, and their being today-misspellings follows from their being obsolete.

Anomalous spellings
Anomalous spellings, those failing a pattern, are not misspellings.

In English, prefixing capitalized words nearly always retains the capital letter and adds a hyphen. But:
 * antichristian is unusual and much less common than anti-Christian but is still fairly common. It was the dominant spelling in the 19th century.
 * unchristian is unusual yet more common than un-Christian. It was many times more common in the 19th century; it was un-Christian that was rare.
 * unchristianlike is unusual yet more common than un-Christian-like. Together with unchristian-like, it was many times more common in the 19th century and un-Christian-like was rare.
 * transatlantic is unusual yet more common than trans-Atlantic.
 * transpacific is unusual yet as common as trans-Pacific.

English hardly ever uses diacritics such as diaeresis or acute accent in spelling. However, the following spellings are common and accepted: See also English terms with diacritical marks.
 * ï: naïve, naïveté
 * é: née, résumé, protégé, émigré
 * ç: Provençal, Curaçao

Urgency of deleting common misspellings
As long as misspellings are marked as such, the reader will not be mislead, and there is no urgency. However, going out of one's way to create entries for rare misspellings seems inadvisable: it is useless for the readers and creates more cleanup work for others.

Formatting
Misspellings should appear under a part of speech heading like Noun, Adjective or Verb but should not appear in those categories. Misspellings can be included in entries that already have other meanings, or other languages. These entries should appear in the relevant categories. The template is designed to do all the formatting necessary for misspellings.

Example (stationery)
==English==

Noun

 * 1) writing materials

Adjective


So this entry appears in Category:English nouns but not Category:English adjectives, as the template is not used for misspellings.

Alternatives
Wiktionary documents usage, therefore misspellings that are commonly judged to be 'misspellings' are included as such. There are alternatives, however:


 * 1)  for entries that are deliberate misspellings, such as  for.
 * 2)  and  for spellings that are no longer used, but were not considered incorrect at a certain time.
 * 3)  for spellings that are less common, but not considered incorrect.
 * 1)  for spellings that are less common, but not considered incorrect.