Wiktionary talk:Votes/2019-03/Excluding typos and scannos

Opposition
I think the policy of excluding relatively rare misspellings and including common misspellings works reasonably well as is, and does not need to be changed by introduction of a distinction between a misspelling and a typo. It is reasonably easy to administer using Google Ngram Viewer and frequency ratios. The rationale for including common misspellings applies to them whether they are typos or not: someone is all too likely to look them up and the best user experience is provided by soft-redirecting the reader to the usual spelling rather than letting them try to figure that spelling out for themselves. The soft redirect still indicates the form to be a misspelling. The distinction of a misspelling from a typo is much more speculative than a quantitative frequency criterion, and that criterion does all we need, from what I can tell. --Dan Polansky (talk) 19:18, 29 March 2019 (UTC)


 * "People are likely to look it up" can be taken too far. Every day we get dozens of searches for things like "hot sex video" in Arabic. Re "user experience", I think that having actual dictionary entries for things that are basically mistakes and rubbish does serious harm to the user experience (and to our credibility). Equinox ◑ 19:35, 29 March 2019 (UTC)
 * I don't see how concieve stating "Misspelling of conceive" is doing any harm to our credibility or user experience. Right, that entry is not attacked since it is not considered to be a "typo", merely a "misspelling", but the underlying point is the same: it is a plentifully attested form that is unlikely to be considered to be correct by the language users given its poor frequency ratio. A mistake is a mistake, whether a misspelling or a typo, to use the ontology of the proposal. I don't see any argument supporting differentiating the two. --Dan Polansky (talk) 19:42, 29 March 2019 (UTC)
 * And I don't see how floopy stating "Misspelling of floppy" is doing any harm to our credibility or user experience either, where floopy is under attack as a typo. --Dan Polansky (talk) 19:44, 29 March 2019 (UTC)
 * While I assent that the adding of misspellings, typos, and corruptelae should not be taken to far, the proposed change seems not to help in making it clear what should or shoul not be added. People would, if it pass, switch to claim that “one does not know of it being accidental”, only adding ground for contention. Taking a current RFDO as an example, SemperBlotto, who has created bisprectal, would perhaps argue this for bisprectal; though for pesudovirus apparently it becomes more clear that the inclusion of such an entry is indefensible. While this vote is intended to make it easier to delete spellings of the “I know it when I see it” kind it therefore rather reaches to opposite or at the best expresses this desire in gawky fashion.
 * Instead of impleting the CFI with minutiae, I rather would strike the whole section for a simple positive test: “A misspelling is only to be included if it is likely to be made and Wiktionary’s competitiveness insinuates its inclusion.” This would catch everything, even manuscript corruptions which have spread somehow. And even by reason of the contrast the respective gloss shows with the deliberate spelling: for the antithesis (also an emanation of the competition that rules mankind). For Metaknowledge says “teh is a prime example of something that is common enough, but that nobody would ever look up” but I think “who says B must also say A”, and the gloss there is only a way to say, or a breviloquence for “though the spelling can be deliberate, it most likely is a misspelling”. So:


 * Misspellings, common misspellings and variant spellings : Rare misspellings should be excluded while common misspellings should be included. There is no simple hard and fast rule, particularly in English, for determining whether a particular spelling is “correct”. Published grammars and style guides can be useful in that regard, as can statistics concerning the prevalence of various forms.


 * Most simple typos are much rarer than the most frequent spellings. Some words, however, are frequently misspelled. For example, occurred is often spelled with only one c or only one r, but only occurred is considered correct.


 * It is important to remember that most languages, including English, do not have an academy to establish rules of usage, and thus may be prone to uncertain spellings. This problem is less frequent, though not unknown, in languages such as Spanish where spelling may have legal support in some countries.


 * Regional or historical variations are not misspellings. For example, there are well-known differences between British and American spelling. A spelling considered incorrect in one region may not occur at all in another, and may even dominate in yet another.


 * A misspelling is only to be included if it is likely to be made and Wiktionary’s competitiveness insinuates its inclusion.


 * Combining characters (like ́|this) should exist as main-namespace redirects to their non-combining forms (like ´|this) if the latter exist. Fay Freak (talk) 13:47, 8 April 2019 (UTC)

Ontology of misspellings and typos
My understanding is that each typo in a word is a misspelling. M-W's definition of "typo" is this: "an error (as of spelling) in typed or typeset material".

Here's OneLook for ease of reference:

--Dan Polansky (talk) 13:54, 30 March 2019 (UTC)
 * My understanding of the difference between typos and misspellings is that a typo occurs because the writer knows the spelling considered to be correct, but fails to record it correctly (due to a slip of the finger, a software bug, etc.) The term "typo" usually only refers to mistakes of this kind which occur through keyboard entry; a slip of the pen could be considered to be in a grey area between a misspelling and a typo. A misspelling occurs when a writer doesn't know the spelling considered to be correct and therefore substitutes an incorrect one (in some contexts, a conscious avoidance of the usual spelling may be considered a misspelling as well, especially if it isn't known that the usual spelling was consciously avoided.) However, when the distinction between the two terms isn't important, they may be used interchangeably.
 * In any case ( should see this too), the proposed changes should probably incorporate a definition of the difference between a typo and a misspelling. --Hazarasp (talk · contributions) 09:19, 1 April 2019 (UTC)

"Common" misspellings
Maybe this information is in an obvious place and I just don't see it, but what are our criteria for common misspellings (as opposed to just any attested one)? Ultimateria (talk) 00:24, 11 April 2019 (UTC)
 * There are no formally agreed on criteria. I use a frequency ratio and did a calibration at User talk:Dan Polansky/2013. An example: suggests a frequency ratio of about 1:1000 for concieve, which by my calibration makes concieve common enough a misspelling. --Dan Polansky (talk) 17:20, 13 April 2019 (UTC)