Wiktionary talk:Criteria for inclusion/Archive 1

From Talk:Youve
Do we really want articles on spelling mistakes? This exists under the correct spelling, you've so I say it's time to delete. &mdash; Hippietrail 01:56, 27 Jun 2004 (UTC)


 * It's not always a spelling mistake; sometimes it's intentional. Oscar Wilde apparently insisted on spelling it this way... in school we studied The Importance of Being Earnest written this way, complete with the altogether interesting form havnt—I am almost sure there was a note by him before the text explaining this very usage, but I can't find it online and my copy of the book is not in this town.
 * Anyway, I think they should stay, but marked as "nonstandard"—they may be POVially regarded as mistakes, but they have been let through by editors into published works, so I think these words (youve and youre) satisfy Criteria for inclusion.  —Muke Tever 02:37, 27 Jun 2004 (UTC)


 * I've deleted it, along with "youre", "youll" and "youd". We can't be listing every misspelling or we won't be able to see the wool for the trees.  Restore them if you wish, with discussion and disclaimers along the lines of what you have said.  &mdash; Paul G 17:32, 23 Jul 2004 (UTC)


 * PS - it's interesting that these were in but the correct spellings were not.

Invented language inclusion
Clearly all natural languages belong here, but I wonder about invented languages. The "criteria for inclusion" words seem to be focused on the idea of people submitting bogus words... Some invented languages are common, mature, and well-recognized enough to belong here, like Esperanto or Interlingua or Klingon, but a line should probably be drawn somewhere: to take an extreme case, a hoaxlang like E, aka E-Prime clearly doesn't belong. What about Tolkien's Nevbosh (famous conlanger, minor language), or Toki Pona (small community, has a Wikipedia), Brithenig (well-known conlang, spawned many imitators), or Atlantic? (That last one's mine. I don't intend to add it.)

Possibly the language as a whole should conform to one of the criteria here, although the last one ("three independently recorded instances") would be way too lax. Limit to conlangs appearing in published works? or what? Would I be justified in adding Nalian words, from The Edifice, for example?

Would the line different between artlangs and auxlangs? —Muke Tever 04:34, 29 May 2004 (UTC)
 * well, we could just apply the criteria for inclusion to the un-natural languages like anything else (two published sources within a year, not counting dictionaries and similar). That would cut out a lot of the languages. But I'm not really sure. --Eean 07:28, 11 Dec 2004 (UTC)

Attributive sense?
Can someone clarify this?
 * "Proper names may be included if... the name is used in an attributive sense."

I assume this means something like Lou Gehrig's disease (even though that should really redirect to amyotrophic lateral sclerosis), but in that case the entry isn't just a name. Under what conditions is an entry that's just a name (e.g., Homer) acceptable? - dcljr 05:54, 12 Apr 2005 (UTC)
 * The name Shakespeare appears in Wiktionary all over the place as the author of a quotation. Those quotations are attributed to Shakespeare.  Lou Gehrig's disease is probably OK, (as a proper noun of a specific disorder) while Lou Gehrig and Gehrig would probably get  'ed, and articles that link to them would be redirected to Wikipedia.  --Connel MacKenzie 07:30, 12 Apr 2005 (UTC)
 * That's not what's meant here. A noun in general is used attributively if it's used to modify another noun.  For example, coffee is attributive in coffee cup.  I believe the idea here is that a proper noun appearing attributively implies that it's so well known that it's become part of the general lexicon.  For example, if someone says That was a David Beckham hairstyle. we know that ... well, actually we don't know precisely what that might be, but we know it's something eye-cacthing.  This is really just a guideline.  I can just as well say That was a Boog Highberger hairstyle, but that doesn't necessarily mean that Boog merits his own entry (but he might :-). -dmh 18:34, 12 Apr 2005 (UTC)
 * That's an issue as well, but I think the idea intended is yet another interpretation of "attributive", namely things like using "Einstein" to refer to a smart person (or with irony, a stupid one) in general. (Or maybe this is just a subclass of dmh's example, except he means proper names used as modifiers, while I mean them used as substantives . . . )
 * However I would ALSO submit (er, if I havnt already) that proper names be addable if they are subject to translation. For example the Greco-Roman hero Aeneas is the same in English and Latin, but in Spanish it's Eneas and in French, Énée. —Muke Tever 15:49, 13 Apr 2005 (UTC)
 * Sounds good to me (on both counts). I'm not sure why proper names should be treated any differently from the rest of the lexicon, anyway. -dmh 04:37, 14 Apr 2005 (UTC)


 * They aren't. Take the infamous refresher course in Main Page (&#9786;) and you'll note that we have two appendices, one for given names (Wiktionary Appendix:First names) and one for family names (Wiktionary Appendix:Surnames).  The difference between us and the encyclopaedia isn't that we don't take names.  It is that our article on a name isn't about a person with that name, but is about the name itself.  Our article on Beckham won't tell us about footballers, as Beckham would, and our article on David won't tell us about statues, as David would; but our articles will tell us the etymologies of the names, their pronunciations, their translations, and their actual meanings (if they have any).  The "attributive sense" description is, as a matter of fact, too narrow for what we actually do, and it is a point that I've long thought of bringing up.  We aren't a genealogy database any more than Wikipedia is, but we are about words, and names are words. Uncle G 02:13, 21 Apr 2005 (UTC)
 * Oddly enough, I was already aware of the lists of names. It seems to me like one of many examples of Wiktionary, given its nature, approaching a given topic from multiple angles.  I wouldn't really infer much about inclusion criteria from the existence of an index, except that people would like to include the terms listed in the index.  Note that given names and surnames are not inherently proper nouns, though they may be so used (e.g., Thatcher and Madonna &mdash; and there's a pair to draw to).  Noting that David is derived from Hebrew דוד and that Beckham means (I'm guessing here) "village by the brook" is interesting and useful.  So is noting that David Beckham is the name of an internationally famous footballer, with a link to Wikipedia for more detail.
 * Personally, I'm not convinced that names (in either sense) need to be subject to any special criteria at all. -dmh 16:31, 21 Apr 2005 (UTC)


 * Re-reading, I think I both mistook Uncle G's meaning and responded unclearly. We're probably in closer agreement than it may seem.  I completely agree that the entries for David and Beckham should just talk about those words per se and link to Wikipedia for further info.  In particular, our entries should not contain lists of famous Davids and such.  This is in keeping with our treatments of words in general.  We include them if they're properly attested and limit the definition to the word per se.  E.g., there isn't and shouldn't be a list of famous oak trees under oak.


 * As for proper names like David Beckham, we should include them under the same criteria as other words: They need to be sufficiently attested and the meaning should be non-obivous.  The implications of these basic general principles are a bit different for proper names.  I would argue that endless examples of usages like "David Beckham scored a brilliant goal in the 90th minute." aren't sufficient reason for a wiktionary entry (while they are reason enough for a Wikipedia entry).  The key question is whether the name is being used in a sense other than the obvious one of "the person named ...".  A good example would be the early 20th-century usage of "Mae West" for a life jacket.


 * A borderline case would be the convention found in sports writing (and elsewhere) of using a a person's name to refer to that person's well-known attributes. For example, "This team needs a Michael Jordan, not a Shaquille O'Neal," meaning (roughly) "This team needs someone with MJ's skills and abilities, not Shaq's."  On the one hand, one could use this construction with absolutely anyone, but on the other hand, it only tends to be used with well-known names.  On the balance, I'd tend not to count such usages.


 * I would, however, count any reference to David Beckham as a usage of both David and Beckham, for purposes of attestation.


 * I can't quite articulate the general princples that account for everything I just said, but I think they're there, they're pretty clear cut, and they're not too far from the more or less de facto standard of "independent uses in running text with non-obvious meaning." -dmh 17:36, 22 Apr 2005 (UTC)


 * Actually, it may be necessary to differentiate names by person, at least in some cases. One reason is that translating names used to be much more common than it is now: many historical figures have names in many different languages, but modern figures generally don't undergo anything more drastic than transliteration, if even that.  For a concrete example, Homer the poet is Homère in French, but Homer the Simpson is still Homer.  (For kings and popes the tradition still seems to be to translate; see, e.g., the list at it:Papa Benedetto XVI).  —Muke Tever 02:26, 22 Apr 2005 (UTC)


 * The more I think about this, the more I think there's not really any need for rules like "used attributively". Further, I don't think that that particular rule is useful even if we did need such rules in general.
 * For example, "New York" is used in idioms like New York minute, New York pizza, New York bagel and so forth. Each of these is its own idiom.  I can know quite a bit about New York without knowing what makes a New York bagel a New York bagel.  The unit "New York" itself doesn't convey anything more than something like "associated with New York".  To make an analogy, linguists don't consider "cran" to be a proper English morpheme, even though "cranberry" is clearly a compound with "berry" (as it happens, cranberries are crane berries just as gooseberries are goose berries).  Similarly, the existence of "New York minute" doesn't argue for (or against) "New York" on its own.
 * What does argue for New York is that some English speaker unfamiliar with the United States might run across "New York" and want to know what it was the name of. Interestingly, this argues more strongly for Hoboken, Healdsburg and Hovenweep than it does for "New York" since we might convince ourselves (most likely incorrectly) that everyone knows what "New York" means.
 * Absolutely -- how many non-US people know why someone would sing about "New York, New York"? It is appropriate that those who are unsure can look up New York and find out. --Enginear 11:40, 24 August 2006 (UTC)
 * The interesting thing about names is that it is generally clear from context that they are just arbitrary names. If I say "This is my good friend Chris."  It's clear (even in speech) that Chris is a name, and one does not need to look in a dictionary to know what "Chris" means.
 * In short, there's at least an argument to be made that Chris need not be in Wiktionary, but Chris Noth probably should.
 * On the other hand, while one might not need to look up the meaning of Chris, one might very well want to look up its etymology, usual translation into other langauges and (in the case of more unusual names) pronunciation. This is clearly dictionary material, and CFI as it stands essentially says as much.  This is also a good rationale for including the (relatively small) class of "phrasebook" entries that would not be included for other reasons.  So perhaps we should expand the Prime Directive a bit to encompass more than just meaning. -dmh 21:18, 6 October 2005 (UTC)


 * The actual state of the dictionary is at odds with the "attributive sense" entry for place names that is currently in the CFI. A search for "place names" http://en.wiktionary.org/wiki/Special:Search?search=place+names yields 1811 results.  A look into place names in Florida shows 9 entries, only two of which Miami and Naples have attributive value.  Similarly,  Appendix:Place names in Ohio has 13 entries, only two of which, Lima and maybe Minerva (although that is also a name), have attributive value.  It seems like we cannot have it both ways.  Either 1) we allow non-attributive place names & change the CFI or 2) delete the current thousands of non-attributive place name entries that are there.  I would argue for 1).  Brholden 21:15, 29 June 2006 (UTC)


 * In my view, the most important reasons for listing place names and personal names are the often fascinating etymology and the way the pronunciation has changed over the years. To take three examples:
 * Rotherhithe [from OE hryther hyth cattle wharf] was a village on the Thames, now subsumed as a part of south-east London. Leading to the centre of Rotherhithe is Redriff Road.  Redriff is merely a phonetic spelling of Rotherhithe according to the pronunciation in the 17th c, when the road was built.
 * Also in the 17th c, Merton, [from OE mere tun farmstead by the pool] now a London borough, was sometimes called Marten. In the 20th c, an actor, Paul Martin, unable to register his name because it had already been used by another actor, decided on a stage name of Paul Merton because he had been brought up there, presumably unaware of its earlier spelling.
 * Igornay is a French village from where a large number of Huguenots emigrated to escape persecution by the Catholic church. The Huguenots were noted as weavers, and many Huguenots finding refuge abroad [they are said to be the original refugées] lived by that trade.  Sigornay was a surname indicating an origin in Igornay, but amongst those families who settled in New Holland [roughly now New York] it became spelt Sigourney and also became a Christian name.  One of F Scott Fitzgerald's teachers was (ironically) a Jesuit priest, Monsignor Sigourney Fay.  FSF appears to have named a female character in The Great Gatsby after him (Mrs Sigourney Howard).  A girl, Susan Alexandra Weaver, read the book, and liked the name so much that at the age of 14 she started using it herself.  Whether she was aware how appropriate it was to her surname, I do not know.
 * I believe we should include such names even where they are not used attributively, because they are interesting words and people like me want to look them up to find their etymology and what they originally meant. I would be happy with an etymology, a pronunciation, and a one line definition with a link to Wikipedia where appropriate. Attribution may be one thing that can make a name interesting, but it certainly isn't the only one, or even the most common. --Enginear 13:26, 24 August 2006 (UTC)


 * Here's an example. A recent SI online article on the Sox/Sox series reads "I can't say we're happy about the situation," said center fielder Johnny Damon, who seems to have been in this same sorry boat more times than Phil Connors ran into Ned Ryerson on the streets of Punxsutawney. "But we'll be all right. I think we have a good-enough team to win."
 * This doesn't support adding Johnny Damon, since he's clearly (from this sentence and the article as a whole) the Boston center fielder. But who are Phil Connors and Ned Ryerson, and where is Punxsutawney.  The article itself explains this obliquely in the next paragraph: The Red Sox, of course, have been in these sad straits before, way too many times. More often, in fact, than you see lame references to Groundhog Day on the sports pages. but even then it's not crystal clear that Phil and Ned are characters in the film.  To know that there's a connection, you have to know that Punxsutawney is associated with the American minor holiday Groundhog Day.
 * What of this to include? Punxsutawney? Punxsutawney Phil?  Most likely.  Phil Connors and Ned Ryerson, probably not, at least not based on this.  If they're ever mentioned outside the context of the film, then yes, but they're not so mentioned here.  By contrast, Yoda and Clark Kent are definitely part of the lexicon, even if they are entirely fictional, as are Winston Churchill or Mickey Mantle.  Note that even though Estee Lauder may be part of the lexicon, Josephine Esther Mentzer isn't.  Both names should still be in Wikipedia, of course. -dmh 22:27, 6 October 2005 (UTC)

urbandictionary
Just outta curiousity, is there a page on here with some kinda reference to Urbandictionary?, cos thats got loads more protologisms and slang terms on. And could be a rival of some sort for wiktionary. Something like a how wiktionary is different to urbandictionary, with a welcome message for any urbandictionarians we could intice to contribute here instead of there. If it doesnt exist, I'll try to rustle up a semi-decent sort of welcome page on a subpage of my user page. --Wonderfool 09:10, 20 Apr 2005 (UTC)
 * This is contentious. Much of the content of Urbandictionary consists of invented vanity terms that sound amusing but have little currency. It is not the aim of Wiktionary to include such terms, as far as I am aware. &mdash; Paul G 09:25, 21 Apr 2005 (UTC)


 * As Paul points out, many of the terms on Urbandictionary fail the Criteria for inclusion because they're not attested. An appearance in Urbandictionary is sometimes a good clue, but it's not enough on its own.  Note that an appearance in Urbandictionary that just defines a term without using it isn't even a valid citation for purposes of the criteria for inclusion, not because it appears in Urbandictionary but because it's not used to convey meaning.


 * As to competition, I think Wiktionary and Urbandictionary are playing in different spaces. Wiktionary aims to be comprehensive, Urbandictionary aims to be up-to-the-minute (and sometimes even ahead :-) in the field of slang.  The two processes are not even that similar.  E.g., there is no voting procedure in Wiktionary. -dmh June 29, 2005 16:55 (UTC)


 * Half the point of Urabndictionary is its humourous content. --Cammoore 09:44, 15 August 2005 (UTC)

"running text"
This new notion of "running text" is about as well-defined as Neurocam is, at the moment. I predict that you'll have a hard time pinning it down, too. I think that it would be better to have criteria that actually say what it is desired to say directly, rather than use a new piece of jargon which then has to be defined. Perhaps this should be expressed in terms of context. Uncle G 03:47, 21 Apr 2005 (UTC)


 * The term may be new, but the notion can't possibly be. The distinction here is between text like:
 * I realized I needed to glork my transmission to get the car to run.
 * and non-uses like
 * No one knows what "glork" means.
 * ... glamor glass gleam glimmer glork glory gloss ...
 * A more subtle example is borrowings from other languages or even from English-based argots like the infamous leet. We choose to exclude leet as a whole (a choice I strongly support), but we do include a few leet-isms like w00t and pr0n exactly because they have been used in plain English text with no other leet-isms around.  Further, we include only the few spellings (out of the many possible) that actually turn up in English contexts.
 * Suggestions for better terminology are always welcome, but the distinction needs to be made in any case. -dmh 16:31, 21 Apr 2005 (UTC)


 * "Running text" is a new notion? Google doesnt seem to think so. I've used the same criterion for marking neologisms in the Latin wiktionary (which in practice doesn't mean protologisms, but mostly Latinizations of proper names.)  IME "running text" means exactly what dmh shows: use of the word in ... well, running text, as opposed to "the word glork"-type things (in addition, I would exclude words only appearing as foreign words in bilingual glossaries: if glorque only appeared in English-French glossaries written by English-speakers as the french for glork it wouldn't make it valid, unless it's actually French; for example, the whole of la Francophonie may be writing glorc...) —Muke Tever 02:44, 22 Apr 2005 (UTC)

I've recast the text here a bit. The phrase "properly formed and grammatical" is redundant with "ordinary" and liable to be interpreted narrowly, excluding perfectly good examples that happen to use a "non-standard" dialect or use some construction that a critic finds objectionable. I had added a pointed example to push at this, but I've removed it in the interest of decreasing the heat/light ratio.

The phrase "in a context that exemplifies its meaning" is unclear. There is a potential open issue here, namely whether it should be possible to discern at least a rough meaning from the context of the word. Since we're trying to establish that the term is used, I don't see how this is necessary. Suppose I find these three independent uses of a term: Each of these is valid evidence that the word fleargle is used and expected to be understood, but only one gives any real clue as to the meaning. The new text clearly includes all three, while the old text might have excluded the first two. On the other hand, the first two might be using the word in a different sense, in which case the attestation of the rodent sense is called into question. I would say in such cases that there should be an entry for the word, but the rodent sense should be noted as possibly not well-attested. We know something's going on, and should record that, but we don't know precisely what's going on (and should note that, too). Fortunately, such borderline cases don't seem to come up much. -dmh 04:11, 30 May 2005 (UTC)
 * I was very sad that I lost my fleargle.
 * This fleargle was unlike any I'd ever seen before.
 * I had just acquired a fleargle, a small rodent with shiny fur.


 * I've completely removed the section, preferring instead to maintain a link to the ordinary meaning of the term. Although the term is well-known I was surprised to find only one other source that defined it.  I'll keep my eye open for others.


 * The removed material, "In the above, in running text means in ordinary sentences in which the meaning of the term must be known in order to understand the overall meaning." was serving to make the matter unclear. Dmh's "fleargle" rationalization is singularly unhelpful.  The citations, if they exist at all, are obviously evidences that the word is used, but the first two alone are not strong enough evidence to support inclusion in a dictionary.  In the absence of the third example, what does the word mean?  Without any other evidence I'm drawn to conclude the a "fleargle" (I can at least say it's a noun) is a nonce word invented by the author because he didn't have any other word to put in that context.  Are we really going to include every whimsical construction that comes along? Eclecticology 19:17, 31 May 2005 (UTC)


 * There is indeed some ambiguity as to whether we're supporting terms or senses. But if we required individual senses to be attested, we'd throw out even more entries than if we uniformly required terms to be attested as a whole.  Since all we're really trying to do is to build a prima facie case that a term is worth introducing into the Wiki process in the hopes of eventually developing a complete entry, three independent attestations, even if we're not sure what senses are in use, seems reasonable.
 * Naturally, we would like to exclude one-time flights of fancy, but those are generally easy to discern from context. Assuming one actually goes so far as to look at the surrounding context. -dmh 20:54, 6 October 2005 (UTC)

Inflections
I don't see where WT:ELE says that we should include entries for regular inflections. It does say that we should include spellings for "inflections if any, particularly if these are irregular, or prone to other uncertainties auch as whether consonants should be doubled.". All this is saying is that words like target (verb form) should note the spelling of targeting and targeted (which is about 50 times as common as targetted), and that an irregular like goose should definitely note the spelling of geese.

This is all goodness. The text quoted above is a bit vague about spelling out completely regular forms. I'd prefer to see it sharpened, but I don't think there's a consensus in practice on whether to include them or not.

What the text does not say, as far as I can tell, is that there should be entries for regular forms. As a rule, I see no particular point in adding entries for walks, walked etc. that have only the obvious meaning (as opposed to walker, for instance, which should be defined). They might help the occasional user in finding a term, but IMHO we have better things to do with our collective time right now. I'd rather define related terms like widow's walk, perp walk or walk the talk than make sure every single regular inflection has its own entry redirecting back to the root.

Particularly useless is wikifying the spellings of regular inflections when they just redirect back to the root. -dmh 16:31, 21 Apr 2005 (UTC)


 * Hello! This is the first well-stated objection to the practice of wikifying links that I've noticed.  I strongly disagree with the notion that redirects are useless.  External links, internal wikified links and searches all benefit from large numbers of redirects.  (Dislaimer: this was not my idea, I'm just one of the more obvious people entering lots of redirects recently.)  As others have said (elsewhere) the practice of replacing content with a redirect should be avoided fiercely (except perhaps in the case of vandalism.)  In general, things should got the other way: replacing redirects with stub articles (when needed.)  The premise of Wiktionary is "all words..." after all.  Even without considering the benefits I mentioned before, that premise should be adequate justification for a whole lot more redirect entries than currently exist.


 * I would also like to note that I did ask for comments before blasting redirects all over the place. I heard no objections at the time (the comments I got were mostly ambivalent: others wouldn't bother entering them, but had no objection to their presence.)


 * I feel I should also point out the genesis of the redirection practice. This came about from discussions about including all senses of a word under a single headword article (like other dictionaries {shudder} do.)  (That was not my idea either.)  As that was shot down, adding redirects was suggested as an partial alternate approach.


 * As a side note though, I had the intent of practicing writing a 'bot for the task of entering either the redirects or the stub articles. I immediately found that percentage-wise, very few articles (about five months ago) had the other senses indicated (wikified or bolded.)  One beneficial side effect of the "redirect syndrome" is that many words now are getting these entered, for some future 'bot to deal with, perhaps.  Also, the format of the other senses is becoming standardized, particularly with the recent addition of the inflection templates.


 * --Connel MacKenzie 02:24, 20 May 2005 (UTC)


 * Let me be a bit clearer about my objections, which are actually fairly narrow:


 * I do object to a wiki link to something that just points back to where you linked from. It violates the "principle of least surprise" in that a link implies new information behind it.
 * I don't object to adding entries for regular inflections. I'm not going to spend any time doing it by hand, or writing a bot to do it for me.  I'd also prefer to see effort invested in making the search function smarter, but that's a different topic.
 * I don't have a strong opinion on what form an entry for a regular inflection should take, whether a redirect (essentially making the search function smarter one special case at a time) or as an entry detailing what the derivation is. If the latter, we should use a template both for consistency and to ensure that, e.g., singing is listed as both progressive and gerund.
 * I hope that clarifies the original comment. -dmh 04:26, 26 May 2005 (UTC)

'unverified'
Connel MacKenzie (in an edit comment) asks: "How is a /. comment page peer reviewed or subjected to editorial verification?"


 * This is a dictionary, not an encyclopedia. Our job is to describe language as people use it.  People do not require their words to be peer reviewed to engage in conversation.  "Editorial verification" can't mean anything other than the imposition of an editor's POV, which is against the spirit of Wikimedia projects.  We can't call any human's spelling, grammar, or usage right or wrong (which would be POV), though we can label it standard or nonstandard. —Muke Tever 23:54, 19 May 2005 (UTC)

This seems as good a place as any for you to question me on that comment, Muke. So I shall explain. The context of what I was doing pertains to a very specific change. My comment about editorial review is my comment, my view on what the old meaning used to imply. Oddly, you didn't quote your own verbose edit line comment. Perhaps that would help you to grok the context I was speaking in.

The old wording covered matierials basically that one could check out from a local public library. I somewhat agree with the assessment that that is a decent place in the sand to draw the line.

While Wiki* sites strive to be completely NPOV (Wikipedia much more so than here, based on many past discussions that I've read here) there has to be some point at which one cuts the cruft away. Just as we don't allow submissions of random keyboard pounding, we also have some obligation to not go overboard with it. Everyone contributing here is an editor. :-) And everyone keeping an eye on recent changes is contributing to the overall editorial review process.  That is the very heart of Wiki, not the antithisis!

BTW, people do expect what they say to be editorially reviewed - if they wish to be understood. If they are talking merely to execise their jaw muscles, then perhaps, they might not desire much review of their output. But that would not be "engaging in conversation." :-)

Describing what a word means, or is accepted to mean by many people is hard to determine. Using previous attempts at just that (i.e. the entire body of published material available in the world today) seems like a fine place to draw the line. The Internet taken as a whole, has proven itself unreliable on many occasions. Technicalities cause weird terms (e.g. grok) to appear. Gradually, such terms enter the "mainstream" and are accepted as "valid" neologisms. But that is a very slow process, and most of those transient terms fade away.

What all that means to me, is that we should not accept absurd made-up words in Wiktionary. That seems to be the consensus around here (both before I got here, and now.) I'm sorry that you do not agree. --Connel MacKenzie 01:25, 20 May 2005 (UTC)


 * I entirely agree. I threw out the old version of forno, for example.  And I haven't defended any word that I don't personally recognize as a word.  It just seems that my threshold is rather lower (or my netslang vocabulary rather more extensive) than yours.  I don't agree that the "library line" is a good one, but if necessary, note smiley, I shall enter into talks with Google to produce The Big Book of the Internet, Volumes 1–&infin;, coming soon to a library near you.  ;)


 * As for NPOV, it is a wiktionary policy to hold it, however it's just that for most words, it doesn't really come naturally to apply a POV.  There are a few words with controversial definitions out there (check the history of marriage, say); there are perhaps some imported etymologies from old sources that are a little free with terms like 'corruption of'; but that's about it.


 * As for words entering the "mainstream" and becoming valid neologisms, I don't agree with that, and I think for a very important reason: many, many words, even in meatspace never achieve mainstream status. Around the rise of the English language many words were imported from Latin; these inkhorn terms were opaque to people who were not well versed in the classical language.  The first English dictionaries were invented for words like these:  their main focus was on the hard words of English, the unusual ones that people were not likely to know, not the mainstream, well-known ones—which is a focus that came later with the broadening of the audience of dictionaries to include those who are learning English.  —Muke Tever 05:02, 20 May 2005 (UTC)


 * Leet/netslang has been horrifically beaten down here at Wiktionary, repeatedly. I am sorry if I implied that that is my threshold; it is not.  I have always had the impression that the Wiktionary goal was to have a respectable reference, not a free-for-all.  --Connel MacKenzie 06:35, 20 May 2005 (UTC)


 * Leet is not the same as netslang. Nobody has nor should have been been rejecting things such as IMHO, meatspace, LOL, interweb, BSOD, teh, fap, asshat... And it's entirely possible to have a respectable reference about things that aren't very respectable in themselves.  It's not a free-for-all, as we do have lines drawn.  —Muke Tever 14:25, 20 May 2005 (UTC)
 * Yes, I agree; thanks for the correction/clarification. Your last sentence summarizes the core idea I was trying to convey; one person's arbirtrary "weakening" of the established guidelines would make it a free-for-all.  --Connel MacKenzie 14:44, 20 May 2005 (UTC)

Verifiability is certainly an important criterion. A lot of what we now have for definitions seems to be completely invented. We need to put more emphasis on citing sources as a means of developing our credibility. Leet and a lot of the other barbaric internet jargon that has been appearing should be more severely controlled. Perhaps it should all be kept on a Internet jargon page in a manner similar to what was done with protologisms. Eclecticology 08:52, 23 May 2005 (UTC)


 * Perhaps. I'm not sure I'd point to Appendix:List of protologisms as any kind of an exemplary model though.  It is a pickle.
 * It's a whole barrel of pickles. :-) -Ec
 * The and  templates (not sure of the spelling of the latter) might be a better approach/model.  The What links here feature can be used to group such words (even if a category is not added to the templates) and the load on sysops for the maintenance of the deletion log might be significantly reduced.
 * The problem with letting a lot of these things their own articles is that it gives them more credibility than they probably deserve. That just encourages more of them.  Eclecticology 21:10, 25 May 2005 (UTC)


 * Hmm. I've never been one for the "Barbarians at the Gate" theory of netslang.  If a (presumably small) online community like alt.squirrelporn uses a bunch of specialized terms in its procedings, we don't really need to include any of it, any more than we need to include specialized terms used informally within Bill Goddard's reasearch group at Caltech.
 * My repeated experience with terms people like to object to is that they're all too often quite easy to track down and assign at least rough definitions. This would include a couple (teh and asshat) on the list above that "no one has or should have been objecting to", but which have in fact produced objections.  The problem is that they are often contributed in garbled and even ungrammatical form by anonymous parties who come and go in the night, and this lends a certain air of illegitimacy to what are in fact perfectly valid terms.
 * Editorial review seems like a non-issue to me. I try to approach Wiktionary from a linguistic, almost anthropological view, and I'm as much interested in colloquial speech as in the written word.  Colloquial speech, by definition, is not formally reviewed.  On the other hand, formal review by an editor is a good (but not perfect) indicator that a term is widely understood in the sense in which the author is using it.  It's certainly not hard to turn up various flavors of tripe in editorially reviewed text.  To me, the question of whether an editor liked a particular usage is less interesting and significant than whether people use a given term consistently in a given way.
 * For that matter, editorial tastes change, capriciously. The recent famous example is tidal wave vs. tsunami.  Up until the recent tragedy, either term was acceptable, and indeed we see the BBC using tidal wave to describe the Boxing Day event.  However, tidal wave is now fairly widely considered "incorrect" by editors, evidently for the completely spurious reason that tidal waves in the usual sense are not caused by the gravitational influence of the moon.  All this happened in a matter of weeks, giving the lie to the notion that linguistic change is necessarily a slow and gradual process.
 * While I'm on the topic of the speed of change, I'd like to take issue with the idea that adoption of a narrowly used term into the mainstream is a slow and gradual process. While a term can indeed linger for years in narrow usage, the transition from narrow to broad usage can be amazingly quick, thanks at least in part to the mass media.  Again, tsunami would be something of a case in point, though not the most dramatic.  I would expect that usage of the term increased by orders of magnitude in a matter of days as the story was picked up worldwide.  This is not to say that tsunami was too narrowly used beforehand to merit inclusion, only that the process of adoption can move very quickly and most likely this has little to do with how long a term has been in narrow use.
 * Meanwhile, back at Criterria for Inclusion, I don't have any problem relying solely on the internet for attestations, as long as it's clear that the term is used widely and consistenly enough that a speaker might expect it to be understood by a complete stranger. This is what the independence criterion is all about.  By the way, I see that this criterion has been considerably expanded and improved since I last saw it.  Thanks! -dmh 05:08, 26 May 2005 (UTC)


 * Ah. Looking through the change log I now see what the original controversy was about.  I'm not sure "(un)verified" is a good distinction to make.  If someone posts a random comment on slashdot, it's quite verifiable that they did so, and if someone challenged me, I would consider a link to slashdot's archives sufficient verification.
 * But this is a red herring. What we're trying to verify is not that someone used the term, but that the term was in sufficiently wide use that someone could use it in a widely-read forum like slashdot and reasonably expect to be understood.  And even this is not quite enough.  I could make up a word like slashdotifiability and use it in a random post and expect to be understood, even if no one else had ever used the word.
 * Which brings us back to independence. We're really trying to establish something more like a reasonable certainty that separate communities of speakers (including purely written usage as "speech" here) use a term consistently and without knowledge of each other.  This is obviously a hard notion to pin down, which is why this page is so long, but as far as I can tell it's not particularly relevant whether usage was in a published work, in someone's living room, on the internet, or someplace else.  The internet is just easier to access online.
 * I'm reasonably happy with the current formulation, that discourages but does not outright prohibit relying on chat rooms, blogs and email. I'm not so sure I'd include blogs, though.  The main problem with chat rooms and email is that they're often limited to a closed community, and so are not strong indicators of independent usage.  Blogs tend to be intended for public consumption.  If someone uses a term on a blog without ever defining it, that strongly suggests that the term in wider use.  OTOH, if a given term turns up, used in the same sense, in both a scrapbooking chatroom and an ice hockey chatroom and is clearly understood in both, that seems pretty indicative to me.
 * The one thing I do get worked up about is the notion that speech in a chatroom is somehow less deserving of study, per se. Granted, the subject matter of many such venues is less than edifying, but we're doing lexicography here, not literary criticism. -dmh 05:33, 26 May 2005 (UTC)
 * The one thing I do get worked up about is the notion that speech in a chatroom is somehow less deserving of study, per se. Granted, the subject matter of many such venues is less than edifying, but we're doing lexicography here, not literary criticism. -dmh 05:33, 26 May 2005 (UTC)


 * [Your score has gone up by ten points.] —Muke Tever 00:49, 27 May 2005 (UTC)

Wikifying terms we define specially here
I think I understand the idea behind wikifying attest and idiomatic in the "general guideline", but I don't think it helps. The current definition of "attest", in particular is fairly general (and maybe a bit musty), and while the 4th sense agrees with what we say here, it doesn't really add anything.

Given that we go to great pains to explain just what we mean in the article itself, linking to more general definitions doesn't seem particularly useful. Conversely, I don't see any need to try to pull the material on the page into the definitions, since it just amplifies the usual definitions (i.e., we're addressing "attested in what way?" and "how do you tell if a sense is idiomatic?" -dmh 19:12, 27 May 2005 (UTC)


 * This is a good point. I do think that some of the definitions at attest are not accurate and that that article is in need of cleanup.  (For example an attestation to a will is not done by the person whose will is under consideration, but by the person who witnesses his signature on the will.) I think that any definition of a word like that that we use on some other page may expand on the word, and how it can be3 applied to a particular environment, but it must not contradict the normal usage of the word. Eclecticology 07:35, 28 May 2005 (UTC)

Protologisms
In a tasty bit of irony, the meaning of the term "protologism" seems to be mutating in real time.

The original definition, itself protologistic, was aimed at cases where a particular person perceives a gap in the lexicon and invents a word to fill it. However, a second sense is evolving, namely a narrow usage that particular parties are trying to convince others to use more widely.

Both senses are noted in the entry for protologism, and both are in active use within Wiktionary. Notably, Appendix:List of protologisms uses the first sense, while several entries on RFD use the second sense. To whatever extent they are trying to promote usage of this sense, the second sense is an example of itself, but I digress.

We should try to be clear what sense we use, or more preferably, use separate words for the two senses. Personally, I would prefer to see protologism restricted to the narrower first sense, where it is clear that one person has created a word and a definition together, the existing term neologism be used for terms which are clearly new and not yet in wide use, and perhaps "specialized term" for terms which are long-standing, but only within a narrow community.

It's a separate discussion under what circumstances we admit any of these. My personal opinion is that


 * Protologisms belong on the protologism page until such time as they can be shown to have made it into wider use.
 * Neologisms be admitted fairly liberally, and objectionable ones be marked plain rfd and not rfdProto.
 * Specialized terms be admitted unless there is very clear reason not to. For example, protologism only appears to be used within Wiktionary, but it's been used for quite a while now.

-dmh 04:32, 30 May 2005 (UTC)

All your instruction are belong to this page
Sorry to be critical, but wow... talk about instruction-creep! This article has grown by 600% since I doubled its size back in April. I think it's actually too long to be useful anymore. I can well imagine most newbies seeing the table of contents and just giving up... (I haven't even read the whole thing yet myself.) - dcljr 4 July 2005 08:46 (UTC)


 * Point taken, but perhaps it would help if we made it clearer that the page consists of two parts:
 * "As a general guideline, a term should be included if it's likely that someone would run across it and want to know what it meant."
 * A detailed gloss on that.
 * I think that's really all the page is, and I don't think the general guideline is particularly intimidating. To the extent that the rest is useful in sorting out particular cases (and maybe it isn't, based on some of the RFD traffic), I think it's worthwhile.  -dmh 7 July 2005 03:10 (UTC)

Spellings
Now that we've had the requisite edit skirmish (which I'll take the blame for starting), it's time to talk about how to handle variant spellings.

I have two objections to the current text:


 * The notion that a misspelling can be more common than a correct spelling.
 * The removal of the numeric rules of thumb, which were clearly marked as such and are based on actual observation (though not a detailed and rigorous study).

I'm also keen to avoid the usual prescriptivist quagmire of endless squabbles over whose notion of "correct" is correct. I chose dette/debt as a somewhat extreme example, but one that seems perfectly defensible in a world where prevalent spellings can be wrong and history and etymology are to be given weight over usage. Consider that:


 * The word was originally borrowed from French dette, which is correctly spelled by French rules, which rules are still generally followed to this day for French borrowings (e.g., laundrette, baguette etc.). One could argue that the French dropped the ball here by removing the Latin "b", but c'est la vie.
 * English has never pronounced a "b" in the word, and there is no general English rule for silent "b". This is in contrast with cases like initial "kn", and "wr", which reflect the Old English (and I think even Middle English) pronunciation.
 * The spelling change can be traced to a particular source and date, before which the natural spelling "dette" was accepted evidently without controversy.
 * This change was based on an arbitrary decision to bring the English spelling closer to the Latin root and not on any practical concern. By this reasoning, one might as well insist on spelling the word "debitum" while continuing to pronounce it as "dette".

In short, there is a reasonable case to be made that "dette" is the etymologically and historically correct spelling while "debt" is the aberration. The only real reason for choosing "debt" as the correct spelling is that it is the one overwhelmingly used in all but perhaps the earliest Modern English texts. This is good enough for me, but evidently this is a dangerously descriptivist attitude, liable to lead to "correct" spellings mistakenly being labeled "incorrect" and vice versa based solely on a hundredfold or so difference in prevalence.

This strikes me as yet another non-problem to be solved by adopting a nebulous and subjective notion of "correctness" over easily measurable empirical guidelines. -dmh 7 July 2005 03:58 (UTC)


 * I've moved your latest argumentative addition here to the talk page:
 * An interesting case is souped-up. The etymologically correct spelling is clearly suped-up, but souped-up is overwhelmingly common (about 10:1).  Common sense suggests that suped-up cannot be a misspelling.  On the other hand, calling souped-up a misspelling is wishful thinking.  This is very similar to the debate over the meanings of hacker, and the solution is analogous: list the two spellings as alternates, possibly with a note on the relative prevalences and a note that some will take offense to the spelling souped-up.
 * I don't see who's arguing with you about "souped-up". Your analysis about that is essentially correct.  The "hacker" debate had nothing to do with the spelling.  In any event the "policy" is about setting soft guidelines, not about arguing over questionable specifics.

''Did I say that the hacker debate had anything to do with spelling? The point is that in both cases, a vocal minority considers the great majority of usage to be "incorrect". Endorsing either view would be POV. Instead, we note the state of affairs.''


 * Measurable guidelines don't exist without data. What would be the source of your data for making such measurements?  I remain open to the possibility that a misspelling is more common than a correct form,  but I'm not expecting that it will be a fruitful criterion for adding things to Wiktionary.

''In cases where there are enough usages even to talk about prevalent spellings &mdash; and I'm thinking tens of thousands, at least &mdash; googling is enough for our purposes. Unlike the case for attestation and deriving meaning, we're just looking for orders of magnitude or at best factors of two or three. This doesn't capture regionality, which I leave as an easy exercise for the reader.''


 * There is no value to an obsession over dette or debt. Your obsolete form, "dette" is mentioned in the 1913 Webster with a reference to Chaucer.  "Laundrette" and "baguette" have no relevance because the "-ette" there is a diminutive suffix.  I can't see the point about the lack of an English rule.  It's all very simple; all silend letters are not pronounced.  This point is not particularly subtle, why doubt it?  Let's not get stuck in bdellium over it.  Eclecticology July 8, 2005 07:05 (UTC)

-

Here's what I (Aleph 1.0) say:

If it's in the most recent version of Merriam-Webster's Unabridged Dictionary, 3rd New International Version, it's correct. 72.197.201.129 04:37, 16 May 2006 (UTC)


 * Well, outside of that we're not Merriam-Webster's Unabridged 3rd NIV... "correct" spellings are easy to recognize. Whether a spelling is incorrect is hard to substantiate, given that multiple spellings can have acceptance (ax/axe, color/colour, griffin/gryphon) and dictionaries, especially those written for a particular region's POV, don't always list them all.  Besides, the  dictionary you mention only has, what, 450K words? and only English ones too, I hear—"international" indeed.   —Muke Tever 00:46, 17 May 2006 (UTC)

Misspelled words.
Is there any Wiktionary policy on misspelled words? I just added the misspelling "seperate" to separate. Many dictionaries have a chart of frequently misspelled words; I was unable to find anything like that here. Should users just add misspelled words to the definition pages for their correct spellings? Put in redirects? 66.114.70.80


 * I think there is some merit in your idea, but personally I'd rather have those misspellings which are not words in their own right show up as redlinks when they are entered as links in other entries, to make it more likely that the person making the entry or someone checking up on it will notice the misspelling. I'd rather see a list of commonly misspelled words, something that would fit well in the appendix.  Gene Nygaard 5 July 2005 05:40 (UTC)


 * There is a semi-official policy in Criteria for inclusion. The policy is that common misspellings should be included and labeled as such.  See torroid for example.  Having them present as entries removes any doubt as to what's going on.  If I look up torroid and it says "Common misspelling of toroid" I know exactly what's happening.  I don't have to double-check on a separate page, and I don't have to wonder if it's just a word no one has entered yet.  It would be nice to have a category of common misspellings, to provide a list as well. -dmh 6 July 2005 18:50 (UTC)


 * I really like the notion of a category for misspellings, (common, rare or disputed.) --Connel MacKenzie 6 July 2005 19:16 (UTC)  Even better, perhaps, would be a category of their corrected spellings: Category:Commonly misspelt words.  --Connel MacKenzie 6 July 2005 19:43 (UTC)


 * Agreement with dmh. If someone types in "torroid," and the page "toroid" comes up with no notice of the misspelling, there is a pretty good chance the user won't notice his miskate, and will continue using "torroid." Then again, just having nothing come up could lead to pages where a misspelled word is given the definition of the proper spelling (furthering confusion). Zachol

The Formatting section suggests the following for the definition of a misspelled word:
 * # misspelling of ...

That seems like a confusion of use vs. mention. Following seems better:
 * # misspelling of ...

Is that more correct or am I unaware of some dictionary convention? Rodasmith 21:19, 23 January 2006 (UTC)

My critique
I really have other things to do, but since I've taken the time to read it, I'll opine for a minute or three:
 * Under "Attestation" - change a.k.a. to aka
 * Under "Vandalism" - remove "(generally within minutes)" Consider including at the end of this paragraph something along the lines of:  If you think you have found an article that has been vandalised, please note that with {rfd}; that should bring it to the attention of the administrators faster.
 * Under "Misspellings, . . ." - in the second to last sentence, change "English" to "British" for clarity. Rationale:  In the previous sections we've been using "English" to refer to the language, then all of a sudden you use "English" to refer to "British"-type spellings; I believe that can cause unnecessary confusion.
 * Under "Inflections" - I thought the current practice was to add common inflections. Kindly correct me if I'm wrong.
 * Under "Names of actual people . . ." - I have to disagree with two examples given:
 * Empire State Building - I think this should be included as a dictionary entry because I don't think it's immediately obvious that New York's motto is "The Empire State" from which this building derives its name.
 * Thomas Jefferson - I think this should be included because this is the foundation of "Jeffersonian".

I hope this has been beneficial.

Cheers,

--Stranger 02:22, 12 September 2005 (UTC)


 * I can't agree about "a.k.a." It is normally pronounced as an initialism; removing the periods would give a contrary impression.
 * Nothing in the criteria forbids inflections; we only discourage them as useless. If you want to add them, it's your time.
 * If it is clear that Thomas Jefferson refers to a specific historical individual. In the etymology for Jeffersonian it should be sufficient to have something like "derived from Thomas Jefferson (17??-1826)"  The years are usefule for giving a time frame to the word.
 * But be sure to hide the wikimachinery: "derived from Thomas Jefferson...." - dcljr 23:26, 6 October 2005 (UTC)

Comment removed from article
This is an HTML comment I'm moving from the Attestation section of the article. Don't ask me what it means... - dcljr 19:04, 30 November 2005 (UTC)
 * &lt;!-- We might want to recommend adding an entry on .../Citations with as much information as is known, on the assumption that Wiktionary will be around as long as Wiktionary is around -->

Widespreadness
Is there any place where appropriateness for inclusion is discussed beyond widespread, which is quoted here? It seems pretty obvious to me in the extreme cases, such that a word used only in British English, would obviously be included, and the usage of knock up as 'have sex with' in South Ajax, rather than the standard "make pregnant" used elsewhere would be excluded, but where jam buster is used by ~1 million English speakers, and maybe recognised by a few million more, is that genuinely 'widespread'? I can't seem to find anywhere here where people even attempt to address this issue. Wilyd 16:59, 12 January 2006 (UTC)
 * Where in the world did you get the ~1 million figure for jam buster? - dcljr 02:21, 16 February 2006 (UTC)


 * Um, that entire section is about defining what we mean by it. Jam buster would not meet our criteria because you won't find A) three independent citations that B) use the term in running text C) spanning a year.  In general, we use books.google.com/print.google.com as a front-line check against protologisms.  The search results of for jam buster indicate proper names and strange word combinations, but I didn't see one running text use of the two words together.  --Connel MacKenzie T C 00:27, 21 February 2006 (UTC)


 * Besides books.google.com I can highly recommend Amazon. Use the SIPs feature from one book, then modify the URL to find whatever word or phrase you want in many books. &mdash; Hippietrail 17:19, 22 February 2006 (UTC)


 * Absolutely. In fact, I think there may be some risk from relying on books.google.com too much.  These (and others) are wonderful resources, but dependence on one or the other should be avoided.  --Connel MacKenzie T C 17:25, 22 February 2006 (UTC)

Proper Names that are subject to Translation
There is a category of proper names (place names, given names etc) that are subject to translation. I believe these should be included in Wiktionary.

For example, I think we need the words London and Londres, Munich and Munchen, even if neither are used in an attributive way. Also Peking and Beijing and any other forms in the Chinese form.

Also, names such as John, Jacques, Giovanni etc, which are all different language forms of the same name (I think). Probably each would qualify anyway.

Also names of stars named differently in different languages.

In fact, any proper name which has a commonly used translation or other language form should be eligible for inclusion, to fulfil the Wiktionary role in translation.

This would allow more proper names than just the attributive use criteria.--Richardb 07:56, 26 February 2006 (UTC)

CFI not universally applicable - Protologisms, WikiSaurus, concordnances etc
CFI not universally applicable - Protologisms, WikiSaurus, concordnances etcThere are many lists in Wiktionary, of varying purpose. Many of these contain words which do not necessarily meet the CFI as currently proposed. I strongly suggest that it would be wrong to apply the CFI to words in lists such as
 * the list of Protologisms
 * concordances
 * WikiSaurus word lists.

By their very nature, these lists do not carry verfification, nor definition. The lists would mostly be wiped out if the CFI were applied to words in them.--Richardb 10:59, 26 February 2006 (UTC)


 * Seems reasonable but I've thought a few times about putting a reference to the CFI in the Wikisaurus entries to give guidance on people considering turning the red links blue. (might have been done I've not looked for a couple of months). MGSpiller 01:34, 28 February 2006 (UTC)


 * I've started a policy page WikiSaurus criteria which addresses the criteria (and action) for words in WikiSaurus. ITt could probably do with some expansion to cover whether words should be linked or not.--Richardb 01:47, 11 May 2006 (UTC)

ISO code criterion
On the project page:
 * If the language lacks an ISO 639 language code, it is almost surely not acceptable.

Problem with this: Hundreds of extinct languages do not have ISO 639 codes and probably will never have them. This criterion, as presently stated, does not seem consistent with the following statement:
 * As an international dictionary, Wiktionary is intended to include "all words in all languages".

Perhaps this is good for restricting constructed languages, but it doesnt seem good for natural languages.


 * The importance for having this is to avoid treating local dialects as a separate language when the proponents insist that it is a separate language. The word "almost" is also there when someone can make a good case. Eclecticology 00:54, 2 March 2006 (UTC)

Formatting?
Why is formatting (incorrectly) described for a misspelling redirect? Doesn't that belong in WT:ELE or somewhere else? --Connel MacKenzie T C 18:00, 4 March 2006 (UTC)

Pawley list

 * from User:Muke's comments in Wiktionary talk:Idioms...

Anyway: “The object is to describe what it takes to use a language properly as a member of society. Part of this is knowing what things to say, when to say them and how to say them in conventional ways. [...] Instead of striving to keep the lexicon small we need to enrich it. In fact we apply the terms ‘lexicon’, ‘lexeme’ (or ‘lexical item’) and ‘lexicalized’ in ways quite different from the grammarian. Now these terms are defined with respect to cultural facts as well as with respect to purely structural criteria. Complex words and compounds, and perhaps phrases, are considered part of the speaker's cultural lexicon if we can show that they have entered the social tradition, that they have attained the status of social institutions, being recognized as conventional ‘names of things’, as ‘terms’ in a set or terminology, as ‘set phrases’, and perhaps as ‘appropriate things to say’. All grammatical strings are not socially equal. We award special status to those strings that are culturally significant, even though they may also be perfectly grammatical. The upshot is an enormous increase in the number of lexemes compared to the ideal grammarian’s dictionary.” Andrew Pawley, as quoted in Making Dictionaries In the same source is quoted his list of criteria for lexeme/headworthiness, which I have beforehand shared with the IRC channel:
 * 1) The naming test: Can the candidate for a lexeme be referred to in questions or statements such as the following: ‘What is it called?’ ‘It is called X.’ ‘We call it X, but they call it Y.’
 * 2) Membership in a terminological system: [...] Does X encompass other terms; can one say ‘it (dog) is a kind of X (animal)’ (=generic)? Is it a member of a set of similar things; can one say ‘X (a chair) is a kind of Y (furniture)’ (=specific)? Can it be used to show contrast; ‘is it a kind of X (fruit), but not a Y (vegetable)’? Does it have synonyms or antonyms?
 * 3) Customary status: Does the use of the phrase imply certain behavior patterns, values, or sequences of activities that are known by society at large? They represent conventionalized knowledge. For example, expected behavior at the front door is different from at the back door (besides their participation in idioms), indicating that these function as cultural units (lexemes) that are more significant than the sum of the parts. Consider go to the mosque, get off work, take a vacation.
 * 4) Legal status: Some phrases have such status that they are codified in legal usage: driving under the influence, breaking and entering, assault and battery, justifiable homicide. Even so-called ‘primitive’ societies with unwritten languages have categories of this sort for dealing with things like marriage negotiations and litigations over land, property, and adultery.
 * 5) Speech act formulas: Every language has some formulas “which carry out conversational moves” (Pawley 1986:106). For example, excuse me, how are you, y'all have a nice day, etc.
 * 6) Use of acronyms: This is often proof that a multi-word phrase represents concepts that have attained conventionalized or institutionalized status. Consider: VIP, DWI/DUI, IQ, RBI, SAT, ASAP, PTO, PTL, AWOL, BS, RSVP, R and R; in Indonesia: KB, DKI, KK, ABRI, DPRD, GBHN, etc.
 * 7) Single-word synonyms: the only one of its kind ↔ unique.
 * 8) Belonging to a terminological set: This is similar to (2), but focuses more on a pair of antonyms. Consider: tell the truth ↔ tell a lie, take care of ↔ neglect.
 * 9) Base for inflected or derived forms: short temper → short-tempered; ooh and ah → oohing and ahing, Indonesian ke mana → dikemanakannya (‘to where’ → ‘wind up where’).
 * 10) Internal pause unacceptable: The unacceptability of inserting a pause in the middle of clichés, idioms, and compounds is partial indication of their functioning as a unit. Consider the functional differences between bunch of baloney vs. bunch of bananas. One can say two bunches of bananas, but cannot do the same with the figurative sense of bunch of baloney.
 * 11) Inseparability of constituents: Insertion of other material changes the unity or naturalness of a phrasal lexeme. Consider: lead up the garden path. Saying lead up the beautiful garden path shifts it from a figurative to a literal interpretation. This is similar to (10) above.
 * 12) Ambiguity as to whether it should be written as a single word: whatchamacallit, thingamajig, man-in-the-street, oneupmanship.
 * 13) Conventionally reduced pronunciation: bosun (boatswain), won't, can't, o'clock, Newfoundland, Christmas, Worcestershire, thruppence (threepence) etc.
 * 14) Conventionally truncated forms: Widespread occurrence of shortened forms often indicate their role as a lexeme in the language: exam(ination), rad(ical), ex-con(vict), con(vict), con(fidence man), con(fidence trick), ex(-husband/-wife), pro and con, etc.
 * 15) Omission of headword: The modifier stands metonymically for the whole: She had an oral (examination), He had a physical (examination), A short (circuit) cut off the (electrical) power.
 * 16) Omission of final constituents: This often implies conventionalized knowledge: If you can’t beat ’em..., A stitch in time..., I haven’t the faintest (idea). These elided forms are often marked by peculiar intonation.
 * 17) Stress and intonation patterns: Different languages give different phonological clues for what is seen to function as a unit. English often uses stress and intonation. Government jargon is often coined through these means. Consider political matters memorandum.
 * 18) Invariable constituents or grammatical frame: The demanding and rhetorical Who do you think you are? does not have the same impact in the future. Kick the bucket does not mean the same when put in the passive. The thought had crossed my mind, and he took the law into his own hands are unnatural in the passive. Compare also stripped down formulaic sentences easier said than done, spoken like a man! There are also syntactically irregular or archaic idioms like easy does it, no go, no way, be that as it may, (she) wants in, once upon a time.
 * 19) Use of definite article on first mention: In English this can indicate the conventionalized nature of the ‘object’, showing the speaker assumes the identity is understood by the addressee: the fire department, the foreign legion, the eight ball.
 * 20) Writing conventions: Where there is a written tradition these may provide clues to perceived status as a unit. Capitals may indicate lexemes that are not typical proper nouns: Third World, Big Bang, Inner City. Beware that where a society has the luxury of supporting a literary community, some writers manipulate the use of capitals for unconventional purposes. Quotation marks may also indicate unitary status: he was considered a ‘bad boy’. Orally, some speakers use so-called or a preceding pause to mark an equivalent to quote marks.
 * 21) Unpredictability of form-meaning relation in semantic idioms: kick the bucket, chew the fat, shoot the breeze.
 * 22) Arbitrary selection of one meaning: Notice that button hole is a hole FOR putting buttons THROUGH, whereas bullet hole is a hole MADE BY bullets, post hole is a hole FOR setting posts IN, etc.
 * 23) Use in ritual language of parallelism: This is a special case of (2) and (8). Ritual language in parallelisms is widespread. It is found, for example, in Biblical Hebrew and many Austronesian languages, particularly in eastern Indonesia (Fox 1988). Existence as a paired entity in this context is sufficient for justifying its status as a conventionalized unit, and hence a lexeme.

Comments
I believe we should be honoring each of these as CFI. At some time in the past, I had reservations about a few of these, but not anymore. --Connel MacKenzie T C 01:40, 26 March 2006 (UTC)


 * Perhaps I'm misinterpreting the CFI, but I don't see a difference between that, or at least the de facto CFI interpretation, and the list above, save #7. [18:18, 19 April 2006 (UTC)] And #3. Davilla 19:43, 19 April 2006 (UTC)


 * I split this into a separate section for editing ease. --Connel MacKenzie T C 18:35, 19 April 2006 (UTC)


 * I think most of these, with the exception of idioms, come under regular attack, especially #1, #2, #3, #6, #8, #10, #15, #17, #19, #20 and #22. In the past, there was considerably more resistance to  keeping terms containing a space in the headword.  --Connel MacKenzie T C 18:35, 19 April 2006 (UTC)


 * I don't believe #15 has come under attack because the list only requires the definition of e.g. oral as a shortened form of "oral examination". By my understanding it does not require that the latter, full form as an entry. If I am misreading, and this should be investigated, then you are correct that the CFI's idiomacy requirement does not match.
 * Reconsidering my own thoughts (again), Pawley must have meant oral examination even though it isn't clearly worded as such. Then you are right, these have not generally been considered idiomatic, and their inclusion would require a change in policy or maybe just thinking. It doesn't seem like it would be too difficult to get folks to support at this stage. Davilla 16:25, 24 April 2006 (UTC)
 * The problem with #3 is that the phrases can be altered, so this sort of information would more likely end up at a shorter phrase as an example, or usage note, or definition of its own (as in get off for "get off work"). I don't agree that take a vacation is the proper place because of all the different words that could be and often are inserted: "take a long vacation", "take many vacations", "take a flight cross country on business and before returning a convenient pleasure vacation". In most cases "take a vacation" is not going to be the search term unless someone already knew that take a vacation was the correct idiomatic phrase to look any of these up. My objections aren't strong, but that's the way I see it. That vacations can be "taken" is the information which needs to somehow be conveyed.
 * Thinking about this a bit more I'm starting to agree with you that phrases like take a vacation are legitimately idiomatic and should be included. This reasoning comes about from considering other phrases that even include "one" or "someone" as placeholders, chosen as the best titles for entries. Thus I would not be surprised if they have come under attack. I am quite curious to know how well the list above matches currest practice. Davilla 17:27, 20 April 2006 (UTC)
 * Back to the point, the majority of these I still think do match the CFI. That the others you've mentioned often come under attack is just a result of their nature in being somewhat borderline. For as many of these that fall under no.'s 1, 8, 10, 17, 19 and 20 specifically, there are a good number of similar phrases that clearly would not. Most multiple-word phrases brought to RfD, aside from the clearly vandalous, are ruled as keeps I think. You said some of the legitimate ones were turned down in the past. I'd like to know if any recent deletions have fallen under the above criteria. As far as I've looked, only War in Iraq seems to counter my claim, although I'm hoping perhaps it won't be deleted in the end. Does that match your analysis? Davilla 19:43, 19 April 2006 (UTC)

Would argue that this Crtieria is very variably applied.
For example, an appearance in someone's online dictionary is suggestive, but it does not show the word actually used to convey meaning.

This critieria is applied at times to exclude words that someone does not like. Yet other words. such as medusetl, are accepted, though the only online reference that can be found is entry in some dictionary.

My contention is that existence in some recognised dictionary is sufficient for a word to be accepted into Wiktionary. The only discussion really is what is a "recognised dictionary". There are some very obvious candidates, such as the OED.

I believe the Criteria should be amended to reflect this, as this is in reality current practice amongst most of us.--Richardb 01:55, 11 May 2006 (UTC)


 * Actually when words get sent to RFV, anyone who attempts to defend it by citing a dictionary is quite shouted down. As for medusetl, it does currently have a cite from a well-known work, so...   —Muke Tever 21:49, 11 May 2006 (UTC)


 * I've suggested before that inclusion in a dictionary, in fact any dictionary so long as it's in print and not one of these "urban" collections online, should at least qualify it for additional time in the RfV process. I would also be willing to define a "recognised" dictionary, one which would automatically permit a term here, as one for which the criteria for inclusion are strictly stronger than our own. Davilla 16:45, 12 May 2006 (UTC)


 * I'm positive this was one of our original criteria, back before we had a criteria page. I've also seen it shouted down - an attitude I dislike. I know at least the OED has this criteria, even if they can't find any other sources, in which case their citation will list the dictionary it's in, or even something like "in various dictionaries" on occasion. &mdash; Hippietrail 18:52, 12 May 2006 (UTC)


 * Davilla, I would love to hear your definition of a "recognised" dictionary, replete with an initial list of dictionaries. I took a couple stabs at doing that and was shouted down as they say.  --Connel MacKenzie T C 22:18, 12 May 2006 (UTC)


 * I started to compile one some time ago along with contact details. I'm sure I had another one somewhere without the contacts but including a lot more dictionary, specifically the Gage Candadian, and a good few non-English dictionaries too. &mdash; Hippietrail 22:25, 12 May 2006 (UTC)

Encyclopedic entries, Names that have words derived from them
These are two types of entries which people have been arguing in favour of keeping recently even though they do not currently meet the CFI.


 * 1) Should we add something to say "prominent people like Abraham Lincoln qualify for an entry"?
 * 2) Should we add "because there is a hairstyle named after the Beatles, 'the Beatles' also qualifies for an entry"?

I think #1 is based on whether or not we have decided to become an encyclopedic dictionary or not. If we have, we have to decide on which encyclopedic entries to include: people, places, more? It is my firm opinion that a) this must be voted on before adding, b) encyclopedic articles must be marked as such, probably a category is sufficient. Personally I'm not in favour but I'll accept the popular vote.

As for #2, I see no basis whatsoever. Where does this line of reasoning come from? It's certainly not the practice of a single dictionary I can think of and I don't think "Wiktionary is not paper" is the explanation for that either. Why are the etymology sections not enough? In all the cases I can imagine these are covered by #1 as encyclopedic entries anyway so if though I also recommend a vote on #2, a vote to accept #1 would go a long way toward #2 also.

Thoughts? &mdash; Hippietrail 19:33, 12 May 2006 (UTC)


 * With respect to your first point, I contend that Abraham Lincoln should have an entry because he is a reference point - a symbol by which you can identify someone else's characteristics. Same goes for Einstein, Hitler, Mother Teresa, Cassanova, Lothario, etc. It is, however, a small list, since the usage must be attested. bd2412 T 20:22, 12 May 2006 (UTC)


 * But on what basis does "being a reference point" bring a term into a dictionary rather than another kind of reference work such as an encyclopedia? What is to be gained from it? Are there any dictionaries you can think of that practice this? If not, why should we pioneer it (besides not being paper)? Also more importantly which tests would you apply to show that a proposed attestation shows use as a reference point? &mdash; Hippietrail 20:50, 12 May 2006 (UTC)
 * Imagine a foreigner to our tongue reading a newspaper that refers to so-and-so as being an Abraham Lincoln (or, more likely an Abe Lincoln) - he will need to look up the term to see what that means - is it a good thing or bad? Looking up an encyclopedia article on Lincoln might help, but such an extensive coverage will reveal many characteristics - physically fit, honest, witty, brooding, conflicted, lionized, assassinated - any of which could be the one the term denotes. But we can explain that to call someone an Abraham Lincoln is to say that they are honest, indeed completely forthright. Similarly, to say that someone is a Mother Theresa is to say that they are saintly, not that they are old or strict in their views, or even a woman or a Catholic. bd2412 T 21:52, 12 May 2006 (UTC)


 * The examples above are funny because there have been many people named Einstein, Hitler, etc. If we're going to include encyclopedic terms, then shouldn't they be under Albert Einstein and Adolf Hitler, just like Abraham Lincoln? And would it be more appropriate to list "John D. Rockefeller", as he's known, or his full name "John Davison Rockefeller, Sr."? And why not redirect the other, since it's the same person, and also Abe Lincoln, etc.? And what if several people have the same name, like John Thompson? Why not have a disambiguation page, for surnames especially, in case someone is looking for a different person by the same name? In other words, why not just leave encyclopedic entries up to the frickin' encyclopedia in the first place!?! Davilla 05:44, 13 May 2006 (UTC)
 * Um, so if someone were to say to you, "Great idea, Einstein", or "you pick up the tab, Rockafeller", you'd need a disambiguation page to figure out which Einstein/Rockafeller they were talking about? The dictionary should only list the meaning attested to be associated with the name, and should only identify the person used as a reference in the etymology. And last I heard, there is no attested phrase along the lines of "that guy's a regular John Thompson!" bd2412 T 15:56, 13 May 2006 (UTC)
 * That's a TOTALLY different case!! The entry for Einstein as Albert Einstein is NOT encyclopedic because it refers to a specific Einstein. The word is a monicker that can easily be cited out of context, as in the example you gave . I wrote the entry for Rockefeller, by the way. Davilla 18:27, 13 May 2006 (UTC)
 * Then shouldn't the material you offer as the definition of Rockefeller actually be presented as the etymology? bd2412 T 02:16, 14 May 2006 (UTC)
 * Not if Einstein can be cited to mean "Albert Einstein" out of context, which must have been true before the figurative sense could have caught on. I'll see if I can find some. Davilla 09:36, 14 May 2006 (UTC)


 * In other words, terms that convey idiomatic meanings, like Einstein, should have entries like Einstein currently has, and not like Rockefeller. The Beatles, on the other hand, have nothing idiomatic, and thereby do not really meet any criterion, right? &mdash;Vildricianus 20:15, 20 May 2006 (UTC) 20:15, 20 May 2006 (UTC)


 * That's not my opinion. There are ways to justify all three that do not also permit encyclopedic titles like Albert Einstein. Edit: Certainly idiomatic is one good idea. Davilla 17:50, 26 May 2006 (UTC)

Tests for multiple-word entries

 * See: List of idioms that survived RFD DAVilla 09:29, 31 December 2006 (UTC)

After reviewing the Pawley list I've realized that the items aren't crieteria so much as clues that an expression is a lexeme. Of course it's multiple-word entries that are of the most interest to us. Essentially, a term is acceptable if it is considered to be a logical unit, especially if it fails sum-of-parts, cannot be altered, or is used differently than the norms of the language would otherwise dicatate. I've reconstructed a few tests from this list, but except as a starting point I'm not sure if relying on Pawley is the right way to proceed. Which tests we use should result from the tests developed during debates, as approved by the community. Since the list is disjunctive rather than conjunctive, a legitimate test cannot include any terms that should clearly be excluded. I think that's the best gauge we have so far for evaluating these. But then the enumerated list could never be considered complete, in the sense that new tests could be added when terms generally supported by the community are found not to fall under any accepted rule. For instance, I'm not sure if any of these allow for empty space.

From the Pawley list
As this is meant to be a starting point, I've only selected tests that I'm certain will pass consensus. In fact, that's exactly where I'd hope this is headed, which is why I've omitted some ideas that have potential. Please do not fault me on incompleteness. As stated above, the list will always be incomplete, but can be extended when the need arises.

1. The fancy dress test.
 * Terms that are not understood in a different dialect although all constituents are understood.

3. The fried egg test.
 * Terms that imply certain social knowledge that could not be derived from any of the constituents, nor from their combination.

4. The prior knowledge test.
 * Terms that have a specific technical meaning in a certain field.

5. The never mind test.
 * Terms that are used to structure conversation.

10,12. The in between test.
 * Terms that are tightly bound, in which a pause cannot be inserted, or for which concatenation seems natural, if not standard.

12. The all right test.
 * Terms for which there is even the question as to the legitimacy of concatenation.

16. The easier said test.
 * Terms whose final constituents are omitted, implying conventional knowledge.

17. The rocky chair (or pet name?) test.
 * Terms signified as logical units by unusual patterns of stress or intonation.

18. The "mind was crossed" test.
 * Terms that cannot be rewritten in certain grammatical frames.

18. The once upon a time test.
 * Terms that are irregular or archaic syntactically.

22. The Egyptian pyramid test.
 * Terms which do not have the most general meaning attributable, for which specific meanings are assigned to the constituents.

Some of these, e.g. 10 and 12, may be duplicative, but there's no harm done. The question is if any are too broadly written. We also need to develop tests for phrasebook entries. Davilla 22:28, 13 May 2006 (UTC)

Names of characters in books and films
Are we allowed to add imaginary people? Donald Duck, Gandalf etc but not real people Geoffrey Chaucer, Rudyard Kipling etc. If that is the case - it seems silly to me. Παρατηρητής
 * Well, your complaint is legitimate, if somewhat overstated, but the point I think is to be a reference of language rather than factual information. Going by my own opinions about the inclusion of terms:
 * I would guess Donald Duck could probably be cited out of context. Gandalf would get the axe if the fictional character were part of the definition. But as only a (dubious) external link, that hasn't been the case.
 * It might be difficult, but Geoffrey Chaucer and Rudyard Kipling could be cited in places as simply Chaucer and Kipling, unintroduced. However, I'd doubt you could find either full name out of context.
 * But then, what would you expect? Few people talk about Chaucer or Kipling. Anyways, they're included on Wikipedia, so is there a need for complaint? The point is to avoid duplicity. Davilla 17:38, 17 May 2006 (UTC)
 * Except, of course for the old joke. Do you like Kipling. No I'm afraid not. I have never kippled before. Andrew massyn 22:35, 20 May 2006 (UTC) :)


 * I haven't a clue about how Chaucer or Donald Duck could meet the CFI. Chaucer should say something about the surname, not the person, and Donald Duck, well, there's a heap of translations there... Should we adapt the CFI regarding translatable names? It seems that they receive more endorsement than non-translatables. &mdash;Vildricianus 22:21, 20 May 2006 (UTC)

Encyclopedic
This subject is currently under debate. This outline establishes my opinion on the idea. Davilla

Encyclopedic means that the sense refers to a specific person, work, or other historic topic. The following pages are specific examples of encyclopedic names that do not merit inclusion in Wiktionary under these titles:


 * Popular films or series such as Harry Potter or I Love Lucy.
 * Culturally significant stories such as the Little Red Hen or the Four Dragons.
 * Titles of novels such as Little Women.
 * Dictionaries such as the Shorter Oxford English Dictionary.
 * Celebrated authors such as William Shakespeare.
 * Important historical figures such as Dwight Eisenhower.
 * Companies and trademarks such as Xerox.
 * Newsworthy locations such as Waco, the city in Texas.

In any of the above examples, a person who ran across the term wondering what it meant would be more likly to look in an encyclopedia than a dictionary. The rationale for not including encyclopedic entries in general reflects the desire to avoid duplicating the content of Wikipedia. However, the rules are inclusive rather than exclusive. That means that encyclopedic terms can be included in many cases, provided they have entered use linguistically rather than just socially or culturally. An entry may be included if it satisfies any of the following criteria:


 * The term is used attributively.
 * Places: New York as used in "New York delicatessen".
 * There are other attested words that derive from the name, not counting original trademarks that have been genericized.
 * Places: The family name Salisbury from the city of Salisbury. Bostonian from Boston, the city in Massachusetts. However, Jeffersonian can refer to any city named "Jefferson".
 * People: Jeffersonian from Jefferson, a Founding Father and U.S. President. Washington the state from Washington the President.
 * Other: Beatlesque from the Beatles, the rock music group; Micro$oft from Microsoft, the software company.
 * Notes: This doesn't work the other way around. Just because there's an entry for Sleeping Beauty doesn't mean Walt Disney's film deserves mention.
 * The term is used idiomatically figuratively.
 * Places: The capital of a country is used as synecdoche for the government of the entire political state.
 * People: An intelligent person can be labeled an Einstein; there is a great deal of notoriety associated with the name Kennedy.
 * Other: The American Heritage Dictionary defines Cinderella as a person who "unexpectedly achieves recognition or success after a period of obscurity and neglect". This usage derives from Cinderella, the character in the fairy tale.
 * A single name stands for a specific person or place in the general context, although there may be many other people or places with that name. This is generally a first step to figurative use.
 * Places: Athens is a common place name, but without any other contextual information it refers to a specific city in Greece. Thus the Greek city gets an entry while the American city in Georgia, of many places, does not.
 * People: Eisenhower in the general context means the U.S. President, not his grandson David Eisenhower, after whom Camp David is named.
 * Other: For its fans, Rocky is the shortened form of The Rocky Horror Picture Show.
 * The term has standard, or common and non-trivial, translations into many other languages, especially on the other side of the world.
 * Places: Although Taipei is a transliteration from the Chinese 台北, it is a standard name that has survived newer romanizations.
 * People: There are many common mappings for names, and the translations for most celebrities follow these. In some cases there are secondary mappings, such as Martial for Martialis, but even Marcus Valerius Martialis does not deserve an entry at his full name.
 * Other: In Vietnamese, the Vietnam War could just as easily be referred to as Nội chiến, the "Civil War".

Note that proper nouns can be generic terms as well, and these rules for encyclopedic meanings do not apply to given or family names such as John or Smith, or to generic place names such as Jackson, which may be defined simply as common names. The generics of trademarks such as xerox, used as a common word, can also be included if they can be attested under the normal criteria.

Comments
A culturally significant story is bound to have translations, isn't it? Edit: Maybe not. Changed reference above. Davilla 21:54, 26 May 2006 (UTC)

Edit: Trademarks becoming generic do not warrant an entry for the original tradmark. Davilla 14:45, 28 May 2006 (UTC)

Edit: OED deserves mention because it could be used out of context and people would be expected to know what it means. Changed to SOED. Davilla 08:27, 21 June 2006 (UTC)

People from Boston, Lincolnshire, England (a very old town) were probably called Bostonians long before Boston, Mass was founded. But that doesn't invalidate your argument. I toyed with suggesting a category (also applying to Boston) where a foreign (to UK in this case) place is so well known that people nearby still assume that the foreign place, rather than the local eponymous one, is being referred to unless specifically differentiated, eg in the UK, Boston, Pennsylvania, Cyprus, Paris, etc or in the US (guessing) Plymouth. However, on reflection, this is probably an encyclopaedic issue, whereas your proposal is linguistic, so more valid.


 * I'm sure there's plenty of room for improvement. Davilla 14:54, 21 June 2006 (UTC)

An interesting problem for the future is Westward Ho! a British town named after a book, and the only official British place name containing a !. Does this make it a valid entry for the town (since the ! makes it a linguistic anomaly) with the book being mentioned in the etymology, or what? Enginear 11:04, 21 June 2006 (UTC)

Stock symbols
I recall it coming up before, but the specific exclusion of stock ticker symbols seems to have been removed? Was it never here, only in conversations on BP? Any objections to indicating that we don't want them here, here on CFI? --Connel MacKenzie T C 07:35, 28 May 2006 (UTC)
 * Vildricianus has suggested keeping this page updated. I don't or barely recall the debate on this topic, and certainly not the outcome, but you make it sound like this was the consensus, in which case that would be the correct action to take. It was a question of no English context, right? Then this might annotate the running-line-of-text concept well. Davilla 14:43, 28 May 2006 (UTC)

Lost patience with CFI
Certain users decided to develop these CFI. Which is not a bad thing, even if the Criteria are somewhat biased and arbitrary. But then those very same users selectively apply the CFI to in effect bowdlerise Wiktionary. The vast majority of words in Wiktionary do not meed CFI, but it's only the "offensive" ones that get targetted.

I really can't be bothered to attempt to modify the "policy", as regardless of the policy the bowdlerisers will still do their best/worst to remove "unsavoury" words.

The CFI are discredited by the very biased way they are used/applied. I can't be bothered to give examples. Just use the Random Page function a few times, and for each page see if the page meets the criteria. Particularly the one about 3 citations. Want to take any bets on what percentage of pages actually meet the criteria ?--Richardb 12:29, 29 May 2006 (UTC)


 * What's a dictionary without a set of criteria? The number of users and contributors is only going to increase over time, and we're de facto defenseless without decent CFI. They're not just there to make deleting content justified; the opposite is equally important. Without CFI, you can't make a point against people who want to remove things unjustly. And BTW, what's with this "vast majority doesn't meet CFI"? It seems to me that you have a different impression of Wiktionary in its current shape. &mdash;Vildricianus 14:37, 29 May 2006 (UTC)

Trademarks

 * Being a trademark or a company name does not guarantee inclusion.

While the companies mentioned in the text may claim their trademarks are nouns and not verbs, should widespread use of the words like "photoshopped" and "googled" trump those claims? They can claim all they want but if millions of people use the word, it exists, at least in my book. - 131.211.210.11 09:10, 30 August 2006 (UTC)
 * That quotation pertains to the brand-specific (generally capitalized) usage. The words you're talking about are not trademarks by our definition because they are generic (so usually lower-case), although it's better to mention the status of the word for the information of readers. What the company says about the word couldn't matter less. They are not the authority on linguistic use, and what they claim does not carry legal weight in this context. The best example is xerox which has been listed in the OED since the fifties. Push comes to shove, it takes three quotations as per the CFI. DAVilla 04:28, 31 August 2006 (UTC)

"Verified through"
“Attested” means verified through I don't quite catch it. Is that supposed to imply "verified through any of these" or "verified only through all of these"? Dart evader 20:54, 6 October 2006 (UTC)
 * 1) Clearly widespread use,
 * 2) Usage in a well-known work,
 * 3) Appearance in a refereed academic journal, or
 * 4) Usage in permanently-recorded media, conveying meaning, in at least three independent instances spanning at least a year.


 * Through any of them (note commas after 1 & 2, and or after 3). --Eng in ear 21:05, 6 October 2006 (UTC)


 * I'm surprised the 1st one hasn't been questioned yet. I've always considered 1,000,000 google hits to be a decent indication for that.  Perhaps that is too low a number?  --Connel MacKenzie 21:35, 6 October 2006 (UTC)

Typo
There is a typo on the page, but since it is blocked, thought I’d report it here: the word ‘hypocoristics’ should be capitalised. Remove this remark at will. 134.2.147.26 13:39, 18 October 2006 (UTC)
 * Indeed it should, being at the beginning of a sentence. Done. Robert Ullmann 14:04, 18 October 2006 (UTC)