Wiktionary talk:Verifiability

dmh's alternative about "Dubious sources"
The primary requirement in supporting dictionary entries is to verify usage. This is significantly different from verifying the contents of an encyclopedia article. For example, if a random blog makes a statement such as "the average American has 3.8 cousins and 7.4 nephews and nieces," there is no particular reason to believe that the average American actually has 3.8 cousins. However, there is very good reason to believe that the words "American," "cousin," and "average" are in use.

In this regard, the appearance of a term is much more trustworthy than a purported definition. I someone asserts "It is incorrect to use the word podium to refer to a lectern," there is no particular reason to believe this, regardless of whether the statement occurs on the internet or in print. As we are documenting usage, usage trumps opinion. In this particular case, it is not hard to find examples where podium is clearly used to mean a lectern or similar item.

In general, it should be assumed that people use language in order to convey meaning, and not to influence the editors of dictionaries. There are clear exceptions to this, in which a particular party tries to promote a favorite word or usage, often of that party's own invention. Such exceptions are generally easy to identify and in any case, they are the exception.

What are we trying to prove?
Part of the confusion over verifiability arises over the need to prove several different things. Here is an attempt to break this down. I will occasionally refer to currently controversial entries because the issues are close at hand. Naturally, I have an opinion about such terms, but my focus here is to highlight issues, not to push for any particualar term (I do this elsewhere :-).

First, we may wish to know whether a term is in use at all. This is where the infamous "This gets xxxx google hits" is appropriate. "This gets 1 million google hits." is an easily verifiable statement, and past some arbitrary limit, should be enough to establish that a term is worth studying. Exactly where that limit should be is open to debate and compromise. I would tend to put it in the range of 1,000 or 10,000. Others would prefer more like 1,000,000, and I can understand such a desire. By either measure, terms like webinar are clearly worth an entry, while a term like choad may or may not be, and one like bot herd would need separate justification.

However, mere google hits do not establish meaning, nor do they speak to idiomaticity ("it is" gets billions of hits). I believe that establishing meaning breaks down to two cases:


 * Technical senses of terms are established by authority, whether formally or informally. E.g., SI units like meter and second have very precise official definitions.  Terms like luminosity distance may not be controlled by a central authority, but still have rigorously defined meanings which are available through various more or less official channels.  If I try to use luminosity distance incorrectly in a journal submission, I won't get published, and conversely, if I run across it in a journal, I can be confident of what it means.  This also applies to non-academic fields; terms like I-beam or A4 paper have standard definitions.  In such cases, it is imperative to track down the most authoritative source available and reference it clearly.  We are trying to verify that some relevant authority has blessed a given definition of a term.
 * Colloquial senses are, pretty much by definition, not established by authority (language mavens to the contrary). They must be gleaned from context, known senses of the term and related terms, etymology, history and so forth.  Existing dictionaries have done considerable work in this regard, but language evolves quickly enough that it is impossible to be completely up to date.  This is where the "wiki" part of "wiktionary" comes into play.  It is our job here to collect usages of terms and try to glean definitions from them in a necessarily somewhat subjective process, but one which must ultimately be supported by actual quotations.  We are trying to verify that people have used terms in a given way.

This second point, as far as I can tell, is the source of much of the disagreement over verifiability. In these cases, we are simply trying to establish that someone actually used a term to convey meaning in a given language. We don't care if this person is making a factually true statement, we don't greatly care if the person is anonymous or pseudonymous, except in rare cases, and we don't care whether the term in question is slang, vulgarity, recently coined or considerd distasteful for whatever reason.

There are legitimate reasons to prefer stable sources, including printed sources, in such cases. However, if I find a quotation on google groups such as "According to a most conservative estimate, if one pandava jubok will have draupodi-choda 3 times a week, 3x5 = 15 chodas/week = two chodas daily for the poor girl continuously for 14 years, RAIN or SHINE." (see ), it's very clearly verifiable that someone used the word "choda" here. It's even reasonably clear from context that in this case the word means "sex act". Would I prefer to see a well-proofread printed source using the term? Of course. I would also love to see feedback from native speakers. But none of this keeps a quotation such as this from being usable evidence. It shows the term in use, and thanks to Google's archiving, it can be expected to remain verifiable over time. And there are plenty more like it.

This example is particularly instructive because a search of relevant print sources failed to turn up the term. Restricting verifiability to print sources would cut out precisely the kind of terms that wiktionary is best suited to documenting. N.B., by this I don't mean strictly vulgarity, slang, netspeak and such, though those will likely be overrepresented. I mean anything which is not likely to have made it through the usual channels. It would also include developing senses of existing terms. -dmh 20:30, 21 September 2005 (UTC)

Verifiability
This is a policy think tank that was pulled off of Wikipedia, and after an unsuccessful attempt at adapting it to Wiktionary, sat around for four years. This page is pointless, given that all of Wiktionary is original research, and verifiability is not a requirement at all, short of the attestation requirement than a word exists. --Yair rand 23:28, 26 April 2010 (UTC)
 * I don't think that "all Wiktionary is original research", not like none of our editors have ever checked in a dictionary. Back on topic, delete, it looks like an unfinished transwiki. It's of tenuous relevance and has few incoming links, so just delete it. Mglovesfun (talk) 23:33, 26 April 2010 (UTC)
 * Delete or replace by an updated copy of Descriptivism. Conrad.Irwin 18:16, 27 April 2010 (UTC)
 * Delete this. &#x200b;—msh210℠ 18:20, 27 April 2010 (UTC)
 * It looks like an interesting bit of history. Eclecticology was an important early contributor. But, even so, it doesn't belong at it existing location, as if it reflected our current practice in all regards. DCDuring TALK 23:35, 27 April 2010 (UTC)
 * Deleted. --Yair rand (talk) 01:13, 28 May 2010 (UTC)