Talk:clustersize

RFV discussion: July–August 2019
Scannos I think. You find a space when you look at the actual pages. (Remember, if you do super-hard work and verify this by scraping three from the bottom of the barrel, it's your responsibility to mark it as rare or nonstandard.) Equinox ◑ 09:36, 27 July 2019 (UTC)


 * Actually, it did not seem all that hard to find uses of as opposed to "cluster size". I have added several citations to the citations page, and found quite a few more.  Kiwima (talk) 21:47, 27 July 2019 (UTC)
 * I feel unhappy about these kinds of entries that seem to me more like authors either referencing a variable name or being unable to distinguish the writing of a variable name from proper English. Mihia (talk) 14:00, 4 August 2019 (UTC)
 * Wiktionary would be a better dictionary without this entry. - TheDaveRoss  13:04, 9 August 2019 (UTC)


 * I agree, but I also feel that is more a matter for requests for deletion than for requests for verification. I am calling this RFV-passed, but am moving this to requests for deletion. Kiwima (talk) 22:18, 31 August 2019 (UTC)

RFD discussion: August 2019–April 2020
Moved from Requests for Verification. The relevant parts of the discussion appear below:

Mihia (talk) 14:00, 4 August 2019 (UTC) : Wiktionary would be a better dictionary without this entry. - TheDaveRoss  13:04, 9 August 2019 (UTC)

I am calling this RFV-passed, but am moving this to requests for deletion. Kiwima (talk) 22:20, 31 August 2019 (UTC)


 * Moved to RFD by Kiwima on 31 August 2019. Citations are at Citations:clustersize. Frequency is at . What is the deletion rationale? --Dan Polansky (talk) 06:38, 6 September 2019 (UTC)


 * Delete per my comment above. Variable name, not a proper word. Mihia (talk) 19:21, 7 September 2019 (UTC)
 * What makes you think so? Citations:clustersize suggest otherwise, e.g. a quotation from Journal of the Physical Society of Japan: "The clustersize increases by the addition of salt." --Dan Polansky (talk) 07:17, 8 September 2019 (UTC)
 * Not everything ever committed to print is correct English. We need to apply editorial judgement too. There are any number of these variable-name-style compounds citable in running text: "edgelength", "linesize", "sampleweight", etc. etc. Mostly these are written by people who do not understand how to spell, and do not understand the difference between writing e.g. "edgelength" in computer code and "edge length" in normal English. We are not helping anyone by recognising or legitimising these people's spelling errors. Mihia (talk) 22:47, 8 September 2019 (UTC)
 * Even if so, "Variable name" has been refuted, as far as I can tell. "Sometimes used as a variable name" was not refuted, but that can hardly be relevant.
 * As for whether this is a misspelling, it seems to be one given . However, this frequency ratio would suggest it is a common misspelling, and therefore keepable one (WT:CFI). --Dan Polansky (talk) 13:29, 21 September 2019 (UTC)
 * I am in favour of including "useful" and "important" spelling errors. I am not in favour of including a million™ entries along the lines of "clustersize: misspelling of cluster size". Mihia (talk) 20:08, 21 September 2019 (UTC)
 * Ok; do you have some exampels of these "useful" spelling errors, for calibration? (The policy in WT:CFI does not speak of "useful" or "important", so that would be a CFI override.) --Dan Polansky (talk) 20:11, 21 September 2019 (UTC)
 * I mean things like "alot" or "miniscule" or "i" for "I" or "it's" for "its". Mihia (talk) 20:53, 21 September 2019 (UTC)
 * And how many of the uses in your comparison are variable names or named parameters? It's hard to trust frequency ratios in the absence of such information. Chuck Entz (talk) 21:54, 21 September 2019 (UTC)
 * That's a good point. When I was looking at this before, I think I searched for phrases such as "the clustersize is" which it seems could only be (IMO) errors, not legitimate uses of or references to variable names. This does not have sufficient frequency to show up on 'Ngrams', however. Mihia (talk) 22:00, 21 September 2019 (UTC)
 * A better comparison would be the plural, which is rarely used for variables and parameter names. I just did a Google Books search on "cluster sizes" and got 55,000 raw hits. The same search for "clustersizes" came up with 38 raw hits. Of those, a number were really "cluster sizes" when you looked at the page image, and there were a couple where the snippet was selectable text and contained things like "Wealsomeasured distributions oftwo clustersizes". Out of the 38 raw hits, I found 5 where one could look at the page image or snippet image and verify that there was no space. Chuck Entz (talk) 22:51, 21 September 2019 (UTC)
 * (outdent) : the solid form not found., confirms your (Chuck's) observations that the solid forms are often variable names or scannos, although not always.  confirms there are many variable names counted in the results, which should probably be taken into account together with the frequency ratio of 500. I do not know how to deal with this, and I abstain for now. --Dan Polansky (talk) 20:58, 22 September 2019 (UTC)
 * Deleted - TheDaveRoss  19:45, 2 April 2020 (UTC)