Wiktionary:Information desk/2020/August

Free corpora
Where can I find free corpora of English and other languages (beside Wikipedia) average size? I want count word frequencies and count lemma (using spaCy) frequencies in text.Borneq (talk) 11:59, 2 August 2020 (UTC)


 * , do you know about the collection of English corpora at https://www.english-corpora.org ? Don't know about other languages. --ColinFine (talk) 22:01, 4 August 2020 (UTC)

Understanding how the Verb Inflection system works
I'm looking at the Verb 'Trek' in English

https://en.wiktionary.org/wiki/trek#Verb

If I click 'edit' I can see:

and somehow the system generates this sentence:

trek (third-person singular simple present treks, present participle trekking, simple past and past participle trekked)

How does it know to remove the 'k' from the parameter for the simple present 'treks'?

If I look at the documentation page:

https://en.wiktionary.org/wiki/Template:en-verb/documentation

(added a k)

...seems to make more sense, but looks like the documentation is outdated?

Is there source code I can look through to see what is happening? Or is it hardcoded- 'double kk' > remove a 'k'?

--Bendecko (talk) 17:08, 3 August 2020 (UTC)


 * Hi, . Thanks for the question: it prompted me to go looking. The source code is at Module:en-headword, and the relevant code is at lines 536-8. It appears that the module mostly works off the page name (here, "trek"). If the parameter ("trekk") doesn't have a recognised form, it is used as the base of the past and the present participle, but the 3 sg. pres. is formed from the page name. The old format is still supported, according to the code, but yes, it could do with the documentation updating. --ColinFine (talk) 22:17, 4 August 2020 (UTC)

Category:Chinese_lemmas
I can report we are now at 200,000 Chinese lemmas. 大家辛苦了！ The two-hundredth-thousandth entry was 錢荒 ("money shortage"). ---&#62; Tooironic (talk) 23:20, 3 August 2020 (UTC)

ihrzen
There seems to be an issue with the pronunciation section. Tharthan (talk) 23:32, 4 August 2020 (UTC)
 * It was a perfectly fine pronunciation section- or would have been if it was at siezen. I removed the incorrect audio file and replaced the IPA with a request for pronunciation template. The new audio file seems a bit odd, but there's nothing wrong with the pronunciation itself. Chuck Entz (talk) 02:47, 5 August 2020 (UTC)

Lua error: not enough memory
Hello,

I am getting the above error when looking at the page: https://en.wiktionary.org/wiki/%E9%A0%AD#Japanese

Everything looks fine up to the paragraph "Etymology 9", but below "Etymology 10" the error is all over the page.

The following snip exhibits the problem:

///////////////////////snip///////////////////////////////////////////////////////

Etymology 9 Kanji in this term 頭 どたま Grade: 2 Irregular

Contraction of ど (do-, “super-”, often used ironically as a derogatory prefix) + 頭 (atama, “head”). Pronunciation

(Irregular reading) (Tokyo) どたま​ [dòtámáꜜ] (Odaka – [3])[2] IPA(key): [do̞ta̠ma̠]

Noun

頭 • (dotama)

(derogatory) head

Usage notes

Often spelled in hiragana, as どたま. Etymology 10 Kanji in this term 頭 ず Grade: 2 on’yomi

/du/ → /d͡zu/ → /zu/

From Middle Chinese 頭 (MC dəu). The Lua error: not enough memory reading, so likely an earlier borrowing. Pronunciation

Lua error: not enough memory Noun

Lua error: not enough memory

////////////////////////////////////////end snip////////////////////////////////////////////////////////

I am using fedora 32 (5.7.10-201.fc32) and the Lua error shows up in Firefox (78.0.2) as well as in Chromium (84.0.4147.89). Would anyone have a suggestion how I could avoid this error?

Best regards Kai


 * It's a known problem. Our code is too memory-intensive in the eyes of the system. We don't have a solution. —Suzukaze-c (talk) 02:08, 6 August 2020 (UTC)

Replying to posts
How do you reply to comments posted to the talk page of an article, or here on the tearoom?
 * Like this. —Mahāgaja · talk 16:01, 7 August 2020 (UTC)


 * Above reply is not really helpful for beginners. See this Wikipedia article section 'Replying to an existing thread'. PianistHere (talk) 12:05, 11 November 2020 (UTC)

Category:Regional English
Quite a few entries are categorized directly into this category, rather than being categorized into specific regional subcategories. At that point, is there any distinction which it is sensible to make (and reasonable to expect to be maintained) between this and Category:English dialectal terms? - -sche (discuss) 16:08, 7 August 2020 (UTC)

Incorrectly classified nouns
I believe cytoplasm is incorrectly classified as a countable noun (it is in fact an uncountable noun) and given an incorrect plural form (cytoplasms). While the word cytoplasms has been used, and therefore might meet the attestation requirement, it should be noted as a non-standard form. It doesn't appear that I can directly edit the plural form on the page for cytoplasm, though.

How do you fix this?

129.130.144.198 22:07, 7 August 2020 (UTC)
 * It is written in the documentation of : The code is.
 * If you later encounter cases where such a situation only applies to specific senses (and others are perfectly countable or uncountable), you use in front of the sense. Fay Freak (talk) 22:12, 7 August 2020 (UTC)
 * Modified accordingly, although it's obviously not true to say the plural is incorrect. —Μετάknowledge discuss/deeds 22:41, 7 August 2020 (UTC)
 * For example, the plural uses seen here are perfectly fine. --Lambiam 00:55, 8 August 2020 (UTC)

What did Mrs. Bryerson say here? photodiode? It was something used as a verb that sounded like "fododio"...
For context, so you don't have to watch the whole episode (if you don't want to), there isn't much of it, other than that Mrs. Bryerson is a stock old lady neighbor character in , a Canadian kids' cartoon. She is talking to someone on the phone. These particular ramblings seem to purposefully just be random topics jumbled together, because they were thinking...maybe that older ladies would be stereotyped to talk about those topics...? Idk honestly...
 * 1999, , "There's No Place Like Gnome" (season 1, episode 4b):
 * Mrs. Bryerson: Oh, these fenders don't fododio like they used to, but I can fandango with the best of them. Well, sure, skydiving will keep your adrenaline going—keep you young, I think! Bye now!

I won't link the video footage here, since it is copyrighted. But if you want to find footage, the line starts at 21 minutes and 10 seconds into the episode.

Please help me find this out. I intensely dislike having to admit that bits of language on my transcription projects are unidentifiable. PseudoSkull (talk) 11:37, 10 August 2020 (UTC)
 * It sounds more like "fododle-o" to me. I think it's just a madeup word intended to sound funny. —Mahāgaja · talk 12:23, 10 August 2020 (UTC)
 * But is it supposed to have an implied meaning? What do “these fenders” have to do with it? Do these make any sense in the context? Which verb might make sense in the phrase “these fenders don't <> like they used to“? --Lambiam 13:08, 10 August 2020 (UTC)
 * I have no idea. Since Fender is also a surname, she could even be talking about people or electric guitars. —Mahāgaja · talk 14:51, 10 August 2020 (UTC)


 * Is it only me or does the first diphthong sound slightly more open than the second and last ones? If so, maybe the sentence is These fenders don’t fall “dodio” like they used to with L vocalisation, which would relate it to the skydiving mentioned in the second half. — Ungoliant (falai) 15:11, 10 August 2020 (UTC)


 * I found a few other possibilities: fold olio. olio is also a Spanish (or Portuguese) borrowing as fandango is, and it has two musical senses. I don't know about fold having a sense that would match this, but the final sound being olio seems likely, if it's not made-up. PseudoSkull (talk) 01:23, 11 August 2020 (UTC)
 * I think it’s intended as Vo-do-de-o, which was (along with variants like Vo-de-o-do and Vo-do-do-de-o) a popular string of nonsense syllables for scatting along to music back in the 1920s and ’30s. As to the meaning — are probably guitars in context, so it would be a pseudo-old-timey slang way of saying ‘these guitars don’t make music like they used to, but I can dance with the best of them.’— Vorziblix (talk · contribs) 13:48, 11 August 2020 (UTC)
 * I find the Vo-do-de-o theory plausible and appealing (seen here in that spelling on a record sleeve), with Mrs. Bryerson temporarily reliving the personality of her wild years as a young woman, which then would have been in the – but then the brand name Fender is chronosynclastically infundibulated (and, moreover, guitars do not fit this style of music).  --Lambiam 19:03, 11 August 2020 (UTC)


 * You are overthinking it. It's probably just a made-up noise. No need to go full Freud :) Equinox ◑ 18:12, 22 August 2020 (UTC)

phonological notation
Can someone tell me what this means:  C^(vl≠h) → Cː / V_(s, ʃ, x)  I just don't know what the part with the exponent symbol means. Dngweh2s (talk) 03:17, 11 August 2020 (UTC)
 * I don't think it's standard notation, but it might mean "immediately followed by". Where did you see this rule? What language does it apply to? Are there any examples of the sound change? —Mahāgaja · talk 07:19, 11 August 2020 (UTC)

Words from video games
What is the policy on adding words that were made up by video games but are mainly used in the game's community? Examples: Hellstone from Terraria, Shulkers from Minecraft. AntisocialRyan (talk) 05:16, 11 August 2020 (UTC)
 * It's a fair question. I've wondered what the policy would be regarding adding words like hookshot, myself. Tharthan (talk) 05:40, 11 August 2020 (UTC)
 * WT:FICTION applies, I believe. (As does the rest of CFI, notably the part about three independent durably archived uses.) So your supporting quotations would have to be outside of the world of Minecraft or Zelda or whatever, which is obviously quite an obstacle. —Μετάknowledge discuss/deeds 06:15, 11 August 2020 (UTC)
 * This doesn't answer your question, but we have Template:quote-video game --Emit888 (talk) 00:15, 23 August 2020 (UTC)

"when I call you up and say burgundy"
This was a strange idiom that my family members had used around me that I'm just now remembering. It meant the same things as when pigs fly, never in a million years, etc. As I remembered it and was trying to find it to add to Wiktionary, I found no Google results whatsoever for the phrase, on the entire Internet! So it was apparently familect and nothing more; I was actually unaware of that until now!

Apparently it was passed down from generation to generation; at the time, my grandfather on that side of the family said his grandfather (that'd be my great great grandfather) taught him the phrase, raising the possibility that the phrase could have potentially existed as early on as the 1910s, or maybe before! Nevertheless, if it was used around him as a kid as he claimed, it would've at latest originated in the early 1960s.

But...why the weird wording? The phrase implies that you would never call someone on the telephone and "say burgundy." But does this merely mean you wouldn't mention burgundy (the color or the wine) in conversation, or that you wouldn't call someone and exclaim "Burgundy!" out of the blue, immediately upon connection? Also, why burgundy? Could it be because the color or the wine is unappealing to the speaker? Perhaps because it's a word that's not that commonly used anyway (I haven't heard it used much myself).

Or maybe at the time the phrase was coined, "burgundy" was a code word of some sort that was to be used to signal an operation's completion over the phone—a prank, maybe. Perhaps this mission was completely unsuccessful, and the ones involved lost all hope in it, thus you'd never "call someone up and say 'burgundy'" because any similar mission would fail every time. So then it would have just evolved into general usage for when you think something would never happen. In this case it'd have started as some kind of inside joke, then have been adopted by others who maybe didn't get the joke, or didn't care.

Also, how can something stay in one family and never be used elsewhere? You'd think maybe somebody else would have picked it up, after all these years, making it a more commonly known idiom.

...Or maybe it has, but just hasn't been used by the right people, or by enough people, to be included anywhere on the Internets...

Any ideas? Just thought it'd be interesting to share and collect thoughts on! PseudoSkull (talk) 06:18, 14 August 2020 (UTC)


 * Could it be part of a familect? My dad says things like "I'm having a sit" (I'm on the toilet) and "he's a real dymo" ("he's an idiot"), and as far as I can ascertain, no one outside my family understands these expressions. Either that, or it could be vestiges of some old Queensland dialect. ---&#62; Tooironic (talk) 22:31, 17 August 2020 (UTC)


 * In my family we referred to the undersink garbage disposal as the "muncher". And we called sandwiches denwiches. Also, my Dad would always make up fancy names for combinations of drinks and I'd use them in a bar and nobody understood them... When it comes to WT, though, we obviously can't accept anything without 3 decent cites. Still, if you really want to share this information, Urbandictionary takes everything. I assume the website still exists. --Emit888 (talk) 00:27, 23 August 2020 (UTC)

How do I propose a word of the day?
See title. Thank you. Amin (talk) 04:20, 13 August 2020 (UTC)
 * See Word of the day/Nominations. PseudoSkull (talk) 05:17, 13 August 2020 (UTC)
 * Thank you! Amin (talk) 16:54, 14 August 2020 (UTC)

Abkhaz letters
Both here and on Wikipedia, it says that there are 62 letters in the Abkhaz alphabet, but I counted 64. Is the count a mistake, or is the alphabet? İʟᴀᴡᴀ–Kᴀᴛᴀᴋᴀ (talk) (edits) 00:09, 19 August 2020 (UTC)


 * I think the reason for the discrepancy is that the signs (Ь ь) and (Ә ә) – which do not represent sounds by themselves but only serve as modifiers of letters that do represent sounds – are not considered letters. --Lambiam 13:23, 19 August 2020 (UTC)
 * According to the World Abaza Congress, there are 64 letters. Signs (such as ъ and ь) are also counted as letters in every Cyrillic-script language I have seen to date (except this one if one goes by Wikipedia or sources based on Wikipedia). Thoughts? İʟᴀᴡᴀ–Kᴀᴛᴀᴋᴀ (talk) (edits) 22:12, 19 August 2020 (UTC)
 * I nevertheless think that the discrepancy is based on differences in status assignments for these graphemes. The is generally considered a letter in texts written in an, but not so for the . Lacking a solid and definitive definition of the concept “letter”, I am not prepared to adjudicate whose judgement in status assignment is the better.  --Lambiam 09:37, 20 August 2020 (UTC)

Adding svg images to articles of characters that don't display correctly
Recently, I made a page for 𠶸. It doesn't display on mobile so I wanted to add an svg image from glyphwiki for reference. I wanted the image to be part of the character info box, but it wasn't working, until somehow it did. I don't know what I did to make it work, and I've checked the code and the image url isn't there. How do I do this with another page? Thanks.

EDIT: I've got it. I have to add the svg image to the Unicode Data Image database.


 * How do I add an svg image to the Unicode Data Image database? PianistHere (talk) 12:02, 11 November 2020 (UTC)

Does anyone knows of a tool to find "words in many language but not in another"?
I've tried to search for a tool like that, to find the "hottest" words, not in my language wiktionary. Has it been done yet? --Ignacio Rodríguez (talk) 18:07, 22 August 2020 (UTC)


 * I don't understand what you are asking. You want to find (for example) a Spanish word that has got a Spanish entry on en.wikt, fr.wikt, and de.wikt, but not yet on es.wikt? Equinox ◑ 18:12, 22 August 2020 (UTC)


 * We can use this tool - click on "I miss you". We have a page on WT too that doesn't get updated much. If not, you could ask, who makes amazing lists. --Emit888 (talk) 00:32, 23 August 2020 (UTC)


 * Excuse my poor English. That's precisely what I am looking for. A list of spanish lemmas currently on en.wikt but not in es.wikt., that's good as well! Pretty awesome that website! --Ignacio Rodríguez (talk) 23:43, 25 August 2020 (UTC)

Can I add "Neopet" as a noun?
Not the website/game itself, but the virtual pets inside the game. They have been mentioned in articles dating back to 2005 and scholarly articles, according to Google. This is similar to Pokémon. AntisocialRyan (talk) 20:17, 24 August 2020 (UTC)
 * It probably passes WT:FICTION, but you should still collect the citations before creating the entry. Feel free to ping me if you're unsure about how to interpret the policy. —Μετάknowledge discuss/deeds 00:41, 26 August 2020 (UTC)

How do I help Wikitionary?
I am new here and I want to learn how to help this project. An answer would help a bunch. Thank you. -- Hamuyi(talk) 19:58, 25 August 2020 (UTC)


 * You asked the same question here two months ago. Was the answer given then insufficiently helpful? Please note that we do not like disruptive editing here any better than the folks over at Wikipedia. Which languages are you competent in? If you know Ugaritic, you could help to fill open requests at Requested entries (Ugaritic). --Lambiam 09:43, 26 August 2020 (UTC)

Audio naming policy
What's the naming policy for audios with entries that have multiple vocalisations? For example. ArabicAudios (talk) 16:57, 29 August 2020 (UTC)

Categories with invalid label?
I have created a category "Category:Chickasaw inalienable nouns" here https://en.wiktionary.org/wiki/Category:Chickasaw_inalienable_nouns However, the page keeps showing an error message -- The automatically-generated contents of this category has errors. The label given to the poscatboiler template is not valid. You may have mistyped it, or it simply has not been created yet. To add a new label, please consult the documentation of the template. -- and I cannot figure out what the problem is or how to fix it.

Can someone give me some guidance or point me in the right direction?

Many thanks in advance Treyzinbox (talk) 23:41, 29 August 2020 (UTC) Treyzinbox
 * The category name has to be added to a module. I saw your edits, and I've been trying to figure out the best way to do it. I've never worked with Chickasaw, and your description doesn't help much. Are we talking about a class of nouns used for parts of oneself, close relatives, and the like? Chuck Entz (talk) 00:40, 30 August 2020 (UTC)