User talk:Erutuon/2020

Todo/multiword Spanish lemmas with a hyphen
Hey. After your excellent page Todo/multiword Spanish lemmas not idiom or proverb, I'd like a request of all Spanish entries with a hyphen. There shouldn't be many entries on the list, as Spanish doesn't use them so much. Thanks in advance, anyhow. --ReloadtheMatrix (talk) 19:35, 1 January 2020 (UTC)
 * Done because I have files of all entry names for all languages. Whoops, that was wrong, it's supposed to be lemmas. Fixed. — Eru·tuon 19:43, 1 January 2020 (UTC)
 * Awesome. You rule. Any chance of having the prefixes and suffixes removed? --ReloadtheMatrix (talk) 19:49, 1 January 2020 (UTC)
 * Done. — Eru·tuon 19:56, 1 January 2020 (UTC)

Gah, the search engine includes results for redirects, which is why dimensional was in the list (-dimensional redirects to it). [Edit: Anyway, fixed.] — Eru·tuon 20:34, 1 January 2020 (UTC)

Adding a pronunciation table for Albanian
Hello,

I'd like to ask you whether you could add a pronunciation table for Albanian with the same structure as the Ancient Greek pronunciation table. I could also provide you with the content for doing so. Apart from that, I'd like to know how links may be added to a template without having to place linking brackets around every term encompassed by it. HeliosX (talk) 17:14, 2 January 2020 (UTC)
 * What Ancient Greek pronunciation table are you referring to? And what sort of template are you talking about? — Eru·tuon 19:36, 3 January 2020 (UTC)
 * I meant the current Ancient Greek pronunciation table that requires the letters to be entered in the page and, for example, this template. There should be, for instance, a link to "ali" and the noun ending in "-ã" or "-i" even though the terms are separated through "ale" in the declension table. It is not sure whether "ali" and the noun ending in "-e" should be linked because the usage of [e] or [i] in positions that allow both is usually somewhat similar and phonologically coherent. HeliosX (talk) 20:00, 3 January 2020 (UTC)
 * Do you ? — Eru·tuon 20:07, 3 January 2020 (UTC)
 * Yes, I meant this one. HeliosX (talk) 20:08, 3 January 2020 (UTC)
 * Ahh, I see. I was confused because "table" made me think of Appendix:Greek pronunciation. I could probably make a pronunciation template for Albanian. I'm not very familiar with Albanian, so I would have to use any information that you can provide, and w:Help:IPA/Albanian and Albanian phonology.
 * I still don't understand the problem with . I also don't understand why there are so many forms in each cell in the table. Does every noun of this type have two indefinite plurals, one in -i and one in -e? — Eru·tuon 20:18, 3 January 2020 (UTC)
 * Thank you for any possible aid with this. I'd have to divide the information about Albanian phonology as far as I'm concerned and as I've gotten to know into three IPA tables.
 * Firstly, the terms of Standard Albanian, which is mostly the same as Tosk, should have three major IPA rows. The first row would be Tosk and its phonemes are all given in the second phonology overview that you were referring to. However, the vowel [ə] is only pronounced when being stressed, in the first syllable of a word or if the word ends with a consonant after the vowel [ə]. The pronunciation due to position in the first syllable applies as well to any terms that are derived so that it is realized always in and . Contrastingly, only in the accusative forms, , , ,  and  and some terms beginning with "atë-" it may be pronounced even in the Tosk rendition of Standard Albanian. Also, the letter "r" is realized either as [ɽ] or often [ɹ] whereas [ɾ] probably does not occur. Hence, there could be a first pronunciation only with [ɽ] and, in the same row, a second pronunciation solely with [ɹ] in addition to denoting that, furthermore, [ɽ] and [ɹ] can be intermixed in a single word. Another matter concerns itself with "ë" that might also be pronounced as [ʉ] but that should only be noted next to the IPA row. The establishment and attribution of this phoneme is also a bit insecure but I've taken note of it.
 * The orthography of Albanian is based on the Korçan dialect of Tosk and, despite not having very many speakers at all, it should be included in the second row because it provides an explanation of the orthography. The vowel [ə] is pronounced everywhere but its speakers may not do so frequently in consideration of having learnt the general phonology of Standard Albanian, omitting these vowels in quite many positions. Nowadays, it seems that "r" is only realized as [ɹ].
 * Even though Gheg frequently may have its own variants for Standard Albanian vocabulary and grammar, its speakers also employ Standard Albanian and would pronounce it differently. Making only the distinction to the pronunciation of the latter in Tosk, the letter "r" has got the phoneme [ɾ], the affricates [t͡ʃ] and [d͡ʒ] can be extended to "gj" and "q", allowing two variants to be placed into the same IPA row.
 * In words that do not belong to Standard Albanian but only to Tosk, a second IPA table with the realization in its own dialect includes [c] and [ɟ] for the letters "q" and "gj" apart from the affricates [c͡ç] and [ɟ͡ʝ] in a single row. Those words don't have any pronunciation in Gheg but as well in the Korçan dialect.
 * In words of Gheg Albanian according to its own pronunciation, not including the other dialects, the information about vowels from this article can be continued for the third IPA table. Nevertheless, "ë" is still realized as [ə] unless the orthography shows that it has been altered. It can be denoted that it may also be pronounced as [ʌ] like in another dialect of Tosk but it does not have to be written into the pronunciation itself. Additionally to the Gheg pronunciation characteristics already entailed by the first IPA table, the consonantal clusters "nd" and "mb" can be pronounced as [nd] or [ⁿd] and [mb] or [ᵐb] and, written differently but derived from those, "n" and "m" as [n] or [nˠ] and [m] or [mˠ]. In order to differentiate the instants of "n" and "m" only as variants for "nd" and "mb" it should perhaps be recognized whether the term is a variant, referring to the templat usage, of a term that has "nd" or "mb" in the position of "n" and "m". The characters "q" and "gj" are not realized as [c͡ç] and [ɟ͡ʝ], but, in extension of their other possible pronunciation, also as [t͡ɕ] and [d͡ʑ]. The consonant [h] is sometimes weakened in particular when not being word-initial and, apparently, [l] can be palatized into [lʲ] at least before [ə], [ɔ] and possibly [o].
 * In Aromanian, both indefinite plurals may be formed. I would need links for each form that is phonetically close so that those would be, giving just an example,, and ,  even though there is written only "ali, ale featã, feati, feate" in the declension table. HeliosX (talk) 18:29, 6 January 2020 (UTC)
 * I don't know how to display "ali, ale featã, feati, feate" but link to, , , . Which words would link to which entries? "ali" could link to either or , and "featã" could link to either  or.
 * In general if these are just phrases like "to the girl", we would not give them their own entries, and each word would be linked separately – – as in the declension table in  where the forms of the definite article  are linked separately from the forms of . That removes the problem of how to show individual words but link to phrases. But I am just guessing that ali and ale mean "to the", because they don't have entries yet. (I also don't know what the different final vowels mean.) — Eru·tuon 07:28, 6 January 2020 (UTC)
 * They almost can't be used without, or , I found it without any separate particles of the genitive and dative cases for example in "Soarili, cã s-avea disprãs di surorili-a lui, dzenili di munti, iu chindurea cathi tahina, di li-adutsea ghiumi-mplini di lunjin, ta si-sh speal fatsa di liatsa noptsljei" in "Lunjina dit sinduchi" by Aromanian writer Dini Trandu but the author evidently employs those and  might simply have been blended together with the definite form of  as this would have resulted in "liatsa-a noptsljei" according to the author's orthography. The vowels [e] and [i] can be used both and I think that the latter has been influenced maybe by Greek phonology and grammar with "-i" as frequent ending of feminine declensions. However, they could be regarded in the same way as the Albanian article used along with the genitive exoclitics, which are not included in the entries that are linked to. HeliosX (talk) 23:42, 6 January 2020 (UTC)
 * Well, even if the genitive or dative case form doesn't occur without these other words, we don't give entries to phrases unless their meaning is not sum-of-parts as explained in WT:SOP – for example if the meaning of is not a combination of the meaning of  and the meaning of . — Eru·tuon 22:06, 6 January 2020 (UTC)
 * Having reconsidered this in comparison to the linkings in the Albanian declension tables, I'd agree that these particles don't have to be linked. HeliosX (talk) 16:07, 8 January 2020 (UTC)
 * Well, I should clarify – I suggested that a, ale, ali should be linked separately from the noun forms, like the Ancient Greek definite articles in the declension table of : . Regarding the Albanian pronunciation template, I will try to get to it eventually. I have some other projects that I'm working on at the moment. — Eru·tuon 08:31, 9 January 2020 (UTC)

Toilbot unusual edit
DTLHS (talk) 00:32, 5 January 2020 (UTC)
 * Thanks. My regex to match a PoS header followed by a headword-line template wasn't good enough. — Eru·tuon 00:45, 5 January 2020 (UTC)

Changing all derivations from Proto-Albanian
Hello,

maybe you could use a tool for multiple edits if such tool has been devised or a programmed account to change all these instants of derivations from Proto-Albanian to inheriting. HeliosX (talk) 16:35, 6 January 2020 (UTC)

update
Hey. Can you update User:Erutuon/abbreviation headers at the next dump please? I estimate it will be around 28% the size of the current page. --Yesyesandmaybe (talk) 10:45, 18 January 2020 (UTC)
 * Yep! It's in the script that updates the other header pages. — Eru·tuon 19:23, 18 January 2020 (UTC)

Module errors from edits to documentation submodules
Please check CAT:E. Chuck Entz (talk) 17:39, 20 January 2020 (UTC)
 * Fixed. Thanks. I wish I'd caught it earlier. — Eru·tuon 19:15, 20 January 2020 (UTC)
 * Well, at least the pages with the errors aren't where a lot of people would see them. It's not a big deal, but the sooner something like this is fixed, the better. Glad I could help. Chuck Entz (talk) 19:22, 20 January 2020 (UTC)

Etymology at epigone.
Hello, Erutuon. I wonder if you will take a moment to visit the English language epigone page when you are able, and check on what I suspect might be an error in the etymology given there. I believe the statement within the Etymology there, that ἐπίγονος comes "from ἐπιγίγνομαι" to be incorrect, as it suggests that γόνος is derived from γίγνομαι. Rather, I think that γόνος, as did γένος, entered Ancient Greek more directly as a lemma from earlier IE sources, instead of being derived from γίγνομαι (please note the Etymology at γόνος, wherein that is indicated, and wherein γόνος is indicated to be merely the equivalent of γίγνομαι + -ος). This is much the same in Latin, wherein the noun genus cannot be said to be a derivative of the verb gigno, but rather, that it is a related word with both deriving from separate IE lexemes. It seems to make more sense to me that the noun ἐπίγονος should be derived as is shown on its page, rather than from ἐπιγίγνομαι. As for myself, I am loath to change any existing etymologies, as I am really not that learned in linguistic history, and so would like to have your more experienced eyes on this (I believe it was Victar who rightly "slapped me down" on an earlier foray of mine into the IE realm). I thought that, instead of just including an etymology template on the page, I might rather just bring it to the attention of someone who probably can assess the etymology properly. Thanks.

Redirect problem
DTLHS (talk) 16:19, 22 January 2020 (UTC)
 * Thanks. I'll exclude redirects and look for the other redirects that my bot messed up. — Eru·tuon 19:38, 22 January 2020 (UTC)

Requested edits
You reverted my edit on Wikitionary:Requested entries because there is no page for Yogotti, but I was told that Wikitionary:Requested entries was the place you request new words. WikitionaryGuy (talk) 23:26, 22 January 2020 (UTC)
 * You must have been misinformed. Requested entries links to the pages where you post requests. In this case, if Yogotti is an English word, you would post it in Requested entries (English). — Eru·tuon 00:35, 23 January 2020 (UTC)

ToilBot "Normalizing" Vandalism
Is there any way you could have your bot avoid normalizing entries that have been edited too recently? I keep finding cases where someone vandalizes an entry and ToilBot tidies it before any patrollers can get to it- thus blocking it from the rollback tool. The only way around that is undoing via the edit history, which is slower and much less convenient. Chuck Entz (talk) 04:19, 23 January 2020 (UTC)
 * Sure. That's pretty annoying. I'll work on a way to skip pages that have been edited within a certain number of hours before I run the script on a large number of pages again. — Eru·tuon 04:41, 23 January 2020 (UTC)
 * Update: now the script finds pages whose most recent edit is in Recent Changes, and it starts from the oldest edits in Recent Changes and stops at edits from 12 hours ago, if it gets that far. I might change the start date because the oldest edits in Recent Changes are from 1 month ago, and some pages are probably edited more often than that. But do you think 12 hours is enough time? — Eru·tuon 19:18, 20 March 2020 (UTC)
 * I would be more comfortable with 24 hours, but there are others who do more rollbacks than I do-, and , to start with. Chuck Entz (talk) 20:33, 20 March 2020 (UTC)
 * 24 hours would probably be enough for me. &mdash; surjection &lang;?&rang; 23:05, 20 March 2020 (UTC)
 * Okay, I've changed it to 24 hours margin for vandal-fighting. — Eru·tuon 23:48, 20 March 2020 (UTC)

Esperanto ordinal numbers
I see you worked on Module:eo-headword and also applied protection to the page. Could you help me at Grease_pit/2020/January? I can't edit the page myself. 15:41, 24 January 2020 (UTC)

ἵημι problem
Hi! In ἵημι the "Aorist: εἵμην" misses the first three persons of the indicative, although in the wikitext they are present; could you please check why don't they appear? Thank you very much! --Epìdosis (talk) 12:06, 31 January 2020 (UTC)

I see the forms missing in both the header and inside the table. That's because the singular uses first-aorist forms, ἧκᾰ, ἧκᾰς, ἧκε(ν), which are shown in a different table because of the limitations of. And so shows the first-person singular indicative middle  in the header. — Eru·tuon 21:57, 31 January 2020 (UTC)
 * Ops, my error! Thank you very much, --Epìdosis (talk) 21:59, 31 January 2020 (UTC)

Update 2
Hey. Can you gimme another update of User:Erutuon/abbreviation headers at the next dump? I reckon about 60% of the terms have since then been corrected (at least in the Abbreviations subpage anyway), and I find myself visiting pages I've already corrected. TIA --AcpoKrane (talk) 11:58, 18 February 2020 (UTC)
 * Yep, I'll update it when the right dump files come out, as usual. — Eru·tuon 23:50, 22 February 2020 (UTC)
 * Done. Just realized I forgot to do it after the last dump (2020-02-01). — Eru·tuon 23:48, 23 February 2020 (UTC)

Nesting in translations
Hi,

Do you know, which module contains the nesting? So that if you add, e.g. a Kurdish translation, you can add "Kurdish/Kurmanji" in the "Nesting"? --Anatoli T. (обсудить/вклад) 02:33, 26 February 2020 (UTC)
 * Yes, that's in MediaWiki:Gadget-TranslationAdder-Data.js, under . — Eru·tuon 23:06, 26 February 2020 (UTC)
 * Thanks but it's not obvious to me how language code "ku" allows nesting "Kurdish/Kurmanji". I'd like to fix Eastern Mari ("chm") as "Mari/Eastern Mari", add a Mongolian nesting "Mongolian/Uyghurjin". --Anatoli T. (обсудить/вклад) 00:05, 27 February 2020 (UTC)
 * Right, MediaWiki:Gadget-TranslationAdder-Data.js only controls nesting that is automatically generated by the TranslationAdder gadget; by editing source code manually, anyone can nest any language any way they want, and that's where the  nesting for   comes from. I think "Mongolian/Ughurjin" requires a different mechanism, which may not exist, because the nesting table in MediaWiki:Gadget-TranslationAdder-Data.js is by language code; it doesn't describe any sub-nestings for writing systems. I'm guessing that the "Serbo-Croatian: Cyrillic: ... Roman: ..." that is in quite a few translations sections was added manually, not by the gadget. — Eru·tuon 00:36, 27 February 2020 (UTC)
 * Thanks. Any language will allow "language name"/Cyrillic or "language name"/Roman. I have fixed the "Eastern Mari" nesting and it seems I can just use Mongolian/Ughurjin or Mongolian/Cyrillic if there is no Mongolian translation present. --Anatoli T. (обсудить/вклад) 04:49, 27 February 2020 (UTC)
 * Okay. I don't see any way to do "Mongolian/Ughurjin" in the translation adder (and wouldn't be able to add that capability), but if that's not necessary, great. — Eru·tuon 19:06, 27 February 2020 (UTC)

Todo/multiword Spanish lemmas not idiom or proverb update
Hey E. Can you rerun Todo/multiword Spanish lemmas not idiom or proverb after the next dump? I linked, over the space of 4 and a bit months, all of the decent entries in there. What I'm looking for exactly is all NEW multiword entries made since the original list, so after making it, would you be able to remove all entries which appear in the original list? Only then will I be able to say that my quest has been completed. Thanks in advance --AcpoKrane (talk) 09:00, 27 February 2020 (UTC)
 * I just used a bot script, so no need to wait. This should be it. — Eru·tuon 23:10, 27 February 2020 (UTC)
 * That's just beautiful. --AcpoKrane (talk) 11:41, 28 February 2020 (UTC)


 * How about getting almost the same thing but for French. Call it Todo/multiword French lemmas (idioms and proverbs too) --Alsowalks (talk) 19:39, 15 March 2020 (UTC)
 * Done! — Eru·tuon 22:29, 17 March 2020 (UTC)
 * Woah. That's massive! How about Todo/multiword Catalan lemmas too?--Alsowalks (talk) 22:35, 17 March 2020 (UTC)
 * Also done! — Eru·tuon 23:32, 17 March 2020 (UTC)
 * Did I say Catalan? Drat, I meant Todo/multiword Italian lemmas and Todo/multiword Portuguese lemmas and Todo/multiword Swedish lemmas --Alsowalks (talk) 23:42, 17 March 2020 (UTC)
 * At this stage maybe I should just preempt you by writing a script to spam Wiktionary with similar pages for all of the 4000 something languages that have entries.... — Eru·tuon 23:35, 19 March 2020 (UTC)
 * I made the ones you asked for, but the poor Wiktionary server probably doesn't want me asking for lists of all lemmas of all languages. — Eru·tuon 18:05, 20 March 2020 (UTC)
 * Erutuon, could you do a rerun of Todo/multiword French lemmas? Quite a few entries have been created since then. 212.224.225.39 16:05, 18 June 2021 (UTC)
 * I can just dump all of them. But I haven't figured out the logic of removing the ones that have been removed in past edits. — Eru·tuon 19:22, 18 June 2021 (UTC)

Day to Days
How to I change the descendant trees ?

https://en.wiktionary.org/wiki/Reconstruction:Proto-Germanic/dagōs Personisgaming (talk) 15:48, 18 March 2020 (UTC)

Sure, you got it!
Nobody else edits as fast around here (except Equinox, of course). Anyway, if I get blocked before I'm done, would you mind adding in the Pronunciation section to all of these words that I recorded today? That would allow me to do other stuff, like, Spanish idioms or nominating people for adminship. --Gorgehater (talk) 22:30, 27 March 2020 (UTC)

Templatehoard
Do you still update it? I'd like to generate some new wanted entry lists. – Jberkel 09:36, 5 April 2020 (UTC)
 * Updated. (I need to figure out how to streamline the process; it's kind of tedious running all the commands.) I tried running the wanted entry script after the 2020-03-01 dump came out, but the first command failed. — Eru·tuon 23:41, 5 April 2020 (UTC)
 * Thanks! Maybe use a simple Makefile to automate the commands? I'll take a look, sometimes there are resource-related problems, unlike Rust Java needs a lot of memory :) – Jberkel
 * Ok, all regenerated. It was a silly bug in the CBOR deserialization. – Jberkel 22:29, 7 April 2020 (UTC)
 * I made a Makefile and it's now much easier to generate the template dump and entry index: just a single command for each. — Eru·tuon 21:48, 23 April 2020 (UTC)
 * Cool, I'll renegerate the pages. – Jberkel 14:15, 25 April 2020 (UTC)
 * I noticed that the scripts never got to the stage of saving the lists, and looked at the error log but didn't know how to fix it. Something about the Java version number if I recall right. (I wish the error log weren't spammed with progress bars or whatever; it makes it hard to read with .) Do you have time to debug? — Eru·tuon 18:57, 12 May 2020 (UTC)
 * yes, I foolishly updated some dependent libraries to a more modern version of Java, but Spark still needs an ancient version of the JDK. I could rollback to an older version but I'm waiting for the new version of Spark to be released, which should be soon. If it doesn't get released for the next dump I'll revert the changes. – Jberkel 21:03, 12 May 2020 (UTC)

User:ToilBot worsened paadje
Why did User:ToilBot worsen my contribution ? If you don't mind, I would like to revert it. There may be many cases where "usage case" is incorrectly used, but this wasn't one of them. It was just one sentence, that should be a hint for your bot to not touch it. --85.148.244.121 06:04, 11 April 2020 (UTC)
 * Do not revert it. We have standardised headers, which allows us to keep track of the millions of pages on the wiki. Think of it this way: in an idealised, complete entry, there may be many relevant usage notes, or there may only be one, but all usage notes will be under the header 'Usage notes'. —Μετάknowledge discuss/deeds 06:08, 11 April 2020 (UTC)
 * That's OK and why I asked it, but are we still allowed to call us "the English-language Wiktionary" if we refuse to speak English and even have bots to remove English from content which uses it? In an idealised English-language wiktionary, we would be writing English (and that still has plural and singular, if that changes the undeclined word would probably win). On the other hand, I don't even speak standard English very well (Sassenach for Alba); for me, it's OK, I just asked. --85.148.244.121 07:45, 11 April 2020 (UTC)
 * Well, "Usage notes" looks like English to me – certainly not Klingon at least. It does strictly speaking violate the rules of grammatical agreement in paadje, but Wiktionary can do what it wants because there's no Académie Anglaise to punish it for crimes against English grammar. More seriously, it would be a headache to try to make the headers agree in number with the contents of the sections, and it would make entries a bit less machine-readable, so Wiktionary has chosen one grammatical number for each header ("Usage notes" in plural, "Pronunciation" in singular) and I enforce it. This is the current convention, and changing it now might cause various bots and tools to break. — Eru·tuon 08:35, 11 April 2020 (UTC)

Update to
If you have time, I was wondering if you would see if could be tweaked so that the archaic second person singular present tense (for example, walkest) and archaic third person singular present tense (walketh) forms could be made into links that, if clicked on, would create the inflections in an accelerated manner, in the way that it works with. There might have to be a warning somewhere that editors should check whether these verb forms are attestable. This isn't urgent. — SGconlaw (talk) 16:52, 12 April 2020 (UTC)
 * I've added the second-person singular past-tense form (-edst) and made the table unconditionally link the forms, because up till now they were only linked if the target page existed; linking to nonexistent pages is a requirement for adding acceleration. I think I'll add acceleration to all the forms, not just -eth and -est, as none of them have it yet. — Eru·tuon 20:07, 20 April 2020 (UTC)
 * Thanks. I had no idea the -edst form existed. The format looks odd, though (what’s the significance of the two columns in the “past tense” section?) – perhaps it should match the present tense column? — SGconlaw (talk) 20:10, 20 April 2020 (UTC)
 * The past-tense columns were basically "modern" and "Elizabethan", but I've changed it to the format of the present-tense column. — Eru·tuon 20:45, 20 April 2020 (UTC)
 * Okay, finished the process. Added new acceleration protocols to Module:accel/en for the archaic forms. Let me know if you notice any problems. — Eru·tuon 21:07, 20 April 2020 (UTC)
 * Is the acceleration working? I clicked on the links cherishest and cherishedst (note: not saying these words exist) in the sample on the documentation page, and they just led to blank pages. — SGconlaw (talk) 04:22, 21 April 2020 (UTC)
 * Those links work for me. Do you have the acceleration gadget enabled in your preferences (search for "accelerated creation links" on the page)? — Eru·tuon 05:13, 21 April 2020 (UTC)
 * Accelerated links in work for me. Didn’t know I had to do something extra for these – will check. — SGconlaw (talk) 05:18, 21 April 2020 (UTC)
 * Okay, should work if  does. Oh, the problem is that acceleration doesn't work in the template namespace. Try clicking the links in the conjugation table in  instead. — Eru·tuon 05:30, 21 April 2020 (UTC)
 * Ah, that was the issue. Yes, it's working fine! Thanks again. — SGconlaw (talk) 07:57, 21 April 2020 (UTC)

Should this extra pipe be removed? — SGconlaw (talk) 19:11, 21 April 2020 (UTC)
 * Huh. That definition should be . Aha, here was the problem. — Eru·tuon 19:19, 21 April 2020 (UTC)
 * Emoji u1f44d.svg — SGconlaw (talk) 04:33, 22 April 2020 (UTC)

Chaucer quotes in English section
Hey. Could I get a list of all Chaucer quotations in the English (but not Middle English) section of an entry? It's because they shouldn't be there, they should be in Middle English. You could put it at Todo/English Chaucer. Thanks in advance --Vitoscots (talk) 17:25, 20 April 2020 (UTC)
 * You already saw, but I made the page. It includes things besides quotes, but it has excerpts so you don't have to waste your time visiting the entry. — Eru·tuon 23:45, 20 April 2020 (UTC)
 * Love ya, Eru! --Vitoscots (talk) 00:17, 21 April 2020 (UTC)

Chaucer list for Shakey and Milly
Hey. Can we get a list of undated Milton and Shakespeare quotes? I guess looking for
 * rfdate| and Milton - WT:Todo/Undated Milton
 * rfdate| and Shakespeare WT:Todo/Undated Shakespeare
 * or anything else that could be useful. Maybe entries using without a year but with the words Milton and Shakespeare in them. --Vitoscots (talk) 12:50, 21 April 2020 (UTC)
 * We already have those wheels, in two forms:
 * Search for, eg, 'hastemplate:"rfdate" insource:/rfdatek\|en\|Chaucer/'
 * Use categories like Category:Requests for date/Chaucer.
 * HTH. DCDuring (talk) 15:00, 21 April 2020 (UTC)
 * Yeah, that works for, which has the author in the template, but doesn't (though you can find examples of  applied to Shakespeare, for instance, among the search results for  ); for instance in :
 * Also I think Wonderfool has an enthusiasm for lists. They help keep him motivated because he can check things off and write down how much work is left.
 * I'll see what I can do. It's more complex than the previous Chaucer request. Gotta figure out what the format typically is. — Eru·tuon 17:49, 21 April 2020 (UTC)
 * Yeah, you gotta keep your volunteers motivated, boss. --Vitoscots (talk) 17:51, 21 April 2020 (UTC)
 * Either alternative technique yields lists from which the completed items disappear, which provides even more motivation. And what about my motivation, having added nearly ten thousand instances and  only now to have WF reject my handiwork? DCDuring (talk) 18:33, 21 April 2020 (UTC)
 * I didn't reject your handiwork, DCD. I was attacking your rfdefs with my steely knife. --Vitoscots (talk) 19:40, 21 April 2020 (UTC)
 * I don't get it. Seems like my making a list is a good way to make your work come to fruition (with dates finally being added)! — Eru·tuon 21:19, 21 April 2020 (UTC)
 * Your Chaucer list was less selective than "my" lists, so you must have essentially ignored the presence or absence the rfdate and rfdatek templates and the resulting categories. DCDuring (talk) 23:41, 21 April 2020 (UTC)
 * Ahh, I see what you mean now. I thought you were talking about the Milton and Shakespeare lists. The purpose of the Chaucer list is to catch Chaucer quotes that need to be moved from the (Modern) English to the Middle English entry, so yeah, and  aren't involved. (There were false positives because I just searched for "Chaucer" in English sections without trying to figure out if it was the author of a quote, or if it was the Chaucer as opposed to another Chaucer.) But the Milton and Shakespeare lists are only occurrences in conjunction with  so they are making use of your work inserting . — Eru·tuon 00:39, 22 April 2020 (UTC)
 * I see. All of the Chaucer quotes now in English should be in Middle English. We had long been accommodating an excellent contributor who thought Middle English quotes, even of alternative forms should appear in the entry of the English descendant. Having the dates should make it especially obvious. BTW, it would be nice to locate each quote in the manuscript fragment it was found in. I think there are four of them, but I haven't seem dates for the fragments. BTW, you have seen how many authors there are with rfdatek and rfquotek categories, right? DCDuring (talk) 00:49, 22 April 2020 (UTC)
 * Who was the excellent contributor, out of interest? --Vitoscots (talk) 00:03, 23 April 2020 (UTC)
 * Yes, seems like a tremendous number. Maybe it would be useful to print the templates in some kind of list format, showing the definition under which was placed, and the quote that  was placed on (somewhat like WT:Todo/Undated Milton and WT:Todo/Undated Shakespeare). Then people could more quickly look over the requests to find ones they can fill, without having to visit hundreds of pages and look over the text of them. The list could be put on a Toolforge site, though then it would be harder to give editors the satisfaction of crossing out the requests they had filled. — Eru·tuon 04:25, 24 April 2020 (UTC)
 * To make easier yet, you could include a link to a search for the quote on Google Books (and Wikisource and Gutenberg?}. The searcher might still have to shorten the search string to find the original wording of the quote, but the job would often be very easy indeed. I had thought about that while adding all the templates, but I just wanted to get the ball rolling. DCDuring (talk) 04:58, 24 April 2020 (UTC)
 * Okay, did the Shakespeare and Milton bit. I included the quotes in the list to make your job easier. — Eru·tuon 21:19, 21 April 2020 (UTC)
 * Nice. It didn't take long to clean all those up. --Vitoscots (talk) 22:09, 24 April 2020 (UTC)
 * See Todo/Undated English quote-templates. It's not a Milton-and-Shakespeare-only list, but it's easy to find them on it. — Eru·tuon 21:21, 5 May 2020 (UTC)

Re: Category timestamp
Re your question on #wikimedia-tech, just checking: you are aware that the timestamp is not supposed to reflect when a category was added to a page, aren't you? See mw:Manual:Categorylinks_table. Timestamps are often updated en mass after some template change, for instance. That said, some SQL queries can often shed light on what's going on. Nemo 10:12, 22 April 2020 (UTC)
 * Thanks, I wasn't aware of that dynamicpagelist used  for category additions. That explains why the list is sometimes random. — Eru·tuon 17:34, 22 April 2020 (UTC)

Please
keep things under control from now on. I'm taking a long break --Vitoscots (talk) 19:33, 26 April 2020 (UTC)

Most deleted pages
Was just reading this Special:Diff/47099083/59225779 and wondering if this can be queried – which pages have been deleted many times but do currently exist? Do we have that data? – Jberkel 19:03, 27 April 2020 (UTC)
 * Well, here is the "times deleted leaderboard", with a column indicating which titles actually exist. User talk:Equinox is at the top of the currently existing titles because Equinox doesn't believe in archiving his talk page... sigh... — Eru·tuon 19:38, 27 April 2020 (UTC)
 * Great, thanks! lots of NSFW type words as expected but some interesting ones as well. – Jberkel 20:30, 27 April 2020 (UTC)


 * Sigh sigh sigh. Well, I don't archive my talk page because I see "talk" as an ephemeral thing, like (to some extent) e-mail, IRC, or instant-messenger chat; distinct from the actual meat or content of the project, the entries and appendices. I can see why some people would disagree with this, especially when it's a "page" on the project (and we do archive "unowned" talk like Beer Parlour). But don't panic: my deleted stuff is still available to admins and the future historians who will read us, like Pepys, to find out what really drove amateur lexicographers in the early 21st century. (HI HISTORIAN! I SEE YOU!) I've got a vague memory that somebody (was it Purplebackpack?) tried to pass a vote preventing people deleting their talk pages, but I think it failed... can't remember... don't care really. I do take the point that if people should be able to delete their userspace then this ability shouldn't necessarily be limited to those who happen to have admin rights (required for deletion); there's always the "speedy" tag though, which tends to be respected unless it's being abused to hide ongoing debates etc. Equinox ◑ 23:20, 4 May 2020 (UTC)
 * Wrote the response below but never submitted it. Got angsty or whatever about it. So it sat in my browser (which kindly saves it for me) for weeks.
 * Well, most talk is un-ephemeral here. Admins can get at your talk page stuff at the moment, if they can find it in the long list of deleted revisions. But no one can use the search engine to find out when someone talked to you about something on your talk page. (Unless there's a "search deleted or past revisions" tool somewhere.) So it's best if nobody starts an important discussion on your talk page that someone might want to be able to refer back to: they'd have trouble finding it because it wouldn't come up in search results. It's not quite like a user page because other people are involved in it and it's not quite like personal IM since it's public to begin with.
 * Unfortunately I apparently think too much about the sort-of-lost treasures of discussion on your talk page. I didn't really mean to lecture you about it but I let my little complaint slip out and I pinged you because didn't want to complain behind your back so to speak...
 * PS: I think I wouldn't support a measure to require people to not delete their talk pages though. Seems too restrictive of personal liberty. — Eru·tuon 22:15, 23 May 2020 (UTC)


 * We agree to license all our "contributions" under whatever freebie licence so I suppose that covers talk pages too. Equinox ◑ 05:04, 31 May 2020 (UTC)

WT:Todo/Undated Bible
Hi. Good work with WT:Todo/Undated Milton, by the way. Could we get something similar for WT:Todo/Undated Bible. It has been noticed that undated Bible quotations are all over this fricking website. For some reason, never tagged them with  so they don't show up in the categories. --Elvinrust (talk) 22:43, 4 May 2020 (UTC)
 * I skipped what was the most common undated cluster. It is necessary to handle not just the KJV, but also Douay and other versions. Also some quotations don't have Bible, but rather, say, Matthew. You could find most of them by searching for 'incategory:"English lemmas" insource:/\#\*[ ]+[A-Z][a-z]+/'. To speed things up you could add 'incategory:"English nouns"' and then proceed to verbs, etc. I would think we would want to link to Wikisource's edition of KJV if at all possible. DCDuring (talk) 22:57, 4 May 2020 (UTC)
 * Made the list using that regex, looking at all English sections. I'm surprised that it really does catch mostly Bible quotations! It seems like it should be too general. I thought of filtering it by "Bible" and names or abbreviations of books of the Bible, but it doesn't seem worth it. — Eru·tuon 02:06, 5 May 2020 (UTC)
 * That regex WAS the one I used to try to catch ALL of the quotations that didn't start with a date. I just didn't insert or  in the biblical quotes. DCDuring (talk) 02:09, 5 May 2020 (UTC)
 * The advantage of using the search with a regex is that it yields a dynamic list, removing those that have been corrected and adding any that may have been added while the operation is in process. DCDuring (talk) 02:12, 5 May 2020 (UTC)
 * Aha, it was your work. Thanks. Yeah, the search engine has its advantages. I could try to update the list more frequently using the MediaWiki API (generating a list of pages, getting their content, searching it), but it would be more complicated and doesn't seem worth the trouble since Wonderfool is the only person who seems to be using the lists at the moment and he likes crossing stuff off. — Eru·tuon 01:21, 7 May 2020 (UTC)
 * I was happy when I discovered how the short regex caught so many (also, such a large proportion) of the quotations without dates. Talk about low-hanging fruit. Going after easy targets does mean that the remaining targets are less likely to be found. The ultimate residual mechanism of manual individual contributor error-detection and -correction is very slow.
 * Also, the reqex I used actually allowed for multiple occurrences of "#": [#]+. DCDuring (talk) 13:35, 7 May 2020 (UTC)
 * I used a pattern with the equivalent because it checked that the list marker was at the beginning of a line and " #*" would skip quotes under sub-definitions. I ultimately generated a JSONL file of probably-quote wikitext from English sections to make it faster to find Bible quotes. (Searching the dump for quotes took ~2 minutes at best but searching the quote file takes a few seconds.) — Eru·tuon 19:19, 7 May 2020 (UTC)