Wiktionary:Grease pit/2009/January

=January 2009=

McBot and XML dumps
McBot is busy removing a useless template and doing a couple of other things, editing at a very high rate (at one point I noted nearly 100/minute ;-). There are a lot of them; at one time yesterday there were 94000+ remaining, after it had already done at least 20K. All good, but it floods the daily dump update; I had to fix it after Conrad.Bot clobbered it on a previous occasion.

The simple version is: the dump for today is posted, but does not reflect most of the McBot edits. They (and other 'bot) edits will be integrated into the dump over the next several days.

The gory details (if you care ;-): the process effectively prioritizes the updates it is doing, and limits the total number of updates in each run. That way it doesn't try to load 100K pages from the servers in one pass. It runs 3 times a day, and the run at ~09:00 UTC is posted. The priorities are: The priorities are not a strict sequence, it always does all templates, and then some from each step. Under normal conditions, it completes all of the updates in each step anyway. At the moment, the fourth group is not being completed. The cache of revision id for each page id is refreshed from the servers at least once in every 7 days (21 runs). Robert Ullmann 11:07, 1 January 2009 (UTC)
 * 1) templates (all)
 * 2) new pages (max 17000, an arbitrary non-infinite number ;-)
 * 3) edits by non-bots from RC (max 5000)
 * 4) edits by 'bots from RC (max 5000)
 * 5) others found on revision cache refresh (max 3000)

Ah, should also note that all deletions needed are done on each run; however it won't do more than 5000 because it just reads the last 5K from the deletion log. This would only be a problem if more than 5000 entries were deleted in 8 hours. (Not very likely.) Robert Ullmann 19:24, 1 January 2009 (UTC)


 * Thanks for keeping us posted! Although it's tempting (for me at least ;) to whinge a little about delays of a few days, it's nothing compared to waiting for a few months as was the previous situation. Thank you again. Conrad.Irwin 02:01, 2 January 2009 (UTC)


 * Karibu. Note that the delay only matters if your use for the dump is affected by the un-updated entries; in this case edits to Spanish verb forms and a a few AF and VolkovBot edits. There is still a new dump posted every day. Today's has 36,203 updates not done. Robert Ullmann 11:32, 2 January 2009 (UTC)

New script templates
These have been added recently, and I'd like to migrate the font specifications to the style sheet MediaWiki:Common.css, standardize the class names, etc:


 * – add fallback font-family: sans-serif
 * – add fallback font-family: sans-serif
 * – change class name from script-Geor to Geor
 * – change class name from LA to Latf; add fallback font-family: sans-serif
 * – change class name from Mahal DV to Thaa
 * – add fallback font-family: sans-serif

Any comments or objections? —Michael Z. 2009-01-02 22:14 z 


 * Do we have a guide somewhere for our users that (1) lists the various scripts and fonts, (2) offers help for users who have trouble displaying certain scripts, (3) points to templates and other tools for editors who want to work in these scripts? --EncycloPetey 05:03, 12 January 2009 (UTC)


 * 1 no, except for category:Script templates, and the style sheet MediaWiki:Common.css documents itself. 2 no. 3 just the templates' own documentation:, , , , etc. I have started WT:SCRIPTS to document these templates, but it is not a help page for general readers of the dictionary. —Michael Z. 2009-01-12 06:19 z 

Tibetan font declaration
Currently, the Tibetan font declaration is identical to the generic Unicode one, which is problematic since none of the Unicode fonts support Tibetan. IE (and most other browsers) ignores these fonts (so there's no problem with Tibetan failing to show up even if the fonts are installed), but the declaration in the Common.css should still be changed to something that makes more sense, like

.Tibt { font-family:Jomolhari,'Tibetan Machine Uni','Microsoft Himalaya',sans-serif; }

-- Prince Kassad 14:57, 3 January 2009 (UTC)


 * Should this be made an MSIE-only declaration by using the comment trick? Safari on my Mac seems to display Tibetan. —Michael Z. 2009-01-03 15:15 z 
 * I think it should be made global. Firefox would greatly benefit from it, since it uses Arial Unicode MS as the default font for Tibetan, even though Arial Unicode doesn't support Tibetan. IE supports Tibetan on its own anyway. -- Prince Kassad 15:18, 3 January 2009 (UTC)


 * Okay, I'll make the change to the style sheet. I notice that Tibetan doesn't seem to display in either MSIE or Firefox in my vanilla Win XP system, with or without this font spec, so I assume that a font download is required.  Is it the same on Vista?


 * FYI: both Safari 3.2.1 and Firefox 3.0.4 seem to display Tibetan without any help on a default install of Mac OS X 10.5.6. Adding the above CSS doesn't seem to affect the display in these browsers. —Michael Z. 2009-01-03 21:44 z 


 * Done. —Michael Z. 2009-01-03 21:46 z 


 * To answer your question, Vista ships with the Microsoft Himalaya font which supports Tibetan. It's at the end of the font list because it's awfully small. -- Prince Kassad 22:00, 3 January 2009 (UTC)


 * Oops, I have also updated the template now, which was overriding the style sheet. Look better? —Michael Z. 2009-01-04 00:26 z 
 * Yes, it works now. -- Prince Kassad 17:27, 4 January 2009 (UTC)

Adding "no plural" and "no singular" parameters to {hu-decl}
I am trying to add two new named parameters (ns - no singular, np - no plural) to hu-decl which is the main layout declension template for Hungarian nouns. Sometimes nouns do not have plurals, occasionally singulars. The current template generates both. I tried the following in (my test template) but did not work:

The other test template that goes with this is. The hu-decl template is the layout template for several other templates. It seems a better idea to put the ns and np parameters here rather than in the subtemplates. They are already complex and I would have to modify several templates instead of one. Could someone please help? Thanks. --Panda10 21:25, 4 January 2009 (UTC)


 * In my opinion, the best solution for this is to use multi-layer templating, as is done on Ancient Greek inflection templates and others (I believe the original credit for this goes to Robert Ullmann, but am not sure). I will set up an example.  Give me perhaps a half hour.  -Atelaes λάλει ἐμοί 00:06, 5 January 2009 (UTC)


 * Ok, take a look at User:Atelaes/Sandbox, where is a modified version of, and shows a singular only version of the inflection of kutya.  It doesn't work perfectly for the nominative and essive-formal, because of the PAGENAME calls, but I imagine you'll get the idea nonetheless.  -Atelaes λάλει ἐμοί 00:19, 5 January 2009 (UTC)


 * I'd like to keep the two-column format. If there is no plural, each cell would contain a dash. Is that possible with your solution? --Panda10 01:01, 5 January 2009 (UTC)


 * Take another look now. -Atelaes λάλει ἐμοί 01:09, 5 January 2009 (UTC)


 * Sorry, I don't see any change. It looks the same to me. The code, too. --Panda10 02:01, 5 January 2009 (UTC)


 * Press edit, and then show preview. The changes are apparently lost in the job queue for now.  -Atelaes λάλει ἐμοί 02:36, 5 January 2009 (UTC)


 * Thanks! --Panda10 15:04, 10 January 2009 (UTC)

Ampersands in messages
It looks like MediaWiki now escapes ampersands in certain messages, which breaks the display of those messages if they use HTML entity references or numeric character references; I just fixed MediaWiki:Previousrevision and MediaWiki:Nextrevision, but I don't know if any other messages are affected. Does anyone know more about this? —Ruakh TALK 23:55, 4 January 2009 (UTC)

template:rfc-archive
Can someone please tell me what's wrong with this template that causes the following errors to occur? 1, [ 2], [ 3]. Thanks.—msh210 ℠  22:24, 8 January 2009 (UTC)


 * @ was an unbalanced attempt at a link and has been fixed. I suspect others will be similar.  I'll work on them.  -Atelaes λάλει ἐμοί 22:27, 8 January 2009 (UTC)


 * I'm not sure about this, but I suspect that the other two require the 1= bit because there are equal signs in the running text (for 2 it's your name, for 3 its in the link that Robert gave). My guess is that the template is taking everything before the = to be a parameter name, and everything after to be the content of that parameter (a parameter which isn't used, and so isn't displayed).  So, I suspect that, for every rfc discussion containing an =, you'll have to do the 1= trick.  -Atelaes λάλει ἐμοί 22:34, 8 January 2009 (UTC)

Hm, thanks. is there a way to allow a template to accept a parameter value that includes msh210 ℠  22:36, 8 January 2009 (UTC)


 * Not that I'm aware of (though I'm certainly not the most technically proficient person on this project). My experience is that anytime you have misbalanced brackets of any sort nested inside other brackets (a necessary condition for inclusion in a template), things just go all screwy.  -Atelaes λάλει ἐμοί 22:46, 8 January 2009 (UTC)


 * If you wrap it in &lt;nowiki&gt; tags (as you did just now), or otherwise modify it so that the software doesn't get confused by mismatched brackets — [&amp;#x5B;foo] and [&lt;nowiki/&gt;[foo] both work — it should be fine, but AFAIK there's no way to do it without modifying the comment. If we want to be really precise about such things, we can create and, turn  into basically just   , and use  and  directly in situations where  would be a pain … personally I wouldn't worry about it if I were the one doing the archive, but since I'm not, you can totally be my guest to be as precise as you like. :-)   —Ruakh TALK 05:08, 9 January 2009 (UTC)

Template:ca-noun-form
This is a very simple template, but it does need to do one thing that I don't know how to set up. It needs to check the values of (or ) and  (or ) to see whether they are valid for gender / number. Can someone help? I could certainly set up something using #switch, but I seem to recall there's a bit of code out there that I could simply call from this template, and have that pre-written code do the work. The problem is that I don't know where the code is or how to call it. --EncycloPetey 04:59, 12 January 2009 (UTC)


 * To begin with, I strongly suggest that the dual input be dropped. Have gender be a named or a numbered input, but not both!  I'm not aware of any sort of pre-written code which could be utilized for this, but it is a fairly simple switch.  -Atelaes λάλει ἐμοί 06:52, 12 January 2009 (UTC)


 * I'm standardizing now, and have only used unnamed. However, the template is set to prefer, and when this is lacking it uses .  The dual input options shouldn't hurt anything, and allows for users who forget whether this particular template uses one coding method or the other.  I often get confused myself between all the various templates I use, some of which take the gender as an unnamed parameter, and some of which require . --EncycloPetey 06:58, 12 January 2009 (UTC)


 * Ok. Well, what are the accepted genders?  Also, what should it do if the gender isn't valid?  I guess my first assumption would be to not display any gender, and simply tag the entry with .  Also, you mentioned something about number, but I'm not seeing any number stuff in the template.  -Atelaes λάλει ἐμοί 07:04, 12 January 2009 (UTC)


 * The is intended to be an optional sg or pl from values of "s" or "p" (or "sg" or "pl").  The value of  should be "m", "f", or "mf" (with "c" as equivalent to "mf").  If the gender isn't valid, it shouldn't display and should request attention, as you guessed.  I'm undecided about whether  should be required; it would certain simplify coding if it were mandatory. --EncycloPetey 07:14, 12 January 2009 (UTC)


 * Ok, give it a shot and let me know if it works. -Atelaes λάλει ἐμοί 07:20, 12 January 2009 (UTC)

"Other pages to be deleted"
When a page is deleted, we now get some text telling us that the template may be deleted. I don't believe this to be true. SemperBlotto 15:27, 16 January 2009 (UTC)


 * It would seem more helpful to leave the tag in the deleted page, if that's what you are suggesting. DCDuring TALK 15:32, 16 January 2009 (UTC)
 * Sorry - didn't understand that - I was asking a question, not making a suggestion. SemperBlotto 15:38, 16 January 2009 (UTC)


 * I've taken the liberty of 'ing the category tag in the template. This had been proposed before, to general indifference, but this change to the software made it a bit more important IMO. -- Visviva 16:29, 16 January 2009 (UTC)


 * I have to admit this is my fault. See my changes to the text that is displayed. (I figured we may as well list them there as there are never very many and it makes them easier to find. I also transcluded the top of recent changes so that there is often a link to the talk page of the contributor which greatly facilitates leaving them a note. (Saves me having to look at Deleted revisions to find it). As with anything wiki this can of course be reverted if it is causing issues for people. (I intended to add recent changes to the post-"patrol" and "rollback" pages as well, if this is wanted). Conrad.Irwin 01:44, 19 January 2009 (UTC)

Images in translations
There are certain situations where one might want to use an image in a translation, mainly when the language is written in a script which is completely unsupported by Unicode. Currently, though, Template:t does not allow images to be used and breaks, even when I use an imagemap as a workaround (see water). Would it be possible to add some kind of parameter to Template:t to allow for images? -- Prince Kassad 21:52, 16 January 2009 (UTC)


 * Don't try to force it into . The template is only useful for languages with FL wikts (presently 167), and mostly pointless for the other 7000+ languages. ( was only created to give Tbot something to do with {t} templates added for non-existant FL wikts, as a slight improvement over just converting them to ordinary links.) Robert Ullmann 13:57, 17 January 2009 (UTC)
 * Using Template:t is useful even for the other languages, since it automatically links to the correct section and allows you to use script templates, transliteration and gender. Also even without Template:t, using images does not work (again, see water) -- Prince Kassad 19:10, 17 January 2009 (UTC)
 * You might ask RodASmith about this. He has been using ASL images in Tranlsations tables with some success. --EncycloPetey 19:23, 17 January 2009 (UTC)

lang tags

 * See: User_talk:Robert_Ullmann


 * See: Requests_for_deletion/Others

We should be generating HTML/XHTML lang tags for a number of bits where we have the information, to allow the browser font selection and CSS styling to work as designed. Please read the two sections above. Nbarth makes a good point, in that trying to convert languages to script (ala ) is mostly pointless, as it is the language that is wanted. We only want the script in a limited number of cases (e.g. Hant v Hans), and to add classes (for IE; without the customary IE brokenness there would not be much point ...)

Just to explain what this looks like; we want a bit of text in Japanese to look like:

言葉

The class being only for our CSS and for IE<8, other browsers can style on the lang "psuedo-class".

Nbarth also points out that writing  is annoying. Especially when the "Jpan" script selection isn't really the point: the browser setup works on the lang attribute.

Somewhere above, we proposed passing  and   to the various script templates. The templates can then do the right thing. The language is the code, the face is one of  so the template can apply the desired HTML tag for the script (some scripts we want bold, some not; likewise for italic). tells the template this is a headword (from {infl} or whatever) possibly made larger (Han, Arabic). says this use is by {term}, to be italic in some cases.

(The script templates should use b for bold, and i for italic, not "em" and "strong", to be consistent with the WM s/w generated code for double and triple apostrophes.)

In standard use, the script codes are not used in most language tags, for example "zh-Hant" is used, but "fr-Latn" is not, "Latn" is the (supressed) default for "fr". Script codes by themselves are meaningless:  is not allowed (ignored), as it isn't a language tag. This is why the script templates need the lang= from caller, although they can sometimes default reasonably ( defaults to language "ko"). So the templates will suppress the script for a set of languages, default the language when reasonable, and in the case of Latn, suppress the script in all languages except a specific list:  makes sense and is used.

What is then needed is a bit of magic, a script template that can be used by default. Other templates that take lang= and sc= can then do something like:

Nbarth suggested ; I had been thinking (-) we can name it whatever. This bit of magic must be very concise, it will get used a lot; hundreds of thousand of times. It provides a small set of script templates for common languages, and otherwise Does the Right Thing.

The set of languages with scripts supplied by the magic is:

(Yes right now you are thinking, but what about ? This list must be very short to work; other languages will simply have to specify the script. Of course we can revise this list, but making it more than a small bit longer is unworkable.) Are we still using for Ancient Greek, or it is now the same template/class?

Time for Man U/Bolton, must go now ... Robert Ullmann 14:48, 17 January 2009 (UTC)

RSS Feed
OTRS has gotten a couple of reports that "The Word of the Day RSS feed hasn't been updated since December 23, 2008.", Not sure how to fix this... -- Versa geek  01:36, 19 January 2009 (UTC)
 * AFAIK, Connel is the only one who knows how to work that magic. --EncycloPetey 01:44, 19 January 2009 (UTC)
 * User talk:Connel MacKenzie. Conrad.Irwin 01:52, 19 January 2009 (UTC)

page load error trying to open ventana
When I try to open the page ventana I get a page load error. This problem exists a least since one weak. Has someone an idea what happens? Matthias Buchmeier 14:28, 19 January 2009 (UTC)


 * No problems here [Tried with latest Firefox, Chrome, and IE7 all on MS Vista]. --Bequw → ¢ • τ 06:38, 20 January 2009 (UTC)

American Sign Language language
The categories ase:Etymology and ase:English derivations use, which causes them to be included in Category:American Sign Language language, which is properly redlinked, as it exists at Category:American Sign Language. Would it be possible to fiddle with that template (or the responsible subtemplate) so that it treats ase as an exception and categorizes properly?—msh210 ℠  20:43, 19 January 2009 (UTC)


 * That's not so hard — just a one-line fix to if I'm following the code right — but is this something we might want to do for other languages as well? For example, do we want "____ Creole language", or just "____ Creole"? Maybe we should add support for a langdesc= parameter that defaults to language name language? —Ruakh TALK 22:01, 19 January 2009 (UTC)

Not too active around here any more, but I saw a note about this from msh210. I haven't looked at this code in a while, but I'd probably lean toward doing this in some variation of Template:langname. Currently, I basically call langname and append "language" unconditionally. If there were a template like langname that knew whether or not to include the word "language", that would help here.

That being said, I was never really sold on the idea of using the topic category templates for the etymology and derivation categories. It always seemed to me that there should be something more sophisticated for these cases, but I don't recall whether I had any better ideas on how to approach it. I don't actually recall the details of how langname itself works right now, so that probably doesn't help :) Mike Dillon 23:27, 19 January 2009 (UTC)


 * P.S. I think you'd probably be dealing with Template:topic cat parents/Etymology, not Template:topic cat parents/default. Mike Dillon 23:29, 19 January 2009 (UTC)

I believe I've fixed this. I had to touch both and. I only did it for "ase". Mike Dillon 05:34, 7 February 2009 (UTC)


 * Looking at the diffs for my changes, it's pretty obvious to me in retrospect that a better way to do this would be to make the template respond to a parameter that tells it to append the word "language", omitting it for the language codes that already have "Language" in them.


 * This change should not affect everywhere that is currently used since they wouldn't be passing this new parameter, but I'm still not willing to make that change given my recently low level of activity. Seems like something up Robert's alley, but the changes I made should work for this one case. It just sucks to have to change three pieces of markup if anyone wants to add a language to this exclusion list. Nevermind extending it to other cases besides Etymology if they arise. Mike Dillon 05:18, 11 February 2009 (UTC)

Interwicket update
I've taken some of the code from the iwiki bot that updates the entire wikt in one pass, with a global index, and written something that looks at RC on the various wikts and finds things to add here.

I did this more because I wanted to gather more information about the activity on other wikts, not because I wanted to speed up additions of iwiki links; but it is useful. I've noted over time that when I have reason to go look at FL wikts, often there is very little activity (RC last 100 shows several days' worth); but sometimes quite a bit. This way I get to "see" a lot of it. I've noted that the total NS:0 creation rate (not including bots) for all of the FL wikts put together is about the same as the en.wikt rate. For bots, I don't know yet, but looking at the statistics timelines for number of entries one can draw some conclusions.

The practical effect (assuming I or someone keeps running it ;-) is that when entries are added to other wikts that have an en.wikt entry, the iwiki will be added here in some smallish number of minutes (up to 6 hours if the FL wikt is usually very quiet, thus not looked at frequently). When a new entry is added here, it will add any iwikis it finds, like this. It doesn't do an exhaustive search in this case, so some may be left to be found later in a complete pass. It looks at a few large wikts, plus the wikts for the language(s) in the entry, plus any that have iwikis added by the creator.

It also checks sort order whenever looking at entries; if someone adds an entry to an FL wikt, and then adds the iwiki here, the bot will re-check it and sort it in correctly if needed. So it isn't necessary in almost any case to tell well-meaning editors not to add iwikis. They very often want to add an iwiki to the en.wikt, because the "standard" bot relies on hints like that to get started.

So you'll see more edits by Interwicket, comments welcome of course. Robert Ullmann 12:06, 25 January 2009 (UTC)

I've added code to add the reciprocal iwiki link to the FL entry, pointing back to the English. This is desirable for iwiki bot-like things, but runs into the rather large problem of getting bot flags/permissions/(non-denials :-) from 170 wikts... it operates in a test mode, doing a very limited number, unless bot-flagged (or blocked ...). I would like feedback from wherever. Robert Ullmann 14:22, 26 January 2009 (UTC)

StringFunctions
moved from BP

I think it is more logical to 'exploit' the consistencies between language conjugations than rely on users learning to use hard-to-understand templates, for example the Dutch verbs are more or less consistent when split in groups: regular/irregular (some exceptions) with the help of the StringFunction extension we could literally have verb conjugations done within the template instead of using e.g etc. here's a rough sample:

en

there are some errors in this, but with a few tweaks it could be used to get the infinitive from a stem of a regular Dutch verb, it is by no means limited to the infinitive of a verbs stem/infinitive, or even verbs. Plural noun forms are pretty consistent, and it would be a ton more user friendly. There is of course the odd exception where this would not work, of which the errors can easily be negated by adding a param.. (note: since the wiki does not currently have the extension installed it would not work right now) 120.16.131.29 12:39, 25 January 2009 (UTC)
 * on second thought maybe this belongs @ Grease pit?


 * The extension will not be installed on any WM project; Tim and Brion think (and I concur), that it is poorly designed and written, and leads to people trying to "program" the template language far beyond what should be done. Combinations of string function and #if and #switch lead to painful performance levels. In our case, the templates can extend easily when provided with the stem (common letter-string, sometimes in a couple of forms), although some do this better than others, some not at all. (did you just drop in to ask this, or are you doing something here?) Robert Ullmann 12:48, 25 January 2009 (UTC)


 * Kinda, yeah. I saw all this was done manually and thought this was a better alternative and thought I'd suggest it while here, I did not think it was bad performance-wise. As for the templates without the string function, it does work efficiently for some, but unfortunately a lot of the dutch words (for example) with long/short vowel sounds breaking when conjugating them would make the spelling inaccurate. (well, this at least explains the sudden performance drop of an off-site wiki, I guess..) 120.16.131.29 13:06, 25 January 2009 (UTC)


 * Wouldn't substituting remove the 'performance issue' then? 120.16.131.29 13:20, 25 January 2009 (UTC)


 * Perhaps, but it would then cause an editing issue. Most users wouldn't know how to read all that code and change the right bit if they found an error.  It would also mean that if we decided to change the template format, we couldn't do so without changing every page that had its code substituted.  The purpose of a template is to allow control of format from a single location, and substituting eliminates that utility. --EncycloPetey 16:29, 25 January 2009 (UTC)

template:diff
I've created for the display of diff URLs (on discussion pages, naturally). If I missed something, please tweak.—msh210 ℠  23:11, 29 January 2009 (UTC)