Talk:auroleus

RFV discussion: October 2013–June 2014
The etymology (aureus + -olum) suggests this is a mistake for aureolus. All I can find is someone’s last name. — Ungoliant (Falai) 11:02, 31 October 2013 (UTC)


 * Is anything I've added to Citations:auroleus at all helpful? You're right that most of the hits on Google Books are for ' full name, but tellingly misspelt as Philippus Aur ole us Theophrastus Bombastus von Hohenheim. My guess is that this is a misspelling that crops up in many languages, including English, Latin, German, Spanish, Italian, and Korean. — I.S.M.E.T.A. 18:08, 31 October 2013 (UTC)


 * That covers translingual. If the Latin is indeed a misspelling, we need someone to confirm it is common enough to warrant an entry. — Ungoliant (Falai) 18:20, 31 October 2013 (UTC)


 * What's the threshold for Latin misspellings? Bear in mind that I've only searched for the nominative singular masculine; there are thirty-five other forms. — I.S.M.E.T.A. 18:38, 31 October 2013 (UTC)


 * Has anyone here developed something to facilitate searches for all inflected forms of a lemma, using our inflection tables or someone else's? Is Google smart enough to do this in its supported languages? DCDuring TALK 19:03, 31 October 2013 (UTC)


 * I've created which automatically creates exact-phrase Google-Books searches for the declined forms generated by  (and works similarly to it). Here it is in action in the case of auroleus (the code below is   ):


 * It's a bit messy at the moment, but it's a start. I can think of three definite improvements:
 * Removal of the google books: prefix and the phrase-marking quotation marks from every search term (so it looks like );
 * Institution of "macron-blindness" (so it acts more like );  — Stricken. — I.S.M.E.T.A. 22:51, 7 November 2013 (UTC)       and,
 * Removal of redundant links for isomorphic forms, meaning that, for example, the nominative singular feminine form would be linked, but the ablative singular feminine, vocative singular feminine, nominative neuter plural, accusative neuter plural, and vocative neuter plural forms would all be unlinked (saving editors from wasting their time making several identical searches).
 * I don't have the time to work on the template any more right now. Y'all are welcome to step in anytime to improve on my work. ;-)  — I.S.M.E.T.A. 20:29, 4 November 2013 (UTC)


 * Maybe it could create a single link to all the quoted forms separated by " OR "? --WikiTiki89 20:34, 4 November 2013 (UTC)


 * Right. I've done a lot of work on, as well as on , which it transcludes. The result, in my opinion, is a rather great improvement: I've instituted improvements № 1 and № 3 which I listed above, and I've added a "SEARCH ALL FORMS" function per WikiTiki's suggestion. I no longer think improvement № 2 would be a good idea (hence my strikethrough), because actual Latin text (this and other dictionaries' standardised presentation notwithstanding) often uses diacritics and/or ligatures, especially during the Mediaeval and Early Modern periods. I intend to add supplementary support for case endings with the spellings -ũ (nom. sg. n., acc. sg. m., acc. sg. n., voc. sg. n.), -æ (gen. sg. f., dat. sg. f., nom. pl. f., voc. pl. f.), -ã (acc. sg. f.), -â (abl. sg. f.), -orũ (gen. pl. m., gen. pl. n.), and -arũ (gen. pl. f.); if anyone knows of any other regular spelling variants for the case endings, let me know, and I'll add them, too. If use of catches on, I can create b.g.c.-searchers for the other Latin declension templates, if desired. — I.S.M.E.T.A. 22:51, 7 November 2013 (UTC)


 * Excellent. I had not realized how difficult it would be to do it "right". I look forward to similar work in other languages and, eventually, some kind of generalization or standardization of how this kind of search is invoked in all inflected languages. Current English is really too easy to do using manual methods, but earlier modern English would also benefit. DCDuring TALK 23:17, 7 November 2013 (UTC)


 * I do have an idea for a template to aid in searching for variant Early-Modern spellings of English compound words (or any word with distinct elements, really), but I don't have the energy to devote to that right now, having just created and . Would you like me to inform you when I've made some progress with it? — I.S.M.E.T.A. 16:44, 8 November 2013 (UTC)


 * I think you should tell the world. Latin and EME are languages for which all-forms search has come up explicitly in RfV. It might well come up in other languages as more questionable entries are added, the more certain ones having already been added. It would seem that each of this kind of template should be added to a suitably named generic category as well as a suitable language category. For now, Category:External link templates, which includes will have to do. DCDuring TALK 20:37, 8 November 2013 (UTC)


 * It's nice to know my work is appreciated. :-) Re notifying the community of these templates' existence, I had intended to post messages to the user talk pages of all the users who've added Latin Babel boxes to their user pages, but quickly abandoned that I idea when I noticed that Category:User la lists 264 such users. Instead, I was going to add a notification thereof to News for editors, but it seems I can't edit that page (Is editing it only open to administrators?), so I asked that someone else do it (see Wiktionary talk:News for editors); if you are able to, would you mind posting the notification for me, please? Re categorisation, I had already created Category:Latin inflection-table search templates (which itself is a member of Category:Latin inflection-table templates and the currently non-existent Category:Inflection-table search templates by language) for them; per your suggestion, I've also added both templates to Category:External link templates.
 * Please note that I have written documentation for the templates; it may well help comprehensibility (the documentation for, at >9½kB, is particularly thorough). I'll keep you posted on any developments pertaining to the compound-words searcher. — I.S.M.E.T.A. 11:53, 9 November 2013 (UTC)


 * ✅ DCDuring TALK 19:09, 13 November 2013 (UTC)


 * Thanks very much. I've now added the supplementary support for case endings with the spellings -ũ, -æ, -ã, -â, -orũ, and -arũ.
 * Unfortunately, there seems to be something wrong with the "SEARCH ALL FORMS" function (or, rather, with Google Books’ search engine):
 * A: 317 hits for
 * B: 222 hits for
 * Because A-query ⊂ B-query, it is logically necessary that <tt>A-hits</tt> ≤ <tt>B-hits</tt>, yet <tt>A-hits</tt> ≫ <tt>B-hits</tt>. I don't think that the "SEARCH ALL FORMS" function can be relied upon to give accurate results. — I.S.M.E.T.A. 20:34, 13 November 2013 (UTC)


 * Does Google now allow the pipe character instead of "OR"? --WikiTiki89 20:44, 13 November 2013 (UTC)


 * Yes, it does, and it works like OR, except that it doesn't have the problem mentioned here. Compare the above with and . — I.S.M.E.T.A. 20:58, 13 November 2013 (UTC)


 * I still don't see a difference between "OR" and "|", but whatever. --WikiTiki89 21:38, 13 November 2013 (UTC)


 * There is no difference in their function in the four example queries I've linked to hitherto; however, the following four queries should indicate an important difference between their functions:
 * 176,000 hits for
 * 176,000 hits for
 * 176,000 hits for
 * 37,300 hits for
 * Notice that the fourth search query, which has its first " unbookended by search terms", returns <22% the number of hits returned by the other three search queries. By contrast, the first and third search queries return the same number of hits, even though the third search query has its first <tt>|</tt> unbookended by search terms. Does that elucidate the difference between the functions of OR and <tt>|</tt> in Google Books search queries for you? — I.S.M.E.T.A. 17:36, 15 November 2013 (UTC)


 * Ok, I see. It is a syntactic issue rather than a functional issue. --WikiTiki89 17:40, 15 November 2013 (UTC)


 * Ah, my mistake. Sorry. — I.S.M.E.T.A. 17:54, 15 November 2013 (UTC)

Here are some statistics generated from that table above: ———————————————————— — I.S.M.E.T.A. 20:55, 13 November 2013 (UTC)
 * 156 hits for
 * 139 hits for
 * 154 hits for
 * 0 hits for
 * 24 hits for
 * 1 hit for
 * 0 hits for
 * 51 hits for
 * 4 hits for
 * 0 hits for
 * 0 hits for
 * 39 hits for
 * 0 hits for
 * 0 hits for
 * 0 hits for
 * 0 hits for
 * 4 hits for
 * 0 hits for
 * 4 hits for
 * 576 total hits for all forms of *


 * To make this work really well it might be useful to allow one to search on only the forms that were not found in English and Romance language dictionaries. In this case all the nominative singular forms are used in taxonomic names, is an English word, Aureolus is part of the full Latin name of Paracelsus, etc. Maybe some "ifexist" tests would help find at least those that had Wiktionary entries. I think one could determine whether a Wiktionary entry that did exist included an English, taxonomic, or non-Latin Romance language term. DCDuring TALK 23:04, 13 November 2013 (UTC)


 * Could one not simply discount the links generated for the terms that have an unacceptable noise-to-signal ratio? As for the "SEARCH ALL FORMS" function, I don't think there's much point investing the effort in adding those <tt>ifexist</tt> tests to a function that is demonstrably unreliable on a more fundamental level. In the specific case of the forms of *auroleus, it doesn't look like any of them are legitimate entries: they're just misspellings in various languages of the forms of aureolus. — I.S.M.E.T.A. 17:36, 15 November 2013 (UTC)

Here is an equivalent table for the forms of the correctly spelt :

And here are the statistics generated from it: ———————————————————— Taken as whole lexemes, the relative frequency of <tt>aureolus:auroleus</tt> on Google Books is 2,642⁵⁄₁₈:1. — I.S.M.E.T.A. 17:52, 15 November 2013 (UTC)
 * 133,000 hits for
 * 701,000 hits for
 * 30,400 hits for
 * 0 hits for
 * 107,000 hits for
 * 3,670 hits for
 * 284 hits for
 * 35,400 hits for
 * 20,600 hits for
 * 10 hits for
 * 464 hits for
 * 430,000 hits for
 * 1,290 hits for
 * 0 hits for
 * 800 hits for
 * 0 hits for
 * 11,600 hits for
 * 5,590 hits for
 * 41,100 hits for
 * 1,522,208 total hits for all forms of


 * In response to DCDuring's suggestion above (see his post timestamped 23:04, 13 November 2013 [UTC]) that one ought to remove from consideration forms homographic with terms in other languages:
 * 1,522,208 (LEXaureolus) − 133,000 (aureolus, species epithet and proper name) − 701,000 (aureola, species epithet and English word) − 30,400 (aureolum, species epithet) − 3,670 (aureolae, English plural noun) − 284 (aureolæ, English plural noun) − 430,000 (aureole, English word) − 41,100 (aureolas, English plural noun) = 1,522,208 − 1,339,454 = 182,754
 * With those forms eliminated, the relative frequency of <tt>aureolus:auroleus</tt> on Google Books is 317⁹⁄₃₂:1; which is to say that LEX*auroleus occurs at less than 0·32% the frequency of LEXaureolus. — I.S.M.E.T.A. 17:45, 18 November 2013 (UTC)


 * In support of my heretofore unchallenged hypothesis that this word is a misspelling wherever it occurs, I hypothesise that this misspelling occurs because it doesn't immediately look wrong, and that this is because it looks like a member of the fairly large class of adjectives derived from the suffix ; coupled with the existence of, this makes *auroleus and its forms look like natural Latin. (Which is without mentioning and even *aureoleus!) — I.S.M.E.T.A. 18:26, 18 November 2013 (UTC)

I suggest that this lexeme's entries be deleted, with <tt>MISSPELLING OF aureol[case ending]</tt> given as their deletion comments. That way, any future editor who attempts to readd the misspelt entry will be given the reason not to do so, and we won't have other editors' misspellings generating blue links. — I.S.M.E.T.A. 20:09, 18 November 2013 (UTC)


 * Why not just do a soft redirect with ? --WikiTiki89 20:20, 18 November 2013 (UTC)


 * Because that generates those undesirable blue links for said misspellings. — I.S.M.E.T.A. 20:30, 18 November 2013 (UTC)
 * Failed. — Ungoliant (falai) 22:49, 18 June 2014 (UTC)