Module talk:User:Ungoliant MMDCCLXIV/archive1

Current state of the placename module prototype

 * the categorisation of placenames is controlled by the Module:User:Ungoliant MMDCCLXIV/data data module:
 * each categorisable type of location is given an entry in that module, each entry has subentries for individual countries and a default subentry.
 * each subentry contains a list of variables, which indicate that the entry should be categorised under that type of place
 * if the value of the variable is something other than true, it will use its value in the instead of the type of place in the category
 * example 1: → A region of Brazil.Category:Regions of Brazil
 * example 2: → A region of England.Category:Counties and regions of England
 * the special variable "itself"= is used for types of location with standalone categories:
 * example 3: → An atoll in the Pacific Ocean.Category:en:Atolls
 * the entry will only be categorised for type of place with the lowest hierarchy that is passed as a parameter:
 * example 4: → A municipality of Brazil.Category:Municipalities of Brazil
 * example 5: → A municipality of Paraná, Brazil.Category:Municipalities of Paraná, Brazil
 * “, country” is appended to the category in some cases (see example 5)
 * note that the data module doesn’t affect definitions, just categories. This means that the module also works for types of locations not listed in the data module:
 * example 6: → an orc fortress in Mordor, Middle Earth
 * the module has two type of definitions: standalone and translations. Translations link to an English entry and have a description enclosed in parentheses; standalone definitions consist of the description, italicised
 * the parameters t1=, t2=, t3=, etc. are used to indicate the English translations. When these parameters are not present, a standalone definition is generated
 * example 7: → An island.Category:en:Islands
 * example 8: → Hawaii; Hawai'i (an island)Category:nl:Islands
 * as per common practice, English and translingual definitions begin with an upper-case letter and end in a full stop, whereas foreign-language definitions don’t
 * the module uses a hierarchy array to know the correct order of holonyms
 * example 9: → A neighbourhood in Guldborgsund, Zealand Archipelago, Kattegat, Zealand, Denmark, Scandinavia, Europe, Earth.
 * there is a function allows for changes to the default A, B, C, D formatting
 * example 10: → A state in the South region of the United States.Category:en:States of the United States
 * there is an alias system to save typing time
 * example 11: =  → An oblast of Ukraine.Category:en:Oblasts of Ukraine
 * for ultimate flexibility, there is a parameter def= which bypasses the automatic definition:
 * example 12: → a ruined mediaeval village in ArmeniaCategory:hy:Villages in Armenia
 * the module distinguishes between place types that use of and those that use in (the default is in)
 * example 13: → A city in Brazil.Category:en:Cities in Brazil
 * example 14: → A municipality of Brazil.Category:en:Municipalities of Brazil
 * the parameter also= allows for “2-in-1” definitions:
 * example 15: → A US territory and island in the Pacific Ocean.Category:en:Islands

Thoughts? — Ungoliant (falai) 21:42, 8 October 2015 (UTC)
 * Looks good. One minor comment: I like the style of, where standalone definitions start with capital letter and end with a period, regardless of the language. But I guess you're right in making the module distinguish the formatting between languages if that makes the entries more consistent.
 * I suggest you using the name Module:place for the module and Template:place/Template:p (or just Template:p) for the template. "Template:place" is going through WT:RFDO right now, so you might want to read that discussion, which is 1-month old. In any event, I decided to close the RFDO, see my last note on that discussion. I thought of the name Template:p, a shortcut would be in order since this is predicted to be a highly-used template, which reminds me of Template:l, a widely-used shortcut to Template:link. --Daniel Carrero (talk) 19:17, 9 October 2015 (UTC)
 * During your inactivity (2012-2014) there were huge fights over whether definitions should be sentences or fragments. Eventually Mglovesfun suggested that English defs should be sentences always and FL def should be fragments always, and that’s the practice I’ve been following.
 * But if you insist on having all defs generated by this template be sentences, I won’t fight. — Ungoliant (falai) 19:24, 9 October 2015 (UTC)
 * I did not know about the fights during my inactivity. I've been creating a few entries formatted as sentences (não sei das quantas, jogo do bicho) But, if you don't mind, let's have all defs generated by this template be sentences as you said, at least for now. These are sentences and not fragments if you ask me, just they are not formatted like one. In any event, if people disagree with that format and want it reverted to the previous fragment-style state, I suppose that could be easily achieved by re-editing the module in the future anyway, and in that case they probably would want to change and  too. Can you kindly edit the module or should I do it? You know it better than I. --Daniel Carrero (talk) 19:38, 9 October 2015 (UTC)
 * ✅. — Ungoliant (falai) 19:48, 9 October 2015 (UTC)

Advantages of having a template for placenames

 * Consistency: it would be very pleasant to have all placenames with the same formatting and overall look, like given names and inflected forms;
 * automatic categorisation: suppose we decide that the municipalities of Italy should be categorised per region. If all names of Italian municipalities used a template, it would take 10 seconds to fill the categories, instead of a week. Suppose we change our mind again, and decide that all Italian municipalities should be in a single category; a mere 10 seconds more and it’s all done;
 * tracking: even people who hate placenames will benefit from templatisation. Suppose you convince other editors that we should get rid of all names of unincorporated communities. If all placenames were templatised, we could have a list of all unincorporated communities in a jiffy;
 * database-like data: people who parse Wiktionary dumps will benefit from the different markup. For example, if someone wants to publish a wordgame dictionary based on WT data, they probably don’t want players to use placenames, but in the way our definitions are currently formatted it is very hard to distinguish a placename from other types of word.

— Ungoliant (falai) 19:44, 9 October 2015 (UTC)

Things to be done

 * The hierarchy system must be converted into a tree-like structure, and should probably be moved into a data module;
 * the description system (the one that makes region=A|country=B be defined as in the region A of B instead of the standard A, B) is a bit clunky; in addition, it should probably be integrated into the hierarchy data module
 * Wikitiki suggested that all parameters should be positions. That would have the advantage that we wouldn’t need to keep adding new items to the hierarchy structure constantly, but how would the template know, for example, whether is for the US state of Georgia or the Caucasian country?
 * the formatting of def= needs some work;
 * does the community even want such a template? There are at least 3 people who will oppose it for sure.
 * documentation
 * links
 * multiple values for parameter 2
 * integrate linking with the links module
 * the module is getting so convoluted; is it even going to be used/usable?
 * country names that take the will need special treatment

— Ungoliant (falai) 19:59, 9 October 2015 (UTC)
 * About "* the module is getting so convoluted; is it even going to be used/usable?" This would be 1 point in favor of the current system that is being used in entries, i.e., we have separate templates for each country/type relation, like Template:place:Brazil/municipality and Template:place:Brazil/state capital, which seems less convoluted and easy to use/adapt. In any event, I have faith that a module system could be better, it's just that it's something complex to develop.
 * In my last job in this year, I've had lots of free time looking at my table while I was waiting for calls from clients. I used a good chunk of that free time with my pen and notebook in hand, drawing flowcharts and plans and possible Beer Parlour openers (as in, the first message I'd post) about different types of places in Brazil (and the world, to a lesser extent), that's how I have an idea of the suggestions I could give to this module and how complicated they could be. --Daniel Carrero (talk) 22:21, 11 October 2015 (UTC)


 * Its convolutedness is more due to my incompetence as a programmer than to the fact that it’s a module. But, regardless, I’m fairly happy with the results so far, and I think we’re not far from a debut. As far as I can tell, the only important feature missing is the linking (so that it generates “A municipality of state, country”), documentation and testing the shit out of it.
 * About the lack of links, it’s relatively easy to solve using plainlinks, but I’d like to make it so the language section it links to can be controlled. My idea is to make so you can use +name to link to English and -name to link to the current language (i.e. → A city in, .) What do you think?
 * — Ungoliant (falai) 22:38, 11 October 2015 (UTC)
 * Another idea is to let people add the links themselves, but country names would have to be treated differently, because its value is used to access the categorisation info. — Ungoliant (falai) 22:42, 11 October 2015 (UTC)
 * I get your point and I'm happy with the results of this module, too. That's why I've been expending time testing it and writing more suggestions about it. (I have more suggestions that I didn't send yet) I think it would be a better idea if the module could do the linking by itself. Can't the module convert all instances of state into ? I also agree that we're not far from a debut! I'll comment about on a later message. --Daniel Carrero (talk) 22:48, 11 October 2015 (UTC)


 * I created Template:place for experimentation.


 * One thing:
 * I think the standalone definitions should not be italicized. Reason: Once I was using the template (non-gloss, italicizes the definition) but that seems inappropriate since the definition is actually a gloss, by the standards of.


 * I agree that the definition should be "A municipality in the state of São Paulo, Brazil.", which is better than simply "A municipality of São Paulo, Brazil." Can the module do this ad hoc with Brazil only at least, even if it ignores other countries for now?


 * About the also=:
 * If we use this code:
 * Current state:
 * A yyyyy and xxxxx in the Pacific Ocean.
 * I propose the order to be reversed, which would follow the order the parameters are likely to be used and sounds more natural:
 * A xxxxx and yyyyy in the Pacific Ocean.


 * About capitals. I know this is tricky because I tried programming this with individual parameters before, but I'll leave the proposal here:
 * Current state: A state capital and municipality of São Paulo, Brazil.
 * I propose using the format that I've been using in : A municipality, the capital of the state of São Paulo, Brazil.
 * Or, if it makes the code any easier to do: A municipality, the state capital of São Paulo, Brazil.
 * --Daniel Carrero (talk) 20:32, 10 October 2015 (UTC)
 * I’ll work on that after I’m done with reworking the hierarchy structure. In the meanwhile, I’ve made it so the categories are generated as actual categories instead of text. — Ungoliant (falai) 20:41, 10 October 2015 (UTC)
 * : I’ve implemented these suggestions. Do you mind trying some examples to make sure I didn’t miss anything? — Ungoliant (falai) 01:45, 11 October 2015 (UTC)
 * Ok. I've tried a variety of places in Brazil to test your module. See the messages below. --Daniel Carrero (talk) 09:03, 11 October 2015 (UTC)
 * Ok. I've tried a variety of places in Brazil to test your module. See the messages below. --Daniel Carrero (talk) 09:03, 11 October 2015 (UTC)


 * : Is there any important feature missing? I think we can start the serious tests now. — Ungoliant (falai) 18:42, 12 October 2015 (UTC)

About state capitals
About state capitals: It seems that the module assumes that every state capital is a municipality, which works for Brazil but maybe not for other countries. If we start editing lots of entries for state capitals in Brazil under that assumption, that may be problematic when people start using the template for state capitals that are not municipalities. I believe that US entries would use "town" and "city" more often. I've been testing some other things, which I'll inform you once I finish writing about them. --Daniel Carrero (talk) 08:50, 11 October 2015 (UTC)
 * I tested these 3 variations for basically the same thing, which I believe other people would probably try too, eventually:
 * I would expect this to be the result, with 2 simultaneous categories:
 * A municipality, the state capital of Minas Gerais, Brazil.Category:pt:Municipalities of Minas Gerais, BrazilCategory:pt:State capitals of Brazil
 * However, these were the results, respectively:
 * A municipality, the state capital of Minas Gerais, Brazil.Category:pt:State capitals of Brazil
 * A municipality, the state capital and municipality of Minas Gerais, Brazil.Category:pt:State capitals of Brazil
 * A municipality and state capital in the state of Minas Gerais, Brazil.Category:pt:Municipalities of Minas Gerais, Brazil
 * Only the 1st definition really looks good, the other ones look off. Also, none of the 3 variations uses the 2 different categories. (which indicates that the parameter also= is not currently able to categorize entries) My suggestion:
 * A municipality, the state capital of Minas Gerais, Brazil. (categories: municipalities of [state], state capitals of [country])
 * A city, the state capital of Wisconsin, USA. (categories: cities of [state] or: cities and towns of [state], state capitals of [country])
 * The state capital of Wisconsin, USA. (category: just "state capitals of [country]" -- which is technically wrong because the "city" part has been omitted, but that could be allowed as the natural result of omitting a parameter, which can be fixed later by editing the entry)
 * If you try these 3 variations right now, the results are:
 * A municipality and state capital in the state of Minas Gerais, Brazil.
 * A city and state capital in Wisconsin, USA.
 * A municipality, the state capital of Wisconsin, USA.
 * The state capital of Wisconsin, USA. (category: just "state capitals of [country]" -- which is technically wrong because the "city" part has been omitted, but that could be allowed as the natural result of omitting a parameter, which can be fixed later by editing the entry)
 * If you try these 3 variations right now, the results are:
 * A municipality and state capital in the state of Minas Gerais, Brazil.
 * A city and state capital in Wisconsin, USA.
 * A municipality, the state capital of Wisconsin, USA.
 * A municipality, the state capital of Wisconsin, USA.

About states
About states: --Daniel Carrero (talk) 09:01, 11 October 2015 (UTC)
 * I believe states should be able to link to their capitals in the definition, which seems to have been standard practice for states of Brazil and US (and probably other countries) for a while. For instance:
 * currently returns "A state in the South region of Brazil."
 * If you add a capital parameter or something (say, capital/Florianópolis), it would return "A state in the South region of Brazil, whose capital is Florianópolis." or "Santa Catarina (state in the South region of Brazil, whose capital is Florianópolis)".
 * English entries usually mention capitals while foreign-language entries omit them. Nonetheless, it makes sense to have the template allow capitals regardless of language, don't you think so? It's a feature that I personally would want to use if it's available, especially if it means that we can copy the same definition between languages and basically just change the language code in the template.
 * Aside from "capital", you'll notice that Wisconsin and other entries have a "largest city" variable. The full format could possibly be:
 * A state in the South region of Brazil, whose state capital is Florianópolis.
 * A state in the South region of Brazil, whose largest city is Joinville.
 * A state in the South region of Brazil, whose state capital is Florianópolis and the largest city is Joinville.
 * And the same with t1:
 * Santa Catarina a state in the South region of Brazil, whose state capital is Florianópolis
 * Santa Catarina a state in the South region of Brazil, whose largest city is Joinville
 * Santa Catarina a state in the South region of Brazil, whose state capital is Florianópolis and the largest city is Joinville
 * Santa Catarina a state in the South region of Brazil, whose state capital is Florianópolis
 * Santa Catarina a state in the South region of Brazil, whose largest city is Joinville
 * Santa Catarina a state in the South region of Brazil, whose state capital is Florianópolis and the largest city is Joinville
 * Santa Catarina a state in the South region of Brazil, whose largest city is Joinville
 * Santa Catarina a state in the South region of Brazil, whose state capital is Florianópolis and the largest city is Joinville
 * Santa Catarina a state in the South region of Brazil, whose state capital is Florianópolis and the largest city is Joinville

About municipalities
--Daniel Carrero (talk) 22:58, 11 October 2015 (UTC)
 * First of all, just a note: the distinction between "in" and "of" is inconsistent sometimes, as far as I've seen in Wikipedia and Google Books. Often, Wikipedia categories for places named as "Blablablahs OF (country)" use in running text "Example is a blablablah IN country."
 * List of municipalities in São Paulo
 * w:Category:Municipalities in Minas Gerais
 * But: List of municipalities of Brazil
 * In any event, it seems that "in" would be the most common preposition to use in this case, including in running text: "Osasco is a municipality in São Paulo, Brazil". I request renaming the category "municipalities of" to "municipalities in", in the module. I'll change the module and the existing categories.
 * Category:pt:Municipalities of Paraná, Brazil -> Category:pt:Municipalities in Paraná, Brazil
 * Category:pt:Municipalities of São Paulo, Brazil -> Category:pt:Municipalities in São Paulo, Brazil


 * I set the values based on the category names. If all municipalities were using this module, it would take change of two character in the data module to immediately recategorise every single entry with the new name; this is the biggest advantage of using a single module instead of various templates or, God forbid, manual definitions. — Ungoliant (falai) 01:00, 12 October 2015 (UTC)


 * I know, I'm not blaming you for the "municipalities of". I created the categories with "of" a few weeks ago. With this messsage I am reconsidering that decision in the favor of "in". I take your point about the module, that's great! That said, in the current system of manual templates, we have only Template:place:Brazil/municipality for a single country, so thankfully that only takes a 2-character change too, but I know it would be worse if we had dozens of templates like Template:place:Italy/municipality, Template:place:Mexico/municipality, etc. --Daniel Carrero (talk) 01:09, 12 October 2015 (UTC)

About administrative regions
The Federal District has "administrative regions" and not "municipalities", according to this article. --Daniel Carrero (talk) 22:58, 11 October 2015 (UTC)
 * returns: (without a "the")
 * An administrative region in Federal District, Brazil.(without a category)
 * returns: (with the "the")
 * An administrative region in the Federal District, Brazil.(without a category)
 * I propose the current incorrectly-named category to be renamed this way (including "in the" and the English-sounding "Federal District" rather than Distrito Federal):
 * Category:pt:Municipalities of Distrito Federal, Brazil -> Category:pt:Administrative regions in the Federal District, Brazil
 * Looking up on Google Books, I've see Federal District used both with and without a "the", still the version with the sounds more common and natural. Wikipedia has List of administrative regions in Federal District without "the", but the introduction is "This is a list of the administrative regions (in Portuguese: região administrativa) in the Distrito Federal (DF), Brazil." with "the". w:Gama, Federal District is described as "Gama is an administrative region in the Federal District, Brazil.", with "the".


 * You can add a "administrative region" item to the data module. — Ungoliant (falai) 01:00, 12 October 2015 (UTC)

About en/pt place names
Ungoliant said: "My idea is to make so you can use +name to link to English and -name to link to the current language (i.e. → A city in, .)"

I propose doing it this way:
 * → A city in,.
 * → A city in,.
 * → A city in,.
 * → A city in,.

I propose that when two languages are used for the same place, English takes precedence and the foreign language is parenthesized. Eventually, this will result in things like "in ". --Daniel Carrero (talk) 23:27, 11 October 2015 (UTC)


 * ✅ (except the last bit; see below). — Ungoliant (falai) 01:00, 12 October 2015 (UTC)

About en/pt types of places
Related to the above, proposal: --Daniel Carrero (talk) 23:32, 11 October 2015 (UTC)
 * → A city in,.


 * I’ll worry about that after the crucial features have been tested properly. — Ungoliant (falai) 01:00, 12 October 2015 (UTC)

About regions of Brazil
I've found terms like South Region to be attestable through Google Books. (with "region" in the name, like the Agra Division of India has "division" in the name)

For that reason, I have created entries for all the 5 regions of Brazil, including Southeast Region, North Region, Northeast Region and Center-West Region.

Proposal (I'll just link the region name because it's part of the proposal):

--Daniel Carrero (talk) 00:48, 12 October 2015 (UTC)
 * A municipality of Paraná in the South region of Brazil.
 * A municipality of Paraná in the South Region of Brazil. (which is what I would use; currenty it returns "in the South Region region", with a redundant repeated "region" word that could be omitted)
 * A municipality of Paraná in the South Region of Brazil. (which is what I would use; currenty it returns "in the South Region region", with a redundant repeated "region" word that could be omitted)
 * A municipality of Paraná in the South Region of Brazil. (which is what I would use; currenty it returns "in the South Region region", with a redundant repeated "region" word that could be omitted)


 * ✅. — Ungoliant (falai) 01:00, 12 October 2015 (UTC)

About the Federal District
Brazil has 27 federative units: 26 states and the Federal District, which is a federal district. (This article lists other federal districts in the world, some are and some aren't named "Federal District".)
 * About the accuracy of categories for states

About the categories:
 * Category:States of Brazil (with all states + Federal District)
 * Category:State capitals of Brazil (with all state capitals + Brasília)

I think the current state categories are fine, they are just a bit inaccurate. The most accurate categories would be "Category:Federative units of Brazil" and "Category:Federative unit capitals of Brazil"/"Category:Capitals of federative units of Brazil".

If you and/or other people want to use "federative unit" categories I guess that's good and I could support that change, but I'm not proposing doing that right now. People often use the word "state" or "estado", just the Federal District is the odd one out.
 * English Wikipedia uses States of Brazil and w:Category:States of Brazil.
 * Portuguese Wikipedia uses Unidades federativas do Brasil and w:pt:Categoria:Unidades federativas do Brasil.

Also, I believe it's very important to mention at the Federal District/Distrito Federal that it is where Brasília is located.
 * About mentioning the capital

We could use the module like this:
 * My proposal
 * The federal district of Brazil, in which the country's capital Brasília is located.Category:States of Brazil
 * The federal district in the Center-West Region of Brazil, in which the country's capital Brasília is located.Category:States of Brazil
 * The federal district in the Center-West Region of Brazil, in which the country's capital Brasília is located.Category:States of Brazil
 * The federal district in the Center-West Region of Brazil, in which the country's capital Brasília is located.Category:States of Brazil

For comparison, I had defined Distrito Federal this way, a few weeks ago: The previous definition was: --Daniel Carrero (talk) 01:13, 12 October 2015 (UTC)
 * 1) The  in the  of, in which the country's capital  is located.
 * 1) Federal district in central-western Brazil in which the country’s capital, Brasilia is located.


 * If we ever change the name of the category, it will be very simple to amend the categorisation. See how "county" and "region" work for England.
 * I’ve added the item "mention capital" to the data module.
 * — Ungoliant (falai) 01:40, 12 October 2015 (UTC)

October 10 updates
— Ungoliant (falai) 01:40, 11 October 2015 (UTC)
 * There has been a major change in the way information is passed to the module. Instead of placetype=placename, you now use placetype/placename. This means that the values are now positional and therefore must be passed in the correct hierarchical order; however, it also means that the module is much more efficient, and new types of place won’t have to be added to a hierarchy table
 * example 1: → A municipality in the state of Paraná, Brazil. Category:en:Municipalities of Paraná, Brazil
 * another advantage is that you can add information that is not connected to a specific type of place
 * example 2: → An ancient city in Samnium, near Rome.
 * the items in the data module now have the variables plural (default: placetype + s), preposition (default: in), article (default: an if it starts with a vowel, a otherwise) and real_name (default: placetype)
 * example 3: (with the real_name of state capital being "municipality, the state capital") → A municipality, the state capital of São Paulo, Brazil. Category:pt:State capitals of Brazil
 * a few more items have been added to the data module, and to the function that controls special definitions
 * a few formatting changes

October 11 updates
— Ungoliant (falai) 21:57, 11 October 2015 (UTC) — Ungoliant (falai) 00:43, 12 October 2015 (UTC)
 * Parameter 2 now accepts multiple subparameters (separated by a slash)
 * example 1: → A municipality, the state capital of Paraná, Brazil. Category:pt:Municipalities of Paraná, Brazil Category:pt:State capitals of Brazil
 * the special subvalue and replaces the comma and doesn’t affect categories
 * example 2: → An island and village in the Pacific Ocean. Category:pt:Islands Category:pt:Villages
 * example 3: → An island and municipality, the state capital of Santa Catarina, Brazil. Category:pt:Islands Category:pt:Municipalities of Santa Catarina, Brazil Category:pt:State capitals of Brazil
 * there are new parameters for extra information: capital=, largest city= and caplc= (“capital and largest city”)
 * example 4: → A country of North America. Capital: Ottawa. Largest city: Toronto. Category:pt:Countries of North America
 * the system for special definitions has been integrated into the data module. The variable synergy=, which can be added to items in the data module, should contain subitems indexed by the placetype that precedes it, or "default", and these subitems contain the variables before=, between= and after=, which contain the text that goes in the given position. See country for an example.
 * The module now supports the format languagecode:placename to make place names link to language sections
 * example 5: → A city in,.
 * The extra info parameter modern= has been added
 * example 6: → A city in, Ancient Rome; modern Agrigento, Italy.

October 12 updates
— Ungoliant (falai) 00:59, 13 October 2015 (UTC) — Ungoliant (falai) 03:03, 13 October 2015 (UTC)
 * Linking has been integrated with Module:links
 * example 1: → A city in ; modern.
 * standalone parameters now also accept languages
 * example 2: → A sunken city in an unknown place, maybe  or the.
 * several bugfixes
 * several new items in the data module.
 * new extra info parameter, official=, for the official name of a polity.
 * there is a new parameter, a=, which replaces the article in the gloss
 * example 3: → The southernmost continent.
 * The data item "region" now refers to subnational administrative regions such as those of Italy. For regions with little to no political meaning, such as those of Brazil, "macroregion" is used instead. The text displayed and categories are not affected.

October 13 updates
— Ungoliant (falai) 20:49, 13 October 2015 (UTC)
 * Testing is almost finished
 * the module can now handle names that sometimes have an article and sometimes don’t
 * example 1: → A municipality in, .Category:en:Municipalities in the United States