Template talk:ja-see

"For a list of all kanji with on'yomi えい, not just those used in Sino-Japanese terms, see [...]"
What about linking to Category:Japanese kanji with on reading えい? —Suzukaze-c◇◇ 04:55, 5 February 2019 (UTC)

そほづ
This page doesn't have any categories added by the template . DTLHS (talk) 14:31, 22 April 2019 (UTC)
 * solved; just forgot to put an "h" to . ～ POKéTalker（═◉═） 21:19, 22 April 2019 (UTC)
 * "h" is no longer needed :) --Dine2016 (talk) 06:10, 26 May 2019 (UTC)

historical kana spellings
Do you think showing the historical kana spelling in (e.g. ) is a good idea? I'm afraid casual readers may mistake it as the katakana spelling or the (modern) pronunciation of the term, but I don't know any better way to place it. --Dine2016 (talk) 06:10, 26 May 2019 (UTC)


 * Hm, perhaps it is inappropriate if ja-see is supposed to be simple. —Suzukaze-c◇◇ 06:15, 26 May 2019 (UTC)
 * Thanks. Actually the template would look much clearer if the whole header (しらかわよふね〔シラカハヨフネ〕【白河夜船・白川夜船・白川夜舟】) got removed:


 * The header were added solely to distinguish between words in rare cases like this:


 * What about removing the header when there is only one matching word and displaying the headers when there is more than one? --Dine2016 (talk) 06:42, 26 May 2019 (UTC)


 * I don't have a particular opinion. —Suzukaze-c◇◇ 06:50, 26 May 2019 (UTC)

Category:Japanese terms with usage examples
, see かんじざいぼさつ for 観自在菩薩. The Category:Japanese terms with usage examples should not be in kana spelling form of the kanji entry, is this intentional? ～ POKéTalker（═◉═） 22:58, 17 August 2019 (UTC)
 * Um..., when I wrote, I was influenced by western linguistics, which regarded the spoken language as the language and the written language a mere encoding of it. Given this view, かんじざいぼさつ and 観自在菩薩 denoted the same term, and since that term had an usage examples, it followed that かんじざいぼさつ and 観自在菩薩 would both be Category:Japanese terms with usage examples. What distinguished the two was that 観自在菩薩 belonged to Category:Japanese spellings with usage examples while かんじざいぼさつ did not.
 * However, it seems that Wiktionary doesn't distinguish spellings from terms, so I wouldn't object to removing that category. (It's easy, just remove

elseif a == 'ja-usex' or a:find('^quote') then -- special hack return '{{=' .. b
 * and add  to   from Module:ja-parse.) --Dine2016 (talk) 11:42, 18 August 2019 (UTC)

Should we consider using Template:ja-see for romaji entries as well?
This template provides much more useful information to the user than the older. Should we consider using in romaji entries, instead of ?

If readers of this page support this idea, then would need a bit of reworking. Here's an example of what looks like on romaji entries. The template now states that the romaji form is an "alternative" spelling.

Also, if we decide to proceed with this idea, considering the kerfuffle from last time when we tried a few iterations of, we should probably broach the topic at WT:BP or WT:GP. ‑‑ Eiríkr Útlendi │Tala við mig 22:34, 28 August 2019 (UTC)


 * I don't think so. The very first reason I created was because the older format for kana soft-redirects,

Noun

 * 1)  A reading pattern for certain kanji compound words, using the Chinese-derived on'yomi for the first kanji, and the native Japanese kun'yomi for the second kanji.


 * duplicated content (POS, definition, category) from the lemma entry. The current format for rōmaji,

Romanization

 * does not duplicate any content, so there is no reason to replace it with when the current format does a good job. --Dine2016 (talk) 01:35, 29 August 2019 (UTC)
 * Granted, reducing data duplication is a good motive, and does an excellent job of that.
 * I realize I wasn't very clear on my main motivation for bringing this up for romaji entries: usability. The current approach with  is poor usability, in that it presents the user with nearly no information, and it requires the user to click through two different entries (the links on the romaji entry, and then the links on the kana entry) before arriving at the desired main entry.  I think it would be much more useful and user-friendly to do something at least similar to, by providing users with entry information already on the romaji page, without having to click through -- and if they want to see a full entry, have the romaji page provide direct links, rather than the indirect link to the kana entry, where the user would have to click through again.
 * Perhaps itself isn't the correct template for the job for romaji entries.  Would you be supportive of something similar?  ‑‑ Eiríkr Útlendi │Tala við mig 04:09, 29 August 2019 (UTC)
 * Rōmaji could use a similar idea to that for kyūjitai entries (c.f. Talk:天道蟲), namely to simply point to the kana form (in source code) and have the template find the lemma (by fixing double redirects). But it needs to filter the result once again. For example, after fetches content from  and fixes double redirects, it should discard words like  as well as POS like the "proper noun" part of, which are romanized differently. (Well, kyūjitai also needs to filter the result once again, if the words involve ambiguous kanji like .)
 * Alternatively, rōmaji could link to the lemma entries (in source code) directly. The advantage with this approach is that acceleration is faster (After creating, simply make and  point to it, instead of creating a two-level hierarchy). The disadvantage is that homophone lists like , , , ... must be repeated on both  and . --Dine2016 (talk) 06:38, 29 August 2019 (UTC)


 * The inclusion of romaji forms was based on discussions many years ago relating to usability and discoverability. As this is the English Wiktionary, we can safely assume that our readership can read English, which is written in the Latin alphabet (romaji).  We cannot assume that our readership can read kana or kanji.  So if an EN WT user has encountered a Japanese word, possibly in transcription, and they come here to look it up but without being able to input Japanese, the argument went that we still needed some way for them to find the entry.  Since Hepburn is the most common Japanese transcription system used in the English-reading world, this was what we adopted here at Wiktionary (with some tweaks).  That doesn't mean that we cannot include other romaji renderings -- just that we only include modified Hepburn in our "official" links, such as in translation tables, or in entry headword lines, or as the romanization we target with  and similar templates.
 * We specifically don't target 訓令式, as that is the romanization scheme adopted in Japan for Japanese readers, and it has various oddities that make it inappropriate for English readers (like zi not being pronounced, or syu not being pronounced ), and oddities that actually render it deficient for describing Japanese (the inability to transcribe certain sounds, like ファ or ティ).
 * We also specifically don't target ワープロ式, as this isn't a standard so much as a de facto practice with many variations, based on what various input method editors will accept for conversion. For instance, long vowels might be the same vowel twice, or the vowel plus a hyphen.  Various consonantal sounds have multiple representations, with ちゃ renderable as tya, tixya, cha, cya, and possibly more.  ん might be nn, as you note, but even in ワープロ式 it could be a single n or even an m so long as it's followed by a consonant.
 * Again, we have no stricture against the creation of romaji entries based on such alternative spelling conventions, and indeed, if users create such entries, I believe we should keep them, so long as they are properly formatted and redirect the user to the appropriate Japanese entries. However, we do not target these for display in our lemma entries, and in Japanese transliteration (linked to from WT:AJA) we explicitly explain that we use a modified version of Hepburn.
 * Returning to your main question of "why anyone would want to look up modern Japanese terms by rōmaji", it comes down to the basic position that we have no alternatives for how to help English readers find Japanese terms, when they don't know kana or kanji and might not even have a Japanese IME installed. This is the same reason we have romanization entries for Gothic, and why the topic of romanized entries for other scripts keeps coming around from time to time in the Beer Parlor and other discussion pages.  If you have some technical approach that would allow a user to input romaji in either the search bar or the URL and still land on the lemma page (or at least the kana soft-redirect page) corresponding to that romaji string, and somehow that romaji string cannot also be interpreted as a word in another language, then I think we can safely get rid of all of our Japanese romaji entries.  I agree that romaji entries are a cludge, and an inelegant one at that, but we (the EN WT community dealing with JA entries) have not been able to come up with a better approach.  ‑‑ Eiríkr Útlendi │Tala við mig 17:44, 17 September 2019 (UTC)
 * What about building rōmaji indexes in the Appendix or Index namespace? We can support multiple transcriptions (standard Hepburn, Waapuro Hepburn and Kunrei-shiki) this way.
 * As for mainspace, I have no objections against rōmaji entries as long as they're voluntary and within reasonable bounds. I vehemently opposed them in the posts above because I mistakenly believed they would be given equal weight to kana entries like a writing system. For example, the current entry is already taking up 20 MB of memory. If we transclude or build the same list at i it will probably cause memory error and break the rest of the page. But that's clearly not your intention, and I apologize for that. I should have said "The time for using  for rōmaji entries is not mature" rather than attacking users looking up by rōmaji directly (though I still suspect anime fans may try to find 竜 at "ryuu" and be disappointed).
 * What about this compromise: employ something like in ordinary entries like tentō mushi, but switch back to the older format for entries like "i" once memory limits are breaked? --Dine2016 (talk) 19:40, 17 September 2019 (UTC)
 * Re: multiple romanization schemes in the  or   namespaces, I think that's a wonderful idea.  Theoretically, we could have a separate appendix or index set up for each romanization scheme, with the boilerplate at the top of each such page explaining what the scheme is and (in brief) how it encodes the Japanese kana and/or sound values.  Presumably, so long as each scheme is a regular encoding, this could be programmatically generated?  And we wouldn't need to have existing romaji pages for each word's spelling, like we do for the categories?  I have no idea how to go about implementing something like this, however.
 * Re: using something like for romaji entries that don't have memory issues, I would also welcome that.
 * Good ideas, thank you! ‑‑ Eiríkr Útlendi │Tala við mig 21:10, 17 September 2019 (UTC)
 * Good ideas, thank you! ‑‑ Eiríkr Útlendi │Tala við mig 21:10, 17 September 2019 (UTC)

inflected forms
Hi everyone. What do you think is the best way to show inflection of alternative spellings?

My initial plan was to make them automatically generated by, on the respective entries of alternative spellings. For example, let's suppose we're soft-redirecting to. In addition to fetching the definitions and categories from, could also fetch the inflectional type (godan verb ending in -su) and inflect the alternative spelling accordingly:

The advantage of this approach is that no extra work is needed, as far as modern spellings are concerned. However, once we start creating kanji spellings involving historical kana such as or, then we need some way to tell the template to use the volitional ending -さう instead of -そう, to keep kana orthography consistent. In other words, the template needs to know whether the current alternative spelling is in modern or historical kana, so that it can supply appropriate inflectional patterns. (Spellings like add additional complexity: if we give modern inflections to  and historical inflections to, then it makes sense to give both to . So there are really three possibilities, not two.)

It might be tempting to add a new parameter to to indicate the kana orthography of the current alternative spelling, but that defeats the purpose of automatically generating the inflection table and is essentially no better than adding  manually. An alternative solution is to mark the kana orthography in the lemma entry, for example via or. I prefer this approach, but I'm not sure which format is better. What do you think?

Alternatively, we could expand the lemma entry directly:

Conjugation of " 言い出す " (See Appendix:Japanese verbs.)

The advantage with this approach is that it is more logical, and users searching alternative spellings of inflected forms is likely to land on the lemma entry directly, with MediaWiki's searching facility. The disadvantage is that the inflection templates must be reworked, and such a format would increase data duplication.

Which approach do you prefer?

--Dine2016 (talk) 11:08, 16 September 2019 (UTC)
 * Both approaches are very interesting, thank you for the efforts. I have no objections. Let's see what other people are going to say. --Anatoli T. (обсудить/вклад) 11:46, 16 September 2019 (UTC)


 * After briefly looking over the proposal, my only concern at this point is the proposal for expanding the lemma entry's table, particularly in edge cases where a given term might have multiple historical kana spellings. The sample above seems to be able to show two forms side-by-side well enough, but I'm not sure it would scale very well to three, as with  -- modern kana もちいる, historical etymological kana もちゐる, historical technically-misspelling kana もちひる.  There are probably other examples out there as well of terms with multiple historical spellings.  ‑‑ Eiríkr Útlendi │Tala við mig 19:35, 16 September 2019 (UTC)


 * I'm not sure which one I like yet, but the second one is rather busy. Perhaps newlines would be an improvement. —Suzukaze-c◇◇ 04:27, 17 September 2019 (UTC)


 * It seems only users with certain permissions (sysop, autopatrolled, etc.) that could edit with this template (when the whole page is only contains a Japanese entry, I guess). I got this error when trying to edit 除ける:

Errors: If you believe your action was constructive, please inform an administrator of what you were trying to do. A brief description of the abuse rule which your action matched is: strips L3 Marlin Setia1 (talk) 23:43, 14 October 2019 (UTC)
 * This action has been automatically identified as harmful, and therefore disallowed.


 * That's because the template breaches the standard entry layout, which requires alternative spellings be formatted as

Verb

 * The new format with has not been formally recognized.
 * This issue needs to be brought to Grease pit. In the meanwhile, you can try the following instead:

Definitions

 * --Dine2016 (talk) 03:37, 15 October 2019 (UTC)


 * I'm confused -- there isn't any conjugation that should go at, as that is just a soft-redirect alternative form. Any conjugation tables for よける should currently go in the  entry.  The  entry already has almost the simplest form for soft-redirects:

==Japanese==

Etymology 2

 * (It's a bit wonky, as there's  even though there's only one etym on that page right now, and it's missing the のける reading, but anyway. :) )
 * The absolute smallest form for these soft-redirects, for spellings with only one reading, would be something like the following (assuming that 除ける were only read as yokeru):

==Japanese==


 * I'm uncertain how was running into the abuse filter?  ‑‑ Eiríkr Útlendi │Tala við mig 19:03, 15 October 2019 (UTC)

Sorting on pages using this template
, anyone else interested --

I'm curious if this template / module could be updated to apply sorting. By way of examle, uses, but the 抱っこ entry is currently sorted in Category:Japanese_childish_terms under 抱っこ rather than the expected だっこ. ‑‑ Eiríkr Útlendi │Tala við mig 18:05, 7 November 2019 (UTC)
 * I gave it a try, but I still think sortkeys should eventually be eliminated. Sorting だっこ under た and 抱っこ under 抱 (or 扌/手) allows users to look up the same term by either spelling. This may not be obvious for small categories like Category:Japanese childish terms but will make a difference for larger categories with hundreds or thousands of words. But the most important argument against custom sortkeys is that many editors forget it. For example, the editors of the current entry for だっこ forgot to add a sortkey, so that it is sorted under だ instead of the correct た. --Dine2016 (talk) 03:34, 8 November 2019 (UTC)
 * I've asked a few times in a few different fora over the years, both here and on other WM sites, about how to fix sorting for Japanese, and no one seems to know jack shit about how to improve things at the base level, frankly. (That may be my frustration showing.  苦笑)  There was a related thread not long ago asking similar questions about categories for Hungarian, which at least uses the Latin alphabet.  The approach there was to use Lua to customize how sorting happens.  Given the MediaWiki team's complete apathy with regard to some of our basic-functionality needs, perhaps a similar approach, leveraging Module:languages/data2 or some other code, could be applied to Japanese?  (Asking in ignorance of the possible complexity, as I have not understood the current module infrastructure -- IMHO, our module documentation and code comments are pretty horribly lacking...)  ‑‑ Eiríkr Útlendi │Tala við mig 18:58, 8 November 2019 (UTC)
 * If we use modules then it must be for converting だ to た', 抱 to ⼿05, etc. There is no easy way to convert kanji to kana.
 * MediaWiki categories are unservicable in the first place. The best we can do is to sort kana under kana and kanji under kanji, so that the user can look for terms beginning with むらさき by https://en.wiktionary.org/w/index.php?title=Category:Japanese_lemmas&from=むらさき, and look for terms beginning with 紫 by https://en.wiktionary.org/w/index.php?title=Category:Japanese_lemmas&from=紫, but there's no way to look for terms ending with something or do composition (kanji beginning with X and kana beginning with Y, both a verb and obsolete, etc.). And most importantly, there are no way to customize how entries appear in categories, for example to make むらさきいろ appear as "むらさきいろ【紫色】" and 紫色 appear as "紫色（むらさきいろ）". Suzukaze-c thinks that the best way to improve Wiktionary's usability is via some third-party searching function, and this requires Wiktionary data to be machine parseable. Looking at the current entry layout, I don't think it is. And more disappointing is the fact that the community's efforts are wasted in pandering to MediaWiki's deficient facilities (sorting, references, etc.) even when it would burden more complexity on page sources and make them unserviable to other interfaces than MediaWiki. --Dine2016 (talk) 01:26, 9 November 2019 (UTC)
 * Apologies for my lack of clarity; regarding sorting features, what I was envisioning was actually what you propose: for kana, using Lua to deal with the current hackish workarounds of adding  on the end for initial kana with 濁点 and   for initial kana with 半濁点, and for kanji, using Lua to sort by radical + additional stroke count, rather than just sorting by the raw character itself.  Ideally, editors wouldn't have to bother with sortkeys for Japanese at all.  ‑‑ Eiríkr Útlendi │Tala við mig 07:14, 10 November 2019 (UTC)

New implementation
The new implementation parses the lemma entry in one pass, instead of dividing it by Etymology headers and accepting/rejecting each in an all-or-nothing manner. This means that the following snippet is parsed correctly:

Noun

 * 1) musical note // current list of alt spellings: おたまじゃくし

Noun
But it also means that fewer categories are copied. In fact, only categories from headword lines and definitions are copied, since the new implementation does not make any assumptions of the format of the rest of the entry.
 * 1) tadpole // current list of alt spellings: おたまじゃくし, オタマジャクシ

The new implementation also handles and  in the same manner (as the old ), and their only difference is that the latter speaks of "Sino-Japanese terms" instead of "terms". What about using for both kango and wago, and putting "Sino-Japanese" in the Etymology section? --Nyarukoseijin (talk) 11:26, 23 March 2020 (UTC)
 * I think it is good to unify these 2 temps. I don't think it is quite necessary to clearly distinguish kango from wago in a soft redirect page. -- Huhu9001 (talk) 02:44, 24 March 2020 (UTC)


 * I discovered what might be a failure mode of sorts. See  and .  The call to  on the  page should presumably only pull in the   section from, the section that has  (which I think was also the previous implementation's behavior).  However, what I see on the  page is senses from both etym sections at , which incorrectly gives  the "saint" sense as well.  ‑‑ Eiríkr Útlendi │Tala við mig 17:44, 27 May 2020 (UTC)

Odd rendering at いただきます
, anyone else -- could you please have a look at the いただきます page and suss out why this template isn't working correctly there? I suspect it might be related to the POS header at lemma form 頂きます, but that's just a guess. ‑‑ Eiríkr Útlendi │Tala við mig 22:15, 11 June 2020 (UTC)


 * ✅? This made it better somehow. —Suzukaze-c (talk) 04:26, 13 August 2020 (UTC)

Oddness at きかい
Pinging, welcoming anyone else with insight --

Issue
doesn't handle alt spellings very well. If a listed kanji compound is an alternative form entry that is just a stub, it gets placed at the bottom of the list in smaller font:


 * (The following entry is uncreated: .)

I haven't confirmed, but this might affect too.

Background
I just had a go at and, lemmatizing at. The other two use. The two kanji entries are rendering as expected. However, has the  (the stub entry) at the bottom, stating that it hasn't been created yet.

Ideas
I've found that this happens if can't find an alt spelling or a kana spelling. I'm not sure of the best way of adding it to a stub entry; has, so we could presumably add  , but that feels weird since this isn't really an "alternative" spelling, strictly speaking, and listing kana in the "alternative spellings" box looks odd. I also and it didn't seem to work, so if we decide to go with this approach, it would require a change to, and possibly.

Looking forward to your input.

‑‑ Eiríkr Útlendi │Tala við mig 18:24, 21 September 2020 (UTC)


 * I don't understand the code, but my guess is that it calls 機械 'uncreated' because there isn't really content&mdash; just ja-see. —Suzukaze-c (talk) 20:22, 22 September 2020 (UTC)
 * (I still don't really understand, but following variables, it seems to call it 'uncreated' because there aren't any definitions&mdash; and ja-see isn't a definition. —Suzukaze-c (talk) 20:25, 22 September 2020 (UTC))
 * I'll echo Suzukaze's guess.
 * To add to that, in combination with Dine2016's notes at User talk:Eirikr, is that is not intended to point to entries that amount to little more than stubs as alternative spellings, which is effectively what  is.
 * I didn't understand the intended behavior when I last edited the entry, and the template's output message stating that the  entry didn't exist -- when it clearly does, albeit as a stub -- was confusing to me.
 * The easy solution at the entry is simply to remove  from the list of arguments.  ... Which I've now done.  :)  ‑‑ Eiríkr Útlendi │Tala við mig 17:31, 19 July 2021 (UTC)

"key" param -- what is this for?
The documentation isn't clear what the use case is for this parameter. Could someone please explain? ‑‑ Eiríkr Útlendi │Tala við mig 22:49, 24 December 2020 (UTC)
 * As far as I understand, it's just like pagename of other templates. Used only in tests. -- Huhu9001 (talk) 14:55, 12 January 2021 (UTC)

Message from Dine2016
This is Dine2016, the original creator of this template.

There are some things I did wrong with this template:


 * 1. Having two variants, and , is certainly wrong. Most Japanese dictionaries don't treat Sino-Japanese words specially. It is unclear why Wiktionary should.
 * I initially introduced so that Sino-Japanese words could be grouped together. Later I realized that native words were also rich in homophones, and remodeled  after . Now the two templates are almost identical; the only difference is that the later says "Sino-Japanese" in the footer.
 * The current practice seems to group soft-redirects to native words in one Etymology header, and those to Sino-Japanese words in another. I suggest dropping this boundary, using a single ===Etymology n=== section and a single for all soft-redirects on the same page.
 * 2. Creating  was also a mistake. The original assumption was that graphemes like 竜/龍, 灯/燈, 画/畫/畵, and 代々/代代 were always interchangeable, regardless of which words they spell. So we could simplify

 WORDS      SPELLINGS

だいだい ／ 代々 代々  ─  代代

／ 代々 よよ  ／  代代 世々  ─  世々 ＼ 世世

せぜ 世々  ─  世々 ＼ 世世
 * to

 WORDS    SPELLINGS   GRAPHEME-VARIANTS

だいだい 代々                代々 ＼       ／              代々  ＼  代代 よよ  ／ 世々                世々 ＼       ／             世々  ＼  世世 せぜ  ／ 世々
 * The problem is that I didn't convey this two-level hierarchy clearly (I couldn't write good English, to begin with), so people started using the template in the wrong way, like redirecting from 代代 to 世々. This totally defeated what the template was created for.
 * 3.  requires the alternative spelling to appear in the main entry. This is not necessary if the main entry contains only one word, in which case no ambiguities will arise.
 * 4. Current version looks too ugly. Unfortunately I was not a web designer and didn't know how to make it mobile-friendly.

Part of the reason I couldn't write good English lies in the fact that Wiktionary's terminology was unclear. For example, Wiktionary has never developed a concept for "words" (or more accurately, "lexical items"). Instead, its entries are organized around spellings. So one entry may contain several words (くらい = 'dark', 'rank/approximately', 'eating') and one word may span several entries ('rank/approximately' = くらい, くらゐ, 位). requires you to think in words, to organize the data around words, which is difficult in a spelling-centric mode of editing. So I kinda regret creating these templates.

--2409:894C:3C36:279D:5B65:2A08:CFCA:DED6 16:23, 11 July 2021 (UTC)

"The following entry is uncreated"—what?
I don't understand what the template means when it says this. Is it broken in detecting whether entries are created, or is it worded/placed poorly?

I see it in entries like 遣る, where it says:
 * (The following entry is uncreated: やる.)

but やる does, in fact, exist—it's the main entry for the word. What's going on here? --TreyHarris (talk) 20:01, 27 November 2021 (UTC)


 * Fixed. The reason is that you used two kanjitab for each alternate form, where in actuality you just need one kanjitab and group the alternative forms together separated by comma. Because how the template were designed, having two kanji tabs confused the ja-see template as it doesn’t know which one to direct. Shen233 (talk) 20:36, 6 December 2021 (UTC)

Category
I noticed that this template automatically adds entries to Category:Japanese non-lemma forms, which seems problematic. It's a bit inconsistent for Category:Japanese lemmas to contain some kanji spellings, but not others. Binarystep (talk) 23:02, 14 March 2022 (UTC)

Show romaji at kana-only spellings
I understand why we don't transliterate when this template is placed at a spelling that contains kanji, but when there's nothing but kana there are no technical reasons not to: we don't have to worry about dealing with multiple readings, and it should be easy to transliterate plain kana without having to fish around in other entries like we do when kanji are involved. We don't even have to link to a romaji entry- it would be fine to just display it like we do in our other templates:. As for the matter of consistency: I would argue that almost all kana-only entries with this template are basically transliteration entries already- but in a script that's opaque to those who haven't completely mastered the kanas (i.e., most of our readers). Chuck Entz (talk) 07:39, 1 May 2023 (UTC)