Module talk:palindromes

match efficiency
Thanks for the fix. I would assume that match, when given an index, would have short circuit logic that would return when it first finds a character. Then again, I suppose it must calculate the unicode offset every time. Lua can be very dumb sometimes. — JohnC5 18:54, 27 August 2016 (UTC)
 * gmatch avoids that problem, iterating over a string that way would be O(n). A hybrid method may be the fastest: use gmatch to add all the characters individually to a list, which is O(n), then do the same n/2 processing as before but using list indexing instead to find the characters. Extending a list with an element each time may slow it down a little, but if the list is pre-created with n "nil"s then it would probably not ever reallocate. I really doubt putting that much work into optimising it is worth it though. I like to treat it more like Python: write simple, sensible code first, be clever only if you have to. —CodeCat 18:57, 27 August 2016 (UTC)
 * Fair enough. My only concern is that adding this mildly intense operation to all headwords will only exacerbate Lua memory and time overflows like those that happen at water. I'd like to eke out as much efficiency as possible for these high user templates. — JohnC5 19:29, 27 August 2016 (UTC)
 * I implemented the hybrid version as I proposed. What do you think? —CodeCat 19:35, 27 August 2016 (UTC)
 * Good enough for me! — JohnC5 20:02, 27 August 2016 (UTC)

Verification
, These entries are categorized as palindromes and are not currently recognized as such by the module. DTLHS (talk) 21:54, 27 August 2016 (UTC)
 * Many of these are what we would consider repeated character entries and a few are just that we have not accounted different diacritics. Some questions that remain are 🇨🇬 and 🇨🇬,, where whoever made the entry considered the ch, cs, and sz to be single characters respectively. I do not know enough of these languages to make a judgment about the validity. Some of the Turkish ones seem to be false. — JohnC5 01:04, 28 August 2016 (UTC)
 * ,, what do you think of these? DTLHS (talk) 01:07, 28 August 2016 (UTC)
 * The Hungarian digraphs cs, dz, gy, ly, ny, sz, ty, zs and trigraph dzs are considered one letter, and should be read the same, so and  are palindromes. See this list of Hungarian palindromes in Wikiquote: . The words  and  are also palindromes but their meaning is not identical when read backwords. --Panda10 (talk) 01:21, 28 August 2016 (UTC)
 * In Czech, ch is considered to be one letter for alphabetical sorting purposes, placed after h in the alphabetical order. --Dan Polansky (talk) 07:40, 28 August 2016 (UTC)
 * Thanks for the tip. At present I'm having the module replace ch with χ, which I hope is not used anywhere in Czech orthography. I suppose we could use a series of very obscure characters for substitution, which would almost guarantee that we never accidentally find a false palindrome. What do you think? — JohnC5 15:25, 28 August 2016 (UTC)
 * Couldn't we just use something like the null character (U+0000)? It's guaranteed to never be an entry title. DTLHS (talk) 23:30, 29 August 2016 (UTC)
 * Telugu uses combining characters that might be difficult to account for: for example విరివి DTLHS (talk) 01:12, 28 August 2016 (UTC)

On the issue of Kaibun (also here), I would ask for additional guidance. Should we allow romaji palindromes and if so should we strip macrons? — JohnC5 01:32, 28 August 2016 (UTC)
 * I think that the hiragana(+katakana) reading that is passed to the headword should be the only think examined. IMO romaji shouldn't count at all. —suzukaze (t・c) 01:35, 28 August 2016 (UTC)
 * Could you provide the unvoiced-voiced equivalency as described in the second link (if it is correct). Also, we just have to ignore the entire Latn block then? — JohnC5 01:45, 28 August 2016 (UTC)
 * 1. Module:ja/data >  has what you are looking for. 2. It doesn't seem like Japanese palindromes involve Latn at all. (pinging  as other Japanese editors, just in case) —suzukaze (t・c) 02:37, 28 August 2016 (UTC)
 * I've made the changes, but now we are missing entries like and  which seem theoretically not to be romaji. Should I take into account what script the template uses? — JohnC5 03:47, 28 August 2016 (UTC)
 * I suppose that would work. —suzukaze (t・c) 03:53, 28 August 2016 (UTC)
 * I think I got it working correctly. — JohnC5 04:15, 28 August 2016 (UTC)

, are there any considerations for Old or Modern Armenian? — JohnC5 19:02, 29 August 2016 (UTC)

The existence of Category:Mandarin palindromes seems really wrong to me. —suzukaze (t・c) 03:32, 30 August 2016 (UTC)
 * Fixed I think- the category can be deleted if it becomes empty. DTLHS (talk) 03:35, 30 August 2016 (UTC)
 * In Modern Armenian, Old Armenian and Middle Armenian ու should be considered a single character. In Modern Armenian եւ should be considered a single character. There are no other rules. --Vahag (talk) 09:04, 30 August 2016 (UTC)
 * I think I have implemented it correctly. — JohnC5 14:40, 30 August 2016 (UTC)
 * Cool. --Vahag (talk) 05:42, 31 August 2016 (UTC)

Two-character "palindromes"
Is Category:Chinese palindromes supposed to have two-character entries like 媽媽 in it? This seems to defy lua. (.) —suzukaze (t・c) 06:55, 25 October 2016 (UTC)
 * It's because Module:palindromes/data explicitly says that repeated characters are palindromes in Chinese. —CodeCat 14:14, 25 October 2016 (UTC)


 * Removed. These belong to Category:Chinese reduplications, not palindromes. Wyang (talk) 21:20, 25 October 2016 (UTC)

local missing
Plesae add  on line 13. Thanks. Dpleibovitz (talk) 13:20, 29 October 2023 (UTC)