User:Dan Polansky/Thesaurus Benefits

This page reflects reasoning not necessarily shared by all editors.

Since the main namespace (mainspace) is designed to feature the same semantic relations as the thesaurus, it is natural to ask why have a separate dedicated thesaurus. Alternatives to thesaurus include featuring its content in the mainspace but also organizing its content via categories. Another alternative are appendices. Having a separate thesaurus isn't too inconvenient for the reader: they only need to click on the thesaurus link to navigate to it. A single click is all that is required.

Centralization of lists for ease of maintenance
If a list of, say, 20 synonyms is duplicated in the 20 mainspace entries, adding a new synonym requires adding it to 20 places. That is an impractical maintenance burden. Thesaurus:drunk has over 250 synonyms, and then also over 80 hyponyms. However, an alternative to the thesaurus would be to centralize the content in one mainspace entry and direct the reader to that central entry via "See X" from the other entries. Still, that would not take the reader to the specific synonym list on the target page; the reader would have to scroll to the list.

Decluttering mainspace
Synonyms in the mainspace are designed to be listed directly under senses. Adding 50 synonyms there would clutter the mainspace. Thesaurus:drunk has over 250 synonyms.

There can be many hyponyms as well. The reader interested in definitions or translations does not appreciate to have to skim through long lists of items they are not interested in. However, the lists could be made collapsible. If so, the reader would also have to click to get to the list content; clicking on a thesaurus link is the same number of clicks.

An example of a long entry that is not synonymic or solely hyponymic is Thesaurus:number.

Focus on a single sense
The mainspace is for a word or term, not for a particular sense. By contrast, the thesaurus can be designed to have one sense per entry, representing a single place in the semantic space. A thesaurus is a semantic navigation aid; mixing disparate senses such as those in e.g. cat on one page distracts from the semantic navigation purpose.

Semantic focus
A dedicated namespace enables focus on semantic relations to the exclusion of everything else, including etymology, pronunciation, definitions, example sentences, related terms, and derived terms, making it easier for the editor to use their imagination and memory to expand the entry, and for the reader to focus on the semantic relations they are interested in.

Knowing where to navigate next
The thesaurus places links to other thesaurus pages next to list items, via ws. Thus, the reader knows where to go to see other related lists. By contrast, a list in the mainspace does not make it clear which of the items in it are worth navigating to: many contain no further semantic lists.

Benefits over categories
One advantage of categories would be a single place of modification: the mainspace entry. By contrast, to add a term to thesaurus, we need to modify the mainspace and also the thesaurus page.

Flexibility
Thesaurus pages are much more flexible than categories:
 * Register labels (informal, vulgar, obsolete, etc.) can be added next to terms, impossible in category display.
 * Each term can have a gloss to be shown in the tooltip.
 * Other cases of flexibility follow as headings.

Keeping item sequence order
Items can be listed in a particular non-alphabetical order, which if often desirable. For instance, in Thesaurus:number, terms can be listed in the numerical ordering. In Thesaurus:frequency, one can list items from highest to lowest frequency. Compare the lack of order in Category:en:Numbers and Category:English frequency adverbs.

One could achieve a similar effect by passing sort keys to categories. However, when inserting an item in the middle of a sequence, one would have to renumber the keys, or one would have to use 10-stepped keys or the like. It would be far from as convenient to do as in a thesaurus page.

Single-page nested structures
Thesaurus pages make it possible to show all items in nested structures. There is a separation to synonyms, hyponyms, etc. on a single page, and these lists can be further subdivided, as in e.g. Thesaurus:number. By contrast, a category is a flat unstructured list of items. To support thesaurus-like functions, there would need to be separate categories for synonyms, hyponyms, etc., and to support further nesting, even deeper categories. It is not obvious how to display the aggregated content from all the relevant categories on a single page.

Wikipedia does not limit itself to categories for navigation either. It contains:
 * Navigation boxes (W:Template:Philosophy topics). Granted, navigation boxes are in templates and not on separate pages. However, they still usually require the reader to click on the box heading to uncollapse.
 * Lists (List of physical quantities)
 * Outlines (Outline of logic).

History
Unlike a category, a thesaurus page has a history. One can thus fairly easily trace which structural changes took place. That is a benefit for maintenance. When categories are expanded, renamed or split, there is not simple way to track changes to the lists. The development of a category cannot be easily traced.

Benefits over appendices
Appendices like Appendix:SI units and Appendix:English nationality prefixes present an interesting model for term list presentation. They are more free-form and in some ways richer than the thesaurus, allowing presentation in multi-column tables, mapping e.g. nationality names to prefixes. However, they provide no guide on how to interconnect them into a semantic network, to help answer the question "what is semantically nearby". They are fine for structured term lists as an alternative to categories, but do not replace the hyponymic and meronymic organization provided by the thesaurus. For synonym lists, they do not seem to provide any advantage over the thesaurus, which allows not only listing synonyms but also providing them with glosses and tagging them for register.

Benefits of WordNet-like design
By being designed on the model of WordNet with its relations of hyponymy/hypernymy (subclass, superclass) and meronymy/holonymy (part, whole), the thesaurus can only benefit. It still features the synonymy support found in the thesauri based on synonymy, but it has much more. It provides richer vocabulary navigation features than synonymy and antonymy alone would provide. The original "thesaurus", which is Roget's Thesaurus, was not restricted to synonymy.

Benefits of "Various" section
Allowing "Various" section provides for richer vocabulary navigation than that provided by a predefined set of relations. Its advantage is also its disadvantage: its out-of-the-box design leaves it more vulnerable to disputes. However, the benefits and richness that this section provides cannot be overstated, as shown e.g. in Thesaurus:number and Thesaurus:frequency. It provides for lateral connections. The classical Roget's Thesaurus is not constrained to any predefined sets of relations either.

Other dictionaries
To double check that having a separate thesaurus may be a good idea, we may note that both Merriam-Webster and Oxford English Dictionary (OED) have separate thesauri. The thesaurus of OED is hierarchical one, a little bit like the one of Wiktionary but somewhat different. Other dictionaries have thesauri as well: there is Macmillan Thesaurus, Collins Thesaurus, Cambridge Thesaurus, and thesaurus.com (a companion of dictionary.com).