User:Dan Polansky/Wordhood

This page deals with the question of what a word is, or wordhood. It is interesting for our readers and also for project-internal purposes.

Let us get an impression of what is behind the question using English examples:
 * "cat" is a word.
 * Are "cat" and "cats" two words?
 * Is "blueness" a word?
 * Is "ouch" a word?
 * Is "bookshop" a word?
 * Is "thru" a real word?
 * Is "green leaf" a word?
 * Is "black hole" a word?
 * How many words are in "cat" entry?
 * One word per sense?
 * One word per section separated by etymology or part of speech?
 * Is "the cat that is on the mat" a word?
 * Is "rain cats and dogs" a word?
 * Is "all roads lead to Rome" a word?
 * Is "Peter" a word?
 * Is "London" a word?
 * Is "Microsoft" a word?
 * Is "Gondor" a word?
 * Is a putative word only used by a single family a word?
 * Is a putative word only used in print by a single person a word?
 * Is a putative word only used once by someone a word?
 * Is a putative word that is morphologically plausible, say a -ness one, that no one has ever used a word?
 * Is an non-conventionalized interjection-like utterance revealing pain or dismay a word?

Different sorts of words are recognized:
 * orthographic word
 * phonological word
 * morphological word
 * grammatical word
 * lexical word
 * syntactic word
 * morphosyntactic word
 * onomastic word
 * lexicographical word

Morphology stands in contrast to syntax: morphological words are morphologically composed while phrases and sentences are syntactically composed. Thus, "blueness" is morphologically composed from "blue" and "-ness" while "the cat that is on the mat" is syntactically composed from its words. However, the boundary between morphology and syntax is not sharp and is subject to an ongoing discussion and research. Is "White House", arguably a compound, morphologically composed or syntactically composed?

A cross-linguistic notion of "word" is something some are skeptical about.

For Chinese, some argue it has no English-like words at all.

Clitics are special: the term tends to refer to things appearing word-like syntactically, but morpheme-like phonologically. Udi is one language with clitics.

Polysynthetic languages such as Mohawk are subject to special treatment and investigation as for wordhood.