User:Pengo/common epithets/about

Background:
 * What's an epithet? Aloe vera is the binomial (scientific) name of a plant. The second word, "vera", is the epithet. If you want to be precise, you might call it a specific epithet, and maybe you switch to saying "specific name" when dealing with animal species. For simplicity, I am calling that second word the "epithet" regardless.
 * The epithet is usually a normal Latin word, such as an adjective (e.g. aureus, golden), or a noun (tigris, tiger), and often describes something about the species. Sometimes it's a weird New Latin term that's been invented for the purpose of naming a species.
 * Despite their widespread use, there is no single source to find the meaning of epithets. They are often not documented well and their etymology can be difficult to find.
 * Wiktionary is becoming one of the best reference sources for finding these meanings.

How the list works:
 * This is a list of epithets. The epithets have been grouped together if they appear to share the same stem. (e.g. alba, album, albus). If they're irregular, they likely will miss out on being grouped (e.g. bovis, bos), unless someone has edited the list.
 * The group of epithets which appears in the most publications is ranked first, with its line appearing at the top of the list. The epithet within that group, which contributed most to that group's ranking, is listed first within the group (on the one line). (see points section for more details)
 * The species (binomial name) which contributed most is listed first in the examples.
 * A number like "(2)" means that item hasn't gotten many "importance points", so you may wish to give it less weight if you're using this list for something like editing Wiktionary or for learning the most common specific epithets. (more about points below)
 * ‡ = synonym or obsolete. The item is obsolete, only appears in synonyms, or something odd happened and it wasn't found in the database. This is usually accurate but not always, so please use as a guide only.

Who's this for?
 * Biology students & teachers — These are the most common epithets, so if you're going to learn some Latin for biology, you might want to learn some of the most common ones first. There's also a plant list for botany enthusiasts.
 * Wiktionary editors — check that the most commonly found epithets have complete entries. Create new entries for missing epithets.
 * This list has been created as a guide to help editing and learning. It is not authoritative in any way.

How does Latin work for binomial names?
 * By definition, specific epithets can only be nominative singular/plural or genitive singular/plural nouns or adjectives. The adjectives agree in gender and number with the generic name, but I've never heard of a plural generic name, so the adjectives should be all singular. The first and second declensions are easy, but the third depends on the ending of the stem and how it interacts with the inflectional endings (of which there are variants)- not that easy to code for. I've never worked with the 4th and 5th declension, but there are very few of them, which you can ignore. See Latin declension. (copied from a brief explanation by Chuck Entz in the Tea Room)
 * Plant species: the genus followed by a single specific epithet in the form of an adjective, a noun in the genitive, or a word in apposition, or several words, but not a phrase name of one or more descriptive nouns and associated adjectives in the ablative (...), nor any of certain other irregularly formed designations (...). If an epithet consists of two or more words, these are to be united or hyphenated. IAPT: Names of species

Points (details if you want to nerd out on how this was made):
 * A point is given to an epithet for each binomial name in the Catalogue of Life in it appears in (including synonyms). It is given an additional point for each publication each of those binomial names appears in (published after 1950). This is where most of the points come from. If it's under 40 publications, it might not register at all and just get 0 due to how ngrams data works. The points are only shown if they're 99 or less, as a sign that the item has been included for completeness, but probably isn't a major factor in that group of epithets being ranked highly. Higher numbers aren't shown because they just clutter things up. Each genus is assigned points too, but this just for show, as they don't count towards the popularity of the group of epithets. Sometimes species you'd expect to be rated higher still get low points, e.g. the Sooty Swift, Cypseloides fumigatus got only (1) point, while less well known but heavily researched species get very high points.

Tea room discussions: (including to-do tasks/ideas)
 * January 2015
 * February 2015

Data sources:
 * Catalogue of Life
 * Google ngram English (all) corpus (2grams)

Lists:
 * Common epithets — the most commonly found specific names in books. Similar ones grouped together. Ranked by popularity in books. Binomial examples given. Page: 1, 2.
 * Missing epithets — the ones that don't have Latin/Translingual entries in Wiktionary yet
 * Common plant epithets
 * Binomials found in books
 * obsolete genus list — for reference