Wiktionary:Frequency lists/English/Project Gutenberg

These lists are the most frequent words, when performing a simple, straight (obvious) frequency count of all the books found on Project Gutenberg. These are mostly English words, with some other languages finding representation to a lesser extent. Many Project Gutenberg books are scanned once their copyright expires, typically book editions published before 1923, so the language does not necessarily always represent current usage. For example, "thy" is listed as the 280th most common word. Also, with 24,000+ books, the text of the boilerplate warning for Project Gutenberg appears on each of them.

Here are the top 100 words from Project Gutenberg texts in alphabetical order: style=margin-left: 1.6em;|1=

a

about

after

all

an

and

any

are

as

at

be

been

before

but

by

can

could

did

do

down

first

for

from

good

great

had

has

have

he

her

him

his

I

if

in

into

is

it

its

know

like

little

made

man

may

me

men

more

Mr

much

must

my

no

not

now

of

on

one

only

or

other

our

out

over

said

see

she

should

so

some

such

than

that

the

their

them

then

there

these

they

this

time

to

two

up

upon

us

very

was

we

were

what

when

which

who

will

with

would

you

your

These wikified terms can be copied to other language wiktionaries; this is what they are intended for. If you do, please add an interwiki link onto the page here.

16 April 2006
style=margin: 0.5em 0;|1=

1-10000

10001-20000

20001-30000

30001-40000

10 October 2005
1-10000

class=inline|1=The list divided by thousand words:

1-1000

1001-2000

2001-3000

3001-4000

4001-5000

5001-6000

6001-7000

7001-8000

8001-9000

9001-10000

16 August 2005
style=margin: 0.5em 0;|1=

1-10000

10001-20000

20001-30000

30001-40000

40001-50000

50001-60000

60001-70000

70001-80000

80001-90000

90001-100000

Approximately 24,197 files, 1,712,082,956 words, 70,756.0 average words per file, from which were gleaned about 9,053,310 unique "words".