User talk:Długosz

Hi, Dlugosz, welcome to Wiktionary. I answered some of your questions on the Talk:spirit, Talk:sol and Talk:Latino pages. If you have other problems or questions about how Wiktionary works, you can also ask in the beer_parlor. Ortonmc 23:41, 11 Mar 2004 (UTC)

H, Długosz. Can you justify that your latest additions are not copyright violations?

Here an example:

alarm vs http://www.thesaurus-dictionary.com/files/a/l/a/alarm.html

Delta G 02:27, 30 Mar 2004 (UTC)


 * The "Webster 1913" material is public domain, and mentioned several times in the intro material for this forum that it may be used. That site apparantly used the same source for that entry.  I'm sure a lot of free dictionaries do.  But I only copy from the original material, not from a site like dictionary.com that mixes results from many sources or the one you mention which does not cite a source at all.


 * I suggest you systematically specify the source in the comment whenever you do this. I think it is useful information. Delta G 18:31, 30 Mar 2004 (UTC)


 * OK, I'll add "material copied from from Webster's Revised Unabridged Dictionary (1913)" in a comment to each page. Długosz


 * Thanks for the question. In the very earliest days of Wiktionary somebody used a bot to create pages for the Wiktionary from the 1913 Webster, and created some 700 pages of material.  I gradually worked to adapt those, but that was a tedious process.  I was almost finished when I became distracted, went on to do other things leaving maybe 30 pages undone.


 * What you describe doesn't really seem like a bot, since the automated part of the work is all done on the clipboard in your computer. The other thing that I would do would be to take whole pages from the 1913 Webster (70 words more or less), and put them into MS Word where I would use the search and replace functions to make adaptations (being careful not to put in quotes and apostrophes because of the idiosyncratic way that MS has of dealing with these).  I would then copy the whole page into Wiktionary, with the intent of making further adaptions as these pages were being split up.  This also served as a way of keeping track what has been done and what remains to be done.


 * I was using the University of Chicago ARTFL version of the Webster. This does not imply that that version is any better or worse than any other.  (They have quite a few places where the pagination go screwed up.)  Using one version consistently just made tracking easier.  Many of these pages that I set up should still be there working for someone to carry on with the work.  From your perspective would having whole pages make your work easier?  I could get back to rolling over these pages with only the minimal change needed to isolate the words, and if your technique works better from there, then so much the better.


 * I view the 1913 Webster material as nothing more than a convenient starting point for establishing baseline. I do adapt as I go along, and make an effort to identify the place where every Shakespeare quote appears in one of his plays.  I thought of also tacking down the quotes of other authors, but after very few words soon realized that this was just too much work for one person.  Whatever modifications we make to the Webster corpus will be what makes Wiktionary stand above the others.  Far too many of those that are there now are based on mindless OCR scans.  Most of our value added needs to wait until we have a solid base to work from.  Eclecticology 08:58, 15 Apr 2004 (UTC)

It is good to have more than one source in order to resolve differences in formatting and presentations. My primary one has been the ARTFL version at: http://humanities.uchicago.edu/orgs/ARTFL/forms_unrest/webster.form.html The one that I have been using as a backup has been the Bootleg version at http://www.bootlegbooks.com/Reference/Webster/Default.htm  Both will allow whole page views. On ARTFL this is a simple matter of calling up the numerical links on the page for a word. None of the online versions handle the legends well. Unless you have an original hard-copy version you pretty well have to guess at these, but that gets easier as you develop the experience.

Sticking to ASCII is very limiting. The ISO 8859-1 characters can be used by simply typing "Alt" + the relevant number. Thus Alt + 0233 will give é. Please note that the number is typed with the leading zero. Most of the other codes that you will need fall within the Latin Extended-A range of Unicode. These can be produced by using the relevant numerical codes with "&" + "#" + number + ";". The number 257 produces &#257;. This and the other vowels with macrons are the most common ones found in Webster.

Please don't remove the Shakespeare quotes, or even the ones from Chaucer. They are a part of the historical record of the language. Adding new quotes is to be encouraged, but they should not "replace" the old ones. Just how much you need to quote will remain an open question. The extracts from Conan Doyle are a great idea, but some are already a century old. For more recent material we need to pay attention to copyright and fair use issues. Eclecticology 18:19, 15 Apr 2004 (UTC)


 * for a small dotted a the hex code is 0227 or decimal 550 should have worked but I guess it's just missing from my machine, so I used the alternative of "a" plus hex 307 = decimal 775 : a&#x0307;;
 * for the other its hex code 02A4 or decimal 676: &#676;
 * The 775 allows a dot to be put on any letter as in c&#775; or p&#775. The Unicode hex range 0300 to 0362 has a number of interesting combining diacritical marks that can be added to any letter. z&#831;


 * Doyle is British. Perhaps a literary American writer from the same period would be a good idea.  (Mark Twain could be interesting.)  I'll need to give it some thought.  There are other possibilities.  Illustrating dictionary entries with quotes from contemporary authors is certainly a fair use, though a full concordance might not be acceptable. Eclecticology 21:16, 15 Apr 2004 (UTC)

Since we have succeeded in misunderstanding ourselves perhaps you could give me a specific citation where this ".a" appears. I didn't find it right away in the ARTFL page that I looked at.

As for the U + 0227, the book that I have is for Unicode 3.0, and I can see now that 30 characters were added to the block at that time. Eclecticology 00:08, 16 Apr 2004 (UTC)

Image:L-words.png
This is being discussed at Requests_for_deletion/Others.--Jusjih 15:18, 18 August 2007 (UTC)