Appendix talk:English phrasal verbs

Phrasal verbs without prepositions?
I took another look at the phrasal verb category preamble and noted that you added adverbs to prepositions in an important bit of the wording. It thought that, though not all verb-preposition collocations were phrasal verbs, all phrasal verbs were verb-preposition collocations. I was having plenty of trouble with distinguishing those v-prep collocations that were deemed phrasal from those that were not. Please help me understand how I could distinguish between a verb-abverb collocation that was a phrasal verb and one that was not. DCDuring TALK 02:29, 26 January 2008 (UTC)


 * Generally, the word(s) following the verb are called particles because it is almost impossible to say whether they are prepositions or adverbs. Most of the really idiomatic phrasala are taking adverbs rather than preps, but that is a generalisation. Example; look after has almost zilch to do with look. The word after in this phrasal is generally considered to be an adverb, nevertheless, it is called a "particle" to remove ambiguity from any discussion. Being an inclusionist, I try not to throw any possible phrasal verb out. Let's see why. Take fill up. There is almost no difference between it and fill. Could you fill (up)) the tank, please? So why is it much more common to hear the phrasal form used? Because it is often clearer. One of the important points about a phrasal verb is it's accuracy. Why do we say write down in preference to write? Write it down, please or Write it, please The first implies on a piece of paper that you are not going to lose, whereas the second more or less requires that clause to be added, to be clear. Back to fill up. Fill her up or Fill her. Which is more accurate? The particles add a lot of meaning. Look at cut up. Cut the paper or cut up the paper? Depends if you want to make a single incision or if you want lots of small pieces. It also helps to think of what is the meaning encased in the particle. Up for instance often means completely, but it has many other meanings, too. Away carries the idea of separation, and elimination in most examples, eg. run away, throw away, etc. If you concentrate on the subtle differences, what does this verb mean without the particle, what does the particle add to the sense, etc. you can determine more or less if it is a real phrasal verb, although there will often be differences of opinion. I hope this helps a bit. -- Algrif 13:27, 26 January 2008 (UTC)


 * Thanks for the fairly clear explanation. So what kind of verb-particle collocation would an inclusionist exclude from the category of phrasal verbs? Any examples? DCDuring TALK 13:53, 26 January 2008 (UTC)


 * If it's a collocation then it is almost certain to be a phrasal verb as well. The problem is not the collocation itself, but the context. Consider go out. I'm going out tonight is phrasal verb. I'm going out that door and I'm never coming back. is literal. If you say them out loud, you will hear yourself link the "going" to the "out" making it sound like one word in the first example, but not the second. It's a subjective test that usually works well for native speakers. Choose your examples carefully for the entries. -- Algrif 17:21, 26 January 2008 (UTC)

[moved from Algrif talk page]

template?
If "phrasal verbs" are going to continue (e.g. with the existing category "phrasal verbs"), then could we have a template such as that for "idiom" or "colloquial"? Currently one has to define the verb as a phrasal verb and then also manually include it in the "phrasal verbs" category. Facts707 18:22, 3 February 2010 (UTC)

separable and inseparable phrasal verbs
It would be worth adding something about inseparable and separable phrasal verbs to this appendix, and to create corresponding categories Category:English inseparable phrasal verbs. Also a decent phrasal verb template. --Rising Sun talk? contributions 22:04, 16 March 2010 (UTC)
 * Yep, I'm working on that, mate Oxlade2000 (talk) 20:43, 21 February 2021 (UTC)

Statistical methods with Phrasal Verbs
This is a copy from a recent discussion started on the Talk page of User DCDuring. Please feel free to join in.

Hi. Thinking about the problem of phrasal verbs, and the "impressive" counts obtained for non-related collocations (c.f. drift apart / drift together) I find translate.google is a good place to go to get actual statistical results rather than relying on raw data (impressive counts is simply raw, unprocessed data). It is an area of interest to me, the way that translation machines work. Perhaps you already know about this yourself. However, I would like to make a point. They use various analog models, which are based on statistical probabilities starting from a huge set (several millions) of real sentences, from books, newspapers, blogs, etc. Very simply, the models are mainly developed from the statistical probability "counts" of one word being next to, or next but one to,or near, another word, and combined with the similar probability that the POS of the one will be next to, or next but one to, the POS of the other. In our example, the probability of "drift" being next to or close to "apart". When you make a simple sentence using "drift apart", the translator examines the probabilities, and comes up with a translation as per the phrasal verb -- some form of "slowly separate" (e.g. in Spanish translates to "alejarse" "separarse") as being the most likely meaning. If, OTOH you enter a sentence with "drift together" and even if you are trying to mean the opposite of "drift apart", the translator will give you a nautical definition for "drift" (in Spanish, "a la deriba"). That gives you a good insight into the statistical significance of the "impressive" counts in the raw data. It would seem that the translator, basing on real data, will give an idiomatic phrasal verb meaning to "drift apart", that "apart" deflects the meaning of "drift" (and visa versa), and that the translator model sees the collocation as a single verb unit. (Hover over the translation and you will see how the translator is seeing the words as a single verb unit.) -- A LGRIF  talk 11:54, 15 March 2013 (UTC)
 * I prefer corpora that are less black-boxy than anything Google offers. I certainly couldn't take what they do on faith, let alone make specific inferences for our purposes from their inferences for theirs. I use COCA and BNC when I need corpora. They even offer some PoS tags (not wholly reliable). DCDuring TALK 12:02, 15 March 2013 (UTC)
 * It's a shame you think that raw data + intuition (that is to say, your own gut feel) is better than a systematic statistical approach to certain problems to do with collocations. It flies in the face of most accepted methods. I only mentioned Google as an easily accessed translator with clear results that demonstrate statistical parsing in practice. You can use any tool you like, if you don't like Google. I have in mind commonly used (by Google and by others too) processes such as the Viterbi algorithm applied to out-of-context parsing. (See Pedia entry for more info). The example I gave you above shows how statistical processing of huge amounts of real English sentences can throw up that the collocation "drift" given "apart" and "apart" given "drift" is statistically significant to the point of having a very specific meaning. I.e. it is a phrasal verb. Google simply puts pretty yellow highlighting as well, if you want. -- A LGRIF  talk 11:37, 16 March 2013 (UTC)
 * If you could make explicit any criteria whatsoever, then it might be possible to have rational discussions. Why not have this discussion in a public venue where more folks are likely to participate? Why don't you advance a proposal for something specific? I'm sure that lots of folks would like to get behind a proposal based on Google translate - because it would fit their intuition and theoretical prejudices.
 * In the meantime, I'm going to be trying to use my intuition to produce explicit criteria to identify the SoP spatial senses that superficially appear to be phrasal verbs. It also would be nice to explicitly define the various contributions that particles can make to non-compositional phrasal verbs. Possibly "aspect marker" is a label that suggests some possibilities. DCDuring TALK 13:25, 16 March 2013 (UTC)
 * It would certainly be better than trying to convince you to stop attempting to destroy perfectly good phrasal verb entries, simply because your gut tells you so, even tho you don't quite grasp or understand them, as you have previously stated. Nothing wrong in trying to learn, but please stop trying to destroy entries as part of the process. -- Discussion moved to Appendix talk:English phrasal verbs. -- A LGRIF  talk 10:04, 17 March 2013 (UTC)
 * I'd have been happy to learn from the master, but the master didn't seem to be interested. DCDuring TALK 10:55, 17 March 2013 (UTC)

o
 * Copied to this location on 17/03/13 -- A LGRIF  talk 09:57, 17 March 2013 (UTC)
 * The use of corpora (as being non-black-boxy) is very far removed from being considered as "criteria". A Corpus is simply large amounts of raw data. The data is very useful for language analysis, but it is still raw. It must be processed to get anything more than "it exists, I've got three cites suitable for inclusion!". One set of analyses (there are others) that have been very successfully used in areas such as speech recognition, and translation, are the Hidden Markov Models and Viterbi algorithms. As mentioned above, statistically a high probability of PoS and actual meaning of a word can be determined from the surrounding words. One of the functions of these algorithms is that they can be trained to recognize PoS, given a large enough and varied corpora and a training set using real sentences - typically giving success rates of over 95% in real world conditions. Where does this lead us? Firstly, away from the "I've found x examples of w + v + z in my favorite corpus" and nearer to "How can we make use of readily available and running PoS analysis based on proven statistical methods applied to many and varied corpora?". As applied to the particular problem of phrasal verbs, I think it is of important note that readily available translator engines (such as, but not limited to, Google translate) provide results which are meaningful. If the translator picks up that (using above example again) drift apart translates as a functioning phrasal verb - but does not pick up "drift together" in the same way, we can be very sure that "drift apart" is actually being used as a phrasal verb in the bulk of actual samples in actual Corpora. This seems to be a much better result than "my gut tells me it is SoP". We are supposed to be reflecting actual usage, (not prescriptive). If statistical results are strongly indicative of "phrasal-ness" (descriptive) - then to insist otherwise based on an anecdotal sentence pulled out of thin air is surely being prescriptive. -- <i>A LGRIF </i ><font color="#FFD700"> talk 10:43, 17 March 2013 (UTC)
 * I oppose any attempt to use Google Translate as a tool for Wiktionary. Often I’ve seen it translate phrases in a way that the meaning comes out the opposite.
 * As a quick test, I ran the first paragraph of, and it identified the following as units: referred to as, is the, for the, in the eighth century, that aimed, to recover the, lost to, the invading, Arabs during, the Muslim invasion, of the Iberian peninsula. This is some pretty shoddy unit splitting: for example, “… [lost to] [the invading] [Arabs during] [the Muslim invasion] …” would’ve been better split as “… [lost to] [the invading Arabs] [during the Muslim invasion] …”
 * [Arabs during] is particularly bad as it ignores a noun phrase/prepositional phrase boundary. — Ungoliant (Falai) 12:55, 17 March 2013 (UTC)
 * Thanks for the input, but it is a bit off mark. We are talking about Use Of English, not translation of another language. And we are talking about identifying an underlying principle, not about how well Google or any other machine actually translates a paragraph. So the idea is to put in some clear and simple example test sentences to see how they are parsed out. Forget long, convoluted sentences like your example. That is of no help whatsoever. All translators would get that in a twist. -- <i>A LGRIF </i ><font color="#FFD700"> talk 14:07, 17 March 2013 (UTC)
 * To clarify. What I'm suggesting is to enter some phrases such as "Over the years we simply drifted apart as we grew older." and see the translation (which is fine, because the sentence is simple) In Portuguese -"Ao longo dos anos nós simplesmente se afastaram como nós crescemos mais velhos." -- which in turn demonstrates that the statistical result that "drifted" given "apart" and "apart" given "drifted" is that they are normally and most commonly considered to be a single verb unit meaning (in this case) "se afastaram". I would say this is evidence of the phrasal verb-ness of "drift apart". This system is better than using a basic parsing tool which would simply tag the individual words whatever you put in. -- <i>A LGRIF </i ><font color="#FFD700"> talk 14:31, 17 March 2013 (UTC)
 * No, that translation is wrong on many counts. I tried even simpler sentences now: “I become older.” “I become blue.” and “I go outside.” Only the last one was translated correctly, and it identified the units [become older], [become blue] and [I go outside], which are doubtlessly not phrasal verbs, and neither their Portuguese equivalents. — Ungoliant (Falai) 15:03, 17 March 2013 (UTC)


 * I agree with DCDuring and Ungoliant. Google Translate is a poor resource. Its grouping of words is, as Ungoliant shows, no indication whatsoever of whether they are phrasal verbs and/or set phrases. - -sche (discuss) 20:07, 29 May 2013 (UTC)