{"id":508384,"date":"2010-04-02T17:14:14","date_gmt":"2010-04-02T21:14:14","guid":{"rendered":"http:\/\/languagelog.ldc.upenn.edu\/nll\/?p=2229"},"modified":"2010-04-02T17:14:14","modified_gmt":"2010-04-02T21:14:14","slug":"uh-accommodation","status":"publish","type":"post","link":"https:\/\/mereja.media\/index\/508384","title":{"rendered":"Uh accommodation?"},"content":{"rendered":"<p>In the course of an enjoyable conversation over lunch yesterday, <a href=\"http:\/\/www.michaelchorost.com\/\">Michael Chorost<\/a> asked whether disfluency is contagious, in the sense (for example) that talking with someone who uses &#8220;uh&#8221; a lot would tend to lead you to behave similarly.\u00a0 This seems plausible, since such effects can be found in most other variable aspects of speech and language use, so I promised to check &#8212; with a warning that causation is especially difficult to infer from correlation in such cases.<\/p>\n<p><span id=\"more-2229\"><\/span><\/p>\n<p>As data, I used the transcripts of the Fisher English conversational speech corpus, recorded in 2003 and published by the LDC in 2004 and 2005. There are 11,699 conversations in all, or 23,398 conversational sides. As transcribed, the average number of words per conversational side is 960.<\/p>\n<p>The median rate of <em>uh<\/em> usage was about 2.6 per thousand words. But 8,494 of the 23,398 conversational sides (36%) had no instances of <em>uh<\/em> at all, at least as transcribed; and some had a lot &#8212; one conversational side had 80 uhs in 592 total words, for a rate of 135 per thousand.<\/p>\n<p>The correlation between the rates of <em>uh<\/em> usage on the two sides of a conversation was r= 0.383.\u00a0 This is not especially high &#8212; it means that only about 15% of the variance in overall rate of <em>uh<\/em> usage is explained by knowing the interlocutor&#8217;s rate. Still, it&#8217;s certainly statistically significant.<\/p>\n<p>And some ways of looking at the relationship are more striking. Thus of the 4,235 conversations in which the A side never used <em>uh<\/em>, in 3,507 cases (83%) the B side also never used <em>uh<\/em>. In contrast, in the 7,464 conversations in which the A side used <em>uh<\/em> at least once, there were only 752 (10%) in where the B side failed to use <em>uh<\/em>.<\/p>\n<p>One graphical way to look at the relationship would be to compare the distribution of uh-usage rates for conversations where the partner&#8217;s uh-usage was greater than the median, to the distribution of rates in conversations where the partner&#8217;s uh-usage was less than the median. Here&#8217;s a &#8220;<a href=\"http:\/\/www.jstatsoft.org\/v28\/c01\/paper\">bean plot<\/a>&#8221; that does this:<\/p>\n<p><a href=\"http:\/\/languagelog.ldc.upenn.edu\/myl\/UhRate2.png\"><img decoding=\"async\" title=\"Click to embiggen\" src=\"http:\/\/languagelog.ldc.upenn.edu\/myl\/UhRate2.png\" alt=\"\" width=\"500\" \/><\/a><\/p>\n<p>I&#8217;ve plotted the square root of the rate of <em>uh<\/em> usage, just to spread out the distributions a bit. If we leave out the zero-rate cases, a comparison of the log of the <em>uh<\/em>-rate distributions is suggestive:<\/p>\n<p><a href=\"http:\/\/languagelog.ldc.upenn.edu\/myl\/UhRate3.png\"><img decoding=\"async\" title=\"Click to embiggen\" src=\"http:\/\/languagelog.ldc.upenn.edu\/myl\/UhRate3.png\" alt=\"\" width=\"500\" \/><\/a><\/p>\n<p>A number of trivial explanations for this pattern come to mind. For example, transcribers sometimes edit out disfluencies (though the transcribers were instructed not to do so in this case), and so it&#8217;s conceivable that this correlation reflects variation in the behavior of the transcribers rather than variation in the behavior of the speakers.<\/p>\n<p>Assuming that the correlation is really a fact about the behavior of speakers, &#8220;accommodation&#8221; of filled-pause usage is not the only possible explanation for the correlation. There might well be shared factors (complexity of the subject-matter, for example) that would influence both participants at once.<\/p>\n<p>Finally, we know that rates of <em>uh<\/em> usage vary with age and sex. There&#8217;s no guarantee that this set of conversations is balanced for pairings of ages and sexes &#8212; in fact, the distribution of <a href=\"http:\/\/www.ldc.upenn.edu\/Catalog\/docs\/LDC2004T19\/fe_03_topics.sgm\">topics<\/a> might well lead to sex or age correlations among speakers. So it would be helpful to use multiple regression to see if the interlocutor&#8217;s rate of <em>uh<\/em>-usage still had an effect, when sex and age were also included in the model.<\/p>\n<p>Still, I&#8217;ll take these results as tending to confirm Michael&#8217;s conjecture.<\/p>\n<p>(For the purposes of this little experiment, I didn&#8217;t check on uses of <em>um<\/em> or <em>ah<\/em>, and I should have included them as well.)<\/p>\n","protected":false},"excerpt":{"rendered":"<p>In the course of an enjoyable conversation over lunch yesterday, Michael Chorost asked whether disfluency is contagious, in the sense (for example) that talking with someone who uses &#8220;uh&#8221; a lot would tend to lead you to behave similarly.\u00a0 This seems plausible, since such effects can be found in most other variable aspects of speech [&hellip;]<\/p>\n","protected":false},"author":4144,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[7],"tags":[],"class_list":["post-508384","post","type-post","status-publish","format-standard","hentry","category-news"],"_links":{"self":[{"href":"https:\/\/mereja.media\/index\/wp-json\/wp\/v2\/posts\/508384","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/mereja.media\/index\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/mereja.media\/index\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/mereja.media\/index\/wp-json\/wp\/v2\/users\/4144"}],"replies":[{"embeddable":true,"href":"https:\/\/mereja.media\/index\/wp-json\/wp\/v2\/comments?post=508384"}],"version-history":[{"count":0,"href":"https:\/\/mereja.media\/index\/wp-json\/wp\/v2\/posts\/508384\/revisions"}],"wp:attachment":[{"href":"https:\/\/mereja.media\/index\/wp-json\/wp\/v2\/media?parent=508384"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/mereja.media\/index\/wp-json\/wp\/v2\/categories?post=508384"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/mereja.media\/index\/wp-json\/wp\/v2\/tags?post=508384"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}