You Have An Accent Even On Twitter Think all regions tweet alike? They don't, according to Jacob Eisenstein, a researcher at Carnegie Mellon University. When it comes to the language of Twitter, there are regional dialects. For example, Southern Californians tweet "coo" for cool, while in Northern California, it's "koo."
NPR logo

You Have An Accent Even On Twitter

  • Download
  • <iframe src="" width="100%" height="290" frameborder="0" scrolling="no" title="NPR embedded audio player">
  • Transcript
You Have An Accent Even On Twitter

You Have An Accent Even On Twitter

  • Download
  • <iframe src="" width="100%" height="290" frameborder="0" scrolling="no" title="NPR embedded audio player">
  • Transcript


Perhaps like us, you assume that the language of tweeting, as in communicating by Twitter, whatever its shortcomings was, at least, reasonably uniformed no matter where its users are atwitter. Well, not so. Thanks to Jacob Eisenstein, we now know that while in Southern California something might be coo - that's cool without the L - a Northern Californian who's making the very same brief trenchant observation is calling it koo, spelled with a K. And if you're tweeting in New York, you're doing suttin, and that doesn't mean that you're on Sutton Place.

Jacob Eisenstein is a postdoctoral researcher at Carnegie Mellon University. He and his team analyzed more than 380,000 tweets and discovered that even in 140 characters or less, there are regional dialects.

Jacob Eisenstein, welcome to the program.

Mr. JACOB EISENSTEIN (Postdoctoral Fellow, Carnegie Mellon University): Thanks.

SIEGEL: And first, those New Yorkers who were doing suttin, what does that mean?

Mr. EISENSTEIN: It just means they were doing something. It's a nice example of a phenomenon that we see in a couple of different places in Twitter

You have a standard form, something, which is used throughout the U.S. You have more phoneticized forms that are spelled more how they might be pronounced, like sumthin, S-U-M-T-H-I-N. And then we have a very specific form to New York City, suttin, which is really almost never used outside of sort of the immediate area around New York City.

SIEGEL: What's another pretty good regionalism that you discovered?

Mr. EISENSTEIN: Well, one that we were expecting to find because we had some evidence from speech is a word called hella, which, you know, if you spent any time living in Northern California, people tend to associate with the Northern Californian spoken dialect.

SIEGEL: Hella?

Mr. EISENSTEIN: Hella, yeah, and it just means very: I'm hella tired. And indeed, that shows up on Twitter, too. So that was just some confirmation that we were finding things that did sort of derive from speech.

On the other hand, we found things that really seem unconnected to speech at all. The example you mentioned at the beginning, koo, which you could start with a C or with a K, it's really impossible to speak that difference, I think.

(Soundbite of laughter)

SIEGEL: You should hope.

Mr. EISENSTEIN: So this is something that I think is really unique to, maybe to social media or to written communication.

SIEGEL: Anything in here that truly surprised you about the differences that have developed so quickly on Twitter?

Mr. EISENSTEIN: Yeah, there seem to be regional differences that are almost completely unconnected from spoken language that are really unique to phenomena we find in written social media.

For example, there are a lot of abbreviations. Everybody, or a lot of people, know LOL, which is laugh out loud. And there are a lot of ways to say that things are funny on the Internet.

So LOL is a form that you'll see throughout the U.S. There are other forms that have the same meaning, that are much more regionally distinct. And unfortunately, most of these forms are things that you can't say on the radio, but again just sort of acronyms for things that are funny that you'll find, for example, just in Pennsylvania. There's another one that you'll find sort of centered on Washington, D.C., and just in the Mid Atlantic area, but again things that would really never find their way into a spoken conversation.

SIEGEL: As you are applying computational research to linguistics, I mean, do you find that something different has happened here, that first email, then texting, then Twitter, with all of its improvised shorthand and creative misspellings, is in fact making written language more like spoken language?

Mr. EISENSTEIN: Yeah, I think that's a great observation, and that's, to me, exactly what's so exciting about studying Twitter. You know, these sorts of regional differences are things that we've known about for a long time in spoken language, and in fact, you know, it's the case that spoken English between different cities in the U.S., linguists believe it's more different now than it was 100 years ago but in written language.

Up until very recently, the only data that you would have available would be very formal written texts, things by journalists or laws, things like that. So there was really no way to identify that kind of regional variation until now.

And now, through social media, we have written communication that's being used in a very conversational, informal way, and so we're starting to see all the same richness and diversity that we see in spoken language, we're starting to see that in written language, too.

SIEGEL: Well, Jacob Eisenstein, thank you very much for talking with us about regional dialects in Twitter.

Mr. EISENSTEIN: Thank you.

SIEGEL: Jacob Eisenstein is a postdoctoral researcher at Carnegie Mellon University.

(Soundbite of song, "Rockin' Robin")

Unidentified Man: (Singing) to hear the robin go tweet tweet tweet. Rockin' Robin. Tweet, tweet, tweet. Rockin' Robin. Tweet tweetly deet.


You are listening to ALL THINGS CONSIDERED.

Copyright © 2011 NPR. All rights reserved. Visit our website terms of use and permissions pages at for further information.

NPR transcripts are created on a rush deadline by Verb8tm, Inc., an NPR contractor, and produced using a proprietary transcription process developed with NPR. This text may not be in its final form and may be updated or revised in the future. Accuracy and availability may vary. The authoritative record of NPR’s programming is the audio record.