<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//TaxonX//DTD Taxonomic Treatment Publishing DTD v0 20100105//EN" "../../nlm/tax-treatment-NS0.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:tp="http://www.plazi.org/taxpub" article-type="research-article" dtd-version="3.0" xml:lang="en">
  <front>
    <journal-meta>
      <journal-id journal-id-type="publisher-id">109</journal-id>
      <journal-id journal-id-type="index">urn:lsid:arphahub.com:pub:3dc5f44e-8666-58db-bc76-a455210e8891</journal-id>
      <journal-title-group>
        <journal-title xml:lang="en">JUCS - Journal of Universal Computer Science</journal-title>
        <abbrev-journal-title xml:lang="en">jucs</abbrev-journal-title>
      </journal-title-group>
      <issn pub-type="ppub">0948-695X</issn>
      <issn pub-type="epub">0948-6968</issn>
      <publisher>
        <publisher-name>Journal of Universal Computer Science</publisher-name>
      </publisher>
    </journal-meta>
    <article-meta>
      <article-id pub-id-type="doi">10.3217/jucs-023-11-1038</article-id>
      <article-id pub-id-type="publisher-id">23684</article-id>
      <article-categories>
        <subj-group subj-group-type="heading">
          <subject>Research Article</subject>
        </subj-group>
        <subj-group subj-group-type="scientific_subject">
          <subject>E.0 - GENERAL</subject>
          <subject>H.3 - INFORMATION STORAGE AND RETRIEVAL</subject>
          <subject>J.5 - ARTS AND HUMANITIES</subject>
        </subj-group>
      </article-categories>
      <title-group>
        <article-title>Utilizing Multilingual Language Data in (Nearly) Real Time: The Case of the Nordic Tweet Stream</article-title>
      </title-group>
      <contrib-group content-type="authors">
        <contrib contrib-type="author" corresp="yes">
          <name name-style="western">
            <surname>Laitinen</surname>
            <given-names>Mikko</given-names>
          </name>
          <email xlink:type="simple">ikko.laitinen@uef.fi</email>
          <xref ref-type="aff" rid="A1">1</xref>
        </contrib>
        <contrib contrib-type="author" corresp="no">
          <name name-style="western">
            <surname>Lundberg</surname>
            <given-names>Jonas</given-names>
          </name>
          <xref ref-type="aff" rid="A2">2</xref>
        </contrib>
        <contrib contrib-type="author" corresp="no">
          <name name-style="western">
            <surname>Levin</surname>
            <given-names>Magnus</given-names>
          </name>
          <xref ref-type="aff" rid="A2">2</xref>
        </contrib>
        <contrib contrib-type="author" corresp="no">
          <name name-style="western">
            <surname>Lakaw</surname>
            <given-names>Alexander</given-names>
          </name>
          <xref ref-type="aff" rid="A2">2</xref>
        </contrib>
      </contrib-group>
      <aff id="A1">
        <label>1</label>
        <addr-line content-type="verbatim">University of Eastern Finland and Department of Languages Linnaeus University, Växö, Sweden</addr-line>
        <institution>University of Eastern Finland and Department of Languages Linnaeus University</institution>
        <addr-line content-type="city">Växö</addr-line>
        <country>Sweden</country>
      </aff>
      <aff id="A2">
        <label>2</label>
        <addr-line content-type="verbatim">Linnaeus University, Växö, Sweden</addr-line>
        <institution>Linnaeus University</institution>
        <addr-line content-type="city">Växö</addr-line>
        <country>Sweden</country>
      </aff>
      <author-notes>
        <fn fn-type="corresp">
          <p>Corresponding author: Mikko Laitinen (<email xlink:type="simple">ikko.laitinen@uef.fi</email>).</p>
        </fn>
        <fn fn-type="edited-by">
          <p>Academic editor: </p>
        </fn>
      </author-notes>
      <pub-date pub-type="collection">
        <year>2017</year>
      </pub-date>
      <pub-date pub-type="epub">
        <day>28</day>
        <month>11</month>
        <year>2017</year>
      </pub-date>
      <volume>23</volume>
      <issue>11</issue>
      <fpage>1038</fpage>
      <lpage>1056</lpage>
      <uri content-type="arpha" xlink:href="http://openbiodiv.net/11153093-5962-5E45-B791-13BC2D098E42">11153093-5962-5E45-B791-13BC2D098E42</uri>
      <uri content-type="zenodo_dep_id" xlink:href="https://zenodo.org/record/5505781">5505781</uri>
      <history>
        <date date-type="received">
          <day>01</day>
          <month>09</month>
          <year>2016</year>
        </date>
        <date date-type="accepted">
          <day>26</day>
          <month>11</month>
          <year>2017</year>
        </date>
      </history>
      <permissions>
        <copyright-statement>Mikko Laitinen, Jonas Lundberg, Magnus Levin, Alexander Lakaw</copyright-statement>
        <license license-type="creative-commons-attribution" xlink:href="" xlink:type="simple">
          <license-p>This article is freely available under the J.UCS Open Content License.</license-p>
        </license>
      </permissions>
      <abstract>
        <label>Abstract</label>
        <p>This paper presents the Nordic Tweet Stream, a cross-disciplinary digital humanities project that downloads Twitter messages from Denmark, Finland, Iceland, Norway and Sweden. The paper first introduces some of the technical aspects in creating a real-time monitor corpus that grows every day, and then two case studies illustrate how the corpus could be used as empirical evidence in studies focusing on the global spread of English. Our approach in the case studies is sociolinguistic, and we are interested in how widespread multilingualism which involves English is in the region, and what happens to ongoing grammatical change in digital environments. The results are based on 6.6 million tweets collected during the first four months of data streaming. They show that English was the most frequently used language, accounting for almost a third. This indicates that Nordic Twitter users choose English as a means of reaching wider audiences. The preference for English is the strongest in Denmark and the weakest in Finland. Tweeting mostly occurs late in the evening, and high-profile media events such as the Eurovision Song Contest produce considerable peaks in Twitter activity. The prevalent use of informal features such as univerbated verb forms (e.g., gotta for (HAVE) got to) supports previous findings of the speech-like nature of written Twitter data, but the results indicate that tweeters are pushing the limits even further.</p>
      </abstract>
    </article-meta>
  </front>
</article>
