Last updated 09/07/24: Online ordering is currently unavailable due to technical issues. We apologise for any delays responding to customers while we resolve this. For further updates please visit our website: https://www.cambridge.org/news-and-insights/technical-incident

We use cookies to distinguish you from other users and to provide you with a better experience on our websites. Close this message to accept cookies or find out how to manage your cookie settings .

Login Alert

language research paper

  • < Back to search results
  • Language in Society

Language in Society

  • Submit your article
  • Announcements
  • Information
  • Journal home
  • Journal information
  • Linguistic Anthropology
  • FirstView articles
  • Latest issue
  • Open access articles
  • Get access Subscribe Check if you have access via personal or institutional login Log in Register
  • Contains open access

Language in Society

  • ISSN: 0047-4045 (Print) , 1469-8013 (Online)
  • Editors: Professor Susan Ehrlich York University, Canada , and Professor Tommaso Milani The Pennsylvania State University, USA
  • Editorial board

Recently published articles

(im)precise personae: the effect of socio-indexical information on semantic interpretation.

  • Andrea Beltrama , Florian Schwarz
  • Language in Society , First View

Doing being an average teenager: Deploying ordinariness as subversive disability performance in presentational media

  • Xiaowei May Li

The liminal (vowel) space of womanhood: Fundamental frequency, formants, and the intersex body in Brazil

  • Ashlee Dauphinais Civitello

Linguistic hostility, social exclusion, and the agency of African migrants in Hong Kong

  • Jiapei Gu , Janet Ho

Settle for Biden: The scalar production of a normative presidential candidate on Instagram

  • Katherine Arnold-Murray

Why language revitalization fails: Revivalist vs. traditional ontologies of language in Provence

  • James Costa

Who's being elitist? A debate about the enregisterment of Singlish

Alternative spaces of encounter: characterological metadiscourses and ‘joint voice’ in finnish multi-ethnic inclusive theater.

  • Tomi Visakko

Call for Proposals for Special Issue of  Language in Society (2026)

Language in Society publishes a special issue of the journal once a year. As editors, we aim to make the selection process for these issues relatively transparent by issuing an annual call for proposals with a set deadline, thereby avoiding a ‘first-come-first-serve’ system. The proposals are reviewed by the editors and relevant members of the editorial board (i.e., editorial board members with expertise in the subject matter of the proposed special issue). We seek collections of articles that make a significant contribution to the advancement of the study of language in society by pushing current debates forward in innovative ways or by taking discussions in new directions. We are looking for collections of papers that connect in meaningful ways with each other and which cohere to form something more than the sum of the collection’s parts.

The deadline for proposals for the special issue for 2026 is August 31st, 2024. Proposals should describe the contribution of the collection as a whole in approximately 1000 words and should also include 150-word abstracts from authors of the individual papers. Proposals should be e-mailed to [email protected] by the deadline. Please address any questions to the editors at this same email address.

Other sociolinguistics journals from Cambridge

Journal of Linguistic Geography

Journal of Linguistic Geography

Language Variation and Change

Language Variation and Change

2022 Journal Citation Reports © Clarivate Analytics

Language style as audience design *

  • Language in Society , Volume 13 , Issue 2

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Review Article
  • Open access
  • Published: 08 March 2023

Changing perceptions of language in sociolinguistics

  • Jiayu Wang 1 ,
  • Guangyu Jin 1 , 2 &
  • Wenhua Li 1  

Humanities and Social Sciences Communications volume  10 , Article number:  91 ( 2023 ) Cite this article

8012 Accesses

1 Citations

1 Altmetric

Metrics details

  • Language and linguistics

This paper traces the changing perceptions of language in sociolinguistics. These perceptions of language are reviewed in terms of language in its verbal forms, and language in vis-à-vis as a multimodal construct. In reviewing these changing perceptions, this paper examines different concepts or approaches in sociolinguistics. By reviewing these trends of thoughts and applications, this article intends to shed light on ontological issues such as what constitutes language, and where its place is in multimodal practices in sociolinguistics. Expanding the ontology of language from verbal resources toward various multimodal constructs has enabled sociolinguists to pursue meaning-making, indexicalities and social variations in its most authentic state. Language in a multimodal construct entails the boundaries and distinctions between various modes, while language as a multimodal construct sees language itself as multimodal; it focuses on the social constructs, social meaning and language as a force in social change rather than the combination or orchestration of various modes in communication. Language as a multimodal construct has become the dominant trend in contemporary sociolinguistic studies.

Similar content being viewed by others

language research paper

Register-based distribution of expressions of modality in COCA

language research paper

Comparing the language style of heads of state in the US, UK, Germany and Switzerland during COVID-19

language research paper

The art of rhetoric: persuasive strategies in Biden’s inauguration speech: a critical discourse analysis

Introduction.

This article will review a range of sociolinguistic concepts and their applications in multimodal studies, in relation to how language has been conceptualized in sociolinguistics. While there are reviews of specific areas of research in sociolinguistics, including prosody and sociolinguistic variation (Holliday, 2021 ), language and masculinities (Lawson, 2020 ), and Language change across the lifespan (Sankoff, 2018 ), there have been few reviews works set out to delineate the most fundamental ontological questions in sociolinguistic studies; that is, what is and what constitutes language? How do sociolinguists perceive language in relation to other semiotic resources that are part and parcel of social meaning-making and social interaction? Relevant discussions are scattered in passing mainly in the introductory sections of various sociolinguistic works, such as Blommaert ( 1999 ), García and Li ( 2014 ) and Makoni and Pennycook ( 2005 ). However, there have not been review articles systematically dealing with the changing perceptions of language in sociolinguistic studies.

These issues are worthwhile to pursue in the sense that though sociolinguistics studies language, yet no reviews were done regarding what on earth constitutes language, especially in relation to a wider range of semiotic resources. What even makes the review more imperative is that in an increasingly globalized and high-tech world, linguistic practices are complicated by the super-diversity of ethnic fluidity, communications technologies, and globalized cross-cultural art.

Centring on the ontological perception of language in sociolinguistics, this article consists of five sections. After the “Introduction” section, the next section will review traditional (socio)linguistic perceptions of language as written or spoken signs or symbols that people use to communicate or interact with each other. The next section will review representative sociolinguistic approaches that place language in multimodal settings which involve the relationship between language and other semiotic resources. They are categorized as the conceptualizations of “language in multimodal construct” and “language as multimodal construct”. These conceptualizations share the common feature that language is not researched merely in terms of written and spoken signs and symbols, but it is probed (1) in relation to its multimodal contexts and (re)contextualization (regarding language in multimodal construct), (2) in terms of its own materiality and spatiality, and linguistic representations of multimodality, for instance, social (inter)action and “smellscapes” (Pennycook and Otsuji, 2015a ) which are in turn conflated with linguistic features (regarding language as multimodal construct). The penultimate section and the last section will present a critical reflection and a conclusion of the review, respectively.

Language as written and spoken signs and symbols

What constitutes language(s)? Saussure ( 1916 ) distinguishes between langue and parole. The former refers to the abstract, systematic rules and conventions of the signifying system, while the latter represents language in daily use. Chomsky ( 1965 ) refers to them as competence (corresponding to langue) and performance (corresponding to parole). Chomsky ( 1965 ) assumes that performance is bound up with “grammatically irrelevant conditions as memory limitations, distractions, shifts of attention and interest, and errors (random or characteristic) in applying his knowledge of this language in actual performance” (Chomsky, 1965 , pp. 3–4). He advocates that the agenda of linguistics should be the study of competence of “an ideal speaker-listener, in a completely homogeneous speech-community, who knows its (the speech community’s) language perfectly” (in brackets original). His conception of the ideal language rules out the “imperfections” arising from the influences of social or pragmatic dimensions in real language use. This can be seen as the conception of language as innate human competence. By contrast, constructionists have argued that language cannot be separated from the societal and social domain; social reality is constructed through languages (Berger and Luckmann, 1966 ), and linguistics should take social dimensions into account, as shown by Systemic Functional Linguistics developed by Halliday. These approaches to language studies, nevertheless, do not pay much attention to the ontological issues of language or linguistics concerning what constitutes language, whether languages can be separated from each other, and whether there are different conceptions of language(s).

Sociolinguistics, taking as its departure an interdisciplinary attempt to be the sociology regarding linguistic issues or linguistics regarding sociological issues, faces the ambivalent positioning of whether it should be sociologically oriented (that is, more explanatory) or linguistically oriented (that is, more descriptive) (Cameron, 1990 ). Also, there are contentions regarding whether more attention should be paid to epistemically linguistic minutiae (as in conversation analysis or CA), or to the macro-social interpretation of ideology not necessarily dependent on the evident orientation of the participants (as in critical discourse analysis, or CDA), as debated in Blommaert ( 2005 ) and Schegloff ( 1992 , 1998a / 1998b , 1999 ). As such, more sociolinguists than linguists in other disciplines are concerned with the ontology of language regarding its nature and its relation with broader social structures. In other words, such concerns can, firstly, justify the identity of sociolinguistics being either a branch of sociology, or linguistics, or even more broadly, anthropology. They can also delineate the contour of the macro vis-à-vis micro research subjects: are languages seen as separate systems, or inseparable but relatively fixed systems or an integrated construction in relation to their social dimensions of power, ideology and hegemony?

Such ontological concerns are important, because different approaches to research may be engendered accordingly. For instance, variational sociolinguistics is concerned with the linguistic differences within a language (standard language vis-à-vis its variations in dialects) and examines how these differences are linked to social aspects of linguistic practices, such as gender and social status. These differences within a certain category of language may be placed in the changing situations of various language communities or areas (e.g., Labov, 1963 , 1966 ), or in contextualized pragmatic situations (Agha, 2003 ; Eckert, 2008 ). Assumptions of separable or separate languages may be well-encapsulated in the works regarding language ideology and linguistic differentiation, such as the studies by Kroskrity ( 1998 ), Irvine and Gal ( 2000 ), as well as considerable other works on bilingualism or multilingualism. These works treat language as belonging to different standard systems (e.g., English, French, German, and so on) and can be pursued by “enumerating” these categories. In other words, these standard language systems are seen as having clear boundaries between them, and language can be researched by attributing different linguistic resources to (one of) these systems. The stance of the inseparability of language problematizes the enumeration of languages, by discrediting their explanatory potential in linguistic practices. In pedagogical contexts, transnational students are found using language features beyond the boundaries of language systems (Creese and Blackledge, 2010 ; Lewis et al., 2012 ). In the context of youth or urban culture, there are loosely fixed assumptions between language and ethnicity (Maher, 2005 ; Woolard, 1999 ). In some globalized contexts, new communications technologies as well as globalization itself are changing the traditional power structure in linguistic practices (Jacquemet, 2005 ; Jørgensen, 2008 ; Jørgensen et al., 2011 ). Furthermore, Makoni and Pennycook ( 2005 ), by advocating the disinvention of languages, problematize the process of “historical amnesia” (Makoni and Pennycook, 2005 , p. 149) of bi- and multilingualism, and their tradition of enumerating languages which reduces sociolinguistics to at best a “pluralization of monolingualism” (Makoni and Pennycook, 2005 , p. 148). However, this does mean that languages cannot be probed as standard categories. It holds a more intricate stance: on the one hand, it problematizes the separation of languages, as language is characterized by fluidity in multi-ethnic settings; on the other hand, it assumes the fixity of the relationship between a given (standard) language and its corresponding identity, ethnicity, and other societal factors (Otsuji and Pennycook, 2010 ); fluidity and fixity, however, are not binary attributes that exclude each other; they coexist, mutually influence each other in real-life linguistic practices. By the same token, Blackledge and Creese ( 2010 ) and Martin-Jones et al. ( 2012 ) also hold a dynamic view on language and identity: while language functions as “heritage” (see Blackledge and Creese, 2010 , pp. 164–180) and the positioning or maintenance of national identity, the bondage, however, frequently loosens as it is always contested, resisted and “disinvented” (Makoni and Pennycook, 2005 ). Table 1 illustrates three kinds of sociolinguistic conceptualizations of language.

The above discussion briefly delineates how contemporary sociolinguistic studies attempt to capture the complex ways in which the notion of language is construed, resisted or reinvented in and through practices. Most of these approaches are based on the traditional assumption of language as written signs and symbols in its verbal forms. Other forms of resources are generally seen as contexts where these verbal signs and symbols take place. They are contextual facets that contribute to the ideological and sociological corollary of language use, but they are not seen as ontological components in linguistics. Later developments, which integrate multimodal studies into sociolinguistics, show differing stances regarding the ontology of language, as shown in the next section.

Language in vis-à-vis as multimodal construct

Jewitt ( 2013 , p. 141) defines multimodality as “an inter-disciplinary approach that understands communication and representation to be more than about language”. This should be seen as a definition oriented toward social semiotics, in which different semiotic resources are seen as various modes of representation or communication through semiosis. For a sociolinguistic version of the definition, we prefer to interpret it as language in vis-à-vis as a multimodal construct. By using the word “construct”, we would like to point out that multimodality or multimodal conventions enter into sociolinguistic studies because they are socially constructed; that is, sociolinguists research these multimodal dimensions because they are semiotic resources and practices which are constructed by social subjects with power, manipulation and ideology. They are not neutral resources by which people communicate information or by which the process of meaning-making, or semiosis, is realized. Instead, they are a social construct that constitutes the type of Foucauldian knowledge in which sociological power and ideology lie at the core. In this sense, the notions, frameworks, and approaches that we discuss as follows are socially critical in nature and are predominantly related to socially constructed ideologies such as hegemony, power, and identity. As Makoni and Pennycook ( 2005 ) note, languages are “invented” by the dominant (colonial) groups through classification and naming in history; they are not neutral practices and they are constructed and invested with ideologies, power and inequality. Sociolinguistics thus needs a historically critical perspective. In fact, since its birth, sociolinguistics has been a discipline focusing on language use in relation to socially critical issues, such as gender, race, class and politics. This focus can date back as early as Labov’s ( 1963 , 1966 ) ethnographical research on variations of English on the island of Martha’s Vineyard, Massachusetts and in New York City. The sound change or phonetic features are studied in relation to ethnicity, social stratification and class. Agha ( 2003 ) and Eckert ( 2008 ) also probe the phonetic features or regional change of variations in relation to ethnicity and social and economic status.

In fact, the above-mentioned concerns of sociolinguistics are also consistent with CDA (see Wang and Jin, 2022 ; Wang and Yang, 2022 ), especially multimodal critical discourse analysis (MCDA), which also contributes to the research trend in terms of language in multimodality. Kress and van Leeuwen ( 1996 ) postulates a set of visual grammar based on systemic functional grammar. Machin ( 2016 ) and Machin and Mayr ( 2012 ) and other scholars have also adopted MCDA in various types of discourse. Semiotic resources other than language are analysed to reveal the social construct of power, ideology, and inequality in relation to verbal resources (Wang, 2014 , 2016a , 2016b ). Language in the multimodal construct in sociolinguistics is quite similar to the social semiotic and critical discourse approach to multimodality: language is seen as one type of resource, amongst other non-language resources (visual, aural, embodied, and spatial) in the meaning-making process. The difference lies in that sociolinguistic approaches toward language in multimodality have much more focus on social interaction, power and ideology and their research frequently includes ethnographical data and observations. Language as a multimodal construct, by contrast, sees language as a more integral part of multimodal resources, and vice versa; less distinct boundaries are seen as existing between languages and non-languages. These two trends of conceptions are discussed below.

Language in multimodal construct

To place language studies in the multimodal construct is not a new practice in sociolinguistics. Agha ( 2003 , p. 29) analyses the Bainbridge cartoon, treating accent not as “object of metasemiotic scrutiny”, but as an integral element in “the social perils of improper demeanour in many sign modalities” such as dress, posture, gait and gesture. His discussion demonstrates how language studies can be embedded in a larger multimodal scope. Language is contextualized by its peripheral multimodal paralinguistic sign systems. In Eckert ( 2008 , p. 25), the process of “bricolage” (Hebdige, 1984 ), in which “individual resources can be interpreted and combined with other resources to construct a more complex meaningful entity”, is linked to the style and language variations which reflect social meaning. She gives examples of how the clothing of students at Palo Alto High School affords them certain types of styles to convey social meaning. Eckert ( 2001 ), Coupland ( 2003 , 2007 ) and other scholars’ research represent the “third-wave” sociolinguistic studies, which see the use of variation in terms of personal and social styles (Eckert, 2012 ). Language and other semiotic resources constitute a stylistic complex that makes social meaning and constructs social styles and identities together. Goodwin ( 2007 ) extensively encompasses multimodal interaction in the examination of participation, stance and affect in a “homework” interaction between a father and his daughter, where gaze, gesture, and the spatial environment are taken into account. Goodwin’s research is partly premised on Bourdieu’s ( 1991 , pp. 81–89) associating bodily hexis with habitus , which is also a notion that is multimodal in itself. The deployment of different bodily modes in different contexts of participation (such as homework, archaeology, and surgery) depends on conventions of various social practices or their respective habitus .

Research regarding language in multimodal construct shares some common ground with the social semiotic approach towards multimodality. First, in communication, there are different modes of resources or semiotic types that convey social meaning and embed ideology. Second, these resources consist of language and “non-language”: the former being written or spoken signs and symbols that social actors use to communicate, and the latter being visual, aural, or embodied ones in that language are situated. Third, meaning-making is done through the orchestration of these resources.

In contrast to social semiotic approaches, with an anthropology-oriented concern, language in the multimodal construct as a sociological and sociolinguistic approach usually bases itself on ethnographical observations of social interaction. Language is seen as a component in social interactional discourse; other semiotic modes or resources are also important resources through which language use is contextualized. To be more specific, language in multimodal construct shows concerns with language as one type of semiotic resource that is placed in multimodal contexts in the following aspects:

First, meaning-making through other resources is seen as “add-ons” to that of language. In other words, language indexes social meaning and ideology in collaboration with other types of resources. An example is Agha’s ( 2003 ) analysis of the Bainbridge cartoon in which clothes, demeanour, and even body shape work in collaboration with accent in conveying register and social status. Second, language as one type of social meaning-making resource can be conceptualized in relation to the meaning-making process of other resources. For example, the process of “bricolage” is probed in relation to variations with their indexed styles and social categorization in terms of “gender and adolescence” (Eckert, 2008 , p. 458). This concept is used to offer clues regarding how “the differential use of variables constituted distinct styles associated with different communities of practice” (Eckert, 2008 , p. 458). Third, language is one of the communicative modes in social interactional discourse. It does not necessarily take the central role, because other types of resources, such as gestures, gaze, and the environment where these actions take place, jointly constitute the social meaning-making process. This can be best encapsulated in Goodwin’s ( 2007 ) analysis of the “homework” interaction between a father and his daughter. In this quite mundane interactional discourse, the father uses different embodied actions to negotiate different moral and affective stances through the “homework interaction” with his daughter. Conversation as a linguistic resource plays a role in the interaction, while embodied actions are key factors in affecting these stances.

Language as a multimodal construct

A slightly different approach to studies of language in multimodal contexts is to view it as a multimodal construct: either in the way that language is considered as autonomously constituting the semiotic texture (e.g., in the art form of the “text art” where text is also seen as picture) or in the way that some traditionally assumed extra-linguistic modes are considered as special forms or dimensions of language. This trend of research includes recent studies on language in space, social interactional multimodal discourse analysis, and new concepts or conceptualizations of language in society, as discussed below.

Language in space: semiotic landscape, place semiotics, and discourse geography

Jaworski and Thurlow ( 2010 ) review the notion of spatialization , that is, the semiotics and discursivity of space (Jaworski and Thurlow, 2010 ), and the extension of the notion of the linguistic landscape. By so doing, they frame the concept of semiotic landscape as encapsulating how written discourse interacts with other multimodal discursive resources with blurring boundaries in between.

In their opinion, space is “not only physically but also socially constructed, which necessarily shifts absolutist notions of space towards more communicative or discursive conceptualizations” (Jaworski and Thurlow, 2010 , p. 7). Sociological research on space thus is more oriented toward spatialization, “the different processes by which space comes to be represented, organized and experienced” (Jaworski and Thurlow, 2010 , p. 6). This spatialization—as represented discursively—is intrinsically multimodal:

Echoing the sentiments of Kress and van Leeuwen quoted at the start of this chapter, Markus and Cameron argue that ‘[b]uildings themselves are not representations’ (p. 15), but ways of organizing space for their users; in other words, the way buildings are used and the way people using them relate to one another, is largely dependent on the spoken, written and pictorial texts about these buildings… Architecture and language (spoken and written) may then form an even more complex, multi-layered landscape (or cityscape) combining built environment, writing, images, as well as other semiotic modes, such as speech, music, photography, and movement…(Jaworski and Thurlow, 2010 , pp. 19–20)

The “spatial turn” (Jaworski and Thurlow, 2010 , p. 6) in sociolinguistics thus adds the analytical dimensions of multimodal resources to the traditional concept of the linguistic landscape. Written language itself does convey social meaning and ideologies, while it is situated in materiality (the materials it is written on) and spatiality (the places where it appears). The concept of the semiotic landscape blurs the traditional boundary between language and non-language.

Different from social semiotic approaches towards multimodality, researchers of semiotic landscape pay predominant attention to the “metalinguistic or metadiscursive nature of ideologies” (Jaworski and Thurlow, 2010 , p. 11). In Kallen’s words, the concept of semiotic landscape starts from the assumption that “sinage is indexical of more than the ostensive message of the sign”. (Kallen, 2010 , p. 41); signage indexes ideologies that are embedded in, or indicated by, different types of space or spatiality: city centre, tourist places, districts and so on. Less interest is invested in the process of semiosis regarding how different modes of signs are orchestrated to communicate information, which is one of the primary endeavours of social semiotics (Li and Wang, 2022 ; Wang, 2014 , 2019 ; Wang and Li, 2022 ). As such, in ethnographical studies or data analysis, language, materiality, and spatiality are usually seen as interwoven with each other, with no distinct boundaries in between; or at least, boundary-marking is not the primary concern of semiotic landscape.

In the same vein, Scollon and Scollon ( 2003 , p. 2) coin the term “geosemiotics” (or “place semiotics”) which is “the study of the social meaning of signs and discourses and of our actions in the material world”. Their research objects are signs in public places. The conceptual framework of “geosemiotics” sees language as a multimodal construct in terms of the following aspects. First, verbal language is analysed by using social semiotic approaches to visuals. Code preference (regarding which language is seen as “primary” language) shown on signs or buildings is analysed by using Kress and van Leeuwen’s ( 1996 , p. 208) conception of compositional meaning indexed by different positions in pictures. Second, language is seen as multimodal itself. Language on signs or buildings is analysed in terms of the multimodal inscription (see Scollon and Scollon, 2003 , pp. 129–142) that includes fonts, letter form, material quality, layering and state changes. Third, the emplacement (referring to meaning-making through positioning signs in different places) in geosemiotics, similar to Jaworski and Thurlow’s ( 2010 ) approach towards the semiotic landscape, is predominantly concerned with spatiality and metalinguistic or metadiscursive ideology, rather than the interaction and orchestration of different modes (language vis-à-vis non-language) in semiosis.

Similar to the concepts of semiotic landscape and place semiotics, Gu ( 2009 , 2012 ) postulates the framework of four-borne discourse and discourse geography. Based on Blommaert’s ( 2005 , p. 2) view of discourse as “language-in-action”, Gu analyses the language and activities in social actors’ trajectories of time and space in the land-borne situated discourse (LBSD): a type of discourse categorized by Gu ( 2009 ) according to different types of spatiality as carriers and places where the discourses take place. In Gu’s ( 2012 ) conceptualizations, language and discourse are metaphorically spatialized: language is seen in terms of the place where it takes place. Multimodality is evaluated based on space (Gu, 2009 ). Though it is arguable to what extent language is seen as a conflation of modes or semiotic attributes in Gu ( 2009 ), his work demarcates an ambivalent boundary between language and the “non-language”. Also, in “spatializing” language as discourse geography, it represents language and discourse as a PLACE or SPACE metaphor that is multimodal itself. In addition, it analyses the translation between different modes, for instance, the “modalization” of written language into visuals and sounds; visuals are also seen as forms of “modalized” language and vice versa. As such, Gu ( 2009 ) also represents the “spatial turn” of sociolinguistics which can be seen as the research trend that regards language as multimodal construct.

In general, the trend to spatialize language and discourse (or the “spatial turn”), with the concepts or frameworks such as semiotic landscape, place semiotics, and discourse geography, treats language as multimodal construct in the following two aspects. First, it focuses on metalinguistic or metadiscursive ideologies that are embedded in different modes of signs or symbols; also, Gu’s research metaphorically theorizes social interaction through multimodality. In other words, it posits that language itself is multimodal or modalizable in meaning-making. Written language has its multimodal dimensions such as facets of its inscription including fonts, letterform, material quality, layering and state changes (Scollon and Scollon, 2003 ). Different forms of language are multimodal in terms of spatiality: they can be naturally multimodal and aural-visual for instance in televised discourse; written language can also be “modalized” (Gu, 2009 , p. 11) into visuals (Gu, 2009 ). Overall, language is either considered as signs in the spatialized system or actions in trajectories of activities. It is an integral part of multimodal construct, where other modes (visual, gesture, action, and so on) are not peripheral or auxiliary, but frequently they also belong to linguistic resources, for instance, the visual resources in text arts.

Multimodal studies from the social interactional perspective

There are sociolinguistic approaches towards multimodality that combine social interactional sociolinguistics (Goffman, 1959 , 1963 , 1974 ), social semiotic approach towards multimodality (Kress and van Leeuwen, 1996 ), and intercultural communication (Wertsch, 1998 ). We summarize these approaches as multimodal studies from the social interactional perspective, which include mediated discourse analysis (Scollon and Scollon, 2003 ) and multimodal interaction analysis (Norris, 2004 ); the latter grew out of the former.

Multimodal studies from the social interactional perspective focus on people’s daily actions and interactions, and the environment and technologies with(in) which they take place. This trend of research sees discourse as (embedded in) social interaction and sets out to investigate social action through multimodal resources used in daily interaction, such as gestures, postures, and language (see Jones and Norris, 2005 ). In Norris’s ( 2004 ) framework for multimodal interaction analysis, units of analysis are a system of layered and hierarchical actions including the lower-level actions such as an utterance of spoken language, a gesture, or a posture, and the higher-level actions consisting of chains of higher-level actions. Norris ( 2004 ) also coins the term “modal density” to refer to the complexity of modes a social actor uses to produce higher-level actions.

The focus on hierarchical levels of actions and the concept of “modal density” entail reflections on the question with regard to what constitute(s) mode and language. Language in multimodal interaction analysis is seen as a type of lower-level action amongst other different embodied resources that are at interactants’ disposal. These embodied resources are seen as different modes such as gesture, gaze, and proxemics. But arguably gestures and gazes in Norris ( 2004 ) are also seen as forms of language in interaction as well. Furthermore, regarding the mode of spoken language, Norris ( 2004 ) and her other works methodologically treat it as a multimodal construct where the pitches and intonation are visualized through various fonts in the wave-shaped annotation, along with the policeman’s gestures, as shown in Fig. 1 .

figure 1

The policeman’s spoken language is treated as a multimodal construct where the pitches and intonation are visualized through various fonts in the wave-shaped annotation, along with his gestures.

Multimodal studies from the social interactional perspective, similar to other sociolinguistic approaches to multimodality, target the meta-modal or metadiscursive facets of ideology. This is done through a bottom-up approach, that is, examining the general social categories of such as power, dominance and ideology from people’s daily (inter)action. This trend of research focuses on basic units of actions in people’s daily interaction; the conception of mode and language is oriented toward seeing language as multimodal; the methodological treatment of languages also shows this orientation. Multimodal studies from the social interactional perspective are intended to reveal the ideology and power embedded in language as action. Overall, they perceive language as a multimodal construct in social (inter)action.

Metrolingualism, heteroglossia, polylanguaging and multimodality

In the second section of the paper, we mentioned the works on some similar notions such as metrolingualism and polylanguaging. In this section, we will review the latest application of the notion of metrolingualism in multimodal analysis and discuss why other related notions or approaches also encapsulate the conceptualization regarding language as a multimodal construct.

Metrolingualism is a concept postulated by Otsuji and Pennycook ( 2010 ) originally referring to “creative linguistic conditions across space and borders of culture, history and politics, as a way to move beyond current terms such as multilingualism and multiculturalism” (Otsuji and Pennycook, 2010 , p. 244). Their later works (Pennycook and Otsuji, 2014 , 2015a , 2015b ) develop the concept and reformulate it as a broader notion encompassing the everyday language use in the city and linguistic landscapes in urban settings.

In Pennycook and Otsuji ( 2014 , 2015b ), metrolingualism involves the practice of “metrolingual multitasking” (Pennycook and Otsuji, 2015b , p. 15), in which “linguistic resources, everyday tasks and social space are intertwined” (Pennycook and Otsuji, 2015b , p. 15). Metrolingualism thus is not only concerned with the mixed use of linguistic resources (from different languages), but it involves how language use is involved in broader multimodal practices such as (embodied) actions accompanying or included in the metrolingual process, (changing) space or places where these actions and language use take place, and the objects in the environment. Pennycook and Otsuji ( 2015b ) include an olfactory mode in their analysis of the metrolingual practices in cities. Smell is represented through linguistic or pictorial signs in the city and suburb to constitute “smellscapes” in relation to social activities, ethnicities, gender and races. Metrolingual smellscapes are represented through the conflation of written and visual signs and symbols (e.g., street signs), social activities (e.g., buying and selling, and riding a bus), objects (e.g., spices), and places or spaces (e.g., suburb markets, coffee shops, buses and trains). The conventional distinction between language and the non-language is less important, or not at issue here, as smells have to be represented through language or visuals, and more resources are conceptualized as metrolingual other than languages.

Language in Pennycook and Otsuji’s ( 2014 , 2015a , 2015b ) conception of metrolingualism, in this regard, is seen as being integrated into different types of activities and actions; it is also spatialized in the sense that metrolingual practice is seen as involving the organization of space, the relationship between “locution and location” (Pennycook and Otsuji, 2015b , p. 84), (historical) layers of cities (Pennycook and Otsuji, 2015b , p. 140). The spatialization is intrinsically multimodal, which we have discussed in earlier sections.

In relation to metrolingualism, Jaworski ( 2014 ) briefly reviews the history of arts and writing, from which he chose the art form of “text art” as his research subject. Referring to the notion of metrolingualism, he sees these art forms as “metrolingual art”, where language interacts with other modes or is seen as part of the visual mode. He suggests that it be useful to “extend the range of semiotic features amenable to metrolingual usage to include whole multimodal resources” (Jaworski, 2014 , p. 151). The multimodal representations in text art are realized by mixing, meshing and queering of the linguistic features, as well as by its relation to a “melange of styles, genres, content, and materiality” (Jaworski, 2014 , p. 151). In this regard, the multimodal affordances (Kress, 2010 ; Jewitt, 2009 ) realized by materiality (e.g., papers, cloths, walls where the language is written), media (e.g., soundtrack, video, moving images, etc.), and styles (e.g., fonts, letterform, layering like add-ons or decorations) are an integral part of the metrolingualism. Subsequently, he postulates that it would be useful to align the concept of heteroglossia with metrolingualism, so as “to extend the idea of metrolingualism beyond ‘hybrid and multilingual’ speaker practices (Otsuji and Pennycook, 2010 , p. 244) and move towards a more ‘generic’ view of metrolingualism as a form of heteroglossia” (Jaworski, 2014 , p. 152). In this way, it relates the subject position taken by the producers of the text arts to their social orientation or alignment as regards power, domination, hegemony, and ideology in a broader social realm. This is also in line with Bailey’s discussion about heterogliossia: “(a) heteroglossia can encompass socially meaningful forms in both bilingual and monolingual talk; (b) it can account for the multiple meanings and readings of forms that are possible, depending on one’s subject position, and (c) it can connect historical power hierarchies to the meanings and valences of particular forms in the here-and-now” (Bailey, 2007 , pp. 266–267; also quoted in Jaworski, 2014 , p. 153). Overall, Jaworski ( 2014 ) shows how metrolingualism and heteroglossia can be used to analyse features of language and their place in multimodal construct. He also discusses how other notions which are similar to metrolingualism may bear a relationship with multimodality in that they stress “the importance of linguistic features (rather than discrete languages) as resources for speakers to achieve their communicative aims” (Jaworski, 2014 , p. 138).

Apart from the concepts of metrolingualism and heteroglossia, Jaworski ( 2014 ) touches upon the relationship between polylanguaging and multimodality, but he does not elaborate on it. Jørgensen ( 2008 ) demonstrates how polylanguaging is concerned with the use of language features in language practice among adolescents in superdiverse societies. Some of these language features “would be difficult to categorize in any given language” (Jørgensen et al., 2011 , p. 25); that is, they do not belong to any standard language system (e.g., English, Chinese, German). In addition, emoticons are frequently used in communication via social networking software. If some of these language features do not belong to any given language, it is difficult to say whether they can be seen as languages. The attention on features of language hence blurs the boundary between language and other semiotic resources. Of course, these features can be seen as a type of linguistic (lexical, morphemic or phonemic) units which still belong to language, but they are frequently used in multimodal meaning-making. Below I use Jørgensen et al.’s ( 2011 , p. 26) example (Fig. 2 ) to illustrate this.

figure 2

The “majority boy” makes use of resources from the minority’s language (the word “shark”).

Jørgensen et al.’s analysis of this example focuses on the “majority boy” using the word “shark”, which is a loan word from Arabic. As a majority member, he is using the minority’s language to which he is not entitled. Judging by the interaction, it can be seen that “both interlocutors are aware of the norm and react accordingly” (Jørgensen et al., 2011 , p. 25). As such he noted that one feature of polylanguaging is “the use of resources associated with different ‘languages’ even when the speaker knows very little of these” (Jørgensen et al., 2011 , p. 25).

What also needs attention but is not discussed by Jørgensen et al. ( 2011 ), is the interlocutors’ creative way to use these features in polylanguaging: the word “shark” is written as a prolonged “shaarkkk” in terms of its phonetic and visual effects. The creative configuration of the language feature “shark” functions to draw other interlocutors’ attention toward the polylanguaging practice. The emoticon “:D” following it is to demonstrate that the speaker knows that he is using language features by violating the “normal” rules; that is, he is using the minority language features to which he is not entitled. The repeated words “cough, cough”, followed by the emoticon “:D”, also demonstrate this.

Polylanguaging, as formulated by Jørgensen et al. ( 2011 ), deviates from the tradition of multilingualism to enumerate languages, but focuses on language features that may not belong to any given language. In this sense, the emoticons or creative configuration of words can also be seen as language features—the language features that are creatively used by a virtual community of (young) netizens in communication. These features are multimodal in the following aspects. First, they visualize the polylanguaging practice by creating new forms of words, for instance, the prolonged word “shaarkkk”. This creation itself is in fact also a process of polylanguaging, in the sense that it uses the features of common language, or language in people’s daily life (that is, non-cyber language) to create new cyber-language that is used by members of a virtual community. Second, these language features utilize the multimodal resources of embodiment in polylanguaging. For example, emoticons use different letters or punctuations (as language features from people’s daily written language) to represent different facial expressions and emotions. The repetition of the words “cough, cough”, as “a reference to a cliché way of expressing doubt or scepticism” (Jørgensen et al., 2011 , p. 27) also takes on an embodied stance. It shows that the interlocutors are aware that the majority boy is using the minority’s language to which he is not entitled. Hence, this embodied stance indexes the polylanguaging practice. To summarize what is discussed above, polylanguaging entails seeing language as a multimodal construct, as interlocutors creatively adapt language features in daily communication (face-to-face or written communication not involving the internet) or utilize embodied language features when polylanguaging in online communication.

Discussion and a critical reflection

In the sections “Language as written and spoken signs and symbols” and “Language in vis-à-vis as multimodal construct” above, we delineated the ontological perceptions of language in sociolinguistics, including language as spoken and written signs and symbols, language in vis-à-vis as a multimodal construct. In teasing out various trends of approaches, language in sociolinguistics is found to have undergone several stages of development. Language as spoken and written signs and symbols have been pursued in variational sociolinguistics, bi- and multilingualism, and the latest theoretical and conceptual trends of research that do not see language as separate and separable systems or codes. Language in sociolinguistics, however, has been predominantly placed in nuanced and complicated relationships with other semiotic resources. Research regarding language in multimodal constructs sees language and non-language resources as different modes, or types of resources. These different modes have boundaries, and efforts are made to see how each mode combines with each other in meaning-making; language itself is a distinctive type of mode, interdependent with but different from other modes. Research regarding language as a multimodal construct sees language itself as multimodal, language is spatialized (that is, probed in relation to various spatiality and materiality where they appear); in the social interactional approach to multimodality, it is embodied and seen as embedded in a layered and hierarchical system of modes (including gesture, posture, and intonation) in social interaction; in the latest concepts built on languaging, language is regarded as “inventions” (Makoni and Pennycook, 2005 ), as cross- and trans-cultural practice, instead of separable and enumerable codes, or system. Language is entangled and integrated with objects (for instance, signage, and the materiality where it appears) and multitasking with embodied resources (gestures, talking, and simultaneously doing other things).

Expanding the ontology of language from verbal resources toward various multimodal constructs has enabled sociolinguists to pursue meaning-making, indexicalities and social variations in its most authentic state. Language itself is multimodal, though it cannot be denied that language and other modes do have boundaries and distinctions (yet not always being so). Whenever a language is spoken, the stresses, intonations, and paralinguistic resources are all integrated into it. Focusing on language per se has generated fruitful outcomes in sociolinguistic studies, but placing language in the multi-semiotic resources has innovated the field and it has become the dominant trend in contemporary sociolinguistics. Both languages in or as multimodal constructs have captured the complex ways in which language interacts with multi-subjects, materiality, objects and spatiality. But it may be found that the latest research in sociolinguistics comes to increasingly see language itself as an intricate multimodal construct, as encapsulated by various new concepts and theories including translanguaging, metrolingualism, and polylanguaing, in the contexts of globalization, migration, multi-ethnicity, and new communication technologies. Language is not only seen as separable codes and systems spoken or written by a different group of people, but it entails a wider range of communicative repertoires including embodied meaning-making, objects and the environment where the written or spoken signs are placed. It hence may be speculated that sociolinguistics will be increasingly less concerned with the boundaries of language and non-language resources, but will focus more on the social constructs, social meaning, and language as a force in social change. The enumerating and separating way of studying language and multimodality—that is, delineating inter-semiotic boundaries and focusing on how modes of communication are combined in meaning-making—has generated various outcomes, especially in the field of grammar-oriented social semiotic research and MCDA. However, contemporary sociolinguistic studies have immensely expanded their scope toward a wider range of areas other than discursive, grammatical, and communicative. The three research paradigms regarding language as a multimodal construct reviewed in “Language as multimodal construct” have proved themselves as a feasible approach toward language in social interaction, geo-semiotics, and language use in ethnographical and multi-ethnic settings. The ontology of language in sociolinguistics, in this regard, may be perceived in terms of the sociology and societal facets of multimodal construct, rather than language placed in a multitude of semiotic types or the verbal resources per se. A critical reflection on the ontology of language is one of the prerequisites of innovations in contemporary linguistics, which is also the objective of this comprehensive review.

As can be seen through the above discussion, there are several versions of the perception of language in sociolinguistics. First, perceptions of language as a written or verbal system are moving from, or have moved from, the enumerating traditions bi- or multi-lingualism towards seeing language as an inseparable entity with fixity and fluidity. In other words, new approaches in sociolinguistics come to see languages as comprising different features, repertories, or resources, rather than different or discrete standard languages such as English, French, German and so on. The negotiation, construction, or attribution of ethnicity, identity, power and ideologies through language also has taken on a more dynamic and diverse look. Second, there is sociolinguistic research that places language with in the multimodal construct. Language is seen as being contextualized by other multimodal semiotics that is seen as “non-language”. However, more research comes to see language as multimodal construct; that is, language, be it written or spoken, is multimodal in itself as it comprises multimodal elements such as type, font, materiality, intonation, embodied representations and so on. It is also activated (seen as actions or activities) or spatialized in different approaches such as mediated discourse analysis, multimodal interaction analysis, geosemiotics, semiotic landscape, and metrolingualism discussed earlier. Third, these changing perceptions of languages in sociolinguistics result from researchers’ innovative efforts to view language from different perspectives. More importantly, they arise from the fact that language itself is also changing as society changes. As mentioned in the beginning, the world has been increasingly globalized and communications technologies have fundamentally changed the ways people interact with each other. Linguistic practices are complicated by the super-diversity of ethnic fluidity (e.g., the diversity of ethnic groups and the ever-present changes in ethnic structure), communications technologies, and globalized cross-cultural art.

In sum, it can be argued that contemporary sociolinguistics has become increasingly concerned with languaging (trans-, poly-, metro-, and pluri- and so on), rather than languages as a type of (static and fixed) verbal resource with demarcated boundaries separating them from other multimodal resources. Language is multimodal; it is embedded in or represents social activities, places or spaces, objects, and smells. Language in society belongs to and constitutes the “semiotic assemblage” (Pennycook, 2017 ) that can be better analysed holistically so as to reach an understanding of “how different trajectories of people, semiotic resources and objects meet at particular moments and places” (Pennycook, 2017 , p. 269). At a fundamental level of sociolinguistic ontology, this trend of research reflects the changing ways in which sociolinguists come to understand what language is and how it should be understood as part of a more general range of semiotic practices.

Agha A (2003) The social life of cultural value. Language Commun 23(3–4):231–273

Article   Google Scholar  

Berger P, Luckmann T (1966) The social construction of reality: a treatise in the sociology of knowledge. Doubleday, New York

Google Scholar  

Blackledge A, Creese A (2010) Multilingualism: a critical perspective. Continuum, London

Blommaert J (Ed.) (1999) Language ideological debates, vol. 2. Walter de Gruyter, Berlin

Blommaert J (2005) Discourse: a critical introduction. Cambridge University Press, Cambridge

Book   Google Scholar  

Bourdieu P (1991) Language and symbolic power [Thompson JB (ed and introd)] (trans: Raymond G, Adamson M). Polity Press/Blackwell, Cambridge

Bailey B (2007) Heteroglossia and boundaries. In: Heller M (Ed.) Bilingualism: a social approach. Palgrave Macmillan, New York, pp. 257–274

Chapter   Google Scholar  

Cameron D (1990) Demythologizing sociolinguistics: why language does not reflect society. In: Joseph J, Taylor T (eds) Ideologies of language. Routledge, London, pp. 79–93

Chomsky N (1965) Aspects of the theory of syntax. MIT Press, Cambridge, Massachusetts

Coupland N (2003) Sociolinguistic authenticities. J Sociolinguist 7(3):417–431

Coupland N (2007) Style: language variation and identity. Cambridge University Press, Cambridge

Creese A, Blackledge A (2010) Translanguaging in the bilingual classroom: a pedagogy for learning and teaching? Mod Language J 94:103–115

Eckert P, Rickford JR (Eds.) (2001) Style and sociolinguistic variation. Cambridge University Press, Cambridge

Eckert P (2008) Variation and the indexical field. J Sociolinguist 12(4):453–476

Eckert P (2012) Three waves of variation study: the emergence of meaning in the study of sociolinguistic variation. Annu Rev Anthropol 41(1):87–100

García O, Li W (2014) Translanguaing: language, bilingualism and education. Palgrave Macmillan, London

Goffman E (1959) The presentation of self in everyday life. Doubleday, New York, NY

Goffman E (1963) Behavior in public places. Free Press, New York, NY

Goffman E (1974) Frame analysis. Harper & Row, New York, NY

Goodwin C (2007) Participation, stance, and affect in the organization of activities. Discourse Soc 18(1):53–73

Gu Y (2009) Four-borne discourses: towards language as a multi-dimensional city of history. In: Li W, Cook V (eds.) Linguistics in the real world. Continuum, London, pp. 98–121

Gu Y (2012) Discourse geography. In: Gee JP, Hanford M (eds.) The Routledge handbook of discourse analysis. Routledge, London, pp. 541–557

Hebdige D (1984) Framing the youth ‘problem’: the construction of troublesome adolescence. In: Garms-Homolová V, Hoerning EM, Schaeffer D (eds.) Intergenerational Relationships. Lewiston, NY: C. J. Hogrefe, pp.184–195

Holliday N (2021) Prosody and sociolinguistic variation in American Englishes. Annu Rev Linguist 7:55–68

Irvine JT, Gal S (2000) Language ideology and linguistic differentiation. In: Kroskrity PV (ed.) Regimes of language: ideologies, polities, and identities. School of American Research Press, Santa Fe, pp. 35–84

Jaworski A (2014) Metrolingual art: multilingualism and heteroglossia. Int J Biling 18(2):134–158

Jaworski A, Thurlow C (eds.) (2010) Semiotic landscapes: language, image, space. Continuum, New York

Jewitt C (2009) Different approaches to multimodality. In: Jewitt C (ed) The Routledge handbook of multimodal analysis. Routledge, Abingdon, pp. 28–39

Jewitt C (2013) Multimodality and digital technologies in the classroom. In: de Saint-Georges I, Weber J (eds) Mulitlingualism and multimodality: current challenges for educational studies. Sense Publishing, Boston, pp. 141–152

Jørgensen JN (2008) Poly-lingual languaging around and among children and adolescents. Int J Multiling 5(3):161–176

Jørgensen JN, Karrebæk MS, Madsen LM, Møller JS (2011) Polylanguaging in superdiversity. Diversities 13(2):23–37

Jacquemet M (2005) Transidiomaticpractices: language and power in the age of globalization. Language Commun 25:257–277

Jones R, Norris S (2005) Discourse as action/discourse in action. In: Norris S, Jones R (eds) Discourse in action: introducing mediated discourse analysis. Routledge, London, pp. 1–3

Kallen J (2010) Changing landscapes: language, space and policy in the Dublin linguistic landscape. In: Jaworski A, Thurlow C (eds) Semiotic landscapes: language, image, space. New York: Continuum, pp. 41–58

Kress GR (2010) Multimodality: a social semiotic approach to contemporary communication. Routledge, London

Kress GR, van Leeuwen T (1996) Reading Images: the grammar of graphic design. Routledge, London

Kroskrity PV (1998) Arizona Tewa Kiva speech as a manifestation of linguistic ideology. In: Schieffelin BB, Woolard KA, Kroskrity P (eds) Language ideologies: practice and theory. Oxford University Press, New York, pp. 103–122

Labov W (1963) The social motivation of a sound change. Word 19(3):273–309

Labov W (1966) Hypercorrection by the lower middle class as a factor in linguistic change. Sociolinguistics 1966:84–113

Lawson R (2020) Language and masculinities: history, development, and future. Annu Rev Linguist 6(1):409–434

Lewis WG, Jones B, Baker C (2012) Translanguaging: origins and development from school to street and beyond. Educ Res Eval 18(7):641–654

Li W, Wang J (2022) Chronotopic identities in contemporary Chinese poetry calligraphy. Poznan Stud Contemp Linguist 58(4):861–884

Machin D (2016) The need for a social and affordance-driven multimodal critical discourse studies. Discourse Soc 27(3):322–334

Machin D, Mayr A (2012) How to do critical discourse analysis: a multimodal introduction. Sage, London

Maher J (2005) Metroethnicity, language, and the principle of Cool. Int J Sociol Language 11:83–102

Makoni S, Pennycook A (2005) Disinventing and (re)constituting languages. Crit Inq Language Stud 2(3):137–156

Martin-Jones M, Blackledge A, Creese A (eds) (2012) The Routledge handbook of multilingualism. Routledge, London

Norris S (2004) Analyzing multimodal interaction: a methodological framework. Routledge, London

Otsuji E, Pennycook A (2010) Metrolingualism: fixity, fluidity and language in flux. Int J Multiling 7:240–254

Pennycook A (2017) Translanguaging and semiotic assemblages. Int J Multiling 14(3):1–14

Pennycook A, Otsuji E (2014) Metrolingual multitasking and spatial repertoires: ‘Pizza mo two minutes coming’. J Socioling 18(2):161–184

Pennycook A, Otsuji E (2015a) Making scents of the landscape. Linguist Landsc 1(3):191–212

Pennycook A, Otsuji E (2015b) Metrolingualism. Language in the city. Routledge, New York

Sankoff G (2018) Language change across the lifespan. Annu Rev Linguist 4:297–316

Schegloff EA (1992) In another context. In: Duranti A, Goodwin C (eds) Rethinking context: language as an interactive phenomenon. Cambridge University Press, Cambridge, pp. 191–227

Schegloff EA (1998a) Positioning and interpretative repertoires: conversation analysis and poststructuralism in dialogue: reply to Wetherell. Discourse Soc 9(3):413–416

Schegloff EA (1998b) Reply to Wetherell. Discourse Soc 9(3):457–60

Schegloff EA (1999) ‘Schegloff’s texts’ as ‘Billig’s data’: a critical reply. Discourse Soc 10(4):558–572

Scollon R, Scollon S (2003) Discourses in place: language in the material world. Routledge, New York

Saussure F (1916) Course in general linguistics. Duckworth, London

Wang J (2014) Criticising images: critical discourse analysis of visual semiosis in picture news. Crit Arts 28(2):264–286

Wang J (2016a) Multimodal narratives in SIA’s “Singapore Girl” TV advertisements—from branding with femininity to branding with provenance and authenticity? Soc Semiot 26(2):208–225

Article   MathSciNet   Google Scholar  

Wang J (2016b) A new political and communication agenda for political discourse analysis: critical reflections on critical discourse analysis and political discourse analysis. Int J Commun 10:19

ADS   Google Scholar  

Wang J (2019) Stereotyping in representing the “Chinese Dream” in news reports by CNN and BBC. Semiotica 2019(226):29–48

Wang J, Jin G (2022) Critical discourse analysis in China: history and new developments. In: Aronoff M, Chen Y, Cutler C (eds) Oxford Research Encyclopedia of Linguistics. Oxford University Press. https://doi.org/10.1093/acrefore/9780199384655.013.909

Wang J, Li W (2022) Situating affect in Chinese mediated soundscapes of suona. Soc Semiot. https://doi.org/10.1080/10350330.2022.2139171

Wang J, Yang M (2022) Interpersonal-function topoi in Chinese central government’s work report (2020) as epidemic (counter-) crisis discourse. J Language Politics. https://doi.org/10.1075/jlp.22022.wan

Wertsch JV (1998) Voices of the mind: a sociocultural approach to mediated action. Harvard University Press, Cambridge, MA

Woolard K (1999) Simultaneity and bivalency as strategies in bilingualism. J Linguist Anthropol 8(1):3–29

Download references

Acknowledgements

Our thanks are extended to Dr. William Dezheng Feng for his constructive advice on the earlier drafts of the paper. This work is supported by the National Social Science Foundation of China (Project No. 18CYY050); the Foreign Language Education Foundation of China (Project No. ZGWYJYJJ11A030); and the Self-Determined Research Funds of CCNU from MOE for basic research and operation (Project No. CCNU20TD008).

Author information

Authors and affiliations.

Central China Normal University, Wuhan, China

Jiayu Wang, Guangyu Jin & Wenhua Li

Inner Mongolia Agricultural University, Hohhot, China

Guangyu Jin

You can also search for this author in PubMed   Google Scholar

Contributions

All three authors contributed to the conception and design of the study. JW mainly participated in drafting the work. GJ revised it critically for important intellectual content. WL participated in major intellectual contributions to the Chinese versions of the paper (unpublished); her ideas and points are integrated into the final version of this paper. All three authors are corresponding authors responsible for the final approval of the version to be published.

Corresponding authors

Correspondence to Jiayu Wang , Guangyu Jin or Wenhua Li .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Ethical approval

This article does not contain any studies with human participants performed by any of the authors.

Informed consent

Additional information.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

Wang, J., Jin, G. & Li, W. Changing perceptions of language in sociolinguistics. Humanit Soc Sci Commun 10 , 91 (2023). https://doi.org/10.1057/s41599-023-01574-5

Download citation

Received : 12 September 2022

Accepted : 20 February 2023

Published : 08 March 2023

DOI : https://doi.org/10.1057/s41599-023-01574-5

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

language research paper

  • Technical Support
  • Find My Rep

You are here

Second Language Research

Second Language Research

Preview this book.

  • Description
  • Aims and Scope
  • Editorial Board
  • Abstracting / Indexing
  • Submission Guidelines

Second Language Research is an international peer-reviewed, quarterly journal, publishing original theory-driven research concerned with second (and additional) language acquisition and second language performance. This includes both experimental studies and contributions aimed at exploring conceptual issues. In addition to providing a forum for investigators in the field of non-native language learning, it seeks to promote interdisciplinary research which links acquisition studies to related non-applied fields such as neurolinguistics, psycholinguistics, theoretical linguistics, bilingualism, and first language developmental psycholinguistics.

Note that studies of foreign language teaching and learning are outside the scope of Second Language Research , unless they make a substantial contribution to understanding the process and nature of second language acquisition. Types of publications include full-length research articles (about 9,000 words), research notes (about 4,000 words), review articles of recent books or timely topics (about 5,000 words), discussion and commentary (about 5,000 words), invited keynote articles (about 11,000 words) and guest-edited, thematic issues.

This journal is a member of the Committee on Publication Ethics (COPE) .

Electronic access :

Second Language Research is available to browse on SAGE Journals Online.

“ Second Language Research is a central resource in the field, especially for cutting-edge work on linguistic and cognitive issues in SLA. It is clearly among our first-tier journals.” Professor Michael Long University of Maryland, USA

Second Language Research publishes theoretical papers, original research and review articles on simultaneous or consecutive second and additional language acquisition in children and adults. In addition to providing a forum for investigators in the field of non-native language learning, the journal seeks to promote interdisciplinary research which links second language acquisition studies to related non-applied fields such as:

* Theoretical linguistics focused on second and additional language acquisition * Neuroscience and cognitive science * First language developmental psycholinguistics * Heritage language studies * Psycholinguistics.

The journal does not publish papers that focus on educational and pedagogical issues in language teaching and language testing.

University of Illinois at Urbana-Champaign, USA
University of Southampton, UK and University of Iowa, USA
Boston College, USA
Boston University, USA
Queen Mary University of London, UK
Université Laval, Canada
University of Kansas, USA
Nebrija University, Madrid, Spain
Germany
Heriot-Watt University, UK
Kookmin University, South Korea
Washington University in St. Louis, USA
University of Illinois, USA
State Univ of New York at Stony Brook, USA
University of Illinois Chicago, USA
University of Calgary, Canada
University of Edinburgh, UK
University of Reading, UK
Purdue University, USA
Indiana University, USA
University of Southampton, UK
University of Western Sydney, Australia
University of Potsdam, Germany
University of Kansas, USA
University of Hawaii, USA
University of Lund, Sweden
Bogaziçi University, Turkey
University of Utah, USA
Chuo University, Japan
Technical University Braunschweig, Germany
Doshisha University, Japan
University of Illinois Urbana Champaign, USA
University of Illinois, USA
University of Pittsburgh, USA
Korea National University of Education, South Korea
Central Connecticut State University, USA
The University of Arizona, USA
Georgetown University, USA
University of Ottawa, Canada
City University of Hong Kong, Hong Kong
University of Reading, UK
University of Essex, UK
Universidad Pompeu Frabra, Spain
University of Paderborn, Germany and University of Newcastle upon Tyne, UK
Universite Francois Rabelais, France
York University, UK
UiT The Arctic University of Norway, Norway and University Nebrija, Spain
University of Iowa, USA
University of Arizona, USA
Gunma Prefectural Women's University, Japan
University of Edinburgh, UK
National Taiwan Tsinghua University, Taiwan
Florida State University, USA
Kanagawa University, Japan
University of Kansas, USA
University of Cambridge, UK
McGill University, Canada
Chinese University of Hong Kong, China
Newcastle University, UK
Shanghai Jiao Tong University, China / University of Cambridge, UK
  • Academic Search Premier
  • British Education Index
  • Contents Pages in Education
  • Current Contents / Social and Behavioral Sciences
  • Current Contents/ Social and Behavioral Sciences
  • Current Index to Journals in Education
  • EMBASE/Excerpta Medica
  • Educational Research Abstracts Online - e-Psyche
  • IBZ: International Bibliography of Periodical Literature
  • IBZ: International Bibliography of Periodical Literature in the Humanities and Social Sciences
  • ISI Discovery Agent
  • Informationszentrum F&uuml;r Fremdsprachenforschung (IFS)
  • Informationszentrum Für Fremdsprachenforschung (IFS)
  • International Bibliography of Book Reviews of Scholarly Literature in the Humanities and Social Sciences
  • International Bibliography of Book Reviews of Scholarly Literature on the Humanities and Social Sciences
  • Language Teaching
  • Linguistics Abstracts
  • Linguistics and Language Behavior Abstracts
  • MLA Abstracts of Articles in Scholarly Journals
  • MLA International Bibliography
  • Professional Development Collection
  • Social Sciences Citation Index (SSCI)
  • Social Services Abstracts
  • Sociological Abstracts
  • e-Psyche (Ceased)

Manuscript Submission Guidelines: Second Language Research

This Journal is a member of the Committee on Publication Ethics

Please read the guidelines below then visit the Journal’s submission site http://mc.manuscriptcentral.com/SLR to upload your manuscript. Please note that manuscripts not conforming to these guidelines may be returned.

Only manuscripts of sufficient quality that meet the aims and scope of Second Language Research will be reviewed. The editorial team is proud that in 2020, the average time to first decision was  31 days  and the average time to a final decision was  48 days .

There are no fees payable to submit or publish in this Journal. Open Access options are available - see section 3.3 below.

As part of the submission process you will be required to warrant that you are submitting your original work, that you have the rights in the work, that you are submitting the work for first publication in the Journal and that it is not being considered for publication elsewhere and has not already been published elsewhere, and that you have obtained and can supply all necessary permissions for the reproduction of any copyright works not owned by you.

Please see our guidelines on prior publication and note that Second Language Research  may accept submissions of papers that have been posted on pre-print servers; please alert the Editorial Office when submitting (contact details are at the end of these guidelines) and include the DOI for the preprint in the designated field in the manuscript submission system. Authors should not post an updated version of their paper on the preprint server while it is being peer reviewed for possible publication in the journal. If the article is accepted for publication, the author may re-use their work according to the journal's author archiving policy.

If your paper is accepted, you must include a link on your preprint to the final version of your paper.

  • What do we publish? 1.1 Aims & Scope 1.2 Article types 1.3 Writing your paper
  • Editorial policies 2.1 Peer review policy 2.2 Authorship 2.3 Acknowledgements 2.4 Funding 2.5 Declaration of conflicting interests
  • Publishing policies 3.1 Publication ethics 3.2 Contributor's publishing agreement 3.3 Open access and author archiving
  • Preparing your manuscript 4.1 Formatting 4.2 Artwork, figures and other graphics 4.3 Supplementary material 4.4 Reference style 4.5 English language editing services 4.6 Statistical Guidelines
  • Submitting your manuscript 5.1 ORCID 5.2 Information required for completing your submission 5.3 Permissions
  • On acceptance and publication 6.1 Sage Production 6.2 Online First publication 6.3 Access to your published article 6.4 Promoting your article
  • Further information

1. What do we publish?

1.1 Aims & Scope

Before submitting your manuscript to Second Language Research , please ensure you have read the  Aims & Scope .

1.2 Article Types

The Journal considers the following kinds of articles for publication. Any submission that does not respect the word limit will be sent back to the author without review. Please note that the manuscript length described below includes only the main body of the text, footnotes and all citations within it. The manuscript length does not include abstract, section titles, figure and table captions, funding statements, acknowledgments and references in the bibliography. 

  (a) Full Articles   (9,000 words) 

Full research reports must include original experimental findings related to an area of relevance to second language acquisition research and theory. Authors must clearly state their hypotheses or research questions and the results must include a quantitative presentation of the data. Research reports must make an important contribution to the field of Second Language Research and demonstrate rigorous methodology and statistical analysis of the results. We do not discourage contributions that present null results if the authors clearly state the hypothesis tested and the meaning and relevance of the null results themselves.  

All research reports must include an introduction (brief and focused), methods, discussion and conclusions. The conclusion must address the broader implications of the results and clearly state how the study contributed to the field of SLA.

(b) Research Notes  (4,000 words) 

Research notes are short reports and discussion papers of interest to the Second Language Research community. Research notes also include original research and follow the same outline as above but should be highly focused on one specific question related to SLA. Research notes may include replications of previously published studies.

(c) Review Articles   (5,000 words) 

Review articles that provide a synthesis in areas covered by the journal, or which assess methods, professional resources (including publications), or conceptual advances in the field. Normally, review articles are broader in focus than research notes and do not include original research.

All books for the review articles section should be sent to the review editor:

Margaret Thomas  Program in Linguistics  Lyons Hall  Boston College  Chestnut hill, MA 02467  USA

E-mail:  [email protected]    

(d) Keynote Articles   with commentaries (11,000 words)

Keynote articles are normally commissioned by the Editors. They present a relevant new theory or model, or address a specific topic that is being currently debated in the field and take a clear position on one side of the debate. The goal of a keynote article is to present the issues most relevant to the topic, take a particular perspective on this topic and situate it in the broader field of SLA. Once a keynote is submitted, other experts on the topic will be invited to comment on the keynote article from their particular perspective.

(e) Discussion and Commentary    (5, 000 words)  

Discussion and commentary are short articles addressing questions and concerns of a theoretical nature. Their goal is to initiate a conversation on a particular burning issue related to second language acquisition and encourage exchange of ideas, opinions and perspectives on that issue among researchers in the field.

(f) Registered Reports

Registered Reports are submissions that go through a two-phase review process. In Stage 1, the methods and proposed analyses are reviewed before data are collected and the study is pre-registered. In Stage 2, reviewers consider the full study, including results and interpretation. This format of article seeks to avoid a variety of inappropriate research practices, including inadequate statistical power, selective reporting of results, and publication bias, but still offers the flexibility to conduct subsequent exploratory (unregistered) analyses.

Starting in 2021, Second Language Research accepts Registered Report submissions. Detailed instructions for this format are available here and on the journal’s submission site, with instructions for authors and instructions for reviewers. Those can also be obtained by email from the editorial office.

Note that Registered Reports are different from preprints. A preprint is an online only, pre-peer reviewed version of a manuscript that is made openly available on a preprint server. They are not peer reviewed, and a preprint is not considered to be published, although they are often assigned DOIs. Preprints provide a unique benefit to the research community by allowing authors to rapidly disseminate their research before their papers are peer reviewed and published. They also allow researchers to work on their paper with the input of others in the research community before it is submitted for journal publication. However, they do not guarantee publication of the article. Since June 2018, SLR collects the DOIs for any preprint versions of the published articles to link a preprint version of an article to the final published version, providing increased transparency for readers.

1.3 Writing your paper

The Sage Author Gateway has some general advice and on  how to get published , plus links to further resources. Sage Author Services also offers authors a variety of ways to improve and enhance their article including English language editing, plagiarism detection, and video abstract and infographic preparation.

1.3.1 Make your article discoverable

When writing up your paper, think about how you can make it discoverable. The title, keywords and abstract are key to ensuring readers find your article through search engines such as Google. For information and guidance on how best to title your article, write your abstract and select your keywords, have a look at this page on the Gateway: How to Help Readers Find Your Article Online

Back to top

2. Editorial policies

2.1 Peer review policy

Each submission is subject to an in-house evaluation process and once this is completed, the article is either determined unsuitable for publication in SLR or sent for external review. This process of initial editorial evaluation may take up to two weeks. Once this review has been conducted, the manuscript is either returned to the author or assigned to an acting editor who sends it out to reviewers. 

The Editors of  Second Language Research  typically ask for 3 independent reviews of submissions they judge to be potentially publishable. Guidelines for  reviewers  can be found  here .

Submissions must be submitted in a format that will allow double-blind reviewing of the manuscript. 

  • The author's name(s) should not be included in headers or footers or in any part of the file (such as in 'Properties') which can reveal her/his/their identity. All funding sources should also be anonymized if they can be used to identify the author(s).
  • Any references to previous work by the same author within the text or in the references themselves should use the formant ‘Author XXX’.

2.2 Authorship

All parties who have made a substantive contribution to the article should be listed as authors. Principal authorship, authorship order, and other publication credits should be based on the relative scientific or professional contributions of the individuals involved, regardless of their status. A student is usually listed as principal author on any multiple-authored publication that substantially derives from the student’s dissertation or thesis.

Please note that AI chatbots, for example ChatGPT, should not be listed as authors. For more information see the policy on Use of ChatGPT and generative AI tools .

2.3 Acknowledgements

All contributors who do not meet the criteria for authorship should be listed in an Acknowledgements section. Examples of those who might be acknowledged include a person who provided purely technical help, or a department chair who provided only general support.

Please supply any personal acknowledgements separately to the main text to facilitate anonymous peer review.

2.3.1 Third party submissions

Where an individual who is not listed as an author submits a manuscript on behalf of the author(s), a statement must be included in the Acknowledgements section of the manuscript and in the accompanying cover letter. The statements must:

  • Disclose this type of editorial assistance – including the individual’s name, company and level of input
  • Identify any entities that paid for this assistance
  • Confirm that the listed authors have authorized the submission of their manuscript via third party and approved any statements or declarations, e.g. conflicting interests, funding, etc.

Where appropriate, Sage reserves the right to deny consideration to manuscripts submitted by a third party rather than by the authors themselves .

2.3.2 Writing assistance

Individuals who provided writing assistance, e.g. from a specialist communications company, do not qualify as authors and so should be included in the Acknowledgements section. Authors must disclose any writing assistance – including the individual’s name, company and level of input – and identify the entity that paid for this assistance. It is not necessary to disclose use of language polishing services.

2.4 Funding

Second Language Research requires all authors to acknowledge their funding in a consistent fashion under a separate heading.  Please visit the Funding Acknowledgements page on the Sage Journal Author Gateway to confirm the format of the acknowledgment text in the event of funding, or state that: This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors. 

2.5 Declaration of conflicting interests

Second Language Research encourages authors to include a declaration of any conflicting interests and recommends you review the good practice guidelines on the Sage Journal Author Gateway

3. Publishing Policies

3.1 Publication ethics

Sage is committed to upholding the integrity of the academic record. We encourage authors to refer to the Committee on Publication Ethics’ International Standards for Authors and view the Publication Ethics page on the Sage Author Gateway

3.1.1 Plagiarism

Second Language Research and Sage take issues of copyright infringement, plagiarism or other breaches of best practice in publication very seriously. We seek to protect the rights of our authors and we always investigate claims of plagiarism or misuse of published articles. Equally, we seek to protect the reputation of the journal against malpractice. Submitted articles may be checked with duplication-checking software. Where an article, for example, is found to have plagiarised other work or included third-party copyright material without permission or with insufficient acknowledgement, or where the authorship of the article is contested, we reserve the right to take action including, but not limited to: publishing an erratum or corrigendum (correction); retracting the article; taking up the matter with the head of department or dean of the author's institution and/or relevant academic bodies or societies; or taking appropriate legal action.

3.1.2 Prior publication

If material has been previously published it is not generally acceptable for publication in a Sage journal. However, there are certain circumstances where previously published material can be considered for publication. Please refer to the guidance on the Sage Author Gateway or if in doubt, contact the Editor at the address given below.

3.2 Contributor's publishing agreement

Before publication, Sage requires the author as the rights holder to sign a Journal Contributor’s Publishing Agreement. Sage’s Journal Contributor’s Publishing Agreement is an exclusive licence agreement which means that the author retains copyright in the work but grants Sage the sole and exclusive right and licence to publish for the full legal term of copyright. Exceptions may exist where an assignment of copyright is required or preferred by a proprietor other than Sage. In this case copyright in the work will be assigned from the author to the society. For more information please visit the Sage Author Gateway

3.3 Open access and author archiving

Second Language Research offers optional open access publishing via the Sage Choice programme and Open Access agreements, where authors can publish open access either discounted or free of charge depending on the agreement with Sage. Find out if your institution is participating by visiting Open Access Agreements at Sage . For more information on Open Access publishing options at Sage please visit Sage Open Access . For information on funding body compliance, and depositing your article in repositories, please visit Sage’s Author Archiving and Re-Use Guidelines and Publishing Policies .

4. Preparing your manuscript for submission

4.1 Formatting

The preferred format for your manuscript is Word. Templates are available on the Manuscript Submission Guidelines page of our Author Gateway.

4.2 Artwork, figures and other graphics

For guidance on the preparation of illustrations, pictures and graphs in electronic format, please visit Sage’s Manuscript Submission Guidelines   

Figures supplied in colour will appear in colour online regardless of whether or not these illustrations are reproduced in colour in the printed version. For specifically requested colour reproduction in print, you will receive information regarding the costs from Sage after receipt of your accepted article.

4.3 Supplementary material

Second Language Research  does not currently accept supplemental files.

4.4 Reference style

Second Language Research adheres to the Sage Harvard reference style. View the Sage Harvard guidelines to ensure your manuscript conforms to this reference style.

If you use EndNote to manage references, you can download the Sage Harvard EndNote output file

4.5 English language editing services

Authors seeking assistance with English language editing, translation, or figure and manuscript formatting to fit the journal’s specifications should consider using Sage Language Services. Visit Sage Language Services on our Journal Author Gateway for further information.

4.6 Statistical Guidelines

Authors should follow the general guidelines for statistical analysis and representation of statistical results as outlined here .

5. Submitting your manuscript

Second Language Research is hosted on Sage Track, a web based online submission and peer review system powered by ScholarOne™ Manuscripts. Visit http://mc.manuscriptcentral.com/SLR  to login and submit your article online.

Manuscripts should have a separate title page with the author's name, full postal address and email address. The first page of the text should carry the title of the article without the name of the author (see also section on Anonymity above). Each article must be accompanied by an abstract of about 200 words.

If you are a new user, you will first need to create an account. Submissions should be made by logging in and selecting the ‘Author Center’ and the 'Click here to Submit a New Manuscript' option. Follow the instructions on each page, clicking the 'Next' button on each screen to save your work and advance to the next screen. If at any stage you have any questions or require the user guide, please use the ‘Online Help’ button at the top right of every screen.

IMPORTANT: Please check whether you already have an account in the system before trying to create a new one. If you have reviewed or authored for the journal in the past year it is likely that you will have had an account created. For further guidance on submitting your manuscript online please visit ScholarOne Online Help .

As part of our commitment to ensuring an ethical, transparent and fair peer review process Sage is a supporting member of ORCID , the Open Researcher and Contributor ID. ORCID provides a persistent digital identifier that distinguishes researchers from every other researcher and, through integration in key research workflows such as manuscript and grant submission, supports automated linkages between researchers and their professional activities ensuring that their work is recognised.

The collection of ORCID IDs from corresponding authors is now part of the submission process of this journal. If you already have an ORCID ID you will be asked to associate that to your submission during the online submission process. We also strongly encourage all co-authors to link their ORCID ID to their accounts in our online peer review platforms. It takes seconds to do: click the link when prompted, sign into your ORCID account and our systems are automatically updated. Your ORCID ID will become part of your accepted publication’s metadata, making your work attributable to you and only you. Your ORCID ID is published with your article so that fellow researchers reading your work can link to your ORCID profile and from there link to your other publications.

If you do not already have an ORCID ID please follow this link to create one or visit our ORCID homepage to learn more.

5.2 Information required for completing your submission

You will be asked to provide contact details and academic affiliations for all co-authors via the submission system and identify who is to be the corresponding author. These details must match what appears on your manuscript. At this stage please ensure you have included all the required statements and declarations and uploaded any additional supplementary files (including reporting guidelines where relevant).

5.3 Permissions

Please also ensure that you have obtained any necessary permission from copyright holders for reproducing any illustrations, tables, figures or lengthy quotations previously published elsewhere. For further information including guidance on fair dealing for criticism and review, please see the Copyright and Permissions page on the Sage Author Gateway

6. On acceptance and publication

6.1 Sage Production

Your Sage Production Editor will keep you informed as to your article’s progress throughout the production process. Proofs will be sent by PDF to the corresponding author and should be returned promptly.  Authors are reminded to check their proofs carefully to confirm that all author information, including names, affiliations, sequence and contact details are correct, and that Funding and Conflict of Interest statements, if any, are accurate.

6.2 Online First publication

Online First allows final articles (completed and approved articles awaiting assignment to a future issue) to be published online prior to their inclusion in a journal issue, which significantly reduces the lead time between submission and publication. Visit the Sage Journals help page for more details, including how to cite Online First articles.

6.3 Access to your published article

Sage provides authors with online access to their final article.

6.4 Promoting your article

Publication is not the end of the process! You can help disseminate your paper and ensure it is as widely read and cited as possible. The Sage Author Gateway has numerous resources to help you promote your work. Visit the Promote Your Article page on the Gateway for tips and advice.

7. Further information

Any correspondence, queries or additional requests for information on the manuscript submission process should be sent to the Second Language Research editorial office as follows:

Silvina Montrul  University of Illinois at Urbana-Champaign Department of Spanish, Italian & Portuguese 4080 Foreign Languages Building, MC-176 707 S. Mathews Ave Urbana, IL 61801 USA

  • Read Online
  • Sample Issues
  • Current Issue
  • Email Alert
  • Permissions
  • Foreign rights
  • Reprints and sponsorship
  • Advertising

Member Subscription, Combined (Print & E-access)

Individual Subscription, Combined (Print & E-access)

Institutional Subscription, E-access

Institutional Subscription & Backfile Lease, E-access Plus Backfile (All Online Content)

Institutional Subscription, Print Only

Institutional Subscription, Combined (Print & E-access)

Institutional Backfile Purchase, E-access (Content through 1998)

Institutional Subscription & Backfile Lease, Combined Plus Backfile (Current Volume Print & All Online Content)

Individual, Single Print Issue

Institutional, Single Print Issue

To order single issues of this journal, please contact SAGE Customer Services at 1-800-818-7243 / 1-805-583-9774 with details of the volume and issue you would like to purchase.

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Front Hum Neurosci

Learning a Foreign Language: A Review on Recent Findings About Its Effect on the Enhancement of Cognitive Functions Among Healthy Older Individuals

Currently, there is an increasing number of older population groups, especially in developed countries. This demographic trend, however, may cause serious problems, such as an increase in aging diseases, one of which is dementia whose main symptom consists in the decline of cognitive functioning. Although there has been ongoing pharmacological research on this neurological disorder, it has not brought satisfying results as far as its treatment is concerned. Therefore, governments all over the world are trying to develop alternative, non-pharmacological strategies/activities, which could help to prevent this cognitive decline while this aging population is still healthy in order to reduce future economic and social burden. One of the non-pharmacological approaches, which may enhance cognitive abilities and protect against the decline in healthy older population, seems to be the learning of a foreign language. The purpose of this mini-review article is to discuss recent findings about the effect of foreign language learning on the enhancement of cognitive functions among healthy older individuals. The findings, divided into three research areas, show that the learning of a foreign language may generate a lot of benefits for older individuals, such as enhancement of cognitive functioning, their self-esteem, increased opportunities of socializing, or reduction of costs. However, as Ware et al. ( 2017 ) indicate, any intervention program on foreign language learning should be well thought of and tailored to the needs of older people in order to be effective and avoid accompanying factors, such as older people’s anxiety or low self-confidence. Nevertheless, more empirical studies should be done in this area.

Introduction

The population is aging. For example, in Europe, older people aged 65+ years form 18% of the whole population. It is expected that by 2050, the older population will outnumber the young population in many developed countries (Statista, 2017 ). This demographic trend, however, may cause serious problems, such as an increase in aging diseases, one of which is dementia whose main symptom consists in the decline of cognitive functioning. This is connected with the brain atrophy, particularly in the temporal cortex, the region that is related to declarative memory (see Buckner, 2004 ), which is encoded by the hippocampus, entorhinal cortex and perirhinal cortex, loss of synaptic connections (Maston, 2010 ), and the occurrence of neuropathological symptoms associated with dementia (see Antoniou and Wright, 2017 ). Although there has been ongoing pharmacological research on this neurological disorder, it has not brought satisfying results as far as its treatment is concerned (Karakaya et al., 2013 ).

Therefore, governments all over the world are trying to develop alternative, non-pharmacological strategies/activities, which could help to prevent this cognitive decline while this aging population is still healthy in order to reduce future economic and social burden (Maresova et al., 2016 ). These alternative, non-pharmacological intervention therapies can be divided into several groups, which have a positive impact on the enhancement of cognitive functions: physical activities, cognitive training, healthy diet (see Klimova and Kuca, 2015 ), as well as social enhancement interventions (see Ballesteros et al., 2015 ), including the use of modern information and communication technologies (Peter et al., 2013 ; Ballesteros et al., 2014 ). One of the cognitive training activities, which may enhance cognitive abilities and protect against the decline in healthy older population, seems to be the learning of a foreign language (see Antoniou et al., 2013 ; Kroll and Dussias, 2017 ). As Connor ( 2016 ) points out, learning a foreign language can promote thinking skills, increase mental agility and delay the aging of the brain. However, as Kurdziel et al. ( 2017 ) explain, the retrieval of new words among older people is harder since their fluid intelligence (i.e., the ability to reason and solve things), as well as the working, short-term, memory (i.e., management of immediately available information) are getting affected in the course of aging. On the contrary, their crystallized intelligence (i.e., the ability to use experience, knowledge and skills) remain intact in the aging process (see Kavé et al., 2008 ). Kurdziel et al. ( 2017 ) also state that the decline in language ability among older people is slower than the decline in global memory. In addition, older individuals possess a superior raw vocabulary even if compared with well-educated adults of young generation. In addition, foreign language learning does not have any side effect (Bak, 2016 ) and can help reduce country’s economic burden (Bialystok et al., 2016 ). Simply, it does not do any harm (see Strauss, 2015 ). Abutalebi and Clahsen ( 2015 ) present that knowledge about language processing in older individuals and about the potential factors that prevent cognitive decline is currently very much desirable since it may contribute to preparing for the demographic changes which our society faces.

The purpose of this mini-review article is to discuss recent findings about the effect of foreign language learning on the enhancement of cognitive functions among healthy older individuals.

The methodology of this mini-review article is based on Moher et al. ( 2009 ). Studies were selected on the basis of the following keyword collocations: healthy aging and foreign language learning ; healthy older individuals and foreign language learning , healthy older individuals and bilingualism , found in the world’s acknowledged databases: Web of Science, PubMed, Scopus and ScienceDirect. The search was not limited by time since the studies on the research topic were scarce. Altogether 43 studies, including both review and original articles, were detected, most of them were identified in ScienceDirect and Web of Science, followed by PubMed and Scopus. The analysis was done by identifying the key words and checking duplication of available sources in the databases mentioned above. Afterwards, the studies were assessed for their relevancy, i.e., verification on the basis of abstracts whether the selected study corresponds to the set goal. After the exclusion of such studies, 26 studies remained for the full-text analysis. Out of 26 studies, 12 were empirical or randomized control studies, which are in detail described in Table ​ Table1. 1 . The review studies (e.g., Antoniou et al., 2013 ; Lee and Tzeng, 2016 ; Kurdziel et al., 2017 ), the studies dealing with the younger adults (e.g., Schlegel et al., 2012 ; Bellander et al., 2016 ) and the studies with patients suffering from dementia, respectively Alzheimer’s disease (e.g., Woumans et al., 2015 ; Bialystok et al., 2016 ) were used for comparison reasons. Moreover, the author also explored websites connected with the research topic, e.g., SeniorsMatter ( 2017 ).

An overview of the detected empirical studies on the effect of foreign language learning on the enhancement of cognitive functions among healthy older individuals.

StudyObjectiveNumber of subjectsMain outcome measuresResults
Ansaldo et al. ( )To examine the behavioral and neural traces of nonverbal interference control in healthy older bilinguals and monolinguals.20 subjects, mean age: 74 years.Language assessment, neuropsychological tests, magnetic resonance imaging (fMRI) scanning.Elderly bilinguals deal with interference control without recruiting a circuit that is particularly vulnerable to aging.
Bak et al. ( )To explore the effect of bilingualism on later-life cognition controlling for childhood intelligence.853 participants.First tested in 1947 (age 11) and then at the age of 70; a series of cognitive tests for participants including intelligence test and comparing the results with their own test scores at the age of 11.Bilinguals, as well as those who acquired a second language at the later age, performed significantly better than predicted from their baseline cognitive abilities, with strongest effects on general intelligence and reading; the findings also suggest a positive effect of bilingualism on later-life cognition, including in those who acquired their second language in adulthood.
Bak et al. ( )To investigate the impact of a short intensive language course on attentional functions.67 participants at the age of 18–78 years.Auditory tests of attentional inhibition and switching.Even a short period of intensive language learning can modulate attentional functions and that all age groups can benefit from this effect.
Diaz-Orueta et al. ( )To examine and define the user requirements for developing digital learning games for older Europeans.86 subjects at the age of 60+ years from Spain, Netherlands and Greece.Focus group sessions with audio and video recordings.The main aspects of interest were challenge, socialization, fun, providing learning opportunities and escape from daily routine. In addition, the content of these games should focus on foreign language learning, physical activity, or culture.
Kousaie and Phillips ( )To investigate the benefit of bilingualism among healthy older bilinguals and monolinguals with the help of behavioral and electrophysiological measures.43 healthy elderly, aged between 60 years and 83 years.Montreal Cognitive assessment, EEG recording.There is evidence that older bilinguals execute enhanced cognitive processing than older monolingual individuals.
Lawton et al. ( )To explore if the age of clinically diagnosed Alzheimer’s disease and vascular dementia occurred later for bilingual than monolingual, immigrant and U.S. born, Hispanic Americans.1789 community-dwelling Hispanic Americans, aged ≥60 years.Cognitive testing; clinical examination; self-report using a three-point Likert-type scale for the evaluation of language proficiency.Mean age of dementia diagnosis was not significantly different for bi/monolingual, U.S. born or immigrant, Hispanic Americans.
Ramos et al. ( )To explore the relationship between language learning and switching ability in elderly monolingual participants who learned a second language during a whole academic year.43 older individuals at the age of 60–80 years.A color-shape switching task.The acquisition of a second language in the elderly does not necessarily lead to an enhancement of switching ability as measured by switching costs.
Sanders et al. ( )To verify whether non-native English speakers (n-NES) have lower risk of incident dementia/AD and that educational level might modify this relationship.1944 healthy older individuals ≥70 years.Battery of cognitive performance tests at baseline and each successive annual evaluation; and nested Cox proportional hazards models were used.n-NES status does not appear to have an independent protective effect against incident dementia/AD, and that n-NES status may contribute to risk of dementia in an education-dependent manner.
Ware et al. ( )To determine whether the English training program integrating technology is feasible for older French people.14 older people, average age: 75 years.Standardized tests for measuring cognitive functions, questionnaires, post-intervention, semi-directive interviews, and a content/theme analysis.The program was stimulating and enjoyable and it might be used as a therapeutic and cognitive intervention in future.
Wilson et al. ( )To test the hypothesis that foreign language and music instruction in early life are associated with lower incidence of mild cognitive impairment (MCI) and slower rate of cognitive decline in old age.964 healthy older individuals.Cognitive testing and clinical classification of MCI.Higher levels of foreign language and music instruction during childhood and adolescence are associated in old age with lower risk of developing MCI but not with the rate of cognitive decline.
Yeung et al. ( )To determine whether bilingualism is associated with dementia in cross-sectional or prospective analyses of older adults.1616 community-living healthy older adults.Self-reports; cognitive testing; and clinical examination.There is no association between speaking more than one language and dementia.
Zahodne et al. ( )To test the hypothesis that dementia is diagnosed at older ages in bilinguals compared to monolinguals.1067 healthy older Hispanic immigrants in New York.Self-report using a four-point Likert-type scale for the evaluation of language proficiency; Selective Reminding Test; Boston Naming Test; tests of verbal and nonverbal abstraction and letter fluency; Color Trails Test; and Cox regression.There is not a protective effect of bilingualism on age-related cognitive decline or the development of dementia.

Findings and Their Discussion

As it has been stated in the “Methods” section, there is a lack of studies on the learning of a foreign language and its effect on the enhancement of cognitive functioning in older people, apart from those on bilingualism (see Klimova et al., 2017a ). Overall, the identified studies can be divided into three main areas: studies concerning the brain plasticity in the old age and foreign language learning; studies focused on foreign language learning among healthy older individuals; and studies aimed at bilingualism and healthy aging, including the electrophysiological studies. All of them also discuss the cognitive aspects.

Plasticity of the Brain in the Old Age and Foreign Language Learning

The brain remains with considerable plasticity even in the old age. Although there is some neural deterioration that rises with age, the brain has the capacity to increase neural activity and develop neural scaffolding to regulate cognitive function (Park and Reuter-Lorenz, 2009 ; Reuter-Lorenz and Park, 2014 ). For example, Cheng et al. ( 2015 ) maintain that both short-term and long-term period of foreign language learning may lead to the changes in the structure of the brain, which consequently may contribute to the promotion of the cognitive reserve, i.e., the resilience to neuropathological damage of the brain (Stern, 2013 ). This has been also confirmed by Lee and Tzeng ( 2016 ), who claim that foreign language learning results in effective structural as well as functional connectivity in the brain due to neural plasticity. They indicate that the effective connectivity due to foreign language learning enhances the capacity for language processing and general executive control by reorganizing neural circuitries. Furthermore, research shows that foreign language learning has a positive impact on both white and gray matter structures (see Bellander et al., 2016 ). For instance, Schlegel et al. ( 2012 ) in their randomized controlled study with 11 English speakers (average age of 20 years) who took a 9-month intensive course in written and spoken Modern Standard Chinese and 16 controls who did not study a language reported that the plasticity of the white matter played a significant role in adult language learning. Although their adult learners showed progressive changes in white matter tracts, associated with traditional left hemisphere language areas and their right hemisphere analogs, the most important changes appeared in frontal lobe tracts crossing the genu of the corpus callosum-a region, which is not generally involved in current neural models of language processing. Tyler et al. ( 2010 ) in their study on preserved syntactic processing across the life span, argue that this is caused by the shift from a primarily left hemisphere frontotemporal system to a bilateral functional language network. In addition, Connor ( 2016 ) described a study of retired people doing an intensive language course of 5 h a day on the Isle of Skye to learn Gaelic (see Bak et al., 2016 ). After finishing the course, the scientists discovered these people were more mentally agile than those doing a course on something else. As Antoniou et al. ( 2013 ) indicate, foreign language training may engage a larger brain network than other forms of cognitive training that have been investigated (e.g., math and crossword puzzles), and it is likely to require long distance neural connections. However, not all the findings on the plasticity o the brain and aging process are positive. For instance, the controlled study by Ramos et al. ( 2017 ) maintains that the switching ability (i.e., the ability to shift attention between one task and another) was not enhanced by learning a foreign language, in this case Basque language, among elderly Spanish people.

Foreign Language Learning Among Healthy Older Individuals

In the most recent study on foreign language learning and its effect on cognitive functioning, Ware et al. ( 2017 ) developed a technology-based English training program for older French adults. The program was based on the assumptions provided by Antoniou et al. ( 2013 ). These assumptions involved various factors, such as that computer-based language training can be administered anywhere and at any time to suit learner’s needs, the content can be adjusted and items can be repeated. In addition, learners can socialize. The average age of the participants was 75 years. The course lasted for 4 months and consisted of 16 2-h sessions. The researchers used standardized tests for measuring cognitive functions (Montreal Cognitive Assessment), as well as University of California Loneliness Assessment for measuring subjective feelings of loneliness and social isolation, both of which did not significantly change after finishing the course. Nevertheless, the researchers found out that their program was feasible for this age group and the participants enjoyed it. Similarly, research performed by Bak et al. ( 2016 ) on a short 1-week Scottish Gaelic course on attentional functions among 67 adults aged between 18 years and 78 years reveals that even a short period of intensive language learning can modulate attentional functions and that all age groups can benefit from this effect. The results showed that at the beginning there was no difference between the groups. However, at the end of the course, a considerable improvement in attention switching was detected in the language group ( p < 0.001) but not the control group ( p = 0.127), independent of the age of subjects. In addition, they also suggested that these short-term effects could be maintained through continuous practice, but the minimum study period should be 5 h a week.

Research also indicates that the age in second language acquisition is not such a significant factor, but the length of exposure to the target language is important (Bialystok, 1997 ). In fact, on the one hand, it might take older people longer and more practice to learn a foreign language in the old age because of difficulty distinguishing new sounds and retrieve novel words, but on the other hand, they are more relaxed and motivated to learn (see SeniorsMatter, 2017 ). As it has been already pointed out, the main problem for older people is to retrieve new words (see Kurdziel et al., 2017 ). However, they are able to retain these new words easily if they are provided in the context. Kurdziel et al. ( 2017 ) also revealed that newly learned words were stored in hippocampus during encoding and then integrated into lexicon in the course of sleeping. Nevertheless, the quality of sleeping is often negatively affected in the old age and therefore older people are not able to retain as many words as their younger counterparts whose sleeping period is higher and unbroken.

Diaz-Orueta et al. ( 2012 ) report that the main stimulation for older people to learn a foreign language is a challenge, socialization, fun, providing learning opportunities and escape from daily routine. Moreover, the older individuals might also have experience of learning a foreign language, which can help them in acquiring a new language (see Singleton and Lengyel, 1995 ).

Kurdziel et al. ( 2017 ) expand by suggesting that learning throughout aging should be a must because older people who keep mentally and physically active are less likely to be cognitively impaired and depressed. In fact, depression seems to be one of the most serious comorbidities in the aging process (Popa-Wagner et al., 2014 ; Sandu et al., 2015 ). Furthermore, foreign language learning increases self-confidence, enables older people travel and communicate with their peers in foreign countries.

Bilingualism and Healthy Aging

The theory of bilingualism states that people acquiring a second language in their adulthood may prevent cognitive decline in later life by approximately 4.5 years (see Bialystok et al., 2007 , 2016 ; Bak et al., 2014 ; Wilson et al., 2015 ; Woumans et al., 2015 ). In their recent study, on the basis of measures of cognitive function and brain structure, Bialystok et al. ( 2016 ) show that bilingualism can delay cognitive decline. As Bialystok et al. ( 2004 ) and Bialystok ( 2006 ) state, bilingualism contributes to compensate age-related losses in certain executive processes. Furthermore, bilingual people possess better mental flexibility because they are used to adapting to constant changes and processing information in a more effective way than the monolingual individuals. However, these results especially concern the retrospective studies on bilingualism since the prospective studies on bilingualism, such as Lawton et al. ( 2015 ), Sanders et al. ( 2012 ), Yeung et al. ( 2014 ), or Zahodne et al. ( 2014 ), have not exerted significant results in this respect (see Klimova et al., 2017a ). For instance, Mukadam et al. ( 2017 ) in the most recent study revealed that retrospective studies inclined to confounding by education, or cultural differences in presentation to dementia and are thus not relevant to set causative links between risk factors and results. However, the electrophysiological studies on bilingualism indicate that bilingualism may enhance cognitive functions among healthy older individuals (i.e., Kousaie and Phillips, 2017 ). Moreover, as Ansaldo et al. ( 2015 ) state, healthy older bilinguals deal with interference control without recruiting a circuit that is particularly vulnerable to aging.

Table ​ Table1 1 below then summarizes the main findings of the studies on the effect of foreign language learning on the enhancement of cognitive functions for healthy older individuals.

The limitations of this mini-review study mainly involve a lack of relevant studies on the research topic. This fact may cause the overestimated effects of the results, which may have an adverse impact on the validity of these reviewed studies (see Melby-Lervåg and Hulme, 2016 ).

Overall, some of the findings in Table ​ Table1, 1 , as well as from other mentioned studies indicate that the learning of a foreign language may generate benefits for older individuals, such as enhancement of cognitive functioning (Bak et al., 2014 , 2016 ; Ansaldo et al., 2015 ; Kousaie and Phillips, 2017 ) their self-esteem (Ware et al., 2017 ), or increased opportunities of socializing (Diaz-Orueta et al., 2012 ; Ballesteros et al., 2015 ). Bialystok et al. ( 2016 ) also emphasize that second-language learning has long-term implications for public health in terms of cost-effectiveness. In addition, as Ware et al. ( 2017 ) indicate, any intervention program on foreign language learning should be well thought of and tailored to the needs of older people in order to be effective and avoid accompanying factors, such as older people’s anxiety or low self-confidence.

In comparison with the intervention studies focusing on physical activities (see Klimova et al., 2017b ), there is still smaller evidence of the effect of foreign language learning on the enhancement of cognitive functions among the healthy aging population. This is especially caused by a lack of research in this area.

Author Contributions

BK drafted, analyzed, wrote and read the whole manuscript herself.

Conflict of Interest Statement

The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Funding. This article is supported by the project Excellence (2018) at the Faculty of Informatics and Management of the University of Hradec Králové, Czechia.

  • Abutalebi J., Clahsen H. (2015). Bilingualism, cognition, and aging. Editorial . Biling. Lang. Cogn. 18 , 1–2. 10.1017/s1366728914000741 [ CrossRef ] [ Google Scholar ]
  • Ansaldo A. I., Ghazi-Saidi L., Adrover-Roig D. (2015). Interference control in elderly bilinguals: appearances can be misleading . J. Clin. Exp. Neuropsychol. 37 , 455–470. 10.1080/13803395.2014.990359 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Antoniou M., Gunasekera G., Wong P. C. M. (2013). Foreign language training as cognitive therapy for age-related cognitive decline: a hypothesis for future research . Neurosci. Biobehav. Rev. 37 , 2689–2698. 10.1016/j.neubiorev.2013.09.004 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Antoniou M., Wright S. M. (2017). Uncovering the mechanisms responsible for why language learning may promote healthy cognitive aging . Front. Psychol. 8 :2217. 10.3389/fpsyg.2017.02217 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Bak T. H. (2016). Language lessons to help protect against dementia . BMJ 354 :i5039. 10.1136/bmj.i5039 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Bak T. H., Long M. R., Vega-Mendoza M., Sorace A. (2016). Novelty, challenge and practice: the impact of intensive language learning on attentional functions . PLoS One 11 :e0153485. 10.1371/journal.pone.0153485 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Bak T. H., Nissan J. J., Allerhand M. M., Deary I. J. (2014). Does bilingualism influence cognitive aging? Ann. Neurol. 75 , 959–963. 10.1002/ana.24158 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Ballesteros S., Kraft E., Santana S., Tziraki C. (2015). Maintaining older brain functionality: a targeted review . Neurosci. Biobehav. Rev. 55 , 453–477. 10.1016/j.neubiorev.2015.06.008 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Ballesteros S., Toril P., Mayas J., Reales J. M., Waterworth J. A. (2014). An ICT social network in support of successful ageing . Gerontechnology 13 , 37–46. 10.4017/gt.2014.13.1.007.00 [ CrossRef ] [ Google Scholar ]
  • Bellander M., Berggren R., Mårtensson J., Brehmer Y., Wenger E., Li T. Q., et al.. (2016). Behavioral correlates of changes in hippocampal gray matter structure during acquisition of foreign vocabulary . Neuroimage 131 , 205–213. 10.1016/j.neuroimage.2015.10.020 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Bialystok E. (1997). The structure of age: in search of barriers to second language acquisition . Second Lang. Rese. 13 , 116–137. 10.1191/026765897677670241 [ CrossRef ] [ Google Scholar ]
  • Bialystok E. (2006). Effect of bilingualism and computer video game experience on the Simon task . Can. J. Exp. Psychol. 60 , 68–79. 10.1037/cjep2006008 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Bialystok E., Abutalebi J., Bak T. H., Burke D. M., Kroll J. (2016). Aging in two languages: implications for public health . Ageing Res. Rev. 27 , 56–60. 10.1016/j.arr.2016.03.003 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Bialystok E., Craik F. I. M., Freedman M. (2007). Bilingualism as a protection against the onset of symptoms of dementia . Neuropsychologia 45 , 459–464. 10.1016/j.neuropsychologia.2006.10.009 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Bialystok E., Craik F. I. M., Klein R., Viswanathan M. (2004). Bilingualism, aging, and cognitive control: evidence from the Simon task . Psychol. Aging 19 , 290–303. 10.1037/0882-7974.19.2.290 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Buckner R. L. (2004). Memory and executive function in aging and AD: multiple factors that cause decline and reserve factors that compensate . Neuron 44 , 195–208. 10.1016/j.neuron.2004.09.006 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Cheng K. W., Deng Y. H., Li M., Yan H. M. (2015). The impact of L2 learning on cognitive aging . ADMET 3 , 260–273. 10.5599/admet.3.3.206 [ CrossRef ] [ Google Scholar ]
  • Connor S. (2016). Learning second language can delay ageing of the brain, say scientists . Available online at: http://www.independent.co.uk/news/science/learning-second-language-can-delay-ageing-of-the-brain-say-scientists-a6873796.html [Accessed on December 28, 2017].
  • Diaz-Orueta U., Facal D., Nap H. H., Ranga M. M. (2012). What is the key for older people to show interest in playing digital learning games? Initial qualitative findings from the LEAGE project on a multicultural European sample . Games Health J. 1 , 115–123. 10.1089/g4h.2011.0024 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Karakaya T., Fußer F., Schröder J., Pantel J. (2013). Pharmacological treatment of mild cognitive impairment as a prodromal syndrome of Alzheimer’s disease . Curr. Neuropharmacol. 11 , 102–108. 10.2174/157015913804999487 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Kavé G., Eyal N., Shorek A., Cohen-Mansfield J. (2008). Multilingualism and cognitive state in the oldest old . Psychol. Aging 23 , 70–78. 10.1037/0882-7974.23.1.70 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Klimova B., Kuca K. (2015). Alzheimer’s disease: potential preventive, non-invasive, intervention strategies in lowering the risk of cognitive decline—a review study . J. Appl. Biomed. 13 , 257–261. 10.1016/j.jab.2015.07.004 [ CrossRef ] [ Google Scholar ]
  • Klimova B., Valis M., Kuca K. (2017a). Bilingualism as a strategy to delay the onset of Alzheimer’s disease . Clin. Interv. Aging 12 , 1731–1737. 10.2147/CIA.s145397 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Klimova B., Valis M., Kuca K. (2017b). Cognitive decline in normal aging and its prevention: a review on non-pharmacological lifestyle strategies . Clin. Interv. Aging 12 , 903–910. 10.2147/CIA.s132963 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Kousaie S., Phillips N. A. (2017). A behavioral and electrophysiological investigation of the effect of bilingualism on aging and cognitive control . Neuropsychologia 94 , 23–35. 10.1016/j.neuropsychologia.2016.11.013 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Kroll J. F., Dussias P. E. (2017). The benefits of multilingualism to the personal and professional development of residents of the US . Foreign Lang. Ann. 50 , 248–259. 10.1111/flan.12271 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Kurdziel L. B. F., Mantua J., Spencer R. M. C. (2017). Novel word learning in older adults: a role for sleep? Brain Lang. 167 , 106–113. 10.1016/j.bandl.2016.05.010 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Lawton D. M., Gasquoine P. G., Weimer A. A. (2015). Age of dementia diagnosis in community dwelling bilingual and monolingual Hispanic Americans . Cortex 66 , 141–145. 10.1016/j.cortex.2014.11.017 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Lee R. R. W., Tzeng O. J. L. (2016). Neural bilingualism: a new look at an old problem . Lang. Linguist. 17 , 147–193. 10.1177/1606822X15614523 [ CrossRef ] [ Google Scholar ]
  • Maresova P., Klimova B., Novotny M., Kuca K. (2016). Alzheimer’s disease and Parkinson’s diseases: expected economic impact on Europe—a call for a uniform European strategy . J. Alzheimers Dis. 54 , 1123–1133. 10.3233/JAD-160484 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Maston L. (2010). Declarative (explicit) and procedural (implicit) memory . Available online at: http://www.human-memory.net/types_declarative.html [Accessed on March 11, 2018].
  • Melby-Lervåg M., Hulme C. (2016). There is no convincing evidence that working memory training is effective: a reply to Au et al. (2014) and Karbach and Verhaeghen (2014) . Psychon. Bull. Rev. 23 , 324–330. 10.3758/s13423-015-0862-z [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Moher D., Liberati A., Tetzlaff J., Altman D. G. (2009). The PRISMA Group. Preferred reporting items for systematic review and meta-analysis: the PRISMA statement . PLoS Med. 6 :e1000097. 10.1371/journal.pmed.1000097 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Mukadam N., Sommerland A., Livingston G. (2017). The relationship of bilingualism compared to monolingualism to the risk of cognitive decline or dementia: a systematic review and meta-analysis . J. Alzheimers Dis. 58 , 45–54. 10.3233/JAD-170131 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Park D. C., Reuter-Lorenz P. (2009). The adaptive brain: aging and neurocognitive scaffolding . Annu. Rev. Psychol. 60 , 173–196. 10.1146/annurev.psych.59.103006.093656 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Peter C., Kreisner A., Schröter M., Hyosun K., Bieber G., Öhberg F., et al. (2013). AGNES: connecting people in a multimodal way . J. Multimod. User Interf. 7 , 229–245. 10.1007/s12193-013-0118-z [ CrossRef ] [ Google Scholar ]
  • Popa-Wagner A., Buga A. M., Tica A. A., Albu C. V. (2014). Perfusion deficits, inflammation and aging precipitate depressive behavior . Biogerontology 15 , 439–448. 10.1007/s10522-014-9516-1 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Ramos S., Garcia Y. F., Anton E., Casaponsa A., Dunabeitia J. A. (2017). Does learning a language in the elderly enhance switching ability? J. Neurolinguistics 43 , 39–48. 10.1016/j.jneuroling.2016.09.001 [ CrossRef ] [ Google Scholar ]
  • Reuter-Lorenz P. A., Park D. C. (2014). How does it STAC up? Revisiting the scaffolding theory of aging and cognition . Neuropsychol. Rev. 24 , 355–370. 10.1007/s11065-014-9270-9 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Sanders A. E., Hall C. B., Katz M. J., Lipton R. B. (2012). Non-native language use and risk of incident dementia in the elderly . J. Alzheimers Dis. 29 , 99–108. 10.3233/JAD-2011-111631 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Sandu R. E., Buga A. M., Uzoni A., Petcu E. B., Popa-Wagner A. (2015). Neuroinflammation and comorbidities are frequently ignored factors in CNS pathology . Neural Regen. Res. 10 , 1349–1355. 10.4103/1673-5374.165208 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Schlegel A. A., Rudelson J. J., Tse P. U. (2012). White matter structure changes as adults learn a second language . J. Cogn. Neurosci. 24 , 1664–1670. 10.1162/jocn_a_00240 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • SeniorsMatter (2017). Learning a language as a senior . Available online at: http://seniorsmatter.com/learning-a-language-as-a-senior/
  • Singleton D. M., Lengyel Z. (1995). The Age Factor in Second Language Acquisition: A Critical Look at the Critical Period Hypothesis. Clevedon: Multilingual Matters. [ Google Scholar ]
  • Statista (2017). Proportion of selected age groups of world population in 2017, by region . Available online at: https://www.statista.com/statistics/265759/world-population-by-age-and-region/ [Accessed on January 2, 2018].
  • Stern Y. (2013). Cognitive reserve: implications for assessment and intervention . Folia Phoniatr. Logop. 65 , 49–54. 10.1159/000353443 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Strauss S. (2015). Does bilingualism delay dementia? CMAJ 187 , E209–E210. 10.1503/cmaj.109-5022 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Tyler L. K., Shafto M. A., Randall B., Wright P., Marslen-Wilson W. D., Stamatakis E. A. (2010). Preserving syntactic processing across the adult life span: the modulation of the frontotemporal language system in the context of age-related atrophy . Cereb. Cortex 20 , 352–364. 10.1093/cercor/bhp105 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Ware C., Damnee S., Djabelkhir L., Cristancho V., Wu Y. H., Benovici J., et al.. (2017). Maintaining cognitive functioning in healthy seniors with a technology-based foreign language program: a pilot feasibility study . Front. Aging Neurosci. 9 :42. 10.3389/fnagi.2017.00042 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Wilson R. S., Boyle P. A., Yang J., James B. D., Bennett D. A. (2015). Early life instruction in foreign language and music and incidence of mild cognitive impairment . Neuropsychology 29 , 292–302. 10.1037/neu0000129 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Woumans E., Santens P., Sieben A., Versijpt J., Stevens M., Duyck W. (2015). Bilingualism delays clinical manifestation of Alzheimer’s disease . Biling. Lang. Cogn. 18 , 568–574. 10.1017/s136672891400087x [ CrossRef ] [ Google Scholar ]
  • Yeung C. M., St John P. D., Menec V., Tyas S. L. (2014). Is bilingualism associated with a lower risk of dementia in community-living old adults? Cross-sectional and prospective analyses . Alzheimer Dis. Assoc. Disord. 28 , 326–332. 10.1097/WAD.0000000000000019 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Zahodne L. B., Schofield P. W., Farrell M. T., Stern Y., Manly J. J. (2014). Bilingualism does not alter cognitive decline or dementia risk among Spanish speaking immigrants . Neuropsychology 28 , 238–246. 10.1037/neu0000014 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]

ORIGINAL RESEARCH article

The effect of language learning strategies on proficiency, attitudes and school achievement.

\r\nAnita Habk*

  • Institute of Education, University of Szeged, Szeged, Hungary

This study examines language learning strategy (LLS) use in connexion with foreign language attitude, proficiency and general school achievement among lower secondary students in Years 5 and 8 ( n = 868) in Hungary. An adapted version of the Strategies Inventory for Language Learning questionnaire was used for data collection. The results showed that Hungarian students mainly engage in metacognitive strategies in both years. Differences between more and less proficient language learners’ strategy use have also been found. With regard to the effect of LLS on foreign language attitude, the foreign language mark and school achievement, path analysis indicated a good fit in both years. The metacognitive, social and memory strategies primarily influenced foreign language attitudes and marks in Year 5. The metacognitive strategies had a slight impact on school achievement as well as on foreign language marks. We demonstrated the dominant effect of metacognitive strategies and the low effect of memory strategies in Year 8. In addition, metacognitive strategies also influenced foreign language marks. The effect of foreign language marks on school achievement was also remarkable. There was a strong impact on the children’s attitudes through these variables.

Introduction

In recent decades, a number of studies have focused on foreign language learning, with the emphasis often having been placed on language learning strategies (LLS; Wong and Nunan, 2011 ; Oxford, 2016 ). Several studies have confirmed that these strategies aid students in becoming more effective learners inside the classroom and foster more efficient development of students’ mastery of the target language after leaving school ( Wong and Nunan, 2011 ). However, less is known about the structure and relationship between LLS, foreign language attitude, the foreign language mark and general school achievement (GA). Recent studies have mainly dealt with LLS among university students and upper secondary students, with only a few investigations having been conducted among lower secondary students. In the present study, we aim to examine young Hungarian students’ LLS use and its connexion to foreign language attitude, the foreign language mark and school achievement at the beginning and end of lower secondary school. We believe that it adds value to the article that we have investigated a young age group, as the beginning period of language learning can establish the success of the entire process. Another advantage of our research is that we analysed the whole language learning process in connexion with several other factors to represent the complexity of the language learning process.

Theoretical Background

Studies on LLS in recent decades have identified a large number of strategies which are employed by English as a foreign/second language (EFL/ESL) learners and several strategy categorisation patterns have also been established. The most frequently used taxonomy was developed by Oxford (1990) . She identified three direct and three indirect strategy types. Direct strategies are specific means of language use: memory, cognitive and compensatory (or compensation) strategies. Indirect strategies, such as metacognitive, affective and social strategies, support LLS indirectly. Recently, Oxford revisited her strategy categories and developed a model with four different strategy categories: cognitive, affective and sociocultural-interactive as well as a master category of “metastrategies.” Metastrategies comprise metacognitive, meta-affective and meta-sociocultural-interactive strategies ( Griffith and Oxford, 2014 ; Oxford, 2016 ). However, she did not elaborate on this strategy classification, and thus our study relied on her original taxonomy.

Various studies have focused on LLS use and aimed to identify the strategies most frequently employed by language learners ( Chamot, 2004 ; Magogwe and Oliver, 2007 ; Wu, 2008 ; Chen, 2009 ; Al-Qahtani, 2013 ; Charoento, 2016 ; Alhaysony, 2017 ; Dawadi, 2017 ). Overall, it can be concluded that the most commonly used LLS in these studies were metacognitive, compensation and cognitive strategies. However, Chamot (2004) pointed out that different strategy preferences were reported by students in different cultural contexts. Chinese and Singaporean students reported a higher level preference for social strategies and lower use of affective strategies than European students.

Some studies have dealt with the implementation of the SILL with a focus on school-aged students ( Magogwe and Oliver, 2007 ; Chen, 2009 , 2014 ; Gunning and Oxford, 2014 ; Platsidou and Kantaridou, 2014 ; Pfenninger and Singleton, 2017 ). The overall conclusion of these studies has been that young learners mostly used social, affective and compensation strategies. The use of memory strategies was relatively low ( Doró and Habók, 2013 ). The attitudes of learners at this age toward language learning are particularly important since they can greatly determine motivation, learning outcomes and later success in language learning ( Platsidou and Kantaridou, 2014 ; Platsidou and Sipitanou, 2014 ).

As the purpose of investigating LLS is to foster learning processes and improve language level, research projects often deal with LLS use in relation to language learning proficiency ( Khaldieh, 2000 ; Magogwe and Oliver, 2007 ; Wu, 2008 ; Chen, 2009 ; Liu, 2010 ; Al-Qahtani, 2013 ; Platsidou and Kantaridou, 2014 ; Charoento, 2016 ; Rao, 2016 ). The notion of proficiency has been defined and involved in analysis in a multitude of ways by various researchers. Charoento (2016) involved self-ratings, Wu (2008) used the results from language proficiency and achievement tests, Magogwe and Oliver (2007) incorporated language course grades into their analysis of their results. Most studies have shown a positive relationship between LLS and proficiency, but the direction of their connexion was often different. Some researchers have stressed that strategy use was mainly specified by proficiency. More proficient students engaged in LLS more frequently and also employed a broader range of strategies overall compared to less proficient students ( Khaldieh, 2000 ; Wu, 2008 ; Rao, 2016 ). Al-Qahtani (2013) and Charoento (2016) demonstrated that successful students mainly used cognitive strategies, while Wu (2008) emphasised significant utilisation of cognitive, metacognitive and social strategies among more proficient university students. Chen (2009) pointed to the use of fewer communication strategies among proficient learners, but noted that they employed them more efficiently than less proficient learners. In addition, Magogwe and Oliver (2007) also established that the basic difference in LLS use between proficient and less proficient learners was that more successful students not only used certain LLS significantly more often, but were also able to select the most adequate strategies depending on the goal of their task.

Some studies have dealt with the effect of LLS use on language proficiency. Both Liu (2010) and Platsidou and Kantaridou (2014) pointed out that learning strategy influences language use and that it plays a significant role in anticipating perceived language performance. Wu (2008) noted that cognitive strategies have the most dominant influence on proficiency. Rao (2016) found that students’ English proficiency significantly affected their learning strategy use and also observed that high-level students avail themselves of more strategies more frequently than low-level students.

Another essential area of LLS research is the study of strategy use in relation to affective variables, such as attitude and motivation ( Shang, 2010 ; Jabbari and Golkar, 2014 ; Platsidou and Kantaridou, 2014 ). Most of these studies have found that learners with a positive attitude employed LLS more frequently compared to learners with a negative attitude. Platsidou and Kantaridou (2014) reported that attitudes toward second language learning influence both direct and indirect strategy uses and that changing learners’ attitudes toward language learning can thus foster their strategy practises. Jabbari and Golkar (2014) established that learners with a positive attitude employ cognitive, compensation, metacognitive and social strategies more frequently.

It can be concluded that LLS use has been studied extensively in recent decades. Most research has found that LLS cannot be analysed separately; it must be examined in relation to certain other factors, among which foreign language attitudes and proficiency play a central role ( Griffiths and Incecay, 2016 ). However, most previous studies preferred university students or adults to primary or secondary school-aged students. Furthermore, a limited amount of research has investigated the relationship of LLS with attitude toward foreign language learning and the foreign language mark. There has also been a dearth of scholarship on how language proficiency and school achievement are determined by LLS use and attitude. Our study aims to fill this gap and attempts to present a comprehensive view of the relationship between LLS use and language attitude and between proficiency and general school achievement by focusing on school children at the beginning and end of lower secondary school. Our specific research question we focus on in this paper is the following:

What are the lower secondary school children’s strategy use preferences and how these are connected with their foreign language attitude, proficiency and general school achievement? Based on the relevant literature we assume that students of this age mainly employ indirect strategies, such as affective, metacognitive and social strategies and these have a significant impact on their foreign language learning attitude, proficiency and general school achievement.

Materials and Methods

Participants.

The participants in the present study were lower secondary students (11- and 14-year-olds) in Hungary ( n Year5 = 450, n Year8 = 418). Participation in the study was voluntary both for schools and students. This study was carried out in accordance with the recommendations of the University of Szeged, the Hungarian law and the municipalities that maintain the schools. The IRB of the Doctoral School (University of Szeged) specifically approved this research project. The agreements are documented and stored in written form in the schools.

Our target group generally started learning a foreign language in Year 4. As one portion of our sample have been learning a foreign language for at least four years, they must have experience of how they learn language. In Hungary, the primary level of education is composed of the elementary and lower secondary school levels; hence, the transition occurs with relatively few major changes, and children have the same language teacher during these school levels. While the foreign language teacher does not change, the other school subjects are taught by specialist teachers as of Year 5. Learning difficulties and differences among children grow considerably from the beginning of lower secondary school; hence, diagnosing language learning attitude is particularly essential.

Instruments

The Strategy Inventory for Language Learning (SILL, Oxford, 1990 ) was administered to investigate the children’s LLS use. The SILL is a standardised measurement tool, and it is applicable to various foreign languages. The complex questionnaire is clustered into six strategy fields: (1) memory (9 items); (2) cognitive (14 items); (3) compensation (6 items); (4) metacognitive (9 items); (5) affective (6 items); and (6) social strategies (6 items). The participants were asked to respond to each statement on a five-point Likert scale. The answers ranged from ‘1 = never or almost never true of me’ to ‘5 = always or almost always true of me.’ The reported internal consistency reliabilities of the questionnaires ranged between 0.91 and 0.94 (Cronbach’s alpha) ( Oxford and Burry-Stock, 1995 ; Ardasheva and Tretter, 2013 ). The questionnaire was conducted in Hungarian to eliminate differences in English knowledge and make it suitable for the language levels in these age groups. The reliability of the Hungarian version was confirmed in previous research ( Doró and Habók, 2013 ). In addition, the children were asked to self-report their foreign language attitude, foreign language mark (indicating students’ foreign language knowledge) and general school achievement (grade point average, which includes students’ achievement in all subjects) on a five-point scale. In Hungarian schools, the different proficiency levels are rated on a five-point scale: 1 is the weakest mark, and 5 is the most excellent.

Design and Procedure

Quantitative research design was employed through online survey methodology. The SILL questionnaire was administered via the eDia online testing platform, which was developed by the Centre for Research on Learning and Instruction for assessing Year 1–6 children’s foreign language knowledge and attitudes. One school lesson was provided for data collection; however, the children needed approximately 20 min to hand in their ratings. Both the children and teachers are familiar with this system because the online platform has been in use since 2009.

Data were handled confidentially during the testing procedure; the children used an identification code provided by research administrators. The researchers were only able to see the codes, and only the teachers were able to identify their students with the codes. All the instructions were in the online questionnaire, so the children were able to answer the questions individually. The teachers were also requested to report the children’s questions, remarks and difficulties during testing. Finally, the teachers reported no misunderstandings or problematic items during data collection.

The data analyses were twofold. First, SPSS for Microsoft Windows 20.0 was employed for classical test analysis, which included an estimation of frequencies, means and standard deviations. The significance of differences among the variables was determined by ANOVA analysis. Second, path analysis was managed by the SPSS AMOS v20 software package to analyse the effect of strategy use on the variables under observation ( Arbuckle, 2008 ). The model fit was indicated by the Tucker–Lewis index (TLI), the normed fit index (NFI), the comparative fit index (CFI) and the root mean square error of approximation (RMSEA) ( Byrne, 2010 ; Kline, 2015 ).

Descriptive Analysis

General strategy uses among lower secondary school children.

The mean scores and standard deviations showed moderate LLS use, with the use of metacognitive, affective and social strategies being the highest in Year 5 (Table 1 ). Compensatory strategies were employed significantly the lowest. In Year 8, besides metacognitive and social strategies, cognitive strategies were relied on the most. Metacognitive strategy use was similarly high in both age groups. Significant differences were found between the age groups in memory, compensation and affective strategies ( p ≤ 0.01). While the use of affective strategies was relatively high in Year 5, it was the least frequently employed in Year 8.

www.frontiersin.org

TABLE 1. The strategy use results for the sample.

Differences in Strategy Use among Students with Different Proficiency Levels

One of our goals was to identify students’ LLS use preferences according to their proficiency levels. To implement this goal, we grouped the children into categories according to their proficiency, which was derived from their foreign language marks.

We combined the foreign language marks for those children who were evaluated with a 1 or a 2. These children showed a very low knowledge level and demonstrated a large number of difficulties and misunderstandings in foreign language learning. The next group was formed of children who were assessed at mark 3. This mark indicated an average knowledge level with gaps. Children who were evaluated with a mark 4 had fewer significant deficits. Children who received a mark 5 were the highest performers in school. Tables 2 , 3 summarise our results on strategy use according to foreign language marks. The number of children is also indicated according to each category.

www.frontiersin.org

TABLE 2. Means of strategy users according to their foreign language mark in Year 5.

www.frontiersin.org

TABLE 3. Means of strategy users according to their foreign language mark in Year 8.

Multivariate Analyses

The relationships between lls and foreign language attitude, lls and foreign language marks, and lls and general school achievement.

Our results demonstrated that the sample was evaluated at an approximate level of mark 4 ( M Year5 = 3.84, SD Year5 = 1.17; M Year8 = 3.62, SD Year8 = 1.17); however, Year 5 children achieved significantly higher ( p < 0.01). As regards children’s attitudes, we found no significant differences between the years ( M Year5 = 3.53, SD Year5 = 1.35; M Year8 = 3.43, SD Year8 = 1.23; p < 0.05). On the whole, it can be stated that children’s foreign language marks are higher than their attitude toward foreign language. The average school achievement showed significantly higher means than foreign language marks in both years ( M Year5 = 3.82, SD Year5 = 0.87, p < 0.001; M Year8 = 3.62, SD Year8 = 1.17, p < 0.001).

We also examined the correlation between LLS and attitude toward foreign languages, LLS and the foreign language mark, and LLS and general school achievement. We observed the most significant estimates between language learning strategy use and attitude in Year 5 ( r = 0.53–0.20; p < 0.001–0.05). The correlational coefficient between attitude and the foreign language mark was also significant ( r = 0.37; p < 0.001). We noted that children who achieved higher in foreign languages showed a more positive attitude toward them. We also noticed a significantly strong effect for the foreign language mark and strategy use ( r = 0.49–0.13; p < 0.001–0.05).

In Year 8, we found significant ( r Year5 = 0.70–0.12; p < 0.001–0.01; r Year8 = 0.82–0.66; p < 0.001–0.01) relationships between overall strategy use and foreign language marks, attitudes and general school achievement. However, the relationship between affective strategies and school achievement was not significant. We observed that children who use LLS have positive attitudes toward language learning, except for compensation and affective strategies.

The Effect of Language Learning Strategies on Attitude, School Marks and General School Achievement

We analysed the effect of LLS on foreign language attitude, school marks and general achievement using AMOS. We were looking for causalities between questionnaire fields and further variables by constructing a theoretical model on the basis of Oxford’s strategy taxonomy and children’s background data. We hypothesised that strategy factors largely influence children’s attitude toward language learning and through this the other variables. The model we created showed appropriate fit indices for the final model and indicated a good fit to our data in both years (Figures 1 , 2 ).

www.frontiersin.org

FIGURE 1. The path model for LLS influence on foreign language mark through foreign language attitude and general school achievement (GA) in Year 5.

www.frontiersin.org

FIGURE 2. The path model for LLS influence on foreign language mark through foreign language attitude and general school achievement (GA) in Year 8.

Year 5 : χ 2 (13) = 18,309, p = 0.146; Year 8 : χ 2 (13) = 23,893, p = 0.18. An analysis of the hypothesised path model indicated a comparative fit index (CFI) of 0.998 in Year 5 and 0.994 in Year 8. The RMSEA (root mean squared error of approximation) was also good in both years, 0.030 in Year 5 and.049 in Year 8. Both the Tucker–Lewis index (TLI Year5 = 0.992; TLI Year8 = 0.981) and the normed fit index (NFI Year8 = 0.992; NFI Year8 = 0.989) confirmed that the model we constructed was a good fit to our data.

The main aim of the present study was to investigate our understanding of LLS in a foreign language learning context. Therefore, first, we identified the strategy use preferences in the sample and specified the most and least often used strategies among children with different proficiency levels. Second, we examined the children’s LLS use in connexion with their foreign language attitude, proficiency and general school achievement. Our results confirmed some results from previous studies and also established new relationships among the variables.

Regarding the general strategy use preferences of the sample, the students reported moderate use of the six strategy categories. The use of indirect strategies, more precisely, metacognitive, affective and social strategies, was the highest in Year 5, while metacognitive, cognitive and social strategies were the most frequently employed in Year 8. These findings shed light on the different preferences among the different ages and proficiency levels. While affective strategies play a significant role in Year 5, cognitive strategies become more dominant later. Metacognitive and social strategies remained the most frequently used in both Years. Our result is consistent with those reported by Dawadi (2017) who discovered similar strategy preferences. We can also reinforce Alhaysony’s (2017) results that high school sample did not engage in affective strategies, and Charoento’s (2016) findings about the low use of memory strategies.

We also examined the differences in strategy use among students with different proficiency levels in both Years. In Year 5 the research findings analysis demonstrated significant differences among strategy uses in four areas: the memory, cognitive, metacognitive and social fields. We noted no significant differences among children in compensation and affective strategies. As regards memory strategies, we observed that low-achieving children rarely employed them. Low achievers used cognitive strategies significantly less often than good and high performers. As our results showed, the most excellent learners are also metacognitive strategy users, and they engage in social strategies significantly very often. In Year 8, we observed significant differences in every field among children with different proficiencies. As in Year 5, the use of metacognitive and social strategies was the most frequent among the high-achieving students; however, cognitive strategy use was also relatively high. Charoento (2016) and Rao (2016) reported the same results, so we can confirm his previous research outcomes that high achievers avail themselves of strategies significantly more frequently than low-performing learners.

We also investigated the relationship between LLS and foreign language attitude, LLS and the foreign language mark, and LLS and general school achievement. According to our results, we found that children who prefer foreign language learning reported significantly higher strategy use. As regards foreign language marks, the relationships between different kinds of strategy users and their foreign language marks were low. Children with high proficiency did not necessarily employ each of the strategies at a higher rate. The same result was reached by Chen (2009) . The relationship between affective strategies and school achievement was not significant. We observed that children who use LLS have positive attitudes toward language learning. So our findings partly confirmed previous results reported by Jabbari and Golkar (2014) and Platsidou and Kantaridou (2014) .

Concerning the impact of strategy use on foreign language learning attitudes, proficiency and general school achievement. In Year 5 the effect of the questionnaire fields on foreign language attitude was considerably high; attitudes were strongly influenced by metacognitive strategies, and the effect of social strategies was also high. While memory and cognitive strategies showed positive paths to attitudes, compensation and affective strategies indicated negative effects on attitudes. Foreign language attitudes signified the same effect on foreign language marks as these marks did on general achievement. A lower but significant effect of metacognitive strategies was found on general school achievement in Year 5.

In Year 8, we found similar tendencies. The effect of metacognitive strategies on foreign language attitudes was very high, while that of memory strategies was low. The effect of social strategies was lost in Year 8. The impact of foreign language attitude on the foreign language mark was almost the same as in Year 5, but that of the foreign language mark on general school achievement was twice as high. Shawer (2016) likewise highlighted what our results have also shown: strategy use has a significant effect on general school achievement. Metacognitive strategies also had a direct effect on foreign language marks. On the whole, not only did we observe a strong use of metacognitive strategies, but the effect of metacognitive strategies on attitudes was also dominant in both years. Moreover, metacognitive strategies influenced school achievement in Year 5 and foreign language marks in Year 8.

To sum up, our results demonstrated that like other studies, our Hungarian sample showed significant preferences for metacognitive strategy use. Compensatory strategies were the least frequently preferred in Year 5 and memory strategies were the least common in Year 8, a finding which also reinforced previous research outcomes ( Doró and Habók, 2013 ). We observed significant differences between more and less proficient students in strategy use. In line with other research ( Platsidou and Kantaridou, 2014 ), we conclude that more proficient learners avail themselves of a broader range of strategies than less proficient students and strategy use has a significant effect on foreign language marks.

The research focused on the whole language process in connexion with several other factors among young students. The added value of our research is not only that we discovered relationships between factors required for foreign language learning, but direct and indirect underlying effects have also been brought to light through path analysis. These analyses provide a comprehensive view both of the dominant role of metacognitive strategies and of the foreign language learning process generally.

In spite of its value, the study has certain limitations. First, we employed a self-report instrument for data collection which does not address students’ deeper views on learning. Qualitative methods would make it possible to gain a more detailed understanding of foreign language learning through interviews, including think-aloud procedures and classroom observations. Second, the current research into LLS and proficiency among Hungarian students was conducted with participants from two different years at the lower secondary school level, so generalisation of the results is limited. In addition, our sample was not representative. Further research would be necessary to fully examine the relationship between language learning strategies, language learning attitudes, foreign language proficiency and general achievement among Hungarian students in a variety of years and in a larger sample.

Third, the current research only used two measurement points of proficiency, the foreign language mark and general achievement, which are evaluated by different teachers. In future, we will collect a wider range of language proficiency data, including language proficiency test and interviews. Fourth, a comparison of LLS and general learning strategies would produce a more nuanced overview of students’ strategy use.

Conclusion and Pedagogical Implications

The main purpose of the present study was to ascertain the effect of LLS on other variables, such as foreign language attitude, foreign language proficiency and general school achievement among secondary school children in Hungary at the beginning and end of lower secondary school. In the beginner phase of learning foreign languages, it is important to better understand the relationship between language learning and related factors. Hence, our main objective was to provide a complex overview of these measurement points and to examine how LLS can support children in the first phase of the language learning process.

We used the Hungarian translation of Oxford’s Strategy Inventory for Language Learning questionnaire and supplemented it with the children’s self-reports of their foreign language attitudes and proficiency indicated by their foreign language mark and school achievement. This provided the basis for our research.

Past research has demonstrated that students with more frequent LLS use have better chances to become more proficient language learners. It has been pointed out that students that are more proficient engage in a wider range of strategies and select learning strategies dependent on learning tasks. Thus, teachers are encouraged to introduce a range of strategies for children to be able to select those that are most appropriate to features of their personality and relevant to learning tasks. At this age, introducing LLS is significant, particularly for children with low and average foreign language marks. It would be essential to motivate children to discover a variety of ways to practise their foreign language and find opportunities to read and engage in conversations with others. Children who are able to recognise the significance of language learning and use a broad range of strategies can find new ways and opportunities to practise language and to improve their proficiency. Hence, it would be highly recommended to integrate LLS consciously into foreign language lessons.

Ethics Statement

This study was carried out in accordance with the recommendations of the University of Szeged. According to these recommendations participation in the study was voluntary both for schools and students. The participating schools had consent with the parents in allowing their students’ engagement in the research. According to the Hungarian law, the schools’ responsibility to conduct a written agreement with the parents about their consent to allow their children to take part in researches. The whole process is permitted and coordinated by the school holding municipalities. The agreements are documented and stored in written forms in the schools. The authors declare that data collection and handling strictly adhered to the usual standards of research ethics as approved by the University of Szeged.

Author Contributions

AH and AM substantially contributed to the conception and design of the study, data collection, analysis and interpretation of data for the research. Both have written the manuscript and reviewed all parts of the manuscript. AH and AM have given final approval of the final version to be published. AH and AM agree to be accountable for all aspects of the work.

The research was founded by the University of Szeged.

Conflict of Interest Statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Alhaysony, M. (2017). Language learning strategies use by Saudi EFL students: the effect of duration of English language study and gender. Theory Pract. Lang. Stud. 7, 18–28. doi: 10.17507/tpls.0701.03

CrossRef Full Text | Google Scholar

Al-Qahtani, M. F. (2013). Relationship between English language, learning strategies, attitudes, motivation, and students’ academic achievement. Educ. Med. J. 5, 19–29. doi: 10.5959/eimj.v5i3.124

PubMed Abstract | CrossRef Full Text | Google Scholar

Arbuckle, J. L. (2008). AMOS (Version 17.0) [Computer Software] . Chicago, IL: SPSS.

Google Scholar

Ardasheva, Y., and Tretter, T. R. (2013). Strategy inventory for language learning–ELL student form: testing for factorial validity. Mod. Lang. J. 97, 474–489. doi: 10.1111/j.1540-4781.2013.12011.x

Byrne, B. M. (2010). Structural Equation Modelling Using AMOS. Basic Concepts, Applications, and Programming , 2nd Edn. New York: Routledge.

Chamot, A. U. (2004). Issues in language learning strategy research and teaching. Electron. J. Foreign Lang. Teachnol. 1, 14–26.

Charoento, M. (2016). Individual learner differences and language learning strategies. Contemp. Educ. Res. J. 7, 57–72.

Chen, M. (2014). Age differences in the use of language learning strategies. Engl. Lang. Teach. 7, 144–151. doi: 10.5539/elt.v7n2p144

Chen, M. L. (2009). Influence of grade level on perceptual learning style preferences and language learning strategies of Taiwanese English as a foreign language learners. Learn. Individ. Dif. 19, 304–308. doi: 10.1016/j.lindif.2009.02.004

Dawadi, S. (2017). Language learning strategies profiles of EFL learners in Nepal. Eur. J. Educ. Soc. Sci. 2, 42–55.

Doró, K., and Habók, A. (2013). Language learning strategies in elementary school: the effect of age and gender in an EFL context. J. Linguist. Lang. Teach. 4, 25–37.

Griffith, C., and Oxford, R. (2014). The twenty-first century landscape of language learning strategies: introduction to this special issue. System 43, 1–10. doi: 10.1016/j.system.2013.12.009

Griffiths, C., and Incecay, G. (2016). “New directions in language learning strategy research: engaging with the complexity of strategy use,” in New Directions in Language Learning Psychology , eds C. Gkonou, D. Tatzl, and S. Mercer (Berlin: Springer), 25–38. doi: 10.1007/978-3-319-23491-5_3

Gunning, P., and Oxford, R. L. (2014). Children’s learning strategy use and the effects of strategy instruction on success in learning ESL in Canada. System 43, 82–100. doi: 10.1016/j.system.2013.12.012

Jabbari, M. J., and Golkar, N. (2014). The relationship between EFL learners’ language learning attitudes and language learning strategies. Int. J. Linguist. 6, 161–167. doi: 10.5296/ijl.v6i3.5837

Khaldieh, S. A. (2000). Learning strategies and writing processes of proficient vs. less-proficient learners of Arabic. Foreign Lang. Ann. 33, 522–533. doi: 10.1111/j.1944-9720.2000.tb01996.x

Kline, R. B. (2015). Principles and Practice of Structural Equation Modeling , 4th Edn. New York, NY: Guilford Press.

Liu, J. (2010). Language learning strategies and its training model. Int. Educ. Stud. 3, 100–104. doi: 10.5539/ies.v3n3p100

Magogwe, J. M., and Oliver, R. (2007). The relationship between language learning strategies, proficiency, age, and self-efficacy beliefs: a study of language learners in Botswana. System 35, 338–352. doi: 10.1016/j.system.2007.01.003

Oxford, R. L. (1990). Language Learning Strategies: What Every Teacher Should Know . Boston, MA: Heinle and Heinle.

Oxford, R. L. (2016). Teaching and Researching Language Learning Strategies: Self-Regulation in Context . New York, NY: Routledge.

Oxford, R. L., and Burry-Stock, J. A. (1995). Assessing the use of language learning strategies worldwide with the ESL/EFL version of the strategy inventory for language learning (SILL). System 23, 1–23. doi: 10.1016/0346-251X(94)00047-A

Pfenninger, S. E., and Singleton, D. (2017). Beyond Age Effects in Instructional L2 Learning: Revisiting the Age Factor . Clevedon: Multilingual Matters. doi: 10.21832/PFENNI7623

Platsidou, M., and Kantaridou, Z. (2014). The role of attitudes and learning strategy use in predicting perceived competence in school-aged foreign language learners. J. Lang. Lit. 5, 253–260. doi: 10.7813/jll.2014/5-3/43

Platsidou, M., and Sipitanou, A. (2014). Exploring relationships with grade level, gender and language proficiency in the foreign language learning strategy use of children and early adolescents. Int. J. Res. Stud. Lang. Learn. 4, 83–96. doi: 10.5861/ijrsll.2014.778

Rao, Z. (2016). Language learning strategies and English proficiency: interpretations from information-processing theory. Lang. Learn. J. 44, 90–106. doi: 10.1080/09571736.2012.733886

Shang, H. F. (2010). Reading strategy use, self-efficacy and EFL reading comprehension. Asian EFL J. 12, 18–42.

Shawer, S. F. (2016). Four language skills performance, academic achievement, and learning strategy use in preservice teacher training programs. TESOL J. 7, 262–303. doi: 10.1002/tesj.202

Wong, L. L. C., and Nunan, D. (2011). The learning styles and strategies of effective language learners. System 39, 144–163. doi: 10.1016/j.system.2011.05.004

Wu, Y. L. (2008). Language learning strategies used by students at different proficiency levels. Asian EFL J. 10, 75–95.

PubMed Abstract | Google Scholar

Keywords : language learning strategy, foreign language attitude, foreign language mark, general school achievement, lower secondary students

Citation: Habók A and Magyar A (2018) The Effect of Language Learning Strategies on Proficiency, Attitudes and School Achievement. Front. Psychol. 8:2358. doi: 10.3389/fpsyg.2017.02358

Received: 06 July 2017; Accepted: 26 December 2017; Published: 11 January 2018.

Reviewed by:

Copyright © 2018 Habók and Magyar. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY) . The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Anita Habók, [email protected]

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

Natural language processing: state of the art, current trends and challenges

  • Published: 14 July 2022
  • Volume 82 , pages 3713–3744, ( 2023 )

Cite this article

language research paper

  • Diksha Khurana 1 ,
  • Aditya Koli 1 ,
  • Kiran Khatter   ORCID: orcid.org/0000-0002-1000-6102 2 &
  • Sukhdev Singh 3  

155k Accesses

34 Altmetric

Explore all metrics

This article has been updated

Natural language processing (NLP) has recently gained much attention for representing and analyzing human language computationally. It has spread its applications in various fields such as machine translation, email spam detection, information extraction, summarization, medical, and question answering etc. In this paper, we first distinguish four phases by discussing different levels of NLP and components of N atural L anguage G eneration followed by presenting the history and evolution of NLP. We then discuss in detail the state of the art presenting the various applications of NLP, current trends, and challenges. Finally, we present a discussion on some available datasets, models, and evaluation metrics in NLP.

Similar content being viewed by others

language research paper

ChatGPT is bullshit

A survey on large language model based autonomous agents.

language research paper

Natural Language Processing

Avoid common mistakes on your manuscript.

1 Introduction

A language can be defined as a set of rules or set of symbols where symbols are combined and used for conveying information or broadcasting the information. Since all the users may not be well-versed in machine specific language, N atural Language Processing (NLP) caters those users who do not have enough time to learn new languages or get perfection in it. In fact, NLP is a tract of Artificial Intelligence and Linguistics, devoted to make computers understand the statements or words written in human languages. It came into existence to ease the user’s work and to satisfy the wish to communicate with the computer in natural language, and can be classified into two parts i.e. Natural Language Understanding or Linguistics and Natural Language Generation which evolves the task to understand and generate the text. L inguistics is the science of language which includes Phonology that refers to sound, Morphology word formation, Syntax sentence structure, Semantics syntax and Pragmatics which refers to understanding. Noah Chomsky, one of the first linguists of twelfth century that started syntactic theories, marked a unique position in the field of theoretical linguistics because he revolutionized the area of syntax (Chomsky, 1965) [ 23 ]. Further, Natural Language Generation (NLG) is the process of producing phrases, sentences and paragraphs that are meaningful from an internal representation. The first objective of this paper is to give insights of the various important terminologies of NLP and NLG.

In the existing literature, most of the work in NLP is conducted by computer scientists while various other professionals have also shown interest such as linguistics, psychologists, and philosophers etc. One of the most interesting aspects of NLP is that it adds up to the knowledge of human language. The field of NLP is related with different theories and techniques that deal with the problem of natural language of communicating with the computers. Few of the researched tasks of NLP are Automatic Summarization ( Automatic summarization produces an understandable summary of a set of text and provides summaries or detailed information of text of a known type), Co-Reference Resolution ( Co-reference resolution refers to a sentence or larger set of text that determines all words which refer to the same object), Discourse Analysis ( Discourse analysis refers to the task of identifying the discourse structure of connected text i.e. the study of text in relation to social context),Machine Translation ( Machine translation refers to automatic translation of text from one language to another),Morphological Segmentation ( Morphological segmentation refers to breaking words into individual meaning-bearing morphemes), Named Entity Recognition ( Named entity recognition (NER) is used for information extraction to recognized name entities and then classify them to different classes), Optical Character Recognition ( Optical character recognition (OCR) is used for automatic text recognition by translating printed and handwritten text into machine-readable format), Part Of Speech Tagging ( Part of speech tagging describes a sentence, determines the part of speech for each word) etc. Some of these tasks have direct real-world applications such as Machine translation, Named entity recognition, Optical character recognition etc. Though NLP tasks are obviously very closely interwoven but they are used frequently, for convenience. Some of the tasks such as automatic summarization, co-reference analysis etc. act as subtasks that are used in solving larger tasks. Nowadays NLP is in the talks because of various applications and recent developments although in the late 1940s the term wasn’t even in existence. So, it will be interesting to know about the history of NLP, the progress so far has been made and some of the ongoing projects by making use of NLP. The second objective of this paper focus on these aspects. The third objective of this paper is on datasets, approaches, evaluation metrics and involved challenges in NLP. The rest of this paper is organized as follows. Section 2 deals with the first objective mentioning the various important terminologies of NLP and NLG. Section 3 deals with the history of NLP, applications of NLP and a walkthrough of the recent developments. Datasets used in NLP and various approaches are presented in Section 4 , and Section 5 is written on evaluation metrics and challenges involved in NLP. Finally, a conclusion is presented in Section 6 .

2 Components of NLP

NLP can be classified into two parts i.e., Natural Language Understanding and Natural Language Generation which evolves the task to understand and generate the text. Figure 1 presents the broad classification of NLP. The objective of this section is to discuss the Natural Language Understanding (Linguistic) (NLU) and the Natural Language Generation (NLG) .

figure 1

Broad classification of NLP

NLU enables machines to understand natural language and analyze it by extracting concepts, entities, emotion, keywords etc. It is used in customer care applications to understand the problems reported by customers either verbally or in writing. Linguistics is the science which involves the meaning of language, language context and various forms of the language. So, it is important to understand various important terminologies of NLP and different levels of NLP. We next discuss some of the commonly used terminologies in different levels of NLP.

Phonology is the part of Linguistics which refers to the systematic arrangement of sound. The term phonology comes from Ancient Greek in which the term phono means voice or sound and the suffix –logy refers to word or speech. In 1993 Nikolai Trubetzkoy stated that Phonology is “the study of sound pertaining to the system of language” whereas Lass1998 [ 66 ]wrote that phonology refers broadly with the sounds of language, concerned with sub-discipline of linguistics, behavior and organization of sounds. Phonology includes semantic use of sound to encode meaning of any Human language.

The different parts of the word represent the smallest units of meaning known as Morphemes. Morphology which comprises Nature of words, are initiated by morphemes. An example of Morpheme could be, the word precancellation can be morphologically scrutinized into three separate morphemes: the prefix pre , the root cancella , and the suffix -tion . The interpretation of morphemes stays the same across all the words, just to understand the meaning humans can break any unknown word into morphemes. For example, adding the suffix –ed to a verb, conveys that the action of the verb took place in the past. The words that cannot be divided and have meaning by themselves are called Lexical morpheme (e.g.: table, chair). The words (e.g. -ed, −ing, −est, −ly, −ful) that are combined with the lexical morpheme are known as Grammatical morphemes (eg. Worked, Consulting, Smallest, Likely, Use). The Grammatical morphemes that occur in combination called bound morphemes (eg. -ed, −ing) Bound morphemes can be divided into inflectional morphemes and derivational morphemes. Adding Inflectional morphemes to a word changes the different grammatical categories such as tense, gender, person, mood, aspect, definiteness and animacy. For example, addition of inflectional morphemes –ed changes the root park to parked . Derivational morphemes change the semantic meaning of the word when it is combined with that word. For example, in the word normalize, the addition of the bound morpheme –ize to the root normal changes the word from an adjective ( normal ) to a verb ( normalize ).

In Lexical, humans, as well as NLP systems, interpret the meaning of individual words. Sundry types of processing bestow to word-level understanding – the first of these being a part-of-speech tag to each word. In this processing, words that can act as more than one part-of-speech are assigned the most probable part-of-speech tag based on the context in which they occur. At the lexical level, Semantic representations can be replaced by the words that have one meaning. In fact, in the NLP system the nature of the representation varies according to the semantic theory deployed. Therefore, at lexical level, analysis of structure of words is performed with respect to their lexical meaning and PoS. In this analysis, text is divided into paragraphs, sentences, and words. Words that can be associated with more than one PoS are aligned with the most likely PoS tag based on the context in which they occur. At lexical level, semantic representation can also be replaced by assigning the correct POS tag which improves the understanding of the intended meaning of a sentence. It is used for cleaning and feature extraction using various techniques such as removal of stop words, stemming, lemmatization etc. Stop words such as ‘ in ’, ‘the’, ‘and’ etc. are removed as they don’t contribute to any meaningful interpretation and their frequency is also high which may affect the computation time. Stemming is used to stem the words of the text by removing the suffix of a word to obtain its root form. For example: consulting and consultant words are converted to the word consult after stemming, using word gets converted to us and driver is reduced to driv . Lemmatization does not remove the suffix of a word; in fact, it results in the source word with the use of a vocabulary. For example, in case of token drived , stemming results in “driv”, whereas lemmatization attempts to return the correct basic form either drive or drived depending on the context it is used.

After PoS tagging done at lexical level, words are grouped to phrases and phrases are grouped to form clauses and then phrases are combined to sentences at syntactic level. It emphasizes the correct formation of a sentence by analyzing the grammatical structure of the sentence. The output of this level is a sentence that reveals structural dependency between words. It is also known as parsing which uncovers the phrases that convey more meaning in comparison to the meaning of individual words. Syntactic level examines word order, stop-words, morphology and PoS of words which lexical level does not consider. Changing word order will change the dependency among words and may also affect the comprehension of sentences. For example, in the sentences “ram beats shyam in a competition” and “shyam beats ram in a competition”, only syntax is different but convey different meanings [ 139 ]. It retains the stopwords as removal of them changes the meaning of the sentence. It doesn’t support lemmatization and stemming because converting words to its basic form changes the grammar of the sentence. It focuses on identification on correct PoS of sentences. For example: in the sentence “frowns on his face”, “frowns” is a noun whereas it is a verb in the sentence “he frowns”.

On a semantic level, the most important task is to determine the proper meaning of a sentence. To understand the meaning of a sentence, human beings rely on the knowledge about language and the concepts present in that sentence, but machines can’t count on these techniques. Semantic processing determines the possible meanings of a sentence by processing its logical structure to recognize the most relevant words to understand the interactions among words or different concepts in the sentence. For example, it understands that a sentence is about “movies” even if it doesn’t comprise actual words, but it contains related concepts such as “actor”, “actress”, “dialogue” or “script”. This level of processing also incorporates the semantic disambiguation of words with multiple senses (Elizabeth D. Liddy, 2001) [ 68 ]. For example, the word “bark” as a noun can mean either as a sound that a dog makes or outer covering of the tree. The semantic level examines words for their dictionary interpretation or interpretation is derived from the context of the sentence. For example: the sentence “Krishna is good and noble.” This sentence is either talking about Lord Krishna or about a person “Krishna”. That is why, to get the proper meaning of the sentence, the appropriate interpretation is considered by looking at the rest of the sentence [ 44 ].

While syntax and semantics level deal with sentence-length units, the discourse level of NLP deals with more than one sentence. It deals with the analysis of logical structure by making connections among words and sentences that ensure its coherence. It focuses on the properties of the text that convey meaning by interpreting the relations between sentences and uncovering linguistic structures from texts at several levels (Liddy,2001) [ 68 ]. The two of the most common levels are: Anaphora Resolution an d Coreference Resolution. Anaphora resolution is achieved by recognizing the entity referenced by an anaphor to resolve the references used within the text with the same sense. For example, (i) Ram topped in the class. (ii) He was intelligent. Here i) and ii) together form a discourse. Human beings can quickly understand that the pronoun “he” in (ii) refers to “Ram” in (i). The interpretation of “He” depends on another word “Ram” presented earlier in the text. Without determining the relationship between these two structures, it would not be possible to decide why Ram topped the class and who was intelligent. Coreference resolution is achieved by finding all expressions that refer to the same entity in a text. It is an important step in various NLP applications that involve high-level NLP tasks such as document summarization, information extraction etc. In fact, anaphora is encoded through one of the processes called co-reference.

Pragmatic level focuses on the knowledge or content that comes from the outside the content of the document. It deals with what speaker implies and what listener infers. In fact, it analyzes the sentences that are not directly spoken. Real-world knowledge is used to understand what is being talked about in the text. By analyzing the context, meaningful representation of the text is derived. When a sentence is not specific and the context does not provide any specific information about that sentence, Pragmatic ambiguity arises (Walton, 1996) [ 143 ]. Pragmatic ambiguity occurs when different persons derive different interpretations of the text, depending on the context of the text. The context of a text may include the references of other sentences of the same document, which influence the understanding of the text and the background knowledge of the reader or speaker, which gives a meaning to the concepts expressed in that text. Semantic analysis focuses on literal meaning of the words, but pragmatic analysis focuses on the inferred meaning that the readers perceive based on their background knowledge. For example, the sentence “Do you know what time is it?” is interpreted to “Asking for the current time” in semantic analysis whereas in pragmatic analysis, the same sentence may refer to “expressing resentment to someone who missed the due time” in pragmatic analysis. Thus, semantic analysis is the study of the relationship between various linguistic utterances and their meanings, but pragmatic analysis is the study of context which influences our understanding of linguistic expressions. Pragmatic analysis helps users to uncover the intended meaning of the text by applying contextual background knowledge.

The goal of NLP is to accommodate one or more specialties of an algorithm or system. The metric of NLP assess on an algorithmic system allows for the integration of language understanding and language generation. It is even used in multilingual event detection. Rospocher et al. [ 112 ] purposed a novel modular system for cross-lingual event extraction for English, Dutch, and Italian Texts by using different pipelines for different languages. The system incorporates a modular set of foremost multilingual NLP tools. The pipeline integrates modules for basic NLP processing as well as more advanced tasks such as cross-lingual named entity linking, semantic role labeling and time normalization. Thus, the cross-lingual framework allows for the interpretation of events, participants, locations, and time, as well as the relations between them. Output of these individual pipelines is intended to be used as input for a system that obtains event centric knowledge graphs. All modules take standard input, to do some annotation, and produce standard output which in turn becomes the input for the next module pipelines. Their pipelines are built as a data centric architecture so that modules can be adapted and replaced. Furthermore, modular architecture allows for different configurations and for dynamic distribution.

Ambiguity is one of the major problems of natural language which occurs when one sentence can lead to different interpretations. This is usually faced in syntactic, semantic, and lexical levels. In case of syntactic level ambiguity, one sentence can be parsed into multiple syntactical forms. Semantic ambiguity occurs when the meaning of words can be misinterpreted. Lexical level ambiguity refers to ambiguity of a single word that can have multiple assertions. Each of these levels can produce ambiguities that can be solved by the knowledge of the complete sentence. The ambiguity can be solved by various methods such as Minimizing Ambiguity, Preserving Ambiguity, Interactive Disambiguation and Weighting Ambiguity [ 125 ]. Some of the methods proposed by researchers to remove ambiguity is preserving ambiguity, e.g. (Shemtov 1997; Emele & Dorna 1998; Knight & Langkilde 2000; Tong Gao et al. 2015, Umber & Bajwa 2011) [ 39 , 46 , 65 , 125 , 139 ]. Their objectives are closely in line with removal or minimizing ambiguity. They cover a wide range of ambiguities and there is a statistical element implicit in their approach.

Natural Language Generation (NLG) is the process of producing phrases, sentences and paragraphs that are meaningful from an internal representation. It is a part of Natural Language Processing and happens in four phases: identifying the goals, planning on how goals may be achieved by evaluating the situation and available communicative sources and realizing the plans as a text (Fig. 2 ). It is opposite to Understanding.

Speaker and Generator

figure 2

Components of NLG

To generate a text, we need to have a speaker or an application and a generator or a program that renders the application’s intentions into a fluent phrase relevant to the situation.

Components and Levels of Representation

The process of language generation involves the following interweaved tasks. Content selection: Information should be selected and included in the set. Depending on how this information is parsed into representational units, parts of the units may have to be removed while some others may be added by default. Textual Organization : The information must be textually organized according to the grammar, it must be ordered both sequentially and in terms of linguistic relations like modifications. Linguistic Resources : To support the information’s realization, linguistic resources must be chosen. In the end these resources will come down to choices of particular words, idioms, syntactic constructs etc. Realization : The selected and organized resources must be realized as an actual text or voice output.

Application or Speaker

This is only for maintaining the model of the situation. Here the speaker just initiates the process doesn’t take part in the language generation. It stores the history, structures the content that is potentially relevant and deploys a representation of what it knows. All these forms the situation, while selecting subset of propositions that speaker has. The only requirement is the speaker must make sense of the situation [ 91 ].

3 NLP: Then and now

In the late 1940s the term NLP wasn’t in existence, but the work regarding machine translation (MT) had started. In fact, Research in this period was not completely localized. Russian and English were the dominant languages for MT (Andreev,1967) [ 4 ]. In fact, MT/NLP research almost died in 1966 according to the ALPAC report, which concluded that MT is going nowhere. But later, some MT production systems were providing output to their customers (Hutchins, 1986) [ 60 ]. By this time, work on the use of computers for literary and linguistic studies had also started. As early as 1960, signature work influenced by AI began, with the BASEBALL Q-A systems (Green et al., 1961) [ 51 ]. LUNAR (Woods,1978) [ 152 ] and Winograd SHRDLU were natural successors of these systems, but they were seen as stepped-up sophistication, in terms of their linguistic and their task processing capabilities. There was a widespread belief that progress could only be made on the two sides, one is ARPA Speech Understanding Research (SUR) project (Lea, 1980) and other in some major system developments projects building database front ends. The front-end projects (Hendrix et al., 1978) [ 55 ] were intended to go beyond LUNAR in interfacing the large databases. In early 1980s computational grammar theory became a very active area of research linked with logics for meaning and knowledge’s ability to deal with the user’s beliefs and intentions and with functions like emphasis and themes.

By the end of the decade the powerful general purpose sentence processors like SRI’s Core Language Engine (Alshawi,1992) [ 2 ] and Discourse Representation Theory (Kamp and Reyle,1993) [ 62 ] offered a means of tackling more extended discourse within the grammatico-logical framework. This period was one of the growing communities. Practical resources, grammars, and tools and parsers became available (for example: Alvey Natural Language Tools) (Briscoe et al., 1987) [ 18 ]. The (D)ARPA speech recognition and message understanding (information extraction) conferences were not only for the tasks they addressed but for the emphasis on heavy evaluation, starting a trend that became a major feature in 1990s (Young and Chase, 1998; Sundheim and Chinchor,1993) [ 131 , 157 ]. Work on user modeling (Wahlster and Kobsa, 1989) [ 142 ] was one strand in a research paper. Cohen et al. (2002) [ 28 ] had put forwarded a first approximation of a compositional theory of tune interpretation, together with phonological assumptions on which it is based and the evidence from which they have drawn their proposals. At the same time, McKeown (1985) [ 85 ] demonstrated that rhetorical schemas could be used for producing both linguistically coherent and communicatively effective text. Some research in NLP marked important topics for future like word sense disambiguation (Small et al., 1988) [ 126 ] and probabilistic networks, statistically colored NLP, the work on the lexicon, also pointed in this direction. Statistical language processing was a major thing in 90s (Manning and Schuetze,1999) [ 75 ], because this not only involves data analysts. Information extraction and automatic summarizing (Mani and Maybury,1999) [ 74 ] was also a point of focus. Next, we present a walkthrough of the developments from the early 2000.

3.1 A walkthrough of recent developments in NLP

The main objectives of NLP include interpretation, analysis, and manipulation of natural language data for the intended purpose with the use of various algorithms, tools, and methods. However, there are many challenges involved which may depend upon the natural language data under consideration, and so makes it difficult to achieve all the objectives with a single approach. Therefore, the development of different tools and methods in the field of NLP and relevant areas of studies have received much attention from several researchers in the recent past. The developments can be seen in the Fig.  3 :

figure 3

A walkthrough of recent developments in NLP

In early 2000, neural language modeling in which the probability of occurring of next word (token) is determined given n previous words. Bendigo et al. [ 12 ] proposed the concept of feed forward neural network and lookup table which represents the n previous words in sequence. Collobert et al. [ 29 ] proposed the application of multitask learning in the field of NLP, where two convolutional models with max pooling were used to perform parts-of-speech and named entity recognition tagging. Mikolov et.al. [ 87 ] proposed a word embedding process where the dense vector representation of text was addressed. They also report the challenges faced by traditional sparse bag-of-words representation. After the advancement of word embedding, neural networks were introduced in the field of NLP where variable length input is taken for further processing. Sutskever et al. [ 132 ] proposed a general framework for sequence-to-sequence mapping where encoder and decoder networks are used to map from sequence to vector and vector to sequence respectively. In fact, the use of neural networks have played a very important role in NLP. One can observe from the existing literature that enough use of neural networks was not there in the early 2000s but till the year 2013enough discussion had happened about the use of neural networks in the field of NLP which transformed many things and further paved the way to implement various neural networks in NLP. Earlier the use of Convolutional neural networks ( CNN ) contributed to the field of image classification and analyzing visual imagery for further analysis. Later the use of CNNs can be observed in tackling problems associated with NLP tasks like Sentence Classification [ 127 ], Sentiment Analysis [ 135 ], Text Classification [ 118 ], Text Summarization [ 158 ], Machine Translation [ 70 ] and Answer Relations [ 150 ] . An article by Newatia (2019) [ 93 ] illustrates the general architecture behind any CNN model, and how it can be used in the context of NLP. One can also refer to the work of Wang and Gang [ 145 ] for the applications of CNN in NLP. Further Neural Networks those are recurrent in nature due to performing the same function for every data, also known as Recurrent Neural Networks (RNNs), have also been used in NLP, and found ideal for sequential data such as text, time series, financial data, speech, audio, video among others, see article by Thomas (2019) [ 137 ]. One of the modified versions of RNNs is Long Short-Term Memory (LSTM) which is also very useful in the cases where only the desired important information needs to be retained for a much longer time discarding the irrelevant information, see [ 52 , 58 ]. Further development in the LSTM has also led to a slightly simpler variant, called the gated recurrent unit (GRU), which has shown better results than standard LSTMs in many tasks [ 22 , 26 ]. Attention mechanisms [ 7 ] which suggest a network to learn what to pay attention to in accordance with the current hidden state and annotation together with the use of transformers have also made a significant development in NLP, see [ 141 ]. It is to be noticed that Transformers have a potential of learning longer-term dependency but are limited by a fixed-length context in the setting of language modeling. In this direction recently Dai et al. [ 30 ] proposed a novel neural architecture Transformer-XL (XL as extra-long) which enables learning dependencies beyond a fixed length of words. Further the work of Rae et al. [ 104 ] on the Compressive Transformer, an attentive sequence model which compresses memories for long-range sequence learning, may be helpful for the readers. One may also refer to the recent work by Otter et al. [ 98 ] on uses of Deep Learning for NLP, and relevant references cited therein. The use of BERT (Bidirectional Encoder Representations from Transformers) [ 33 ] model and successive models have also played an important role for NLP.

Many researchers worked on NLP, building tools and systems which makes NLP what it is today. Tools like Sentiment Analyser, Parts of Speech (POS) Taggers, Chunking, Named Entity Recognitions (NER), Emotion detection, Semantic Role Labeling have a huge contribution made to NLP, and are good topics for research. Sentiment analysis (Nasukawaetal.,2003) [ 156 ] works by extracting sentiments about a given topic, and it consists of a topic specific feature term extraction, sentiment extraction, and association by relationship analysis. It utilizes two linguistic resources for the analysis: the sentiment lexicon and the sentiment pattern database. It analyzes the documents for positive and negative words and tries to give ratings on scale −5 to +5. The mainstream of currently used tagsets is obtained from English. The most widely used tagsets as standard guidelines are designed for Indo-European languages but it is less researched on Asian languages or middle- eastern languages. Various authors have done research on making parts of speech taggers for various languages such as Arabic (Zeroual et al., 2017) [ 160 ], Sanskrit (Tapswi & Jain, 2012) [ 136 ], Hindi (Ranjan & Basu, 2003) [ 105 ] to efficiently tag and classify words as nouns, adjectives, verbs etc. Authors in [ 136 ] have used treebank technique for creating rule-based POS Tagger for Sanskrit Language. Sanskrit sentences are parsed to assign the appropriate tag to each word using suffix stripping algorithm, wherein the longest suffix is searched from the suffix table and tags are assigned. Diab et al. (2004) [ 34 ] used supervised machine learning approach and adopted Support Vector Machines (SVMs) which were trained on the Arabic Treebank to automatically tokenize parts of speech tag and annotate base phrases in Arabic text.

Chunking is a process of separating phrases from unstructured text. Since simple tokens may not represent the actual meaning of the text, it is advisable to use phrases such as “North Africa” as a single word instead of ‘North’ and ‘Africa’ separate words. Chunking known as “Shadow Parsing” labels parts of sentences with syntactic correlated keywords like Noun Phrase (NP) and Verb Phrase (VP). Chunking is often evaluated using the CoNLL 2000 shared task. Various researchers (Sha and Pereira, 2003; McDonald et al., 2005; Sun et al., 2008) [ 83 , 122 , 130 ] used CoNLL test data for chunking and used features composed of words, POS tags, and tags.

There are particular words in the document that refer to specific entities or real-world objects like location, people, organizations etc. To find the words which have a unique context and are more informative, noun phrases are considered in the text documents. Named entity recognition (NER) is a technique to recognize and separate the named entities and group them under predefined classes. But in the era of the Internet, where people use slang not the traditional or standard English which cannot be processed by standard natural language processing tools. Ritter (2011) [ 111 ] proposed the classification of named entities in tweets because standard NLP tools did not perform well on tweets. They re-built NLP pipeline starting from PoS tagging, then chunking for NER. It improved the performance in comparison to standard NLP tools.

Emotion detection investigates and identifies the types of emotion from speech, facial expressions, gestures, and text. Sharma (2016) [ 124 ] analyzed the conversations in Hinglish means mix of English and Hindi languages and identified the usage patterns of PoS. Their work was based on identification of language and POS tagging of mixed script. They tried to detect emotions in mixed script by relating machine learning and human knowledge. They have categorized sentences into 6 groups based on emotions and used TLBO technique to help the users in prioritizing their messages based on the emotions attached with the message. Seal et al. (2020) [ 120 ] proposed an efficient emotion detection method by searching emotional words from a pre-defined emotional keyword database and analyzing the emotion words, phrasal verbs, and negation words. Their proposed approach exhibited better performance than recent approaches.

Semantic Role Labeling (SRL) works by giving a semantic role to a sentence. For example, in the PropBank (Palmer et al., 2005) [ 100 ] formalism, one assigns roles to words that are arguments of a verb in the sentence. The precise arguments depend on the verb frame and if multiple verbs exist in a sentence, it might have multiple tags. State-of-the-art SRL systems comprise several stages: creating a parse tree, identifying which parse tree nodes represent the arguments of a given verb, and finally classifying these nodes to compute the corresponding SRL tags.

Event discovery in social media feeds (Benson et al.,2011) [ 13 ], using a graphical model to analyze any social media feeds to determine whether it contains the name of a person or name of a venue, place, time etc. The model operates on noisy feeds of data to extract records of events by aggregating multiple information across multiple messages, despite the noise of irrelevant noisy messages and very irregular message language, this model was able to extract records with a broader array of features on factors.

We first give insights on some of the mentioned tools and relevant work done before moving to the broad applications of NLP.

3.2 Applications of NLP

Natural Language Processing can be applied into various areas like Machine Translation, Email Spam detection, Information Extraction, Summarization, Question Answering etc. Next, we discuss some of the areas with the relevant work done in those directions.

Machine Translation

As most of the world is online, the task of making data accessible and available to all is a challenge. Major challenge in making data accessible is the language barrier. There are a multitude of languages with different sentence structure and grammar. Machine Translation is generally translating phrases from one language to another with the help of a statistical engine like Google Translate. The challenge with machine translation technologies is not directly translating words but keeping the meaning of sentences intact along with grammar and tenses. The statistical machine learning gathers as many data as they can find that seems to be parallel between two languages and they crunch their data to find the likelihood that something in Language A corresponds to something in Language B. As for Google, in September 2016, announced a new machine translation system based on artificial neural networks and Deep learning. In recent years, various methods have been proposed to automatically evaluate machine translation quality by comparing hypothesis translations with reference translations. Examples of such methods are word error rate, position-independent word error rate (Tillmann et al., 1997) [ 138 ], generation string accuracy (Bangalore et al., 2000) [ 8 ], multi-reference word error rate (Nießen et al., 2000) [ 95 ], BLEU score (Papineni et al., 2002) [ 101 ], NIST score (Doddington, 2002) [ 35 ] All these criteria try to approximate human assessment and often achieve an astonishing degree of correlation to human subjective evaluation of fluency and adequacy (Papineni et al., 2001; Doddington, 2002) [ 35 , 101 ].

Text Categorization

Categorization systems input a large flow of data like official documents, military casualty reports, market data, newswires etc. and assign them to predefined categories or indices. For example, The Carnegie Group’s Construe system (Hayes, 1991) [ 54 ], inputs Reuters articles and saves much time by doing the work that is to be done by staff or human indexers. Some companies have been using categorization systems to categorize trouble tickets or complaint requests and routing to the appropriate desks. Another application of text categorization is email spam filters. Spam filters are becoming important as the first line of defence against the unwanted emails. A false negative and false positive issue of spam filters is at the heart of NLP technology, it has brought down the challenge of extracting meaning from strings of text. A filtering solution that is applied to an email system uses a set of protocols to determine which of the incoming messages are spam; and which are not. There are several types of spam filters available. Content filters : Review the content within the message to determine whether it is spam or not. Header filters : Review the email header looking for fake information. General Blacklist filters : Stop all emails from blacklisted recipients. Rules Based Filters : It uses user-defined criteria. Such as stopping mails from a specific person or stopping mail including a specific word. Permission Filters : Require anyone sending a message to be pre-approved by the recipient. Challenge Response Filters : Requires anyone sending a message to enter a code to gain permission to send email.

Spam Filtering

It works using text categorization and in recent times, various machine learning techniques have been applied to text categorization or Anti-Spam Filtering like Rule Learning (Cohen 1996) [ 27 ], Naïve Bayes (Sahami et al., 1998; Androutsopoulos et al., 2000; Rennie.,2000) [ 5 , 109 , 115 ],Memory based Learning (Sakkiset al.,2000b) [ 117 ], Support vector machines (Druker et al., 1999) [ 36 ], Decision Trees (Carreras and Marquez, 2001) [ 19 ], Maximum Entropy Model (Berger et al. 1996) [ 14 ], Hash Forest and a rule encoding method (T. Xia, 2020) [ 153 ], sometimes combining different learners (Sakkis et al., 2001) [ 116 ]. Using these approaches is better as classifier is learned from training data rather than making by hand. The naïve bayes is preferred because of its performance despite its simplicity (Lewis, 1998) [ 67 ] In Text Categorization two types of models have been used (McCallum and Nigam, 1998) [ 77 ]. Both modules assume that a fixed vocabulary is present. But in first model a document is generated by first choosing a subset of vocabulary and then using the selected words any number of times, at least once irrespective of order. This is called Multi-variate Bernoulli model. It takes the information of which words are used in a document irrespective of number of words and order. In second model, a document is generated by choosing a set of word occurrences and arranging them in any order. This model is called multi-nomial model, in addition to the Multi-variate Bernoulli model, it also captures information on how many times a word is used in a document. Most text categorization approaches to anti-spam Email filtering have used multi variate Bernoulli model (Androutsopoulos et al., 2000) [ 5 ] [ 15 ].

Information Extraction

Information extraction is concerned with identifying phrases of interest of textual data. For many applications, extracting entities such as names, places, events, dates, times, and prices is a powerful way of summarizing the information relevant to a user’s needs. In the case of a domain specific search engine, the automatic identification of important information can increase accuracy and efficiency of a directed search. There is use of hidden Markov models (HMMs) to extract the relevant fields of research papers. These extracted text segments are used to allow searched over specific fields and to provide effective presentation of search results and to match references to papers. For example, noticing the pop-up ads on any websites showing the recent items you might have looked on an online store with discounts. In Information Retrieval two types of models have been used (McCallum and Nigam, 1998) [ 77 ]. Both modules assume that a fixed vocabulary is present. But in first model a document is generated by first choosing a subset of vocabulary and then using the selected words any number of times, at least once without any order. This is called Multi-variate Bernoulli model. It takes the information of which words are used in a document irrespective of number of words and order. In second model, a document is generated by choosing a set of word occurrences and arranging them in any order. This model is called multi-nominal model, in addition to the Multi-variate Bernoulli model, it also captures information on how many times a word is used in a document.

Discovery of knowledge is becoming important areas of research over the recent years. Knowledge discovery research use a variety of techniques to extract useful information from source documents like Parts of Speech (POS) tagging , Chunking or Shadow Parsing , Stop-words (Keywords that are used and must be removed before processing documents), Stemming (Mapping words to some base for, it has two methods, dictionary-based stemming and Porter style stemming (Porter, 1980) [ 103 ]. Former one has higher accuracy but higher cost of implementation while latter has lower implementation cost and is usually insufficient for IR). Compound or Statistical Phrases (Compounds and statistical phrases index multi token units instead of single tokens.) Word Sense Disambiguation (Word sense disambiguation is the task of understanding the correct sense of a word in context. When used for information retrieval, terms are replaced by their senses in the document vector.)

The extracted information can be applied for a variety of purposes, for example to prepare a summary, to build databases, identify keywords, classifying text items according to some pre-defined categories etc. For example, CONSTRUE, it was developed for Reuters, that is used in classifying news stories (Hayes, 1992) [ 54 ]. It has been suggested that many IE systems can successfully extract terms from documents, acquiring relations between the terms is still a difficulty. PROMETHEE is a system that extracts lexico-syntactic patterns relative to a specific conceptual relation (Morin,1999) [ 89 ]. IE systems should work at many levels, from word recognition to discourse analysis at the level of the complete document. An application of the Blank Slate Language Processor (BSLP) ( Bondale et al., 1999) [ 16 ] approach for the analysis of a real-life natural language corpus that consists of responses to open-ended questionnaires in the field of advertising.

There is a system called MITA (Metlife’s Intelligent Text Analyzer) (Glasgow et al. (1998) [ 48 ]) that extracts information from life insurance applications. Ahonen et al. (1998) [ 1 ] suggested a mainstream framework for text mining that uses pragmatic and discourse level analyses of text .

Summarization

Overload of information is the real thing in this digital age, and already our reach and access to knowledge and information exceeds our capacity to understand it. This trend is not slowing down, so an ability to summarize the data while keeping the meaning intact is highly required. This is important not just allowing us the ability to recognize the understand the important information for a large set of data, it is used to understand the deeper emotional meanings; For example, a company determines the general sentiment on social media and uses it on their latest product offering. This application is useful as a valuable marketing asset.

The types of text summarization depends on the basis of the number of documents and the two important categories are single document summarization and multi document summarization (Zajic et al. 2008 [ 159 ]; Fattah and Ren 2009 [ 43 ]).Summaries can also be of two types: generic or query-focused (Gong and Liu 2001 [ 50 ]; Dunlavy et al. 2007 [ 37 ]; Wan 2008 [ 144 ]; Ouyang et al. 2011 [ 99 ]).Summarization task can be either supervised or unsupervised (Mani and Maybury 1999 [ 74 ]; Fattah and Ren 2009 [ 43 ]; Riedhammer et al. 2010 [ 110 ]). Training data is required in a supervised system for selecting relevant material from the documents. Large amount of annotated data is needed for learning techniques. Few techniques are as follows–

Bayesian Sentence based Topic Model (BSTM) uses both term-sentences and term document associations for summarizing multiple documents. (Wang et al. 2009 [ 146 ])

Factorization with Given Bases (FGB) is a language model where sentence bases are the given bases and it utilizes document-term and sentence term matrices. This approach groups and summarizes the documents simultaneously. (Wang et al. 2011) [ 147 ])

Topic Aspect-Oriented Summarization (TAOS) is based on topic factors. These topic factors are various features that describe topics such as capital words are used to represent entity. Various topics can have various aspects and various preferences of features are used to represent various aspects. (Fang et al. 2015 [ 42 ])

Dialogue System

Dialogue systems are very prominent in real world applications ranging from providing support to performing a particular action. In case of support dialogue systems, context awareness is required whereas in case to perform an action, it doesn’t require much context awareness. Earlier dialogue systems were focused on small applications such as home theater systems. These dialogue systems utilize phonemic and lexical levels of language. Habitable dialogue systems offer potential for fully automated dialog systems by utilizing all levels of a language. (Liddy, 2001) [ 68 ].This leads to producing systems that can enable robots to interact with humans in natural languages such as Google’s assistant, Windows Cortana, Apple’s Siri and Amazon’s Alexa etc.

NLP is applied in the field as well. The Linguistic String Project-Medical Language Processor is one the large scale projects of NLP in the field of medicine [ 21 , 53 , 57 , 71 , 114 ]. The LSP-MLP helps enabling physicians to extract and summarize information of any signs or symptoms, drug dosage and response data with the aim of identifying possible side effects of any medicine while highlighting or flagging data items [ 114 ]. The National Library of Medicine is developing The Specialist System [ 78 , 79 , 80 , 82 , 84 ]. It is expected to function as an Information Extraction tool for Biomedical Knowledge Bases, particularly Medline abstracts. The lexicon was created using MeSH (Medical Subject Headings), Dorland’s Illustrated Medical Dictionary and general English Dictionaries. The Centre d’Informatique Hospitaliere of the Hopital Cantonal de Geneve is working on an electronic archiving environment with NLP features [ 81 , 119 ]. In the first phase, patient records were archived. At later stage the LSP-MLP has been adapted for French [ 10 , 72 , 94 , 113 ], and finally, a proper NLP system called RECIT [ 9 , 11 , 17 , 106 ] has been developed using a method called Proximity Processing [ 88 ]. It’s task was to implement a robust and multilingual system able to analyze/comprehend medical sentences, and to preserve a knowledge of free text into a language independent knowledge representation [ 107 , 108 ]. The Columbia university of New York has developed an NLP system called MEDLEE (MEDical Language Extraction and Encoding System) that identifies clinical information in narrative reports and transforms the textual information into structured representation [ 45 ].

3.3 NLP in talk

We next discuss some of the recent NLP projects implemented by various companies:

ACE Powered GDPR Robot Launched by RAVN Systems [ 134 ]

RAVN Systems, a leading expert in Artificial Intelligence (AI), Search and Knowledge Management Solutions, announced the launch of a RAVN (“Applied Cognitive Engine”) i.e. powered software Robot to help and facilitate the GDPR (“General Data Protection Regulation”) compliance. The Robot uses AI techniques to automatically analyze documents and other types of data in any business system which is subject to GDPR rules. It allows users to search, retrieve, flag, classify, and report on data, mediated to be super sensitive under GDPR quickly and easily. Users also can identify personal data from documents, view feeds on the latest personal data that requires attention and provide reports on the data suggested to be deleted or secured. RAVN’s GDPR Robot is also able to hasten requests for information (Data Subject Access Requests - “DSAR”) in a simple and efficient way, removing the need for a physical approach to these requests which tends to be very labor thorough. Peter Wallqvist, CSO at RAVN Systems commented, “GDPR compliance is of universal paramountcy as it will be exploited by any organization that controls and processes data concerning EU citizens.

Link: http://markets.financialcontent.com/stocks/news/read/33888795/RAVN_Systems_Launch_the_ACE_Powered_GDPR_Robot

Eno A Natural Language Chatbot Launched by Capital One [ 56 ]

Capital One announces a chatbot for customers called Eno. Eno is a natural language chatbot that people socialize through texting. CapitalOne claims that Eno is First natural language SMS chatbot from a U.S. bank that allows customers to ask questions using natural language. Customers can interact with Eno asking questions about their savings and others using a text interface. Eno makes such an environment that it feels that a human is interacting. This provides a different platform than other brands that launch chatbots like Facebook Messenger and Skype. They believed that Facebook has too much access to private information of a person, which could get them into trouble with privacy laws U.S. financial institutions work under. Like Facebook Page admin can access full transcripts of the bot’s conversations. If that would be the case then the admins could easily view the personal banking information of customers with is not correct.

Link: https://www.macobserver.com/analysis/capital-one-natural-language-chatbot-eno/

Future of BI in Natural Language Processing [ 140 ]

Several companies in BI spaces are trying to get with the trend and trying hard to ensure that data becomes more friendly and easily accessible. But still there is a long way for this.BI will also make it easier to access as GUI is not needed. Because nowadays the queries are made by text or voice command on smartphones.one of the most common examples is Google might tell you today what tomorrow’s weather will be. But soon enough, we will be able to ask our personal data chatbot about customer sentiment today, and how we feel about their brand next week; all while walking down the street. Today, NLP tends to be based on turning natural language into machine language. But with time the technology matures – especially the AI component –the computer will get better at “understanding” the query and start to deliver answers rather than search results. Initially, the data chatbot will probably ask the question ‘how have revenues changed over the last three-quarters?’ and then return pages of data for you to analyze. But once it learns the semantic relations and inferences of the question, it will be able to automatically perform the filtering and formulation necessary to provide an intelligible answer, rather than simply showing you data.

Link: http://www.smartdatacollective.com/eran-levy/489410/here-s-why-natural-language-processing-future-bi

Using Natural Language Processing and Network Analysis to Develop a Conceptual Framework for Medication Therapy Management Research [ 97 ]

Natural Language Processing and Network Analysis to Develop a Conceptual Framework for Medication Therapy Management Research describes a theory derivation process that is used to develop a conceptual framework for medication therapy management (MTM) research. The MTM service model and chronic care model are selected as parent theories. Review article abstracts target medication therapy management in chronic disease care that were retrieved from Ovid Medline (2000–2016). Unique concepts in each abstract are extracted using Meta Map and their pair-wise co-occurrence are determined. Then the information is used to construct a network graph of concept co-occurrence that is further analyzed to identify content for the new conceptual model. 142 abstracts are analyzed. Medication adherence is the most studied drug therapy problem and co-occurred with concepts related to patient-centered interventions targeting self-management. The enhanced model consists of 65 concepts clustered into 14 constructs. The framework requires additional refinement and evaluation to determine its relevance and applicability across a broad audience including underserved settings.

Link: https://www.ncbi.nlm.nih.gov/pubmed/28269895?dopt=Abstract

Meet the Pilot, world’s first language translating earbuds [ 96 ]

The world’s first smart earpiece Pilot will soon be transcribed over 15 languages. According to Spring wise, Waverly Labs’ Pilot can already transliterate five spoken languages, English, French, Italian, Portuguese, and Spanish, and seven written affixed languages, German, Hindi, Russian, Japanese, Arabic, Korean and Mandarin Chinese. The Pilot earpiece is connected via Bluetooth to the Pilot speech translation app, which uses speech recognition, machine translation and machine learning and speech synthesis technology. Simultaneously, the user will hear the translated version of the speech on the second earpiece. Moreover, it is not necessary that conversation would be taking place between two people; only the users can join in and discuss as a group. As if now the user may experience a few second lag interpolated the speech and translation, which Waverly Labs pursue to reduce. The Pilot earpiece will be available from September but can be pre-ordered now for $249. The earpieces can also be used for streaming music, answering voice calls, and getting audio notifications.

Link: https://www.indiegogo.com/projects/meet-the-pilot-smart-earpiece-language-translator-headphones-travel#/

4 Datasets in NLP and state-of-the-art models

The objective of this section is to present the various datasets used in NLP and some state-of-the-art models in NLP.

4.1 Datasets in NLP

Corpus is a collection of linguistic data, either compiled from written texts or transcribed from recorded speech. Corpora are intended primarily for testing linguistic hypotheses - e.g., to determine how a certain sound, word, or syntactic construction is used across a culture or language. There are various types of corpus: In an annotated corpus, the implicit information in the plain text has been made explicit by specific annotations. Un-annotated corpus contains raw state of plain text. Different languages can be compared using a reference corpus. Monitor corpora are non-finite collections of texts which are mostly used in lexicography. Multilingual corpus refers to a type of corpus that contains small collections of monolingual corpora based on the same sampling procedure and categories for different languages. Parallel corpus contains texts in one language and their translations into other languages which are aligned sentence phrase by phrase. Reference corpus contains text of spoken (formal and informal) and written (formal and informal) language which represents various social and situational contexts. Speech corpus contains recorded speech and transcriptions of recording and the time each word occurred in the recorded speech. There are various datasets available for natural language processing; some of these are listed below for different use cases:

Sentiment Analysis: Sentiment analysis is a rapidly expanding field of natural language processing (NLP) used in a variety of fields such as politics, business etc. Majorly used datasets for sentiment analysis are:

Stanford Sentiment Treebank (SST): Socher et al. introduced SST containing sentiment labels for 215,154 phrases in parse trees for 11,855 sentences from movie reviews posing novel sentiment compositional difficulties [ 127 ].

Sentiment140: It contains 1.6 million tweets annotated with negative, neutral and positive labels.

Paper Reviews: It provides reviews of computing and informatics conferences written in English and Spanish languages. It has 405 reviews which are evaluated on a 5-point scale ranging from very negative to very positive.

IMDB: For natural language processing, text analytics, and sentiment analysis, this dataset offers thousands of movie reviews split into training and test datasets. This dataset was introduced in by Mass et al. in 2011 [ 73 ].

G.Rama Rohit Reddy of the Language Technologies Research Centre, KCIS, IIIT Hyderabad, generated the corpus “Sentiraama.” The corpus is divided into four datasets, each of which is annotated with a two-value scale that distinguishes between positive and negative sentiment at the document level. The corpus contains data from a variety of fields, including book reviews, product reviews, movie reviews, and song lyrics. The annotators meticulously followed the annotation technique for each of them. The folder “Song Lyrics” in the corpus contains 339 Telugu song lyrics written in Telugu script [ 121 ].

Language Modelling: Language models analyse text data to calculate word probability. They use an algorithm to interpret the data, which establishes rules for context in natural language. The model then uses these rules to accurately predict or construct new sentences. The model basically learns the basic characteristics and features of language and then applies them to new phrases. Majorly used datasets for Language modeling are as follows:

Salesforce’s WikiText-103 dataset has 103 million tokens collected from 28,475 featured articles from Wikipedia.

WikiText-2 is a scaled-down version of WikiText-103. It contains 2 million tokens with a 33,278 jargon size.

Penn Treebank piece of the Wall Street Diary corpus includes 929,000 tokens for training, 73,000 tokens for validation, and 82,000 tokens for testing purposes. Its context is limited since it comprises sentences rather than paragraphs [ 76 ].

The Ministry of Electronics and Information Technology’s Technology Development Programme for Indian Languages (TDIL) launched its own data distribution portal ( www.tdil-dc.in ) which has cataloged datasets [ 24 ].

Machine Translation: The task of converting the text of one natural language into another language while keeping the sense of the input text is known as machine translation. Majorly used datasets are as follows:

Tatoeba is a collection of multilingual sentence pairings. A tab-delimited pair of an English text sequence and the translated French text sequence appears on each line of the dataset. Each text sequence might be as simple as a single sentence or as complex as a paragraph of many sentences.

The Europarl parallel corpus is derived from the European Parliament’s proceedings. It is available in 21 European languages [ 40 ].

WMT14 provides machine translation pairs for English-German and English-French. Separately, these datasets comprise 4.5 million and 35 million sentence sets. Byte-Pair Encoding with 32 K tasks is used to encode the phrases.

There are around 160,000 sentence pairings in the IWSLT 14. The dataset includes descriptions in English-German (En-De) and German-English (De-En) languages. There are around 200 K training sentence sets in the IWSLT 13 dataset.

The IIT Bombay English-Hindi corpus comprises parallel corpora for English-Hindi as well as monolingual Hindi corpora gathered from several existing sources and corpora generated over time at IIT Bombay’s Centre for Indian Language Technology.

Question Answering System: Question answering systems provide real-time responses which are widely used in customer care services. The datasets used for dialogue system/question answering system are as follows:

Stanford Question Answering Dataset (SQuAD): it is a reading comprehension dataset made up of questions posed by crowd workers on a collection of Wikipedia articles.

Natural Questions: It is a large-scale corpus presented by Google used for training and assessing open-domain question answering systems. It includes 300,000 naturally occurring queries as well as human-annotated responses from Wikipedia pages for use in QA system training.

Question Answering in Context (QuAC): This dataset is used to describe, comprehend, and participate in information seeking conversation. In this dataset, instances are made up of an interactive discussion between two crowd workers: a student who asks a series of open-ended questions about an unknown Wikipedia text, and a teacher who responds by offering brief extracts from the text.

The neural learning models are overtaking traditional models for NLP [ 64 , 127 ]. In [ 64 ], authors used CNN (Convolutional Neural Network) model for sentiment analysis of movie reviews and achieved 81.5% accuracy. The results illustrate that using CNN was an appropriate replacement for state-of-the-art methods. Authors [ 127 ] have combined SST and Recursive Neural Tensor Network for sentiment analysis of the single sentence. This model amplifies the accuracy by 5.4% for sentence classification compared to traditional NLP models. Authors [ 135 ] proposed a combined Recurrent Neural Network and Transformer model for sentiment analysis. This hybrid model was tested on three different datasets: Twitter US Airline Sentiment, IMDB, and Sentiment 140: and achieved F1 scores of 91%, 93%, and 90%, respectively. This model’s performance outshined the state-of-art methods.

Santoro et al. [ 118 ] introduced a rational recurrent neural network with the capacity to learn on classifying the information and perform complex reasoning based on the interactions between compartmentalized information. They used the relational memory core to handle such interactions. Finally, the model was tested for language modeling on three different datasets (GigaWord, Project Gutenberg, and WikiText-103). Further, they mapped the performance of their model to traditional approaches for dealing with relational reasoning on compartmentalized information. The results achieved with RMC show improved performance.

Merity et al. [ 86 ] extended conventional word-level language models based on Quasi-Recurrent Neural Network and LSTM to handle the granularity at character and word level. They tuned the parameters for character-level modeling using Penn Treebank dataset and word-level modeling using WikiText-103. In both cases, their model outshined the state-of-art methods.

Luong et al. [ 70 ] used neural machine translation on the WMT14 dataset and performed translation of English text to French text. The model demonstrated a significant improvement of up to 2.8 bi-lingual evaluation understudy (BLEU) scores compared to various neural machine translation systems. It outperformed the commonly used MT system on a WMT 14 dataset.

Fan et al. [ 41 ] introduced a gradient-based neural architecture search algorithm that automatically finds architecture with better performance than a transformer, conventional NMT models. They tested their model on WMT14 (English-German Translation), IWSLT14 (German-English translation), and WMT18 (Finnish-to-English translation) and achieved 30.1, 36.1, and 26.4 BLEU points, which shows better performance than Transformer baselines.

Wiese et al. [ 150 ] introduced a deep learning approach based on domain adaptation techniques for handling biomedical question answering tasks. Their model revealed the state-of-the-art performance on biomedical question answers, and the model outperformed the state-of-the-art methods in domains.

Seunghak et al. [ 158 ] designed a Memory-Augmented-Machine-Comprehension-Network (MAMCN) to handle dependencies faced in reading comprehension. The model achieved state-of-the-art performance on document-level using TriviaQA and QUASAR-T datasets, and paragraph-level using SQuAD datasets.

Xie et al. [ 154 ] proposed a neural architecture where candidate answers and their representation learning are constituent centric, guided by a parse tree. Under this architecture, the search space of candidate answers is reduced while preserving the hierarchical, syntactic, and compositional structure among constituents. Using SQuAD, the model delivers state-of-the-art performance.

4.2 State-of-the-art models in NLP

Rationalist approach or symbolic approach assumes that a crucial part of the knowledge in the human mind is not derived by the senses but is firm in advance, probably by genetic inheritance. Noam Chomsky was the strongest advocate of this approach. It was believed that machines can be made to function like the human brain by giving some fundamental knowledge and reasoning mechanism linguistics knowledge is directly encoded in rule or other forms of representation. This helps the automatic process of natural languages [ 92 ]. Statistical and machine learning entail evolution of algorithms that allow a program to infer patterns. An iterative process is used to characterize a given algorithm’s underlying algorithm that is optimized by a numerical measure that characterizes numerical parameters and learning phase. Machine-learning models can be predominantly categorized as either generative or discriminative. Generative methods can generate synthetic data because of which they create rich models of probability distributions. Discriminative methods are more functional and have right estimating posterior probabilities and are based on observations. Srihari [ 129 ] explains the different generative models as one with a resemblance that is used to spot an unknown speaker’s language and would bid the deep knowledge of numerous languages to perform the match. Discriminative methods rely on a less knowledge-intensive approach and using distinction between languages. Whereas generative models can become troublesome when many features are used and discriminative models allow use of more features [ 38 ]. Few of the examples of discriminative methods are Logistic regression and conditional random fields (CRFs), generative methods are Naive Bayes classifiers and hidden Markov models (HMMs).

Naive Bayes Classifiers

Naive Bayes is a probabilistic algorithm which is based on probability theory and Bayes’ Theorem to predict the tag of a text such as news or customer review. It helps to calculate the probability of each tag for the given text and return the tag with the highest probability. Bayes’ Theorem is used to predict the probability of a feature based on prior knowledge of conditions that might be related to that feature. The choice of area in NLP using Naïve Bayes Classifiers could be in usual tasks such as segmentation and translation but it is also explored in unusual areas like segmentation for infant learning and identifying documents for opinions and facts. Anggraeni et al. (2019) [ 61 ] used ML and AI to create a question-and-answer system for retrieving information about hearing loss. They developed I-Chat Bot which understands the user input and provides an appropriate response and produces a model which can be used in the search for information about required hearing impairments. The problem with naïve bayes is that we may end up with zero probabilities when we meet words in the test data for a certain class that are not present in the training data.

Hidden Markov Model (HMM)

An HMM is a system where a shifting takes place between several states, generating feasible output symbols with each switch. The sets of viable states and unique symbols may be large, but finite and known. We can describe the outputs, but the system’s internals are hidden. Few of the problems could be solved by Inference A certain sequence of output symbols, compute the probabilities of one or more candidate states with sequences. Patterns matching the state-switch sequence are most likely to have generated a particular output-symbol sequence. Training the output-symbol chain data, reckon the state-switch/output probabilities that fit this data best.

Hidden Markov Models are extensively used for speech recognition, where the output sequence is matched to the sequence of individual phonemes. HMM is not restricted to this application; it has several others such as bioinformatics problems, for example, multiple sequence alignment [ 128 ]. Sonnhammer mentioned that Pfam holds multiple alignments and hidden Markov model-based profiles (HMM-profiles) of entire protein domains. The cue of domain boundaries, family members and alignment are done semi-automatically found on expert knowledge, sequence similarity, other protein family databases and the capability of HMM-profiles to correctly identify and align the members. HMM may be used for a variety of NLP applications, including word prediction, sentence production, quality assurance, and intrusion detection systems [ 133 ].

Neural Network

Earlier machine learning techniques such as Naïve Bayes, HMM etc. were majorly used for NLP but by the end of 2010, neural networks transformed and enhanced NLP tasks by learning multilevel features. Major use of neural networks in NLP is observed for word embedding where words are represented in the form of vectors. These vectors can be used to recognize similar words by observing their closeness in this vector space, other uses of neural networks are observed in information retrieval, text summarization, text classification, machine translation, sentiment analysis and speech recognition. Initially focus was on feedforward [ 49 ] and CNN (convolutional neural network) architecture [ 69 ] but later researchers adopted recurrent neural networks to capture the context of a word with respect to surrounding words of a sentence. LSTM (Long Short-Term Memory), a variant of RNN, is used in various tasks such as word prediction, and sentence topic prediction. [ 47 ] In order to observe the word arrangement in forward and backward direction, bi-directional LSTM is explored by researchers [ 59 ]. In case of machine translation, encoder-decoder architecture is used where dimensionality of input and output vector is not known. Neural networks can be used to anticipate a state that has not yet been seen, such as future states for which predictors exist whereas HMM predicts hidden states.

Bi-directional Encoder Representations from Transformers (BERT) is a pre-trained model with unlabeled text available on BookCorpus and English Wikipedia. This can be fine-tuned to capture context for various NLP tasks such as question answering, sentiment analysis, text classification, sentence embedding, interpreting ambiguity in the text etc. [ 25 , 33 , 90 , 148 ]. Earlier language-based models examine the text in either of one direction which is used for sentence generation by predicting the next word whereas the BERT model examines the text in both directions simultaneously for better language understanding. BERT provides contextual embedding for each word present in the text unlike context-free models (word2vec and GloVe). For example, in the sentences “he is going to the riverbank for a walk” and “he is going to the bank to withdraw some money”, word2vec will have one vector representation for “bank” in both the sentences whereas BERT will have different vector representation for “bank”. Muller et al. [ 90 ] used the BERT model to analyze the tweets on covid-19 content. The use of the BERT model in the legal domain was explored by Chalkidis et al. [ 20 ].

Since BERT considers up to 512 tokens, this is the reason if there is a long text sequence that must be divided into multiple short text sequences of 512 tokens. This is the limitation of BERT as it lacks in handling large text sequences.

5 Evaluation metrics and challenges

The objective of this section is to discuss evaluation metrics used to evaluate the model’s performance and involved challenges.

5.1 Evaluation metrics

Since the number of labels in most classification problems is fixed, it is easy to determine the score for each class and, as a result, the loss from the ground truth. In image generation problems, the output resolution and ground truth are both fixed. As a result, we can calculate the loss at the pixel level using ground truth. But in NLP, though output format is predetermined in the case of NLP, dimensions cannot be specified. It is because a single statement can be expressed in multiple ways without changing the intent and meaning of that statement. Evaluation metrics are important to evaluate the model’s performance if we were trying to solve two problems with one model.

BLEU (BiLingual Evaluation Understudy) Score: Each word in the output sentence is scored 1 if it appears in either of the reference sentences and a 0 if it does not. Further the number of words that appeared in one of the reference translations is divided by the total number of words in the output sentence to normalize the count so that it is always between 0 and 1. For example, if ground truth is “He is playing chess in the backyard” and output sentences are S1: “He is playing tennis in the backyard”, S2: “He is playing badminton in the backyard”, S3: “He is playing movie in the backyard” and S4: “backyard backyard backyard backyard backyard backyard backyard”. The score of S1, S2 and S3 would be 6/7,6/7 and 6/7. All sentences are getting the same score though information in S1 and S3 is not same. This is because BELU considers words in a sentence contribute equally to the meaning of a sentence which is not the case in real-world scenario. Using combination of uni-gram, bi-gram and n-grams, we can to capture the order of a sentence. We may also set a limit on how many times each word is counted based on how many times it appears in each reference phrase, which helps us prevent excessive repetition.

GLUE (General Language Understanding Evaluation) score: Previously, NLP models were almost usually built to perform effectively on a unique job. Various models such as LSTM, Bi-LSTM were trained solely for this task, and very rarely generalized to other tasks. The model which is used for named entity recognition can perform for textual entailment. GLUE is a set of datasets for training, assessing, and comparing NLP models. It includes nine diverse task datasets designed to test a model’s language understanding. To acquire a comprehensive assessment of a model’s performance, GLUE tests the model on a variety of tasks rather than a single one. Single-sentence tasks, similarity and paraphrase tasks, and inference tasks are among them. For example, in sentiment analysis of customer reviews, we might be interested in analyzing ambiguous reviews and determining which product the client is referring to in his reviews. Thus, the model obtains a good “knowledge” of language in general after some generalized pre-training. When the time comes to test out a model to meet a given task, this universal “knowledge” gives us an advantage. With GLUE, researchers can evaluate their model and score it on all nine tasks. The final performance score model is the average of those nine scores. It makes little difference how the model looks or works if it can analyze inputs and predict outcomes for all the activities.

Considering these metrics in mind, it helps to evaluate the performance of an NLP model for a particular task or a variety of tasks.

5.2 Challenges

The applications of NLP have been growing day by day, and with these new challenges are also occurring despite a lot of work done in the recent past. Some of the common challenges are: Contextual words and phrases in the language where same words and phrases can have different meanings in a sentence which are easy for the humans to understand but makes a challenging task. Such type of challenges can also be faced with dealing Synonyms in the language because humans use many different words to express the same idea, also in the language different levels of complexity such as large, huge, and big may be used by the different persons which become a challenging task to process the language and design algorithms to adopt all these issues. Further in language, Homonyms, the words used to be pronounced the same but have different definitions are also problematic for question answering and speech-to-text applications because they aren’t written in text form. Sentences using sarcasm and irony sometimes may be understood in the opposite way by the humans, and so designing models to deal with such sentences is a really challenging task in NLP. Furthermore, the sentences in the language having any type of ambiguity in the sense of interpreting in more than one way is also an area to work upon where more accuracy can be achieved. Language containing informal phrases, expressions, idioms, and culture-specific lingo make difficult to design models intended for the broad use, however having a lot of data on which training and updating on regular basis may improve the models, but it is a really challenging task to deal with the words having different meaning in different geographic areas. In fact, such types of issues also occur in dealing with different domains such as the meaning of words or sentences may be different in the education industry but have different meaning in health, law, defense etc. So, the models for NLP may be working good for an individual domain, geographic area but for a broad use such challenges need to be tackled. Further together with the above-mentioned challenges misspelled or misused words can also create a problem, although autocorrect and grammar corrections applications have improved a lot due to the continuous developments in the direction but predicting the intention of the writer that to from a specific domain, geographic area by considering sarcasm, expressions, informal phrases etc. is really a big challenge. There is no doubt that for most common widely used languages models for NLP have been doing very well, and further improving day by day but still there is a need for models for all the persons rather than specific knowledge of a particular language and technology. One may further refer to the work of Sharifirad and Matwin (2019) [ 123 ] for classification of different online harassment categories and challenges, Baclic et.al. (2020) [ 6 ] and Wong et al. (2018) [ 151 ] for challenges and opportunities in public health, Kang et.al. (2020) [ 63 ] for detailed literature survey and technological challenges relevant to management research and NLP, and a recent review work by Alshemali and Kalita (2020) [ 3 ], and references cited there in.

In the recent past, models dealing with Visual Commonsense Reasoning [ 31 ] and NLP have also been getting attention of the several researchers and seems a promising and challenging area to work upon. These models try to extract the information from an image, video using a visual reasoning paradigm such as the humans can infer from a given image, video beyond what is visually obvious, such as objects’ functions, people’s intents, and mental states. In this direction, recently Wen and Peng (2020) [ 149 ] suggested a model to capture knowledge from different perspectives, and perceive common sense in advance, and the results based on the conducted experiments on visual commonsense reasoning dataset VCR seems very satisfactory and effective. The work of Peng and Chi (2019) [ 102 ], that proposes Domain Adaptation with Scene Graph approach to transfer knowledge from the source domain with the objective to improve cross-media retrieval in the target domain, and Yen et al. (2019) [ 155 ] is also very useful to further explore the use of NLP and in its relevant domains.

6 Conclusion

This paper is written with three objectives. The first objective gives insights of the various important terminologies of NLP and NLG, and can be useful for the readers interested to start their early career in NLP and work relevant to its applications. The second objective of this paper focuses on the history, applications, and recent developments in the field of NLP. The third objective is to discuss datasets, approaches and evaluation metrics used in NLP. The relevant work done in the existing literature with their findings and some of the important applications and projects in NLP are also discussed in the paper. The last two objectives may serve as a literature survey for the readers already working in the NLP and relevant fields, and further can provide motivation to explore the fields mentioned in this paper. It is to be noticed that even though a great amount of work on natural language processing is available in literature surveys (one may refer to [ 15 , 32 , 63 , 98 , 133 , 151 ] focusing on one domain such as usage of deep-learning techniques in NLP, techniques used for email spam filtering, medication safety, management research, intrusion detection, and Gujarati language etc.), still there is not much work on regional languages, which can be the focus of future research.

Change history

25 july 2022.

Affiliation 3 has been added into the online PDF.

Ahonen H, Heinonen O, Klemettinen M, Verkamo AI (1998) Applying data mining techniques for descriptive phrase extraction in digital document collections. In research and technology advances in digital libraries, 1998. ADL 98. Proceedings. IEEE international forum on (pp. 2-11). IEEE

Alshawi H (1992) The core language engine. MIT press

Alshemali B, Kalita J (2020) Improving the reliability of deep neural networks in NLP: A review. Knowl-Based Syst 191:105210

Article   Google Scholar  

Andreev ND (1967) The intermediary language as the focal point of machine translation. In: Booth AD (ed) Machine translation. North Holland Publishing Company, Amsterdam, pp 3–27

Google Scholar  

Androutsopoulos I, Paliouras G, Karkaletsis V, Sakkis G, Spyropoulos CD, Stamatopoulos P (2000) Learning to filter spam e-mail: A comparison of a naive bayesian and a memory-based approach. arXiv preprint cs/0009009

Baclic O, Tunis M, Young K, Doan C, Swerdfeger H, Schonfeld J (2020) Artificial intelligence in public health: challenges and opportunities for public health made possible by advances in natural language processing. Can Commun Dis Rep 46(6):161

Bahdanau D, Cho K, Bengio Y (2015) Neural machine translation by jointly learning to align and translate. In ICLR 2015

Bangalore S, Rambow O, Whittaker S (2000) Evaluation metrics for generation. In proceedings of the first international conference on natural language generation-volume 14 (pp. 1-8). Assoc Comput Linguist

Baud RH, Rassinoux AM, Scherrer JR (1991) Knowledge representation of discharge summaries. In AIME 91 (pp. 173–182). Springer, Berlin Heidelberg

Baud RH, Rassinoux AM, Scherrer JR (1992) Natural language processing and semantical representation of medical texts. Methods Inf Med 31(2):117–125

Baud RH, Alpay L, Lovis C (1994) Let’s meet the users with natural language understanding. Knowledge and Decisions in Health Telematics: The Next Decade 12:103

Bengio Y, Ducharme R, Vincent P (2001) A neural probabilistic language model. Proceedings of NIPS

Benson E, Haghighi A, Barzilay R (2011) Event discovery in social media feeds. In proceedings of the 49th annual meeting of the Association for Computational Linguistics: human language technologies-volume 1 (pp. 389-398). Assoc Comput Linguist

Berger AL, Della Pietra SA, Della Pietra VJ (1996) A maximum entropy approach to natural language processing. Computational Linguistics 22(1):39–71

Blanzieri E, Bryl A (2008) A survey of learning-based techniques of email spam filtering. Artif Intell Rev 29(1):63–92

Bondale N, Maloor P, Vaidyanathan A, Sengupta S, Rao PV (1999) Extraction of information from open-ended questionnaires using natural language processing techniques. Computer Science and Informatics 29(2):15–22

Borst F, Sager N, Nhàn NT, Su Y, Lyman M, Tick LJ, ..., Scherrer JR (1989) Analyse automatique de comptes rendus d'hospitalisation. In Degoulet P, Stephan JC, Venot A, Yvon PJ, rédacteurs. Informatique et Santé, Informatique et Gestion des Unités de Soins, Comptes Rendus du Colloque AIM-IF, Paris (pp. 246–56). [5]

Briscoe EJ, Grover C, Boguraev B, Carroll J (1987) A formalism and environment for the development of a large grammar of English. IJCAI 87:703–708

Carreras X, Marquez L (2001) Boosting trees for anti-spam email filtering. arXiv preprint cs/0109015

Chalkidis I, Fergadiotis M, Malakasiotis P, Aletras N, Androutsopoulos I (2020) LEGAL-BERT: the muppets straight out of law school. arXiv preprint arXiv:2010.02559

Chi EC, Lyman MS, Sager N, Friedman C, Macleod C (1985) A database of computer-structured narrative: methods of computing complex relations. In proceedings of the annual symposium on computer application in medical care (p. 221). Am Med Inform Assoc

Cho K, Van Merriënboer B, Bahdanau D, Bengio Y, (2014) On the properties of neural machine translation: encoder-decoder approaches. arXiv preprint arXiv:1409.1259

Chomsky N (1965) Aspects of the theory of syntax. MIT Press, Cambridge, Massachusetts

Choudhary N (2021) LDC-IL: the Indian repository of resources for language technology. Lang Resources & Evaluation 55:855–867. https://doi.org/10.1007/s10579-020-09523-3

Chouikhi H, Chniter H, Jarray F (2021) Arabic sentiment analysis using BERT model. In international conference on computational collective intelligence (pp. 621-632). Springer, Cham

Chung J, Gulcehre C, Cho K, Bengio Y, (2014) Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555

Cohen WW (1996) Learning rules that classify e-mail. In AAAI spring symposium on machine learning in information access (Vol. 18, p. 25)

Cohen PR, Morgan J, Ramsay AM (2002) Intention in communication, Am J Psychol 104(4)

Collobert R, Weston J (2008) A unified architecture for natural language processing. In proceedings of the 25th international conference on machine learning (pp. 160–167)

Dai Z, Yang Z, Yang Y, Carbonell J, Le QV, Salakhutdinov R, (2019) Transformer-xl: attentive language models beyond a fixed-length context. arXiv preprint arXiv:1901.02860

Davis E, Marcus G (2015) Commonsense reasoning and commonsense knowledge in artificial intelligence. Commun ACM 58(9):92–103

Desai NP, Dabhi VK (2022) Resources and components for Gujarati NLP systems: a survey. Artif Intell Rev:1–19

Devlin J, Chang MW, Lee K, Toutanova K, (2018) Bert: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805

Diab M, Hacioglu K, Jurafsky D (2004) Automatic tagging of Arabic text: From raw text to base phrase chunks. In Proceedings of HLT-NAACL 2004: Short papers (pp. 149–152). Assoc Computat Linguist

Doddington G (2002) Automatic evaluation of machine translation quality using n-gram co-occurrence statistics. In proceedings of the second international conference on human language technology research (pp. 138-145). Morgan Kaufmann publishers Inc

Drucker H, Wu D, Vapnik VN (1999) Support vector machines for spam categorization. IEEE Trans Neural Netw 10(5):1048–1054

Dunlavy DM, O’Leary DP, Conroy JM, Schlesinger JD (2007) QCS: A system for querying, clustering and summarizing documents. Inf Process Manag 43(6):1588–1605

Elkan C (2008) Log-Linear Models and Conditional Random Fields. http://cseweb.ucsd.edu/welkan/250B/cikmtutorial.pdf accessed 28 Jun 2017.

Emele MC, Dorna M (1998) Ambiguity preserving machine translation using packed representations. In proceedings of the 36th annual meeting of the Association for Computational Linguistics and 17th international conference on computational linguistics-volume 1 (pp. 365-371). Association for Computational Linguistics

Europarl: A Parallel Corpus for Statistical Machine Translation (2005) Philipp Koehn , MT Summit 2005

Fan Y, Tian F, Xia Y, Qin T, Li XY, Liu TY (2020) Searching better architectures for neural machine translation. IEEE/ACM Transactions on Audio, Speech, and Language Processing 28:1574–1585

Fang H, Lu W, Wu F, Zhang Y, Shang X, Shao J, Zhuang Y (2015) Topic aspect-oriented summarization via group selection. Neurocomputing 149:1613–1619

Fattah MA, Ren F (2009) GA, MR, FFNN, PNN and GMM based models for automatic text summarization. Comput Speech Lang 23(1):126–144

Feldman S (1999) NLP meets the jabberwocky: natural language processing in information retrieval. Online-Weston Then Wilton 23:62–73

Friedman C, Cimino JJ, Johnson SB (1993) A conceptual model for clinical radiology reports. In proceedings of the annual symposium on computer application in medical care (p. 829). Am Med Inform Assoc

Gao T, Dontcheva M, Adar E, Liu Z, Karahalios K DataTone: managing ambiguity in natural language interfaces for data visualization, UIST ‘15: proceedings of the 28th annual ACM symposium on User Interface Software & Technology, November 2015, 489–500, https://doi.org/10.1145/2807442.2807478

Ghosh S, Vinyals O, Strope B, Roy S, Dean T, Heck L (2016) Contextual lstm (clstm) models for large scale nlp tasks. arXiv preprint arXiv:1602.06291

Glasgow B, Mandell A, Binney D, Ghemri L, Fisher D (1998) MITA: an information-extraction approach to the analysis of free-form text in life insurance applications. AI Mag 19(1):59

Goldberg Y (2017) Neural network methods for natural language processing. Synthesis lectures on human language technologies 10(1):1–309

Gong Y, Liu X (2001) Generic text summarization using relevance measure and latent semantic analysis. In proceedings of the 24th annual international ACM SIGIR conference on research and development in information retrieval (pp. 19-25). ACM

Green Jr, BF, Wolf AK, Chomsky C, Laughery K (1961) Baseball: an automatic question-answerer. In papers presented at the may 9-11, 1961, western joint IRE-AIEE-ACM computer conference (pp. 219-224). ACM

Greff K, Srivastava RK, Koutník J, Steunebrink BR, Schmidhuber J (2016) LSTM: A search space odyssey. IEEE transactions on neural networks and learning systems 28(10):2222–2232

Article   MathSciNet   Google Scholar  

Grishman R, Sager N, Raze C, Bookchin B (1973) The linguistic string parser. In proceedings of the June 4-8, 1973, national computer conference and exposition (pp. 427-434). ACM

Hayes PJ (1992) Intelligent high-volume text processing using shallow, domain-specific techniques. Text-based intelligent systems: current research and practice in information extraction and retrieval, 227-242.

Hendrix GG, Sacerdoti ED, Sagalowicz D, Slocum J (1978) Developing a natural language interface to complex data. ACM Transactions on Database Systems (TODS) 3(2):105–147

"Here’s Why Natural Language Processing is the Future of BI (2017) " SmartData Collective. N.p., n.d. Web. 19

Hirschman L, Grishman R, Sager N (1976) From text to structured information: automatic processing of medical reports. In proceedings of the June 7-10, 1976, national computer conference and exposition (pp. 267-275). ACM

Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780

Huang Z, Xu W, Yu K (2015) Bidirectional LSTM-CRF models for sequence tagging. arXiv preprint arXiv:1508.01991

Hutchins WJ (1986) Machine translation: past, present, future (p. 66). Ellis Horwood, Chichester

Jurafsky D, Martin J (2008) H. Speech and language processing. 2nd edn. Prentice-Hall, Englewood Cliffs, NJ

Kamp H, Reyle U (1993) Tense and aspect. In from discourse to logic (pp. 483-689). Springer Netherlands

Kang Y, Cai Z, Tan CW, Huang Q, Liu H (2020) Natural language processing (NLP) in management research: A literature review. Journal of Management Analytics 7(2):139–172

Kim Y. (2014) Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408.5882

Knight K, Langkilde I (2000) Preserving ambiguities in generation via automata intersection. In AAAI/IAAI (pp. 697-702)

Lass R (1998) Phonology: An Introduction to Basic Concepts. Cambridge, UK; New York; Melbourne, Australia: Cambridge University Press. p. 1. ISBN 978–0–521-23728-4. Retrieved 8 January 2011Paperback ISBN 0–521–28183-0

Lewis DD (1998) Naive (Bayes) at forty: The independence assumption in information retrieval. In European conference on machine learning (pp. 4–15). Springer, Berlin Heidelberg

Liddy ED (2001). Natural language processing

Lopez MM, Kalita J (2017) Deep learning applied to NLP. arXiv preprint arXiv:1703.03091

Luong MT, Sutskever I, Le Q V, Vinyals O, Zaremba W (2014) Addressing the rare word problem in neural machine translation. arXiv preprint arXiv:1410.8206

Lyman M, Sager N, Friedman C, Chi E (1985) Computer-structured narrative in ambulatory care: its use in longitudinal review of clinical data. In proceedings of the annual symposium on computer application in medical care (p. 82). Am Med Inform Assoc

Lyman M, Sager N, Chi EC, Tick LJ, Nhan NT, Su Y, ..., Scherrer, J. (1989) Medical Language Processing for Knowledge Representation and Retrievals. In Proceedings. Symposium on Computer Applications in Medical Care (pp. 548–553). Am Med Inform Assoc

Maas A, Daly RE, Pham PT, Huang D, Ng AY, Potts C (2011) Learning word vectors for sentiment analysis. In proceedings of the 49th annual meeting of the association for computational linguistics: human language technologies (pp. 142-150)

Mani I, Maybury MT (eds) (1999) Advances in automatic text summarization, vol 293. MIT press, Cambridge, MA

Manning CD, Schütze H (1999) Foundations of statistical natural language processing, vol 999. MIT press, Cambridge

MATH   Google Scholar  

Marcus MP, Marcinkiewicz MA, Santorini B (1993) Building a large annotated corpus of english: the penn treebank. Comput Linguist 19(2):313–330

McCallum A, Nigam K (1998) A comparison of event models for naive bayes text classification. In AAAI-98 workshop on learning for text categorization (Vol. 752, pp. 41-48)

McCray AT (1991) Natural language processing for intelligent information retrieval. In Engineering in Medicine and Biology Society, 1991. Vol. 13: 1991., Proceedings of the Annual International Conference of the IEEE (pp. 1160–1161). IEEE

McCray AT (1991) Extending a natural language parser with UMLS knowledge. In proceedings of the annual symposium on computer application in medical care (p. 194). Am Med Inform Assoc

McCray AT, Nelson SJ (1995) The representation of meaning in the UMLS. Methods Inf Med 34(1–2):193–201

McCray AT, Razi A (1994) The UMLS knowledge source server. Medinfo MedInfo 8:144–147

McCray AT, Srinivasan S, Browne AC (1994) Lexical methods for managing variation in biomedical terminologies. In proceedings of the annual symposium on computer application in medical care (p. 235). Am Med Inform Assoc

McDonald R, Crammer K, Pereira F (2005) Flexible text segmentation with structured multilabel classification. In proceedings of the conference on human language technology and empirical methods in natural language processing (pp. 987-994). Assoc Comput Linguist

McGray AT, Sponsler JL, Brylawski B, Browne AC (1987) The role of lexical knowledge in biomedical text understanding. In proceedings of the annual symposium on computer application in medical care (p. 103). Am Med Inform Assoc

McKeown KR (1985) Text generation. Cambridge University Press, Cambridge

Book   Google Scholar  

Merity S, Keskar NS, Socher R (2018) An analysis of neural language modeling at multiple scales. arXiv preprint arXiv:1803.08240

Mikolov T, Chen K, Corrado G., & Dean, J. (2013). Distributed representations of words and phrases and their compositionality. In Advances in Neural Information Processing Systems

Morel-Guillemaz AM, Baud RH, Scherrer JR (1990) Proximity processing of medical text. In medical informatics Europe’90 (pp. 625–630). Springer, Berlin Heidelberg

Morin E (1999) Automatic acquisition of semantic relations between terms from technical corpora. In proc. of the fifth international congress on terminology and knowledge engineering-TKE’99

Müller M, Salathé M, Kummervold PE (2020) Covid-twitter-bert: A natural language processing model to analyse covid-19 content on twitter. arXiv preprint arXiv:2005.07503

"Natural Language Processing (2017) " Natural Language Processing RSS. N.p., n.d. Web. 25

"Natural Language Processing" (2017) Natural Language Processing RSS. N.p., n.d. Web. 23

Newatia R (2019) https://medium.com/saarthi-ai/sentence-classification-using-convolutional-neural-networks-ddad72c7048c . Accessed 15 Dec 2021

Nhàn NT, Sager N, Lyman M, Tick LJ, Borst F, Su Y (1989) A medical language processor for two indo-European languages. In proceedings. Symposium on computer applications in medical care (pp. 554-558). Am Med Inform Assoc

Nießen S, Och FJ, Leusch G, Ney H (2000) An evaluation tool for machine translation: fast evaluation for MT research. In LREC

Ochoa, A. (2016). Meet the Pilot: Smart Earpiece Language Translator. https://www.indiegogo.com/projects/meet-the-pilot-smart-earpiece-language-translator-headphones-travel . Accessed April 10, 2017

Ogallo, W., & Kanter, A. S. (2017). Using natural language processing and network analysis to develop a conceptual framework for medication therapy management research. https://www.ncbi.nlm.nih.gov/pubmed/28269895?dopt=Abstract . Accessed April 10, 2017

Otter DW, Medina JR, Kalita JK (2020) A survey of the usages of deep learning for natural language processing. IEEE Transactions on Neural Networks and Learning Systems 32(2):604–624

Ouyang Y, Li W, Li S, Lu Q (2011) Applying regression models to query-focused multi-document summarization. Inf Process Manag 47(2):227–237

Palmer M, Gildea D, Kingsbury P (2005) The proposition bank: an annotated corpus of semantic roles. Computational linguistics 31(1):71–106

Papineni K, Roukos S, Ward T, Zhu WJ (2002) BLEU: a method for automatic evaluation of machine translation. In proceedings of the 40th annual meeting on association for computational linguistics (pp. 311-318). Assoc Comput Linguist

Peng Y, Chi J (2019) Unsupervised cross-media retrieval using domain adaptation with scene graph. IEEE Transactions on Circuits and Systems for Video Technology 30(11):4368–4379

Porter MF (1980) An algorithm for suffix stripping. Program 14(3):130–137

Rae JW, Potapenko A, Jayakumar SM, Lillicrap TP, (2019) Compressive transformers for long-range sequence modelling. arXiv preprint arXiv:1911.05507

Ranjan P, Basu HVSSA (2003) Part of speech tagging and local word grouping techniques for natural language parsing in Hindi. In Proceedings of the 1st International Conference on Natural Language Processing (ICON 2003)

Rassinoux AM, Baud RH, Scherrer JR (1992) Conceptual graphs model extension for knowledge representation of medical texts. MEDINFO 92:1368–1374

Rassinoux AM, Michel PA, Juge C, Baud R, Scherrer JR (1994) Natural language processing of medical texts within the HELIOS environment. Comput Methods Prog Biomed 45:S79–S96

Rassinoux AM, Juge C, Michel PA, Baud RH, Lemaitre D, Jean FC, Scherrer JR (1995) Analysis of medical jargon: The RECIT system. In Conference on Artificial Intelligence in Medicine in Europe (pp. 42–52). Springer, Berlin Heidelberg

Rennie J (2000) ifile: An application of machine learning to e-mail filtering. In Proc. KDD 2000 Workshop on text mining, Boston, MA

Riedhammer K, Favre B, Hakkani-Tür D (2010) Long story short–global unsupervised models for keyphrase based meeting summarization. Speech Comm 52(10):801–815

Ritter A, Clark S, Etzioni O (2011) Named entity recognition in tweets: an experimental study. In proceedings of the conference on empirical methods in natural language processing (pp. 1524-1534). Assoc Comput Linguist

Rospocher M, van Erp M, Vossen P, Fokkens A, Aldabe I, Rigau G, Soroa A, Ploeger T, Bogaard T(2016) Building event-centric knowledge graphs from news. Web Semantics: Science, Services and Agents on the World Wide Web, In Press

Sager N, Lyman M, Tick LJ, Borst F, Nhan NT, Revillard C, … Scherrer JR (1989) Adapting a medical language processor from English to French. Medinfo 89:795–799

Sager N, Lyman M, Nhan NT, Tick LJ (1995) Medical language processing: applications to patient data representation and automatic encoding. Methods Inf Med 34(1–2):140–146

Sahami M, Dumais S, Heckerman D, Horvitz E (1998) A Bayesian approach to filtering junk e-mail. In learning for text categorization: papers from the 1998 workshop (Vol. 62, pp. 98-105)

Sakkis G, Androutsopoulos I, Paliouras G, Karkaletsis V, Spyropoulos CD, Stamatopoulos P (2001) Stacking classifiers for anti-spam filtering of e-mail. arXiv preprint cs/0106040

Sakkis G, Androutsopoulos I, Paliouras G et al (2003) A memory-based approach to anti-spam filtering for mailing lists. Inf Retr 6:49–73. https://doi.org/10.1023/A:1022948414856

Santoro A, Faulkner R, Raposo D, Rae J, Chrzanowski M, Weber T, ..., Lillicrap T (2018) Relational recurrent neural networks. Adv Neural Inf Proces Syst, 31

Scherrer JR, Revillard C, Borst F, Berthoud M, Lovis C (1994) Medical office automation integrated into the distributed architecture of a hospital information system. Methods Inf Med 33(2):174–179

Seal D, Roy UK, Basak R (2020) Sentence-level emotion detection from text based on semantic rules. In: Tuba M, Akashe S, Joshi A (eds) Information and communication Technology for Sustainable Development. Advances in intelligent Systems and computing, vol 933. Springer, Singapore. https://doi.org/10.1007/978-981-13-7166-0_42

Chapter   Google Scholar  

Sentiraama Corpus by Gangula Rama Rohit Reddy, Radhika Mamidi. Language Technologies Research Centre, KCIS, IIIT Hyderabad (n.d.) ltrc.iiit.ac.in/showfile.php?filename=downloads/sentiraama/

Sha F, Pereira F (2003) Shallow parsing with conditional random fields. In proceedings of the 2003 conference of the north American chapter of the Association for Computational Linguistics on human language technology-volume 1 (pp. 134-141). Assoc Comput Linguist

Sharifirad S, Matwin S, (2019) When a tweet is actually sexist. A more comprehensive classification of different online harassment categories and the challenges in NLP. arXiv preprint arXiv:1902.10584

Sharma S, Srinivas PYKL, Balabantaray RC (2016) Emotion Detection using Online Machine Learning Method and TLBO on Mixed Script. In Proceedings of Language Resources and Evaluation Conference 2016 (pp. 47–51)

Shemtov H (1997) Ambiguity management in natural language generation. Stanford University

Small SL, Cortell GW, Tanenhaus MK (1988) Lexical Ambiguity Resolutions. Morgan Kauffman, San Mateo, CA

Socher R, Perelygin A, Wu J, Chuang J, Manning CD, Ng AY, Potts C (2013) Recursive deep models for semantic compositionality over a sentiment treebank. In proceedings of the 2013 conference on empirical methods in natural language processing (pp. 1631-1642)

Sonnhammer EL, Eddy SR, Birney E, Bateman A, Durbin R (1998) Pfam: multiple sequence alignments and HMM-profiles of protein domains. Nucleic Acids Res 26(1):320–322

Srihari S (2010) Machine Learning: Generative and Discriminative Models. http://www.cedar.buffalo.edu/wsrihari/CSE574/Discriminative-Generative.pdf . accessed 31 May 2017.]

Sun X, Morency LP, Okanohara D, Tsujii JI (2008) Modeling latent-dynamic in shallow parsing: a latent conditional model with improved inference. In proceedings of the 22nd international conference on computational linguistics-volume 1 (pp. 841-848). Assoc Comput Linguist

Sundheim BM, Chinchor NA (1993) Survey of the message understanding conferences. In proceedings of the workshop on human language technology (pp. 56-60). Assoc Comput Linguist

Sutskever I, Vinyals O, Le QV (2014) Sequence to sequence learning with neural networks. In Advances in Neural Information Processing Systems

Sworna ZT, Mousavi Z, Babar MA (2022) NLP methods in host-based intrusion detection Systems: A systematic review and future directions. arXiv preprint arXiv:2201.08066

Systems RAVN (2017) "RAVN Systems Launch the ACE Powered GDPR Robot - Artificial Intelligence to Expedite GDPR Compliance." Stock Market. PR Newswire, n.d. Web. 19

Tan KL, Lee CP, Anbananthen KSM, Lim KM (2022) RoBERTa-LSTM: A hybrid model for sentiment analysis with transformers and recurrent neural network. IEEE Access, RoBERTa-LSTM: A Hybrid Model for Sentiment Analysis With Transformer and Recurrent Neural Network

Tapaswi N, Jain S (2012) Treebank based deep grammar acquisition and part-of-speech tagging for Sanskrit sentences. In software engineering (CONSEG), 2012 CSI sixth international conference on (pp. 1-4). IEEE

Thomas C (2019)  https://towardsdatascience.com/recurrent-neural-networks-and-natural-language-processing-73af640c2aa1 . Accessed 15 Dec 2021

Tillmann C, Vogel S, Ney H, Zubiaga A, Sawaf H (1997) Accelerated DP based search for statistical translation. In Eurospeech

Umber A, Bajwa I (2011) “Minimizing ambiguity in natural language software requirements specification,” in Sixth Int Conf Digit Inf Manag, pp. 102–107

"Using Natural Language Processing and Network Analysis to Develop a Conceptual Framework for Medication Therapy Management Research (2017) " AMIA ... Annual Symposium proceedings. AMIA Symposium. U.S. National Library of Medicine, n.d. Web. 19

Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I, (2017) Attention is all you need. In advances in neural information processing systems (pp. 5998-6008)

Wahlster W, Kobsa A (1989) User models in dialog systems. In user models in dialog systems (pp. 4–34). Springer Berlin Heidelberg, User Models in Dialog Systems

Walton D (1996) A pragmatic synthesis. In: fallacies arising from ambiguity. Applied logic series, vol 1. Springer, Dordrecht)

Wan X (2008) Using only cross-document relationships for both generic and topic-focused multi-document summarizations. Inf Retr 11(1):25–49

Wang W, Gang J, 2018 Application of convolutional neural network in natural language processing. In 2018 international conference on information Systems and computer aided education (ICISCAE) (pp. 64-70). IEEE

Wang D, Zhu S, Li T, Gong Y (2009) Multi-document summarization using sentence-based topic models. In proceedings of the ACL-IJCNLP 2009 conference short papers (pp. 297-300). Assoc Comput Linguist

Wang D, Zhu S, Li T, Chi Y, Gong Y (2011) Integrating document clustering and multidocument summarization. ACM Transactions on Knowledge Discovery from Data (TKDD) 5(3):14–26

Wang Z, Ng P, Ma X, Nallapati R, Xiang B (2019) Multi-passage bert: A globally normalized bert model for open-domain question answering. arXiv preprint arXiv:1908.08167

Wen Z, Peng Y (2020) Multi-level knowledge injecting for visual commonsense reasoning. IEEE Transactions on Circuits and Systems for Video Technology 31(3):1042–1054

Wiese G, Weissenborn D, Neves M (2017) Neural domain adaptation for biomedical question answering. arXiv preprint arXiv:1706.03610

Wong A, Plasek JM, Montecalvo SP, Zhou L (2018) Natural language processing and its implications for the future of medication safety: a narrative review of recent advances and challenges. Pharmacotherapy: The Journal of Human Pharmacology and Drug Therapy 38(8):822–841

Woods WA (1978) Semantics and quantification in natural language question answering. Adv Comput 17:1–87

Xia T (2020) A constant time complexity spam detection algorithm for boosting throughput on rule-based filtering Systems. IEEE Access 8:82653–82661. https://doi.org/10.1109/ACCESS.2020.2991328

Xie P, Xing E (2017) A constituent-centric neural architecture for reading comprehension. In proceedings of the 55th annual meeting of the Association for Computational Linguistics (volume 1: long papers) (pp. 1405-1414)

Yan X, Ye Y, Mao Y, Yu H (2019) Shared-private information bottleneck method for cross-modal clustering. IEEE Access 7:36045–36056

Yi J, Nasukawa T, Bunescu R, Niblack W (2003) Sentiment analyzer: extracting sentiments about a given topic using natural language processing techniques. In data mining, 2003. ICDM 2003. Third IEEE international conference on (pp. 427-434). IEEE

Young SJ, Chase LL (1998) Speech recognition evaluation: a review of the US CSR and LVCSR programmes. Comput Speech Lang 12(4):263–279

Yu S, et al. (2018) "A multi-stage memory augmented neural network for machine reading comprehension." Proceedings of the workshop on machine reading for question answering

Zajic DM, Dorr BJ, Lin J (2008) Single-document and multi-document summarization techniques for email threads using sentence compression. Inf Process Manag 44(4):1600–1610

Zeroual I, Lakhouaja A, Belahbib R (2017) Towards a standard part of speech tagset for the Arabic language. J King Saud Univ Comput Inf Sci 29(2):171–178

Download references

Acknowledgements

Authors would like to express the gratitude to Research Mentors from CL Educate: Accendere Knowledge Management Services Pvt. Ltd. for their comments on earlier versions of the manuscript. Although any errors are our own and should not tarnish the reputations of these esteemed persons. We would also like to appreciate the Editor, Associate Editor, and anonymous referees for their constructive suggestions that led to many improvements on an earlier version of this manuscript.

Author information

Authors and affiliations.

Department of Computer Science, Manav Rachna International Institute of Research and Studies, Faridabad, India

Diksha Khurana & Aditya Koli

Department of Computer Science, BML Munjal University, Gurgaon, India

Kiran Khatter

Department of Statistics, Amity University Punjab, Mohali, India

Sukhdev Singh

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Kiran Khatter .

Ethics declarations

Conflict of interest.

The first draft of this paper was written under the supervision of Dr. Kiran Khatter and Dr. Sukhdev Singh, associated with CL- Educate: Accendere Knowledge Management Services Pvt. Ltd. and deputed at the Manav Rachna International University. The draft is also available on arxiv.org at https://arxiv.org/abs/1708.05148

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Khurana, D., Koli, A., Khatter, K. et al. Natural language processing: state of the art, current trends and challenges. Multimed Tools Appl 82 , 3713–3744 (2023). https://doi.org/10.1007/s11042-022-13428-4

Download citation

Received : 03 February 2021

Revised : 23 March 2022

Accepted : 02 July 2022

Published : 14 July 2022

Issue Date : January 2023

DOI : https://doi.org/10.1007/s11042-022-13428-4

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Natural language processing
  • Natural language understanding
  • Natural language generation
  • NLP applications
  • NLP evaluation metrics
  • Find a journal
  • Publish with us
  • Track your research

Help | Advanced Search

Computer Science > Computation and Language

Title: a survey of large language models.

Abstract: Language is essentially a complex, intricate system of human expressions governed by grammatical rules. It poses a significant challenge to develop capable AI algorithms for comprehending and grasping a language. As a major approach, language modeling has been widely studied for language understanding and generation in the past two decades, evolving from statistical language models to neural language models. Recently, pre-trained language models (PLMs) have been proposed by pre-training Transformer models over large-scale corpora, showing strong capabilities in solving various NLP tasks. Since researchers have found that model scaling can lead to performance improvement, they further study the scaling effect by increasing the model size to an even larger size. Interestingly, when the parameter scale exceeds a certain level, these enlarged language models not only achieve a significant performance improvement but also show some special abilities that are not present in small-scale language models. To discriminate the difference in parameter scale, the research community has coined the term large language models (LLM) for the PLMs of significant size. Recently, the research on LLMs has been largely advanced by both academia and industry, and a remarkable progress is the launch of ChatGPT, which has attracted widespread attention from society. The technical evolution of LLMs has been making an important impact on the entire AI community, which would revolutionize the way how we develop and use AI algorithms. In this survey, we review the recent advances of LLMs by introducing the background, key findings, and mainstream techniques. In particular, we focus on four major aspects of LLMs, namely pre-training, adaptation tuning, utilization, and capacity evaluation. Besides, we also summarize the available resources for developing LLMs and discuss the remaining issues for future directions.
Comments: ongoing work; 124 pages, 946 citations
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Cite as: [cs.CL]
  (or [cs.CL] for this version)
  Focus to learn more arXiv-issued DOI via DataCite

Submission history

Access paper:.

  • Other Formats

References & Citations

  • Google Scholar
  • Semantic Scholar

1 blog link

Bibtex formatted citation.

BibSonomy logo

Bibliographic and Citation Tools

Code, data and media associated with this article, recommenders and search tools.

  • Institution

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs .

IMAGES

  1. Research on language learning and teaching: 1997-98

    language research paper

  2. ⛔ Research papers on teaching english as a second language. Research

    language research paper

  3. Learning the Grammar of a Second Language Research Paper

    language research paper

  4. (PDF) Quantitative register analysis across languages

    language research paper

  5. (PDF) Research on Language and Learning: implications for Language Teaching

    language research paper

  6. ⛔ Research papers on teaching english as a second language. Research

    language research paper

VIDEO

  1. A Level English Language (9093) Paper 4- Section B: Language and the Self (Part 2)

  2. Language Research

  3. Speak Your Customer's Language: Boost Sales with Effective Communication

  4. Common Types of Research Papers for Publication

  5. Linguistic evolution: how and why languages change

  6. 9 TIPS TO ENHANCE THE LANGUAGE IN YOUR RESEARCH PAPER

COMMENTS

  1. Language and linguistics

    In a comprehensive quantitative coanalysis of linguistic and genetic data across China, Yang et al. find evidence to suggest that demographic diffusion, cultural diffusion and linguistic ...

  2. The Language Learning Journal

    The Language Learning Journal (LLJ) is an academic, peer-reviewed journal, providing a forum for research and scholarly debate on current aspects of foreign and second language learning and teaching.

  3. Language Teaching Research: Sage Journals

    Language Teaching Research is a peer-reviewed journal that publishes research within the area of second or foreign language teaching. Although articles are written in English, the journal welcomes studies dealing with the teaching of languages other … | View full journal description.

  4. Language Learning

    It publishes empirical and theoretical articles that apply methods of inquiry from disciplines including psychology, linguistics, cognitive science, neuroscience, and educational research. We welcome studies of the learning of oral, signed, and written language and across diverse naturalistic, formal, and laboratory settings.

  5. Language in Society

    Language in Society is an international journal of sociolinguistics concerned with language and discourse as aspects of social life. The journal publishes empirical articles of general theoretical, comparative or methodological interest to students and scholars in sociolinguistics, linguistic anthropology, and related fields.

  6. LANGUAGE ACQUISITION AND LANGUAGE LEARNING

    The following paper is to analyze the synergy between language acquisition and language learning within a multicultural environment on the pedagogical discourse.

  7. Research on learning and teaching of languages other than

    In this review of System 's scholarship on the learning and teaching of languages other than English (LOTEs), we focus on 12 articles on language pedagogy and language learners, selected from a total of 208 relevant articles published in the journal (until 2020).

  8. Changing perceptions of language in sociolinguistics

    This paper traces the changing perceptions of language in sociolinguistics. These perceptions of language are reviewed in terms of language in its verbal forms, and language in vis-à-vis...

  9. Second Language Research

    Second Language Research is an international peer-reviewed, quarterly journal, publishing original theory-driven research concerned with second (and additional) language acquisition and second language performance.

  10. Research on Language and Social Interaction

    Kristine Fitch. Publication office: Taylor & Francis, Inc., 530 Walnut Street, Suite 850, Philadelphia, PA 19106. Authors can choose to publish gold open access in this journal. Read the Instructions for Authors for information on how to submit your article. Read full aims and scope. Explore articles. Latest. Open access. Most read. Most cited.

  11. Learning a Foreign Language: A Review on Recent Findings

    The findings, divided into three research areas, show that the learning of a foreign language may generate a lot of benefits for older individuals, such as enhancement of cognitive functioning, their self-esteem, increased opportunities of socializing, or reduction of costs.

  12. A STUDY OF LANGUAGE LEARNING STRATEGIES OF COLLEGE FEMALE

    Abstract: This paper investigated female ESL students’ preferred language learning strategies in the Philippine context. In addition, the researchers also identified the most and least preferred language learning strategies and how a) task requirement; b) age; and c) length of time learning English, affect their use of language learning ...

  13. Frontiers

    We demonstrated the dominant effect of metacognitive strategies and the low effect of memory strategies in Year 8. In addition, metacognitive strategies also influenced foreign language marks. The effect of foreign language marks on school achievement was also remarkable.

  14. [2402.06196] Large Language Models: A Survey

    In this paper, we review some of the most prominent LLMs, including three popular LLM families (GPT, LLaMA, PaLM), and discuss their characteristics, contributions and limitations. We also give an overview of techniques developed to build, and augment LLMs.

  15. (PDF) Research on Language and Learning: implications for

    Abstract. Taking into account severa1 limitations of communicative language teaching (CLT), this paper calls for the need to consider research on language use and learning through...

  16. Natural language processing: state of the art, current trends

    Natural language processing (NLP) has recently gained much attention for representing and analyzing human language computationally. It has spread its applications in various fields such as machine translation, email spam detection, information extraction, summarization, medical, and question answering etc. In this paper, we first distinguish ...

  17. A Comprehensive Overview of Large Language Models

    Large Language Models (LLMs) have recently demonstrated remarkable capabilities in natural language processing tasks and beyond. This success of LLMs has led to a large influx of research contributions in this direction.

  18. [2303.18223] A Survey of Large Language Models

    In this survey, we review the recent advances of LLMs by introducing the background, key findings, and mainstream techniques. In particular, we focus on four major aspects of LLMs, namely pre-training, adaptation tuning, utilization, and capacity evaluation.