Cultural Heritage Topics in Online Queries: Comparison
Please note this is a comparison between Version 1 by Karol Król and Version 2 by Catherine Yang.

New communication channels and methods for retrieving information can provide increasingly precise data describing how cultural heritage is perceived, protected, promoted, and shared. Many internet users search for cultural-heritage-related topics using online search engines and databases. 

  • keyword analysis
  • content analysis
  • cultural heritage
  • digital cultural heritage

1. Introduction

Cultural heritage components may originate from the natural environment or be manmade. They are most commonly divided into tangible, intangible, and natural categories [1][2][1,2]. However, the notion of “heritage” began to evolve in the final decades of the twentieth century; people began to apply the concept in place of such terms as “monument” or “cultural asset”. The international view of cultural heritage offers a global perspective which surpasses national, regional, or local ones. Cultural heritage assets harbour unique features that encourage communities to preserve and protect them. They can also evoke feelings and emotions [3]. Consequently, cultural heritage assets are no longer defined solely by their material attributes. This has given rise to an intangible cultural heritage, which has been long ignored in terms of protection [4]. Moreover, a new heritage category has recently come to light: digital cultural heritage [5].
Most people have positive associations with the concept of heritage and attribute positive values to it. Components of tangible culture, such as works of art, everyday objects, architecture, and landscapes are generally considered to be for the common good and beneficial to all, like intangible heritage components such as dances, music, theatre performances, rites, language, and memory [6]. This has led to the popular belief that investment in cultural heritage (and other forms of culture) is beneficial to the local economy not only in terms of cultural consumption, but also through increased employment and income [7][8][7,8].
Studies show that socioeconomic development should take into account cultural heritage and be based on the achievements of past generations [9][10][9,10]. Cultural heritage is a source of traditions, practices, and customs. Handicrafts, cuisines, and cultural events can be restored thanks to historical records and oral traditions. These can all can provide a framework for socioeconomic and cultural development. Nevertheless, cultural heritage components are fleeting and can be forgotten if no comprehensive effort is made to preserve them. Moreover, education and transgenerational transfer of traditional skills are necessary to achieve economic value through the use of heritage [11].
Living in today’s knowledge society implies a recognition of the importance of the past and consideration of cultural heritage as a fundamental background of our identities [12]. Information and communication technology offers easier access to and a more comprehensive view of cultural heritage artefacts. It can also further enrich and improve heritage education through innovative learning and teaching methods [13]. New communication channels and methods for retrieving information can provide increasingly precise data on how cultural heritage is perceived, protected, promoted, and shared, and also on the expectations of the audience [14]. Generally available search tools and social media can provide information on changing trends, moods, and opinions regarding cultural heritage from both the local and global perspectives [15][16][15,16].

2. Language as Cultural Heritage

Every human behaviour intended to produce a response in others involves social communication, and language is a primary communication tool [17][32]. Communication is not only about facts but also involves interpretations, attitudes, values, and understandings and perceptions of reality. Therefore, language is not only an instrument of communication but an entire system that determines the process [18][33]. Therefore, language is the primary carrier of culture and intercultural communication. An individual using a certain language is, to some extent, a participant in the culture established by the group that speaks this language. Every national language is a tool, material, and component of culture. It extends over the everyday lives of its users, their spirituality, and the entirety of their intellectual and artistic activity. Being the raw material of national literature, it preserves and reflects the complexity of historical experience. It constitutes the linguistic picture of the world, which is a collection of principles included in category-driven grammatical relationships and semantic lexical structures. It exhibits language-specific perceptions of individual components of the world and a general understanding of how it is organised, including its hierarchies and values. Therefore, every national language is rooted in the separate culture of the user community, which it affects [19][34]. Language is, in a way, a treasury and custodian of generations’ worth of knowledge about the world. This knowledge is in and of itself an important component of the cultural heritage of the community that speaks the language. The cognitive and cultural functions of a language are usually considered based on national (ethnic) languages. The linguistic differentiation of a population reflects cultural (and civilisational) differentiation, while cultural pluralism determines the abundance of cultural achievements of humanity. On the other hand, there is a strong need for the circulation of ideas and transfer of knowledge beyond the boundaries of national languages [20][35].

3. Introduction to Polish Research on Verbal Communication on the Internet

Polish is a large Central European language [21][36]. Polish language studies, Polish applied linguistics, and language pedagogy have identified “variations”, “dialects”, or “registers” of the language, with such qualifiers as “professional”, “expert”, “special”, “specialised”, or “specialist”. The use of specific phrases is most often associated with a specific circle, class, or social group that stands out from the population in terms of age or generation (language of the youth, students, or a generation), or profession (IT or legal jargon). According to sociolinguists, a special language is usually limited to communities delimited by a kind of social bond, such as a class, stratum, circle, or professional group [22][37]. Moreover, Polish linguistics divides specialist Polish into a professional language and professiolect, which is mainly spoken, and academic language, which is primarily written. The same applies to various groups of enthusiasts, including those in grassroots retrogaming, retrocomputing, and general (digital) cultural heritage societies [23][38]. Polish research focusing on linguistic aspects of verbal communication on the internet began in the early days of the internet in Poland, in the late twentieth century. This research most often concerns models of communication on the internet, typology of internet communication, the ontological status of the language on the internet, analysis of the notions of text and hypertext, a genealogical map of the internet language, characterisation of individual linguistic subsystems in internet communication, etiquette in technology, use of various codes to construct a text to be published online, identification and analysis of plays on words and their potential impact on the Polish language, and implications of internet-based language and communication practices in the virtual space for everyday language and communication [24][39]. Such analyses may be relevant to the moulding of the media discourse on cultural heritage. They may be particularly useful for the promotion of local heritage, which is hard to promote to a large online audience.

4. English as a Dominant Language

Cultural diversity is inherent to linguistic diversity. In the time of globalisation and the global village, it is the most valuable resource of humanity, a treasury of traditions, and a reservoir of growth opportunities [25][40]. This assumption emphasizes the peaceful partnership collaboration of nations in mutual respect for their differences and individualities. It also attributes particular humanistic values to multilingualism and polyculturalism [26][41]. On the other hand, the use of multiple languages may be advised against in some domains. The roles of dominant languages in science and technology, economy, tourism, transport and logistics, politics, and particularly mass media are growing, as they facilitate international communication and make information available to a broader audience. In general, global linguistic communication has always been multifaceted and diversified. However, some areas of social life exhibit a strong trend towards universalisation and unification, mostly in favour of English today. Its dominance is clear in science, where linguistic diversity is perceived as a hindrance to the transfer and perception of information [27][42]. English is the primary tool of international scientific communication today in terms of original works, citations, and researcher collaboration [28][43]. Nevertheless, expert knowledge and familiarity with the terminology used in a specific field are just as important [29][44]. The universal use of English as the language of international academic discourse breaches the barrier of multilingualism in science to facilitate the transfer of knowledge. Paradoxically, the ever-growing dominance of this language is sometimes considered a barrier in and of itself; it may significantly hinder the participation of researchers from outside English-speaking countries in the global scientific effort, limiting the impact of contributions in languages other than English on global scientific development [26][41]. The same applies to the flow of information more generally, particularly information of local relevance that one might wish to promote internationally, which could be the case for tourism or the promotion of local cultural heritage.

5. Cultural Heritage on the Internet

Online cultural content is available in many forms, such as texts, images, soundtracks, videos, and NFTs [30][45]. These items concern various topics, including art, handicraft, etc., and are written in various languages. Such content comes from diverse independent preservation organisations, such as museums, archives, libraries, or individuals, and is targeted both at amateurs and experts. The problem of finding and connecting information in such an environment of heterogeneous content delivery and data format can be both a barrier for end users trying to access cultural content and a challenge for content producers [31][46]. Recent decades have seen new bold campaigns by public institutions and private organisations to digitise cultural artefacts, yielding huge digital collections. These campaigns offer public access to millions of digital objects from many collections of cultural heritage through multilingual graphic user interfaces. However, large datasets such as digital libraries suffer from low availability to the general public and a difficult search process [32][47]. Smart conversational agents facilitate access to information in semantic networks through natural-language interaction and structured answers to user queries. Nevertheless, these tools are not as commonplace as search engines. In traditional portals, search is usually based on free text search (e.g., Google), database queries, and/or a stable classification hierarchy. Semantic content makes it possible to provide the end user with more “intelligent” facilities based on ontological concepts and structures [31][46].

6. Keyword Analyses

Research paper keyword analyses usually focus on one or multiple journals over a specific period. The frequency of keywords and their links to individual articles facilitate the visualisation of the results [33][48]. A keyword analysis can demonstrate whether a specific field is gaining popularity in the literature by arousing interest among researchers (a popular topic) or is relatively recent and paves the way for future efforts (an emerging topic) [34][35][49,50]. Such insight can hardly be gained through a traditional literature review [36][51]. Research paper keywords can be categorised according to two main attributes: dominance and perseverance. The former reflects the frequency of the keyword in a set of papers. It is the number of papers in which specific keywords occur. The latter attribute is related to the temporal continuity of the topic. Combined, the two attributes create a matrix of four areas. Each defines a homogeneous group of keywords [37][52]. Analysis of research paper keywords can identify descriptors of primary research topics and gene words that indicate knowledge domains formed through cross-referencing and hybridisation of core keywords [38][53]. If the number of papers or other sources is small, it is possible to analyse them manually using test and computing tools. For larger numbers of publications or online sources, more advanced methods for obtaining textual data are used, such as text mining or web scraping. Web scraping is a technique for extracting information from online sources. It replaces the manual, mechanical, and repetitive feeding of a spreadsheet or other software. It is broadly used to analyse large datasets via various packages and scripts, most often in R and Python. It has been applied to analyse manager job postings [39][54], to analyse emotions during the COVID-19 pandemic [40][55], and in numerous other focus papers [41][56]. Packages in R used to harvest and explore large textual datasets from the internet include rvest [42][57] and downloader [43][58]. Beautiful Soup and Reguests-HTML [44][45][59,60] are popular Python packages. Social media data can be scraped using facebook-scraper and Selenium. The application programming interface (API), which facilitates interactions among applications, is an important factor in sharing online resources. Today, its most common context is Web API and REST API [46][61]. These tools can be used according to service terms and conditions. Slightly less commonly used tools for such analyses are keyword planners, which can identify and discover new keywords for a topic [47][22].

7. The Information Potential of Keywords

A keyword is a phrase of one or several words that search engine users use. Keywords identify specific content. They can be included in the metadata and help to better adjust search results to user queries. There are three primary types of keyword: branded keywords (such as names of brands), generic keywords (most often consisting of one or two words that do not designate the searched product or piece of information precisely, for example, “cultural heritage”), and long-tail keywords, which are detailed queries typed into a search engine (such as “rural cultural heritage”). Geolocated or geotargeted keywords with place or street names, for example, are becoming increasingly popular [48][62]. The foundation of the long-tail strategy is that a broad and diversified set of keyphrases that are niche, less competitive, and each able to individually generate limited traffic results in significant aggregated organic traffic [49][63]. Nonstandard, unpopular keywords and detailed, precise, phrases of multiple words increase a website’s reach if used in large numbers. The idea of long-tail keywords was born from user behaviour. People increasingly search for specific information using complex queries of several words in various orders and diverse grammatical forms. According to the long-tail concept, users who type in uncommon, nonstandard, or long queries are searching for specific results and are more determined, making target conversion more probable [50][64]. Knowledge of keywords used online can help adjust content to users’ expectations. When the context and frequency of keywords are known, the content can be made more appealing to users. Consequently, they can reach specific websites more often. These activities are part of content marketing and search engine marketing (SEM), and are the basis of search engine optimisation (SEO) [51][65]. The context of specific keywords in users’ queries can be identified using online applications. Tools that propose keywords (keyword planners or generators) are based on keywords entered by users who search for information using search engines [52][66]. For example, when one enters a query into the Google search engine, it suggests other similar keywords. Keyword-suggesting tools fetch the suggestions from search engines and display them in their interface. However, the information is merely illustrative and the keyword frequency is not given in absolute values.
Video Production Service