The epistemology of the Internet lies at the junction of a few philosophical fields such as social epistemology, virtue epistemology, and ethics, while non-philosophical disciplines are involved as well for addressing the problems under epistemic investigation – information science, communications science, and technology science. Reviewing the research in this field, we can say that social epistemology is more than a contributing discipline, being actually the philosophical branch that the epistemology of the internet falls within, since the former provides the conceptual framework and shares the most part of its methodology with the latter.
The current entreviewy outlines the key theoretical points and challenges of this research and argues that, in order to crystallize a systematic epistemology of the Web space, we should not focus exclusively on the social dimension of the epistemology of the internet and on its social practical targets, but rather to built a firm theoretical foundation of this field by investigating the nature and evolution of the main entities themselves specific to the Internet (websites, search engines, groups, forums, posts, online communications, etc.), as mere epistemic products.
The epistemology of the Internet studies what is the nature of knowledge provided by the Internet and how the Internet, through its specific human-driven electronic entities (websites, search engines, groups, forums, posts, online communications, etc.), created, developed, and working under specific conditions (social media interaction, exploitation of big data, anonymity, etc.) affects knowledge. This includes how knowledge is produced, transmitted, evaluated, and trusted in online environments (Floridi, 2010 [1], 2011 [2]; Lynch, 2016 [3]).
Obviously, this field intersects social epistemology, which deals with knowledge in social contexts, virtue epistemology, dealing with the degrees in which the agents’ intellectual character matters for validating information as knowledge, and ethics, dealing with the norms of producing the various kinds of knowledge and the moral aspects of make it public and available for the users (Goldman, 1999 [4]; Frost-Arnold, 2023 [5]). Other non-philosophical disciplines are involved and contribute to the epistemology of the Internet. It is about information science, communications science, and technology science (Goldman & Whitcombe, 2011 [6]); Heersmink, 2018 [7]; Spence, 2009 [8]). Such disciplines account for both the nature and specificity of the Internet knowledge, especially the way it is produced and communicated, and how it is different for the traditional on-paper published knowledge. Involvement and contributions are expected to come also from other disciplines in the future, as the Internet evolves technologically and socially at a high rate.
Research in the epistemology of the Internet addressed themes related to the specific features of the information delivered in the Web space, including in what concerns the technological processes of delivery. The thematic areas that received the most attention are: search engines, information-seeking, and credibility; trust, authority, credibility; epistemic virtues, agent epistemology; epistemic risks: misinformation, overload, echo chambers; epistemic injustice, feminist, postcolonial and situated knowledge; ontology and methodology of big data; anonymity, identity, and online agency; content moderation, platform power and epistemic governance; and virtue, norms, ethics, and educational implications.
In what follows we briefly review each area at a time, outlining the major problems and debates that emerged with the research.
The Web space gives easy access to massive information and search engines mediate that access, in the sense that they structure and filter the information in ways that affect what the users see. It follows that users must often evaluate sources for credibility, authority, and evidence beyond the search-engines’ algorithmic ranking in searches, which is based on criteria of questionable epistemic virtue. In general, virtue epistemology has been applied to how we ought to use the Internet in epistemically good ways - patience, attentiveness, intellectual humility, etc. (Heersmink, 2018 [7]).
The concept of ‘default trust’ has been advanced in relation to personalization, algorithmic bias, and filter bubbles, in which users may not see dissenting or diverse views (Bartsch et al., 2025 [9]). However, both the nature and role of default trust is ambiguous. Given the fact or condition that users often are not well trained in evaluating evidence or sources, especially in “fluid” or non-academic domains, there is tension between usability/convenience and epistemic standards (supposed to transcend the Internet as a medium), which has not been regimented in theory.
Trust and knowledge are fundamentally entwined in online contexts. Users often rely on authority (expertise, reputation) and credibility (truth, consistency), but what counts as authority or credible can vary and be manipulated (Rini, 2017 [10]; Wang & Emurian, 2005 [11]). It worth noting that factors beyond the content itself and authorship count for the users when assessing credibility, for instance the aesthetics of a site’s design (Robins & Holmes, 2008 [12] Summary RO).
Some work proposes formal models (e.g. knowledge-based trust, endogenous versus exogenous signals, transparency of source, provenance, “signals like references or backlinks, etc.) for evaluating web sources. Dong et al. (2015 [13]) propose a quantitative model using endogenous signals to estimate web source trustworthiness, distinguishing extraction versus factual errors. Rowley and Johnson (2013 [14]) develop models linking credibility, trust, and authority, and analyzing how they interact in online settings. But foundational questions remained unanswered: How to define credible authority? How to deal with anonymity and pseudonymity? How to account theoretically for the risk of “authority bias” (trusting someone because of perceived status rather than evidence). Automated ranking algorithms are themselves authorities of a certain sort but they are opaque and associated with only one entity as epistemic agent – the search engine running them. Algorithmic influence and platform power complicate things in accounting for who is really authoritative.
Drawing on virtue epistemology of the Internet, some papers (like Heersmink, 2018 [7]) argue that we need to cultivate our own intellectual virtues to use the Internet well: curiosity, autonomy, humility, carefulness, open-mindedness, etc. These virtues are supposed to help counteract or mitigate the epistemic risks of the web (misinformation, overload, bias). In general, it is argued that how agents use the Internet – including intellectual virtues – is crucial for epistemic reliability (Baehr, 2011 [15]).
But there is tension in this argument, as not everyone has equal capability or training. Also, virtues may conflict (e.g., breadth vs. carefulness), while environment and platform structures can hinder or distort virtues (e.g., incentives for sensational content). Moreover, how to integrate virtue cultivation in education or platform design is underexplored. Overall, the question arises as whether such direction of accounting for the users’ capabilities (of a strong social nature) is not replacing the Internet (the main object of investigation) from the spotlight of the research within virtue epistemology.
The Web space massively amplifies information, including false or misleading. Big Data and social media amplify speed over depth (Boyd & Crawford, 2012 [16]). The choices of what to show/hide, algorithmic selection, popularity contests, virality, etc., can distort epistemic communities. The Web space is overloaded with information; there is so much content that evaluating everything – individually or by comparing – is impossible. This is the main reason why the heuristics approach dominates.
Echo chambers and filter bubbles potentially reduce exposure to conflicting views and nuanced information. Evidence for “filter bubbles” is mixed; it depends on both individual behavior and platform architecture (Nguyen, 2020 [17]). Attempts to moderate content run into issues of censorship, bias, free expression, and platform power. Distinguishing misinformation versus disinformation (intent) is difficult (Floridi, 2011 [2]). The question arises as to what epistemic standards apply in different domains (everyday life versus scientific) and how to account for the social dimension of these standards (originating in the social purpose of preventing harms for the users) in a systematic epistemology of the Internet.
Since the online environment is not neutral, it inherits and can amplify social inequalities. Marginalized voices may be silenced, dismissed, or stereotyped. Fricker (2007 [18]) distinguishes between testimonial injustice (where someone’s testimony is undervalued because of prejudice) and hermeneutical injustice (when someone lacks the resources to make sense of their oppression).
Feminist epistemology, standpoint theory, epistemologies of ignorance, and epistemic injustice are addressed when analyzing online knowing, raising the question of how can we approach theoretically concepts like “whose knowledge counts” and regimentation projects like “decolonizing Internet knowledge” (Pohlhaus, 2017 [19]).
Anonymity complicates traditional frameworks of epistemic injustice and undermines attempts to remediate injustice in large networked platforms (Origgi & Ciranna, 2017 [20]). Another issue is that efforts to “correct” injustice may impose other biases (global South versus North, language, access, or algorithmic bias in training data) (Fricker, 2007 [18]).
Big Data brings both opportunities (for instance, scaling or pattern detection) and challenges. Collection, filtering, and use of massive datasets raise questions about representation, validity, bias, quantification versus interpretation (Boyd & Crawford, 2012 [16]). Such questions have both a statistical nature and an ethical-epistemological one.
Resnyansky (2019 [21]) critiques over-reliance on quantitative data, advocating for integrating social scientific theory in the epistemological account of Big Data.
From an epistemological perspective, there are tensions between quantitative and qualitative, the danger of treating data as neutral, and the issue of “data universalism” (assuming that data generalizes). And since platforms act as producers of data and controllers of data access, power issues are also raised (Törnberg & Törnberg, 2018 [22]).
Anonymity/pseudonymity changes how we perceive knowledge, trust, and authority, as it can empower marginalized voices, but also enable deception, trolls, fake accounts. In addition, identity cues affect reception of testimony. Fricker-inspired frameworks explore how anonymity and identity in digital spaces complicate power relations in testimony and credibility (Origgi & Ciranna, 2017 [20]).
A theory of Internet-like anonymized knowledge should tell how to balance privacy/anonymity versus accountability, including how to distinguish sincere pseudonyms from deceit. A factual issue to be reflected in any theoretical account of this knowledge is that there is cultural variation in how identity is tied to credibility (Bartsch et al., 2025 [9]).
Online platforms (social networks, search engines, Wikipedia) are the Internet key gatekeepers: their ranking algorithms, moderation policies, and community norms decide what is visible and to what degree (Rowley & Johnson, 2013 [14]). They shape what knowledge is accessed, produced, and considered legitimate from their point of view (McDowell, 2002 [23]).
The major question of epistemic authority relative to the platforms is who decides what content is moderated and what counts as harmful or false, as well as how that authority should do that, not mentioning the cross-jurisdiction issues (Petit, 2006 [24]; Alaimo & Kallinikos, 2017 [25]; Rowell & Call-Cummings, 2020 [26]; Zuckerman, 2021 [27]). The ranking algorithms are not transparent and there is real potential for epistemic injustice in the moderation processes. Overall, incentives of platforms conflict with the epistemic ideals, which transcend the specificity of the Internet as an epistemic medium.
Researchers agreed that to navigate the Web well and safely, we need norms, virtues, and skills, such as critical thinking, epistemic humility, and skepticism. As platform designs often do not reward virtue, educational systems should integrate training in evaluating sources, recognizing bias, and corroboration in online environments. Spence (2009 [8]) emphasizes that online information carries both epistemic and ethical obligations for its producers, disseminators, and users.
However, it is difficult to adopt standards sensitive to context and culture (Brady & Fricker, 2019 [28]). Moreover, norms may conflict to each other (openness versus privacy, free speech versus preventing harm, for instance) (Turilli & Floridi, 2009 [29]). Educating large populations is challenging and brings a new social dimension to a complete account of the epistemology of the Internet, originating in the educational nature of the Internet.
In what follows I will outline the major tensions and open problems raised by the research in the epistemology of the Internet, relative to the theoretical framework in which the research was developed. These challenges are either the result of incomplete or unsettled theory (including in what concerns conceptual framework) or require further research within the same framework.
To what extent can we have universal norms of epistemic evaluation (what counts as credible and what trust requires) given cultural, linguistic, and institutional variation?
An answer to this question should resolve at an epistemic-ethical level the dissension that platforms and authorities often try to impose standard criteria, but local and cultural contexts may disagree with them.
Many systems and human management (search engines, AI recommendation systems, and content moderators) operate opaquely. For instance, the users often don’t see why certain content is ranked higher or removed and algorithmic decisions can embody biases. The question is how can we have justified belief or trust when the mediators or agents do not provide transparency for their criteria and policies, as long as discovering and correcting them is difficult or impossible? Should mere exposure in a ranking form – whether opaque or transparent – be considered an epistemic criterion for Internet knowledge?
With vast amounts of information, users rely on heuristics. But such heuristics (surface cues, popularity, recency, and so on) may mislead and detour the acquisition of essential knowledge concerning all consulted sources.
There is a trade-off: users cannot deeply investigate everything and are forced to superficial evaluation, however this practice assumes the risks of false beliefs and compilation of irrelevant information to be converted into knowledge.
It may seem that this trade-off does not pose ethical problems, since it is not about the information itself that it is criticized, but one’s subjective way of evaluating it. However, as far as a clear theoretical account for the epistemology of the Internet is concerned, the trade-off problem reverts to the optimization of the form and structure of the epistemic units, which to ensure the essential information to be extracted and used as relevant knowledge regardless of the users’ habits and constraints.
Not all false information is equally pernicious. Some is erroneous by conception and authoring, but some is intentionally false. While erroneous “by naïve conception” information can be eventually judged and qualified as such by evaluating the credentials of its authors and by critical thinking about the information itself, intentionally false information is expected to be construed in a sort of “expert” mode to be misleading and hence hard to be identified as false.
A theory of epistemology of the Internet should then capture the intentional dimension of the diffusion of information, as submitting to the social dimension of the Internet. The problem of epistemic responsibility becomes very important in the light of the misinformation/disinformation issue. Therefore, when fine graining the social dimension of the Internet within an epistemological framework, the epistemic responsibility should be granted an essential theoretical role as a factor of evaluating information as knowledge. Moreover, when distinguishing between Internet knowledge and traditional press published (on paper or online) knowledge, the former should be accounted as relative to the Internet-specific epistemic responsibility.
Answering the question of what responsibilities do platforms and authors have to correct, retract, or signal false content, in order for the users to differentiate erroneous from intentionally misleading information, and how they should be regulated, falls within the ethical side of the delivery of information via Internet (Himma & Tavani, 2008 [30]). The question arises as how these ethical aspects can be adequately represented in a theory of the epistemology of the Internet, since there is no absolute regulatory or epistemic authority ruling over the Internet as a whole.
The core question whose answer should be a prerequisite of the construction of a theory of Internet knowledge is what should trust amount to in online settings? Is it about reliability or deeper epistemic virtue? When is it justified to trust, when to defer, when to be skeptical? And how do power and authority factor in?
Such questions should be answered within both a general epistemology and a social epistemology framework and adapted for the specificity of a conceptual framework of an epistemology of the Internet.
From a merely ethical perspective, the Internet should provide content of epistemic literacy, training users to assess sources, detect bias, and engage in good practices. In regard to that, questions of how to build such content, how to assign it epistemic authority in online setting, and what should be the role of educational institutions and platform design are waiting for their answers. However, such content, principles, and policies have to be developed in an academic setting in order to be adequate, which raises again the same issues of trust and authority in online setting. Besides that, such content would have the same status as any other on the Internet, as no authority can impose the users to access certain content prior to their use of the Internet.
Technology advances with high acceleration on the Internet. AI, generative models, and automation already raised questions about bias, hallucination, and erasure of non-dominant epistemologies. A foundational question arises for a theory of epistemology of the Internet: How can such theory be developed and crystallized, as long as it should always adapt to the new technology? A solution to this problem should transcend the nature of the means by which information and knowledge is delivered online and should depend on the information itself and its nature. However, technology is part of the nature of the Internet.
The epistemology of the Internet is a rich, rapidly growing field extending classical concerns of justification, knowledge, authority, and trust into Web-mediated contexts. Strengths include combining empirical research and case studies with normative reflection, attention to social and power dimensions, and recognition that technology mediates knowledge. Challenges remain in balancing rigorous epistemic ideals with real-world constraints, managing platform power and opacity, ensuring fairness and justice, and integrating diverse epistemic cultures.
Research in this field is far from crystallizing a systematic theory of the epistemology of the Internet. The attribute of systematic is justified by the aim of integrating this theory in the traditional epistemology and establishing the precise relationships of this prospected theory with the more general social epistemology, by employing the specificity of the Internet knowledge.
As expected, the research was focused on the social dimension of Internet, given that practical targets of a social-ethical nature emerged with the development of the Internet. However, this focus detours the path toward a systematic epistemology of the Internet, just because the epistemic units or products specific to the web space (websites, search engines, groups, forums, posts, online communications, etc.) are those delivering and processing knowledge for the users through the Internet interface. In traditional (general) epistemology, the orthodoxy definition of knowledge as “justified true belief” is only relative to the content of the belief itself, which is subject to testing for justification and truth, even though the epistemic agents or agencies are part of the process of delivering it. Actually, the social dimension of knowledge in traditional epistemology only concerns (in a weak mode) the epistemic agents and less the ‘users’ of knowledge, who always exist in whatever medium. In the reviewed research, a great emphasis is put on the social and ethical aspects of the acquisition of information and knowledge of the Internet users and less on what counts as knowledge in this medium.
Obviously, there is a wide interdisciplinary and multidisciplinary framework getting contoured with the analysis of the Internet knowledge from the providers’ and users’ perspective. But epistemology itself is not that interdisciplinary – it calls for other philosophical fields and at most cognition sciences but not beyond in the society. Prospecting for standards and norms to avoid Internet-specific harms and optimize users’ experience is a merely social aim and many disciplines are welcome and able to contribute. However, for a systematic theory of the Internet knowledge which to integrate in the traditional epistemology and distinguish from social epistemology, in this stage first foundational theoretical research is needed, centered on a metatheoretical reflection on what counts as “knowledge” in online settings. Theoretical research should investigate the epistemic products of the Internet in a social-free (but not epistemic-agents-free) conceptual framework and establish their place and specificity in a clear taxonomy of “published knowledge”, which will include traditional on-paper information as knowledge (books, press, courses, etc.). This direction also assumes as a prerequisite establishing precisely the distinction between information and knowledge in online setting and adopting a methodology specific to analytical philosophy.