Submitted Successfully!
To reward your contribution, here is a gift for you: A free trial for our video production service.
Thank you for your contribution! You can also upload a video entry or images related to this topic.
Version Summary Created by Modification Content Size Created at Operation
1 -- 2536 2023-11-09 15:30:05 |
2 A more correct name for the context -2468 word(s) 68 2023-11-11 13:59:11 | |
3 We apologize. The content presentation language has been corrected into English. + 2474 word(s) 2542 2023-11-13 10:43:24 | |
4 references update + 3 word(s) 2545 2023-11-14 06:41:41 |

Video Upload Options

Do you have a full video?

Confirm

Are you sure to Delete?
Cite
If you have any further questions, please contact Encyclopedia Editorial Office.
Yesin, V.; Karpinski, M.; Yesina, M.; Vilihura, V.; Kozak, R.; Shevchuk, R. Technique for Searching Data. Encyclopedia. Available online: https://encyclopedia.pub/entry/51371 (accessed on 04 July 2024).
Yesin V, Karpinski M, Yesina M, Vilihura V, Kozak R, Shevchuk R. Technique for Searching Data. Encyclopedia. Available at: https://encyclopedia.pub/entry/51371. Accessed July 04, 2024.
Yesin, Vitalii, Mikolaj Karpinski, Maryna Yesina, Vladyslav Vilihura, Ruslan Kozak, Ruslan Shevchuk. "Technique for Searching Data" Encyclopedia, https://encyclopedia.pub/entry/51371 (accessed July 04, 2024).
Yesin, V., Karpinski, M., Yesina, M., Vilihura, V., Kozak, R., & Shevchuk, R. (2023, November 09). Technique for Searching Data. In Encyclopedia. https://encyclopedia.pub/entry/51371
Yesin, Vitalii, et al. "Technique for Searching Data." Encyclopedia. Web. 09 November, 2023.
Technique for Searching Data
Edit

The growing popularity of data outsourcing to third-party cloud servers has a downside, related to the serious concerns of data owners about their security due to possible leakage. The desire to reduce the risk of loss of data confidentiality has become a motivating start to developing mechanisms that provide the ability to effectively use encryption to protect data. However, the use of traditional encryption methods faces a problem. Namely, traditional encryption, by making it impossible for insiders and outsiders to access data without knowing the keys, excludes the possibility of searching. 

database security database management system (DBMS) confidentiality encryption

1. Introduction

Today, storing and processing data on third-party remote cloud servers is widely used, showing explosive growth [1]. However, as the scale, value, and centralization of data increases, the reverse side of this process is revealed—the problems of ensuring the security and privacy of data are aggravated, which causes serious concern for owners and users of data. There is an identified risk that data stored in databases may be compromised [2], and this, in accordance with various international laws and standards such as: General Data Protection Regulation (GDPR [3], Payment Card Industry Data Security Standard (PCI DSS) [4], the Health Insurance Portability and Accountability Act (HIPAA) [5], and some others, cannot be allowed. The owner of the data must be sure that the data stored on the third party remote servers of the service provider are protected from theft by outsiders. Moreover, these data must be protected even from the service provider itself (a valid user, known as an insider), if the respective provider cannot be trusted.
As you know, one of the fundamental solutions to this problem is the use of relevant cryptographic methods and primitives. Encryption is the standard approach to providing data confidentiality that is outsourced to so-called honest-but-curious cloud servers. Encryption makes it impossible for both insiders and outsiders to access data without knowing the keys. However, encryption also has a downside. The direct use of traditional data encryption/decryption approaches in most cases makes it difficult to perform search operations in encrypted data [6][7][8]. A simple solution to this problem is to download the entire dataset of the corresponding storage, then decrypt it locally and search for the required data. This approach creates serious performance issues that negate the benefits of outsourcing, making it unacceptable for most applications. The other method allows the server to decrypt the data, execute the query on the server side, and send only the results back to the user. However, in this case, the level of security is reduced, since data protected by encryption can potentially become available to the service provider (privileged user). Therefore, it is desirable to support the fullest possible server-side search functionality with the least possible loss of data confidentiality. In particular, a secure search system should aim to ensure that the service provider does not learn anything about the data stored in the secure database or about the queries, and the requester of the relevant data (querier) learns nothing, except for the query results [2].
The problem of searching data in encrypted databases has aroused great interest both in the scientific community and in industry [9]. To solve the problem of providing a search in cryptographically protected databases, relevant studies were carried out related to the development of new cryptographic primitives, new data structures for searchable encryption, and the development of views on security [6][10]. The solutions available today for searching encrypted data combine non-trivial ideas from cryptography, from the main provisions of the theory of algorithms and data structures, information search, and databases [2][6][11]. However, despite the wide variety of options offered, there is no dominant solution for all use cases. The goal of a security plan, according to Andress [12], is to find the balance between protection, usability, and cost. Similar views are held by Fuller et al. [2], who believe that designing a protected search system is a balance between security, functionality, performance, and usability. Therefore, it is important for data owners and users to understand how a fairly wide range of secure database systems are offered for their various applications and what compromises are acceptable for their respective use case. All this has stimulated research in the field of secure data management and increased its relevance.

2. A Brief Survey of Technique for Searching Data in a Cryptographically Protected Database

Security, as is known, is associated with information that, during the operation of searchable encryption schemes, is revealed or leaked to an attacker who has access to the database server. Bösch et al. [6] believe that information leakage is possible in such schemes, which can be divided into three groups:
(a)
index information (refers to the information about the keywords contained in the index);
(b)
search pattern (information that can be obtained by knowing whether two search results refer to the same keyword);
(c)
an access pattern (refers to information that is implied by the query (search) results, namely which documents contain the requested keyword for each of the queries [13] or which document identifiers match the query [14]).
Bösch et al. [6] note that in many schemes, there is leakage of at least the search pattern and the access pattern. At that, identifying the search pattern may not be a problem in some scenarios, whereas for others it is unacceptable. For example, in a medical database, disclosing a search pattern through statistical analysis (which allows an attacker to get full information about the plaintext keywords) can lead to the leakage of a large amount of information. This information can be used to match it with other (anonymous) public databases.
Fuller et al. [2] distinguish two types of entities that can pose a security threat to a database: a valid user known as an insider who performs one or more roles and an outsider. The latter can monitor and potentially modify network interactions between valid users, separating attackers into those that persist for the lifetime of the database and those that obtain a snapshot at a single point in time. At that, attackers are divided into those that persist for the lifetime of the database and those that obtain a snapshot at one a single point in time [15]. In addition, Fuller et al. [2] differentiate attackers into those who are: semi-honest (or honest-but-curious), i.e., those who follow the prescribed protocols, but may try to get additional information from data that they observe; and malicious, that is, those that actively perform actions aimed at obtaining additional information or influencing the operation of the system. They also note that much of the active research in protected search technology considers semi-honest security against a persistent insider adversary. At that, special attention is paid to such types of objects within a protected search system that are vulnerable to leaks, such as: (a) data items and any indexing data structures; (b) queries; (c) records returned in response to queries or other relationships between data items and queries; (d) access control rules and the results of their application.
The cryptographic community has developed several common primitives:
fully homomorphic encryption [16][17][18][19],
functional encryption [20][21] with its subclasses and earlier representatives:
predicate encryption [22][23],
identity-based encryption [24],
attribute-based encryption [25]
and some others that completely or partially solve the problem of searching in a secure database. Protected search techniques are often based on these primitives, but rarely rely solely on one of them. Instead, they tend to use specialized protocols, often with some leakage in order to improve performance [2].
One possible approach to reduce the damage caused by a server compromise is to encrypt sensitive data and run all computation (application logic) on the clients. However, as noted by Popa et al. [26], some important applications are not suitable for this approach. For example, database-backed websites that process queries to generate data for the user, and applications that compute large amounts of data. Another possible approach is the use of such theoretical solutions as fully homomorphic encryption (FHE) [16][17][18][19]. Its use allows servers to compute arbitrary functions over encrypted data while only clients see the decrypted data. However, one of the problems of schemes with fully homomorphic encryption is performance, since current schemes require large computational resources and large storage overheads [6][26]. For some applications, so-called somewhat homomorphic encryption schemes may be used. These schemes are more efficient than FHE, but only allow a certain number of additions and multiplications [16][18]. The main problem when using somewhat or fully homomorphic encryption is that the resulting search schemes require a linear search time in the length of the dataset and this is too slow for practical use in modern applications.
As noted earlier, the problem of searching over encrypted data is of great interest from both theoretical and practical points of view. This is explained by the importance of ensuring the security and privacy of data stored and processed on third-party remote cloud servers of the service provider. However, as noted by some experts in this field [9][27], research on this topic is more focused on the scenario of a user who outsources an encrypted set of documents (such as e-mails or medical records) and would like to continue keyword search in this encrypted dataset. However, in practice, many companies, organizations, and institutions store data in databases that use the relational data model. Users are accustomed to using widely accepted SQL, which allows them to store, query, and update their data in a convenient way. Databases that support SQL (this applies in general to both NewSQL and some NoSQL databases that also allow you to work in the SQL query paradigm) provide fast search and retrieval of records, provided that the database can read out the data contents. However, encryption makes it difficult to search encrypted databases. Therefore, the direct application of solutions to search for the required information in the encrypted data of traditional databases is not an easy task.
In order to solve certain issues, Hacigümüş et al. [28] have developed techniques by which the bulk of the work of executing SQL queries can be performed by the service provider without the need to decrypt the stored data. The paper explores an algebraic structure for query splitting to minimize client-side computations. Using a so-called “coarse index” allows you to partially execute the SQL query on the provider side. The result of this query is sent to the client. The final correct result of the query is found by decrypting the data and executing a compensation query on the client side.
Popa et al. [26] proposed a system called CryptDB that supports SQL queries over encrypted data. This solution is based on various types of encryption, such as random (RND), deterministic (DET), and order-preserving encryption (OPE), applied to a SQL table column. To request data from an encrypted database, CryptDB converts an unencrypted SQL query into its encrypted equivalent and decrypts the appropriate encryption layers. CryptDB achieves its goals using three ideas: running queries over encrypted data using a new encryption strategy with SQL support, dynamically adjusting the encryption level using encryption onions to minimize the information disclosed to the untrusted DBMS server, and chaining encryption keys to user passwords in a way that only authorized users can access to encrypted data. At that, although CryptDB protects data confidentiality, it does not guarantee the integrity, actuality, or completeness of the results returned to the application. However, the main disadvantage of CryptDB, as noted by Azraoui et al. [9] is that whenever one layer is removed, the encryption scheme becomes weak. In light of this, the main problem is to provide a practical solution for searching over encrypted databases that does not suffer from the leakage occurring in CryptDB and that provides transparent processing of complex queries over encrypted SQL databases. In their paper [9] the authors attempt to solve this problem by proposing a practical construct for searching data in an encrypted SQL databases that limits information leakage. Their solution is based on the searchable encryption technique developed by Curtmola et al. [29] and applied to unstructured documents. This mechanism creates an inverted search index of keywords in the database to enable keyword search queries over encrypted data. The practicality of this solution is achieved through the use of the cuckoo hashing technique, which makes the search in the index efficient. The proposed solution supports Boolean and range queries.
Pilyankevich et al. [27] propose a system (called Acra) which allows, among other things, to provide a search for encrypted data in SQL databases. The proposed Acra Searchable Encryption (Acra SE) solution is based on a blind indexing approach that develops the original idea of the CipherSweet project [30]. The main component of the Acra SE scheme is the so-called Acra Server, which works as a reverse proxy (transparent encryption/decryption proxy server). It sits between the application and the database. The application does not know that the data are encrypted before it gets into the database, the database does not know that someone encrypted the data. It is worth noting that the encryption and secure search functions of Acra Server can be configured for each column. This means that every table in the database can be fully encrypted (every column), partially encrypted (some columns are encrypted, some not), or fully unencrypted. All Acra’s searchable encryption security properties are very similar to the security properties of CipherSweet, which poses the risk of partially known plaintext attacks. In this connection, Pilyankevich et al. [27] provide practical recommendations to ensure security. However, despite certain solutions aimed at ensuring the security of storing and searching for sensitive data, Acra, like CipherSweet, which was taken as a prototype of a searchable encryption scheme, supports the minimum functionality of queries, namely, only for equality.
Various DBMSs are characterized by the so-called technology of “transparent data encryption” (TDI) [31], which allows you to selectively encrypt sensitive data stored in database files, as well as in files related to data recovery, such as redo logs, archive logs, backup tapes. The essence of transparent encryption is that a combination of two keys is used: a key for each database table, which is unique, a master key that is stored outside the database in the so-called “wallet”. Data stored on disk are encrypted; however, they are automatically decrypted for the legitimate user to process queries. That is, when the user selects encrypted columns, the DBMS quietly extracts the key from the “wallet”, decrypts the columns and shows them to the user. As a result, the server must have access to the decryption keys, and an attacker who has compromised the DBMS software can gain access to all data. Therefore, the main goal of TDE is to protect sensitive data located in the corresponding files of the operating system. TDE is not a full blown encryption system and it should not be used in this capacity.
In addition, attention should be paid to the fact that the ability to perform search operations over encrypted databases leads to the complexity of systems and an increase in the amount of memory required and query execution time. At that, some searchable encryption schemes when performing certain queries do not provide sufficient data confidentiality. That is, with long-term observation, an attacker can obtain a significant part of the information about sensitive data. All this testifies to the openness of the secure search problem and the need for further research in this direction to ensure secure work with remote databases and data storages.

References

  1. Abadi, D.; Ailamaki, A.; Andersen, D.; Bailis, P.; Balazinska, M.; Bernstein, P.; Boncz, P.; Chaudhuri, S.; Cheung, A.; Doan, A.; et al. The Seattle Report on Database Research. ACM Sigmod Rec. 2020, 48, 44–53.
  2. Fuller, B.; Varia, M.; Yerukhimovich, A.; Shen, E.; Hamlin, A.; Gadepally, V.; Shay, R.; Mitchell, J.D.; Cunningham, R.K. SoK: Cryptographically protected database search. In Proceedings of the 2017 IEEE Symposium on Security and Privacy (SP), San Jose, CA, USA, 22–26 May 2017; pp. 172–191.
  3. General Data Protection Regulation GDPR. Available online: https://gdpr-info.eu/ (accessed on 2 August 2023).
  4. Payment Card Industry (PCI) Data Security Standard. Requirements and Testing Procedures Version 4.0. 2022. Available online: https://www.pcisecuritystandards.org/documents/PCI-DSS-v4_0.pdf (accessed on 2 August 2023).
  5. Atchinson, B.K.; Fox, D.M. From the field: The politics of the health insurance portability and accountability act. Health Aff. 1997, 16, 146–150.
  6. Bösch, C.; Hartel, P.; Jonker, W.; Peter, A. A survey of provably secure searchable encryption. ACM Comput. Surv. (CSUR) 2014, 47, 1–51.
  7. Yesin, V.; Vilihura, V. Research on the main methods and schemes of encryption with search capability. Radiotekhnika 2022, 2, 138–155.
  8. Yesin, V.; Vilihura, V. Researching basic searchable encryption schemes in databases that support SQL. Radiotekhnika 2022, 3, 53–74.
  9. Azraoui, M.; Önen, M.; Molva, R. Framework for Searchable Encryption with SQL Databases. In Proceedings of the 8th International Conference on Cloud Computing and Services Science (CLOSER 2018), Madeira, Portugal, 19–21 March 2018; pp. 57–67.
  10. Ramasamy, R.; Vivek, S.S.; George, P.; Kshatriya, B.S.R. Dynamic verifiable encrypted keyword search using bitmap index and homomorphic MAC. In Proceedings of the 2017 IEEE 4th International Conference on Cyber Security and Cloud Computing (CSCloud), New York, NY, USA, 26–28 June 2017; pp. 357–362.
  11. Kamara, S. Encrypted search. XRDS 2015, 21, 30–34.
  12. Andress, J. The Basics of Information Security: Understanding the Fundamentals of InfoSec in Theory and Practice, 2nd ed.; Syngress: Waltham, MA, USA, 2014.
  13. Liu, C.; Zhu, L.; Wang, M.; Tan, Y.A. Search pattern leakage in searchable encryption: Attacks and new construction. Inf. Sci. 2014, 265, 176–188.
  14. Oya, S.; Kerschbaum, F. Hiding the access pattern is not enough: Exploiting search pattern leakage in searchable encryption. In Proceedings of the 30th USENIX Security Symposium (USENIX Security 21), Virtual, 11–13 August 2021; pp. 127–142.
  15. Grubbs, P.; McPherson, R.; Naveed, M.; Ristenpart, T.; Shmatikov, V. Breaking web applications built on top of encrypted data. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, Vienna, Austria, 24–28 October 2016; pp. 1353–1364.
  16. Gentry, C. Computing arbitrary functions of encrypted data. Commun. ACM 2010, 53, 97–105.
  17. van Dijk, M.; Gentry, C.; Halevi, S.; Vaikuntanatha, V. Fully Homomorphic Encryption over the Integers; Advances in Cryptology—EUROCRYPT 2010. EUROCRYPT 2010. Lecture Notes in Computer Science; Gilbert, H., Ed.; Springer: Berlin/Heidelberg, Germany, 2010; Volume 6110, pp. 24–43.
  18. Brakerski, Z.; Vaikuntanathan, V. Fully Homomorphic Encryption from Ring-LWE and Security for Key Dependent Messages; Advances in Cryptology—CRYPTO 2011. CRYPTO 2011. Lecture Notes in Computer Science; Rogaway, P., Ed.; Springer: Berlin/Heidelberg, Germany, 2011; Volume 6841, pp. 505–524.
  19. Brakerski, Z.; Gentry, C.; Vaikuntanathan, V. (Leveled) fully homomorphic encryption without bootstrapping. ACM Trans. Comput. Theory 2014, 6, 1–36.
  20. Garg, S.; Gentry, C.; Halevi, S.; Raykova, M.; Sahai, A.; Waters, B. Candidate indistinguishability obfuscation and functional encryption for all circuits. SIAM J. Comput. 2016, 45, 882–929.
  21. Boneh, D.; Sahai, A.; Waters, B. Functional Encryption: Definitions and Challenges; Theory of Cryptography. TCC 2011. Lecture Notes in Computer Science; Ishai, Y., Ed.; Springer: Berlin/Heidelberg, Germany, 2011; Volume 6597, pp. 253–273.
  22. Boneh, D.; Waters, B. Conjunctive, Subset, and Range Queries on Encrypted Data; Theory of Cryptography. TCC 2007. Lecture Notes in Computer Science; Vadhan, S.P., Ed.; Springer: Berlin/Heidelberg, Germany, 2007; Volume 4392, pp. 535–554.
  23. Katz, J.; Sahai, A.; Waters, B. Predicate Encryption Supporting Disjunctions, Polynomial Equations, and Inner Products; Advances in Cryptology—EUROCRYPT 2008. EUROCRYPT 2008. Lecture Notes in Computer Science; Smart, N., Ed.; Springer: Berlin/Heidelberg, Germany, 2008; Volume 4965, pp. 146–162.
  24. Boneh, D.; Franklin, M. Identity-Based Encryption from the Weil Pairing; Advances in Cryptology—CRYPTO 2001. CRYPTO 2001. Lecture Notes in Computer Science; Kilian, J., Ed.; Springer: Berlin/Heidelberg, Germany, 2001; Volume 2139, pp. 213–229.
  25. Sahai, A.; Waters, B. Fuzzy Identity-Based Encryption; Advances in Cryptology—EUROCRYPT 2005. EUROCRYPT 2005. Lecture Notes in Computer Science; Cramer, R., Ed.; Springer: Berlin/Heidelberg, Germany, 2005; Volume 3494, pp. 457–473.
  26. Popa, R.A.; Redfield, C.M.; Zeldovich, N.; Balakrishnan, H. CryptDB: Protecting confidentiality with encrypted query processing. In Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles 2011, SOSP 2011, Cascais, Portugal, 23–26 October 2011; pp. 85–100.
  27. Pilyankevich, E.; Kornieiev, D.; Storozhuk, A. Proxy-Mediated Searchable Encryption in SQL Databases Using Blind Indexes. Cryptol. Eprint Arch. 2019, 806.
  28. Hacigümüş, H.; Iyer, B.; Li, C.; Mehrotra, S. Executing SQL over encrypted data in the database-service-provider model. In Proceedings of the 2002 ACM SIGMOD International Conference on Management of Data, Madison, WI, USA, 4–6 June 2002; pp. 216–227.
  29. Curtmola, R.; Garay, J.; Kamara, S.; Ostrovsky, R. Searchable symmetric encryption: Improved definitions and efficient constructions. In Proceedings of the 13th ACM Conference on Computer and Communications Security (CCS ‘06), Association for Computing Machinery, Alexandria, VA, USA, 30 October–3 November 2006; pp. 79–88.
  30. CipherSweet. Available online: https://ciphersweet.paragonie.com/ (accessed on 2 August 2023).
  31. McCarty, R.J. Methods and Systems for Transparent Data Encryption and Decryption. US Patent 7426,745 B2, 16 September 2008.
More
Information
Contributors MDPI registered users' name will be linked to their SciProfiles pages. To register with us, please refer to https://encyclopedia.pub/register : , , , , ,
View Times: 242
Revisions: 4 times (View History)
Update Date: 14 Nov 2023
1000/1000
Video Production Service