A key-value database is a non-relational database (or a NoSQL database) that stores data as a key-value pair. This database is similar to maps or dictionaries, where each key is associated with only one value. There is no connection between values, and they are unrelated. It can be simple data types like strings, numbers, or complex objects. On another side, the key must be unique to avoid ambiguity while looking for its value in a database.
Here data is kept in tables made up of rows and columns in a standard relational database design. The database developer can specify many attributes of the data stored in advance, which opens up possibilities for optimisations like data compression, data aggregations, and data access. Instead of complex aggregations, key-value databases are more flexible and offer fast read and write performance. In other words, the database looks for a single key and returns its corresponding value rather than performing complex aggregations.
The key-value store provides a way to store, access, and update data using simple get, put and delete commands. So the simplicity of this model makes this database fast, easy to use, portable, and flexible. This is highly partitionable and can scale horizontally.
Some critical operations on key-value store:
Key-value databases are used in applications when just one type of item can be looked for based on one property. It offers considerable speed and scalability benefits compared to traditional databases. Most key-value storage systems seek the following properties:
Consistency: Most key-value database systems claim to be eventually consistent, which means that data updates must be propagated across all nodes. The system which uses replication must ensure that all clones are consistent, and retrieving a value for a particular key must always be the same.
Throughput: The throughput statistic, which focuses on the number of transactions performed in a given unit of time, is one of the fundamental goals of key-value databases.
Scalability: A key-value database system satisfies the scalability criteria if it can distribute data and a load of read/write operations across multiple servers. This can increase the system’s performance.
Availability: The availability feature ensures that readers/writers always succeed, and data must be available to the user at any given time.
Partition Tolerance: Data must be stored in several places. The system can handle failures of some individual servers and continue functioning as a whole.
In several situations, choosing a key-value database to store data would be the best idea. It can process a constant stream of read/write operations with low latency on large-scale data. Additionally, the built-in redundancy can handle lost storage or data without issues.
We can use this database in the following scenarios:
Session management: We can use it to manage and store session attributes in online applications on a large scale. One of the best examples is multiplayer online games, where we use a key-value database to manage each player’s session.
In-memory data caching: We can use it as an effective cache mechanism for frequently accessed but rarely updated data to accelerate application responses.
User preference storage and personalisation: We can use it to provide product recommendations and store personalized lists of items for individual customers. One of the best examples is e-commerce.
Implementing blockchain-based solutions: This database is common when implementing blockchain-based ideas. Here key is a hash value, and value is the corresponding block entity.
Real-time random data access: Key-value databases offer fast in-memory access. So we can use it in a situation when our application needs to handle lots of small continuous reads and writes.
Storing basic information: We can use it in an application to keep some basic data based on key-value pairs. For example, we can store the web page URL as the key and the webpage as the value.
Content delivery networks: We can use it for multimedia storage and serve large web objects (video, images, audio, etc.) in a high availability environment.
Several implementations of key-value databases are available in the market. They offer different functionalities and improvements over others.
Aerospike: Aerospike is the world’s leading provider of next-generation, real-time NoSQL data solutions for organizations of any size. The Aerospike database provides lower latency with a predictable performance at any scale, with industry-leading uptime.
Apache Cassandra: Apache Cassandra is an open-source NoSQL distributed database for scalability and high availability without compromising performance. This is a complex key-value database that Facebook first released as open-source in 2008. Cassandra employs data structure building blocks to generate up to 4 or 5-dimensional hashes to represent data. Values are just a collection of other key-value pairs.
Amazon Dynamo DB: Amazon DynamoDB is a key-value, NoSQL document database with single-digit millisecond performance. It’s a fully managed, multi-region, multi-master, persistent database for internet-scale applications with built-in security, backup and restores, and in-memory caching.
Berkeley DB: Berkeley DB is a family of embedded key-value database libraries providing scalable, high-performance data management services to applications. Here records are made up of key-value pairs, which operate on keys using three different access methods: B+tree, hashing, and fixed-or-variable-length records (Recno).
Memcached: Reduces the number of times an external data source must be accessed by caching data and objects in RAM to speed up webpages. It’s open-source and free.
Riak: Riak is a key-value database inspired by Dynamo. It’s primarily written in Erlang. The data comprises a pair of keys and values kept in a bucket that serves as a namespace and allows the same key to appear in several buckets. Riak has a native interface and an HTTP API that enables users to manipulate data using GET, PUT, POST, and DELETE methods.
Redis: Strings, lists, sets, sorted sets, and hashes are among the data structures used by Redis. Redis only supports insert, update, delete, and lookup operations. The dataset is kept in RAM by Redis servers; however, the data is written to disc at customisable intervals for backup and system shutdown.
Because of the ability to return a value given a specific key, we can use a key-value database as a cache to provide crucial real-time information. It can store a pool of read data so that future data requests can be quickly accessed at a later time to improve performance.
There are some differences between cache and key-value store:
Enjoy learning, Enjoy system design!
Subscribe to get weekly content on data structure and algorithms, machine learning, system design and oops.