In Machine Learning and Data Science fields, we need to store a large amount of data so that our algorithms can access them efficiently. That's where Data Structures come into the picture. In our previous article, we learned about two important data structures in Python used in ML and Data Science, Tuples and Lists. In this article, we will cover other data structures frequently used in AI applications: **Sets** and **Dictionaries**.

After reading this blog, we will be able to understand:

- Concept of sets
- Different operations that can be performed on sets
- How to convert lists into sets?
- Concept of dictionaries
- Different operations that can be performed on dictionaries
- Comparison between sets and dictionaries

So let's start with sets.

Sets are a data structure used to store multiple data types together. Unlike lists and tuples, sets are unordered, meaning they do not maintain the position or index of the elements they contain.

- Every element in a set is unique and duplicate values cannot be stored.
- Sets are defined using curly brackets.

```
>>> a = {'EnjoyAlgorithms', 1.2, 7, 7}
>>> a
{'EnjoyAlgorithms', 1.2, 7}
>>> type(a)
<class 'set'>
>>> a = {'EnjoyAlgorithms', 1.2, (7, 7)}
>>> a
{'EnjoyAlgorithms', 1.2, (7, 7)}
```

When we print the set that we created, we will notice that duplicates are automatically removed. However, when we include duplicates as a tuple, they will be stored in the same form without being removed. This is because a tuple such as (7, 7) is treated as a single entity.

We can convert a list or tuple data type into a set using the built-in **set()** function. This process of converting one data type to another is known as type casting. For example, you can use set(my*list) to convert the list my*list into a set. This will automatically remove any duplicates that were present in the original list and give you a set data type in return.

Let's understand this using example:

```
>>> a = ('a','b','c', 'd')
>>> set(a)
{'c', 'd', 'a', 'b'}
>>> a = ['a', 'b', 'c', 'd']
>>> set(a)
{'c', 'd', 'a', 'b'}
>>> a = ['a', 'a', 'a', 'b']
>>> set(a)
{'a', 'b'}
```

An exciting thing to note here: Lists and Tuples store elements in a specific order, so the order in which elements are stored matters. However, sets do not store any information about the position of elements, so after conversion, the order of elements in the original list or tuple is lost. Additionally, when we convert a list (with duplicates) into a set, only the unique elements are retained. This property can be useful in certain situations.

A set can store other sets as its elements, creating a hierarchical structure. For example, a set called "animals" can contain several sets of different types of animals, such as "mammals", "birds", and "reptiles". Each of these nested sets can then contain further sets or individual elements, depending on the specific use case. This feature of sets allows for more complex and organized data structures, but it also makes it more difficult to access the data.

`a = {'EnjoyAlgorithms', 1.2, (7, 7), {7,7}}`

As we discussed, sets do not contain information about positions, so we can not access elements using the index values. But we can check whether a particular element is present in a given set or not. For example:

```
>>> a = {'a', 'b', 'c', 'd', 'e', 'f', 'g'}
>>> 'e' in a
True
```

**Single Element:** We can add elements in a given set by using the ".add" operation in the case of single elements. For example:

```
>>> a = {'a', 'b', 'c', 'd'}
>>> a.add('e')
>>> a
{'e', 'a', 'b', 'd', 'c'}
>>> a.add('c')
>>> a
{'c', 'a', 'b', 'd'}
```

**Multiple Elements:** We can add multiple elements in a given set using the ".update" operation. For example:

```
>>> a
{'c', 'a', 'b', 'd'}
>>> b = ['e', 'f', 'g']
# Now we want to add b into a
>>> a.update(b)
>>> a
{'c', 'g', 'd', 'a', 'f', 'b', 'e'}
```

Now we know how to add elements inside a set. Let's also learn about the method of removing elements from a given set.

To remove an element from a set, we can use either the ".remove()" or ".discard()" operation. It's important to note that it's not possible to remove multiple elements at once, so if you need to remove multiple elements, you'll need to do so one at a time.

```
>>> a = {'EnjoyAlgorithms', 1.2, 7, 'ML'}
>>> a.remove(1.2)
>>> a
{'EnjoyAlgorithms', 'ML', 7}
>>> a.discard('ML')
>>> a
{'EnjoyAlgorithms', 7}
```

We can use the same **len()** function that we saw in the case of lists and tuples to find the length of a given set in Python.

```
>>> a = {'EnjoyAlgorithms', 1.2, 7, 'ML'}
>>> len(a)
4
```

We have seen multiple things about Sets, but one of the key things that make sets unique in Python is the support of mathematical operations. Let's quickly see how these Sets can be used to find union, intersection, or other mathematical operations.

If we are familiar with the sets in mathematics, we might know that sets can be defined using a Venn diagram or circular representation.

There are a lot of important mathematical operations that can be done using Sets.

Union is an operation used to form a new set out of all individual sets containing all the unique elements inside the individual sets. In simple terms, a union consists of all elements from all the individual sets. We can use the (|) operator in Python to calculate the union. For example:

```
>>> a = {'EnjoyAlgorithms', 1.2, 7, 'ML'}
>>> b = {'Python', 'C++', 'Java', 'EnjoyAlgorithms'}
>>> a|b
{'ML', 1.2, 7, 'C++', 'Python', 'Java', 'EnjoyAlgorithms'}
```

We can also use the ".union()" operation for finding the union. The general syntax would be **set.union(set1, set2…).** For example:

```
>>> a = {'EnjoyAlgorithms', 1.2, 7, 'ML'}
>>> b = {'Python', 'C++', 'Java', 'EnjoyAlgorithms'}
>>> c = {'SS', 'OOPs'}
>>> d = a.union(b,c)
>>> d
{'ML', 1.2, 7, 'C++', 'Python', 'Java', 'SS', 'EnjoyAlgorithms', 'OOPs'}
```

Using intersection operation, we can form a new set containing common elements from all the sets. We can use Python's (&) ampersand operator to find the intersection of sets. For example:

```
>>> a = {'EnjoyAlgorithms', 1.2, 7, 'ML'}
>>> b = {'Python', 'C++', 'Java', 'EnjoyAlgorithms'}
>>> a&b
{'EnjoyAlgorithms'}
>>> a.intersection(b)
{'EnjoyAlgorithms'}
>>> c = {'a', 'b'}
>>> a.intersection(b, c)
set()
```

As we have only one element in common between sets a and b, "EnjoyAlgorithms", hence the output is 'EnjoyAlgorithms'. We can also use the ".intersection()" operator to find the intersection. The general syntax would be set.intersection(set1, set2 … etc). If there are no common elements, it will produce an empty set, as shown in the example above.

We can also find the difference between the two sets using the (-) operator in Python. If we subtract sets A and B by using A-B, the resultant would be a new set with elements unique to set A. For example:

```
>>> a = {'EnjoyAlgorithms', 1.2, 'ML', 7}
>>> b = {'Python', 'C++', 'Java', 'EnjoyAlgorithms'}
>>> a-b
{'ML', 1.2, 7}
```

We can also check whether a given set is a subset of another set or not. A set is a subset of another set if all the elements present in the former set can be found in the latter set. For example:

```
>>> a = {'ML', 'DataScience', 'RL', 'DL', 'NN'}
>>> b = {'DL', 'NN'}
>>> b.issubset(a)
True
```

It's important to note in sets that if we assign the value of one set to a new set and then perform operations on the newer set, those operations will also be automatically performed on the original set. For example:

```
>>> a = {'ML', 'DataScience', 'RL', 'DL', 'NN'}
>>> b = a
>>> b.remove('RL')
>>> b
{'ML', 'NN', 'DataScience', 'DL'}
>>> a
{'ML', 'NN', 'DataScience', 'DL'}
```

But we might be thinking about how to perform this assignment as it's an important feature. So we need to use the ".copy()" operator to make a copy of the original set and then perform operations.

```
>>> a = {'ML', 'DataScience', 'RL', 'DL', 'NN'}
>>> b = a.copy()
>>> b.remove('RL')
>>> b
{'ML', 'NN', 'DataScience', 'DL'}
>>> a
{'ML', 'NN', 'DataScience', 'DL', 'RL'}
```

```
>>> a = {1, 2, 3, 4, 5}
>>> min_a = min(a)
1
>>> max_a = max(a)
5
>>> sum_a = sum(a)
15
>>> b = {3, 4, 5, 6, 7}
>>> a.intersection_update(b)
>>> a
{3, 4, 5}
>>> a = {1, 2, 3, 4, 5}
>>> a.difference_update(b)
>>> a
{1, 2}
```

That's enough for the basic understanding of sets. Let's learn about our second data structure for this blog, i.e., Dictionaries.

A dictionary is another type of collection in Python that stores multiple types of data types present. If we remember lists, we saw the integer indexes as addresses for the various elements present in the list. In the same line, we have dictionaries, but instead of integer indexes, we have strings.

A dictionary contains the elements in the form of "keys" and "values" pairs.

To create a dictionary, we use curly brackets; likewise, we did in sets. But here, we place Keys followed by a colon and then the corresponding values. The keys must be immutable and unique, which means we can not make two keys with the same name. Also, the values for any key can be immutable, mutable, or even duplicates. We can also store lists, tuples, or sets as the values inside the Dictionary, and the pair of keys and values would be separated by a comma (,). For example:

```
>>> a = {'key1':[1,2,3], 'key2':'EnjoyAlgorithms', 'key3':7, 'key4':{1,3,5}}
>>> type(a)
<class 'dict'>
>>> alpha = {1.2:1.2, 7:7, "This is awesome": 11}
>>> type(alpha)
<class 'dict'>
```

Please note that these keys can be strings, integers, or floats. Let's quickly see some essential operations that can be performed on dictionaries.

We can extract an element from a dictionary using the reference for the corresponding key in square brackets. We can also use the .get() operator to do the same. For example:

```
>>> a = {'key1':[1,2,3], 'key2':'EnjoyAlgorithms', 'key3':7, 'key4':{1,3,5}}
>>> a['key2']
'EnjoyAlgorithms'
>>> a.get('key2')
'EnjoyAlgorithms'
```

We can add a new “key-value” pair with just an assignment operation, *given_dict[key] = value*. For example:

```
>>> a = {'key1':[1,2,3], 'key2':'EnjoyAlgorithms', 'key3':7, 'key4':{1,3,5}}
>>> a['key5'] = (1,2,4)
>>> a
{'key1': [1, 2, 3], 'key2': 'EnjoyAlgorithms', 'key3': 7, 'key4': {1, 3, 5}, 'key5': (1, 2, 4)}
```

Please note that there was no 'key5' earlier, but it appeared after the assignment.

We can assign the new value to the corresponding key. For example:

```
>>> a = {'key1':[1,2,3], 'key2':'EnjoyAlgorithms', 'key3':7, 'key4':{1,3,5}}
>>> a['key4'] = 'ML'
>>> a
{'key1': [1, 2, 3], 'key2': 'EnjoyAlgorithms', 'key3': 7, 'key4': 'ML'}
```

**Can you guess how we can change any key in a dictionary?**

As we said, keys are immutable; hence we can not change any key in a given dictionary. For that, we need to add a new key with the new name containing the same values as the older key. Later we can delete the older key.

We can use the del operation to delete a particular key from a given dictionary. For example:

```
>>> a = {'key1':[1,2,3], 'key2':'EnjoyAlgorithms', 'key3':7, 'key4':{1,3,5}}
>>> del a['key4']
>>> a
{'key1': [1, 2, 3], 'key2': 'EnjoyAlgorithms', 'key3': 7}
```

We can get a list of all the keys and values separately by using .keys() and .values() operators, respectively. For example:

```
>>> a = {'key1':[1,2,3], 'key2':'EnjoyAlgorithms', 'key3':7, 'key4':{1,3,5}}
>>> a.keys()
dict_keys(['key1', 'key2', 'key3', 'key4'])
>>> a.values()
dict_values([[1, 2, 3], 'EnjoyAlgorithms', 7, {1, 3, 5}])
```

The fromkeys() method forms a new dictionary with the default values for all the keys mentioned. If we do not define the default values, all values will be assigned to None.

```
>>> a = dict.fromkeys([7,11], 'EnjoyAlgorithms')
>>> a
{7: 'EnjoyAlgorithms', 11: 'EnjoyAlgorithms'}
>>> a = dict.fromkeys([7,11])
>>> a
{7: None, 11: None}
```

We can store a new dictionary as a value inside a key. For example:

```
>>> a = {'key1':[1,2,3], 'key2':{'ML':'EnjoyAlgorithms'}}
>>> type(a['key1'])
<class 'list'>
>>> type(a['key2'])
<class 'dict'>
```

That's all for understanding the basics of sets and dictionaries.

In this article, we learned about two important data structures frequently used in Machine learning and data science domains, Sets and Dictionaries. We looked into various operations that can be performed on these data structures and how they store elements. With this blog, we summarized all the important data structures in Python for ML and data science domains. We hope you enjoyed the article.

**Next Blog:** Conditions and Branching in Python

**Enjoy learning!**

Subscribe to get weekly content on data structure and algorithms, machine learning, system design and oops.