Logo

dev-resources.site

for different kinds of informations.

Hashing

Published at
11/13/2022
Categories
hash
hashing
systems
webdev
Author
pragyasapkota
Categories
4 categories in total
hash
open
hashing
open
systems
open
webdev
open
Author
13 person written this
pragyasapkota
open
Hashing

Previously, we discussed hashing briefly under the section IP hashing-based selection in the article โ€” Load Balancing. However, it might be a little tricky for some which is why this is an article with the basic concept of hashing.

What is hashing?

The internet has been growing in its content rapidly. The content on the internet doubles every six months making it very difficult to find a particular one. The usual array and linked lists used to store data lead to a hard time when the data is large.

Thus, Enters HASHING.

A certain data of large size in an array takes O(logn) times in a sorted one and O(n) in the unsorted one. It might feel easier if the sorted array uses special techniques like binary search but if itโ€™s not, we might have to use linear search. However, we cannot be sure if the error margin will be minimized. Hence, hashing is used so that we can have the constant time of O(1) on the operations. This is a far more effective method since the time of operation is independent of the data size n.

Map-The data structure

The word โ€˜mapโ€™ means the relation of two sets โ€” (key, value). The relation between the key and the value is that given a key, you can use a function to get a value. The function is called a hash function.

For a simple understanding, letโ€™s imagine an array where the key is the index, and the value of that index is the required value. The array X with the key โ€œiโ€ gives the value from X[i]. The difference from the concept of an array on hashing is that the concept of a hash table is more generalized in comparison. The latter can have any value as a key instead of having an integer all the time. It can even be a name or an object times according to your requirement. The only trick here is to have a valid hash function to form an index that can store the object at a particular location from where we can later find the object easily.

Example: -

Letโ€™s imagine a set of strings {โ€œpragyaโ€, โ€œsapkotaโ€, โ€œmediumโ€}. Now we need to store these strings on a table. We need to have a quick update and the operation time should be O(1). We canโ€™t be responsible for the ordering of the operation or even the maintenance.

Now, letโ€™s think of a schema where we number the alphabets like a=1, b=2, c=3,. โ€ฆ.., z=26.

So, letโ€™s have a number for each string.

  1. โ€œpragyaโ€ = 16+18+1+7+25+1 = 68

  2. โ€œsapkotaโ€ = 19+1+16+11+15+20+1 = 83

  3. โ€œmediumโ€ = 13+5+4+9+21+13 = 65

If you have a table of size 4, we can use the hash function as
sum mod 4
and thus place it on the table for the table.

  1. โ€œpragyaโ€ = 68 mod 4 = 0

  2. โ€œsapkotaโ€ = 83 mod 4 = 3

  3. โ€œmediumโ€ = 65 mod 4 = 1

The table will look as below: -

0 1 2 3
pragya medium sapkota

As you can notice from above that the idea is computing a location for a string with the help of a hash function. Here, the hash function is sum of the characters mod Table size.

Hash

The way of searching a string with the use of the value is a very effective way of storing words in a dictionary.

Hashing: The problems

On the above strings {โ€œpragyaโ€, โ€œsapkotaโ€, โ€œmediumโ€} we used the lowercase letters which have the numbers 68, 83, and 65 respectively. However, if we have the permutations of the letters such as โ€œaygarpโ€ for โ€œpragyaโ€ or โ€œsaPkotAโ€ for โ€œsapkotaโ€ or โ€œMediumโ€ for โ€œmediumโ€ โ€” the numbers will the same as 68, 83, and 65. This leads to storing more than one string in one location which ultimately leads to what we call on hashing terms โ€” COLLISION.

Likewise, another problem with hashing is determining the size of a hash table. The size is preferred to be a prime number to prevent collisions.

All of this leads us to two points as follows: -

  • Picking a good hash function

  • Dealing with the collisions

The hash function is supposed to be easy to compute and the one to avoid the maximum collisions.

Consistent Hashing

If you have five servers and one of the servers dies due to some reason, the hashing function wonโ€™t be able to know it by itself. And if you then add a replacement server, it wonโ€™t know about it too. But the mod operator still stays 5 despite one server being removed and the other being added. For solving this problem, we have Consistent Hashing.

The concept of consistent hashing was brought to reduce the problem of hashing. However, it doesnโ€™t eliminate those problems. This may not sound much in small-scale data but for large data โ€” this is very important. The function applies to the requests and the servers to result in output in a set range of values. It reduces the problem of failed servers where the traffic still gets routed and the new server that has no allocations.

Conclusion

Hashing can be an intricate topic when we learn system design. But we can divide each of the terms and mini topics to learn them individually in detail. Consistent hashing is one of the important ones to study and understand so that we understand other system design concepts like load balancing.

GitHub logo pragyaasapkota / System-Design-Concepts

A repo with some system design concepts.

System Design

Systems design is the process of defining elements of a system like modules, architecture, components and their interfaces and data for a system based on the specified requirements.

This is a index for the concepts of system.

If you wish to open these in a new tab, Press CTRL+click

Thank you!!!


Show some โค๏ธ by starring this repository!



I hope this article was helpful to you.

Please donโ€™t forget to follow me!!!

Any kind of feedback or comment is welcome!!!

Thank you for your time and support!!!!

Keep Reading!! Keep Learning!!!

hashing Article's
30 articles in total
Favicon
What is Hashing? A Complete Guide for Developers and Security Professionals
Favicon
The Evolution of Hashing Algorithms: From MD5 to Modern Day
Favicon
Pattern 9: Hashing
Favicon
Lithe Hash: Um Mรณdulo Robusto para Hashing Seguro de Senhas
Favicon
The Bcrypt Algorithm for Secure Password Hashing
Favicon
Sorting Algorithms That Use Hash Tables
Favicon
๐–ค๐—‡๐–ผ๐—‹๐—’๐—‰๐—๐—‚๐—ˆ๐—‡ ๐–บ๐—‡๐–ฝ ๐–ง๐–บ๐—Œ๐—๐—‚๐—‡๐—€: ๐–ง๐—ˆ๐— ๐–ณ๐—๐–พ๐—’ ๐–ฏ๐—‹๐—ˆ๐—๐–พ๐–ผ๐— ๐–ธ๐—ˆ๐—Ž๐—‹ ๐–ฃ๐–บ๐—๐–บ ๐–ฃ๐—‚๐–ฟ๐–ฟ๐–พ๐—‹๐–พ๐—‡๐—๐—…๐—’
Favicon
Two Unconventional Ways to store Passwords: Honeywords & Rock Salt
Favicon
Difference Between Encryption and Hashing ๐Ÿ”๐Ÿ”‘
Favicon
Basic File Integrity Monitoring System
Favicon
Lithe Hash: A Robust Module for Secure Password Hashing
Favicon
Comparative Analysis of Password Hashing Algorithms: Argon2, bcrypt, scrypt, and PBKDF2
Favicon
Cryptojs vs. Bcryptjs: Which password hashing method should you trust?
Favicon
Distributed Hash Generation Algorithm in Python โ€” Twitter Snowflake Approach
Favicon
Understanding Bcrypt Rounds: Balancing Security and Performance
Favicon
Wednesday Links - Edition 2023-05-03 ๐Ÿ‡ต๐Ÿ‡ฑ๐Ÿ“œ
Favicon
Understanding Hashing Algorithms: A Beginner's Guide
Favicon
Hashing in Blockchain Technology: How it Ensures Data Integrity
Favicon
Using Hash Codes as Unique Identifiers
Favicon
What is Password Hashing Algorithm?
Favicon
Hashing
Favicon
๐Ÿ”How to encrypt variables in NodeJS
Favicon
Lava lamps securing the Web?๐Ÿคทโ€โ™‚๏ธ
Favicon
What is the difference between encryption, hashing and salting?
Favicon
Security in The Blockchain
Favicon
Hashing Algorithms and creating a simple file integrity monitor (FIM)
Favicon
Hashing, Explain it to me like I'm 5.
Favicon
#Hashing in Java
Favicon
How to hash a password in Go
Favicon
Sony FAIL : why it is absolutely necessary to encrypt passwords

Featured ones: