dev-resources.site
for different kinds of informations.
Hashing Algorithms and creating a simple file integrity monitor (FIM)
The CIA triad
which stands for : Confidentiality, Integrity and Availability. These are the three pillars of every security infrastructure and represent goals for security experts to ensure in their company. Hereβs what each one means in simple terms :
Confidentiality is keeping the data confidential and not shown to people who are not supposed to see it. a simple example would be the data exchanged between a client and a server in an online store (passwords, credit card information, preferences ...)
Integrity is maintaining the consistency and trustworthiness of data, making sure it doesnβt change if itβs not supposed to and if it does, the user knows about it. This is what we will cover in this tutorial. We will build a simple FIM (File Integrity Monitor) using hashing algorithms to monitor data and keep tabs on changes made on it (writing) and implement a warning that is triggered when said changes happen so that the user may take the necessary precautions.
Availability is ensuring that systems remain online and available for those who need them.
Hashing Algorithms
Or a cryptographic hash function is an algorithm that takes an arbitrary amount of data input and produces a fixed-size output of enciphered text called a hash value, or just βhash.β That enciphered text can then be stored instead of the password itself, and later used to verify the user in the most basic cases.
- Hashes are non-reversible. it is very hard to find the original password from the output or hash.
- Diffusion, the slightest of changes to the input will produce an entirely different output, thus making it harder.
- Determinism, a given input must always produce the same hash value
- Collision resistance. It should be hard to find two different passwords that hash to the same enciphered text.
- Non-predictable. The hash value should not be predictable from the input. There are many hashing algorithm, in this post, we will be using the sha256 hash function, which is still approved as a secure algorithm
FIM (File Integrity Monitor)
File Integrity Monitoring (FIM) is a security practice which consists of verifying the integrity of operating systems and application software files to determine if tampering or fraud has occurred by comparing them to a trusted "baseline." this is mainly done by using hashing algorithms.
Coding our basic FIM
In our application, the input will be the digital thumbprint of each file in the directory that we would like to monitor for changes, the outputted hashes will be stored in a file to be then later compared to a newly calculated hash; If they're equal, that means no changes have been made to the file, else there has been changes. We will also cover the cases where a file is deleted or a new file is created.
Here's a chart to help you understand the functioning of the scripts we are about to see
Now for the code, this step-by-step guide will be in bash (the Bourne Again SHell) which is a widely used shell scripting language for automating tasks, but you can also find the python or Powershell version on the github page
#User input
echo -ne "would you like to\n 1) Collect a new .baseline\nOr\n 2) Proceed with the previously recorded one\n [ 1 | 2 ] ? "
read ans
get the user's input, easy enough, right?
function calculate_file_hash(){
filehash=$(sha256sum $1 | cut -d ' ' -f 1)
filepath=$1
path_and_hash=$filepath"|"$filehash
echo $path_and_hash
}
here we created a function that calculates the file hash for the specified file directory in function call argument
First case scenario, Collecting the baseline
if [ "$ans" = "1" ];then
if [[ -f ".baseline.txt" ]]; then
rm .baseline.txt
>.baseline.txt
#hidden file starts with a . (in linux based systems)
else
>.baseline.txt
fi
#filling in the .baseline.txt file with filepath|filehash pairs
for entry in "$monitoring_dir"/*
do
res=$(calculate_file_hash "$entry")
echo $res >> .baseline.txt
done
in this part, the user decided to collect a new baseline, the old one will be deleted if it exists and we will store the file_path|file_hash pairs in the newly created baseline.txt file using the calculate_file_hash function
else
declare -A path_hash_dict
#creating a dictionary with filepath as key and filehash as value
lines=$(cat .baseline.txt)
for line in $lines
do
path=$( echo "$line" | cut -d '|' -f1 )
hash=$( echo "$line" | cut -d '|' -f2-)
path_hash_dict[$path]=$hash
done
Second case scenario, user wants to start monitoring the files, first we create a dictionary where each key is the file path and the value for this key is the file's hash, this is done for easy access to the data stored in the baseline.txt file
while true
do
sleep 1
#checking if a file has been deleted
for key in "${!path_hash_dict[@]}"; do
if [ ! -f "$key" ]; then
echo -e "A file has been REMOVED ! FILE NAME :$key"
fi
done
for file in "$monitoring_dir"/*
do
hash=$(sha256sum $file | cut -d ' ' -f 1)
if [ ! -v path_hash_dict[$file] ]; then
echo -e "A file has been CREATED ! FILE NAME : $key"
else
if [ "$hash" = "${path_hash_dict[$file]}" ]; then
continue
elif [ "$hash" != "${path_hash_dict[$file]}" ]; then
echo -e "A file has been CHANGED ! FILE NAME : $key"
ls -la $key
fi
fi
done
done
fi
Let the monitoring start ! In this infinite while loop, if a key in our dictionary doesn't correspond to a file's name in the monitored directory, it means it has been deleted
If a file's name is not among the keys in our dictionary, it means a new file has been created in the monitored directory
Lastly, we calculate the hash of each file and compare it to the hash stored in the dictionary, if they're different, this means the file has been modified.
Find the a more complete version of this script on Github. You can also find the python and Powershell versions there.
Credit where credit's due,
- This post was inspired by Josh Madakor's youtube video, check out his youtube channel for cyber security related content
- Some lines from this article about cryptographic hash functions
Featured ones: