dev-resources.site
for different kinds of informations.
Indexable Containers in Haskell
While learning Haskell, you will come across List
, the most popular indexable container available in this programming language. However, as you dive deeper into Haskell, List
's limitations, e.g. slow indexing or immutability, will become more and more pervasive. Although we cannot fix List
's flaws, we can use alternative modules to achieve better performance and/or functionality. In this post, we will discuss List
and two modules you can use instead, Data.Array
and Data.Vector
, aiming to collect essential knowledge on them in one place and differentiate between them.
List
List
is one of the most basic data structures in every programming language and, if Haskell isn't your first programming language, you might already be familiar with it. Yet, Haskell's built-in List
is rather special for three reasons:
- Firstly, it can contain exclusively same-type elements (contrary to, e.g., lists in Python).
- Secondly, it has performance characteristics of linked lists.
- Lastly, as everything in Haskell, it is immutable.
> let longList = [1..50000] -- A List of 50,000 elements
> let elemAt50000 = longList !! 50000 -- Indexing into the last, 50,000th element (the operation is slow!)
> elemAt50000
50000
Listing 1: Creating a List
containing 50,000 elements and indexing into the last element
Indexing into List
can be slow because the program iterates from the first element to find the desired element. Interestingly, List
can generate values as you index into it. This is caused by Haskell's being lazy and not evaluating elements until you try to read the value.
> let infiniteList = [1..] -- Creating an infinite List (you can do this because Haskell is a lazy programming language)
> let thisEvaluatesMillionElements = longList !! 1000000
> thisEvaluatesMillionElements
1000000
Listing 2: Creating an infinite List
containing 50,000 elements and indexing into the last element
List
is convenient to use because of its simplicity but performing certain operations on it is much more complicated than it would be for an Array
or a Vector
, e.g. copying. Hence, compared to the other two, List
is slow and best suited for simpler purposes. You can use List
if you don't need immediate access to each element, want to iterate from the first element, to avoid any additional dependencies, or to iterate on an infinite number of elements.
Array
In contrast to List
, Array
is a mutable data structure (except immutable IArray
) that allows immediate access to its elements. Its main advantage is the amount of control it gives you over indexing. For example, you can index into Array
by Int
, Int32
, Int64
, and even Char
. Furthermore, it supports lower and upper bounds, meaning you can assign any position to its first element. Hence, the first element can hold the 5th position and to access it you would have to index into 5. Another advantage Array
has over List
is StorableArray
, a type that can be used with C functions and is available in the Array
module.
> import Data.Array (Array, listArray, array, (!!))
> let arr = listArray (5, 10) "abcde" :: Array Int Char -- Creating an Array of Chars, indexed by Ints
> arr ! 6
'b'
> let charArr = array ('a', 'c') [('a', "I'm at index a"), ('b', "I'm at index b")] -- Creating an Array of Strings, indexed by Chars
> charArr ! 'b'
"I'm at index b"
Listing 3: Creating and reading Array
s with various index ranges and types
However, Array
's advantages also lead to its disadvantages. Because Array
grants you control over indexing, you cannot simply create an Array
, you have to define its indexing as well. It is reasonable, then, that if you're working with an already existing Array
, you need to always keep in mind which indexing types it implements.
> arr ! (4 :: Int)
*** Exception: Ix{Integer}.index: Index (4) out of range ((5,10))
> charArr ! (1 :: Int)
<interactive>: error:
• Couldn't match expected type ‘Char’ with actual type ‘Int’
Listing 4: Examples of incorrect indexing into an Array
Combining mutability and O(1) indexability, Array
is a powerful container and can serve complex purposes. It is also built into Haskell and recommended if you consider additional dependencies less than desirable.
If you want to learn more about Array
, this tutorial on MMHaskell might be helpful.
Vector
As a data structure, Vector
combines the best qualities of List
and Array
: a simple API and O(1) access to elements. Similarly to List
, Vector
is immutable (except for MVector
which you can read on here) and can be indexed only by Int
.
> import Data.Vector (fromList)
> let vec = fromList [1, 2, 3]
> vec
[1,2,3]
Listing 5: Creating a Vector
from a List
However, Vector
has features that distinguish it from List
and Array
:
- it is growable, meaning that you can append elements to a
Vector
; - it supports loop fusion, which increases its efficiency (Mark Karpov has explained this process well in his blog).
> import Data.Vector.Mutable (new, write, read, grow)
> mvec <- new 5 :: IO (MVector RealWorld Char) -- Creating a Vector of Chars
> write mvec 6 'x'
*** Exception: index out of bounds (6,5)
> grown <- grow mvec 30 -- Growing the Vector by 30 elements
> Data.Vector.Mutable.read grown 6
'x'
Listing 6: Mutating and growing a Vector
Please note that Vector
is not built into Haskell. The package is available on Hackage. Just like Array
, the Vector
module offers a Data.Vector.Storable that is compatible with C functions.
Using Vector
is recommended if you need Array
's O(1) access, don't mind extra dependencies and lack of generic indexing, and want to perform actions too complex for lists or are looking for a mutable container (only when using MVector
interface).
Conclusion
Familiarizing yourself with the containers available in a given programming language is a necessary aspect of the learning process. This post gathers the basic information on all three indexable Haskell containers, each with a different purpose, and possibly useful for your project. The table below summarizes the most important features of containers discussed in this post:
Feature | List | Array | Vector |
---|---|---|---|
Difficulty | low | high | medium |
Built-In | yes | yes | no |
Mutable | no | yes (except IArray ) |
no (except MVector ) |
Indexable by |
Int |
e.g. Int , Word32 , Bool , or Char
|
Int |
Happy coding!
Featured ones: