This year for Advent of Code, I decided to not only use C++, but completely over-engineered, modern C++ code. What's the fun of writing just the most basic, boring code just to get the correct answer?

Part 1

Part one gives us a rather long list of paired numbers in the form:

38450   56790
94765   36795
89694   26251
96083   99006
...

It asks us to sort the lists and add up the distance between the corresponding number in each list.

Due to C++ still not having networking in the standard library, I opted to save the input into a file in my project directory. Thus the first step is reading and interpereting the data.

Reading

inline std::expected<std::ifstream, aoc::exceptions::AocException> Day01::openFile(
    const std::filesystem::path &path) noexcept {
    if (!exists(path)) {
        return std::unexpected(aoc::exceptions::FileOpenError(path.string()));
    }
    std::ifstream f(path);
    if (!f.is_open()) {
        return std::unexpected(aoc::exceptions::FileOpenError(path.string()));
    }
    return f;
}

The first thing to look at is std::expected. This is a nice feature added in C++ 23 that provides a nice way to handle errors without throwing and forces the caller to handle the error. We will see how that works later.

We then open the file with the passed std::filesystem::path, added in C++ 17 which abstracts the file system of the platform.

Lastly we do some checks to make sure the file exists and was opened, otherwise we return a std::unexpected to signify an error happened which removes the need to throw.

Validating the Data

Now we get our first taste of metaprogramming:

template<Numeric T>
std::expected<Day01::NumberLists<T>, aoc::exceptions::AocException> Day01::readLists(
    const std::filesystem::path &path) noexcept {
    auto stream = openFile(path);
    if (!stream) {
        return std::unexpected(stream.error());
    }
    std::vector<T> listOne;
    std::vector<T> listTwo;
    T num1;
    T num2;

    //Directly read the two numbers into the vectors
    try {
        while (true) {
            if (!(stream.value() >> num1)) {
                // End of file is OK, other failures are errors
                if (!stream.value().eof()) {
                    return std::unexpected(aoc::exceptions::DataFormatError("Stream in invalid state"));
                }
                break;
            }
            // If we got num1 but can't get num2, that's an error
            if (!(stream.value() >> num2)) {
                return std::unexpected(aoc::exceptions::DataFormatError("Stream in invalid state"));
            }
            listOne.push_back(num1);
            listTwo.push_back(num2);
        }
    } catch (const std::bad_expected_access<T>&) {
        return std::unexpected(aoc::exceptions::DataFormatError("Stream in invalid state"));
    } catch (const std::ios_base::failure&) {
        return std::unexpected(aoc::exceptions::DataFormatError("Invalid number format"));
    } catch (const std::exception&) {
        return std::unexpected(aoc::exceptions::DataFormatError("Error reading numbers"));
    }

    //If the vectors are not equal, something went wrong reading the data
    if (listOne.size() != listTwo.size()) {
        return std::unexpected(aoc::exceptions::DataFormatError(
            std::format("vectors are not equal: {} != {}", listOne.size(), listTwo.size())));
    }

    return NumberLists<T>(std::move(listOne), std::move(listTwo));
}

template<Numeric T> allows us to limit T to use any type of number here.

Numericuses C++'s Concepts, which provides the ability to constrain what types we want passed for T. Here, we check if it's either an integral or floating-point type using is_arithmetic_v (C++ 17).

template<typename T>
concept Numeric = std::is_arithmetic_v<T>;

NumberLists is a simple struct that simplifies the code of passing 2 vectors around. Again, we use Numeric to allow any type of number.

template<Numeric T>
struct NumberLists {
    std::vector<T> left;
    std::vector<T> right;

    [[nodiscard]] constexpr NumberLists(std::vector<T> l, std::vector<T> r) noexcept;

    [[nodiscard]] size_t size() const noexcept;
};

In our function body we do some validation to make sure we received values from the file, and add them to their respective lists. If there was an error, we return another std::unexpected for the caller to handle. And since we know the lists should be the same size, we also preform that check as a final validation.

Finally, we return the lists wrapped up in our NumberLists using std::move to transfer ownership.

The use of std::move here is an often overlooked optimization for newcomers in C++. When we read data into listOne and listTwo, we create two potentially large vectors containing our numbers. Without std::move, constructing NumberLists would make copies of these vectors, meaning all the numbers would be copied into new memory locations. By using std::move, we instead transfer ownership of the existing vectors directly into the NumberLists structure - essentially just updating some pointers rather than copying all the data.

Getting the Answer

Now we get to the actual data processing:

inline void Day01::partOne() {
    auto lists = readLists<int64_t>(INPUT_FILE);
    if (!lists) {
        std::println("Error reading lists: {}", lists.error().what());
        return;
    }
    //Simple ordered sorting
    std::ranges::sort(lists->left);
    std::ranges::sort(lists->right);

    auto total = calculateWithLists<int64_t>(lists->left, lists->right,
                     [](const int64_t a, const int64_t b) { return std::abs(a - b); });

    std::println("Total is {}", total);
}

Firstly, we use the previous functions to get our lists. We then check if we received an std::unexpected and hand handle it accordingly.

Next, we use the std::ranges::sort (Added in C++ 20) to let the standard library handle the sorting for us. std::ranges::sort is the more modern option between it and std::sort as it has stricter compile-time constraints and allows us to directly pass the container instead of the iterators.

Now for some more complicated metaprogramming:

template<Numeric T, ListBinaryOperation<T> BinaryOp>
T Day01::calculateWithLists(std::span<const T> left, std::span<const T> right, BinaryOp op) noexcept {
    return std::transform_reduce(
        left.begin(), left.end(), // First range
        right.begin(), // Second range
        T{0}, // Initial value
        std::plus(), // Reduction operation
        op // Transform operation (lambda)
    );
}

This function again uses our Numeric template, and also a new ListBinaryOperation template:

template<typename F, typename T>
concept ListBinaryOperation = Numeric<T> && requires(F f, T a, T b)
{
    { f(a, b) } -> std::convertible_to<T>;
};

In this concept, we have a function F and a Numeric T. The concept then requires the operation result of F on a and b be convertible to our Numeric T. Because of this we get very strong compile time constraints and much easier error messages to track down.

Our calculateWithLists function function gets passed a generic std::span<T> to allow any type of list with any type of Numeric. We also pass a BinaryOp lambda which our std::transform_reduce will use for it's iterative operation.

For the body, we use the std::transform_reduce added in C++ 17. This nicely encapsulates looping over both lists, performing an operation on them, and adding the results of those operations. This is essentially equivilent to:

T total = 0;
for(size_t i = 0; i < left.size(); i++) {
    total += op(left[i], right[i]);
}

but with the advantages of being more concise and potentially more optimized.

Back to our main part one function, we call calculateWithLists with our lists and the lambda:

[](const int64_t a, const int64_t b) { return std::abs(a - b); })

which performs our operation std::abs(a - b) on each set of numbers in the list. We then get our answer cleanly compacted into our variable total.

Lastly, we use the new std::println (added in C++ 23) with it's built-in formatting to display our answer to the console.

And there we have it!

> Total is 1666427

Part 2

In part 2 we use the same data, but do a slightly different computation. For each of the items in our first list, we need to search the second list for the number of occurances that value appears. We then multiply the number of occurances by the value we are searching for and add up all of those values into a so-called "similarity score."

Using the official example:

With a list:

The first number in the left list is 3. It appears in the right list three times, so the similarity score increases by 3 * 3 = 9.

The second number in the left list is 4. It appears in the right list once, so the similarity score increases by 4 * 1 = 4.

The third number in the left list is 2. It does not appear in the right list, so the similarity score does not increase (2 * 0 = 0).

The fourth number, 1, also does not appear in the right list.

The fifth number, 3, appears in the right list three times; the similarity score increases by 9.

The last number, 3, appears in the right list three times; the similarity score again increases by 9.

So, for these example lists, the similarity score at the end of this process is 31 (9 + 4 + 0 + 0 + 9 + 9).

While part 1 required sorting and comparing corresponding elements, part 2 gives us a different challenge that's better solved using a hash map to track frequencies. This lets us avoid repeatedly scanning the right list for each element in the left list, reducing our time complexity from O(n²) to O(n).

Reading and Validating

This is the exact same as above, so I won't go over it again.

Getting the answer:

inline void Day01::partTwo() {
    const auto lists = readLists<int64_t>(INPUT_FILE);
    if (!lists) {
        std::println("Error reading lists: {}", lists.error().what());
        return;
    }

    // Create frequency map
    std::unordered_map<int64_t, int64_t> frequency;
    frequency.reserve(lists->size());
    for (const int64_t num: lists->right) {
        frequency[num]++;
    }
    // O(1) lookup
    auto similarityScore =
            calculateWithLists<int64_t>(lists->left, lists->right, [&frequency](const int64_t left, const int64_t) {
                return left * frequency[left];
            });

    std::println("Similarity score is {}", similarityScore);
}

We again start off by getting our lists and handling any errors.

After that, we create a frequency map to keep track of how many times each number appears in our right list. When working with unordered_map, we can use reserve() to pre-allocate the internal bucket structure that the map uses to organize its elements. By calling reserve(), we're telling the map to prepare enough buckets to handle our expected number of elements efficiently. While this won't prevent memory allocation for the elements themselves, it does prevent the more expensive operation of rehashing - where the map needs to redistribute all its elements into a new, larger bucket structure as it grows. In this case, lists->size() gives us an upper bound on the number of unique elements we might insert, though the actual count could be lower if there are duplicates in the right list.

Then, we simply iterate through each number in our right list and use it as a key in our frequency map. When we use the increment operator on frequency[num], we're either incrementing an existing count or, if this is the first time we've seen this number, the map automatically creates a new entry initialized to zero before incrementing it. The result is a map where each unique number from our right list is associated with the count of how many times it appears.

This automatic creation and the hash table lookups make the unordered_map particularly efficient for this kind of frequency counting - each operation typically takes constant time, O(1), though this can degrade if we have many hash collisions.

Because we again want a summation of an operation, we can reuse our calculateWithLists function and pass a new lambda for our operation:

[&frequency](const int64_t left, const int64_t) { // Note: our right value is ignored, as we don't need it here.
    return left * frequency[left];
});

In our lambda, we gather the value of the item in the left list multiplied by the number of times it appears stored in our map frequency, which we capture by reference with [&frequency] to allow us to use it inside the lambda.

All of these operations are then summed together and stored into our similarityScore variable.

Finally, we print our value using std::println:

> Similarity score is 24316233

All code for my AOC 2024 is hosted on github for easier viewing and testing.

Thanks for reading!

Advent of Code in C++ Day 1

rebel onion