*Incredible time on Oct 4, 2017*

*Incredible time on Oct 2, 2020*

Skip to content
Featured## Wonderful Times

## An Amazing Mathemusician

## Awesome Swiss Tables

## Founding Our Awesome Kick-ass Company

## Monty Hall Problem

## A Wet Dream You’ll Never Wanna Wake Up From

## Efficient Ways to Sort Strings

## The Bizarreness of Optimization

View Original
View Original
View Original
## How To Implement A Real Emplace Routine

## Innovation

## Happy Happy Birthday

*Incredible time on Oct 4, 2017*

*Incredible time on Oct 2, 2020*

Recently I found a beautiful, funny, intelligent woman, Vi Hart, a YouTuber who creates entertaining, thought-provoking mathematics and music videos that explain mathematical concepts through doodles. It is a delightful and enlightening experience to watch her doodling, drawing those beautiful, adorable mathematical geometric shapes. Besides, she explains those mathematical concepts in a singing tone, which is cheerful and amusing to listen to. I think she did a wonderful job in spreading the idea that mathematics can be, and yes, is fun & beautiful in such exciting and entertaining ways. That is also what I always want to make the general audience feel about.

The basic idea is open addressing, but without any redirections (thus making Swiss tables very memory efficient). Also, Swiss tables store a densely packed array of metadata with each entry consisting of 1 byte (which makes the metadata table easier to fit into CPU cache, as opposed to, say, a metadata of pointers of 8 bytes), sort of like UNIX file system inodes and bit maps (see OSTEP: Chapter 40 File System Implementation). Thus, we can use the metadata to tell if an entry in the table is present or not (by using the single control bit and the 7-bit H2 hash, see the details at https://abseil.io/about/design/swisstables).

And there’s an excellent video about it on youtube.

Finally, let’s see the performance differences on the counting words benchmark.

Today is a special occasion (& might be a historic moment) that we might appreciate a few years later: I, Yichuan Wang, & Robin Qiu just founded our own company. We were roomies in college, and they’re brilliant & awesome guys, and I **firmly believe** in a few years we will create a very **bright** future, with our hands, our passion, our genius minds, and our entrepreneurial spirits. I am just feeling passionate all over my body, my blood, my cells, just like Richard Hendricks. So, work hard, keep moving, and be a badass :) Notes on April 4, 2022.

This problem is actually a little tricky and counterintuitive, but really fun to think about (as a brain teaser).

Here is the problem (I’ll just copy it from Wikipedia, lazy evil being :)

Suppose you’re on a game show, and you’re given the choice of three doors: Behind one door is a car; behind the others, goats. You pick a door, say No. 1, and the host, who

knowswhat’s behind the doors, opens another door, say No. 3, whichhas a goat. He then says to you, “Do you want to pick door No. 2?” Is it to your advantage toswitchyour choice?

At first sight, it appears to be a fifty-fifty chance to win without switching, as there’re only two doors left, and behind one there’s the car. Most people do think so (me too). But there is a piece of important information that people may not notice, the host **knows** what’s behind each door. On your choosing (a 1/3 probability of the car and a 2/3 probability of a goat), he purposely opens a door that has a goat. Actually, when you choose a door, you only have a 1/3 chance to win the car. The car behind one of the other two doors has a 2/3 chance. This won’t change with or without the host telling ya one of the other two has a goat. If the host opens a door with a goat, that means the other door left has a 2/3 chance to have the car. Thank you for that extra 33.3%!!

Mathematically speaking, we denote Ga, Gb the two goats. When choosing one door, say door 1, it has a 1/3 chance that the car is right behind it. But more likely (a 2/3 probability) it’s a goat behind it. It can be Ga or Gb. When it’s Ga, the host opens one of the other two doors behind which Gb is. When it’s Gb, the host opens the door where the one remaining goat Ga is. If now you decide to switch, then you get the car. There’s a 2-in-3 chance of happening (**on choosing Ga or Gb, the car may be behind one of the other 2 doors, thus we have 2 x 2 = 4 cases, divided by the total number of permutations which is P(3, 3) = 6 cases**). But if you were lucky and picked the door with the car (with a 1/3 probability) at the very beginning, and then decide to switch to the other remaining door, you lose. But there’s just a 1-in-3 chance (1 x 2 = 2 divided by P(3, 3) = 6). So, statistically, if you continue to play this game for, like 300 times, you’ll probably win about 200 times and lose only about 100 times. Overall, the odds are in your favor if you do the switch. (I also found a nice picture on Wikipedia, so again, I’ll shamelessly steal it)

But if there’re, say 100 doors, and only behind one of them has a car. And you’ve just picked your lucky number, door No. 13, but still, you are pretty much damn sure about it that the chance you win will be very very slim (1/100). Now if the host opens the other 98 doors, all with goats (why the hell would he do so??), and let you decide if you will switch your choice. Ha, this is a no-brainer, of course, I’ll f*$#ing do it, just switch!

Anyway, the most important thing is the host **has the information (the power you want :)** If he doesn’t know where the car is and randomly opens a door you didn’t pick, then you’ll get a 50% chance of winning. But that’s not how the game plays, he may open a door with the car (Ooops)! Also, it’s different from that first let the host opens a door with a goat and then let you choose one (you just have two choices then), it’s really a 50% win.

Interestingly and historically, this problem was made popular in a letter from reader Craig F. Whitaker, quoted in Marilyn vos Savant‘s “Ask Marilyn” column. There’s a great youtube video about it, see https://www.youtube.com/watch?v=dOQowCeAnRs&t=19s

Recommended watching & reading:

https://www.youtube.com/watch?v=4Lb-6rxZxx0

I just couldn’t believe it, … seemed after I told Ania how desperately, madly I’ve fallen love with her, we gazed into each other’s eyes, and…, and the most amazing, exciting thing that I’ve always dreamed of happened… WOW, she kissed me passionately, it was a big sloppy kiss, a minutes-long tongue kiss, a hell of a kiss & my first kiss (I know it’s lame ;) I could feel her sweet tongue screwing mine, the best experience I could ever dream about from a romantic movie seeing the male protagonist kissing the female protagonist, but even much better. Then I kissed her back, with all the strength I could have, the wonderful exciting feelings can never be spoken…

What’s next was a lot of raw, animal, sexual touches – Yup, we made love. Making love with the one you’ve always dreamed of, the one you’ve known, loved & been crazy about for many years is a sheer thing I could die for.

This is such a beautiful dream and I had a thought & strong (VERY VERY intense) feeling to make love to her for the rest of my life, then I woke up. FUCK! I found a fact that the more you don’t wanna wake up from an amazing dream, the more likely you’ll, and be very soon, ’cause you’re just sooooooooooooo fucking excited ;) Damn!

**Background**: consider the hashtable with closed addressing/separate chaining:

Typically, we can have two different schemes to link through all the nodes. One is to link the last node of the current bucket to the first node of the next non-empty bucket. See the above picture and imagine taht there are links among these buckets. Another is to make all the nodes look exactly the same as the above picture (by which I mean to link the tail nodes with null pointers).

The second approach will have some more work to obtain the successor of the current node. It may need to scan a lot of (empty) buckets to get its successor. Sounds awful, huh? Actually not. By amortized analysis (assume the `max_load_factor==1`

), ideally, we will have each node in a (different) bucket exactly. So it can be efficient.

How about the first one? The bad news is that it can suffer from very terrible insertion overhead. Now imagine you have a very big hashtable, `tab[i]`

and `tab[j]`

are two adjacent (non-empty) buckets. The horrible thing is that you have a very big difference `|i-j|`

, and then an item whose hash code `k`

is between `i`

and `j`

is needed to be inserted there. We will need to scan all the way down from `k`

to `j`

in order to update the links of the three nodes: the last node of bucket `i`

, the newly inserted node, the first node of bucket `j`

(you can think of this process as the insertion of a doubly-linked list). And if we constantly perform this pathological operation, `j-1, i+1, j-2, i+2,...`

, things can get really bad, bad, bad, bad.

**The Problem**: Now if I ask you a simple question, after building a hashtable, we iterate through all the elements, which do you think is faster, the all-linked-up one (1st case) or the only-in-bucket-linked one (2nd case)?

Here is a piece of test code and the whole code can be found here.

int main() { #if defined(USE_HASHMAP) mySymbolTable::HashMap<int, int> mp{}; #elif defined(USE_HASHMAP2) mySymbolTable::alternative::HashMap<int, int> mp{}; #else std::unordered_map<int, int> mp{}; #endif constexpr int N = 1'000'000; int k = 0; for(int i=0; i < N; ++i) mp[N-i] = i; auto t1 = std::clock(); for(auto it=mp.begin(); it!=mp.end(); ++it) ++k; auto t2= std::clock(); auto iteration_time_elapsed = (t2 - t1) / (double)1'000; std::cout << "Iterating through " << k << " items used " << iteration_time_elapsed << "ms\n"; return 0; }

Essentially, they run the following code respectively

_self& operator++() { _ptr = _ptr->_next; return *this; }

vs

_self& operator++() { auto next = _table->next_entry(_ptr, _bucket_index); _ptr = next.first; _bucket_index = next.second; return *this; } std::pair<node_ptr, size_t> next_entry(node_ptr x, size_t bucket_index) const { assert(x != nullptr && "cannot increment end() iterator"); if (x->_next) return { x->_next, bucket_index }; size_t i = bucket_index + 1; for (; i < _hashtable.size(); ++i) { if (_hashtable[i]) return { _hashtable[i], i }; } return { nullptr, i }; }

So, which one? (I think it should the first one, any rational human beings would make this choice, I suppose. Don’t believe it??)

Now have a look at their assembly code

The bizarre thing is that in debug mode (`-g`

flag), the 1st case beats the 2nd by about one fifth (`t1 = 82% * t2`

). While if optimization is enabled (`-O3`

flag), it’s a whole different story. The summary table is given below.

Here, `_d`

suffix means debug version, `*1`

suffix means `myst::HashMap`

(1st case), `*2`

suffix means `myst::alternative::HashMap`

(2nd case), and no suffix suggests we are using `std::unordered_map`

.

We can see from the results that optimization `-O3`

barely boosts the performance of the 1st case (`std::unordered_map`

and `myst::HashMap`

) except the `test_int`

cases, but rather make a significant boost in the 2nd case (`myst::alternative::HashMap`

). Given the 2nd is a little bit more complicated, it is therefore expected there will be more space for the 2nd case to optimize, for example, inlining. Still, I don’t think it can beat the 1st small code. How do we explain this weird phenomenon?

(In my understanding, small code means fast, think about cache, and technically speaking, 2nd case should never be faster than the 1st one.)

Above is a piece of code that attempts to implement the emplace family routines. One can insert something like `const value_type& (value_type&&) val`

or variadic arguments `Args&&... args`

, both of which then can be handled by the `insert_leaf_at`

subroutine

Of course, this simple implementation is flawed as it requires an extra move operation. In order to obtain the key from `Args&&... args`

, we need to somehow get the value itself which can be a value (set) or a pair of <key, val> values (map). This is done by using the compile-time template bool constant/parameter `IsMap`

. If we must construct the actual object within the method `insert_at_leaf`

, then, well, an extra move operation is inevitable.

The solution to the question is we construct the node in-place, thus the name `empalce`

;) We first construct the node in the heap using the variadic arguments `node_ptr z = new_node(/*parent=*/nullptr, std::forward<Args>(args)...);`

, then `get_key`

from the node by reference. And what is left is we walk through the tree and find the right position to insert the node and reset the parent link. TadaðŸŽ‰, there you are. This makes even more sense if we look at the C++ specification where `node_type`

and `node_handle`

were introduced in C++17. Their presence makes implementing `emplace`

more easily and naturally.

But the C++ folks are soooo ambitious and if the original one has so many combined benefits (simplicity, flexibility, versatility, etc.) and actually works pretty well (given the overhead of an extra move operation is negligible), we can just keep the way it is :)

Ref: A C++17 compliant AVL map container.

So, happy happy birthday to a marvelous friend, Ania;)