2

I've already done some research on this topic, but it still isn't quite clear to me.

According to this post patricia trees are radix trees with $r = 2$. Every patricia tree I've seen so far is used to store alphanumeric symbols. Assuming that there are used 8 bits to represent one symbol, $r$ is the number of bits compared and the radix of a radix tree is $2^r$ shouldn't the radix of a patricia tree be $2^8 = 256$?

If you have a look at a PATRICIA Tree with the following inputs (in this order):

  • smile
  • smiled
  • smiles

According to the linked post a Patricia tree should then look like this:


smile - - - smiled - - - smiles

But intuitively (by the alphanumeric representation) it should look like this

       d
      /
smile 
      \
       s
Ian Fako
  • 123
  • 3
  • Not necessarily. If you know that the data is alphanumeric, you can use a smaller radix. – Yuval Filmus Dec 06 '17 at 15:31
  • I understand the referenced post differently. A symbol may have 8 bits, but in Patricia trees you can split at any one of these bits, thus each split still has only at most two children. If you store all 256 permutations of a single symbol, your tree is 8 levels deep and 128 node wide. – TilmannZ Dec 07 '17 at 20:25
  • @YuvalFilmus if you have another look at the post, there is an example of a patricia tree. This tree is correct for the binary representation but not for the raw alphanumeric. Any ideas how to explain this? – Ian Fako Dec 08 '17 at 11:06
  • A Patricia tree has radix 2. It treats its input as a stream of bits. – Yuval Filmus Dec 08 '17 at 11:09
  • @YuvalFilmus I updated my post to explain what I don't understand – Ian Fako Dec 08 '17 at 11:25
  • A Patricia tree considers a binary encoding of its contents, and uses it to construct a radix 2 tree. As far as it is concerned, there is no such thing as a letter - everything is a bit. – Yuval Filmus Dec 08 '17 at 11:39

1 Answers1

1

Quoting Wikipedia:

PATRICIA trees are radix trees with radix equals 2, which means that each bit of the key is compared individually and each node is a two-way (i.e., left versus right) branch.

According to Wikipedia, Patricia is just a "brand name" coined by Donald R. Morrison, standing for Practical Algorithm To Retrieve Information Coded In Alphanumeric.

Patricia trees thus accept their input in the form of binary strings. If the input is a word, then it is first encoded in binary.

As for the smile/smiled/smiles example, it is explained in the linked post – each of these words is encoded as a binary string, and this is what the Patricia tree sees. It is completely unaware of the semantic interpretation of the bitstring as an alphanumeric word.

Yuval Filmus
  • 276,994
  • 27
  • 311
  • 503