Skip to content

Word Frequency

This data structure should store words, its frequencies and eventually some other information.

Description

Alternatives

  1. unordered_map< string,int >
  2. list< pair< string,int >>
  3. trie

Experiments

  1. Building time: How long to insert all words in the document.
  2. Traversal time: How long to visit every word in the collection.
  3. Query time: How long to find a word in the collection.

Input

  1. jekyll.txt: The strange case of Dr. Jekyll and Mr. Hyde.
  2. jekyll100.txt: 100x jekyll.txt
  3. random-english.txt: text formed from dictionary with 5K english words
  4. random.txt: text formed from random space-separated strings.

Experiment: jekyll-x100.txt

Scenarios/Test Cases query traversal building
trie 0s 417ms 0s 0ms 1s 281ms
list 0s 418ms 0s 0ms 4s 331ms
unordered_map 0s 419ms 0s 0ms 0s 975ms

Experiment: jekyll.txt

Scenarios/Test Cases query traversal building
trie 0s 415ms 0s 0ms 0s 16ms
list 0s 415ms 0s 0ms 0s 21ms
unordered_map 0s 413ms 0s 0ms 0s 13ms

Memory consumption

trie list unordered_map
1x 6x 6x