Distinct Characters
May 9, 2017
We’ve done similar exercises in the past:
Write a program to determine if all the characters in a string are distinct. For instance, the word “Programming” has two m and two g, so all the characters are not distinct, but the word “Praxis” has six distinct characters.
Your task is to write a program to determine if all the characters in a string are distinct; you should provide three solutions, one that takes time O(n²), one that takes time O(n log n), and one that takes time O(n). When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.
Advertisements
Pages: 1 2
In Python.
Added an additional method to Paul’s solution in Python. Uses pruning, combined with dictionary add/lookup is O(1) average. Has better performance, depending on distribution of frequently occuring characters amongst all different words (i.e. character ‘e’ reoccurs frequently in English).
Rutger: I expect the difference between the set object and your emulation of it is just noise.
John: order_n_prune is rougly 10-20% faster than order_n for the words in /usr/share/dict/words (linux). For other datasets this may not hold, i.e. a dataset with words that only have unique characters.
Klong: Not sure how to provide the 3 different solutions, so here’s the quickest one
str::”Programming Praxis”
“Programming Praxis”
str=?str
0
str::”Praxis”
“Praxis”
str=?str
1
:” ?a returns a list containing unique elements from ‘a’ in order of appearance.”
[sourcecode lang="css"]
Not sure how to do the 3 solutions, so here’s just one:
DISTCHAR (str) ; Check for distinct characters
n arr,flg,i
s flg=1
f i=1:1:$l(str) d q:’flg
. s char=$e(str,i)
. i $d(arr(char)) s flg=0
. e s arr(char)=””
q flg
MCL> W $$DISTCHAR^DISTCHAR(“Programming Praxis”)
0
MCL> W $$DISTCHAR^DISTCHAR(“Praxis”)
1
Just the fastest solution (O[n] time) in C11:
#include <stdbool.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
bool
distinct (char *str);
int
main (int argc, char **argv)
{
char *strings[] = {"Programming", "Praxis", "Cat", "Dog", "Dogg", NULL};
for (register size_t i = 0; strings[i] != NULL; ++i)
{
if (distinct(strings[i]))
printf("'%s' is distinct\n", strings[i]);
else
printf("'%s' is not distinct\n", strings[i]);
}
exit(0);
}
// O(n) solution
bool
distinct (char *str)
{
size_t arr[127] = {0}; // ASCII
const size_t n = strlen(str);
for (register size_t i = 0; i < n; ++i)
if (++arr[str[i]] > 1)
return false;
return true;
}