## Sorting array of strings (with repetitions) according to a given ordering

We get two arrays:

``ordering = ["one", "two", "three"] ``

and

``input = ["zero", "one", "two", "two", "three", "three", "three", "four"]; ``

We want to find the array `output` so that

``output = ["one", "two", "two", "three", "three", "three", "zero", "four"] // or output = ["one", "two", "two", "three", "three", "three", "four", "zero"] ``

The strings (with possible repetitions) should be sorted as in the `ordering` array. Not found/contained strings should be put at the end of the new array and their order doesn’t matter.

The $$n^{2}$$ solution is obvious, can we do better? The memory doesn’t matter and it doesn’t have to be an in-place algorithm.

## Finding encryption algorithm from known and encrypted password strings [duplicate]

I am working with a piece of software which seems to use some type of lightweight, “home baked” password encryption algorithm.

I know a number of clear text passwords as well as their corresponding encrypted as they are stored in the database — Does anyone know of a tool or a means to find the underlying algorithm and/or hash type which might be used? I would like to be able to decrypt and use these in a use-case application of my own.

Examples (clear text -> encrypted):

``test123 -> 2404483248 tb ->      43971963 ks ->      43912691 mm ->      43937163 et ->      43941139 ``

(

## Counting strings with balanced substrings

Consider a string of characters $$a, b, c$$ only. Such a string is called good if the number of $$a$$‘s + number of $$b$$‘s is equal to the number of $$c$$‘s.

Given an integer $$n$$, find the number of strings of length $$n$$ consisting only of characters $$a,b,c$$ such that all of its substrings of length $$k$$ are good.

Example:

$$n = 3 ,k = 2$$ is $$6$$,

$$n = 2,k = 1$$ is $$0$$

I could only solve when there are only two characters but can anyone help me how to solve when there are three characters.

## Given a list of strings, find every pair \$(x,y)\$ where \$x\$ is a substring of \$y\$. Possible to do better than \$O(n^2)\$?

Consider the following algorithmic problem: Given a list of strings $$L = [s_1, s_2, \dots, s_n]$$, we want to know all pairs $$(x,y)$$ where $$x$$ is a substring of $$y$$. A trivial algorithm would be this:

``foreach x in L:    foreach y in L:       if x is substring of y:          OUTPUT x,y ``

However, this takes $$O(n^2)$$ $$x$$ substring of $$y$$ operations – I am curious to know whether there is a faster algorithm?

## Spliting strings into groups of similar strings

I would like to group a list of strings into groups of strings differing by max 1 character:

For instance, given:

``[John, Alibaba, Johny, Alidaba, Mary] ``

I would expect three groups:

``[John, Johny], [Alibaba, Alidaba], [Mary] ``

My first thought was about using some clustering algorithm with Levenshtein distance but that seems like an overkill to me.

Is there a better approach?

## Constructing Generalised Suffix Tree from a large set of strings

Is there a published method to construct a generalised suffix tree from a large set of strings (~ 500 000) without the need of concatenating them?

I would like to use the resulting suffix tree for a pattern search problem.

## Length of strings accepted by DFA

Problem: Given a DFA $$D$$, find all possible lengths of strings accepted by the $$D$$.

It makes sense that these lengths can be represented as $$a_i+kb_i$$. What might be the algorithm to find all such pairs $$(a_i, b_i)$$?

## How to facilitate the export of secret strings from an offline system?

I want to use Shamir’s Secret Sharing algorithm to store a randomly generated passphrase securely by spreading the secret shares on paper for example.

The passphrase is generated on an offline system. I am looking for a way to ease the process of “exporting” those secrets which can be quite long (~100 hexadecimal characters).

First I converted the secrets from hexadecimal to base64. That is not bad but not enough.

Then I tried to compress the strings using different methods but because it is random data it does not compress well (or at all).

Then I though of printing them as QR code, it works fine but the issue comes later when I need to import the secrets back, because I would need a camera.

Is there anything else I could try?

## Regular expression for strings not starting with 10

How can I construct a regular expression for the language over $$\{0,1\}$$ which is the complement of the language represented by the regular expression $$10(0+1)^*$$?

## Regular expressions for set of all strings on alphabet \$\{a, b\}\$

I came across following regular expressions which equals $$(a+b)^*$$ (set of all strings on alphabet $$\{a, b\}$$):

• $$(a^*+bb^*)^*$$
• $$(a^*b+b^*a)^*$$
• $$(a^*bb^*+b^*ab^*)^*(a^*b+b^*a)^*b^*a^*$$

I want to generalise different ways in which we can append to original regular expression $$(a+b)^*$$, to not to change its meaning and still get set of all strings on alphabet $$\{a, b\}$$. I think we can do this in two ways :

• P1: We can concatenate anything to $$a$$ and $$b$$ inside brackets of $$(a+b)^*$$
• P2: We can concatenate $$(a+b)^*$$ with any regular expression which has star at most outer level ($$(…)^*$$)

• P3: I know $$(a+b)^* = (a^*+b)^* = (a+b^*)^*= (a^*+b^*)^*$$. So I guess P1 and P2 also applies to them.

Am I correct with P’s?

Q. Also I know $$(a+b)^*=(a^*b^*)^*=b^*(a^*b)^*=(ab^*)^*a^*$$. Can we append some pattern of regular expressions to these also to not to change their original meaning?