Shortest common supersequence with Genetic Algorithm

I’m trying to solve the shortest common supersequence with Genetic Algorithm. I found it a little bit hard to reduce the size of the chromosomes in each generation.

I know that the maximum size of the chromosome is the total length of the strings. Even if we create a “naive” genetic algorithm: Totally random strings (with different length) as initial population, mutation that replace a character, fitness that return how much strings that chromosome contains etc. How the crossover can reduce the size of the chromosome length? If we choose n-points crossover, the length of the children cannot be smaller than the shorter parent.

So how genetic algorithm can solve such problems that need the shortest chromosome length? How crossover can reduce the chromosome length?

Banker’s algorithm – additional allocation

I have a problem with the banker’s algorithm. There are 6 processes and one type of resources. The allocation is (0, 29, 35, 10, 25, 35) and the MAX need is (50, 80, 50, 25, 40, 90). The available resources are equal to D. The first question is what is the minimum value for D as the state to be safe. I have managed to solve this and the value is 15. But the next question is: To how many processes can be allocated additional 5 resources to keep the state safe (multiple situations)? I am not sure I understand this. It means that I need to add 5 to the allocation array or both allocation array and MAX need?

Proof of heuristic function in A star algorithm

Maybe I am missing something very easy and obvious.

But, I don’t understand why estimate cost of source vertex is subtracted from the overall estimate cost, if heuristic function $ h$ is monotonic: $ $ d′(x,y)=d(x,y)+h(y)−h(x)$ $

What I currently know:

A* algorithm can be used as an extension for Dijkstra’s algorithm. At each iteration of its main loop, it chooses the vertex with the minimum of estimation cost plus cost of the path to this vertex:

For vertex $ u$ and its successor $ v$ , overall cost is calculated with $ $ f(u, v) = d(u, v) + h(v)$ $ using some heuristic function $ h$ . Where:

  • $ d(u,v)$ cost of the path from $ u$ to $ v$
  • $ h(v)$ estimate cost from $ v$ to the target vertex $ t$

If for any adjacent vertices $ u$ and $ v$ , it is true that $ $ h(u) <= d(u, v) + h(v)$ $ then $ h$ is a monotonic. In other words, graph holds triangle inequality property.

It is stated in Wiki page of A* algorithm:

If the heuristic h satisfies the additional condition $ h(x) ≤ d(x, y) + h(y)$ for every edge $ (x, y)$ of the graph (where d denotes the length of that edge), then h is called monotone, or consistent. In such a case, A* can be implemented more efficiently—roughly speaking, no node needs to be processed more than once (see closed set below)—and A* is equivalent to running Dijkstra’s algorithm with the reduced cost $ d'(x, y) = d(x, y) + h(y) − h(x)$ .

My questions are:

and A* is equivalent to running Dijkstra’s algorithm with the reduced cost $ d'(x, y) = d(x, y) + h(y) − h(x)$ .

Any proof for this equivalence ?

It is clear for me that $ 0 <= d(x, y) + h(y) – h(x)$ , and it is feasible. But:

  • Why this formula is chosen as a new distance function ?
  • Is there any formal proof that it works ?
  • Why it is not enough to run Dijkstra with $ d'(x, y) = d(x, y) + h(y)$ ?
  • What is the math behind it ?

Algorithm for Team Meetings scheduling optimization

Consider persons P (about 100) and teams T (ca 20), where each person is member of 2-3 teams. Schedule succession of meetings M of teams where each P can attend and optimize waiting time by running M in parallel, if possible

Optimal: Meetings of T (teams) should be scheduled as close in time as possible, ideally in parallel, if no P would be in same meeting of T at the same time.

Sub-optimal: Allow for meetings to run parallel, if only one, two P (or percentage of team-size) would be “overbooked”, so those could only attend one of the parallel meetings.

For example:

P1 member of T1, T4 P2 member of T1, T3 .... 

Expected result:

|   Timeslot 1   |   Timeslot 2   |  Timeslot 3  .... | T1: {P1 P2 P3} | T3: {P4 P2 P3} |  .... | T2: {P4 P5 P6} | T4: {P1 P5 P6} |  .... ..... ..... 

Minimize number of timeslots, maximize number of parallel meetings

Can someone point me to an algorithm?

Preferably in pseudo-code or C-style lang, or is there some routine available for this in any kind of library, or even a procedure in applications like Excel… Database..etc..?

Is there any advantage to combining a hash algorithm with a key-derivation function?

Let’s assume I would like to secure passwords using a modern KDF such as Argon2. The flow of information would look like this: $ hash,$ salt = argon2id($ password, $ salt).

Is there any advantage to first hash the password using SHA256/512, like so $ hash,$ salt = argon2id(sha256($ password), $ salt)?

Induction to prove equivilence of a recursive and iterative algorithm

Using induction how do you prove that two algorithm implementations, one recursive and the other iterative, of the Towers of Hanoi perform identical move operations. The implementations are as follows

Hanoi(n, src, dst, tmp):   if n > 0     hanoi(n-1, src, dst, tmp)     move disk n from src to dst     hanoi(n-1, tmp, dst, src) 

And iteratively,

If n is even, swap pegs 1 and 2. At the ith step, make the only legal move that avoids peg i mod 3. If there is no legal move, then all disks are on peg i mod 3, and the puzzle is solved.

 hanoi(n, s, t, d)   pegs[s = [1,2,..,n], t, d]   if n is even     swap pegs[1], pegs[2]   i <- 1   while(pegs[d].length < n.length)     avoid <- i mod 3     if the top of pegs[(avoid + 1) mod 3] > the top of pegs[(avoid + 2) mod 3]       move the top of pegs[(avoid + 2) mod 3] to pegs[(avoid + 1) mod 3]     else       move the top of pegs[(avoid + 1) mod 3] to pegs[(avoid + 2) mod 3]     i++ 

I’m having a hard time figuring out how to put this into mathematical expressions or a recurrence-relation which I can then perform induction on to show their equivalency.

My searching has only brought me problems that deal with the complexity or correctness of algorithms, I’m interested in showing that every $ i$ th move the performed are identical.

Proof of QuickSort algorithm correctness

Recently I’ve studied QuickSort and understood its general idea.

Basically, we do the following:

  1. Pick an element from the array (no matter which one and how in this context)
  2. Rearrange elements in the array so that elements less than the pivot are placed prior to it (in the left part of the array), and ones that are greater than the pivot are placed after it (in the right part).
  3. Apply these procedures to the left and right parts of the array.

Everything looks quite clear, but my question is: why does it eventually sorts the entire array?

I’ve read lots of articles about the QuickSort, but none of them provides a clear proof that algorithm does really work in a way they describe. So I’m looking for such a proof.

I realize that:

  • Array of one element (i.e with length 1) is already sorted.
  • After any partition the pivot is placed on it’s final position.

So what’s the proof of the fact that after applying QuickSort procedure to a given array of length N its elements will be ordered after a procedure completes?

P.S. I’ve seen several proofs based on mathematical induction. Unfortunately, I’m not familiar with it yet, so I’m primarily looking for non-induction proof.

Is the complexity of this algorithm O($\sqrt{n}$) or linear?

Let’s say I have two identical jars and I want to find the height that the jars will breaks when dropped from. I can drop the jar from increments of steps on a staircase, but I want to solve this problem sublinearly.

I decided to increment using square numbers (1, 4, 9, 16..etc) for a complexity of O($ \sqrt{n}$ ), so I know that I have an upper and lower bound once the jar breaks. If the jar breaks at 16, I know that the height that it breaks at is between 9 + 1 and 16. If I were to run the algorithm linearly from 10 to 16, would this make my algorithm linear?