How to prove that this problem is NP-complete (or NP-Hard)

I have the following data:

  • A set $ V$ of tasks, the starting time $ s_j$ of each task and the duration $ p_j$ of each task.

  • A set $ K$ of resource, each resource has an availability function $ R_{k}$ that is piecewise constant.That is, for each $ t = 0, .., T-1$ , we precise $ R_{k}(t)$ the number of units available at $ t$ . $ R_k$ is an array of length $ T$ .

  • Each task $ j$ needs $ r_{j,k}$ resources to be processed (it could be zero). This quantity needs to be available during all the processing time starting from $ s_j$ .

For example consider :

  • Task$ A$ has processing time $ 3$ and starts at time period $ t=2$ and needs 2 units of some resource $ k$
  • Task $ B$ has processing time $ 4$ starts at time $ t=3$ and needs 3 units of the same resource $ k$ .

Then if $ R_{k}(t) = [*,*,2,6,6,3,*,*]$ then we are ok since at time $ t=2$ only task $ A$ is active and it requires $ 2$ units, at time $ t=3$ , both tasks are active and the sum of their utilization is $ 2+3 = 5 \leq 6$ ; same at time $ t=4$ . At time $ t=4$ , only task $ B$ is active and it requires $ 3$ units.

However, if $ R_{k}(t) = [*,*,2,4,6,3,*,*]$ , is not ok since at time $ t=3$ , both tasks $ A$ and $ B$ are active and their total use is equal to $ 5$ wheras only $ 4$ units are available.

Here is my attempt to verify that the resource utilization at each $ t$ is no larger than the availability function. So the answer is yes or no (we can say that this is a decision problem).

For each time t in [0,T-1]   For each resource k in K      total_use = 0, active_set = A     for each task j in V       if s_j<=t and s_j+p_j > t and r_{j,k}>0 \if the task is active at time t and it requires positive amount of resource k in order to be processed)         total_use += r_{j,k}         active_set := active_set U {j}        if total_use > R_{k}(t)         print(at time t the usage of resource k exceeds its capacity, active_set)         return False return True 

The algorithm here is pseud-polynomial. Unfortunately, I need to find a polynomial one in order to say that the problem is in $ \mathcal{NP}$ .

Is protein folding NP-hard and how to prove that?

This question has two facets that are related.

Is the general problem of protein folding really NP-hard?

The hydrophobic-polar protein folding model (Ken Dill et al., 1985) stated the problem on a lattice (similar to self-avoiding walk):

Simplification significantly reduces the computational effort in handling the model, although even in this simplified scenario the protein folding problem is NP-complete.

However, this simplification takes a problem which has some local convexity characteristics in real life, since protein folding can be thought as the problem of finding the global minimum on some energy "cost function", also known as the enemy landscape that was illustrated by Ken Dill et al., 2012:

enter image description here

As stated by others, translating a problem defined over continuous domain into problems over the integers or a binary lattice may introduce higher complexity cost. You may think about it as taking a linear problem and solve it using 3SAT solver, any problem in $ P$ is presumably reducible to 3SAT. I assume that there is a convex nature to the energy landscape, and this piecewise convexity is beneficial for the search for the minimal energy state.

How to reduce hydrophobic-polar protein folding to TSP/3SAT/CNF?

The work by Bahi et al. really confused me. I expected it to be something relatively simple to reduce 3SAT or TSP to the problem of protein lattice, because intuitively it seems that finding Hamiltonian path could be stated as finding a protein conformation on a lattice.

Bahi, Jacques M., et al. Computational investigations of folded self-avoiding walks related to protein folding. Computational biology and chemistry 47 (2013): 246-256.

Bahi, Jacques M., et al. Chaos of protein folding. In IJCNN 2011, Int. Joint Conf. on Neural Networks, pages 1948–1954, San Jose, California, United States, July 2011.

Bahi, Jacques M., et al. Protein folding in the 2D hydrophobic-hydrophilic (HP) square lattice model is chaotic. Cognitive Computation, 4(1):98–114, 2012.

Bahi, Jacques M., et al. Is protein folding problem really a NP-complete one ? First investigations. arXiv:1306.1372, October 2012. Submitted to Journal of Bioinformatics and Computational Biology (Elsevier).

Bahi, Jacques M., et al. Unfoldable self-avoiding walks are infinite. Consequences for the protein structure prediction problem. arXiv.org, 2013.

Bahi, Jacques M., et al. Protein structure prediction software generate two different sets of conformations. Or the study of unfolded self-avoiding walks. arXiv:1306.1439, 2013.

Is minimising the total number of one entries in binary matrices $CA$ and $C^TB$ NP-HARD?


Given a two rectangular binary matrices $ A$ and $ B$ with dimensions $ c\times a$ and $ c \times b$ respectively, does there exist an invertible binary matrix C with dimensions $ c \times c$ such that the total number of 1 entries in $ CA$ and $ C^TB$ is below a target threshold $ t$ ?

Here we are working in $ GF(2)$ , where multiplication and addition are AND and XOR respectively.

What is the hardness of this problem? Are there approximation algorithms to this problem?

We know this problem is in $ NP$ , as a valid $ C$ can be used as a certificate. Also, we know how to find $ C$ when there exists the a solution for $ t = a+b$ , by using Gaussian Elimination.

How many clauses are required for SAT to be NP-hard in CNF formulas?

It is not hard to see that SAT for a CNF formula with $ n$ variables and a constant number of clauses can be solved in polynomial time. On the other hand, it is not hard to see that a CNF formula with $ n$ variables and $ O(n)$ clauses is enough for NP-hardness (consider for example the instances of SAT associated with the natural formula for 3-colorability, applied to planar graphs).

We could define this formally as $ \text{CNFSAT}-f-\text{Clauses}$ , a family of problems parameterized by a function $ f$ , in which instances are formulas in CNF such that if they have $ n$ variables, then they have at most $ f(n)$ clauses. Based on this, what I’d like to know is what is the smallest function $ g$ such that we know there exists $ f \in O(g)$ such that $ \text{CNFSAT}-f-\text{Clauses}$ is already NP-hard. We know that g = 1 (constant # of clauses) does not work, and $ g = n$ (linear number of clauses) works.

What about $ g = \log n$ ? Is there a simple algorithm for CNFSAT over formulas that have $ O(\lg \lg n)$ clauses?

Do any single-cell organisms exist that approximate NP-hard problems within a factor better than $1/2$ $log$2?

I’ve seen on Wikipedia; that set covering cannot be approximated in polynomial time to within a factor mentioned above. Unless $ NP$ has quasipoly-time algorithms.

Now, this must pertain to classical algorithms and does not mention any approximation algorithms that may only work in nature.

(eg. Things like Amoebas solving $ TSP$ problems)

  • Do any single-cell organisms show any promise in solving $ NP$ -hard problems in polynomial-time?

  • Or approximating them better than any known classical algorithms?

Is there a *natural* problem that is NP-hard on trees, but in P on non-trees?

It seems intuitive that any natural problem that is NP-hard on trees, should be hard on graphs that are not trees. But perhaps this is wrong?

Question: Is there some natural decision problem on graphs that is NP-hard when we are promised the graph is a tree, but in $ P$ when the graph is promised to not be a tree.

Thoughts:

  • Here’s a list of problems that are NP-hard on trees: https://cstheory.stackexchange.com/questions/1215/np-hard-problems-on-trees The ones I saw were all also hard on non-trees.

  • Given any NP-complete language $ M$ that is NP-complete on trees, you can obtain a tautological example via the decision problem $ L$ where $ x \in L$ iff $ x \in M$ and $ x$ is a tree. This is pretty unnatural though.

  • Is there some object that trees have a lot of of, but non-trees only have a few of?

A variant of hitting set problem? Is this also a NP-hard problem?

Let’s start from finding a minimum hitting set problem. Given a collection of sets $ U=\{\{S_1\},\{S_2\},\{S_3\},\{S_4\},\{S_5\},\{S_6\}\}=\{\{1, 2, 3\}, \{1, 3, 4\}, \{1, 4, 5\}, \{1, 2, 5\}, \{2, 3\}, \{4, 5\}\}$ ,
it is easy to know that a minimum hitting set is $ \{2,4\}$ .

I am thinking what this problem would be if the set $ S$ is also a collection of sets. For instance, given $ S_1=\{\{1,2,3\},\{3,4\},\{1,2\}\}$ ,
$ S_2=\{\{3\},\{3,5\},\{1,3\}\}$ ,
$ S_3=\{\{2,5\},\{4\},\{1,5\},\{1,10,6,7\}\}$ ,
then a minimum “hitting set” is $ \{3,4\}$ . That is because $ \{3,4\}$ only consists of two elements and hits every set (i.e., $ \{3,4\}\in S_1$ , $ \{3\}\subset \{3,4\}$ and $ \{4\}\subset \{3,4\}$ ).

Does anyone have an idea to solve this problem? Do you think it is a variant of hitting set problem? Is it an NP-hard problem?