A reduction from $HP$ to $\{(\langle M \rangle, \langle k \rangle) : \text{M visits in at list $k$ states for any input}\}$

I tried to define the next reduction from $ HP$ to $ \{(\langle M \rangle, \langle k \rangle) : \text{M visits in at list $ k$ states for any input}\}$ .

Given a couple $ (\langle M\rangle , \langle x\rangle)$ we define $ M_x$ such that for any input $ y$ , $ M_x$ simulates $ M$ on the input $ x$ . We denote $ Q_M +c$ the number of states needed for the simulation of $ M$ on $ x$ , and define more special states for $ M_x$ $ q_1′,q_2′,…,q_{Q_M + c+ 1}’$ when $ q’_{Q_M +c+1}$ is defined as the only final state of $ M_x$ . Now, in case $ M_x$ simulation of $ M$ on $ x$ halts (i.e $ M$ reach one of its finite state) $ M_x$ move to $ q_1’$ and then continue to walk through all the special states till it reaches $ q_{Q_M + c + 1}$ .

We define the reduction $ (\langle M \rangle , \langle x \rangle) \longrightarrow (\langle M_x \rangle , \langle Q_M +c+1 \rangle)$

In case $ ((\langle M \rangle , \langle x \rangle) \in HP$ then for any input $ y$ , $ M_x$ walks through all the special states and thus visits in at least $ Q_m + c+ 1$ steps. Otherwise, $ M$ doesn’t stop on $ x$ so $ M_x$ doesn’t visit any special state, thus visits at most $ Q_M +c$ states (the states needed for the simulation).

It is ok? If you have other ideas or suggestions please let me know.

Are there any enumerations of (machines for) languages in P such that all the machines can be simulated efficiently on any input?

From Computational Complexity A Modern Approach: Efficient Universal Turing Machine Theorem: There exists a TM U such that for every x, α ∈ {0, 1}∗, U(x, α) = Mα(x), where Mα denotes the TM represented by α. Furthermore, if Mα halts on input x within T steps then U(x,α) halts within CT log T steps, where C is a number independent of |x| and depending only on Mα’s alphabet size, number of tapes, and number of states.

From Kozen INDEXINGS OF SUBRECURSIVE CLASSES: "the class of polynomial time computable functions is often indexed by Turing machines with polynomial time counters…. The collection of all (encodings over (0, 1)) of such machines provides an indexing of PTIME…we have Theorem: No universal simulator for this indexing can run in polynomial space."

He then goes on to say: "can it be proved that gr U for any indexing of PTIME requires more than polynomial space to compute? We have proved this for a wide class of indexings, namely counter indexings satisfying the succinct composition property."

  • gr U is the graph of the universal function U and (barring details) represents the minimum power necessary to simulate P uniformly.

  • And the counter indexing (or polynomial time counters) he is referring to is specified in the answer here: How does an enumerator for machines for languages work?

I’m wondering how theorem for efficient universal Turing machine relates to Kozen’s result, that for certain types of enumerations of P, there is no machine that can efficiently simulate the machines enumerated. What causes simulation to be difficult and can it be circumvented–namely: does there exist an enumeration of P that describes the (machines for) languages in P in such a way that allows them to be efficiently simulated (with no more than poly overhead) on any input–or as Kozen puts it "allow(s) easy construction of programs from specifications"?

My guess is that part of the reason for the difficulty is because the efficient simulation theorem only says that there exists a machine that can efficiently simulate any single TM, but not a whole class of them… and when you start having to simulate more than one TM you loose the ability to optimize your simulator’s design to solve any particular language (run any particular machine) and have to design the simulator with the various machines you need to simulate in mind (and the more different those machines are the larger your overhead gets).

PS. A possible example could be in implicit complexity theory… where they construct languages that are complete for certain complexity classes. A language that is complete for P doesn’t seem to have trouble running its programs (which represent the languages in P)? But, if there are examples here, how do they overcome the difficulty Kozen is referring to, as they are programming systems and thus enumerations / indexings?

Just as a related aside… I think I have a proof that for a language Lp 1. and 2. cannot be true at the same time:

  1. An enumeration of P, call it language Lp (whose members are the strings in the enumeration) is decidable in poly time.

  2. All the P machines / languages represented by strings in Lp can be efficiently simulated by a universal simulator for P on any input.

It makes sense that there would be a relationship between the way the machines are encoded and the overhead for their simulation and 1. can be made to be true so that leaves 2. and brings us to the question being asked… Is it possible that 2. is always false–meaning for ANY enumeration/ encoding of P (any language Lp) simulation of those machines is not efficient for any universal simulator for P.

Here’s a rough sketch for anyone interested:

Take L:= {w∈L iff w∈Lp and W(w)=0}

So, one way to do this is our diagonal function maps w–>the language in P that w encodes (if w encodes a machine for a language in P (if w is a member of Lp)) and if it does not then it maps to the language containing all words. The existance of a map between a w and a language translates to w∈L iff w ‘is not a member’ of the language it is mapped to.

Since all w’s that aren’t encodings of P machines (members of Lp) are mapped to the language containing all words–they are members of L iff they are not members of this language. This is always false, so all words that are not members of Lp are not members of L.

This leaves only words of Lp as candidates for members of L. But for w to be a member of L not only does it need to be a member of Lp–the machine that w encodes, W, needs to evaluate to 0 on w. W(w)= 0 and w∈Lp <–> w∈L.

L is not in P. If L were in P then for some w, w would encode a machine for L, Wl. and for that w∈L iff Wl(w) = 0, ie. w∈L iff w is not in L.

Now, let’s employ the assumption that 1. Lp is poly decidable. As well as the assumption 2. that any machine specified by Lp can be simulated with no more than poly overhead by a universal simulator.

Then we can devise an algorithm, namely: given w, decide w∈Lp. If w∈Lp then run W(w). If w(w)=0 –> w∈L.

By the above algorithm, under these the assumptions 1. and 2. L would be in P–which is a contradiction to the previous proof by contradiction. I think the previous proof is correct and conclude that neither 1. nor 2. can be true at the same time.

Is it correct or incorrect to say that an input say $C$ causes an average run-time of an algorithm?

I was going through the text Introduction to Algorithm by Cormen et. al. where I came across an excerpt which I felt required a bit of clarification.

Now as far as I have learned that that while the Best Case and Worst Case time complexities of an algorithm arise for a certain physical input to the algorithm (say an input $ A$ causes the worst case run time for an algorithm or say an input $ B$ causes the best case run time of an algorithm , asymptotically), but there is no such physical input which causes the average case runtime of an algorithm as the average case run time of an algorithm is by it’s definition the runtime of the algorithm averaged over all possible inputs. It is something I hope which only exists mathematically.

But on the other hand inputs to an algorithm which are neither the best case input nor the worst case input is supposed to be somewhere in between both the extremes and the performance of our algorithm is measured on them by none other than the average case time complexity as the average case time complexity of the algorithm is in between the worst and best case complexities just as our input between the two extremes.

Is it correct or incorrect to say that an input say $ C$ causes an average run-time of an algorithm?

The excerpt from the text which made me ask such a question is as follows:

In context of the analysis of quicksort,

In the average case, PARTITION produces a mix of “good” and “bad” splits. In a recursion tree for an average-case execution of PARTITION, the good and bad splits are distributed randomly throughout the tree. Suppose, for the sake of intuition, that the good and bad splits alternate levels in the tree, and that the good splits are best-case splits and the bad splits are worst-case splits. Figure(a) shows the splits at two consecutive levels in the recursion tree. At the root of the tree, the cost is $ n$ for partitioning, and the subarrays produced have sizes $ n- 1$ and $ 0$ : the worst case. At the next level, the subarray of size $ n- 1$ undergoes best-case partitioning into subarrays of size $ (n-1)/2 – 1$ and $ (n-1)/2$ Let’s assume that the boundary-condition cost is $ 1$ for the subarray of size $ 0$ .

The combination of the bad split followed by the good split produces three sub- arrays of sizes $ 0$ , $ (n-1)/2 – 1$ and $ (n-1)/2$ at a combined partitioning cost of $ \Theta(n)+\Theta(n-1)=\Theta(n)$ . Certainly, this situation is no worse than that in Figure(b), namely a single level of partitioning that produces two subarrays of size $ (n-1)/2$ , at a cost of $ \Theta(n)$ . Yet this latter situation is balanced! Image

Is unicode character encoding a safe alternative for html encoding when rendering unsafe user input to html?

I am building a web application in which a third party library is used, which transforms the user input into JSON and sends it to an controller action. In this action, we serialize the input using the standard Microsoft serialize from the System.Text.Json namespace.

public async Task<IActionResult> Put([FromBody]JsonElement json) {     string result = JsonSerializer.Serialize(json); } 

However currently, the json is rendered back to the page, within a script block and using @Html.Raw(), which raised an alarm with me, when I reviewed the code.

While testing if this creates an opening for script injection, I added

<script>alert("HACKED");</script> 

to the input. This input is transformed into

\u003Cscript\u003Ealert(\u0027HACKED\u0027);\u003C/script\u003E 

when serialized.

This look fine. Rendering this to the page does not result code execution, when I tested that.

So, is unicode character encoding really a good protection against script injection, or should I not rely on it?

Is it conceivable that unicode encoding is lost somewhere during processing? Like (de)serializing once more, etc?

This seems like a question that has been asked and answered before, but I couldn’t find it.

Map two input streams, one graphics objects and the other characters through Show

The question is how to sequentially execute Show with two streams of input. The first is graphics object stream and the second is a character stream for supplying labelling for the graphics. I tried

ss={{ListPlot[x1]},{ListPlot[x2],…};labelling={aa,bb,cc,dd….}; Map[Show[#1,PlotLabel->StringJoin[#2,”…”,”…”]]&,{ss,labelling}]

I tried both Map and MapThread to inconsistent results, i.e., works sometimes and not work some other times. It became consistent when I put the labelling elements into individual curly brackets, i.e., labelling={{aa},{bb},{cc},(dd)) Wonder why is this the case?

Responsive input sliders – is there a better way?

I am building an app where a user controls inputs via multiple sliders each with discrete steps. The app then uses the combination of slider values to run some calculations and returns back an output value.

The sliders are meant to be highly responsive, so the user can immediately see the impact of their input combination on the output value -so I have resorted to eagerly calculating ALL possible combinations of sliders and letting the client query the results in the browser itself for faster response times.

The problem is that the number of possible combinations can seriously explode. If I have 5 inputs each with 10 steps, that is 5^10 = 9.8 million possible combinations! That’s way too expensive to calculate eagerly.

Is there a better way to do this? I could try to restrict the number of sliders that can be moved at any point, to lower the solution space…but wondering if there’s a better solution out there.