MySQL – multiple counts on relations based on conditions – JOIN VS. SUBQUERY

I don’t want to share my exact DB structure, so let’s assume this analogy :

--categories-- id name  --products-- id name cat_id 

I then have SQL like this :

SELECT categories.*, count(CASE WHEN products.column1=something1 and products.column2=something2 THEN 1 END) as count1, count(CASE WHEN products.column3=something3) as count2  FROM categories LEFT JOIN products ON GROUP BY 

The problem here is that the GROUP BY is taking too long, it’s a difference between 0.2s query and 2.5s query.

BCNF decomposition which generates redundant relations (as for 3NF)

Sometimes the 3NF decomposition algorithm generates redundant relations, where all attributes of some R_i already appear in another R_j. The algorithm is supposed to delete such redundant relations.

I read several descriptions of the BCNF decomposition algorithm and none of them mention a similar situation, which let me think that it never occurs for BCNF.

However I stumbled upon this example: consider R(A,B,C,D) and the set of functional dependencies F = {A→B; B→C; B→D}. R is not in BCNF because e.g BC→D is a consequence of F and BC is not a superkey. If we choose to decompose R using BC→D we obtain R1(BCD) and R2(ABC). The relation R1 is in BCNF, but R2 is not because e.g we have B→C and B is not a superkey. If we decompose R2 using B→C, we obtain R3(BC) and R4(AB). Finally we obtain a BCNF decomposition of R as R1(BCD),R3(BC) and R4(AB). But all attributes of R3 already appear in R1, therefore it would be preferable to delete R3.

Did I apply the BCNF algorithm incorrectly, or should one add this final deletion step to the algorithm ?

Relations between deciding languages and computing functions in advice machines

I’m trying to understand implications of translating between functions and languages for P/Poly complexity. I’m not sure whether the following all makes sense. Giving it my best shot given my current understanding of the concepts. (I have a project in which I want to discuss Hava Siegelmann’s analog recurrent neural nets, which recognize languages in P/Poly, but I’d like to understand and be able to explain to others implications this has for computing functions.)

Suppose I want to use an advice Turing $ T_1$ machine to calculate a function from binary strings to binary strings $ f: {0,1}* \rightarrow {0,1}*$ . $ T_1$ will be a machine that can compute $ f$ in polynomial time given advice that is polynomial-size on the length of arguments $ s$ to $ f$ , i.e. $ f$ is in P/Poly. (Can I say this? I have seen P/Poly defined only for languages, but not for functions with arbitrary (natural number) values.)

Next suppose I want to treat $ f$ as defining a language $ L(f)$ , by encoding its arguments and corresponding values into strings, where $ L(f) = \{\langle s,f(s)\rangle\}$ and $ \langle\cdot,\cdot\rangle$ encodes $ s$ and $ f(s)$ into a single string.

For an advice machine $ T_2$ that decides this language, the inputs are of length $ n = |\langle s,f(s)\rangle|$ , so the relevant advice for such an input will be the advice for $ n$ .

Question 1: If $ T_1$ can return the result $ f(s)$ in polynomial time, must there be a machine $ T_2$ that decides $ \{\langle s,f(s)\rangle\}$ in polynomial time? I think the answer is yes. $ T_2$ can extract $ s$ from $ \{\langle s,f(s)\rangle\}$ , and then use $ T_1$ to calculate $ f(s)$ , and then encode $ s$ with $ f(s)$ and compare it with the original encoded string. Is that correct?

Question 2 (my real question): If we are given a machine $ T_2$ that can decide $ \{\langle s,f(s)\rangle\}$ in polynomial time, must there be a way to embed $ T_2$ in a machine $ T_3$ so that $ T_3$ can return $ f(s)$ in polynomial time?

I suppose that if $ T_2$ must include $ T_1$ , then the answer is of course yes. $ T_3$ just uses the capabilities of $ T_1$ embedded in $ T_2$ to calculate $ f(s)$ . But what if $ T_2$ decides $ L(f)$ some other way? Is that possible?

If we are given $ s$ , we know its length, but not the length of $ f(s)$ . So in order to use $ T_2$ to find $ f(s)$ , it seems there must be a sequential search through all strings $ s_f = \{\langle s,r\rangle\}$ for arbitrary $ r$ . (I’m assuming that $ f(s)$ is unbounded, but that $ f$ has a value for every $ s$ . So the search can take an arbitrary length of time, but $ f(s)$ will ultimately be found.)

One thought I have is that the search for a string $ s_f$ that encodes $ s$ with $ f(s)$ has time complexity that depends on the length of the result $ f(s)$ (plus $ |s|$ , but that would be swamped when $ f(s)$ is long).

So now the time complexity does not have to do with the length of the input, but only the length of $ f(s)$ . Maybe $ L(f)$ is in P/Poly if $ f$ is in P? (Still confused here.)

Thinking about these questions in terms of Boolean circuits has not helped.

Forming recurrence relations

I have this 2 examples in my textbook:

Example 1

public void f(int n) {     if (n = 1)         return 1;     else         return n * f (n-1); } 

The textbook shows how the recurrence relation is being form from by the above code

T(0) = a           for some constant a T(n) = T(n-1)+b    for some constant b and a recursive term 

Example 2

public int myFunction (int n) {     if (n == 1)         return 1;     else         return 2 * myFunction(n/2) + myFunction(n/2) + 1; } 

The textbook shows how the recurrence relation is being form from by the above code

T(1) = c              for some constant c T(n) = 2T(n/2) + b    for some constant b and a recursive term 

The problem

Despite reading the textbook multiple time (and failed attempt at searching online), I still do not understand how the +b come about for both examples. Is anyone able to enlighten on this?


Recurrence Relations for Perfect Quad Trees (same as binary trees but with 4 children instead of 2)

I have to write and solve a recurrence relation for n(d), showing how I arrive at the formula and solve the recurrence relation, showing how I arrive at the solution. Then prove my answer is correct using induction for perfect quad trees which are basically binary trees but with 4 children at each node rather than 2 and the leaf nodes in the deepest layer have no children. Nodes at precisely depth d is designated by n(d). For example, the root node has depth d=0, and is the only node at that depth, and so n(0) = 1

Does this mean it would be T(n)= 4T(n/4) + d ? then prove

I’m really confused and would appreciate any help or resources.

Calculating complexity for recursive algorithm with codependent relations

I wrote a program recently which was based on a recursive algorithm, solving for the number of ways to tile a 3xn board with 2×1 dominoes:

F(n) = F(n-2) + 2*G(n-1)

G(n) = G(n-2) + F(n-1)

F(0) = 1, F(1) = 0, G(0) = 0, G(1) = 1

I tried to calculate the complexity using methods I know such as recursion tree and expansion, but none resulted in any answer. Actually I had never come across such a recursion, where the relations are codependent.

Am I using the wrong methods, or maybe using the methods in a wrong way? And if so, can anyone offer a solution?

What are the common practices to weight tags relations?

I am working on a webapp (fullstack JS) where the user create documents and attach tags to them. They also select a list of tags they are interested in and attach them to their profile.

I am not a math guy, but I did some NLP as hobbyist and learnt about latent semantic indexation: as I understand it, you create a table where you store each couple of words you parsed, and then add weight to each of these couples of words when both are found next to each other.

I was thinking of doing the same thing with tags: when 2 tags appear on the same document or profile, I increase the weight of their couple. That would allow me to get a ranking of the “closest” tags of a given one.

Then I remembered that I came across web graphs, where websites were represented in a 2D space (x and y coordinates) and placed depending on their links using an algorithm called force vector.

While I do know how I would implement my first idea, I am not sure about the second one. How do I spread the tag coordinates when created? Do they all have an x:0, y:0 at the start?

Since I assume this is a common case of data sorting, I wondered what would be the common/best practices recommended by people of the field.

Is there documents, articles, libraries (npm?) or wikipedia pages you could point me out to help me understand what can or should ideally be done? Is my first option a good one by default?

Also, please let me know in comments if I should add or remove a tag to this question or edit its title: I’m not even sure of how to categorize it.

Encoding order relations in CNF

I want convert timetable scheduling problems to SAT problems. Suppose there are $ t$ time slots and $ c$ classes. I will define $ t\times c$ variables $ x_{ij}$ , which is true iff class $ j$ takes place in time slot $ i$ . My problem is: suppose there is a constraint that class $ a$ takes place after class $ b$ . How to encode that efficiently in CNF?

Visualize file relations

I’m desiging an interface where file relations are a bit complicated and I could really use some help on how to visualize them.

Basically, it’s a programme where you can change different kinds of parameters of trains. The problem is that one train file contains loads of other files (speed etc.), and these files can also be part of other trains. How can I show the relation that one file is part of two or more trains?

Many Thanks, Hannah

What are the relations between these two descriptions of let polymorphism?

In Types and Programming Languages by Pierce, there are two descriptions of let-polymorphism.

Sec23.8 Fragments of SystemF on p359 says

This has led to various proposals for restricted fragments of System F with more tractable reconstruction problems.

The most popular of these is the let-polymorphism of ML (§22.7), which is sometimes called prenex polymorphism because it can be viewed as a fragment of System F in which type variables range only over quantifier-free types (monotypes) and in which quantified types (polytypes, or type schemes) are not allowed to appear on the left-hand sides of arrows. The special role of let in ML makes the correspondence slightly tricky to state precisely; see Jim (1995) for details.

Sec 22.7 Let Polymorphism says

The first step is to change the ordinary typing rule for let so that, instead of calculating a type for the right-hand side t1 and then using this as the type of the bound variable x while calculating a type for the body t2, …, it instead substitutes t1 for x in the body, and then typechecks this expanded expression … We write a constraint-typing rule for let in a similar way:

enter image description here

In essence, what we’ve done is to change the typing rules for let so that they perform a step of evaluation

enter image description here

The second step is to rewrite the definition of double using the implicitly annotated lambda-abstractions from §22.6.

let double = λf. λa. f(f(a)) in let a = double (λx:Nat. succ (succ x)) 1 in let b = double (λx:Bool. x) false in ... 

The combination of the constraint typing rules for let (CT-LetPoly) and the implicitly annotated lambda-abstraction (CT-AbsInf) gives us exactly what we need: CT-LetPoly makes two copies of the definition of double, and Ct-AbsInf assigns each of the abstractions a different type variable. The or- dinary process of constraint solving does the rest.

What are the relations between the two descriptions?

Does each of the two descriptions imply (or lead to) the other? How? More specifically, do the first description’s

  • type variables range only over quantifier-free types (monotypes)

  • quantified types (polytypes, or type schemes) are not allowed to appear on the left-hand sides of arrows

and the second description’s

  • the constraint typing rules for let (CT-LetPoly)
  • the implicitly annotated lambda-abstraction (CT-AbsInf)

imply each other, and how?


Related to my previous question What is "Hindley-Milner (i.e., unification-based) polymorphism"?