## Parallel computation in a computer with 2 CPUs

My computer has 2 CPUs, each with 8 cores. However, when using parallel operation, only 8 cores of one CPU can be called. How can I set it so that 8 cores of the other CPU can also participate in the calculation?

## Distributed predicate computation on event stream

My question is actually a request for papers, articles, texts or books on the problem that I’m trying to solve on my work.

I’m working on a program that computes a predicate value (true or false) for a given object in a distributed system in which there is a stream of events that can change the object’s attributes and, consequentially, the predicate value. Whenever the predicate value changes, the program must send a notification about this change.

For example, consider that there is an object A which has an attribute called name and consider that there is a predicate P which is true when the object’s name is equal to Jhon. Each event in the stream has a timestamp and a value for the attribute name. So consider the following sequence of events:

e1 = { name: Jhon, timestamp: 1 } e2 = { name: Jhon, timestamp: 2 } e3 = { name: Peter, timestamp: 3 } e4 = { name: Doug, timestamp: 4 } e5 = { name: Jhon, timestamp: 5 } 

Now, the events don’t necessarily show up in the stream in the correct order and, even worst, there are multiple computers parallelly processing this stream of events. However, for simplicity, I’ll go further in this example considering only one computer.

If the events arrive and are processed in the order described above, then the notifications sent should be:

P(A) = true when e1 arrives P(A) = false when e3 arrives P(A) = true when e5 arrives. 

That is the correct sequence of notifications. Now, imagine that the computer receives the events in the following order:

e1, e5, e2, e4, e3 

A naive algorithm which doesn’t consider the event’s timestamp would send an incorrect sequence of notifications:

P(A) = true when e1 arrives P(A) = false when e4 arrives 

The algorithm that I’m working on considers the timestamps and infers when a notification should have been sent but was not. So when e3 arrives it will notice that the notification P(A) = true for e5 was not sent. This feels a bit like reinventing the wheel, though I’m not aware of any reading about this problem. I would like some references to this problem or to something similar, like some papers dealing with this kind of problem.

The real problem is quite more complex since it involves storing the predicate $$\times$$ object state in a database that works as a shared state between the computers processing the stream and I’m talking about thousands of events arriving per second so it’s not possible to keep all events stored in some database.

## Time computation problems with “Maximize” expression

I just started to use Mathematica a few weeks ago. I am afraid there is something that does not work on my laptop because, for the following simple command, it takes too much time. Do you know if there is a mistake in this command syntax? Thank you in advance for your help.

Maximize[{(g/e)*Sqrt[((g – e)^2 + (f – h)^2)], 0 <= e <= 1, 0 <= f <= 1, e^2 + f^2 == 1, 0 <= g <= e, 0 <= h <= f}, {e, f, g, h}]

## Faster computation of $ke^{-(x – h)^2}$

The question is quite simple; almost every computer language today provides the $$\exp(x)$$ function in their standard library to compute expressions like $$ke^{-(x – h)^2}.$$ However, I would like to know whether this function is the fastest way to compute the above expression. In other words, is there some way to compute $$ke^{-(x – h)^2}$$ faster than $$\exp(x)$$ in standard libraries while keeping the result very accurate?

I would like to specify that Taylor series will not work for my application, nor will any other polynomial approximations.

## CPU Registers and Computation

How exactly does the control unit in the CPU retrieves data from registers? Does it retrieve bit by bit?

For example if I’m adding two numbers, A+B, how does the computation takes place in memory level?

## Understanding CRC Computation with PCLMULQDQ

I am currently reading this paper which shows how to calculate CRC using the instruction PCLMULQDQ. I don’t quite understand the equations in it yet.

1. Starting with this one for the definition of crc32:

CRC (M (x)) = xdeg(P(x)) • M(x) mod P(x)

Before reading this paper, I had only ever seen CRC (M (x)) = M(x) mod P (x). Why can xdeg(P(x)) be multiplied with M(x) (and still =CRC) and why do they do it?

1. M(x) = D(x) • x^T xor G(x).

Again I do not understand how they derived this equation?

1. I also don’t understand why they are using multiplication at all (although this might become clear once I understand 1) and 2).

All CRC implementations I’ve seen so far are basically divisions with or without precomputed values so why do they use multiplication?

In addition, I’d also be very thankful to be pointed towards resources that could help me derive these equations myself in the future.

EDIT: Ive managed to derive 2). Any bitstring M representing a polynomial can be cut into two smaller bitstrings A and B. To add them back together to create M you have to multiply A by x to the power of the number of digits(=T) in B, thereby appending T zeros to the right end of A. Once you’ve done this, you can xor (add in galois field) the two back together, creating M. E.g. M = 10101, A = 101 B = 01 => T = 2; A * x^T = 10100. Xor with B = 10100 xor 01 = 10101 = M

EDIT2: typos

## Computation Complexity books for a mathematician

I recently attented to some computational complexity (or complexity theory, I am not sure which is the correct name) and I fell in love with it. I would like to find some books, online courses… in general resources of any kind to self-study this (securely) wonderfull subject.

My backgrund is pure mathematics with emphasys in discrete mathematics (graph theorey, crypto, coding thoery, combinatorics…), with no background in computer science. I am not sure if the last one is mandatory.

## What is the simplest quantum algorithm to visualize a quantum computation?

I’m interested to visualize how a simple quantum computation can be done, step by step. Can you help me?

I need any simple example of how qubits can be used to make a computation.

## Same computation order using postfix notation?

I’m trying to understand arithmetic using stacks. Specifically converting infix notation to postfix notation. My question is how you convert an expression like: 1 + (2 + 3) + (4 + 5) that computes in the exact order that the order of operations says.

Meaning: 1 + (2 + 3) + (4 + 5) becomes 1 + 5 + (4 + 5) then 1 + 5 + 9 then 6 + 9

If you do:

Push 1, Push 2, Push 3, Add, Push 4, Push 5, Add

Where Add pops the top two operands off the stack, adds them, and then pushes the sum back on the stack.

Alternative notation: 1 2 3 + 4 5 +

How do you add 1 + (sum of 2 and 3) next? If you do another Add then the sum of 2 and 3 will be added to sum of 4 and 5, because these are the top two on the stack. Is it impossible to do it in the exact same order? Doing each parentheses group first left to right then doing the rest of the computation left to right.

## Minimizing computation time

I’m having issues with reducing the computation time of a NMinimize function. I’ve tried setting the WorkingPrecision and N and adding constraints to the variable in NMinimize but more or less the computation time is way too long.

 ClearAll["*"] \$  Assumptions = {HeavisideTheta[0] == 1, HeavisideTheta[0.] == 1,    0 < j1 <= 5, 0 < j2 <= 5, 0 < j3 <= 5, 0 < j4 <= 5, 0 < t1,    t1 <= t2, t2 <= t3, t3 <= t4, t4 <= t5, t5 <= t6, t6 <= t7} eps = 0.05 wn = 2 Pi 5  func[q_?NumericQ] :=   Module[{vmax, sol, j, a, v, x, constraint1, constraint2, constraint3,     t1, t2, t3, t4, t5, t6, t7, yofs, yoft, j1, j2, j3, j4, temp500},    j[t_] =     j1 (UnitStep[t] - UnitStep[t - t1]) -      j2 (UnitStep[t - t2] - UnitStep[t - t3]) -      j3 (UnitStep[t - t4] - UnitStep[t - t5]) +      j4 (UnitStep[t - t6] - UnitStep[t - t7]);   a[t_] = Integrate[j[t], t];   v[t_] = Integrate[a[t], t];   x[t_] = Integrate[v[t], t];   constraint1 =     FullSimplify[     a[t7], {0 < j1 <= 5, 0 < j2 <= 5, 0 < j3 <= 5, 0 < j4 <= 5,       0 < t1, t1 <= t2, t2 <= t3, t3 <= t4, t4 <= t5, t5 <= t6,       t6 <= t7}];   constraint2 =     FullSimplify[     v[t7], {0 < j1 <= 5, 0 < j2 <= 5, 0 < j3 <= 5, 0 < j4 <= 5,       0 < t1, t1 <= t2, t2 <= t3, t3 <= t4, t4 <= t5, t5 <= t6,       t6 <= t7}];   constraint3 =     FullSimplify[     x[t7], {0 < j1 <= 5, 0 < j2 <= 5, 0 < j3 <= 5, 0 < j4 <= 5,       0 < t1, t1 <= t2, t2 <= t3, t3 <= t4, t4 <= t5, t5 <= t6,       t6 <= t7}];   vmax = (j1 t1 t1 + (t3 - t2) j1 t1)/2 + (t2 - t1) j1 t1;    sol = NMinimize[{t7, {t1 j1 - j2 (t3 - t2) == 0,        j1 t1 <= 0.7, -j3 (t5 - t4) >= -0.7,        j1 t1 (t2 + t3 - t1)/2 <= 0.14, constraint1 == 0,        constraint2 == 0, constraint3 == 0.05, 0 < j1 <= 5, 0 < j2 <= 5,        0 < j3 <= 5, 0 < j4 <= 5, 0 < t1, t1 <= t2, t2 <= t3, t3 <= t4,        t4 <= t5, t5 <= t6, t6 <= t7, t7 < 2}}, {j1, j2, j3, j4, t1,       t2, t3, t4, t5, t6, t7}];   {j1, j2, j3, j4, t1, t2, t3, t4, t5, t6,      t7} = {j1, j2, j3, j4, t1, t2, t3, t4, t5, t6, t7} /. sol[[2]];   Clear[j3, j4, t5, t6, t7, t4];   j3 = q;   j4 = q;   {t4, t5, t6, t7} = {t4, t5, t6, t7} /.      NMinimize[{t7, {constraint1 == 0, constraint2 == 0,          constraint3 == 0.05, -j3 (t5 - t4) >= -0.7, t5 > t4, t6 >= t5,          t7 > t6, t4 >= t3}}, {t4, t5, t6, t7}][[2]];    yofs = -LaplaceTransform[a[t], t, s]/(s^2 + 2 eps wn s + wn^2);   yoft = InverseLaplaceTransform[yofs, s, t];   NMinimize[{N[Re[FullSimplify[yoft, t > t7]]], t > t7, t < 1.5 t7},      t, WorkingPrecision -> 9][[1]] (*Column[{NMinimize[{Re[FullSimplify[yoft, t > t7]], t > t7}, t][[1]],    Plot[Re[yoft], {t, t7, 1.5 t7}]}*)]    ] (*func[3]   func[4] *) DiscretePlot[func[q], {q, 3, 5, 0.01}]   

When I make the output of the function as the one in comment and call 2 individual functions the results are what I want(takes around 3 minutes to compute 2 outputs) but when I run a discrete plot, 200 calls, basically would take 5 hours which is too much. is there a way to reduce this. Using N or using constraints or using WorkingPrecision` isn’t changing the computation time either.