Postgres: Efficient schema for querying with one exact string match and two range matches

The table I need to query:

CREATE TABLE regions (     state text NOT NULL,     zip_begin text NOT NULL,  -- 9-digit zip code     zip_end text NOT NULL,     date_begin date NOT NULL,     date_end date,      data ..., ) 

There are ~50 states and between 0-5M rows per state. The zip and date ranges might overlap.

The following will be one of the most common queries in my OLTP application:

SELECT data FROM regions WHERE state = {state}   AND {date} BETWEEN date_begin AND date_end   AND {zip} BETWEEN zip_begin AND zip_end 

This query usually yields one row, but may sometimes yield more.

From the Postgres docs, it sounds like a GiST index might do what I need, but I don’t really understand how those work.

Efficient method to find patterns $\frac{a}{b}=k \frac{c}{d}$ in the set

I have a list of rules, the whole set is bigger, this is just a small part

rule1=MapIndexed[Subscript[A, First@#2]->#&,{0.017840140244041125`,0.03701275157916095`,0.051660116184981536`,0.08951663681268851`,0.11158027392659164`,0.11199090055895211`,0.11722479640123873`,0.12880796041375156`,0.13998629375261973`,0.14799408926444602`,0.1603406353916348`,0.1691630967698906`,0.17603820358593392`,0.17625123467649714`,0.19645343424258566`,0.21551707277352605`,0.2181102569613515`,0.23391774392335418`,0.2550771120859639`,0.268556465535236`,0.3048461639996855`,0.31092372219882125`,0.3127938111713915`,0.31392896827464023`,0.3409781586745735`,0.34596915956848695`,0.3529110816686192`,0.3616255919094322`,0.367472697175696`,0.3934402946502719`,0.4047693032887368`,0.4311153156718978`,0.4683757162129396`,0.46952099223657906`,0.5072262199843859`,0.5221019133548683`,0.5343944162249954`,0.547890105814087`,0.5599232760856494`,0.5668852126411741`,0.5798751142315612`,0.5934466012255624`,0.5974131205016744`,0.6095346242098095`,0.6435968663448772`,0.6590218413254265`,0.6941771182743655`,0.7420644764943874`,0.7438917323215621`,0.7969894568033051`,0.8000537128029253`,0.8374749504134635`,0.8407436273863764`,0.8616085956899902`,0.9112275732642781`,0.9275783068606626`,0.9389371022951489`,0.9437334963687791`,0.9593199516799973`,1}]; 

a, b, c, d are the four numbers in the set, and at most two are equal, satisfy the following relationship:
$ \frac{a}{b}=k \frac{c}{d}$
$ k$ is a rational number.
when $ k=1$ , The GroupBy based approach is fast enter image description here

GroupBy[#/#2 -> (#/#2 /. rule1) & @@@ Subsets[Keys@rule1, {2}],    Round[Last@#, 10^-10.] & -> First, Equal @@ # &] // DeleteCases[True] // AbsoluteTiming 

when $ k\neq 1$ , assume that $ k = 1/3$
Itโ€™s not easy to do with GroupBy, I thought of a way to use Gather, It’s much slower than the previous one enter image description here

Gather[#/#2 -> (#/#2 /. rule1) & @@@ Subsets[Keys@rule1, {2}],    Abs[#[[2]] #2[[2]] - 1/3] < 10^-10. &] // Select[Length@# > 1 &] // AbsoluteTiming 

I want to know if $ k\neq 1$ , is there a way to use Groupby instead of Gather, or other more efficient methods?
Thank you in advance.

Efficient sinc interpolation of a list

I’m looking for an efficient way to construct a sinc interpolation i.e. an interpolating function of a list of numbers in the form of a Fourier series with a chosen maximum frequency. The input list is supposed to be an equidistant sample of some real smooth function (e.g. a function with a finite-support Fourier transform) and the interpolation should be a continuous function that is a sum of harmonic functions (cosine and sine functions) such that their coefficients are equal to the Fourier coefficients of the list for all frequencies up to the chosen max frequency, (see Wikipedia: Whittakerโ€“Shannon interpolation formula). Increasing the max frequency beyond the band-limit of the original function should give a perfect reconstruction of it. I’d guess there is a predefined Mathematica function doing that but I couldn’t find it.

One way would be to apply Interpolation on the list, then NFourierSeries on the resulting InterpolationFunction (choosing the desired max frequency), but this is obviously very inefficient. LowpassFilter should be doing something similar but the output is a list, not a function. One option for an efficient solution that I’m trying to implement is to compute the Fourier coefficients from the list using Fourier, construct a function that is a list of harmonic functions with the right frequencies (e.g. Table[Sin[2 Pi n #/T], {n, nmax}] & for sines) and then Dot-multiply the two lists. I found this excellent post that computes the correct factors: Numerical Fourier transform of a complicated function

Disk space efficient foreign key index for insert intensive scientific database?

I’m working tuning a scientific database whose associated simulation is very insert intensive (i.e., run for a long time inserting data, execute summary query at the end). One of the tables is starting to cause some problems since the table size is 235 GB with index sizes of 261 GB, and the server only has 800 GB so we would like to free up a bit of space.

Currently there is one foreign key reference (integer data) that is stored as a clustered b-true. This has been good for the summary queries, but likely isn’t helping the disk space issues.

Is there a more disk efficient way of storing this foreign key index? Would it make sense to switch over to a hash index instead of the b-tree index?

What is an efficient way to get a look-at direction from either a quaternion or a transformation matrix?

So, I have an object in my custom engine (C++), with a column-major transform in world space. I’m using a package that takes a look-at direction as an input. What’s the most efficient way to get a look-at direction from this transform? Do I extract the rotation matrix? Do I try to extract a quaternion?

Efficient algorithm for this combinatorial problem [closed]

$ \newcommand{\argmin}{\mathop{\mathrm{argmin}}\limits}$

I am working on a combinatorial optimization problem and I need to figure out a way to solve the following equation. It naturally popped up in a method I chose to use in my assignment I was working on.

Given a fixed set $ \Theta$ with each element $ \in (0,1)$ and total $ N$ elements ($ N$ is about 25), I need to find a permutation of elements in $ \Theta$ such that $ $ \vec K = \argmin_{\vec k = Permutation(\Theta)} \sum_{i=1}^N t_i D(\mu_i||k_i) $ $ where $ \vec t, \vec \mu$ are given vectors of length $ N$ and $ D(p||q)$ is the KL Divergence of the bernoulli distributions with parameters $ p$ and $ q$ respectively. Further, all the $ N$ elements of $ \vec t$ sum to 1 and $ \vec \mu$ has all elements in $ [0,1]$ .

It is just impossible to go through all $ N!$ permutations. A greedy type of algorithm which does not give exact $ \vec K$ would also be acceptable to me if there is no other apparent method. Please let me know how to proceed!

In theory, should neuromorphic computers be more efficient than traditional computers when performing logic?

There is this general sentiment about neuromorphic computers that they are simply "more efficient" than von Neuman.

I’ve heard a lot of talk about neuromorphic computers in the context of machine learning.
But, is there any research into performing logic and maths in general on such computers? How would one translate arithmetic, logic and algorithms and into "instructions" for a neuromorphic computer, if there are no logic structures in the hardware itself?

It is common to draw parallels with a brain in this context, so here’s one: Brains are great at recognising faces and such, but I don’t think I can do maths faster than an Arduino (and that thing doesn’t need much energy).

Most efficient method for set intersection

Suppose I have two finite sets, $ A$ and $ B$ , with arbitrarily large cardinalities, the ordered integral elements of which are determined by unique (and well defined) polynomial generating functions $ f:\mathbb{N}\rightarrow\mathbb{Z}$ given by, say, $ f_1(x_i)$ and $ f_2(x_j)$ , respectively. Assume, also, that $ A\cap B$ is always a singleton set $ \{a\}$ such that $ a=f_1(x_i)=f_2(x_j)$ where I’ve proven that $ i\neq j$ .

Assuming you can even avoid the memory-dump problem, it seems the worst way to find $ \{a\}$ is to generate both sets and then check for the intersection. I wrote a simple code in Sagemath that does this, and, as I suspected, it doesn’t work well for sets with even moderately large cardinalities.

Is there a better way to (program a computer to) find the intersection of two sets, or is it just as hopeless (from a time-complexity perspective) as trying to solve $ f_1(x_i)=f_2(x_j)$ directly when the cardinalities are prohibitively large? Is there a parallel-computing possibility? If not, perhaps there’s a way to limit the atomistic search based on a range of valuesโ€”i.e., each loop terminates the search after it finds the first $ i$ value such that $ f_1(x_i)>f_2(x_j)$ , knowing that $ f_1(x_{i+1}), f_1(x_{i+2}), f_1(x_{i+3}), \cdots, f_1(x_{i+n})>f_1(x_i)>f_2(x_j)$ .

An efficient way of calculating ๐œ™(๐œ™(p*q)) where p and q are prime

Let p and q be prime numbers and ๐œ™ Euler’s totient function. Is there an efficient way of computing ๐œ™(๐œ™(p*q)) = ๐œ™(๐œ™((p-1)(q-1)), that is not simply based on factoring (p-1) and (q-1)?

Obviously, if p and q do not equal two, (p-1) and (q-1) are even and consequently their prime factorization is entirely different from the prime factorization of p and q. Therefore I assume that no such shortcut exists.

Do I overlook something?

What is the most efficient way to turn a list of directory path strings into a tree?

I’m trying to find out the most efficient way of turning a list of path strings into a hierarchical list of hash maps tree using these rules:

  • Node labels are delimited/split by ‘/’
  • Hash maps have the structure:
{     label: "Node 0",     children: [] } 
  • Node labels are also keys, so for example all nodes with the same label at the root level will be merged

So the following code:

[     "Node 0/Node 0-0",     "Node 0/Node 0-1",     "Node 1/Node 1-0/Node 1-0-0" ] 

Would turn into:

[     {         label: "Node 0",         children: [             {                 label: "Node 0-0",                 children: []             },             {                 label: "Node 0-1",                 children: []             },         ]     },     {         label: "Node 1",         children: [             {                 label: "Node 1-0",                 children: [                     {                         label: "Node 1-0-0",                         children: []                     },                 ]             },         ]     }, ]