What is considered an asymptotic improvement for graph algorithms?

Lets say we are trying to solve some algorithmic problem A that is dependent on input of size n. we say algorithm B that runs in time T(n), is asymptotically better than algorithm C which runs in time G(n) if we have: G(n) = O(T(n)), but T(n) is not O(G(n)).

My question is related to the asymptotic running time of graph algorithms, which is usually dependent on |V| and |E|. Specifically I want to focus on Prim’s algorithm. If we implement the priority queue with a binary heap the run-time would be O(ElogV). With Fibonacci heap we could get a run-time of O(VlogV + E).

My question is do we say that O(VlogV + E) is asymptotically better than O(ElogV)?

Let me clarify: I know that if the graph is dense the answer is yes. But if E=O(V) both of the solutions are the same. I am more interested in what is usually defined as an asymptotic improvement in the case we have more than one variable, and even worse – the variables are not independent (V-1<=E<V^2, since we assume the graph is connected for Prim’s algorithm).


Batching multiple nearest surface queries: Is it faster? Are there better algorithms?

I’m working on an algorithm that computes lots of "nearest point on a triangulated surface" queries in 3d as a way to resample data sets, and I’m wondering if there is any information out there on speeding up these queries. My gut tells me that partitioning the set of query points in a voxel grid or something, and doing them in batches could be a speedup, but I can’t quite see how I could efficiently use that. Also I’m not sure if the time cost of partitioning would balance the search speedup. Is running N independent queries really the best way?

I found that there are papers and research for the all-knn algorithm, but that’s for searching within a single set. And then, those speedups take advantage of the previously computed neighbors or structure within the single set, so I can’t use them. It feels close though.

Any help is appreciated.

simplest algorithms and effective

If I explore all the algorithms and tools necessary for learning from data (training a model with data) and being capable of predicting a numeric estimate (for example house pricing) or a class (for instance the species of an iris flower) given any new example that I didn’t have before. If I start with the simplest algorithms and work toward those that are more complex. The four algorithms represent a good starting point for any data scientist.

Regression has a long history in statistics from building simple but effective linear models of economic, psychological, social or political data, to hypothesis testing for understanding group differences, to modeling more complex problems with ordinal values, binary and multiple classes, count data, and hierarchical relationships, it is also a common tool in data science, a swiss army knife for machine learning that I can use for every problem. Stripped of most of its statistical properties, data science practitioners perceive linear regression as a simple, and an understandable, yet effective algorithm for estimations and in its logistic-regression version, for classification as well.

I would like to know about the simplest algorithm, as a tool in data science for machine learning and linear regression as a simple and understandable, yet effective algorithm for estimations. , if possible in its logistic-regression version, for classification as well.

Computability of sequential cubic-order algorithms

I have a cubic-order algorithm which must be executed sequentially; there appears to be no way of making it parallel.

I need to come up with an estimate of maximum input size that can be solved using today’s technology. So, the essential problem seems to be how to relate the cost of a single step to execution time on a “typical” hardware. Is there a gold-standard way of doing this, e.g. something an attentive peer-reviewer would like to see being done in a manuscript?

Can Homotopy Type Theory be used to derive more efficient algorithms on more efficient data representations from less efficient ones?

I’ve read here that in HoTT, compilers could swap out less efficient representations of data for more efficient ones and I’m wondering whether my interpretation of this statement is correct.

Say we have two different ways of representing the natural numbers, unary (zero and successor) and binary. Here is a function that checks evenness on the former representation:

even : UnaryNat -> Bool even zero = true even (succ zero) = false even (succ (succ n)) = even n 

If we then have an isomorphism between the unary and binary representations, we trivially get an evenness function for the binary representation “for free”, simply by converting a given binary natural number to a unary one, applying the even function, and converting the result back to the binary representation. Obviously, this is not very efficient, and we also don’t need HoTT for this.

A better way to check whether a binary natural number is even would be to check if its least significant digit is a zero. My question is: Could we derive this more efficient algorithm for binary natural numbers from our definition of evenness for unary natural numbers using HoTT? If so, would this also be possible for other data types? I haven’t studied any HoTT yet and since its appears to be a pretty complex subject I would like to find out whether it’s as exciting as I think it is. Thanks!

Search Space for Genetic Algorithms

I have not been able to find anywhere a general formula for the size of the search space for Genetic Algorithms (GA).

I would imagine that such a formula would involve the binomial coefficient — maybe Stars and Bars


The reason I ask is because I have developed my own GA and would like to know the search space size as a means to motivate the need for, and usefulness of, metaheuristics like GAs for a manuscript that I am currently writing.

As an example, consider a simple binary (0/1) GA with string length of $ L$ = 10 and population size (number of chromosomes) of $ N$ = 100. Possible solutions could be:

0100001110  1011010010  etc. 

Since 0/1 can be repeated in any given string (chromosome), there would be exactly $ 2^N$ possible configurations. This would generalize to $ k^N$ for any $ k$ -ary problem.

I feel this isn’t the whole story.

If I had to guess a closed-form expression using Stars and Bars for the binary GA case, it might be something like $ $ \binom{N + 2^N – 1}{N} = \binom{N + 2^N – 1}{2^N – 1} = \binom{100 + 2^{100} – 1}{100} = \binom{100 + 2^{100} – 1}{2^{100} – 1} \sim 10^{2852} $ $

Is this the right line of thinking?

Any thoughts are greatly welcomed.

Analysis of Dijkstra algorithm’s (Lazy) running time

I’m trying to figure out the running time for a Dijkstra algorithm. All the sources I have read say that the running time is O(E * log (E)) for a lazy implementation.

But when we do the math we get O(E * (Log(E)+E*Log(E))).

Since E isn’t a constant, I don’t see how someone could reduce this to O(E * log (E).

Are we analyzing the wrong or is it possible to reduce?

        while (!minPQ.isEmpty()) { <=== O(E)             Node min = minPQ.poll(); <=== O(log(e)              for (Edge edge : graph.adj(min)) { <=== O(E)                 if (min.getId() == target.getId()) {                     // Source and Target = Same edge                     if (edgeTo.size() == 0) edgeTo.put(target, edge);                      return;                 }                  relax(edge, min, vehicle); <=== log(e) (because of add method on PQ)             }       } 

Quantum Cryptography Algorithms Implementations

The Post Quantum Cryptography is a type of cryptography that lies on physics properties instead of mathematics , it has many algorithms and implementations like NTRU , McEliece , SIDH … etc

But there is a difference between Post Quantum Cryptography and Quantum Cryptography , i’d like to know some algorithms of that and also if they have implementations for example on Github or any thing like that

Thank you

Time complexity analysis of 2 arbitrary algorithms – prove or disprove

We are given 2 algorithms A and B such that for each input size, algorithm A performs half the number of steps algorithm B performs on the same input size.

We denote the worst time complexity of each one by $ g_A(n),g_B(n)$

Also, we know there’s a positive function $ f(n)$ such that $ g_A(n)\in\Omega(f(n))$

Is it possible that $ g_B(n)\in\Omega(f(n))$ ? Is it necessary?

It seems naive to think that it’s necessary, but I can’t figure out to contradict it.

Is the usage for asymptotic notation for these algorithms correct? [duplicate]

So after reading a lot of information around asymptotic analysis of algorithms and the use of Big O / Big Ω and Θ, I’m trying to grasp how to utilise this in the best way when representing algorithms and operations on data structures.

For example there is a recommended website where I got this screenshot from describing Quicksort and I’ve noticed a few issues that stand out to me based on what I’ve learnt.

  1. Is it possible for all notations to represent “Best” “Average” and “Worst” cases? and if so how is this possible? For example for a “Worst” case, How can Big Ω represent the Upper bound. The upper bound is tied to Big O.
  2. I thought in order to find Theta Θ, Big O and Big Ω had to be the same values? In the screenshot “Best” case is n log(n) and Worst case is n^2 so how can Θ(n log(n))?
  3. Take for instance a Hash Table data structure, if you were to perform an analysis on the time complexity for insertion of an element. Would I be correct is saying you could interchangeably say Ω(1) and O(N) or conversely “Average Case is O(1)” and “Worst Case is O(N)”?