How can I generate a random sample of unique vertex pairings from a undirected graph, with uniform probability?

I’m working on a research project where I have to pair up entities together and analyze outcomes. Normally, without constraints on how the entities can be paired, I could easily select one random entity pair, remove it from the pool, then randomly select the next entity pair.

That would be like creating a random sample of vertex pairs from a complete graph.

However, this time around the undirected graph is now

  • incomplete
  • with a possibility that the graph might be disconnected.

I thought about using the above method but realized that my sample would not have a uniform probability of being chosen, as probabilities of pairings are no longer independent of each other due to uneven vertex degrees.

I’m banging my head at the wall for this. It’s best for research that I generate a sample with uniform probability. Given that my graph has around n = 5000 vertices, is there an algorithm that i could use such that the resulting sample fulfils these conditions?

  1. There are no duplicates in the sample (all vertices in the graph only is paired once).
  2. The remaining vertices that are not in the sample do not have an edge with each other. (They are unpaired and cannot be paired)
  3. The sample generated that meets the above two criteria should have a uniform probability of being chosen as compared to any other sample that fulfils the above two points.

There appear to be some work done for bipartite graphs as seen on this stackoverflow discussion here. The algorithms described obtains a near-uniform sample but doesn’t seem to be able to apply to this case.

Algorithm design: find any path in an undirected acyclic graph which has a total sum of the nodes as a specific value

The question was asked in an interview, and I’m not sure if this is the most optimized answer, but here goes-

You have an undirected acyclic graph, where each node has a non-negative value x and the edges have 0 value. You need to find a path with any number of nodes on it such that the total sum of all node values is exactly k. There are N nodes in the graph. You can start at any node.

I gave the brute-force solution for this problem, which is basically do a dis on each node until you find a valid “root” node, such that it has a path to one of the other nodes, with a total sum being k. The time complexity of this solution would be n^2, since we are doing DFS on each node.

The question was posed as a delivery problem, where you have N connected houses, each having a package of weight k, and the truck can start at any house and should follow a path, and has to pick the package from every house that is on that path. The weights should sum exactly to k.

How to minimize diffusion in undirected weighted graph

I’ve got a work that has to do with networks, where each edge has a weight ($ w\in(0,1)$ ) and some random set of nodes are ‘infected’. I don’t know which nodes are (or how many). The diffusion spread is according to linear threshold model (with some random threshold $ \theta\in(0.3,0.5)$ . I’m allowed to delete only $ 10\%$ of edges from the original graph, before the parameters are set (I don’t know the initial infected nodes, nor the threshold a-priori).

I’ve tried multiple methods like:

  1. removing highest weighted edges first – that results good performance when only small amount of initial infected nodes is given.
  2. removing local-bridges – I’ve sorted the edges by neighbourhood overlap, and then removed by weight. – This works good for low thresholds.
  3. Multiple variations of vertex cover, edge cover, centrality, betweenness etc – all underperformed the “naive” remove by weight approach (presented in (1)).
  4. Clustering heuristics – tried to construct clusters and remove cutting edges (by weight, etc).

I find it hard to “accept” that approach (1) which seem to be very naive – outperforms all other.

Also, went over several papers and implemented some of the suggested ideas – none outperformed that naive approach.

I will appreciate if anyone with similar experience has anything to suggest.

Detecting a cycle in an undirected graph and printing the vertices if there is a cycle, printing no cycle otherwise

Problem: Find an O(E + V) time algorithm that outputs the vertices of a cycle of G, if it exists. If G has no cycles, the algorithm outputs no cycle. So if there is a cycle 1, … 3, 1, it would print 1, …, 3, 1.

So I was thinking of doing a BFS, and if the currently examined vertex’s neighbor v has been visited prior, then there is a cycle. However, I am not sure how I might go about tracking the vertices in order to print. I believe I would have to view the BFS-Tree, but unsure.

Detected if an undirected graph G has a cycle

Trying to understand complexity well, I found myself with the following problem.

Consider the following algorithm to detect if an undirected graph $ G = (V, E)$ has a cycle.

Imagine that $ V = \{1 …|V|\}$ (in other words that the vertices are numbered from $ 1$ to $ |V|$ ). An explorer plants a flag on point $ 1$ and then moves on the graph according to the following principle.

  • For each vertex $ u \in V$ the different edges $ (u,v)$ around the point $ u$ can be ordered according to the size of $ v$ . So we can talk about the ith edge around $ u$ . . Note that if $ (u,v)$ is the ith edge around $ u$ it is possible that $ (v,u)$ is not the ith edge around from $ v$ .
  • If the explorer reaches the point $ u$ of degree $ k$ using the ith edge around $ u$ then it starts from $ u$ using the $ (i + 1)$ th edge around from $ u$ . If $ i = k$ then the explorer starts from the first edge around $ u$ .

As a first attempt, the explorer leaves point $ 1$ using the first edge around $ 1$ . If it returns to point $ 1$ by a different edge then he concludes that $ G$ contains a cycle.

If on the other hand it returns to point $ 1$ to across the same edge, then it begins its exploration again, starting by the second edge of point $ 1$ , then the third edge and so on.

If he has exhausted all edges around $ 1$ and has always returned to $ 1$ by the same edge, so he plants his flag on point $ 2$ and so on.

My questions are :

  1. how can we show that $ G$ contains a cycle if and only if there is a point $ u$ and an edge $ (u,v)$ around $ u$ such that the explorer leaving $ u$ by the edge $ (u,v)$ does not return to $ u$ by $ (u,v)$ .
  2. What is the is the space complexity of the algorithm described here?

Find two non overlapping paths in undirected graph

I’ve been struggling on how to approach a problem for over two weeks now. I am supposed to generate an undirected graph which represents a map of streets, and I am supposed to divide the map into two sets of non overlapping streets and both paths should be roughly the same distance.

Is this a common CS problem? What kind of algorithm should I use to solve it?

I’m just looking for a general direction here on what kind of algorithm should I use to do that. I’m a self taught newbie in CS and maths so I have some major gaps in my formal knowledge. The thing is, that I’m supposed to show that the technique I use to solve it involves AI. Initially I tought that I could ask this question aiming to solve the problem straight from the graph. But I now realize that my lack of technical knowledge prevented anyone from clearly understanding the problem.

Anyway, I thought of solving the problem by what I think is the Hungarian Algorithm to solve the Travelling Salesman Problem as shown here. I expected to obtain the most optimum path to traverse the whole graph and after that walk such path and stopping in the edge closest to half the total distance in order to have the two paths I required. However, not sure if it will work and I’m probably going in a completely wrong direction here. Thanks in advance.

Algorithm for undirected projective dependency parsing

I am looking for an algorithm that will get an optimal projective undirected dependency parse.

That is, given the sentence ‘Mary does love John’, and an edge-weight function $ f$ (that is, a function from pairs of words in the sentence to real numbers), like

$ f$ (Mary,does) = 1

$ f$ (Mary,love) = 5

$ f$ (Mary,John) = 2

$ f$ (does,love) = 2

$ f$ (does,John) = 3

$ f$ (love,John) = 5

I want an algorithm that will give the edges of the maximum projective spanning tree as: {(Mary,love), (does,love), (love,John)} with a total weight of 12. That is, it should give this:

Crucially, the algorithm shouldn’t give {(Mary,love), (does,John), (love,John)}, even though it has a higher weight of 13, because in that case the dependencies are not projective (the arcs (Mary,love), (does,John) cross each other). That is, it should not give this:

Equivalently, I am asking: what algorithms exist for finding a Minimum/Maximum Spanning Tree on an ordered graph, such that the resulting structure is projective? A bit more formally: given a totally ordered set of nodes $ (S,<)$ , and a edge weight function $ W: S × S → ℝ$ , is there a good algorithm for finding the optimal-weight spanning tree over this set of nodes, such that the resulting structure has the property that every sub-tree is contiguous in the ordering (for any subtree $ T$ , for all $ t,s$ in $ T$ there is no $ r$ not in $ T$ such that $ t<r<s$ )?

  • Without the projectivity requirement, any classic MST algorithm will do (such as Prim’s, or Kruskal’s).
  • With the projectivity requirement, but for directed graphs/dependencies Eisner’s algorithm (a faster version of Arc-factored Projective Parsing) is standard for getting the optimal directed projective dependency parse.

I am looking for a (probably CYK-like) algorithm like Eisner’s, but which will work on undirected dependency parsing. That is, an algorithm for finding the maximum projective spanning tree for undirected ordered graphs. Or, perhaps, a proof that Eisner’s algorithm will with some modification to work on undirected graphs will be guaranteed to give the optimal projective spanning tree.

About shortest cycles in undirected graphs

In an undirected (and unweighted) graph, to find the shortest cycle containing a specific vertex $ s$ , it is usually said to run a BFS from $ s$ and the first time to find a re-visit, then that is the wanted smallest cycle length.

This does not seem to be true, quoting from page 3:

Note that if you run BFS from $ s$ and stop as soon as you revisit a vertex (using a previously ununused edge), you may not get the shortest path containing $ s$ .

Examples where this technique seems to be suggested: An efficient algorithm to find a shortest cycle including a specific vertex

Another technique, finding the shortest cycle in the whole graph, by running BFS from each vertex, also seems to detect only the shortestLength + 1 in a special case, as mentioned in this paper: in the “Unweighted case”. On the contrary, here ( it is mentioned (first and second paragraphs) that running BFS from every vertex gives the shortest length (girth) in all cases. This is also mentioned in an archive (

Which of all algorithms/methods is true for:

  1. Finding the shortest-length cycle in an undirected graph?
  2. Finding the shortest-length cycle that passes through a known vertex $ s$ in an undirected graph?
  3. Where is the pitfall in each of my above compare-and-contrasts? (I cannot believe that some of the above quoted might even be untrue…)

Note: A randomized sub-cubic algorithm seems to exist, in the paper “A shortest cycle for each vertex of a graph” by Raphael Yuster:

Orientation of undirected graph, give an algorithm to mimize the maximum in-degree of any node

I’m trying to solve this question.

Given: An undirected graph G = (V,E)

Output: An orientation of G so as to minimize the maximum in-depth of any node.

So we have undirect graph G. We can create a directed graph by giving each edge direction. For example, e=(u,v) can be an edge from u to v or v to u. We want to minimize the max in-depth degree of any node for the new directed graph.

I’m trying to come up with a poly-time algorithm.

  1. we give an arbitrary direction to each edge.

  2. calculate the current max in-degree of the node.

  3. change each edge direction and repeat 2, if it is smaller, then update the max.

1 needs E times.

2 needs E + V to check.

3 needs E iterations

Total is E + E * (E + V)

so the total is O(E* (E+V)). Does this work?