Efficient algorythm to prune a graph

I’m faced to a problem I don’t know the name, and for which I’m searching an efficient algorithm.

From two list of nodes (without shared elements), the aim is to remove edges between element of the two list if a given metric is higher than a threshold. Even if there is no need to compute edges between the element inside a list, the metric still can be applied between them if necessary. The lists contains ~1 000 000 nodes, so brute-force is not an option.

I absolutely don’t know if I am clear, and if I can add info to help you help me.

Thank you in advance, D.

Graph problem with cities in a road network

In a country, some pairs of cities are connected by one-way roads. Each pair of cities have at most one road directly connected them. Each road has a cost for reversing its direction. Let $ s,t$ be two cities. Can we give an efficient algorithm to find the minimum cost that we have to pay so that $ s$ cannot reach $ t$ ?

I think that the minimum cost should be the same as the minimum cut between $ s$ and $ t$ . My idea is that if we have a minimum cut, perhaps we can reverse the directions of all the roads in that cut. But I’m not sure if this idea works, since some roads may already be in the “correct” direction.

Highly asymmetric regular graph

Let $ G$ be a regular connected simple graph on $ n$ vertices with chromatic number $ \chi$ and maximum degree $ \Delta$ . Then, it is implied that $ G$ is $ \chi$ -partite. Suppose, we remove one of the partite set of vertices. Then, what would be the maximum degree of the induced subgraph formed by the remaining vertices?

I may say with some confidence that the induced subgraph would have a maximum degree of $ \chi-2$ (as the remaining partite sets must be connected with each other, otherwise the graph would be disconnected). In addition, if the graph be vertex transitive, I think that the maximum degree of the induced subgraph would be $ \Delta-1$ . Any hints and counterexamples in this case? Thanks beforehand.

Reducing Graph Reachability to SAT (CNF)

So I came across this problem in my textbook. I was wondering how to develop a reduction from the Graph Reachability problem to SAT (CNF) problem. (i.e. formula is satisfiable iff there exists a path in graph G from start to end node)

1) I can’t wrap my head around how to go from something that can be solved in polynomial time (Graph Reachability) to something that is NP (SAT).

2) I can’t seem to find a way to formulate these nodes/edges of Graph into actual clauses in CNF that correspond to reachability.

I tried to think about algorithms like Floyd-Warshall that determine if a path exists from start to end node but I can’t seem to formulate that idea into actual CNF clauses. Help would be much appreciated!

Can graph execution be better optimized than imperative programs?

I’ve been reading about Google’s TensorFlow, and the way it represents calculations with graphs that are then executed by an engine. While the concept is interesting, I would like to understand why they made that choice instead of the arguably simpler imperative programming found e.g in PyTorch.

The TensorFlow documentation lists several advantages:

  • Parallelism. By using explicit edges to represent dependencies between operations, it is easy for the system to identify operations that can execute in parallel.
  • Distributed execution. By using explicit edges to represent the values that flow between operations, it is possible for TensorFlow to partition your program across multiple devices (CPUs, GPUs, and TPUs) attached to different machines. TensorFlow inserts the necessary communication and coordination between devices.
  • Compilation. TensorFlow’s XLA compiler can use the information in your dataflow graph to generate faster code, for example, by fusing together adjacent operations.
  • Portability. The dataflow graph is a language-independent representation of the code in your model. You can build a dataflow graph in Python, store it in a SavedModel, and restore it in a C++ program for low-latency inference.

Portability makes sense, I’m more interested in the performance aspect. It seems that doing parallel and distributed computations doesn’t require graph execution, since PyTorch does it too. I assumed that creating a graph enabled the engine to do optimizations that could not be done with an imperative program.

But then I read about TensorFlow’s upcoming eager mode, which basically does away with the graph API and lets us use imperative programming like in other libraries. The documentation for eager mode suggests that it approaches and could reach the performance of graph mode:

For compute-heavy models, such as ResNet50 training on a GPU, eager execution performance is comparable to graph execution. But this gap grows larger for models with less computation and there is work to be done for optimizing hot code paths for models with lots of small operations.

One thing I didn’t find is distributed training with eager mode, but as mentioned earlier, other imperative libraries seem to offer that despite not using graphs.

I’m not sure what to make of this. Does graph execution have a performance advantage over imperative programming after all?

Graph of function, continuous projection

$ X$ and $ Y$ are topological spaces. $ f:X\rightarrow Y$ a map (we don’t suppose that $ f$ is continuous). Consider $ A=\{(x,f(x))\in X\times Y| x\in X\}$ . is $ \pi: A\rightarrow X$ , $ $ (x,f(x))\mapsto x$ $ a homeomorphism ? If not, is it enough to assume that $ A$ is closed subspace in $ X\times Y$ ? If $ X$ and $ Y$ are metrizable spaces, how to prove that $ \pi$ is a homemorphism using sequential continuity ? Suppose that by miracle $ \pi$ is homeomorphism but $ f$ is not continuous, is it possible such phenomenon ?

How to duplicate a graph in Google Docs and use a separate source/link of data

When I copy and paste a graph in Google Docs, it makes 2 graphs, linked to the same spreadsheet. I’m looking to make 100 identically-styled graphs and quickly modify each one by editing the data in each.

Any idea how to duplicate the source table, so that I can modify each one independently?

convert LineString obj to dict and serialize json from networkX graph

I have a networkX multidigraph. I want to make it json serializable, but there are LineString elements in the edge attributes, so first I map them to a dict and then try to json.dumps(). But for some reason I am still getting the error: TypeError: Object of type LineString is not JSON serializable.

import matplotlib.cm as cm import networkx as nx from networkx.readwrite import json_graph import numpy as np import osmnx as ox import pandas as pd import json from shapely.geometry import Point, mapping, shape %matplotlib inline ox.config(log_console=True, use_cache=True) ox.__version__  place = 'Santa Monica, California, USA' gdf = ox.graph_from_place(place)  for n, nbrs in gdf.adjacency():     for nbr, edict in nbrs.items():         try:             edict[0]['geometry'] = mapping(edict[0]['geometry'])         except: next  data1 = json_graph.node_link_data(gdf)  s1 = json.dumps(data1) 

Then I get the error:

TypeError: Object of type LineString is not JSON serializable.

But as you can see in the for loop I replace all of these LineStrings with dicts. And I even go back and check and I don’t see any LineStrings.

Given complete graph, find optimal path with two costs on each edge

We are given complete graph, such that each edge has two costs $ a \text{ and } b$ . We should find path that passes through each node once and has minimum total cost. Cost of a path is the maximum of both of the sums $ a$ and $ b$ of the edges laying on the edges crossed on the path.

One person told me that this is easily solvable by taking the maximum of $ a \text { and } b$ for each edge and then using the dynamic programming technique storing the final index and the bitmask of visited nodes. I couldn’t find a counter test case, however I’m not sure if this is correct approach.

Matching Algorithm – How to construct a bipartite-like graph with heterogeneous matches rules

We have a set Set. Elements in this set can be matched according to the following rules:

matching rules

The input to the matching algorithm is an array of variable size consisting of elements in S. Each element in the array has a particular size or “quantity” that can be matched (I imagine this can simply be modeled as edge weights).

The first question is, how to maximize the total quantity matched? Second question is, how to optimize time complexity?

Intuitively, I think the problem could be modeling as a weighted bipartite graph and solved as a max-flow algorithm. The challenge is that elements can be matched in different ways, so I’m not sure what the graph should look like given these extra rules or if it implies a different approach should be used.