Sync logins and Agent jobs across all replicas in Distributed Availability Group

I am using dba_CopyLogins stored proc to sync logins on replicas in Distributed Availability group , but the database level permissions are not transferred due to database sync operation is going on.. Is there any way to sync all logins and permissions from global primary to Forwarder and other replicas? also how to copy agent jobs?

Distributed predicate computation on event stream

My question is actually a request for papers, articles, texts or books on the problem that I’m trying to solve on my work.

I’m working on a program that computes a predicate value (true or false) for a given object in a distributed system in which there is a stream of events that can change the object’s attributes and, consequentially, the predicate value. Whenever the predicate value changes, the program must send a notification about this change.

For example, consider that there is an object A which has an attribute called name and consider that there is a predicate P which is true when the object’s name is equal to Jhon. Each event in the stream has a timestamp and a value for the attribute name. So consider the following sequence of events:

e1 = { name: Jhon, timestamp: 1 } e2 = { name: Jhon, timestamp: 2 } e3 = { name: Peter, timestamp: 3 } e4 = { name: Doug, timestamp: 4 } e5 = { name: Jhon, timestamp: 5 } 

Now, the events don’t necessarily show up in the stream in the correct order and, even worst, there are multiple computers parallelly processing this stream of events. However, for simplicity, I’ll go further in this example considering only one computer.

If the events arrive and are processed in the order described above, then the notifications sent should be:

P(A) = true when e1 arrives P(A) = false when e3 arrives P(A) = true when e5 arrives. 

That is the correct sequence of notifications. Now, imagine that the computer receives the events in the following order:

e1, e5, e2, e4, e3 

A naive algorithm which doesn’t consider the event’s timestamp would send an incorrect sequence of notifications:

P(A) = true when e1 arrives P(A) = false when e4 arrives 

The algorithm that I’m working on considers the timestamps and infers when a notification should have been sent but was not. So when e3 arrives it will notice that the notification P(A) = true for e5 was not sent. This feels a bit like reinventing the wheel, though I’m not aware of any reading about this problem. I would like some references to this problem or to something similar, like some papers dealing with this kind of problem.

The real problem is quite more complex since it involves storing the predicate $ \times$ object state in a database that works as a shared state between the computers processing the stream and I’m talking about thousands of events arriving per second so it’s not possible to keep all events stored in some database.

What are some advanced background topics I’ll need for distributed systems and networks research?

I am a new graduate student in Computer Science who would like to be able to read and understand modern and new distributed systems research papers. My current background / courses and understanding is in the level of undergraduate and beginner graduate level courses in:

  • Networks (TCP/IP stack and applications)
  • Distributed Systems (Graduate level course with Time (logical/vector clocks), 2PC and 3PC, Multicast and membership, election, Consistency , Consensus and Quorums (Paxos), DHTs and Overlays and some modern applications like ZooKeeper etc)
  • Undergraduate Algorithms, Discrete Mathematics and Theory of Computation (basic DFA/NFA and intro to Turing Machines with no rigorous mathematics)

However, I find this background insufficient to read modern research in networks and distributed systems and in particular, I am not aware of modern protocols like QUIC and the formal methods mentioned in the papers which I believe include some sort of model checking and the likes. Also many of the topics I have mentioned above in distributed systems – I lack the background to verify and prove correctness of these protocols and even follow the proofs that they have given.

Any suggestions on a reading list that can prepare me to be in a position to understand modern research in this area would be very helpful.

Creating a top-hat distributed random number generator

enter image description here

I have this Fortran code which generates a flat distribution as it produces a single random number centered on 0. The function GRNDM (Geant 4 random number generator) produces equally distributed random numbers between the values of 0 and 1. RDUMMY is the name of the vector filled with the random number and the argument “1” states the length of the vector: i.e. GRNDM here will produce a single random number between 0 and 1. The second line then produces random numbers in the interval [μ−σ2,μ+σ2].

I was wondering if there was a way of changing it to produce random numbers with a top hat distribution?

SQL Server 2019 Always On using Distributed Network Name

We’re running a couple SQL servers in Azure that are set up with an Always On availability group and Windows Failover Clustering. The servers are Windows 2019 and we’re running SQL Server 2019. When we set up the cluster, it was set up to use a Distributed Network Name instead of a static cluster IP address. Thanks to this we shouldn’t need an internal load balancer according to these notes: https://github.com/MicrosoftDocs/azure-docs/issues/34648.

I’m struggling to understand exactly how this works though. Based on what I read, it seems like our connection strings will point to the DNS name of the cluster (let’s call it AgCluster). If I look in DNS, there is an A record for AgCluster pointing to sql1 and another pointing to sql2. When I use AgCluster in my connection string it seems to always connect me to the primary server, even if I have ApplicationIntent=ReadOnly set. When I query @@SERVERNAME I always get the same server.

So with the Distributed Network Name setup, what should I use in my connection strings to make sure read/write queries go to the primary and read only go to a secondary? Any guides on setting this up in general would be helpful. Thanks!

service demand in the topic of distributed systems

I am so confused about this question , imagine we have a disk that the service demands of database transactions on this disk is 0.1 sec , additionally we increase the disk speed by 40% so how service demand will be changed ? The answer says we should consider 60% of the original service demand and it means is 0.1 x 0.6 , but I think it should be 0.1 x 0.4 ……….. Does anyone has any idea?

refrence: https://learning.oreilly.com/library/view/performance-by-design/0130906735/ch02.html Exercise number 4

Why aren’t distributed computing and/or GPU considered non-deterministic Turing machines if they can run multiple jobs at once?

So we know a nondeterministic Turing machine (NTM) is just a theoretical model of computation. They are used in thought experiments to examine the abilities and limitations of computers. Commonly used to dicuss P vs NP, and how NP problems cannot be solved in polynomial time UNLESS the computation was done on the hypothetical NTM. We also know an NTM would use a set of rules to prescribe more than one action to be performed for any given situation. In other words, attempt many different options simultaneously.

Isn’t this what distributed computing does across commodity hardware? Run many different possible calculations in parallel? And the GPU, does this within a single machine. Why isn’t this considered an NTM?

Ideal time complexity in analysis of distributed protocol

I need some explanation about the definition of ideal time complexity. My textbook says:

The ideal execution delay or ideal time complexity, T: the execution delay experienced under the restrictions “Unitary Transmission Delays” and “Syn- chronized Clocks;” that is, when the system is synchronous and (in the absence of failure) takes one unit of time for a message to arrive and to be processed.

What is intended for “Syncronized Clocks”?

Take for example broadcast problem and flooding protocol.

In this protocol each uninformed node wait that some informed node (at the beginning only the source) send to it the information and next it resend the info to all neighbors.

Now the ideal time complexity of this protocol is at most the eccentricity of the source and so at most the Diameter of the comunication graph.

Now if the ideal time complexity is this, necessarily al nodes send message to neighbor in parallel, correct?

and we are assuming that:

  • The source send message to each neighbor => 1 unit of time
  • The neighbors of the source send message to them neighbors => 1 time

and so on.. until we reach the most far away node from the source.

It’s a correct view?

Crawl distributed file system error: The object was not found

SharePoint 2013 Standard server.

We have a a distributed file system (DFS) which I want to crawl.

I use a UNC path like:

\machine1234\Departments\HR 

The account that runs the crawler can access it. I tested by logging in with that account and pasting the above path into explorer.

I run the a full crawl and get the following error message:

The object was not found

I run with and without proxy, same difference. Any ideas?