Why is the run time with a loop of this structure considered O(log n)

I used the search function and a good amount of google searches, but wasn’t able to get a straight answer on how a loop of the form below, is translated to a proper summation where the function derived from the summation is: $ O(\log n)$ .

Example of the for loop:

int j = 0; for (int i = 1; i < n; i *= 2) {     j = j + n * 2; } 

So I understand within the loop we have 3 operations (multiplication, addition, then assignment).

I understand that the index $ i$ ranges from $ [1, 2, 4, 8, 16, 32, 64, 128, 256]$ … ($ i$ is only equal to powers of 2, up to $ n$ ).

So essentially the range of $ i$ seems to be from $ [2^m, n]$ and $ m \in [0, \log_2n]$ right?

Also, $ i^m < n$ so the loop executes $ \log_2n – 0 + 1 = \log_2n + 1$ times right? How do we go about expressing this in summation notation?

Is it this (just taking a guess, not sure if it’s right):

$ $ \sum\limits_{i = 0}^{log_2n + 1} 3$ $ since we have 3 operations? If this is the answer, why do we keep the upper bound of the summation to $ \log_2n$ instead of $ n$ ? Also, I know typically we’d use $ \log n$ but just put in the base 2 to help clear things up in my own head.

Could someone please show how the summation of the for loop is properly written?

Why not implement Union-Find structure using root as the direct parent?

I just learned about using UF with union by rank and path compression. A path can be compressed via attaching a node to its root after Find is called on the node. If the goal here is to flatten the tree, why not just implement the tree such that each node is directly attached to its root (instead of its true parent)? That way, maximum compression would be achieved from the start. What is the con of this as long as union by rank is used along with it?

Finding a data structure using hash tables

I learnt about “perfect square hashing” – a stacking algorithm given a S subset U subset of N-size keys, creating the O(N^2) hashing table and mapping the keys to it O(N). After creating the table each key search takes O(1).

But I want to work under a different assumption:

At the start of the algorithem we only get N, the size of the subset of keys that we will need to add to the data structure during the entire run, but we will receive the keys themselves during the run in a scattered way and we do not know what those keys will be (only they are from our U key space). We want running time O(1) (not for expectation) for every income and search during the run.

I need to propose a data structure (and algorithms) that, given N (the number of keys that will enter the data structure when used) is initialized in O(N^2) and then allows insertion and search of keys so that the probability is greater than 1/2. Each of these operations run time is O(1) for each key during the run.

I can’t find anything to work as expected, can you help me?

Is there a data structure that can perform range modulo additions and range minimum queries?

It is well-known that the Segment Tree performs range additions and range minimum queries in O(logN) each.

Let each element in the array have value V[i], M[i]. We define a “range modulo add” as the following: add +X to V[i] for each element in the range L<=i<=R and then calculate modulo M[i] for each element L<=i<=R. Can both this operation and range minimum queries be run in (worst-case or average-case) o(N)? If not on ranges [L,R], is it possible to handle range minimum queries and range modulo adds on the entire array quickly?

Structure for getting $| \{ a,b \} \in S : a+b \le d|$ in O(1)

I am struggling with exercise from the old algorithmic exam:
$ d$ is const for the whole structure. Propose a structure for which you can do:

  • Init(S) //called only 1 time
  • Insert(x, S):: $ S := S \cup \{x\}$ in O(log(|S|)
  • Delete(x, S):: $ S := S \setminus \{x\}$ in O(log(|S|)
  • Get(S) = $ | \{ a,b \} \subset S : a+b \le d|$ in O(1)

I am trying to that with AVL Tree with additional members like number of nodes such that $ v.value+u.value \le d$ .
Could somebody give me some hint?

ERROR: structure of query does not match function result type

   CREATE OR REPLACE FUNCTION "cem_dashboard"."n_performance_value"("city" character varying, "testdate" character varying, "technology" int4, "param" character varying)       RETURNS TABLE("sim_network_operator_code_a" float8, "client_city" text, "signal_cell_type_a" float8, "test_date" text, "data_value" float8) AS $  BODY$       BEGIN         IF param = 'download' THEN             RETURN QUERY SELECT cem_dashboard.network_performance.sim_network_operator_code_a,network_performance.signal_cell_type_a,network_performance.test_date,sum(network_performance.download_kbps) as download_kbps FROM network_performance WHERE cem_dashboard.network_performance.client_city = city AND cem_dashboard.network_performance.test_date LIKE testdate AND cem_dashboard.network_performance.signal_cell_type_a = technology GROUP BY cem_dashboard.network_performance.sim_network_operator_code_a,cem_dashboard.network_performance.signal_cell_type_a,cem_dashboard.network_performance.test_date;         ELSIF param = 'upload' THEN             RETURN QUERY SELECT cem_dashboard.network_performance.sim_network_operator_code_a,network_performance.signal_cell_type_a,network_performance.test_date,sum(network_performance.upload_kbps) as upload_kbps FROM network_performance WHERE cem_dashboard.network_performance.client_city = city AND cem_dashboard.network_performance.test_date LIKE testdate AND cem_dashboard.network_performance.signal_cell_type_a = technology GROUP BY cem_dashboard.network_performance.sim_network_operator_code_a,cem_dashboard.network_performance.signal_cell_type_a,cem_dashboard.network_performance.test_date;         ELSE             RETURN QUERY SELECT cem_dashboard.network_performance.sim_network_operator_code_a,network_performance.signal_cell_type_a,network_performance.test_date,sum(network_performance.latency) as latency FROM network_performance WHERE cem_dashboard.network_performance.client_city = city AND cem_dashboard.network_performance.test_date LIKE testdate AND cem_dashboard.network_performance.signal_cell_type_a = technology GROUP BY cem_dashboard.network_performance.sim_network_operator_code_a,cem_dashboard.network_performance.signal_cell_type_a,cem_dashboard.network_performance.test_date;         END IF;      END; $  BODY$         LANGUAGE plpgsql VOLATILE       COST 100       ROWS 1000 

I want to make a function with the contents if conditional. In if conditional, I insert a sql query where the sql has special conditions according to the parameters given, but I am confused about how to return it.

Please help me 🙁

sorry if my english is bad, I use google translate

A data structure for efficiently finding nodes relative to other (ex: if a node is earlier in a list than another node)

Suppose we have N elements, which we’ll treat as a simple object. Is there a data structure I can use that will allow me to see which node appears earlier based on some arbitrary insertion order given a reference to both nodes? I’ll need some kind of data structure $ \mathcal{D}$ , where $ \mathcal{D}$ supports the following operations (assuming $ A$ and $ B$ are node references, and $ A \neq B$ ):

$ \mathcal{D}.addBefore(A, B)$

$ \mathcal{D}.addAfter(A, B)$

$ \mathcal{D}.addLast(A)$

$ \mathcal{D}.earlier(A, B) \rightarrow bool$

$ \mathcal{D}.remove(A)$

I’d like to implement some kind of $ Earlier$ predicate which takes two nodes and returns whether A comes before B. For example, if this was an indexed list and we had the index of the nodes, then it’d be simply:

$ $ Earlier(A, B) \implies A.index < B.index$ $

The ordering is determined by a user who inserts them as they see fit. They are allowed to add either after some node, or before some node, or if the data structure is empty then they can just add it and the element that was added first becomes the only element in the data structure until another element is added.

A practical example of this problem is that a user is pasting files into a directory, but the file explorer lets the user paste files before or after any file in the list. The file explorer must display the files in order that the user requests, so if a list is used to hold the files, then [A, B, C] should render as [A, B, C], and if the user pastes a file D before B, then the list should render [A, D, B, C].

This becomes a problem when I need to insert before another item: I don’t have that luxury since inserting into the middle of a list backed by an array has a big overhead. My next thought was to go with a linked list because I will have references to the two nodes and can quickly insert with my handle to the node. This is $ \mathcal{O}(1)$ for insertion.

The actual problem I have is that insertions are not too frequent, but searching for which node comes first between two given nodes in the data structure is a common operation. This makes the naive $ \mathcal{O}(n)$ search pretty painful when dealing with a lot of nodes in the list as I have to search all the other nodes in the list at the worst case to determine which one is behind/ahead of the other.

My main roadblock is that since the user can insert them in any order (and it needs to stay in the order the user inserts them in), I have to use some data structure that maintains this invariant.

As such, with a linked list I am stuck currently at:

$ $ Earlier \in \mathcal{O}(n)$ $

and iterating over the list is of course $ \mathcal{O}(n)$ , along with removal being $ \mathcal{O}(1)$ since it’s trivial to unlink a node with a reference to it in a doubly linked list.


My solution to the problem:

Now, we can change the data structure if we want, so a linked list isn’t required. The only thing that is required is the ability to let the users iterate over the data structure and get the elements back in the order they placed them in.

This makes me think of whether there’s a tree structure I can use. For example, what if I was to take a binary tree that balances itself such that the search depth is approximately $ \mathcal{O}(\lg n)$ , maybe something like a self-balancing tree. The first thing that jumps to mind is an AVL tree where I’d track the sizes of the trees in balance and then update them. This isn’t quite an AVL tree since there’s no implicit ordering between the nodes, but the idea of self-balancing is something I’d like to exploit to get a good search runtime.

To make this viable, our users will have references to our nodes. This way we can put each node in a hash table and do an $ \mathcal{O}(1)$ lookup to find the node in the tree. Inserting a node before or after it is not too bad since you create a new subtree from the current node by adding a parent node and making the previous node into a leaf and adding the element as another leaf. To make this visually make sense:

    o                   o    / \     add A       / \      rebalance   o   o   ------->    o   o    ---------->  ...  / \      before X   / \        if needed o   X               o   o                        / \                       A   X 

or

    o                   o    / \     add A       / \      rebalance   o   o   ------->    o   o    ---------->  ...  / \      after X    / \        if needed o   X               o   o                        / \                       X   A 

where o is another node (that is either a parent, or a leaf).

A consequence of this data structure is that it is a full binary tree and each leaf is a value we’re storing, and the parent nodes do not store any value.

The cost of adding a node to a self balancing binary search tree is $ \mathcal{O}(1)$ to place it at the node since we assume we can look up the node reference from a hash table, and then $ \mathcal{O}(1)$ to insert it by adding a parent and attaching the two nodes, and then $ \mathcal{O}(\lg n)$ to rebalance the tree. This means insertion is $ \mathcal{O}(\lg n)$ . Not too bad.

Searching for an element to see which comes earlier becomes a “traverse from both nodes up to the root and find the least common ancestor, and whichever comes from the left branch is earlier”, which is $ \mathcal{O}(\lg n)$ . Searching is now logarithmic as well.

As such, this means we now get:

$ $ Earlier \in \mathcal{O}(\lg n)$ $

Further, iterating over the binary tree is $ \mathcal{O}(n)$ since it’s a full binary tree and at worst there should be approximately $ 2n$ nodes to visit in total. Since the naive list solution previously was $ \mathcal{O}(n)$ , we’re looking good.

Finally, removal is probably the same as AVL tree removal and thus also $ \mathcal{O}(\lg n)$ .


But can we do better?

Overall the above solution is decent, but it would be really nice if I could knock the searching down to $ \mathcal{O}(1)$ if possible or something really small like how disjoint sets are $ \mathcal{O}(\alpha(n))$ for some operations and feel effectively constant.

Is it possible to do something like this in $ \mathcal{o}(\lg n)$ time? I am willing to trade away performance on addition, deletion, and iteration to get a better search time, as that is my bottleneck.

I don’t know what other data structures are out there, maybe you know. Or maybe there is some other method I can use that I don’t know about which would allow me to achieve very quick search times. I can augment the data structures, so that is always an option on the table.

I also understand that getting a better runtime might require going to the literature and implementing some exotic data structure, to which the cost of implementing and maintaining it may be more than it’s worth, and as such maybe the balancing binary tree might be the only viable solution since this is not Google-level data sizes and doesn’t need such a solution. Since this is a problem I have in a hobby project, I figure I can try out things with little repercussion.

How to prevent people from duplicating my data structure?


INTRODUCTION

First of all, I’m a beginner in RSA mechanisms and similar but I’m really interested about knowing if this is possible.

The scenario is I organized a party where certain people receive a special invitation voucher I made with my private key. They will later use it to enter and I will check if it is valid with a public key.

On the begining let’s suppose the voucher has someone’s ID “signed” below by a private key. By this mean, I can ensure THIS person will enter with THAT entrance ID.

MAIN PART

But now I want this vouchers to be transferable in OFFLINE mode with some sort of program. So now it would consist of two blocks; the second one ensures the ticket IS the one. And the first block, would contain the actual holder of the voucher:

………………………………………. BLOCK 1

ACTUAL HOLDER DATA

SIGNATURE OF THIS BLOCK

………………………………………. BLOCK 2

ORIGINAL HOLDER

VOUCHER ID

SIGNATURE OF THIS BLOCK

……………………………………….

I provide the first invitee with his own private key, so he can modify the first block to change the ACTUAL HOLDER DATA in virtue of the new person he transfered the voucher to. When he/she does this, he also provides the private key to the new person.

THE FLAW

If (let’s call him) Mallory makes a copy of his voucher before he transfers it to Alice correctly, he would be ALSO able to enter to my party although Alice should be the only one capable of entering. And so, the number of invitees could be infinite.

My question is simple, how can I disable Mallory’s voucher?

  • Option 1 would be to make the transaction online and add to a server which acknowledges that Alice is the new owner of THAT voucher. Otherwise I need a mechanism to make the old voucher obsolete, unless anyone proposes something better.

Anyone?