mysql partition by date and id

At the moment I’m partitioning my table daily by


On a daily basis I load data into that table, I have billions of rows in that table. I was thinking on switch to exchange partition for better performance. However the data is getting from multiple sources at different times. I do however can do exchange partition if I could do partition the table by (date, sourceId).

Is that possible? Do I have to use sub-partitions or is there other ways to solve it?

Does Oracle guarentee that ORA_HASH is used to determine which hash partition a row is inserted, and will this be honored in the future?

I use hash partitioning for a few of my very large tables, and occasionally I have a use case where it would be convenient to have a mechanism that would return the partition name that a row would be inserted into, given a partition value.

This blog here shows that we can use ORA_HASH function for this purpose. Incidentally, it appears this page is the only page on the entire internet that explains this.

I’ve used it successfully and it works in all cases that I have tried. It seems ORA_HASH is definitely what Oracle itself uses to pick the hash partition that it inserts data into, and that at least on the current version of Oracle it is safe to use for this use case.

However there is no guarantee in the documentation that Oracle even uses it, or will continue to use it in the future. This makes me think that using ORA_HASH in this way is not safe or future proof. What if a DB is upgraded and ORA_HASH no longer behaves this way?

For reference, you can use the following SQL to return the hash partition for a given value:

SELECT partition_name FROM all_tab_partitions WHERE table_name = 'FOO'     AND partition_position = ORA_HASH('bar', n - 1) + 1 

Where 'bar' is the value you wish to analyze, and n is the number of partitions in your table. There are some edge cases when the number of partitions is not a power of 2, which is covered in the blog article linked above.

Calculating the running time of Quicksort’s PARTITION procedure

I am confused about calculating the PARTITION procedure’s running time.

PARTITION procedure is used in the Quicksort Algorithm to partition the array $ A[p…r]$

I analyzed the PARTITION procedure provided in the book Introduction to Algorithms:

  PARTITION(A, p, r) 1.  x= A[r] 2.  i= p-1 3.  for(j = p to r-1) 4.       if(A[j]<= x) 5.         i= i+1 6.         exchange A[i] with A[j] 7.  exchange A[i+1] with A[r] 8.  return i+1 

Now what I figured out that:

  1. Every procedure except lines $ 3-6$ has a constant time i.e: $ \Theta(1)$
  2. For loop would execute $ r-1-p+1$ times.

Thus the running time should be: $ \Theta(r-p)$

While the book quotes:

The running time of PARTITION on the subarray $ A[p…r]$ is $ \Theta(n)$ where $ n= r-p+1$ .

Could someone please help me out in knowing how to prove the aforementioned complexity?

Thank you.

Linear Partition problem (Dynamic Programming) in a circular array

I am practicing algorithms for the last couple of days and I came across this question, the gist of which is:

Given apples and oranges arranged in a circle indexed form 1 to n , where n is an odd number(we know which ones apple and which ones orange), divide these into k contiguous groups each havind odd number of fruits such that most of the groups have more apples than oranges. Also, the arrangement can be such that a group can contain fruits of indices (n-3, n-2, n-1, n, 1 , 2, 3).

This appears like the linear partition problem to me, but the circular arrangement confuses me. Also, I was thinking of masking Apples as 1 and Oranges as -1 so that it’s easy to calculate the which one is higher in the group( if sum is +ve, then apples are higher else oranges). Also, I observed that k must be an odd number as n is an odd and each of the k groups have odd fruits, so sum of odd number of odds is an odd number.

We have to maximize the sum of each of the groups in this case right?

It would be great if someone can help out.

Thanks a lot!!!

The Partition Problem: Reducing to prove NP-Completeness

I am struggling with the below problem. Curious to hear any guidance.

The Partition Problem is the following:

$ \textbf{Instance:}$ A (multi-)set of numbers $ S = \{a_1, a_2, \ldots , a_n \}$ .

$ \textbf{Question:}$ Can $ S$ be partitioned into two (multi-)sets $ A$ and $ B$ such that the sum of the numbers in $ A$ is equal to the sum of the numbers in $ B$ ?

Prove that the Partition Problem is NP-complete.

Things I could reduce from that I know of are 3-SAT, Vertex Cover, Subset Sum, Independent Set, and the clique problem. My assumption is that I should be reducing from the Subset Sum problem, but I struggle with that problem as well.

Anyone able to help shed some light on this? Also, any explanation in plain english (maybe with some notation) would be greatly appreciated as I’m struggling with the concepts here. Just mathematical notation alone might make it more difficult for me to understand at the moment.

Thank you in advance!

Computing the partition function

Are there in any relatively efficient algorithms to compute the partition function $ P(n)$ for a given $ n$ ? What’s the runtime complexity or class of this problem as a function of $ n$ ?

Perhaps I’m missing something trivial here, but other than listing some recurrent definitions of this function and some of their asymptotics, I don’t see this precise information (complexity class, and known algorithms with their runtime/space complexity) in either of these articles:

  • Partition(number theory)
  • Partition function (number theory)

Using DD to get the hash of a non-system partition encrypted by VeraCrypt

I am trying to use DD for Windows to obtain the hash of a non-system partition that was encrypted via Veracrypt, but have run into a bit of a problem.

The command I used to get the hash of the encrypted partition looks like this

dd if=\?\Device\HarddiskVolume11 of=hash_output.txt bs=512 count=1 

And this command (in theory) should create a file called hash_output.txt that contains the encrypted hash that should, for example, look something similar to this:


However, the output I am getting when issuing the DD command above looks more like this:

fb55 d397 2879 2f55 7653 24a3 c250 14d3 3711 7109 e563 617f ab73 f11a 3469 33bb 

Which is obviously not the hash I was expecting so I am hoping someone might be able to help me figure out what I am doing wrong.

Some things to note:

  • I am 100% positive that the drive I am selecting in the DD command is the right drive.
  • There is only 1 encrypted partition on the drive that spans the entire size of the drive.
  • There is no physical / functional damage to the drive which would cause this issue.
  • This on an external 1tb drive that is connected via usb 3.0 (I have tried other cables and ports).
  • The same DD command worked fine for a test drive that I encrypted using the same parameters that were set for this drive.

Why do b*trees partition 2 nodes into 3?

I’m writing a b*tree library in Rust. I’m thinking it might be better to make the purposeful decision to only implement half of the b* optimizations. (And not because it is easier, although it is.)

According to my reading B* is 2 optimizations (for insertion anyway):

  1. If a node is full but a sibling has free space, move something over to the sibling before adding, then no recursion is necessary. With circular buffers this is really fast.
  2. If a node is full and has a full sibling, split them into 3 nodes, distribute the existing elements as 2/3 2/3 2/3 among them.

Without the rule 2 optimization, when a node is full, it is just split into 2 nodes with a 1/2 1/2 distribution between them.

I see how theoretically rule 2 is interesting, it guarantees that every node in the tree is always 2/3 full (except for the root and for very small trees).

But for a downside, consider the use case where you usually add the nodes in order (for example, adding 1, 2, 3, 4, 5….mostly in order). When you keep rule 2 then the tree is almost always exactly 2/3 full and no better; because, after the sibling gets it allotted 2/3, it is never looked at again. But without rule 2, the tree (at least the leaves anyway) is nearly 100% full all the time. Because the last node which is full gets split into 2 nodes, they each get 1/2 of the elements, and the previous sibling remains full and untouched. On the subsequent insertions, due to rule 1, the last 2 nodes fill up to be 100% full again before splitting.

And about the case with random order insertions, with rule 1 and without rule 2, the worse case scenario (I think really rare and difficult to construct) is a 50% fill rate and the best case is still the 100% fill rate. The average I would guess is still the same as with rule 2, since the tree is still usually between 2/3 and 100% full.

For a tree with large nodes (approaching page size), a possible problem would be increasing the number of page loads because of the need to check siblings. But it seems to me that even though rule 2 reduces the variance in the density of the tree, it doesn’t do much to actually increase the absolute density of the tree, that is moreso accomplished by rule 1. So I would expect the average number of page accesses to be about the same or probably less.

Can someone give me a good reason to keep rule 2 when inserting into a B* Tree?

Prove Product Partition is NP-complete in the strong sense

I am trying to understand how to prove that the Product Partition problem is NP-complete in the strong sense. The problem is similar to the normal Partition problem, except in this case the product of the subsets is taken instead of the sum.

I have only managed to find one paper discussing the issue: paper

This proof is very complicated and tough to understand. Could someone provide a simpler explanation of the process?