Can the test set of attributes be a subset of the training set’s attributes?

I’m currently writing an application for a trick-based card game, where agents are assigned points based on the accuracy of their predictions of how many hands they’re going to win. The number of tricks predicted will display a level of confidence to the other agents as well as also potentially allowing the agent to choose the Trump suit.

I’ve compiled a set of attributes (such as cards in hand, score sum total in hand) that will be useful in the prediction, but it would be ideal if I could include previous predictions in the training set.

My question is can my test data [attributes] be a subset of the training data [attributes ∩ predictions] both of which are predicting the number of hands that the agent will win?

If $U_1,U_2$ linear independent sets then $( \cap )= $

If I take an element $ v$ in $ (<U_1> \cap <U_2>)$ why this element can be described as :

$ v=\sum_{i=1}^{k}\lambda_iz_i+\sum_{i=k+1}^{k+n}\lambda_{i}x_i=\sum_{i=1}^{k}\mu_iz_i+\sum_{i=k+1}^{k+m}\mu_i y_i$

?

where $ z_1,..,z_n\in U_1\cap U_2,x_{k+1},…,x_{k+n}\in U_1\backslash U_2$ and $ y_{k+1},…,y_{k+m}\in U_2\backslash U_1$

I thought every element of $ (<U_1> \cap <U_2>)$ must be a linear combination of vectors that are bothin $ U_1$ and $ U_2$

Project Euler #60 Prime pair sets

The problem is as below:

The primes 3, 7, 109, and 673, are quite remarkable. By taking any two primes and concatenating them in any order the result will always be prime. For example, taking 7 and 109, both 7109 and 1097 are prime. The sum of these four primes, 792, represents the lowest sum for a set of four primes with this property.

Find the lowest sum for a set of five primes for which any two primes concatenate to produce another prime.

%%time import itertools import time  t1 = time.time()  def prime():      yield 3     yield 7      for i in itertools.count(11, 2):         e = int(i ** .5) + 1         for j in range(2, e + 1):             if i % j == 0:                 break         else:             yield i   def is_prime(n):     if n < 2:         return False     if n == 2:         return True     e = int(n ** .5) + 1     for i in range(2, e + 1):         if n % i == 0:             return False     else:         return True   def power_up(n):     # helper function return the next 10 power     i = 1     while 1:         if n < i:             return i         i*=10   def conc(x,y):     # helper function check if xy and yz is prime     if not is_prime((x*power_up(y))+y):         return False     else:         return is_prime(y*power_up(x)+x)  def conc3(x,y,z): # not use, it did not improve the performance      a = conc(x,y)     if not a:         return False     b = conc(x,z)     if not b:         return False     c = conc(y,z)     if not c:         return False     return True     one = [] two = [] three = [] four = [] found = 0  for i in prime():     if found:         break     try:         if i > sum_:             break     except:         pass     one += [i]     for j in one[:-1]:  # on the fly list         if conc(i,j):             two += [[i, j]]             for _, k in two: # check against k only if it is in a two pair                 if _ == j:                     for x in [i, j]:                         if not conc(x,k):                             break                     else:                         three += [[i, j, k]]                         for _, __, l in three:                             if _ == j and __ == k:                                  for x in [i, j, k]:                                     if not conc(x,l):                                         break                                 else:                                     four += [[i, j, k, l]]                                     # print(i, j, k, l)                                     for _, __, ___, m in four:                                         if _ == j and __ == k and ___ == l:                                             for x in [i, j, k, l]:                                                 if not conc(x,m):                                                     break                                             else:                                                 a = [i, j, k, l, m]                                                 t2 = time.time()                                                 try:                                                     if (                                                         sum(a) < sum_                                                     ):  # assign sum_ with the first value found                                                         sum_ = sum(a)                                                 except:                                                     sum_ = sum(a)                                                 print(                                                     f"the sum now is {sum(a)}, the sum of [{i}, {j}, {k}, {l}, {m}], found in {t2-t1:.2f}sec"                                                 )                                                 if i > sum_:                                                     # if the first element checked is greater than the found sum, then we are sure we found it,                                                     # this is the only way we can be sure we found it.                                                     # it took 1 and a half min to find the first one, and confirm that after 42min.                                                     # my way is not fast, but what I practised here is to find the number without a guessed boundary                                                      found = 1                                                     print(                                                         f"the final result is {sum_}"                                                     ) 

I found the first candidate in 75 sec which I think is to long. I want to see if anyone can give me some suggestion on how to improve the performance.

Difference set of in $b$-separated Sidon sets

Define a maximal $ b$ -separated Sidon set with parameters $ p,\alpha$ at any two integers $ 0<b<p$ and a real $ \alpha\in(0,1)$ to be the maximum set of $ m$ integers $ a_1<\dots<a_m$ in the interval $ (p^\alpha,p-p^\alpha)$ such that

  1. $ a_i-a_j\neq a_{i’}-a_{j’}$ holds if $ i\neq i’$ or $ j\neq j’$ or both.

  2. $ \min_{1\leq i<j\leq m}|a_i-a_j|>b$ holds.

Denote $ \mathcal D(\mathcal R)$ to be set of differences a maximal $ b$ -separated Sidon set $ \mathcal R$ with parameters $ p,\alpha$ represents (set of differences $ a_i-a_j$ ).

Is there a maximum $ r<p-2p^\alpha$ such that for every integer $ k\in[b,r]$ there is always two such sets $ \mathcal R$ and $ \mathcal R’$ such that $ k\in\mathcal D(\mathcal R)$ but $ k\in\mathcal D(\mathcal R’)$ ?

Karp hardness of two vertex sets in a digraph

Given a digraph $ G(V,A)$ and a number $ k$ , we want to find two vertex subsets $ S,T\subseteq V$ such that:

  1. $ |S|+|T|=k$
  2. For every $ v\notin S\cup T$ , $ v$ has no arcs coming to $ S$ , and no arcs coming from $ T$ . In other words, from $ v$ ‘s point of view, $ S$ is the source, $ T$ is the terminal. Hence their names.

So, can this be solved by an efficient algorithm or it is NP-complete?

Algebraic construction of $\varepsilon$-biased sets

Let $ \ell> 1$ be an integer and consider the mapping $ \text{Tr}:\mathbb{F}_{2^\ell}\to\mathbb{F}_{2^\ell}$ defined by $ $ \text{Tr}(x)=x^{2^0}+x^{2^{1}}+\cdots+x^{2^{\ell-1}}$ $ It is then possible to show the following

  1. $ \text{Tr}$ maps $ \mathbb{F}_{2^\ell}$ into $ \mathbb{F}_2$ .
  2. If $ a\in\mathbb{F}_{2^\ell}$ is non-zero, then the mapping $ f_a:\mathbb{F}_{2^\ell}\to\mathbb{F}_{2}$ defined by $ f_a(x)=\text{Tr}(a\cdot x)$ is $ \mathbb{F}_2$ -linear and $ \mathbb{E}_{x\sim\mathbb{F}_{2^\ell}}[f(x)]=\frac{1}{2}$ .

Now, we consider the set $ S=\{s(x,y,z):x,y,z\in\mathbb{F}_{2^\ell}\}$ such that we index the entries of $ s(x,y,z)$ by $ 0\leq i,j$ such that $ i+j\leq c\sqrt{n}$ ($ c$ is a constant so that there are exactly $ n$ entries). For such $ x,y,z$ and $ i,j$ we set $ s(x,y,z)_{i,j}=\text{Tr}(x^iy^jz)$ .

I want to show that for an appropriate choice of $ \ell$ , the set $ S$ described above is an $ \varepsilon$ -biased set of size $ O(n\sqrt{n}/\varepsilon^3)$ .

So, fix a non-empty test $ \tau\in\{0,1\}^n$ , we need to show that $ $ \bigg|\mathbb{E}_{s\in S}\Big[(-1)^{\langle s,\tau\rangle}\Big]\bigg|\leq \varepsilon$ $

Let $ x,y,z\in\mathbb{F}_{2^\ell}$ and consider $ \langle s(x,y,z),\tau\rangle$ , I managed to show that (from $ \mathbb{F}_2$ -linearity above while indexing $ \tau$ as we index $ s(x,y,z)$ ) $ $ \langle s(x,y,z),\tau\rangle=\cdots=f_z\Big(\sum_{i,j}x^iy^j\tau_{i,j}\Big)$ $ Finally, I thought of defining the bi-variate polynomial $ p_\tau(x,y)=\sum\limits_{i,j}x^iy^j\tau_{i,j}$ and saying that since it is a non-zero polynomial of low degree at most $ 2c\sqrt{n}$ it attains each value of $ \mathbb{F}_{2^\ell}$ with multiplicity at most $ 2c\sqrt{n}2^\ell$ (from Schwartz-Zippel), so $ \forall\alpha\in\mathbb{F}_{2^\ell}:\Pr\limits_{x,y\in\mathbb{F}_{2^\ell}}[p_\tau(x,y)=\alpha]=O(\sqrt{n}/2^\ell)$ .

I want to use it but I am stuck…, maybe we can say that the distribution of $ p_\tau(x,y)$ is close enough to $ U_{\mathbb{F}_{2^\ell}}$ in statistical distance in order to infer that the expeced value of $ f_z(p_\tau(x,y))$ is close enough to $ 1/2$ ?

Quick and space-efficient way to find whether two sets intersect

I hope you can help me –

Given a lot of sets containing integers, I’d like for any two sets, to quickly (i.e. O(1)) ask whether they intersect. Note that I don’t need the exact intersection, rather just a yes/no answer. Also, I am fine with some false-positives. Also, the representation of the sets should be space-efficient (i.e. less than the set-size).

Ideally, I’d also like to (infrequently) update the sets.

My requirements make me think of Bloom Filters, which 1)represent sets efficiently, 2)allow O(1) containment-test and 3) have some false-positives. Unfortunately it they don’t apply to two-set-intersection.

Any ideas? Thanks!

(Just FYI, the sets are subsets of ids of adjacent edges from a huge graph)

L-packets in the local Langlands correspondence: why finite sets?

Let $ G$ be a connected, reductive group over a local field $ k$ , and let $ ^LG$ be the Langlands dual group. As explained by Borel in his article in the Corvallis proceedings, the general local Langlands correspondence should give (1) a partition of the classes of irreducible admissible representations of $ G(k)$ into finite sets, called L-packets, and (2) a bijection between the L-packets and the equivalence classes of admissible homomorphisms of the Weil-Deligne group $ W_k’$ into $ ^LG$ .

When $ G = \operatorname{GL}_n$ , the L-packets are just singleton sets. I believe that only the local Langlands conjectures for $ \operatorname{GL}_2$ were proved at the time Borel’s article was written. There were no worked out examples of L-packets with more than one element at the time, as far as I know.

Why did Borel and others in the 1970s expect the L-packets to be finite? Why do we still expect this today?