Simple Uniform Hashing Assumption and worst-case complexity for hash tables

My question: Is the Simple Uniform Hashing Assumption (SUHA) sufficient to show that the worst-case amortized time complexity of hash table lookups is O(1)?

It says in the Wikipedia article that this assumption implies that the average length of a chain is $ \alpha = m / n$ , but…

  • …this is true even without this assumption, right? If the distribution is [4, 0, 0, 0] the average length is still 1.
  • …this is a probabilistic statement, which is of little use when discussing worst case complexity, no?

It seems to me like a different assumption would be needed. Something like:

The difference between the largest and smallest bucket is bounded by a constant factor.

Maybe this is this implied by SUHA? If so, I don’t see how.

Irreversibly hash email addresses while preserving format/entropy

I’m looking to parse mail log data as part of a machine learning project. As part of that I need to ensure that the processed data has been scrubbed of personally identifiable information (specific users/client domain etc).

A quick google-fu led me to pyffx and the inbuilt secrets package.

I’m looking to scrub the email addresses while retain the formatting and beginning character sequence:

#!/usr/bin/env python # coding: utf-8  import pyffx, secrets  def ffx_encrypt(email,secret):     raw_user, raw_domain = email.split('@')     #retaining first few characters to test entropy of bulk sender lists     user_chars = raw_user[:3]     user_rem = raw_user[3:]     #get unique characters for each string to retain entropy     uniq_user_chars = ''.join(set(raw_user))     uniq_dom_chars = ''.join(set(raw_domain))      e_user = pyffx.String(secret,alphabet=uniq_user_chars,length=len(user_rem))     e_dom = pyffx.String(secret,alphabet=uniq_dom_chars,length=len(raw_domain))      user_encrypt = e_user.encrypt(user_rem)     dom_encrypt = e_dom.encrypt(raw_domain)      return user_chars + user_encrypt + '@' + dom_encrypt;  #To be generated at runtime secret = secrets.token_hex(32).encode()  print(ffx_encrypt('test1@gmail.com',secret)) print(ffx_encrypt('firstname_surname1@mail.net',secret)) print(ffx_encrypt('username1@mail.co.uk',secret)) print(ffx_encrypt('username1@gmail.com',secret)) print(ffx_encrypt('username1@mail.net',secret)) print(ffx_encrypt('bounce-mc.uk1147123_813.721605-sue.test=mail.net@mail555.atl123.test.net',secret))  ##Sample run results #teste@limigooac #firms_smnnueefrna_@tnmaenmi #userersua@k.m.auuamo #userersua@limigooac #userersua@tnmaenmi #bout50um7=8s_t43n07s0.6tn5knt0e366u-7c73bl3_2iio@1.eisnss5i1l32s.3.ea..3 

At the moment I’m not focused on performance, elegance or robustness, it’s more about avoiding stepping on a landmine if there’s an obvious flaw with my implementation which could make the post-processed email addresses/domains reversible.

Feedback would be greatly appreciated.

Autenticar un usuario con hash y helper

tengo el siguiente código para autenticar a un usuario.

[HttpPost] [ValidateAntiForgeryToken]   public virtual async System.Threading.Tasks.Task<ActionResult> Index1(LoginViewModel model, string returnUrl)         {             string sql = @"SELECT COUNT(*)                       FROM USURAIO                       WHERE USUARIO_ACCESO = @usuario AND PASSWORD = @password";       using (SqlConnection conn = new SqlConnection(ConfigurationManager.ConnectionStrings["default"].ToString()))             {                 conn.Open();                  SqlCommand command = new SqlCommand(sql, conn);                 command.Parameters.AddWithValue("@nombre", model.Username);                  string hash = Helper.EncodePassword(string.Concat(model.Username, model.Password));                 command.Parameters.AddWithValue("@password", hash);                  int count = Convert.ToInt32(command.ExecuteScalar());                  if (count == 0)                     return Json(new { success = false, message = false }, JsonRequestBehavior.AllowGet);                 else                    return Json(new { success = true, message = true, url = Url.Action("Index", "Home") }, JsonRequestBehavior.AllowGet);              }         }

pero me sale un error en el helper dice que no existe en el contexto actual, tengo entendido que con un using debería funcionar, pero no se cual es ni como solucionarlo. Les agradezco si me pueden ayudar Este es el error que les comento.

Autenticar un usuario con hash y helper

tengo el siguiente código para autenticar a un usuario.

[HttpPost] [ValidateAntiForgeryToken]   public virtual async System.Threading.Tasks.Task<ActionResult> Index1(LoginViewModel model, string returnUrl)         {             string sql = @"SELECT COUNT(*)                       FROM USURAIO                       WHERE USUARIO_ACCESO = @usuario AND PASSWORD = @password";       using (SqlConnection conn = new SqlConnection(ConfigurationManager.ConnectionStrings["default"].ToString()))             {                 conn.Open();                  SqlCommand command = new SqlCommand(sql, conn);                 command.Parameters.AddWithValue("@nombre", model.Username);                  string hash = Helper.EncodePassword(string.Concat(model.Username, model.Password));                 command.Parameters.AddWithValue("@password", hash);                  int count = Convert.ToInt32(command.ExecuteScalar());                  if (count == 0)                     return Json(new { success = false, message = false }, JsonRequestBehavior.AllowGet);                 else                    return Json(new { success = true, message = true, url = Url.Action("Index", "Home") }, JsonRequestBehavior.AllowGet);              }         }

pero me sale un error en el helper dice que no existe en el contexto actual, tengo entendido que con un using debería funcionar, pero no se cual es ni como solucionarlo. Les agradezco si me pueden ayudar Este es el error que les comento.

Postgresql using hash join with small table

I have one view (B) that returns ~20M records and a table (A) that has ~50M records. If I do A inner join B ON A.id=B.id, it performs a hash join, as expected. If I do B WHERE B.id IN ('value','value','value') it performs a nested loop with index scan on B, as expected. But if I do B WHERE B.id IN (SELECT id FROM A LIMIT 3) it performs a hash join, degrading performance terribly. Why is that? I tried disabling hash joins, but it is even worse: it uses seqscans anyway on both sides.

Does putting salt first make it easier for attacker to bruteforce the hash?

Many recommendations for storing passwords recommend hash(salt + password) rather than hash(password + salt).

Doesn’t putting the salt first make it much faster for the attacker to bruteforce the password, because they can precompute the state of the hashing function with the bytes of the salt, and then each time of their billions and trillions attempts they only need to finish calculating the hash using the bytes of the password.

In other words, each bruteforce iteration needs to calculate only the hash of the password intermediateHashState(password) instead of the whole hash(salt + password).

And if the salt was placed after the password, the attacker wouldn’t have this shortcut.

Does this advantage exist and is it significant?

Given a family of hash functions in table form, how can I know whether it’s universal?

I’ve been given the following two families of hash functions:

H

and

G

enter image description here

Each family has three functions $ \{0,1,2,3,4\} \to \{0,1,2\}$ that can be seen in the tables above. For each family I need to decide whether it’s universal. I know that a family of hash functions $ F$ is called universal if for every $ x \neq y$ , $ \text{Pr}(f(x) = f(y)) \le \frac{1}{m}$ ($ f$ is a function in $ F$ ). However, I don’t understand how to calculate this probability. Should I calculate it for any one of the functions or for the whole family?

Is there a good way to hash abstract binding trees?

The hash function should be invariant under alpha-renaming. Using de Bruijn notation seems to be possible, but it requires alpha-converting the whole tree when a binding is created, and has the unhappy consequence that a substructure of an abt is not a well-formed abs (since the de Bruijn indices are broken). So, is there a good (neat, elegant and/or efficient) way to construct such a function? By the way, is there any study on this matter? Any help is appreciated!