Sliding Window Dictionary String Matching

Consider the following problem. We are given a set of patterns (strings) $ \Pi = \{\pi_i\}$ , a text $ s$ , and a window length $ k$ . We want a list of all shifts $ 0 \le i \le |s|-k$ such that every pattern in $ \Pi$ is contained in the substring $ s[i:i+k]$ .

Can this be solved in linear- or near-linear-time? It can of course be solved in quadratic time $ O(|s| |\Pi| + \sum |\pi_i|)$ using KMP or Aho-Corasick plus post-processing.

The motivation for this problem is finding matches for a topic (represented by the set of patterns) in a text. In that context it actually makes sense to require the matches to be non-overlapping so I’m also interested in that case, but it might be easier to start with the relaxed version.

I would also be interested in generalizations of the problem that allow for approximate matches of some kind, eg, only requiring a threshold on the size of the subset of matching patterns, allowing matches within given edit distance, or something using hidden Markov models or general probabilistic graphical models. I would be surprised if any such generalization can be solved in subquadratic time though.

Objecto en formato string dentro de valor de un JSON

Bueno tengo el siguiente JSON:

{   "creatorId": "#1",   "data": {             "id": "10",             "creator": "#1"         },   "subs": ["1"] } 

Pero el contenido de data no debe de ser un objeto. Sino un objecto en formato string.

{   "creatorId": "#1",   "data": "{               \"id\": \"10\",               \"creator\": \"#1\"             }",   "subs": ["1"] } 

Todo esto es debido a que mi intencion es recoger el campo data del JSON y convertirlo en un objeto js.

const dataObj = JSON.parse(recoverJson.data); 

Y claro lo más sencillo sería pasarle directamente el objecto y no tener que estar recogiendo el string y así. Pero es requisito de API que lo que llegue sea un string.

Como podria montar el JSON con un objeto “strigificado”

Algorithm to find repeated patterns in a large string

For optimization purposes I’m trying to analyze a large list of executed program commands to find chunks of commands that are executed over and over again. This problem is similar to searching repeated substrings in a string. However, in my case I’m not looking for the longest substrings but rather smaller substrings that occur very often.

For example, say each command is represented by a letter, then a program might look like xabca yabca xabca yabca. If we are looking for the longest repeated substrings, the best result is xabca yabca. A “better” result would be abca, though. While being shorter, it occurs more often in the string. a occurs even more often on its own, but it would be considered a too short match. So an algorithm should be parameterizable by a minimum and maximum chunk length.

Things I have tried so far:

  • I played with suffix trees to find the longest repeated substrings that occur at least k times. While that is simple to implement, it doesn’t work well in my use case, because also overlapping substrings are found. Trying to remove those wasn’t very successful either. The approach mentioned in this post either gave wrong or incomplete results (or I misunderstood the approach), and it also doesn’t seem to be customizable. Suffix trees still seem the most promissing approach to me. Perhaps someone has an idea how to include the minumim/maximum chunk lengths into the search here?
  • Another attempt was using the substring table that is created for the LZW compression algorithm. The problem with this approach is that is doesn’t find repeated chunks that occur early and it also creates longer and longer table entries the farer it processes the input (which makes sense for compression but not in my use case).
  • My best solution so far is the brute-force approach, i.e. building a dictionary of every possible substring and counting how often it occurs in the input. However, this is slow for large inputs and has a huge memory consumption.
  • Another idea was searching for single commands that occur most frequently, and then somehow inspecting the local environments of those commands for repeated patterns. I didn’t come up with a good algorithm here, though.

What else algorithms are there that could be useful in this scenario? What I’m looking for is not necessarily the best match but a good heuristics. My input data is pretty big, strings up to a length of about 100MB; the chunk sizes will usually be in the range from 10 to 50.

Cambiar el connection string en tiempo de ejecucion

tengo una cadena una aplicacion hecha en vb.net, esta aplicación la corro en dos o tres redes diferentes, existe alguna manera de que se testee la conexion a una base de datos y si no responde utilice otra cadena?? si Conexion1 no anda utilizar Conexion2

utilizo entity Framework para conectarme a las bd.

mi idea es algo asi:

Saludos

How does a ghost hunter use the screwdriver and string in the “ghost hunting kit?”

Using a prefab character — Maximilian Hirst, parapsychologist — I have a ghost hunting kit consisting of a thermometer, string, screwdriver, and bible. Neither the GM (nor I) could come up with a reason or method of use for the string and screwdriver. Is it a plumb line thing? Some kind of ley line detection?

El string que me retorna mi metodo lo asigno a un [value] y debo pasarlo a un [(ngModel)]

Tengo un metodo que guarda una imagen en el storage de firebase, este me retorma la url y la asigno a un [value] de esta manera:

<div class="form-group">                     <input                          type="text"                         [value]="urlImage | async"                         class="form-control"                         name="avatar"                         #avatar="ngModel"                         [(ngModel)]="perfilService.seleccionarPerfil.avatar"                         placeholder="Url">

pero el [(ngModel)] no acepta esta asignación, como podría pasarle el dato que traigo del value [] para que en [(ngModel)] me lo valide?