## A deterministic finite state automata for finding all (potentially overlapping) regular expression matches?

I was working on a bioinformatics practice problem named Finding a Protein Motif on rosalind.info. In essence, I was given a particular regular expression `N[^P](S|T)[^P]` and is asked to find all matches.

Solving that problem is not the goal here, I have a ‘working’ solution here. In essence, I manually designed a state machine that can find all matches for that regular expression.

And here is a ‘visualization’ of the state machine, to make it clear how it is manually designed.

``digraph G {     0 [label="0,''"]     1 [label="1,'N'"]     2 [label="2,'NN'"]     3 [label="3,'NS|NX'"]     4 [label="4,'NNS'"]     5 [label="5,'NSS|NXS'"]     0 -> 1 [label="N"]     0 -> 0 [label="P"]     0 -> 0 [label="S"]     0 -> 0 [label="X"]     1 -> 2 [label="N"]     1 -> 0 [label="P"]     1 -> 3 [label="S"]     1 -> 3 [label="X"]     2 -> 2 [label="N"]     2 -> 0 [label="P"]     2 -> 4 [label="S"]     2 -> 3 [label="X"]     3 -> 1 [label="N"]     3 -> 0 [label="P"]     3 -> 5 [label="S"]     3 -> 0 [label="X"]     4 -> 1 [label="N(accept)"]     4 -> 0 [label="P"]     4 -> 5 [label="S(accept)"]     4 -> 0 [label="X(accept)"]     5 -> 1 [label="N(accept)"]     5 -> 0 [label="P"]     5 -> 0 [label="S(accept)"]     5 -> 0 [label="X(accept)"] } ``

The classical theory allows us to convert a regular expression to a non-deterministic finite-state automaton and then convert it to a deterministic finite-state automaton through subset construction. In particular, subset construction guarantees that if there exists an accepting computation, then the deterministic finite-state automaton would also accept.

Let say I have a regular expression that matched twice, the corresponding deterministic finite-state automaton would accept after the first match, but then it doesn’t know what to do in order to set it to the right state for detecting overlapping matches. I guess I could start one character after the beginning of the first match, which in the worst case would probably lead to quadratic time, as we could imagine with {.*} on {{{{{}

In the worst case, I expect quadratic time (e.g. {{{{{}}}}}}), but it would be great if the timing is output-sensitive, for I believe a good deal of cases aren’t quadratic in output size.

It would be great if my state machine used to find all matches can be generalized (apparently sometimes it need linear space, not just a single state) or automatically designed. Do we know if there are existing theories for that?

## No engines matches when importing accounts

Importing accounts to this engine shows “no engine matches” for all urls.

[setup]
enabled=1
default checked=0
engine type=Article
description=
dofollow=1
anchor text=1
creates own page=1
uses pages=0
multiple posts per account=1
;;; API MAIN VARIABLES
[api_url]
type=extract
default=http://gsapi.local:9090
static=1
type=extract
back=}
static=1
[api_target_url]
type=extract
default=%targethost%
static=1
;;; API REQUIRED VARIABLES
[api_engine_name]
type=extract
default=test
static=1
;;; NORMAL VARIABLES
[URL]
type=url
[Anchor_Text]
type=text
alternate data=%spinfile-generic_anchor_text.dat%
[Article]
type=memo
allow html=1
must be filled=1
auto modify=0
custom mode=1
must be filled=1
hint=The login for websites that need an account. Use numbers and letters only.
min length=10
upcase=0
static=1
must be filled=1
hint=A password used for websites that need an account. Use numbers and letters only.
static=1
type=email
static=1

—–
[STEP1]
submit success=”success”:true
submit failed=”success”:false
submit failed retry=XXXXXXXXXXXXXXXXXXXXXXXX
verify submission=1
verify by=url
verify interval=10
verify timeout=99999999999999999999
first verify=5
verify on unknown status=0
[STEP2]
modify url=%api_target_url%
[STEP3]
post data=engine_name=%api_engine_name%&target_url=%api_target_url%&url=%url%
form request with=XMLHttpRequest
encode post data=3

## Created POST route, but resulting in RoutingError (No Route matches POST)

I am setting up a new route “/api/v1/example_two” that I can POST to (create), however it is resulting in No route matches [POST] RoutingError

I have tried explicitly stating post, try to create the route through resources

`config/routes.rb`

``Rails.application.routes.draw do    resources :roles, only: [:index], defaults: { format: :xml }    defaults format: :json do     scope :v1 do       resources :example_one, only: [:create, :show], param: :uuid       resources :example_two, only: [:create], param: :uuid     end   end end  ``

and I have a controller: `app/controllers/example_two.rb`

``class example_two < ApplicationController   def create     ...   end end ``

I expect it to return whatever is in example_two#create, however it is resulting in `ActionController::RoutingError (No route matches [POST] \"/api/v1/example_two\"`

## Is there a way to filter out partial matches from search results on YouTube?

When searching for “JoJo’s Bizarre Adventure,” I frequently just search the keyword “jojo,” but my first sight is often greeted by a musical artist by the same name.

Instead, I’d prefer that all my YouTube results not show a single result pertaining to the artist, and instead pertaining to the anime itself.

Is there a way to a filter out specific content from the search results?

## Как использовать matches чтобы найти name?

Как использовать matches чтобы вернуло true ?

``var a = document.querySelector('.href'); console.log(a); var a2 = a.matches('name'); var a3 = a.matches('ad'); console.log(a2); console.log(a3);``
``<a href="#" class="href" name='ad'></a>``

## Matching Algorithm – How to construct a bipartite-like graph with heterogeneous matches rules

We have a set . Elements in this set can be matched according to the following rules:

The input to the matching algorithm is an array of variable size consisting of elements in . Each element in the array has a particular size or “quantity” that can be matched (I imagine this can simply be modeled as edge weights).

The first question is, how to maximize the total quantity matched? Second question is, how to optimize time complexity?

Intuitively, I think the problem could be modeling as a weighted bipartite graph and solved as a max-flow algorithm. The challenge is that elements can be matched in different ways, so I’m not sure what the graph should look like given these extra rules or if it implies a different approach should be used.

## How to get exact matches on top of search results?

I’m using search api with solr for drupal 8. I’ve added title field to index and gave maximum boost. Problem is when I search a keyword, exact match is not at the top of the results. For example: Assume I have contents with title “test content”, “the test content”, “small test content” etc.. The “test content” should be on the top of the list when I search for “test content”. Now it is appearing below the others. Any help is appreciated. Thank you

## JSONLayout has no parameter that matches element KeyValuePair

Below is the JSONLayout configure in log4j2.xml

``        <JSONLayout complete="true" charset="UTF-8" compact="true">             <KeyValuePair key="application-name" value="sample-app"></KeyValuePair>         </JSONLayout> ``

POM.xml

``org.apache.logging.log4j:log4j-core:jar:2.7:compile org.apache.logging.log4j:log4j-api:jar:2.7:compile org.apache.logging.log4j:log4j-jul:jar:2.7:compile com.fasterxml.jackson.core:jackson-databind:jar:2.9.9:compile ``

I see the message is being printed in JSON format but somehow keyvalue pair is not being recognized.

``2019-06-01 21:11:23,305 localhost-startStop-1 ERROR layout JSONLayout has no parameter that matches element KeyValuePair SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder". SLF4J: Defaulting to no-operation (NOP) logger implementation SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details. ``

Any idea why keyvaluepair is not recognized?

## Send email notification when a data in column matches a certain value

I am trying to make Google Sheets send a notification to a particular person/s when one of the fields on column Q is equal to the word “High”. I want the email to be sent when it is high because it relates to the priority of the task.

How can I do this?

Any help would be greatly appreciated!

## Function to check if received message matches any of the expected messages

I have a message coming in and I need to match it against the expected messages. The program will eventually do something as a result of receiving those messages. I am not very experienced at programming, but surely there should be a better way to declare all those messages it can be in like a separate entity and then be able to use it within this HexSearch.cpp file?

I tried to search how to do that but I couldn’t find the right words to ask about this using a search engine. There is many more messages than those show here which still need to be declared but this is just a sample, which I don’t like to look at already.

``#include "HexSearch.h"  void searchFunction(int num, char msg[]) {      static const char readReq[] = { 0x92 };                                 static const char readResp[] = { 0x00, 0x02, 0x12, 0x34, 0xA1 };      static const char writeReq[] = { 0x0A, 0xE0 };                          static const char writeResp[] = { 0x00, 0x02, 0x11, 0x01, 0x98 };      static const char resetReq[] = { 0x00, 0xFF };                            static const char resetResp[] = { 0x00, 0x21, 0x23, 0x0E, 0xAE, 0x11, 0x3A };      static const char verReq[] = {0x00, 0xA2};     static const char verResp[] = {0x00, 0x03, 0x82, 0xAA, 0x07, 0x88, 0xA9};      static const char typeReq[] = {0x00, 0x67};     static const char typeResp[] = {0x00, 0x03, 0x00, 0x00, 0xC4, 0x77};      static const char askReq[] = {0x00, 0x55};     static const char askResp[] = {0x00, 0x01, 0xFE, 0xFF};      if (num == 4) {         replyMsg(msg, 2, 3,  readReq, readResp, sizeof(readResp) / sizeof(readResp[0]));     }     else if (num == 5) {         replyMsg(msg, 2, 4, writeReq, writeResp, sizeof(writeResp) / sizeof(writeResp[0]));         replyMsg(msg, 2, 4, resetReq, resetResp, sizeof(resetResp) / sizeof(resetResp[0]));         replyMsg(msg, 2, 4, verReq, verResp, sizeof(verResp) / sizeof(verResp[0]));         replyMsg(msg, 2, 4, typeReq, typeResp, sizeof(typeResp) / sizeof(typeResp[0]));         replyMsg(msg, 2, 4, askReq, askResp, sizeof(askResp) / sizeof(askResp[0]));     } }  void replyMsg(char msg[], int startArr, int endArr, const char* receiv, const char* resps, int respL) {     if (std::equal(msg + startArr, msg + endArr, receiv)) {         for (int x = 0; x < respL; x++) {             serialPC.putc(resps[x]);         }     } } ``

The code works. I am interested in improving it only. `num` is the total number of bytes of a message. E.g. `readReq` has one byte of data, but has also got 2 start bytes and 1 end byte so a total of 4. `readResp` array has the 2 start bytes, 2 data bytes, and one end byte and so it has a total size of 5 bytes. The 2nd byte is the one which specifies the length of a message. `msg[]` is the message coming in from a serial connection essentially.

As an example, if `msg[] = { 0x00, 0x01, 0x92, 0x56 }` then `num = 4` and `replyMsg` will compare the 3rd byte to see that it matches `readReq` and so output `readResp`