Is Allow: in robots txt an exclusion to an exclusion?

Say I have a MediaWiki website and all the Special: namespace webpages are excluded with robots.txt Disallow: Special: but there are a few specific Special: webpages that I do want to include;

  • Allow:Special:RecentChanges
  • Allow:Special:RandomPage
  • Allow:Special:Categories

Is Allow: in robots txt an exclusion to an exclusion?

To ask a more specific and two-factored question: Is the code above what I need to add to robots.txt and is it correct to say that these allocations are "exclusions to the (general) exclusion"?

Robots Exclusion Protocol Specification

3 articles regarding changes to the Robots Exclusion Protocol from Google:

Quote:

For 25 years, the Robots Exclusion Protocol (REP) has been one of the most basic and critical components of the web. It allows website owners to exclude automated clients, for example web crawlers, from accessing their sites – either partially or completely….

…the REP was never turned into an official Internet standard, which means that developers have interpreted the protocol somewhat differently over the years. And since its inception, the REP hasn’t been updated to cover today’s corner cases….

…Together with the original author of the protocol, webmasters, and other search engines, we’ve documented how the REP is used on the modern web, and submitted it to the IETF…


Full article with specifics on the changes: Formalizing the Robots Exclusion Protocol Specification July 01, 2019

Quote:

… we open sourced the C++ library that our production systems use for parsing and matching rules in robots.txt files. This library has been around for 20 years and it contains pieces of code that were written in the 90’s. Since then, the library evolved; we learned a lot about how webmasters write robots.txt files and corner cases that we had to cover for, and added what we learned over the years also to the internet draft when it made sense.
We also included a testing tool in the open source package to help you test a few rules….


Read more about Google’s robots.txt parser is now open source July 01, 2019

Quote:

…In the interest of maintaining a healthy ecosystem and preparing for potential future open source releases, we’re retiring all code that handles unsupported and unpublished rules (such as noindex) on September 1, 2019. For those of you who relied on the noindex indexing directive in the robots.txt file, which controls crawling, there are a number of alternative options:


Read the full article related to unsupported rules in robots.txt July 02, 2019

What is a counterexample for Lamport’s distributed mutual exclusion algorithm with non-FIFO message queues?

Lamport’s distributed mutual exclusion algorithm (also described here) solves mutual exclusion problem for $ N$ processes with $ 3(N-1)$ messages per request (“take and release lock” cycle).

It requires that for each pair of processes $ P$ and $ Q$ all messages send by $ P$ to $ Q$ are received and processed by $ Q$ in the same order. E.g. if $ P$ sends messages $ m_1$ and $ m_2$ in that order, $ Q$ cannot receive $ m_2$ before receiving $ m_1$ .

I would like to see how it breaks if I remove that First-In-First-Out condition and allow reordering. The only counterexample I was able to built uses two processes who want to acquire the shared resource:

  1. $ P$ starts with clock 10 and sends request $ m_1$ to $ Q$
  2. $ Q$ starts with clock 1 and sends request $ m_2$ to $ P$
  3. $ Q$ receives request $ m_1$ with timestamp 10 and sends acknowledge message $ m_3$ to $ P$
  4. $ P$ receives message $ m_3$ before $ m_2$ and enters critical section. As far as $ P$ is concerned, it’s the only process wanting the resource
  5. $ P$ receives message $ m_2$ and responds to $ Q$ with acknowledge
  6. $ Q$ enter critical section as $ Q$ ‘s request has timestamp 1 and $ P$ ‘s request has timestamp 10, so $ Q$ has priority

However, that requires $ P$ to respond to $ Q$ ‘s request $ m_2$ while in critical section. Otherwise, $ Q$ will receive acknowledgment when $ P$ is no longer in critical section and there will be no conflict.

Question being: how to construct a counterexample where processes do not respond to external messages while in critical section?

Inclusion Exclusion Application

IF $ A$ ,$ B$ ,$ C$ are finite sets then,

Number of elements in exactly one of sets $ A$ ,$ B$ ,$ C$ :$ $ n(A)+n(B)+n(C)-2 \times n(A \cap B)-2 \times n(A \cap C)-2 \times n(C \cap B) + 3 \times n(A \cap B \cap C)$ $

I can derive the above thrhough inclusion exclusion, but would there be a general formula for n finite sets?

Conditionally formatting duplicate values in Google sheets with exclusion criteria?

I’m currently working in a Google sheet where I use the following formula to catch duplicate values with conditional formatting:

=countif($ B:$ B,B2)>1

While this works on its own, I find that it does not account for duplicate values I do not want to be counted. In column O, I have values marking the row as “Canceled”. So I want to have Google sheets only conditionally format cells if the following criteria are met:

  • The value in column B has a duplicate

AND

  • Neither duplicate value has the word “canceled” in column O for that row.

This is the formula I tried, but it no longer formats duplicates:

=AND((countif($ B:$ B,B2)>1=TRUE),ISNUMBER(SEARCH("Cancelled",$ O:$ O)=FALSE))

I think what this has done has told the logic to not format duplicates unless there are no instances of “canceled” in column O.

Loop download and exclusion list file

I created the following program to download some files from NCBI (GEO):

    for(i in 1:5){       x <- GEO [i,1]       myPath <- paste0("https://www.ncbi.nlm.nih.gov/geo/download/?     acc=",x,"&format=file")       download.file(myPath, paste0(x, ".tar"))       out <- tryCatch(         {           message("This is the 'try' part")           readLines(con=myPath, warn=FALSE)            },         error=function(cond) {           message(paste("URL does not seem to exist:", myPath))           message("Here's the original error message:")           message(cond)           # Choose a return value in case of error           return(NA)         },         warning=function(cond) {           message(paste("URL caused a warning:", myPath))           message("Here's the original warning message:")           message(cond)           # Choose a return value in case of warning           return(NULL)         },         finally={           message(paste("Processed URL:", myPath))           message("Some other message at the end")         }       )           return(out)     } 

I´m trying to use tryCatch loop function but this is not working as I expected.

How does one write a tryCatch loop so that: (i) When the URL is wrong, the code does not stop and continues donwload the URLs list. (ii) When the URL is wrong, the code save those “wrong URL” in a separated file (exclusion list).

Mutual Exclusion by disabling interrupt

I am reading William Stallings operating system book which gives this pseudo-Code for disabling interrupt to achieve Mutual Exclusion

while (true) {  /* disable interrupts */;  /* critical section */;  /* enable interrupts */;  /* remainder */; } 

But I don’t get why there is a While loop in the code which is true this means that this part of the code repeats forever but why?