Snort analyze reply based on request

I’m trying to write a snort rule which detects if certain binary files where requested via HTTP based on a regex rule matching there names. But it should only send an alert if the file exists (e.g. HTTP 200 OK reply).

Is it possible to have this kind of “statefull” scan? What kind of technique could I use else since the files have no reliable information in them I could search for.

The current look of my rule:

alert TCP $  EXTERNAL_NET any -> $  HOME_NET $  HTTP_PORTS (pcre:"/\d{6}-\d\.\d\.pdf$  /U"; sid:90000512; classtype:patent-access;)  

What can cause low cost and high runtime in EXPLAIN ANALYZE?

I have a database that pretty consistently runs queries in a magnitude of cost/10 ms. There are a couple queries where EXPLAIN ANALYZE reports a cost of 2000 (which I’d expect to be somewhere in the ballpark of 200ms) but runs take multiple minutes.

My first thought is that some other activity is bogging down postgres causing this (either other processes on the machine or concurrent database activity). Is there anything else I should be looking into? Am I mistaken to expect similar cost:time ratios for different queries?

Difficulty in understanding this summations to analyze time complexity

I wanna know if what this link https://stackabuse.com/shell-sort-in-java/’s calculation of the complexity of shell sort is true.

Here’s the shell sort algorithm:

void shellSort(int array[], int n){     for (int gap = n/2; gap > 0; gap /= 2){       for (int i = gap; i < n; i += 1) {         int temp = array[i];         int j;         for (j = i; j >= gap && array[j - gap] > temp; j -= gap){           array[j] = array[j - gap];         }         array[j] = temp;       }     } } 

Let me attach the site author’s calculation using summations:

1st 2nd 3rd 4th 5th 6th

Where did he get the o(n log n) from? And why O(n^2)?

How to analyze a USB device of having possibly malicious capabilities?

So I recently ordered a chinese external USB card and I would like to find out whether it has some hidden functionality, which might become malicious. It has buttons integrated in it so Linux using libusb -vv displays it of having HID capabilities, which already alerted me since it could be used to inject keystrokes.

  • How do I go on continuing my analysis?
  • Can I dump more information about its capabilities using libusb?
  • How do I dump its firmware for reverse engineering purposes? According to [this] that’s only possible with a JTAG/UART connection?
  • Is there something like Wireshark but for USB?

Bonus points if you also add some libusb example code.

Is it possible to analyze and decrypt personal message? [on hold]

Im working on my thesis to do some pentest on android application call Picmix. My research was test some vulnerability on this app if we using this on public WiFi. What im trying to do is test and analyzed some traffic or decrypt personal message without others devices knowing.

I use 2 smartphones with 2 my own account to do some personal message tests. It works fine for uploading and fetching image with driftnet on my kali.

This is for educational purpose only..

postgresql explain analyze from sql file?

I had wrote a long query for postgresql, and I want to use postgresql explain ANALYZE to find out runnig time / parallel workers information about that query. I can run that sql command by using

psql -d database -f myquery.sql 

but I don’t know how to explan/analyze that sql file in command. and because it is a really long query , it’s hard to paste in psql console. so , is there any other way to run explain/analyze from the sql file ?

How to Analyze and Document a SQL Server Stored Procedure

I have been handed a large SQL Server batch process containing <100 stored procedures and functions. There is no documentation, I cannot rely on comments.

High level, what are the best practices to efficiently document the logic and data flow?

Can anyone point me to documentation, tools or other guidance?

EG: One process is an ETL that imports text files to tables, massages the data then loads target tables. This ETL spans 2 dbs, about 20 procs/functions and 20 or so tables. Probably <5000 lines of code. (Its pretty gnarly)

I need to become SME on this bad boy. In the past it has been hunt and peck, walking the code chunk by chunk. There must be a better way.

Setup pipeline to analyze data stored in web app DB

Background:

  • So there is a (Ruby) web app with a production Postgres DB (hosted in the cloud)
  • I would like to run some machine learning algorithms in a Python setting on the production data and (ultimately) deploy the model in a production setting (in the cloud)
  • I only know how to run these algorithms locally on say a Numpy array that fits in memory and assuming the training data is fixed
  • Let us say the dataset of interest would ultimately be too large to fit in memory, so the data would need to be accessed in batches.

My general question is:

What is a good way to go about setting up the pipeline to run the algorithms on the production data?

To be more specific, here is my current reasoning, that may or may not make sense, with more specific questions:

  • Considering the algorithms will need to access the data over and over, read speed will be pretty important. We cannot afford to access it over the network and cannot keep querying the web app production db anyway. What is the best way to store the data and make it available to the machine learning algorithms to process? Copy everything to another relational DB that the Python code can access locally?

  • Finding the right model is probably easiest if done locally on a sample of the data that fits in memory. Once a good candidate is found, we can retrain it, with all the data we have. Should we do the second step locally as well? Or you should generally try to setup a complete production pipeline that allows you to work with a larger amount of data at this stage already?

  • Let us say you have new data being written regularly. If you do the initial training by visiting batches of the data you have at time 0, and then stop training, you probably have to retrain it from scratch using all of the data you have at some later time t? Is the re-training something that is reasonable to automatize in production?

General hints and sources that help with these kind of questions are appreciated.