How to convert a Dataset into an indexed dataset / association-of-associations given a column header?

Given a dataset as such

Input

If "letter" is the header that is chosen, how do I convert it into an indexed dataset / association-of-associations?

i.e. How do I define f such that f[dataset_,columnHeader_] produces the following?

enter image description here

Please note GroupBy is close but fails as you are unable to use Part to work with the result to extract column data. eg:

data = {<|"letter" -> "a", "foo" -> 1, "bar" -> 2|>, <|"letter" -> "b", "foo" -> 3, "bar" -> 4|>, <|"letter" -> "c", "foo" -> 5, "bar" -> 6|>}; dataDS = Dataset[data]; dataDSg= GroupBy[dataDS, Key["letter"]]; dataDSg[All, "foo"] (* <- produces an error *) 

Where as data in the format of an association-of-association works fine

data2 = <|"a" -> <|"foo" -> 1, "bar" -> 2|>, "b" -> <|"foo" -> 3, "bar" -> 4|>, "c" -> <|"foo" -> 5, "bar" -> 6|>|>; data2DS = data2 // Dataset; data2DS [All, "foo"] (* <- returns a dataset with 1,3,5 *) 

How can I make a best fit curve from a dataset from specific data columns in order to display it with their respective points?

I’ve got the next dataset:

 ArstreYSust = Dataset[{ <|"Velocidad [m/s]" -> 8,   "Fuerza de arrastre \[Theta]=2\[Degree] SIN FLAP [N]" -> 0.06,   "Fuerza de sustentación \[Theta]=2\[Degree] SIN FLAP [N]" ->    0.19, "Fuerza de arrastre \[Theta]=14\[Degree] SIN FLAP [N]" ->    0.14, "Fuerza de sustentación \[Theta]=14\[Degree] SIN FLAP \   [N]" -> 0.38,   "Fuerza de arrastre \[Theta]=20\[Degree] SIN FLAP [N]" -> 0.21,   "Fuerza de sustentación \[Theta]=20\[Degree] SIN FLAP [N]" ->    0.4, "Fuerza de arrastre \[Theta]=8\[Degree] CON FLAP [N]" ->    0.24, "Fuerza de sustentación \[Theta]=8\[Degree] CON FLAP [N]" \   -> 0.75|>,   <|"Velocidad [m/s]" -> 10,    "Fuerza de arrastre \[Theta]=2\[Degree] SIN FLAP [N]" -> 0.08,    "Fuerza de sustentación \[Theta]=2\[Degree] SIN FLAP [N]" ->    0.31, "Fuerza de arrastre \[Theta]=14\[Degree] SIN FLAP [N]" ->    0.2, "Fuerza de sustentación \[Theta]=14\[Degree] SIN FLAP [N]" \   -> 0.5, "Fuerza de arrastre \[Theta]=20\[Degree] SIN FLAP [N]" -> 0.3,   "Fuerza de sustentación \[Theta]=20\[Degree] SIN FLAP [N]" ->    0.68, "Fuerza de arrastre \[Theta]=8\[Degree] CON FLAP [N]" ->    0.35, "Fuerza de sustentación \[Theta]=8\[Degree] CON FLAP [N]" \   -> 1.36|>, <|"Velocidad [m/s]" -> 12,   "Fuerza de arrastre \[Theta]=2\[Degree] SIN FLAP [N]" -> 0.11,   "Fuerza de sustentación \[Theta]=2\[Degree] SIN FLAP [N]" ->    0.49, "Fuerza de arrastre \[Theta]=14\[Degree] SIN FLAP [N]" ->    0.28, "Fuerza de sustentación \[Theta]=14\[Degree] SIN FLAP \  [N]" -> 0.79,   "Fuerza de arrastre \[Theta]=20\[Degree] SIN FLAP [N]" -> 0.46,   "Fuerza de sustentación \[Theta]=20\[Degree] SIN FLAP [N]" ->    0.86, "Fuerza de arrastre \[Theta]=8\[Degree] CON FLAP [N]" ->    0.44, "Fuerza de sustentación \[Theta]=8\[Degree] CON FLAP [N]" \  -> 1.96|>, <|"Velocidad [m/s]" -> 14,   "Fuerza de arrastre \[Theta]=2\[Degree] SIN FLAP [N]" -> 0.15,   "Fuerza de sustentación \[Theta]=2\[Degree] SIN FLAP [N]" ->    0.72,   "Fuerza de arrastre \[Theta]=14\[Degree] SIN FLAP [N]" -> 0.4,   "Fuerza de sustentación \[Theta]=14\[Degree] SIN FLAP [N]" ->    1.12, "Fuerza de arrastre \[Theta]=20\[Degree] SIN FLAP [N]" ->    0.69, "Fuerza de sustentación \[Theta]=20\[Degree] SIN FLAP \  [N]" -> 1.32,   "Fuerza de arrastre \[Theta]=8\[Degree] CON FLAP [N]" -> 0.6,   "Fuerza de sustentación \[Theta]=8\[Degree] CON FLAP [N]" ->    3.17|>, <|"Velocidad [m/s]" -> 16,   "Fuerza de arrastre \[Theta]=2\[Degree] SIN FLAP [N]" -> 0.18,   "Fuerza de sustentación \[Theta]=2\[Degree] SIN FLAP [N]" -> 0.8,   "Fuerza de arrastre \[Theta]=14\[Degree] SIN FLAP [N]" -> 0.45,   "Fuerza de sustentación \[Theta]=14\[Degree] SIN FLAP [N]" ->    1.29, "Fuerza de arrastre \[Theta]=20\[Degree] SIN FLAP [N]" ->    0.78, "Fuerza de sustentación \[Theta]=20\[Degree] SIN FLAP \  [N]" -> 1.53,   "Fuerza de arrastre \[Theta]=8\[Degree] CON FLAP [N]" -> 0.71,   "Fuerza de sustentación \[Theta]=8\[Degree] CON FLAP [N]" ->    3.48|>}]; 

I’ve had already build my list plot.

BUt I don’t know how to build a fit curve directly from my dataset in order to include it in the show command. I need to do it this way for different columns.

Thanks for reading and for your help.

Dataset of Hard Instances of SUBSET-SUM

I know for factoring we have the RSA Numbers, in which factoring one of them quickly (usually) indicates a breakthrough in the field. However, I want to know if there’s something similar for SUBSET-SUM, in which there are hard instances that if solved, would be a "big deal"? I found this, but they don’t seem to be unsolved.

One way would to take the RSA numbers, convert them to 3-SAT, then convert to SUBSET-SUM, but the weights generated are very large. Maybe there’s a way to convert FACTOR (the special case of two prime factors, to be specific) to SUBSET-SUM?

What factors of the integer dataset being sorted can I change, in order to compare two sorting algorithms?

I am comparing two comparison and binary data structure based sorting algorithms, the Tree Sort, and the Heap Sort. I am measuring the time taken for both algorithms to sort an increasing size of an integer dataset. However, I am wondering if there are any other variables which I can modify, for example standard deviation, in the integer dataset itself that would be of any benefit to my comparison.

12.1.1 “Part” functionality having issues with Dataset

I made some large datasets of SEM micrograph images and metadata in 12.0 and they have been working no problem for several weeks. but now when I run the notebook in 12.1.1, it seems that it fails to define the Part of the dataset I need. I just reinstalled 12.0 and the problem is still happening now, so maybe I’m just an idiot.

Dataset

Output

Where can i download a benign PE dataset? or at least which website is the best candidate for crawling and downloading normal executables? [closed]

I’m planning to gather a benign dataset for my ML malware detection model

the problem I’m having is finding benign PE files, i just need a source that has a dataset of normal executables, i will scan them with VT and extract benign ones, but i cant find anything useful

if there is nothing out there, then at least what is the best website that has the potential to be useful for a PE downloader crawler? (meaning its easy to crawl and automatically download .exe files without running into problems)

also another problem of using a download website is Installers, considering most of their files are installer and i need to install the program first, is there any good solution to this? is there any AutoIT script that somehow can install all types of installers ?

(I tried looking at surveys on using ML in malware detection like [1], but seems like non of the papers have released any useful benign dataset other than simple windows files which anyone can gather and is less than 10k, and very small amounts like 1000, i need to gather a large benign dataset, more than 50,000 benign files because my malware dataset is really large)

[1] https://www.sciencedirect.com/science/article/pii/S0167404818303808

Dataset Collection

I have to collect Aerial Images (from drones flying at a height of nearly 30m to 40m) of People Gathering in a Group of 3-4. I have manually checked some images on the internet and looked for some videos on YouTube but these are only images of the dense crowd but what I need is sparse crowd aerial images or YouTube videos or any dataset of same kind. Can someone help me with the appropriate links for the same? I have also found a paper but in it, how it has formed its dataset is not clear to me.

Machine Learning: Missing Data in Dataset and Imputer

I am a newbie in ML, and I am learning how to fill missing data in a dataset using Imputer. These are the few lines of code that I came across

from sklearn.preprocessing import Imputer imputer = Imputer(missing_values = 'NaN', strategy = 'mean', axis = 0) imputer = imputer.fit(X[:, 1:3]) X[:, 1:3] = imputer.transform(X[:, 1:3]) 

Now I am not able to understand what is the role of the fit and the transform function. It will be great if someone can help. Thank you.