Machine learning, kNN and NB algorithm. Is there a way make the code more compact and nice looking?

So this is the task I am working on:

“In this assignment you will implement the K-Nearest Neighbour and Naïve Bayes algorithms and evaluate them on a real dataset using the stratified cross validation method. You will also evaluate the performance of other classifiers on the same dataset using Weka. Finally, you will investigate the effect of feature selection, in particular the Correlation-based Feature Selection method (CFS) from Weka.”

I have done the weka part and also written seperate code for performance evaluation. Here, I need help with the main code where I am implementing kNN and NB algorithm.

Here is my code! I want to reduce the number of lines and possibly make it look more neat. Please help as much as you can. I have tried splitting into more functions and then calling them to do the work but is there a way I can do more? Also, is there a way the for loops can be written more compactly?

import heapq import sys import math as m from decimal import Decimal  class Entry:     def __init__(self, attributes):         self.attributes = attributes         self.diabetes = "yes" if "yes" in attributes else "no" if "no" in attributes else ""      def __str__(self):         string = ""         for attributeNum, attribute in enumerate(self.attributes):             if attributeNum == len(self.attributes) - 1:                 string += str(attribute)             else:                 string += str(attribute) + ','         return string      def set_ifdiabetes(self, diabetes):         self.diabetes = diabetes      def euclidean(self, other_entry):         sum = 0.0         for i in range(len(self.attributes)):                 sum += m.pow(float(self.attributes[i]) - float(other_entry.attributes[i]), 2)         return m.sqrt(sum)  class NB:     def __init__(self, training_data, testing_data):         self.training_data = training_data         self.testing_data = testing_data          self.training_entries = []         self.testing_entries = []          self.diabetes_yes = []         self.diabetes_no = []          self.mu_diabetesyes = []         self.mu_diabetesno = []          self.sigma_diabetesyes = []         self.sigma_diabetesno = []          self.num_attributes = 0          self.p_diabetesyes = 0         self.p_diabetesno = 0          self.num_diabetesyes = 0         self.num_diabetesno = 0      def train(self):         self.traincleandata()         self.get_mus()         self.get_sigmas()      def traincleandata(self):         for line in self.training_data:             if self.num_attributes == 0:                 self.num_attributes = len(line.split(','))                 for i in range(self.num_attributes):                     self.diabetes_yes.append([])                     self.diabetes_no.append([])                     self.mu_diabetesyes.append(0.0)                     self.mu_diabetesno.append(0.0)                     self.sigma_diabetesyes.append(0.0)                     self.sigma_diabetesno.append(0.0)             params = line.split(',')             cleanparams = getcleanparams(params)             entry = Entry(cleanparams)             self.training_entries.append(entry)              if (entry.diabetes == "yes"):                 for i in range(len(entry.attributes) - 1):                     self.diabetes_yes[i].append(Decimal(entry.attributes[i]))                 self.p_diabetesyes += 1                 self.num_diabetesyes += 1             else:                 for i in range(len(entry.attributes) - 1):                     self.diabetes_no[i].append(Decimal(entry.attributes[i]))                 self.p_diabetesno += 1                 self.num_diabetesno += 1          self.p_diabetesyes = Decimal(self.p_diabetesyes) / len(self.training_entries)         self.p_diabetesno = Decimal(self.p_diabetesno) / len(self.training_entries)      def get_mus(self):         for i in range(self.num_attributes - 1):             self.mu_diabetesyes[i] = sum(self.diabetes_yes[i]) / len(self.diabetes_yes[i])             self.mu_diabetesno[i] = sum(self.diabetes_no[i]) / len(self.diabetes_no[i])      def get_sigmas(self):          sigSumYes = [0] * self.num_attributes         sigSumNo = [0] * self.num_attributes          for i in range(self.num_attributes - 1):             for j in range(self.num_diabetesyes):                 sigSumYes[i] += m.pow(self.diabetes_yes[i][j] - self.mu_diabetesyes[i], 2)             self.sigma_diabetesyes[i] = m.sqrt(sigSumYes[i] / (len(self.diabetes_yes[i]) - 1))             for j in range(self.num_diabetesno):                 sigSumNo[i] += m.pow(self.diabetes_no[i][j] - self.mu_diabetesno[i], 2)             self.sigma_diabetesno[i] = m.sqrt(sigSumNo[i] / (len(self.diabetes_no[i]) - 1))      def test(self):         self.testcleandata()         self.testalgo()      def testcleandata(self):         for line in self.testing_data:             params = line.split(',')             cleanparams = getcleanparams(params)             entry = Entry(cleanparams)             self.testing_entries.append(entry)      def testalgo(self):         counter = 1         P_diabetesyes = [0] * self.num_attributes         P_diabetesno = [0] * self.num_attributes         for entry in self.testing_entries:             pYesEntry = 1             pNoEntry = 1             for i in range(self.num_attributes - 1):                 P_diabetesyes[i] = Decimal((1 / (self.sigma_diabetesyes[i] * m.sqrt(2 * m.pi))) * m.pow(m.e, (-m.pow(Decimal(entry.attributes[i]) - self.mu_diabetesyes[i], 2) / (2 * m.pow(self.sigma_diabetesyes[i], 2)))))                 P_diabetesno[i] = Decimal((1 / (self.sigma_diabetesno[i] * m.sqrt(2 * m.pi))) * m.pow(m.e, (-m.pow(Decimal(entry.attributes[i]) -self.mu_diabetesno[i], 2) / (2 * m.pow(self.sigma_diabetesno[i], 2)))))                 pYesEntry *= float(P_diabetesyes[i])                 pNoEntry *= float(P_diabetesno[i])              pYesEntry *= float(self.p_diabetesyes)             pNoEntry *= float(self.p_diabetesno)              entry.set_ifdiabetes("yes") if (pYesEntry/pNoEntry >= 1) else entry.set_ifdiabetes("no")             counter += 1  class kNN:     def __init__(self, training_data, testing_data, k):         self.k = k         self.training_data = training_data         self.testing_data = testing_data         self.training_entries = []         self.testing_entries = []      def __str__(self):         string_to_return = ''         for entry in self.training_entries:             string_to_return = string_to_return + str(entry) + '\n'         return string_to_return      def train(self):         for line in self.training_data:             params = line.split(',')             self.training_entries.append(Entry(getcleanparams(params)))      def test(self):         counter = 1         for line in self.testing_data:             params = line.split(',')             self.testing_entries.append(Entry(getcleanparams(params)))         for testEntry in self.testing_entries:             nearest = []             for trainEntry in self.training_entries:                 current_entry = (testEntry.euclidean(trainEntry), str(trainEntry.diabetes), str(trainEntry))                 nearest.append(current_entry)             heapq.heapify(nearest)             nearest.sort()             nearest = nearest[:int(self.k)]             num_diabetes = 0             for entry in nearest:                 if (entry[1] == "yes"):                     num_diabetes += 1             testEntry.set_ifdiabetes("yes") if (num_diabetes >= int(self.k)/2) else testEntry.set_ifdiabetes("no")             counter += 1      def compare(self):         for entry in self.testing_entries:             print("-----------")             print(entry.compare(self.training_entries[0]))             print(entry)             print(self.training_entries[0])   def getcleanparams(params):     params_nospaceortab = []     for param in params:         params_nospaceortab.append(param.strip())     return params_nospaceortab   if __name__ == '__main__':     training_file = open(sys.argv[1])     testing_file = open(sys.argv[2])     classifier_input = sys.argv[3]      training_lines = training_file.readlines()     training_cleanlines = getcleanparams(training_lines)      testing_lines = testing_file.readlines()     testing_cleanlines = getcleanparams(testing_lines)      if classifier_input == "NB":         classifier = NB(training_cleanlines, testing_cleanlines)         classifier.train()         classifier.test()         for entry in classifier.testing_entries:             print(entry.diabetes)     elif "NN" in classifier_input:         classifier = kNN(training_cleanlines, testing_cleanlines, (classifier_input.index('NN') - 1))         classifier.train()         classifier.test()         for entry in classifier.testing_entries:             print(entry.diabetes)     else:         print("Error: unknown classifier type")         sys.exit() 

Do I need a transit Visa with my Turkish passport traveling from Istanbul to Munich (Germany) and to Nice (France) to finally go to Canada?

I keep searching in this website and calling the French embassy, but I still don’t know if a transit visa is enough for this kind of layover. Or would I need a full Visa for a 1:30 h layover in Nice coming from Munich.

Does a (nice) centerless group always have a centerless profinite completion?

This is an extension of a question I asked here on Math.SE


Assume that I have a finitely generated residually finite centerless group $ G$ . Is it true that the profinite completion $ \hat{G}$ also has trivial center?

In the linked question, user YCor was able to show that this fails in general if you do not assume either finite generation or residually finite. However, the result is happens to be true if $ G$ is a surface group. I’d like to know if this is a phenomenon specific to surface groups or if this is a more general fact.

Does a (nice) centerless group always have a centerless profinite completion?

This is an extension of a question I asked here on Math.SE


Assume that I have a finitely generated residually finite centerless group $ G$ . Is it true that the profinite completion $ \hat{G}$ also has trivial center?

In the linked question, user YCor was able to show that this fails in general if you do not assume either finite generation or residually finite. However, the result is happens to be true if $ G$ is a surface group. I’d like to know if this is a phenomenon specific to surface groups or if this is a more general fact.

Nice Form of Vector Field

Let $ G$ be a reductive algebraic group (maybe reductive is not necessary) over an algebraically closed field $ k$ of characteristic zero. Let $ X$ be a homogeneous affine $ G$ -variety, i.e. $ X=G/K$ for some reductive subgroup $ K$ of $ G$ .

In this case we obtain an action by vector fields of the Lie algebra $ \mathfrak{g}$ of $ G$ on functions on $ X$ , the algebra $ k[X]$ . Choose an element $ u\in\mathfrak{g}$ .

My question is: are there statements about nice local forms of $ u$ ? e.g., I would like to know about statements of the form: If $ u$ vanishes at a point $ a$ on $ X$ there exists a neighborhood $ U$ of $ a$ with coordinates $ x_1,\dots,x_n$ (coming from an etale map $ U\to\mathbb{A}^n$ ) such that $ $ u=\sum\limits_i x_{j_i}\partial_{x_{k_i}}$ $ Any references or comments are greatly appreciated!

I’m looking for a “nice” looking grub loader

I’ve recently installed both Elementary OS and Ubuntu on to a laptop I had. When the grub screen loads up to choose which to load, it’s very plain. Just the command line look with Elementary, Ubuntu, and a few other options written out in text. I’m sure most of you know the screen I’m talking about. My question is, is there any way to change it and make it “nicer”? I’ve looked and I keep getting suggestions for Burg and a couple of others, but they all seem to be abandoned. Anyone have any suggestions? If you have a screen shot, that’d would be greatly appreciated also. Thank you in advance.

my printer and computer quit playing nice

we have an epson-wp-4530 which worked well enough with this computer for a couple of years. We had a power outage and all hell broke loose. I was able to commence communications with the printer, but all it would print was a couple characters over and over and over again.
I got the latest deb 64 bit driver, installed and rebooted, and now it is printing a lot more complicated gibberish instead of test pages and whatnot that i am trying to print. Can anyone help me with this? thanks ubuntu 18.04.1 LTS printer driver epson-201113w

Give your nice relevant keyword research Instant delivery for $10

Hello everyone,, I am a RUMMAN121, I can every work in YouTube. Please order me fast,,,I can completed every work very fast & honestly. YouTube , Twitter , Instagram , Facebook —Social Media Marketing… 1000+ Yt likes only $ 30 100+ custom comments only $ 4 1000+YouTube Subscribe only $ 35 1000+ Twitter likes only $ 12 1000+ Twitter follow only $ 12 1000+ Facebook likes only $ 22 1000+ facebook follow only $ 12 1000+ Instagram likes only $ 12 1000+ Instagram follow only $ 12 If You Want Then Please Contact me fast sir.. Thanks.. Quality of my service: *** 100% money back guaranteed. *** You can order me anytime for any Video likes. *** My service never violate video likes rules. *** Now order me without any hesitation. In simple SEO terms a Local Citation is simply where your company is mentioned on other websites and places found on the Internet. Local citations are used heavily in helping you to rank in local search results. An example of a citation could be a business directory such as Yell, Thompson Local or Brown Book where your company is mentioned explicitly by name. Local citations do not to include a link to your site. It could also be where your company is mentioned, cited, referenced or spoken about on other local websites. A local citation is any online mention of the name, address, and phone number for a local business. Citations can occur on local business directories, on websites and apps, and on social platforms. Citations help Internet users to discover local businesses and can also impact local search engine rankings. Local businesses can actively manage many citations to ensure data accuracy. A local citation is any mention of your business on the web; it is any combination of your company name, phone number, address, zip or postal code, and website address. Citations in SEO are a key factor in improving your local search results. Local citations come in various forms, for example!!!!!!!!

by: justworkedSMM
Created: —
Category: Onsite SEO & Research
Viewed: 131