Pricing data checker script

I was recently given a task to implement a data-checker script which given an input file containing (date, last_price) values should check for 3 different kind of errors – missing values, stale values and outliers and returns list of errors. I wrote the code below and the feedback I got was that it was not “pythonic” enough. Can someone please let me know how I can make this more pythonic? What parts look good, what don’t? Will be extremely helpful as I write more python code in future.

#!/usr/bin/python3.6  import sys import csv import pprint import statistics from datetime import date  class DataChecker:     class DatePrice:         def __init__(self, input_date, price):             try:                 day, month, year = input_date.split("/")                 self._price_date = date(int(year), int(month), int(day))             except ValueError:                 # Don't tolerate invalid date                 raise              try:                 self._price = float(price)             except (TypeError, ValueError):                 self._price = 0          @property         def date(self):             return self._price_date.strftime("%d/%m/%Y")          @property         def date_obj(self):             return self._price_date          @property         def price(self):             return self._price          def __repr__(self):             return f"{self.date}, {self.price}"      def __init__(self, input_date_price_values):         self._date_price_values = []         for date, price in input_date_price_values:             try:                 self._date_price_values.append(self.DatePrice(date, price))             except ValueError:                 pass          self._date_price_values.sort(key=lambda x: x.date_obj)          self._stale_price_dict = {}          self._outlier_low, self._outlier_high = self._calculate_outlier_thresholds_using_iqr(             self._date_price_values         )      def check_for_errors(self):         """         returns -> List[tuple(date, float, error)]          errors = 'missing value', 'stale value' or 'outlier'          Uses 3 different error checkers to check for errors in data         1. Checks for missing values in data -> categorises missing values as any value == 0, or empty string or nulls         2. Checks for stale values in data -> categorises stale values as any value that remains unchanged              for 5 business days. For stale values it returns the last date on which it was repeated         3. Checks for outlier values in data -> Uses Interquartile range (IQR) and a low threshold of              first-quartile-value - 1.2 x IQR and high-threshold of third-quartile-value + 1.2 x IQR.              Any values outside this range are deemed as outliers         """         errors = []         for datePrice in self._date_price_values:             if self._is_value_missing(datePrice.price):                 self._add_to_errors(datePrice, "missing value", errors)             elif self._is_value_stale(datePrice.price):                 self._add_to_errors(datePrice, "stale value", errors)             elif self._is_value_outlier(datePrice.price):                 self._add_to_errors(datePrice, "outlier", errors)             else:                 continue          return errors      def _add_to_errors(self, datePrice, error_string, errors):         error_tuple = (datePrice.date, datePrice.price, error_string)         errors.append(error_tuple)      def _is_value_missing(self, price):         if price is None or price == 0:             return True         return False      def _is_value_stale(self, price):         if price in self._stale_price_dict:             self._stale_price_dict[price] += 1             if self._stale_price_dict[price] >= 5:  # 5 business days in week                 return True         else:             self._stale_price_dict.clear()             self._stale_price_dict[price] = 1         return False      def _is_value_outlier(self, price):         if price < self._outlier_low or price > self._outlier_high:             return True         return False      def _calculate_outlier_thresholds_using_iqr(self, data_price_values):         price_values = sorted([dataPrice.price for dataPrice in data_price_values])          median_index = len(price_values) // 2         first_quartile = statistics.median(price_values[:median_index])         third_quartile = statistics.median(price_values[median_index + 1 :])          iqr = third_quartile - first_quartile          low_iqr = first_quartile - 1.2 * iqr         high_iqr = third_quartile + 1.2 * iqr          return low_iqr, high_iqr      def _calculate_outlier_thresholds_using_mean_deviation(self, data_price_values):         price_values = sorted([dataPrice.price for dataPrice in data_price_values])          mean_value = statistics.mean(price_values)         std_dev = statistics.stdev(price_values)          low_iqr = mean_value - 2 * std_dev         high_iqr = mean_value + 2 * std_dev          return low_iqr, high_iqr   def check_file_data(file_path):     with open(file_path) as data_file:         raw_data = csv.DictReader(data_file)         input_data = []         for row in raw_data:             input_data.append((row["Date"], row["Last Price"]))          data_checker = DataChecker(input_data)         errors = data_checker.check_for_errors()          pp = pprint.PrettyPrinter(indent=4)         pp.pprint(errors)         print(f"Total Errors Found: {len(errors)}")          return errors   if __name__ == "__main__":     if len(sys.argv) < 2:         print("Please provide filepath")         sys.exit()      file_path = sys.argv[1]     check_file_data(file_path) 

To test put the above code in a file called “data_checker.py” and test code below in a file called “test_data_checker.py” in the same directory and run:

python3.6 -m pytest -v test_data_checker.py

import pytest  from data_checker import DataChecker  test_data = [     (         pytest.param(             [("01/02/2010", "10"), ("02/02/2010", "10.09"), ("03/02/2010", "10.12")],             [],             id="no-errors-in-data",         )     ),     (         pytest.param(             [("01/02/2010", "0.0"), ("02/02/2010", ""), ("03/02/2010", "10.12")],             [("01/02/2010", 0, "missing value"), ("02/02/2010", 0, "missing value")],             id="2-zero-values",         )     ),     (         pytest.param(             [                 ("01/02/2010", "2"),                 ("02/02/2010", "1.12"),                 ("03/02/2010", "1.12"),                 ("04/02/2010", "1.12"),                 ("05/02/2010", "1.12"),                 ("06/02/2010", "1.11"),             ],             [],             id="4-repeated-values-no-stale",         )     ),     (         pytest.param(             [                 ("01/02/2010", "1.10"),                 ("02/02/2010", "1.12"),                 ("03/02/2010", "1.12"),                 ("04/02/2010", "1.12"),                 ("05/02/2010", "1.12"),                 ("06/02/2010", "1.12"),                 ("07/02/2010", "1.11"),             ],             [("06/02/2010", 1.12, "stale value")],             id="1-stale-value",         )     ),     (         pytest.param(             [                 ("01/02/2010", "0"),                 ("02/02/2010", "1.12"),                 ("03/02/2010", "1.12"),                 ("04/02/2010", "1.12"),                 ("05/02/2010", "1.12"),                 ("06/02/2010", "1.12"),                 ("07/02/2010", "1.11"),             ],             [("01/02/2010", 0, "missing value"), ("06/02/2010", 1.12, "stale value")],             id="1-missing-1-stale-value",         )     ),     (         pytest.param(             [                 ("01/02/2010", "1.11"),                 ("02/02/2010", "5"),                 ("03/02/2010", "1.12"),                 ("04/02/2010", "1.11"),                 ("05/02/2010", "1.12"),                 ("06/02/2010", "1.12"),                 ("07/02/2010", "1.11"),             ],             [("02/02/2010", 5, "outlier")],             id="1-outlier-value",         )     ),     (         pytest.param(             [                 ("01/02/2010", "0"),                 ("02/02/2010", "5"),                 ("03/02/2010", "1.12"),                 ("04/02/2010", "1.11"),                 ("05/02/2010", "1.12"),                 ("06/02/2010", "1.12"),                 ("07/02/2010", "1.12"),                 ("08/02/2010", "1.12"),                 ("09/02/2010", "1.12"),             ],             [                 ("01/02/2010", 0, "missing value"),                 ("02/02/2010", 5, "outlier"),                 ("09/02/2010", 1.12, "stale value"),             ],             id="missing-stale-outlier-value",         )     ), ]   @pytest.mark.parametrize("input_data, expected_value", test_data) def test_check_for_error(input_data, expected_value):     data_checker = DataChecker(input_data)     errors = data_checker.check_for_errors()     assert errors == expected_value 

Explicit Song Checker

To stay in practice with my python I’ve decided to write an explicit song checker. It checks each word in the song against an array of explicit words contained in the file. I’ve decided not to include the words that are to be checked against, for I am not sure about the rule of having explicit words in a programs code. Feedback on efficiency and structure is what I’m going for mostly. Style and other critiques are invited as well.

explicit_song_checker.py

explicit_words = [     #explicit words not shown  ]  def isClean(song_path):     with open(song_path) as song:         for line in song:             words = line.split(" ")             for word in words:                 for explicit_word in explicit_words:                     if word == explicit_word:                         return False          return True  def main():     song = raw_input("Enter path to song: \n")     if isClean(song):         print("CLEAN")     else:         print("EXPLICIT")  if __name__ == '__main__':     main() 

Strong password checker in Python

This is a Leetcode problem –

A password is considered strong if below conditions are all met –

1. It has at least 6 characters and at most 20 characters.

2. It must contain at least one lowercase letter, at least one uppercase letter, and at least one digit.

3. It must NOT contain three repeating characters in a row (...aaa... is weak, but ...aa...a... is strong, assuming other conditions are met).

Write a function strong_password_checker(s), that takes a string s as input, and returns the MINIMUM change required to make s a strong password. If s is already strong, return 0.

Insertion, deletion or replacements of any one character are all considered as one change.

Here is my solution to this challenge –

def strong_password_checker(s: str) -> int:     def has_lower(s):         for c in s:             if c.islower(): return 1         return 0      def has_upper(s):         for c in s:             if c.isupper(): return 1         return 0      def has_digits(s):         for c in s:             if c.isnumeric(): return 1         return 0      def find_repeats(s):         i = 0         j = 0         repeats = []         while i < len(s) - 1:             if s[i+1] == s[i]:                 i += 1                 continue             if (i - j + 1) > 2: repeats.append(i - j + 1)             i += 1             j = i         if (i - j + 1) > 2: repeats.append(i - j + 1)         return repeats      def repeats_after_delete(reps, d):         if d >= sum([r - 2 for r in reps]):             return []         reps = sorted(reps, key=lambda d: d%3)         while d > 0:             for i in range(len(reps)):                 if reps[i] < 3:                     continue                 r = reps[i] % 3 + 1                 reps[i] -= min(r, d)                 d -= r                 if d <= 0:                     break         return [r for r in reps if r > 2]      def num_repeats_change(repeats):         return sum([r // 3 for r in repeats])      total_changes = 0     format_changes = (1 - has_lower(s)) + (1 - has_upper(s)) + (1 - has_digits(s))     repeats = find_repeats(s)     if len(s) < 6:         repeat_change = num_repeats_change(repeats)         total_changes = max([6 - len(s), format_changes, repeat_change])     elif len(s) > 20:         repeats = repeats_after_delete(repeats, len(s) - 20)         repeat_change = num_repeats_change(repeats)         total_changes = len(s) - 20 + max([repeat_change, format_changes])     else:          repeat_change = num_repeats_change(repeats)         total_changes = max([repeat_change, format_changes])     return total_changes 

Here are some example outputs –

#print(strongPasswordChecker("aaaaaaaaaaaaAsaxqwd1aaa"))  >>> 6  #Explanation - aa1aa1aa1aAsaxqwd1aa (just an example - delete three characters, add three digits to replace consecutive a's) 

#print(strongPasswordChecker("aaaaa"))  >>> 2  #Explanation - aaAaa1 (just an example - add a character (so the length is at least 6), add an uppercase letter to replace consecutive a's) 

#print(strongPasswordChecker("aAsaxqwd2aa"))  >>> 0  #Explanation - A strong password (all conditions are met) 

Here are the times taken for each output –

%timeit strongPasswordChecker("aaaaaaaaaaaaAsaxqwd1aaa") >>> 18.7 µs ± 1.75 µs per loop (mean ± std. dev. of 7 runs, 100000 loops each)  %timeit strongPasswordChecker("aaaaa") >>> 5.05 µs ± 594 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)  %timeit strongPasswordChecker("aAsaxqwd2aa") >>> 7.19 µs ± 469 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) 

So, I would like to know whether I could make this program shorter and more efficient.

Any help would be highly appreciated.

Is there any better / improved / optimise way to approach these input checker?

Actually I got a form with multiple input (around 39) on which I wish to perform some check and display customs messages based on sets of conditions.

I found a way to address my problem but I’m wondering (and I’m barely sure there is) if there’s any better way to approach this.

My trouble come from the fact there’s a lot of custom field I need to address to (many input to check with custom error displayer per input and sometimes more than custom message to send / display) plus quite lot of check to perform.

function hideShowError(elem, sub) {     sub = sub || false;     parentElem = getBoxParent($  (elem), sub);      if (parentElem != false)         parentElem.addClass('errorWrapper');      return false; }  function hideShowErrorWithMessage(elem, message, special, sub, noBorder) {     sub = sub || false;     special = special || false;     noBorder = noBorder || false;     parentElem = getBoxParent($  (elem), sub);      if (parentElem != false) {         parentElem.addClass('errorWrapper');         if (noBorder)             parentElem.addClass('errorWrapper-no-border');         parentElem.find('.errorSpe').text(message);         special ? parentElem.find('.disable-error').addClass("active") : parentElem.find('.disable-error-pb').removeClass("active");     }      return false; }  function getBoxParent(childElem, sub) {     childElem = childElem || "";     sub = sub || false;      if (childElem != "") {         if (!sub)             return childElem.closest('.questBox3');         return childElem.closest('.subQuestBox3');     }     return false; }  function testInput() {     var input1 = $  ('input1-t').is(':checked') || $  ('input1-f').is(':checked');     var input2 = $  ('input2-f').is(':checked');     var input3 = !$  ('input3-t').is(':checked') && !$  ('input2-f').is(':checked');     var input4 = $  ('input4-f').is(':checked');     var input5 = !$  ('input5-t').is(':checked') && !$  ('input4-f').is(':checked');     var input6 = $  ('input6-f').is(':checked');     var input7 = !$  ('input7-t').is(':checked') && !$  ('input6-f').is(':checked');     var input8 = $  ('input8-f').is(':checked');     var input9 = !$  ('input9-t').is(':checked') && !$  ('input8-f').is(':checked');     var input10 = $  ('input10-t').is(':checked');     var input11 = !$  ('input11-f').is(':checked') && !$  ('input10-t').is(':checked');     var input12 = $  ('input12-t').is(':checked') || $  ('input12-f').is(':checked');     var input13 = $  ('input13-t').is(':checked') || $  ('input13-f').is(':checked');     var input14 = $  ('input14-t').is(':checked') || $  ('input14-f').is(':checked');     var input15 = $  ('input15-f').is(':checked');     var input16 = $  ('input16select option:selected').length > 0 && $  ('input16select option:selected').val() != 0;     var input17 = $  ('input17select option:selected').length > 0 && $  ('input17select option:selected').val() != 0;     var input18 = $  ('input18').val().length > 0;     var input19 = $  ('input19').val().length > 0;     var input20 = $  ('input20').val().length > 0;     var input21 = $  ('input21').val().length > 0;     var input22 = $  ('input22').val().length > 0;     var input23 = $  ('input23').val().length > 0;     var input24 = $  ('input24').val().length > 0;     var input25 = $  ('input25-t').is(':checked') || $  ('input25-f').is(':checked');     var input26 = $  ('input26-t').is(':checked') || $  ('input26-f').is(':checked');     var input27 = $  ('input27-t').is(':checked') || $  ('input27-f').is(':checked');     var input28 = $  ('input28').val().length > 0 || $  ('input28-2').val().length > 0 || $  ('input28-3').val().length > 0;     var input29 = $  ('input29-t').is(':checked') || $  ('input30-f').is(':checked');     var input30 = $  ('input30-f').is(':checked');     var input31 = $  ('input31-t').is(':checked') || $  ('input32-f').is(':checked');     var input32 = $  ('input32-f').is(':checked');     var input33 = $  ('input33').val().length > 0;     var input34 = $  ('input33').val() != "" && (!$  .isNumeric($  ('input33').val()) || ($  ('input33').val() > 99 || $  ('input33').val() < 0));      // checkbox input     var input35 = $  ('input35:checked').length;     var input36 = $  ('input36:checked').length;      // siret control     var numSiret = ($  ('input28').val().toString()).replace(/\s+/g, '');      var validSiret = ("" == numSiret || (verif_siren_siret(numSiret, 9) && verif_siren_siret(numSiret, 14)))      // Get CP (97, 971, 972 ...)     var departement = '<%= @adresse.try(:departement) %>'.match("^97");      // Error check Question 1 (input)     if (!input1)         ret = hideShowError('input1-t');     // Error check Question 9 (input)     if (input2 && !input3)         ret = hideShowErrorWithMessage('input2-f', "<%= t('sometrad1') %>", false, false, true);     // Error check Question 10 (input)     if (input4 && !input5)         ret = hideShowErrorWithMessage('input4-f', "<%= t('sometrad1') %>", false, false, true);     // Error check Question 11 (input)     if (input6 && !input7)         ret = hideShowErrorWithMessage('input6-f', "<%= t('sometrad1') %>", false, false, true);     // Error check Question 12 (input)     if (input8 && !input9)         ret = hideShowErrorWithMessage('input8-f', "<%= t('sometrad1') %>", false, false, true);     // Error check Question 13 (input)     if (input10 && !input11)         ret = hideShowErrorWithMessage('input10-t', "<%= t('sometrad1') %>", false, false, true);      if (!input2 && input3)         ret = hideShowErrorWithMessage('input2-f', "<%= t('sometrad2') %>", false);     if (!input4 && input5)         ret = hideShowErrorWithMessage('input4-f', "<%= t('sometrad2') %>", false);     if (!input6 && input7)         ret = hideShowErrorWithMessage('input6-f', "<%= t('sometrad2') %>", false);     if (!input8 && input9)         ret = hideShowErrorWithMessage('input8-f', "<%= t('sometrad2') %>", false);     if (!input10 && input11)         ret = hideShowErrorWithMessage('input10-t', "<%= t('sometrad2') %>", false);     if (!input12)         ret = hideShowError('input12-t');     if ($  ('input12-t').is(':checked') && !input13)         ret = hideShowErrorWithMessage('input13-t', "<%= t('sometrad2') %>", false, true);     if (!input14)         ret = hideShowErrorWithMessage('input14-t', "<%= t('sometrad2') %>", false, true);     if (input15)         ret = hideShowErrorWithMessage('input13-t', "<%= t('sometrad3') %>", false, true);     if (!input16)         ret = hideShowError('input16select');     if (!input17)         ret = hideShowError('input17select');     if (!input18)         ret = hideShowError('input18');     if (!input19)         ret = hideShowError('input19');     if (!input25)         ret = hideShowError('input25-t');     if ($  ('input25-t').is(':checked') && !input20)         ret = hideShowError('input20');     if ($  ('input25-t').is(':checked') && !input21)         ret = hideShowError('input21');     if ($  ('input25-t').is(':checked') && !input22)         ret = hideShowError('input22');     if ($  ('input25-t').is(':checked') && !input23)         ret = hideShowError('input23');     if ($  ('input25-t').is(':checked') && !input24 && null != departement)         ret = hideShowError('input24');     if (!input26)         ret = hideShowError('input26-t');     if (input26 && !input27)         ret = hideShowErrorWithMessage('input27-t', "<%= t('sometrad2') %>", false);     if ($  ('input27-t').is(':checked') && !input28) {         ret = hideShowErrorWithMessage('input28', "<%= t('sometrad4') %>", false);         ret = hideShowErrorWithMessage('input28-2', "<%= t('sometrad4') %>", false);         ret = hideShowErrorWithMessage('input28-3', "<%= t('sometrad4') %>", false);     }     if ($  ('input27-t').is(':checked') && $  ('input28').val().length > 0 && !validSiret)         ret = hideShowErrorWithMessage('input28', "<%= t('sometrad5') %>", false);      // Error check Question 13 (checkbox)     if (input35 > 0)         ret = hideShowErrorWithMessage('#pb-decence-1', "<%= t('sometrad2') %>", false, false, true);     if (input35 == 0 && input36 == 0)         ret = hideShowErrorWithMessage('#pb-decence-1', "<%= t('sometrad2') %>", false);      if (!input29)         ret = hideShowErrorWithMessage('input29-t', "<%= t('sometrad2') %>", false);     if (input30)         ret = hideShowErrorWithMessage('input29-t', "<%= t('sometrad6') %>", false);      if (!input31)         ret = hideShowErrorWithMessage('input31-t', "<%= t('sometrad2') %>", false);     if (input32)         ret = hideShowErrorWithMessage('input31-t', "<%= t('sometrad7') %>", false, false, true);      if (!input33)         ret = hideShowErrorWithMessage('input33', "<%= t('sometrad8') %>", false);     if (input34)         ret = hideShowErrorWithMessage('input33', "<%= t('sometrad2') %>", false); } 

I also add an HTML (slim) block of code, though all input are not the sames the general structure is mostly the same

div.questGridWrapper     div.questBox         span= t('quest_9')     div.questBox         span.special-info             span= t('quest_error_warn.warn_0') + " "             <i class="fa fa-question-circle"></i>         div.no-display-info             br             span= t('quest_9_prec.sub_1')             br             br             ul                 li.li-style                     span= t('input.sub_2')                 span.color OR                 li.li-style                     span= t('input.sub_3') div.questGridWrapper     div.questBox3         div.form-group.radio_buttons             div.form-check-hacked                  span.radio                     label for="input-t"                         - if @pb_projet_courant.pb_logement.espace_suffisant == true                             input.form-check-input.radio_buttons type="radio" name="input[value]" checked="checked" id="input-t" value="true"                             = t('generic.quest_yes')                         - else                             input.form-check-input.radio_buttons type="radio" name="input[value]" id="input-t" value="true"                             = t('generic.quest_yes')                  span.radio                     label for="input-f"                         - if @pb_projet_courant.pb_logement.espace_suffisant == false                             input.form-check-input.radio_buttons type="radio" name="input[value]" checked="checked" id="input-f" value="false"                             = t('generic.quest_no')                         - else                             input.form-check-input.radio_buttons type="radio" name="input[value]" id="input-f" value="false"                             = t('generic.quest_no')          .error-message             span.errorSpe.errorSpe-noBorder= t('someDefaultTrad') 

(I had to anonymize the code before posting, so sadly getting a working exemple may take some times but the code itself actually work, I’m looking for way to improve my if statements if there’s any regarding the numbers of conditions that are checked.

Index Checker

I’m using Google index checker and would like to know what it actually checks for.
I tested with a list of 35 of my websites, in all instances the home page and with correct http or https. and www or not www,
I only get 12 indexed, and while some are indeed not indexed, at least 80% are in reality when checking by hand on Google.

What’s the search string that’s used?

PS I get 18 indexed on Yahoo, and 14 on Bing, and I never tried to index on either one of them.

Index Checker

I’m using Google index checker and would like to know what it actually checks for.
I tested with a list of 35 of my websites, in all instances the home page and with correct http or https. and www or not www,
I only get 12 indexed, and while some are indeed not indexed, at least 80% are in reality when checking by hand on Google.

What’s the search string that’s used?

PS I get 18 indexed on Yahoo, and 14 on Bing, and I never tried to index on either one of them.

Spell checker for word 2007 is not working

From this question it is clear that a language without the ABC checkers will not work for spell checking. I’m aware of this. What is not answered there is what to do when one wants to use a certain language that doesn’t have the ABC checkers. How one activates (install) this particular language into word.

EDIT: Tried to install all three language service packs from Microsoft but the following error message appear.

enter image description here

Setup: Windows 10 64 bit, Office 2007