Problem installing pandas [aarch64-linux-gnu-gcc failed with exit status 1]

I’m having some trouble installing pandas.

creating build/temp.linux-aarch64-3.5   creating build/temp.linux-aarch64-3.5/pandas   creating build/temp.linux-aarch64-3.5/pandas/_libs   aarch64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -Ipandas/_libs -I./pandas/_libs -Ipandas/_libs/src/klib -Ipandas/_libs/src -I/home/amessios/.virtualenvs/legal_subs/lib/python3.5/site-packages/numpy/core/include -I/usr/include/python3.5m -I/home/amessios/.virtualenvs/legal_subs/include/python3.5m -c pandas/_libs/window.cpp -o build/temp.linux-aarch64-3.5/pandas/_libs/window.o -Wno-unused-function   cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++   pandas/_libs/window.cpp: In function ‘PyObject* __pyx_pf_6pandas_5_libs_6window_24roll_generic(PyObject*, PyObject*, __pyx_t_5numpy_int64_t, __pyx_t_5numpy_int64_t, PyObject*, PyObject*, int, PyObject*, int, PyObject*, PyObject*)’:   pandas/_libs/window.cpp:43540:61: error: ‘PyArrayObject {aka struct tagPyArrayObject}’ has no member named ‘data’        __pyx_v_buf = ((__pyx_t_5numpy_float64_t *)__pyx_v_arr->data);                                                                ^   pandas/_libs/window.cpp:43599:67: error: ‘PyArrayObject {aka struct tagPyArrayObject}’ has no member named ‘data’        __pyx_v_oldbuf = ((__pyx_t_5numpy_float64_t *)__pyx_v_bufarr->data);                                                                      ^   pandas/_libs/window.cpp:43627:23: error: ‘PyArrayObject {aka struct tagPyArrayObject}’ has no member named ‘data’          __pyx_v_bufarr->data = ((char *)__pyx_v_buf);                          ^   pandas/_libs/window.cpp:43709:21: error: ‘PyArrayObject {aka struct tagPyArrayObject}’ has no member named ‘data’        __pyx_v_bufarr->data = ((char *)__pyx_v_oldbuf);                        ^   pandas/_libs/window.cpp: In function ‘PyObject* __pyx_pf_6pandas_5_libs_6window_30ewmcov(PyObject*, __Pyx_memviewslice, __Pyx_memviewslice, __pyx_t_5numpy_float64_t, int, int, int, int)’:   pandas/_libs/window.cpp:45682:27: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]      __pyx_t_2 = ((__pyx_t_1 != __pyx_v_N) != 0);                              ^   error: command 'aarch64-linux-gnu-gcc' failed with exit status 1 error   ERROR: Failed building wheel for pandas   Running setup.py clean for pandas   Running command /home/amessios/.virtualenvs/legal_subs/bin/python3.5 -u -c 'import setuptools, tokenize;__file__='"'"'/tmp/pip-install-zytfd7ni/pandas/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' clean --all   running clean Failed to build pandas Installing collected packages: pandas 

I have Ubuntu 16.04, python 3.5.2, and all the dependencies I think I should need. I’m also installing in a virtual environment. Anyone have any idea?

Problem installing pandas [aarch64-linux-gnu-gcc failed with exit status 1]

I’m having some trouble installing pandas.

creating build/temp.linux-aarch64-3.5   creating build/temp.linux-aarch64-3.5/pandas   creating build/temp.linux-aarch64-3.5/pandas/_libs   aarch64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -Ipandas/_libs -I./pandas/_libs -Ipandas/_libs/src/klib -Ipandas/_libs/src -I/home/amessios/.virtualenvs/legal_subs/lib/python3.5/site-packages/numpy/core/include -I/usr/include/python3.5m -I/home/amessios/.virtualenvs/legal_subs/include/python3.5m -c pandas/_libs/window.cpp -o build/temp.linux-aarch64-3.5/pandas/_libs/window.o -Wno-unused-function   cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++   pandas/_libs/window.cpp: In function ‘PyObject* __pyx_pf_6pandas_5_libs_6window_24roll_generic(PyObject*, PyObject*, __pyx_t_5numpy_int64_t, __pyx_t_5numpy_int64_t, PyObject*, PyObject*, int, PyObject*, int, PyObject*, PyObject*)’:   pandas/_libs/window.cpp:43540:61: error: ‘PyArrayObject {aka struct tagPyArrayObject}’ has no member named ‘data’        __pyx_v_buf = ((__pyx_t_5numpy_float64_t *)__pyx_v_arr->data);                                                                ^   pandas/_libs/window.cpp:43599:67: error: ‘PyArrayObject {aka struct tagPyArrayObject}’ has no member named ‘data’        __pyx_v_oldbuf = ((__pyx_t_5numpy_float64_t *)__pyx_v_bufarr->data);                                                                      ^   pandas/_libs/window.cpp:43627:23: error: ‘PyArrayObject {aka struct tagPyArrayObject}’ has no member named ‘data’          __pyx_v_bufarr->data = ((char *)__pyx_v_buf);                          ^   pandas/_libs/window.cpp:43709:21: error: ‘PyArrayObject {aka struct tagPyArrayObject}’ has no member named ‘data’        __pyx_v_bufarr->data = ((char *)__pyx_v_oldbuf);                        ^   pandas/_libs/window.cpp: In function ‘PyObject* __pyx_pf_6pandas_5_libs_6window_30ewmcov(PyObject*, __Pyx_memviewslice, __Pyx_memviewslice, __pyx_t_5numpy_float64_t, int, int, int, int)’:   pandas/_libs/window.cpp:45682:27: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]      __pyx_t_2 = ((__pyx_t_1 != __pyx_v_N) != 0);                              ^   error: command 'aarch64-linux-gnu-gcc' failed with exit status 1 error   ERROR: Failed building wheel for pandas   Running setup.py clean for pandas   Running command /home/amessios/.virtualenvs/legal_subs/bin/python3.5 -u -c 'import setuptools, tokenize;__file__='"'"'/tmp/pip-install-zytfd7ni/pandas/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' clean --all   running clean Failed to build pandas Installing collected packages: pandas 

I have Ubuntu 16.04, python 3.5.2, and all the dependencies I think I should need. I’m also installing in a virtual environment. Anyone have any idea?

Problem installing pandas [aarch64-linux-gnu-gcc failed with exit status 1]

I’m having some trouble installing pandas.

creating build/temp.linux-aarch64-3.5   creating build/temp.linux-aarch64-3.5/pandas   creating build/temp.linux-aarch64-3.5/pandas/_libs   aarch64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -Ipandas/_libs -I./pandas/_libs -Ipandas/_libs/src/klib -Ipandas/_libs/src -I/home/amessios/.virtualenvs/legal_subs/lib/python3.5/site-packages/numpy/core/include -I/usr/include/python3.5m -I/home/amessios/.virtualenvs/legal_subs/include/python3.5m -c pandas/_libs/window.cpp -o build/temp.linux-aarch64-3.5/pandas/_libs/window.o -Wno-unused-function   cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++   pandas/_libs/window.cpp: In function ‘PyObject* __pyx_pf_6pandas_5_libs_6window_24roll_generic(PyObject*, PyObject*, __pyx_t_5numpy_int64_t, __pyx_t_5numpy_int64_t, PyObject*, PyObject*, int, PyObject*, int, PyObject*, PyObject*)’:   pandas/_libs/window.cpp:43540:61: error: ‘PyArrayObject {aka struct tagPyArrayObject}’ has no member named ‘data’        __pyx_v_buf = ((__pyx_t_5numpy_float64_t *)__pyx_v_arr->data);                                                                ^   pandas/_libs/window.cpp:43599:67: error: ‘PyArrayObject {aka struct tagPyArrayObject}’ has no member named ‘data’        __pyx_v_oldbuf = ((__pyx_t_5numpy_float64_t *)__pyx_v_bufarr->data);                                                                      ^   pandas/_libs/window.cpp:43627:23: error: ‘PyArrayObject {aka struct tagPyArrayObject}’ has no member named ‘data’          __pyx_v_bufarr->data = ((char *)__pyx_v_buf);                          ^   pandas/_libs/window.cpp:43709:21: error: ‘PyArrayObject {aka struct tagPyArrayObject}’ has no member named ‘data’        __pyx_v_bufarr->data = ((char *)__pyx_v_oldbuf);                        ^   pandas/_libs/window.cpp: In function ‘PyObject* __pyx_pf_6pandas_5_libs_6window_30ewmcov(PyObject*, __Pyx_memviewslice, __Pyx_memviewslice, __pyx_t_5numpy_float64_t, int, int, int, int)’:   pandas/_libs/window.cpp:45682:27: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]      __pyx_t_2 = ((__pyx_t_1 != __pyx_v_N) != 0);                              ^   error: command 'aarch64-linux-gnu-gcc' failed with exit status 1 error   ERROR: Failed building wheel for pandas   Running setup.py clean for pandas   Running command /home/amessios/.virtualenvs/legal_subs/bin/python3.5 -u -c 'import setuptools, tokenize;__file__='"'"'/tmp/pip-install-zytfd7ni/pandas/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' clean --all   running clean Failed to build pandas Installing collected packages: pandas 

I have Ubuntu 16.04, python 3.5.2, and all the dependencies I think I should need. I’m also installing in a virtual environment. Anyone have any idea?

Parsing data from JSON API in Pandas

I made a little program to parse data from an API. I do not have experience with Pandas. It is working but I would like to know how to do that better and more efficiently.

class Albion_data():     def __init__(self,item):         self.location = f'Bridgewatch,Thetford,FortSterling,Martlock,Bridgewatch,Lymhurst'         self.item = item     @property     def get_data(self):         url = f"https://www.albion-online-data.com/api/v1/stats/Prices/{self.item}?locations={self.location}"         response = (requests.get(url).text)         response_json = json.loads(response)         if response_json != []:             response_json = pd.DataFrame(response_json)[["item_id", "city", "sell_price_min"]]             max = response_json.loc[response_json["sell_price_min"].idxmax()]             min = response_json.loc[response_json["sell_price_min"].idxmin()]             gain = max[2] - min[2]             data = pd.DataFrame(                 [[max[0], max[1], min[1], min[2], max[2], gain]],                 columns=["ITEM", "CITY max ", "CITY min", "MIN_PRICE", "MAX PRICE", "GAIN"])              return data  def get_data_frame():     item_list = pd.read_csv('items.txt', sep=':', names=['nuber', 'item']['item']     item_price = pd.DataFrame(columns=["ITEM", "CITY max ", "CITY min","MIN_PRICE", "MAX PRICE", "GAIN"])     for item in item_list:         item_price = item_price.append(Albion_data(item).get_data)         item_price.to_csv('data.csv')     return item_price ``` 

Code Scrapes, Queries Against a Database, and Does intersection in Pandas

The code base is split between three different files odbc.py, scrape.py, and dataprocessor.py. For ODBC, the job of this file is to take scraped data and determine whether or not those results are in a database. Ultimately, I’d like to extend this script to contain more general functionality, so that the queries that are being performed are less dependent on the specifics of checking for titles; for instance, I have an idea where I’d like to check what interviews were added to the database and return back the results and sample path. The ‘top 100’ variable is an example of the output from scrape.py.

Here is the code for pyodc.py:

import pyodbc from ConfigFunc import GetConfig  #This function allows for the user to pull ODBC # Location = 'AzureWinMedia06232019' connectConfig = GetConfig(Location)  #Establishes Connection with Azure Database via ODBC# conn = pyodbc.connect("""Driver={};Server={};Database={};Uid={}; Pwd={};Encrypt={};TrustServerCertificate={}; Connection Timeout={};""".format(connectConfig["Driver"], connectConfig["Server"], connectConfig["Database"],  connectConfig["User"], connectConfig["Password"], connectConfig["Encrypt"], connectConfig["TrustedServer"], connectConfig["ConnectionTimeout"]))  top100 = {1: {'artist': 'Lil Nas X Featuring Billy Ray Cyrus', 'title': 'Old Town Road'}, 2: {'artist': 'Billie Eilish', 'title': 'Bad Guy'}, 3: {'artist': 'Khalid', 'title': 'Talk'}, 4: {'artist': 'Jonas Brothers', 'title': 'Sucker'}, 5: {'artist': 'Ed Sheeran & Justin Bieber', 'title': "I Don't Care"}, 6: {'artist': 'Post Malone', 'title': 'Wow.'}, 7: {'artist': 'Post Malone & Swae Lee', 'title': 'Sunflower (Spider-Man: Into The Spider-Verse)'}, 8: {'artist': 'DaBaby', 'title': 'Suge'}, 9: {'artist': 'Chris Brown Featuring Drake', 'title': 'No Guidance'}, 10: {'artist': 'Sam Smith & Normani', 'title': 'Dancing With A Stranger'}, 11: {'artist': 'Polo G Featuring Lil Tjay', 'title': 'Pop Out'}, 12: {'artist': 'Shawn Mendes', 'title': "If I Can't Have You"}, 13: {'artist': 'Ava Max', 'title': 'Sweet But Psycho'}, 14: {'artist': 'Taylor Swift Featuring Brendon Urie', 'title': 'ME!'}, 15: {'artist': 'Halsey', 'title': 'Without Me'}, 16: {'artist': 'Ariana Grande', 'title': '7 Rings'}, 17: {'artist': 'Lizzo', 'title': 'Truth Hurts'}, 18: {'artist': 'Marshmello & Bastille', 'title': 'Happier'}, 19: {'artist': 'Blake Shelton', 'title': "God's Country"}, 20: {'artist': 'Morgan Wallen', 'title': 'Whiskey Glasses'}, 21: {'artist': 'Panic! At The Disco', 'title': 'High Hopes'}, 22: {'artist': 'Luke Combs', 'title': 'Beer Never Broke My Heart'}, 23: {'artist': 'Daddy Yankee & Katy Perry Featuring Snow', 'title': 'Con Calma'}, 24: {'artist': 'Young Thug, J. Cole & Travis Scott', 'title': 'The London'}, 25: {'artist': 'J. Cole', 'title': 'Middle Child'}, 26: {'artist': 'City Girls', 'title': 'Act Up'}, 27: {'artist': 'benny blanco, Halsey & Khalid', 'title': 'Eastside'}, 28: {'artist': 'Katy Perry', 'title': 'Never Really Over'}, 29: {'artist': 'Mustard & Migos', 'title': 'Pure Water'}, 30: {'artist': 'Tyler, The Creator', 'title': 'Earfquake'}, 31: {'artist': 'Panic! At The Disco', 'title': 'Hey Look Ma, I Made It'}, 32: {'artist': 'Meek Mill Featuring Drake', 'title': 'Going Bad'}, 33: {'artist': 'Dan + Shay', 'title': 'Speechless'}, 34: {'artist': 'Lady Gaga & Bradley Cooper', 'title': 'Shallow'}, 35: {'artist': 'Khalid', 'title': 'Better'}, 36: {'artist': 'Lee Brice', 'title': 'Rumor'}, 37: {'artist': 'Ariana Grande', 'title': "Break Up With Your Girlfriend, I'm Bored"}, 38: {'artist': 'Travis Scott', 'title': 'Sicko Mode'}, 39: {'artist': 'Thomas Rhett', 'title': 'Look What God Gave Her'}, 40: {'artist': 'A Boogie Wit da Hoodie', 'title': 'Look Back At It'}, 41: {'artist': 'Calboy', 'title': 'Envy Me'}, 42: {'artist': 'Billie Eilish', 'title': "When The Party's Over"}, 43: {'artist': 'Halsey', 'title': 'Nightmare'}, 44: {'artist': 'Jonas Brothers', 'title': 'Cool'}, 45: {'artist': 'Luke Combs', 'title': 'Beautiful Crazy'}, 46: {'artist': 'Kane Brown', 'title': 'Good As You'}, 47: {'artist': 'Cardi B', 'title': 'Press'}, 48: {'artist': 'Lil Baby', 'title': 'Close Friends'}, 49: {'artist': 'Ed Sheeran Featuring Chance The Rapper & PnB Rock', 'title': 'Cross Me'}, 50: {'artist': 'YG, Tyga & Jon Z', 'title': 'Go Loko'}, 51: {'artist': 'Cardi B & Bruno Mars', 'title': 'Please Me'}, 52: {'artist': 'Brett Eldredge', 'title': 'Love Someone'}, 53: {'artist': 'Offset Featuring Cardi B', 'title': 'Clout'}, 54: {'artist': 'YK Osiris', 'title': 'Worth It'}, 55: {'artist': 'Lewis Capaldi', 'title': 'Someone You Loved'}, 56: {'artist': 'Kelsea Ballerini', 'title': 'Miss Me More'}, 57: {'artist': 'P!nk', 'title': 'Walk Me Home'}, 58: {'artist': 'Billie Eilish', 'title': 'Bury A Friend'}, 59: {'artist': 'Maren Morris', 'title': 'GIRL'}, 60: {'artist': 'DJ Khaled Featuring SZA', 'title': 'Just Us'}, 61: {'artist': 'Luke Bryan', 'title': "Knockin' Boots"}, 62: {'artist': 'Luke Combs', 'title': "Even Though I'm Leaving"}, 63: {'artist': '5 Seconds Of Summer', 'title': 'Easier'}, 64: {'artist': 'Summer Walker X Drake', 'title': 'Girls Need Love'}, 65: {'artist': 'Lil Tecca', 'title': 'Ran$  om'}, 66: {'artist': 'Blanco Brown', 'title': 'The Git Up'}, 67: {'artist': 'Meek Mill Featuring Ella Mai', 'title': '24/7'}, 68: {'artist': 'Jason Aldean', 'title': 'Rearview Town'}, 69: {'artist': 'Bad Bunny & Tainy', 'title': 'Callaita'}, 70: {'artist': 'DJ Khaled Featuring Cardi B & 21 Savage', 'title': 'Wish Wish'}, 71: {'artist': 'Dan + Shay', 'title': 'All To Myself'}, 72: {'artist': 'Chase Rice', 'title': 'Eyes On You'}, 73: {'artist': 'Beyonce', 'title': 'Before I Let Go'}, 74: {'artist': 'Eric Church', 'title': 'Some Of It'}, 75: {'artist': 'Marshmello Featuring CHVRCHES', 'title': 'Here With Me'}, 76: {'artist': 'Lil Uzi Vert', 'title': 'Sanguine Paradise'}, 77: {'artist': 'Lunay, Daddy Yankee & Bad Bunny', 'title': 'Soltera'}, 78: {'artist': 'Florida Georgia Line', 'title': 'Talk You Out Of It'}, 79: {'artist': 'Yo Gotti Featuring Lil Baby', 'title': 'Put A Date On It'}, 80: {'artist': 'Eli Young Band', 'title': "Love Ain't"}, 81: {'artist': 'NLE Choppa', 'title': 'Shotta Flow'}, 82: {'artist': 'Pedro Capo X Farruko', 'title': 'Calma'}, 83: {'artist': 'Avicii', 'title': 'Heaven'}, 84: {'artist': 'The Chainsmokers & Bebe Rexha', 'title': 'Call You Mine'}, 85: {'artist': 'Billie Eilish', 'title': 'Ocean Eyes'}, 86: {'artist': 'Megan Thee Stallion', 'title': 'Big Ole Freak'}, 87: {'artist': 'Future', 'title': 'Please Tell Me'}, 88: {'artist': 'Cody Johnson', 'title': 'On My Way To You'}, 89: {'artist': 'SHAED', 'title': 'Trampoline'}, 90: {'artist': 'Chris Young', 'title': 'Raised On Country'}, 91: {'artist': 'Nicky Jam X Ozuna', 'title': 'Te Robare'}, 92: {'artist': 'Ozuna', 'title': 'Amor Genuino'}, 93: {'artist': 'Jonas Brothers', 'title': 'Only Human'}, 94: {'artist': 'Yella Beezy, Gucci Mane & Quavo', 'title': 'Bacc At It Again'}, 95: {'artist': 'Bryce Vine Featuring YG', 'title': 'La La Land'}, 96: {'artist': 'Juice WRLD', 'title': 'Robbery'}, 97: {'artist': 'Ozuna x Daddy Yankee x J Balvin x Farruko x Anuel AA', 'title': 'Baila Baila Baila'}, 98: {'artist': 'Future', 'title': 'XanaX Damage'}, 99: {'artist': 'Future', 'title': 'Government Official'}, 100: {'artist': 'Sech Featuring Darell', 'title': 'Otro Trago'}}  def read(conn, query):     """Executes a Query against the specified connection and query params"""     cursor = conn.cursor()     cursor.execute(query)     data = cursor.fetchall()     print(data)     return data   def top100Search(conn, top100):     """Executes a query for the values in the top100 variable against the database specified within conn param"""     results = []     for items in top100:          topquery = "SELECT Title, Performer FROM Media WHERE Title ='" + top100[items]['title'].replace("'","''") + "'"         temp = read(conn, topquery)         if temp == []:             continue         else:                 results.append(temp)     return results   def resultsparser(results):     b = []     for i in range(0, len(a)):          for j in range(0, len(a[i])):             b.append(a[i][j])     return b 

The code is importing an external function (Get Config) which looks like this:

import pyodbc import ConfigParser  def GetConfig(remoteServer):    """Needs a specified server in config and returns back the ODBC parameters Server, Driver, Database, User, Password, Encrypt, TrustedServer, ConnectionTimeout"""     Config = ConfigParser.ConfigParser()    Config.read('activeconfig.ini')    Server = Config.get(remoteServer, 'Server')    Driver = Config.get(remoteServer, 'Driver')    Database = Config.get(remoteServer, "Database")    User = Config.get(remoteServer, "User")    Password = Config.get(remoteServer, "Password")    Encrypt = Config.get(remoteServer, "Encrypt")    TrustedServer = Config.get(remoteServer, "TrustedServer")    ConnectionTimeout = Config.get(remoteServer, "ConnectionTimeout")    return {"Server": Server,"Driver": Driver, "Database": Database, "User":User, "Password": Password,     "Encrypt": Encrypt, "TrustedServer": TrustedServer, "ConnectionTimeout": ConnectionTimeout} 

The purpose of the code above is to get the database connection string, which is stored in a config.ini file. Which is structured like so and connects to a remote Azure Database, which is a replica of a media library:

[SomeServer] Server: Password: Database: User:  Driver: Encrypt:  TrustedServer:  ConnectionTimeout: 

Overall, I’m wondering if my code is over specified or if any of my functions/variables should be migrated over into classes? I haven’t included the scraper or the Pandas portion of the code, as I thought it would be a bit much but could include it if it’s helpful.

Writing a pandas dataframe to a csv file and renaming on a for loop

I have a script that reads SQL db to a pandas data frame which is then concatenated together to form one dataframe on a loop. I need to write this second data frame to a csv file and rename this from a list of ID’s

I am using pd.to_csv to write the file and os.rename to change the name.

for X, df in d.iteritems():     newdf = pd.concat(d)     for X in newdf:                 export_csv = newdf.to_csv (r'/Users/uni/Desktop/corrindex+id/X.csv', index = False, header = None)                 for X in NAMES:                     os.rename ('X.csv',X) 

This is the code that concatenates the data frames together. In the third loop, NAMES = ‘rt35’ but in the future this will be a list of similar names.

I expect to get a file named rt35.csv. However I either get r.csv or X.csv and this error:

OSError: [Errno 2] No such file or directory 

The files are writing correctly, the only issue is the name.

Como faço para excluir linhas pelo conteúdo especifico da célula no pandas

Estou fazendo o pré-processamento dos meus dados utilizando a biblioteca pandas do Python.

Isto é um projeto para treinar um algorítimo a prever “roles”

Este é o resultado que eu tenho quando executo. ” ‘ print(vagas.role_name.value_counts()) ‘ ”

Security Entry 9300 Retail Entry 6562 Healthcare 5884 Food & Hospitality 2559 Unmatched Role 1922 Security Experienced 1481 Education 541 Corporate Experienced 538 Retail Experienced 309 Service Technician 188 Transportation 183 Sales 175 Software & Technology 148 General Labor 128 Corporate Entry 110 Tire Sales & Service 44 Insurance Sales Agent REFERRAL ONLY 33 Test and Referrals ONLY 29 Customer Service 28 Insurance Sales Agent In Person 18 Insurance Sales Agent REFERRAL ONLY – Reliable Life Insurance 17 Security Officer 12 Security Guard (Road Guard) 9 Insurance Sales Agent In Person – Reliable Life Insurance 8 Insurance Sales Agent Phone Interview – Reliable Life Insurance 5 Insurance Sales Agent In Person – Union National Life Insurance 4 Insurance Sales Agent Phone Interview 3 Guest Service Call Center Representative – example role only 3 Manager in Training – example role only 3 Visual Merchandiser at Forever 21 2 Sales Associate at Forever 21 LIVE 2 DO NOT USE 2 Truck Driver – CDL 1 TRAINING ROLE ONLY 1 Experienced Material Handler MCkesson (4+years) 1 Tire service Technician 1 Assistant Visual Manager at Forever 21 1 test role MH 1 Sales & Administration 1 Lead of Service – Fashion LIVE 1 Delivery Professional 1 Security Officer – Armed 1 Co Manager at Forever 21 1 Name: role_name, dtype: int64

Quero remover todas as “roles” que tenham menos de 100 linhas

Correct syntax for conditional statement in pandas

I have a conditional statement I’m trying to workout in pandas in Anaconda. I’ve installed numpy as np.

I need to create a new “Text” field and, if the existing “truncated” field is “False”, use the string in the existing “text” field. Otherwise, (or if the value of the “truncated” field is “True”, use the string in the existing “extended_tweet.full_text” field.

Trying to follow instructions on this page, but it’s not a direct parallel, as my ‘choices’ are the values of other fields, and not a given string. Pandas conditional creation of a series/dataframe column

Here’s my code:

conditions = [     (df['truncated'] == 'False'),     (df['truncated'] == 'True')] choices = ['text'], ['extended_tweet.full_text'] df['Text'] = np.select(conditions, choices, default='null') 

After running that, all ‘Text’ values are ‘null’

I’ve tried variations for the ‘choices’ options code, and am thinking the problem is the way I’m indicating the options in the choices line (the example code I’m following is using given ‘string’ values). But I can’t sort out the right way to indicate I want the string values in the stated fields used in the new ‘Text’ field.

Any help greatly appreciated.

Нужно написать программу которая обрабатывает списки поступающих с помощью pandas

Нужно написать программу ,в которой функция принимает на вход число мест для поступления, а на выводе получается список поступивших. Это нужно сделать с помощью пакета pandas в python. Имеется где-то 20 фаилов в формате excel, каждый фаил это некоторая специальность. Принцип такой: сначала мы создаем общий список из всех специальностей,потом сортируем поступивших в новые списки в соответствии специальности. В списках есть приоритет поступающего(куда он хочет ,это тоже необходимо соблюсти).Кто сможет помочь,буду крайне благодарен.

Ссылка на фаилы

Parse date format in Pandas using Python

I have a column in a Pandas Dataframe containing birth dates in object/string format:

0    16MAR39 1    21JAN56 2    18NOV51 3    05MAR64 4    05JUN48 

I want to convert the to date formatting for processing. I have used

#Convert String to Datetime type data['BIRTH'] = pd.to_datetime(data['BIRTH']) 

but the result is …

0   2039-03-16 1   2056-01-21 2   2051-11-18 3   2064-03-05 4   2048-06-05 Name: BIRTH, dtype: datetime64[ns] 

Clearly the dates have the wrong century prefix (“20” instead of “19”)

I handled this using …

data['BIRTH'] = np.where(data['BIRTH'].dt.year > 2000, data['BIRTH'] - pd.offsets.DateOffset(years=100), data['BIRTH']) 

Result

0       1939-03-16 1       1956-01-21 2       1951-11-18 3       1964-03-05 4       1948-06-05  Name: BIRTH, Length: 10302, dtype: datetime64[ns] 

I am wondering:

  1. if there is a way to process the data that will get it right first time?
  2. If there is a better way to process the data after the incorrect conversion.

I’m an amateur coder and as far as I understand things Pandas is optimised for processing efficiency. So I wanted to use the Pandas datatime module for that reason. But is it better to consider Numpy’s or Pandas’ datetime module here? I know this dataset is small but I am trying to improve my skills so that when I am working on larger datasets I know what to consider.

Source data