Cleaner Python

I am currently reworking some Python scripts at work that manipulate student data and prepare it for upload to various places. Some of these destinations are MS Access*, MySQL as managed by PHPMyAdmin, and other 3rd party platforms that require structure to that data so it correctly maps to what is mostly SQL Server targets.

Python logo

Previously, I was using the ‘re’ module to read a file name and extracting a string value used to populate something in the file, and to name the final output (in this case, an Excel document).

	filetxt = str(newgpa)
	termc = re.findall(r'[0-9]{4}', filetxt)
	termcode = (termc[2])

In this new version, I have removed the entire block, just a little segment of which appears above, and am simply using the input() and exec() functions to tell the script which way to go, that is, which script to run next, depending on some locally contextual needs. The segment below was also part of the above version. But in the version below, the code block from above has been removed.

yesno = input('Do you want the CumulativeGPA, yes or no?')

if yesno == 'yes':
    
    exec(open('NewGPAyes2.py').read())

On another of the files that is called by exec(), in the older working version was a series of rename functions from Pandas in the form of: [dataframename].rename().

gpaextractpd.rename(columns={"University ID":"IUID"}, inplace=True)
gpaextractpd.rename(columns={"Term Code":"Term"}, inplace=True)
gpaextractpd.rename(columns={"Cumulative GPA":"CumulativeGPA"}, inplace=True)

In the new version of this code, which I am simply calling [filenamefromversion1]2.py, as seen from the code block 2nd from the top, this little block has been removed.

There are various reasons for these changes:

  • The 3rd party platforms referenced above are still the same and have to be interacted with queries that are run manually and not via any API (that is the same as well)
  • We are currently migrating some more data sources and queries over to MySQL from Access, which affects several things
  • The move to MySQL in the round tripping of data from one system to another and back again can be built around better more standardized routines and SQL and the data can be better prepared upon output for Python, and then its final destination
  • That I can use MySQL to redesign data output means that I can rely on even simpler Python to manipulate the information, which will be carried in Excel intended to be manually uploaded (this point is important for me)

There are a lot of little examples from the Python and from the MySQL I could demo here, but this post is focused on cleaner, more concise code. Python is already a very clean language, but this new process is making my code related to these processes even more concise. Yes!

There is more work needed on various steps in this process, but much has already been done, making it quicker, more consistent and robust. In fact, this very day, I will be working on this project. Deployment to production from DEV mode has been successful thus far, and the rest is intended to ‘go live’ through the Spring and Summer as each piece is completed and polished.

Advertisement