The Most Underrated Python Packages

The Most Underrated Python Packages

  • 638

In my experience as a Python user, I’ve come across a lot of different packages and curated lists. Some are in my bookmarks like the great awesome-python-data-science curated list

In my experience as a Python user, I’ve come across a lot of different packages and curated lists. Some are in my bookmarks like the great awesome-python-data-science curated list, or awesome-python curated list. If you don’t know them, go check them out asap.

In this post, I’d like to show you something else. These are the results of late-night GitHub/Reddit browsing, and cool stuff shared by colleagues.

Some of these packages are really unique, others are just fun to use and real underdogs among the data scientist/statistician I’ve worked with.

Let’s start!

Misc (the weird ones)

  • Knock Knock: Send notifications from Python to mobile devices or the desktop or email.
  • tqdm: Extensible Progress Bar for Python and CLI, with built-in support for pandas.
  • Colorama:Simple cross-platform colored terminal text.
  • Pandas-log: It provides feedback about basic pandas operations. Great for debugging long pipe chains.
  • Pandas-flavor:The easy way to extend Pandas DataFrame/Series.

Data Cleaning and Manipulation

  • ftfy: Fixes mojibake and other glitches in Unicode text, after the fact.
  • janitor:A lot of cool functions to clean data.
  • Optimus:Another package for data cleaning.
  • Great-expectations: A great package to check if your data obeys your expectations.

Data Exploration and Modelling

  • Pandas-profile: Create an HTML report full of statistics from pandas DataFrame.
  • pydqc: Allow to compare statistics between two datasets.
  • Pandas-summary:An extension to pandas DataFrames describe function.
  • pivottable-js: drag’n’drop functionality for pandas inside jupyter notebook.

Performance Checking and Optimization

  • Py-spy: Sampling profiler for Python programs.
  • pyperf:Toolkit to run Python benchmarks.
  • snakeviz: An in-browser Python profile viewer with great support for Jupiter notebook.
  • Cachier: Persistent, stale-free, local and cross-machine caching for Python functions.
  • Faiss: A library for efficient similarity search and clustering of dense vectors.

I hope you found something useful or fun for your work. I’m going to expand the post in the future, so stay tuned for new updates!

Recommended Reading

Learn Python for beginners by building five games

Learning Python — Advanced List Methods and Techniques

Python | Read Text from Image with One Line Code