Basic Python Data Visualization with Matplotlib

In this short post, you will learn how to create a basic plot with Python. To create a plot we will use Matplotlib.

What is Matplotlib?

Matplotlib is a package used to draw charts with Python. Often it is used via Jupyter Notebook which is a web application where you can interactively write and run Python code. In Jupyter, you can see the charts directly in the browser. However, you will mostly work with scripts that create the data visualizations as files.

In many cases where charts are to be created, data is stored in data types whose structure is optimized to perform mathematical operations quickly. The package that provides these is NumPy. The NumPy package is usually imported with:

import numpy as np

Data Visualization in Python

Below you will find an example of how matplotlib is used to draw a diagram and save it to file. First, you need to import matplotlib and the plot (to draw the graphics):

import matplotlib
import matplotlib.pyplot as plt

Second, to save the plots as .png files we use the following command:

matplotlib.use('AGG')

Third, you type the following (into your script) to create the plot:

plt.figure()

plt.plot([1, 2, 75, 6, 7])
plt.ylabel('The label is useless in this plot example')

plt.savefig("data_visualization_in_Python_example.png")

data_visualization_in_python

Briefly about the plot function()

The package matplotlib has a module called pyplot. In the pyplot module, the plot() function is defined that we use to draw simpler graphs.

Other libraries to use when doing data science:

  1. Pandas is also great for data wrangling, plotting, and descriptive statistics.
  2. Seaborn is easier to use, and a wrapper around matplotlib
  3. Statsmodels for data analysis

Make sure to check this site for Pandas Python Tutorials, data visualization, data analysis, and many more neat guides and how-tos for programming.

That’s all! Let me know if you need anything!

10 Open Data resources to use with Python

Recently, I have asked on Twitter if there are any good sources for free and open data to use to learn Python (and R):

In this post I will list the suggestions I have got so far.

  • Awesome Public Datasets: A huge collection of public datasets. Categorized by field (e.g., biology, economics, machine learning, etc).
  • UCI Machine learning Repository: ”…currently maintain 349 data sets as a service to the machine learning community”
  • https://www.kaggle.com/datasets: Also a list of publicly available datasets.
  • Goverment data: govermental data. Everything from agriculture to science & research. Very interesting.
  • Google Public Data: Huge collection of different data sources that are public. Seems really nice.
  • Amazon public data sets: ”AWS hosts a variety of public data sets that anyone can access for free.” Seems interesting.
  • Movielens:  ”Learn more about movies with rich data, images, and trailers. Browse movies by community-applied tags, or apply your own tags. Explore the database with expressive search tools.”  Movie lens is not really a data source in the way that I asked. However, the suggestion was that one could use the movie ratings to learn hadoop/spark/MapReduce. I may give this a try. If I ever get time.

This was the different data sources people on twitter replied to my tweet. I have myself found this very intersting: Open Psychology data. This is a journal that describes open and re-usable Psychology data. If you are interested in playing around with personality data it can be found here. Finally, APA have link to open data sets: Data Links.

I know, the title is wrong: I gave you a huge amount of different data sources to use. Some may contain overlapping links to data but I would assume that we now have data to play around with for quite some time. Do you know any more data sources that are open and free? Please leave a comment!

 

 

Bands Incorporated — OUseful.Info, the blog…

A few weeks ago, as I was doodling with some Companies House director network mapping code and simple Companies House chatbot ideas, I tweeted an example of Iron Maiden’s company structure based on co-director relationships. Depending on the original search is seeded, the maps may also includes elements of band members’ own personal holdings/interests. The […]

via Bands Incorporated — OUseful.Info, the blog…