6. Advanced plotting¶

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

import seaborn as sns

We have seen already two options to plot data: we can use the "raw" Matplotlib which in principle allows one to create any possible plot, however with lots of code, and we saw the simpler internal Pandas solution. While the latter solution is very practical to quickly look through data, it is rather cumbersome to realise more complex plots.

Here we look at another type of plotting resting on the concepts of the grammar of graphics. This approach allows to create complex plots where data can be simply split in a plot into color, shapes etc. without having to do a grouping operation in beforehand. We will mainly look at Seaborn, and finish with an example with Plotnine, the port to Python of ggplot.

Importing data¶

We come back here to the dataset of swiss towns. To make the dataset more interestig we add to it some categorical data. First we attempt to add the main language for each town. It is a good example of the type of data wranglig one ofen has to do by combining information from different sources.

#load table indicating to which canton each town belongs
cantons = pd.read_excel('Datasets/be-b-00.04-osv-01.xls',sheet_name=1)[['KTKZ','ORTNAME']]

#load general table with infos on towns
towns = pd.read_excel('Datasets/2018.xls', skiprows=list(range(5))+list(range(6,9)),
                      skipfooter=34, index_col='Commune',na_values=['*','X'])
towns = towns.reset_index()

#merge tables using the town name. This adds the canton abbreviation to the main table 
towns_canton = pd.merge(towns, cantons, left_on='Commune', right_on='ORTNAME',how = 'inner')

#load data indicating languages of each canton
language = pd.read_excel('Datasets/je-f-01.08.01.02.xlsx',skiprows=[0,2,3,4],skipfooter=11)
languages = language[['Allemand (ou suisse allemand)','Français (ou patois romand)',
         'Italien (ou dialecte tessinois/italien des grisons)']]
languages = languages.apply(pd.to_numeric, errors='coerce')
#check which language has majority in each canton
languages['language'] = np.argmax(languages.values.astype(float),axis=1)
code={0:'German', 1:'French', 2:'Italian'}
languages['Language'] = languages.language.apply(lambda x: code[x])
languages['canton'] = language['Unnamed: 0']
languages = languages[['canton','Language']]

#load table matching canton name to abbreviation
cantons_abbrev = pd.read_excel('Datasets/cantons_abbrev.xlsx')
#add full canton name to table by merging on abbreviation
canton_language = pd.merge(languages, cantons_abbrev,on='canton')

#add language by merging on canton abbreviation
towns_language = pd.merge(towns_canton, canton_language, left_on='KTKZ', right_on='abbrev')

towns_language['town_type'] = towns_language['Surface agricole en %'].apply(lambda x: 'Land' if x<50 else 'City')

#Create a new party column and a new party score column
parties = pd.melt(towns_language,id_vars=['Commune'], value_vars=['UDC','PS','PDC'], 
                  var_name= 'Party', value_name='Party score')
towns_language = pd.merge(parties, towns_language, on='Commune')

towns_language

Basic plotting¶

We finally have a table with mostly numerical information but also two categorical data: language and town type (land or city). With Seaborn we can now easily make all sorts of plots. For example what are the average scores of the different parties:

sns.barplot(data = towns_language, y='Party score', x = 'Party');

/usr/local/lib/python3.5/dist-packages/scipy/stats/stats.py:1713: FutureWarning: Using a non-tuple sequence for multidimensional indexing is deprecated; use `arr[tuple(seq)]` instead of `arr[seq]`. In the future this will be interpreted as an array index, `arr[np.array(seq)]`, which will result either in an error or a different result.
  return np.add.reduce(sorted[indexer] * weights, axis=axis) / sumval

Do land towns vote more for the right-wing party ?

g = sns.scatterplot(data = towns_language, y='UDC', x = 'Surface agricole en %', s = 10, alpha = 0.5);
g.set_xlim([0,100]);

Using categories as "aesthetics"¶

The greate advantage of using these packages is that they allow to include categories as "aesthetics" of the plot. For example we looked before at average party scores. But are they different between language regions ? We can just specify that the hue (color) should be mapped to the town language:

sns.barplot(data = towns_language, y='Party score', x = 'Party', hue = 'Language');

/usr/local/lib/python3.5/dist-packages/scipy/stats/stats.py:1713: FutureWarning: Using a non-tuple sequence for multidimensional indexing is deprecated; use `arr[tuple(seq)]` instead of `arr[seq]`. In the future this will be interpreted as an array index, `arr[np.array(seq)]`, which will result either in an error or a different result.
  return np.add.reduce(sorted[indexer] * weights, axis=axis) / sumval

Similarly with scatter plots. Is the relation between land and voting on the right language dependent ?

g = sns.scatterplot(data = towns_language, y='UDC', x = 'Surface agricole en %', hue = 'Language',
                    s = 10, alpha = 0.5);
g.set_xlim([0,100]);

Statistics¶

We see difference in the last plot, but it is still to clearly see the relation. Luckiliy these packages allow us to either create summary statistics or to fit the data:

g = sns.lmplot(data = towns_language, x = 'Surface agricole en %', y='UDC', hue = 'Language', scatter=True,
              scatter_kws={'alpha': 0.1});
g.ax.set_xlim([0,100]);

/usr/local/lib/python3.5/dist-packages/scipy/stats/stats.py:1713: FutureWarning: Using a non-tuple sequence for multidimensional indexing is deprecated; use `arr[tuple(seq)]` instead of `arr[seq]`. In the future this will be interpreted as an array index, `arr[np.array(seq)]`, which will result either in an error or a different result.
  return np.add.reduce(sorted[indexer] * weights, axis=axis) / sumval

Now we can also do the same exercise for all parties. Does the relation hold?

g = sns.lmplot(data = towns_language, x = 'Surface agricole en %', y='Party score', 
               hue = 'Party', scatter=True,
              scatter_kws={'alpha': 0.1});
g.ax.set_xlim([0,100]);

/usr/local/lib/python3.5/dist-packages/scipy/stats/stats.py:1713: FutureWarning: Using a non-tuple sequence for multidimensional indexing is deprecated; use `arr[tuple(seq)]` instead of `arr[seq]`. In the future this will be interpreted as an array index, `arr[np.array(seq)]`, which will result either in an error or a different result.
  return np.add.reduce(sorted[indexer] * weights, axis=axis) / sumval

Adding eve more information¶

We can recover from some other place (Poste) the coordinates of each town. Again by merging we can add that information to our main table:

coords = pd.read_csv('Datasets/plz_verzeichnis_v2.csv', sep=';')[['ORTBEZ18','Geokoordinaten']]
coords['lat'] = coords.Geokoordinaten.apply(lambda x: float(x.split(', ')[0]) if type(x)==str else np.nan)
coords['long'] = coords.Geokoordinaten.apply(lambda x: float(x.split(', ')[1]) if type(x)==str else np.nan)

towns_language = pd.merge(towns_language,coords, left_on='Commune', right_on='ORTBEZ18')

So now we can in addition look at the geography of these parameters. For example, who votes for the right-wing party ?

fix, ax = plt.subplots(figsize = (12,8))
sns.scatterplot(data = towns_language, x= 'long', y = 'lat', hue='UDC', style = 'Language', palette='Reds');

# MZ: if used to ggplot -> use 'plotnine' package
# same grammar as ggplot

	Commune	Party	Party score	Code commune	Habitants	Variation en %	Densité de la population par km²	Etrangers en %	0-19 ans	20-64 ans	...	PBD	PST/Sol.	PES	Petits partis de droite	KTKZ	ORTNAME	canton	Language	abbrev	town_type
0	Aeugst am Albis	UDC	30.929249	1	1977	8.388158	249.936789	13.100658	20.586748	62.822458	...	2.617442	0.167638	7.075094	4.888178	ZH	Aeugst am Albis	Zurich	German	ZH	City
1	Aeugst am Albis	PS	18.645940	1	1977	8.388158	249.936789	13.100658	20.586748	62.822458	...	2.617442	0.167638	7.075094	4.888178	ZH	Aeugst am Albis	Zurich	German	ZH	City
2	Aeugst am Albis	PDC	2.076428	1	1977	8.388158	249.936789	13.100658	20.586748	62.822458	...	2.617442	0.167638	7.075094	4.888178	ZH	Aeugst am Albis	Zurich	German	ZH	City
3	Affoltern am Albis	UDC	33.785785	2	11900	7.294203	1123.701605	27.848740	20.285714	62.201681	...	4.164299	0.190049	6.211047	1.768197	ZH	Affoltern am Albis	Zurich	German	ZH	Land
4	Affoltern am Albis	PS	19.080314	2	11900	7.294203	1123.701605	27.848740	20.285714	62.201681	...	4.164299	0.190049	6.211047	1.768197	ZH	Affoltern am Albis	Zurich	German	ZH	Land
5	Affoltern am Albis	PDC	4.585387	2	11900	7.294203	1123.701605	27.848740	20.285714	62.201681	...	4.164299	0.190049	6.211047	1.768197	ZH	Affoltern am Albis	Zurich	German	ZH	Land
6	Bonstetten	UDC	29.100156	3	5435	5.349874	731.493943	14.149034	23.808648	60.717571	...	3.803108	0.112518	6.661066	1.915807	ZH	Bonstetten	Zurich	German	ZH	City
7	Bonstetten	PS	20.403265	3	5435	5.349874	731.493943	14.149034	23.808648	60.717571	...	3.803108	0.112518	6.661066	1.915807	ZH	Bonstetten	Zurich	German	ZH	City
8	Bonstetten	PDC	3.378541	3	5435	5.349874	731.493943	14.149034	23.808648	60.717571	...	3.803108	0.112518	6.661066	1.915807	ZH	Bonstetten	Zurich	German	ZH	City
9	Hausen am Albis	UDC	34.937369	4	3571	6.279762	262.573529	14.533744	22.738729	60.403248	...	4.656087	0.193911	8.021665	1.825436	ZH	Hausen am Albis	Zurich	German	ZH	City
10	Hausen am Albis	PS	19.393305	4	3571	6.279762	262.573529	14.533744	22.738729	60.403248	...	4.656087	0.193911	8.021665	1.825436	ZH	Hausen am Albis	Zurich	German	ZH	City
11	Hausen am Albis	PDC	2.881915	4	3571	6.279762	262.573529	14.533744	22.738729	60.403248	...	4.656087	0.193911	8.021665	1.825436	ZH	Hausen am Albis	Zurich	German	ZH	City
12	Hedingen	UDC	30.114599	5	3687	8.123167	564.624809	14.971522	22.484405	62.110117	...	3.768864	0.227988	6.466387	1.840045	ZH	Hedingen	Zurich	German	ZH	Land
13	Hedingen	PS	22.478008	5	3687	8.123167	564.624809	14.971522	22.484405	62.110117	...	3.768864	0.227988	6.466387	1.840045	ZH	Hedingen	Zurich	German	ZH	Land
14	Hedingen	PDC	3.918166	5	3687	8.123167	564.624809	14.971522	22.484405	62.110117	...	3.768864	0.227988	6.466387	1.840045	ZH	Hedingen	Zurich	German	ZH	Land
15	Kappel am Albis	UDC	48.615099	6	1110	20.915033	140.151515	18.018018	26.486486	60.180180	...	5.134268	0.312447	4.382706	2.769802	ZH	Kappel am Albis	Zurich	German	ZH	City
16	Kappel am Albis	PS	10.285425	6	1110	20.915033	140.151515	18.018018	26.486486	60.180180	...	5.134268	0.312447	4.382706	2.769802	ZH	Kappel am Albis	Zurich	German	ZH	City
17	Kappel am Albis	PDC	2.744469	6	1110	20.915033	140.151515	18.018018	26.486486	60.180180	...	5.134268	0.312447	4.382706	2.769802	ZH	Kappel am Albis	Zurich	German	ZH	City
18	Knonau	UDC	32.876136	7	2168	20.444444	335.085008	17.158672	24.077491	60.885609	...	5.944968	0.008415	5.801919	3.437395	ZH	Knonau	Zurich	German	ZH	City
19	Knonau	PS	18.436553	7	2168	20.444444	335.085008	17.158672	24.077491	60.885609	...	5.944968	0.008415	5.801919	3.437395	ZH	Knonau	Zurich	German	ZH	City
20	Knonau	PDC	3.126052	7	2168	20.444444	335.085008	17.158672	24.077491	60.885609	...	5.944968	0.008415	5.801919	3.437395	ZH	Knonau	Zurich	German	ZH	City
21	Maschwanden	UDC	43.383446	8	626	1.623377	133.475480	12.140575	23.162939	59.744409	...	3.452833	0.644309	5.170990	2.114654	ZH	Maschwanden	Zurich	German	ZH	City
22	Maschwanden	PS	22.732529	8	626	1.623377	133.475480	12.140575	23.162939	59.744409	...	3.452833	0.644309	5.170990	2.114654	ZH	Maschwanden	Zurich	German	ZH	City
23	Maschwanden	PDC	3.502396	8	626	1.623377	133.475480	12.140575	23.162939	59.744409	...	3.452833	0.644309	5.170990	2.114654	ZH	Maschwanden	Zurich	German	ZH	City
24	Mettmenstetten	UDC	35.671015	9	4861	14.565166	373.062164	14.873483	22.341082	60.851677	...	4.352704	0.133017	5.059457	3.569025	ZH	Mettmenstetten	Zurich	German	ZH	City
25	Mettmenstetten	PS	18.800282	9	4861	14.565166	373.062164	14.873483	22.341082	60.851677	...	4.352704	0.133017	5.059457	3.569025	ZH	Mettmenstetten	Zurich	German	ZH	City
26	Mettmenstetten	PDC	3.649155	9	4861	14.565166	373.062164	14.873483	22.341082	60.851677	...	4.352704	0.133017	5.059457	3.569025	ZH	Mettmenstetten	Zurich	German	ZH	City
27	Obfelden	UDC	36.174029	10	5131	9.496372	680.503979	20.015591	23.055935	60.202690	...	4.358116	0.086903	3.705416	2.982453	ZH	Obfelden	Zurich	German	ZH	Land
28	Obfelden	PS	16.922138	10	5131	9.496372	680.503979	20.015591	23.055935	60.202690	...	4.358116	0.086903	3.705416	2.982453	ZH	Obfelden	Zurich	German	ZH	Land
29	Obfelden	PDC	6.488176	10	5131	9.496372	680.503979	20.015591	23.055935	60.202690	...	4.358116	0.086903	3.705416	2.982453	ZH	Obfelden	Zurich	German	ZH	Land
...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...
6318	Muriaux	UDC	14.609053	6753	504	3.703704	29.857820	7.341270	19.841270	60.515873	...	NaN	4.320988	8.641975	NaN	JU	Muriaux	Jura	French	JU	City
6319	Muriaux	PS	8.641975	6753	504	3.703704	29.857820	7.341270	19.841270	60.515873	...	NaN	4.320988	8.641975	NaN	JU	Muriaux	Jura	French	JU	City
6320	Muriaux	PDC	20.370370	6753	504	3.703704	29.857820	7.341270	19.841270	60.515873	...	NaN	4.320988	8.641975	NaN	JU	Muriaux	Jura	French	JU	City
6321	Le Noirmont	UDC	8.346334	6754	1845	10.877404	90.485532	15.880759	21.680217	61.517615	...	NaN	2.964119	7.332293	NaN	JU	Le Noirmont	Jura	French	JU	City
6322	Le Noirmont	PS	25.663027	6754	1845	10.877404	90.485532	15.880759	21.680217	61.517615	...	NaN	2.964119	7.332293	NaN	JU	Le Noirmont	Jura	French	JU	City
6323	Le Noirmont	PDC	15.834633	6754	1845	10.877404	90.485532	15.880759	21.680217	61.517615	...	NaN	2.964119	7.332293	NaN	JU	Le Noirmont	Jura	French	JU	City
6324	Saignelégier	UDC	9.322820	6757	2556	2.240000	80.707294	8.020344	21.244131	60.015649	...	NaN	5.287570	9.137291	NaN	JU	Saignelégier	Jura	French	JU	Land
6325	Saignelégier	PS	24.768089	6757	2556	2.240000	80.707294	8.020344	21.244131	60.015649	...	NaN	5.287570	9.137291	NaN	JU	Saignelégier	Jura	French	JU	Land
6326	Saignelégier	PDC	25.278293	6757	2556	2.240000	80.707294	8.020344	21.244131	60.015649	...	NaN	5.287570	9.137291	NaN	JU	Saignelégier	Jura	French	JU	Land
6327	Soubey	UDC	21.739130	6759	134	-8.843537	9.933284	2.238806	14.925373	56.716418	...	NaN	0.000000	8.695652	NaN	JU	Soubey	Jura	French	JU	Land
6328	Soubey	PS	15.217391	6759	134	-8.843537	9.933284	2.238806	14.925373	56.716418	...	NaN	0.000000	8.695652	NaN	JU	Soubey	Jura	French	JU	Land
6329	Soubey	PDC	42.028986	6759	134	-8.843537	9.933284	2.238806	14.925373	56.716418	...	NaN	0.000000	8.695652	NaN	JU	Soubey	Jura	French	JU	Land
6330	Alle	UDC	8.108108	6771	1817	7.769870	171.415094	10.621904	24.766098	54.980737	...	NaN	1.719902	4.176904	NaN	JU	Alle	Jura	French	JU	City
6331	Alle	PS	14.557740	6771	1817	7.769870	171.415094	10.621904	24.766098	54.980737	...	NaN	1.719902	4.176904	NaN	JU	Alle	Jura	French	JU	City
6332	Alle	PDC	48.341523	6771	1817	7.769870	171.415094	10.621904	24.766098	54.980737	...	NaN	1.719902	4.176904	NaN	JU	Alle	Jura	French	JU	City
6333	Beurnevésin	UDC	16.279070	6773	127	-8.633094	24.950884	6.299213	12.598425	48.818898	...	NaN	3.100775	6.201550	NaN	JU	Beurnevésin	Jura	French	JU	City
6334	Beurnevésin	PS	6.976744	6773	127	-8.633094	24.950884	6.299213	12.598425	48.818898	...	NaN	3.100775	6.201550	NaN	JU	Beurnevésin	Jura	French	JU	City
6335	Beurnevésin	PDC	48.062016	6773	127	-8.633094	24.950884	6.299213	12.598425	48.818898	...	NaN	3.100775	6.201550	NaN	JU	Beurnevésin	Jura	French	JU	City
6336	Boncourt	UDC	8.318099	6774	1205	-7.307692	133.740289	10.539419	17.925311	52.614108	...	NaN	0.548446	2.285192	NaN	JU	Boncourt	Jura	French	JU	Land
6337	Boncourt	PS	22.760512	6774	1205	-7.307692	133.740289	10.539419	17.925311	52.614108	...	NaN	0.548446	2.285192	NaN	JU	Boncourt	Jura	French	JU	Land
6338	Boncourt	PDC	44.972578	6774	1205	-7.307692	133.740289	10.539419	17.925311	52.614108	...	NaN	0.548446	2.285192	NaN	JU	Boncourt	Jura	French	JU	Land
6339	Bonfol	UDC	21.452703	6775	665	-2.349486	48.969072	9.172932	15.789474	53.984962	...	NaN	2.364865	6.418919	NaN	JU	Bonfol	Jura	French	JU	Land
6340	Bonfol	PS	15.371622	6775	665	-2.349486	48.969072	9.172932	15.789474	53.984962	...	NaN	2.364865	6.418919	NaN	JU	Bonfol	Jura	French	JU	Land
6341	Bonfol	PDC	31.418919	6775	665	-2.349486	48.969072	9.172932	15.789474	53.984962	...	NaN	2.364865	6.418919	NaN	JU	Bonfol	Jura	French	JU	Land
6342	Bure	UDC	5.503356	6778	685	3.162651	50.036523	6.131387	19.854015	54.306569	...	NaN	0.671141	1.208054	NaN	JU	Bure	Jura	French	JU	Land
6343	Bure	PS	8.859060	6778	685	3.162651	50.036523	6.131387	19.854015	54.306569	...	NaN	0.671141	1.208054	NaN	JU	Bure	Jura	French	JU	Land
6344	Bure	PDC	49.395973	6778	685	3.162651	50.036523	6.131387	19.854015	54.306569	...	NaN	0.671141	1.208054	NaN	JU	Bure	Jura	French	JU	Land
6345	Coeuve	UDC	8.194444	6781	732	8.124077	62.994836	3.551913	25.136612	56.147541	...	NaN	2.222222	5.000000	NaN	JU	Coeuve	Jura	French	JU	City
6346	Coeuve	PS	24.722222	6781	732	8.124077	62.994836	3.551913	25.136612	56.147541	...	NaN	2.222222	5.000000	NaN	JU	Coeuve	Jura	French	JU	City
6347	Coeuve	PDC	36.527778	6781	732	8.124077	62.994836	3.551913	25.136612	56.147541	...	NaN	2.222222	5.000000	NaN	JU	Coeuve	Jura	French	JU	City