Summary

The notebook

Hi, this notebook page was written to analyse the 2019-2020 pandemic outbreak in Europe of the SARS-CoV-2 virus.

It was written to:

  • provide the current status of the outbreak from the authorities;
  • present these status with interactive charts;
  • write a small library to analyse and predict the plateau and duration of the disease;
  • add content to my own personal website with something new;

The disease

The disease COVID-19 is caused by the virus SARS-CoV-2 (Severe Accute Respiratory Syndrome 2).

Your help

This notebook served me to practice some older skills and eventually reach out to an interested audience.
In case you have a suggestion on how to improve the usefulness of the notebook, I will be thankful ( nuno.aja@gmail.com ).

Kind regards, The author.

NOTICE: ongoing work


Section I

In this section we focus on getting the current datasets, process and load them.

For convenience and security we implemented the adequate code to present the data.


In [1]:
import modules_loader
from modules.analytics.sars_cov2_2019_20 import analytics
In [2]:
analysis = analytics.scenario()
available_countries = analysis.download_source_datafiles(force = True)
Matplotlib created a temporary config/cache directory at /tmp/matplotlib-bvpac0mt because the default path (/.config/matplotlib) is not a writable directory; it is highly recommended to set the MPLCONFIGDIR environment variable to a writable directory, in particular to speed up the import of Matplotlib and to better support multiprocessing.
In [3]:
target_cases = analysis.process_dataset(available_countries)
Preparing dataset for 196 coutries.

Section II

In this section we focus on the current data exploration.


Selecting the (20) countries with most active cases

We now select countries by active cases number and plot the resulting data on a bar chart.

Our function plots the data automatically whilst saving the chart as well.

An equivalent code would be the following:

top_20_countries_active_cases = analysis.statistics_by_country.sort_values(by='Active', ascending=False).head(20)
top_20_countries_active_cases.iplot(kind='bar', subplots=False, title='The (20) countries with most active cases');
In [4]:
top_20_countries_active_cases = analysis.show_top_cases(20, feature = 'Active')

The 20 countries with most active cases

Following with the data for these selected countries.

In [5]:
display(analysis.statistics_by_province.loc[list(top_20_countries_active_cases.index)].sort_values(by='Active', ascending=False))
Active Infected Recovered Deaths
Country Province
US 67712182 68569950 0 857768
India 37731080 38218773 0 487693
Brazil 22803267 23425392 0 622125
United Kingdom 15353878 15506750 0 152872
France 13740891 13865920 0 125029
Turkey 10581049 10666302 0 85253
Russia 10399545 10716397 0 316852
Italy 9077186 9219391 0 142205
Spain 8585479 8676916 0 91437
Germany 8244941 8361262 0 116321
Argentina 7327998 7446626 0 118628
Iran 6099777 6231909 0 132132
Colombia 5493083 5624520 0 131437
Poland 4270656 4373718 0 103062
Mexico 4193198 4495310 0 302112
Indonesia 4131336 4275528 0 144192
Ukraine 3858681 3963917 0 105236
Netherlands 3659718 3680896 0 21178
South Africa 3471007 3564578 0 93571
Philippines 3240581 3293625 0 53044
France Reunion 101781 102216 0 435
Guadeloupe 72825 73660 0 835
French Guiana 70131 70483 0 352
Martinique 68118 68931 0 813
French Polynesia 46639 47275 0 636
United Kingdom Channel Islands 37489 37613 0 124
France Mayotte 35003 35189 0 186
Netherlands Curacao 34161 34368 0 207
Aruba 31416 31603 0 187
United Kingdom Isle of Man 20109 20179 0 70
France New Caledonia 14245 14527 0 282
United Kingdom Cayman Islands 12008 12023 0 15
Gibraltar 11116 11216 0 100
Bermuda 9034 9144 0 110
Netherlands Sint Maarten 8723 8798 0 75
France St Martin 7653 7713 0 60
Netherlands Bonaire, Sint Eustatius and Saba 5797 5824 0 27
United Kingdom British Virgin Islands 5402 5449 0 47
Turks and Caicos Islands 5239 5271 0 32
France Saint Barthelemy 2946 2952 0 6
United Kingdom Anguilla 2180 2187 0 7
France Saint Pierre and Miquelon 553 553 0 0
Wallis and Futuna 447 454 0 7
United Kingdom Montserrat 147 148 0 1
Falkland Islands (Malvinas) 85 85 0 0
Saint Helena, Ascension and Tristan da Cunha 4 4 0 0

Plotting the cases by day

We now plot the data for some countries using a composition of features.

In [6]:
analysis.display_locations(countries = [ 'Italy', 'Portugal', 'Spain', 'US' ],
                           fill = False,
                           logy = False)



Active Infected Recovered Deaths
Country Province
US 67712182 68569950 0 857768



Active Infected Recovered Deaths
Country Province
Spain 8585479 8676916 0 91437



Active Infected Recovered Deaths
Country Province
Portugal 1983756 2003169 0 19413



Active Infected Recovered Deaths
Country Province
Italy 9077186 9219391 0 142205

Interactive Sunburst Pie Charts

Now we present the interactive maps where you can select the region/subregion of the cases for more details.

In [7]:
analysis.display_sunburst_chart(label = 'Active')
In [8]:
analysis.display_sunburst_chart(label = 'Recovered')
In [9]:
analysis.display_sunburst_chart(label = 'Deaths')

Querying dataset

We now execute some queries to our dataset for countries with most recoveries and over 2k infections with additional locations.

In [10]:
interesting_locations = {
    ('United Kingdom', ''),
    ('Brazil', ''),
}
analysis.display_locations(
    locations = interesting_locations,
    query='Infected > 50000000 & Active < Recovered & Deaths > 200000',
    provinces = False
)
Active Infected Recovered Deaths
Country Province



Active Infected Recovered Deaths
Country Province
Brazil 22803267 23425392 0 622125



Active Infected Recovered Deaths
Country Province
United Kingdom 15353878 15506750 0 152872

Selected Countries and Locations

Now we present the additional charts of the cases around the world.

In [11]:
analysis.display_locations(countries = 'Netherlands', logy=False, provinces = True)



Active Infected Recovered Deaths
Country Province
Netherlands 3659718 3680896 0 21178



Active Infected Recovered Deaths
Country Province
Netherlands Bonaire, Sint Eustatius and Saba 5797 5824 0 27



Active Infected Recovered Deaths
Country Province
Netherlands Aruba 31416 31603 0 187



Active Infected Recovered Deaths
Country Province
Netherlands Curacao 34161 34368 0 207



Active Infected Recovered Deaths
Country Province
Netherlands Sint Maarten 8723 8798 0 75

Animated Geographic Map

Now we present the animation of the cases around the world since January of 2020. Presented on https://blog.njaniceto.com/demos/geographic-evolution-2019-20-pandemic/.

In [12]:
# Generate an animated geographic map
#analysis.display_geomap()

Final Remarks

The author

Nuno André Jeremias de Aniceto is a Technology Consultant with experience in Software Engineering; Software Architecture and DevOps.
Holds a Master degree in Computer Science Engineering with focus on Computer Vision; Big Data; Multimedia and 3D Simulations.
Has specializations on Deep Learning and on Data Engineering on Google Cloud Platform.

The source of the data

The datasets are compiled by the Johns Hopkins University and the datasources themselves may present some issues (such as Canada province "Recovered").

As of 2020-03-28 the datasources are:

References