Summary

The notebook

Hi, this notebook page was written to analyse the 2019-2020 pandemic outbreak in Europe of the SARS-CoV-2 virus.

It was written to:

  • provide the current status of the outbreak from the authorities;
  • present these status with interactive charts;
  • write a small library to analyse and predict the plateau and duration of the disease;
  • add content to my own personal website with something new;

The disease

The disease COVID-19 is caused by the virus SARS-CoV-2 (Severe Accute Respiratory Syndrome 2).

Your help

This notebook served me to practice some older skills and eventually reach out to an interested audience.
In case you have a suggestion on how to improve the usefulness of the notebook, I will be thankful ( nuno.aja@gmail.com ).

Kind regards, The author.

NOTICE: ongoing work


Section I

In this section we focus on getting the current datasets, process and load them.

For convenience and security we implemented the adequate code to present the data.


In [ ]:
import modules_loader
from modules.analytics.sars_cov2_2019_20 import analytics
In [ ]:
analysis = analytics.scenario()
available_countries = analysis.download_source_datafiles(force = True)
In [ ]:
target_cases = analysis.process_dataset(available_countries)

Section II

In this section we focus on the current data exploration.


Selecting the (20) countries with most active cases

We now select countries by active cases number and plot the resulting data on a bar chart.

Our function plots the data automatically whilst saving the chart as well.

An equivalent code would be the following:

top_20_countries_active_cases = analysis.statistics_by_country.sort_values(by='Active', ascending=False).head(20)
top_20_countries_active_cases.iplot(kind='bar', subplots=False, title='The (20) countries with most active cases');
In [4]:
top_20_countries_active_cases = analysis.show_top_cases(20, feature = 'Active')

The 20 countries with most active cases

Following with the data for these selected countries.

In [5]:
display(analysis.statistics_by_province.loc[list(top_20_countries_active_cases.index)].sort_values(by='Active', ascending=False))
Active Infected Recovered Deaths
Country Province
US 19398156 19740468 0 342312
France 2363858 2597124 169250 64016
United Kingdom 2360340 2432888 0 72548
Spain 1709153 1910218 150376 50689
Netherlands 775976 787300 0 11324
Belgium 624801 644242 0 19441
Brazil 611233 7619200 6814092 193875
Italy 564395 2083689 1445690 73604
Russia 544861 3100018 2499465 55692
Sweden 428652 437379 0 8727
Germany 361971 1741153 1345952 33230
Ukraine 333679 1076880 724143 19058
Serbia 331828 334991 0 3163
India 257656 10266674 9860280 148738
Poland 227506 1281414 1025889 28019
Mexico 222267 1413935 1066771 124897
Canada Quebec 191657 199822 0 8165
Iran 184944 1218753 978714 55095
Canada Ontario 178605 183104 0 4499
Hungary 165880 319543 144234 9429
Argentina 144089 1613928 1426676 43163
Canada Alberta 99382 100428 0 1046
British Columbia 50407 51300 0 893
Manitoba 23852 24513 0 661
Saskatchewan 15006 15160 0 154
France French Polynesia 11895 16851 4842 114
Guadeloupe 6223 8620 2242 155
Martinique 5932 6072 98 42
Mayotte 2871 5890 2964 55
French Guiana 2830 12896 9995 71
Netherlands Curacao 1457 4230 2759 14
Canada Nova Scotia 1418 1483 0 65
United Kingdom Gibraltar 751 1973 1216 6
Channel Islands 710 3024 2256 58
Canada New Brunswick 588 596 0 8
France Reunion 434 8972 8496 42
Canada Newfoundland and Labrador 386 390 0 4
Nunavut 265 266 0 1
Netherlands Aruba 261 5442 5132 49
United Kingdom Bermuda 164 595 421 10
Turks and Caicos Islands 104 874 764 6
France St Martin 94 961 855 12
Canada Prince Edward Island 94 94 0 0
Netherlands Sint Maarten 87 1434 1320 27
Canada Yukon 59 60 0 1
United Kingdom Cayman Islands 34 330 294 2
Canada Northwest Territories 24 24 0 0
Repatriated Travellers 13 13 0 0
Grand Princess 13 13 0 0
United Kingdom Falkland Islands (Malvinas) 12 29 17 0
British Virgin Islands 11 86 74 1
France Saint Barthelemy 11 184 172 1
Netherlands Bonaire, Sint Eustatius and Saba 5 186 178 3
United Kingdom Isle of Man 4 377 348 25
France Saint Pierre and Miquelon 2 16 14 0
United Kingdom Anguilla 1 13 12 0
Montserrat 0 13 12 1
France New Caledonia 0 38 38 0

Plotting the cases by day

We now plot the data for some countries using a composition of features.

In [14]:
analysis.display_locations(countries = [ 'Italy', 'Portugal', 'Spain', 'US' ],
                           fill = False,
                           logy = False)



Active Infected Recovered Deaths
Country Province
Portugal 68205 406051 331016 6830



Active Infected Recovered Deaths
Country Province
Italy 564395 2083689 1445690 73604



Active Infected Recovered Deaths
Country Province
Spain 1709153 1910218 150376 50689

Interactive Sunburst Pie Charts

Now we present the interactive maps where you can select the region/subregion of the cases for more details.

In [7]:
analysis.display_sunburst_chart(label = 'Active')
In [8]:
analysis.display_sunburst_chart(label = 'Recovered')
In [9]:
analysis.display_sunburst_chart(label = 'Deaths')

Querying dataset

We now execute some queries to our dataset for countries with most recoveries and over 2k infections with additional locations.

In [ ]:
interesting_locations = {
    ('United Kingdom', ''),
    ('Brazil', ''),
}
analysis.display_locations(
    locations = interesting_locations,
    query='Infected > 50000000 & Active < Recovered & Deaths > 200000',
    provinces = False
)

Selected Countries and Locations

Now we present the additional charts of the cases around the world.

In [ ]:
analysis.display_locations(countries = 'Netherlands', logy=False, provinces = True)



Active Infected Recovered Deaths
Country Province
Netherlands Sint Maarten 87 1434 1320 27



Active Infected Recovered Deaths
Country Province
Netherlands 775976 787300 0 11324

Animated Geographic Map

Now we present the animation of the cases around the world since January of 2020. Presented on https://blog.njaniceto.com/demos/geographic-evolution-2019-20-pandemic/.

In [ ]:
# Generate an animated geographic map
#analysis.display_geomap()

Final Remarks

The author

Nuno André Jeremias de Aniceto is a Technology Consultant with experience in Software Engineering; Software Architecture and DevOps.
Holds a Master degree in Computer Science Engineering with focus on Computer Vision; Big Data; Multimedia and 3D Simulations.
Has specializations on Deep Learning and on Data Engineering on Google Cloud Platform.

The source of the data

The datasets are compiled by the Johns Hopkins University and the datasources themselves may present some issues (such as Canada province "Recovered").

As of 2020-03-28 the datasources are:

References