"/mnt/tera/ext4/code/sdmx-sandbox/venv/lib/python3.9/site-packages/pandasdmx/remote.py:11: RuntimeWarning: optional dependency requests_cache is not installed; cache options to Session() have no effect\n",
"> __Dataflow__: set di metadati relativi a una misura effettuata (ad esempio, `educ_enrl1ad - Students by ISCED level, study intensity and sex`)\n",
"\n",
"> __Message__: risposta HTTPS ricevuta in seguito a una richiesta effettuata ad un server di dati\n",
"\n",
"Poi, scarichiamo _tutti_ i dataflow disponibili usando `.dataflow()` sul client creato in precedenza per effettuare una richiesta al server Eurostat, creando un `pandasdmx.message.Message`:"
"> __Series__: una specie di `dict` più veloce e avanzato implementato da `pandas`\n",
"\n",
"PandaSDMX ha la funzionalità che cercavamo di cercare dataset per keyword!\n",
"\n",
"Per effettuare la ricerca, usiamo il metodo `.to_pandas()` per convertire il `Message` in oggetti Python e/o `pandas`, poi usiamo i metodi \"nativi\" per trovare quello che ci serve:"
"text/plain": "(DS-018995 EU trade since 1988 by SITC\n DS-022469 EXTRA EU trade since 1999 by mode of transport...\n DS-032655 EU trade since 1988 by BEC\n DS-043227 EFTA trade since 1995 by SITC\n DS-066341 Sold production, exports and imports by PRODCO...\n ... \n yth_incl_120 Young people living in households with very lo...\n yth_part_010 Frequency of getting together with relatives o...\n yth_part_020 Frequency of contacts with relatives or friend...\n yth_part_030 Participation of young people in activities of...\n yth_volunt_010 Participation of young people in informal volu...\n Length: 6573, dtype: object,\n DSD_DS-018995 \n DSD_DS-022469 \n DSD_DS-032655 \n DSD_DS-043227 \n DSD_DS-066341 \n ..\n DSD_yth_incl_120 \n DSD_yth_part_010 \n DSD_yth_part_020 \n DSD_yth_part_030 \n DSD_yth_volunt_010 \n Length: 6573, dtype: object)"
"text/plain": "educ_enrl1ad Students by ISCED level, study intensity and sex\neduc_enrl1at Students by ISCED level, type of institution a...\neduc_enrl1tl Students by ISCED level, age and sex\neduc_enrl5 Tertiary students (ISCED 5-6) by field of educ...\neduc_enrl6 Tertiary students (ISCED 5-6) non-citizens, n...\neduc_enrl8 Tertiary students (ISCED 5-6) by country of ci...\neduc_enrllng1 Students in ISCED 1-3 by modern foreign langua...\neduc_enrllng2 Students in ISCED 1-3 by number of modern fore...\neduc_fiaid Financial aid to students\neduc_ilev Distribution of pupils/ students by level\neduc_iste Pupil/ student - teacher ratio and average cla...\neduc_mofo_dst Foreign students by level of education and cou...\neduc_mofo_fld Foreign students by level and field of education\neduc_mofo_gen Foreign students by level of education and sex\neduc_mofo_orig Foreign students by level of education and cou...\neduc_momo_dst Students going abroad by level of education an...\neduc_momo_fld Students from abroad by level and field of edu...\neduc_momo_gen Students from abroad by level of education and...\neduc_momo_orig Students from abroad by level of education and...\neduc_outc_pisa Underachieving 15-year-old students by sex and...\neduc_renrlrg1 Students by level of education, orientation, s...\neduc_renrlrg3 Students by age, sex and NUTS 2 regions\neduc_thmob Student mobility\neduc_uoe_enra01 Pupils and students enrolled by education leve...\neduc_uoe_enra02 Pupils and students enrolled by education leve...\neduc_uoe_enra03 Pupils and students enrolled by education leve...\neduc_uoe_enra04 Pupils and students by education level - as % ...\neduc_uoe_enra05 Pupils and students in education by age groups...\neduc_uoe_enra06 Pupils and students in education aged 30 and o...\neduc_uoe_enra07 Expected school years of pupils and students b...\neduc_uoe_enra08 Students in post-compulsory education - as % o...\neduc_uoe_enra09 Students participation at the end of compulsor...\neduc_uoe_enra11 Pupils and students enrolled by education leve...\neduc_uoe_enra12 Pupils and students enrolled by sex, age and N...\neduc_uoe_enra13 Distribution of pupils and students enrolled i...\neduc_uoe_enra16 Pupils and students enrolled by education leve...\neduc_uoe_enrt01 Students enrolled in tertiary education by edu...\neduc_uoe_enrt02 Students enrolled in tertiary education by edu...\neduc_uoe_enrt03 Students enrolled in tertiary education by edu...\neduc_uoe_enrt04 Distribution of students enrolled at tertiary ...\neduc_uoe_enrt05 Ratio of the proportion of tertiary students o...\neduc_uoe_enrt06 Students enrolled in tertiary education by edu...\neduc_uoe_enrt07 Students in tertiary education by age groups -...\neduc_uoe_enrt08 Students in tertiary education - as % of 20-24...\neduc_uoe_fina01 Financial aid to students by education level -...\neduc_uoe_fine09 Public expenditure on education per pupil/stud...\neduc_uoe_fine10 Pupils and students enrolled by education leve...\neduc_uoe_fini04 Annual expenditure on educational institutions...\neduc_uoe_fini06 Ratio of annual expenditure per student at the...\neduc_uoe_mobs01 Mobile students from abroad enrolled by educat...\neduc_uoe_mobs02 Mobile students from abroad enrolled by educat...\neduc_uoe_mobs03 Share of mobile students from abroad enrolled ...\neduc_uoe_mobs04 Distribution of mobile students from abroad en...\neduc_uoe_perp04 Ratio of pupils and students to teachers and a...\nhrst_fl_tefor Participation of foreign students in tertiary ...\ntsc00028 Doctorate students in science and technology f...\ndtype: object"
"> __Structure__: metadati su come sono strutturate le misure di un dataflow (cosa è stato misurato, quali filtri è possibile applicare, note, etc)\n",
"\n",
"_Particolarità di Eurostat: la structure va richiesta separatamente dal dataflow, in quanto tutti i campi a parte `id` di `dataflow.structure` sono sempre vuoti._\n",
"\n",
"Scopriamo prima il label della structure, poi scarichiamo da Eurostat la structure del dataflow che ci interessa con il metodo `.datastructure()`:"
],
"metadata": {
"collapsed": false,
"pycharm": {
"name": "#%% md\n"
}
}
},
{
"cell_type": "code",
"execution_count": 9,
"outputs": [
{
"data": {
"text/plain": "<DataStructureDefinition ESTAT:DSD_educ_enrl1ad(1.0): DSWS Data Structure Definition>"
"Ispezioniamo la structure che abbiamo scaricato, visualizzandola contemporaneamente [sul Data Explorer di Eurostat](https://ec.europa.eu/eurostat/databrowser/view/educ_enrl1ad/default/table?lang=en)\n",
"\n",
"> __Measures__: valori aggregati relativi alle misure effettuate, simili a `COUNT(*)` dell'SQL\n",
"\n",
"> __Dimensions__: filtri applicabili ai dati raccolti in modo simile all'`HAVING` dell'SQL\n",
"\n",
"> __Attributes__: ???\n",
"\n",
"> __Annotations__: commenti che possono essere aggiunti al dataflow"
"Infine, richiediamo i dati da Eurostat, limitandoli a quelli dell'`IT`alia dal 2010 in poi e selezionando solo il `WORKTIME` `TOTAL`, e convertiamoli in una Series multi-chiave:"
],
"metadata": {
"collapsed": false,
"pycharm": {
"name": "#%% md\n"
}
}
},
{
"cell_type": "code",
"execution_count": 28,
"outputs": [
{
"data": {
"text/plain": "FREQ UNIT ISCED97 SEX WORKTIME GEO TIME_PERIOD\nA NR ED0 F TOTAL IT 2010 808706.0\n 2011 811615.0\n 2012 815656.0\n M TOTAL IT 2010 872281.0\n 2011 876225.0\n ... \n UNK M TOTAL IT 2011 NaN\n 2012 NaN\n T TOTAL IT 2010 NaN\n 2011 NaN\n 2012 NaN\nName: value, Length: 279, dtype: float64"
"Tra le sorgenti di dati di cui abbiamo parlato, sono [completamente supportate](https://pandasdmx.readthedocs.io/en/latest/sources.html):\n",
"\n",
"- `ESTAT` - Eurostat\n",
"- `ISTAT` - ISTAT\n",
"\n",
"Queste sorgenti non supportano lo standard `SDMX-MD` ma solo lo standard `SDMX-JSON`, che [non supporta query di metadati e struttura](https://pandasdmx.readthedocs.io/en/latest/sources.html#data-source-limitations):\n",
"\n",
"- `OECD` - Organisation for Economic Cooperation and Development"
],
"metadata": {
"collapsed": false,
"pycharm": {
"name": "#%% md\n"
}
}
},
{
"cell_type": "markdown",
"source": [
"## Archiviazione dati\n",
"\n",
"Se si vogliono replicare dati provenienti da queste fonti, si potrebbe usare tranquillamente un database **relazionale** (SQL) le cui tabelle sono generate a runtime in base alla struttura del dataflow desiderato.\n",
"\n",
"[SQLAlchemy](https://www.sqlalchemy.org/) potrebbe essere utile in questo caso; non sono particolarmente familiare con l'[ORM di Django](https://docs.djangoproject.com/en/3.1/topics/db/models/), ma sembrano molto simili (anche se [si direbbe che SQLAlchemy supporti query più complesse](https://stackoverflow.com/questions/18199053/example-of-what-sqlalchemy-can-do-and-django-orm-cannot))."