# PandaSDMX

- [Documentazione aggiornata (v1.4.1)](https://pandasdmx.readthedocs.io/en/latest/)
- [Esempio breve (con poche spiegazioni)](https://pandasdmx.readthedocs.io/en/master/example.html)
- [Esempio approfondito (ma non troppo aggiornato)](https://pandasdmx.readthedocs.io/en/latest/walkthrough.html)

## Installazione

- L'ultima versione non funziona con Pydantic 1.8.1 ma richiede 1.7 ([dr-leo/pandaSDMX#204](https://github.com/dr-leo/pandaSDMX/issues/204))

In [1]:
!pip install pandasdmx pydantic==1.7



## Esempio

In [2]:
import pandas
import pandasdmx

# Per type annotations
import pandasdmx.message
import pandasdmx.model
import pandasdmx.source
import pandasdmx.source.estat

  warn(


È possibile selezionare tra più fonti di dati, tra i quali Eurostat:

In [3]:
# Crea un "client" di comunicazione SDMX-ML con Eurostat
eurostat: pandasdmx.Request = pandasdmx.Request("ESTAT")
eurostat

<pandasdmx.api.Request at 0x7fc5ec2b6970>

Sembra che PandaSDMX implementi la funzionalità che cercavamo di ricerca metadati:

In [4]:
# Scarica i metadati di TUTTI dataflow disponibili su Eurostat
# Ci mette qualche minuto: i dataflow sono 6573!
all_flows_msg: pandasdmx.message.Message = eurostat.dataflow()
all_flows_msg

<pandasdmx.StructureMessage>
  <Header>
    id: 'IDREF372221'
    prepared: '2021-03-13T13:41:50.771000+00:00'
    receiver: <Agency Unknown>
    sender: <Agency Unknown>
    source: 
    test: False
  response: <Response [200]>
  DataflowDefinition (6573): DS-018995 DS-022469 DS-032655 DS-043227 DS...
  DataStructureDefinition (6573): DSD_DS-018995 DSD_DS-022469 DSD_DS-03...

In [5]:
# Convertiamo i risultati in due Series di pandas, una con i dataflow e una con la loro relativa struttura
_dict: dict[str, pandas.Series] = all_flows_msg.to_pandas()
all_flows: pandas.Series = _dict["dataflow"]
all_structs: pandas.Series = _dict["structure"]
all_flows, all_structs

(DS-018995                               EU trade since 1988 by SITC
 DS-022469         EXTRA EU trade since 1999 by mode of transport...
 DS-032655                                EU trade since 1988 by BEC
 DS-043227                             EFTA trade since 1995 by SITC
 DS-066341         Sold production, exports and imports by PRODCO...
                                         ...                        
 yth_incl_120      Young people living in households with very lo...
 yth_part_010      Frequency of getting together with relatives o...
 yth_part_020      Frequency of contacts with relatives or friend...
 yth_part_030      Participation of young people in activities of...
 yth_volunt_010    Participation of young people in informal volu...
 Length: 6573, dtype: object,
 DSD_DS-018995          
 DSD_DS-022469          
 DSD_DS-032655          
 DSD_DS-043227          
 DSD_DS-066341          
                      ..
 DSD_yth_incl_120       
 DSD_yth_part_010       
 DSD_yth_pa

In [6]:
# Cerchiamo nella Series i allflows la cui descrizione contiene "student"
# https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.str.contains.html
student_flows: pandas.Series = all_flows[all_flows.str.contains("student", case=False)]
student_flows

educ_enrl1ad        Students by ISCED level, study intensity and sex
educ_enrl1at       Students by ISCED level, type of institution a...
educ_enrl1tl                    Students by ISCED level, age and sex
educ_enrl5         Tertiary students (ISCED 5-6) by field of educ...
educ_enrl6         Tertiary students (ISCED 5-6)  non-citizens, n...
educ_enrl8         Tertiary students (ISCED 5-6) by country of ci...
educ_enrllng1      Students in ISCED 1-3 by modern foreign langua...
educ_enrllng2      Students in ISCED 1-3 by number of modern fore...
educ_fiaid                                 Financial aid to students
educ_ilev                  Distribution of pupils/ students by level
educ_iste          Pupil/ student - teacher ratio and average cla...
educ_mofo_dst      Foreign students by level of education and cou...
educ_mofo_fld       Foreign students by level and field of education
educ_mofo_gen         Foreign students by level of education and sex
educ_mofo_orig     Foreign student

In [12]:
# Prendiamo il primo e andiamo a scaricare i dati corrispondenti
my_flow_label = student_flows.index[0]
# Scarichiamo lo specifico dataflow che ci interessa
my_flow_msg: pandasdmx.message.Message = eurostat.dataflow(my_flow_label)
my_flow: pandasdmx.model.DataflowDefinition = my_flow_msg.dataflow[my_flow_label]
# Scopriamo il label della struttura dati
my_struct_label: pandasdmx.source.DataStructureDefinition = my_flow.structure.id
# Scarichiamo la struttura del dataflow
my_struct_msg: pandasdmx.message.Message = eurostat.datastructure(my_struct_label)
my_struct: pandasdmx.source.DataStructureDefinition = my_struct_msg.structure[my_struct_label]
my_flow, my_struct

(<DataflowDefinition ESTAT:educ_enrl1ad(1.0): Students by ISCED level, study intensity and sex>,
 <DataStructureDefinition ESTAT:DSD_educ_enrl1ad(1.0): DSWS Data Structure Definition>)

In [14]:
# Ispezioniamo la struttura, che contiene:
# - annotazioni
# - misure
# - attributi
# - dimensioni
my_struct.annotations, my_struct.measures, my_struct.attributes, my_struct.dimensions

([],
 <MeasureDescriptor: <PrimaryMeasure OBS_VALUE>>,
 <AttributeDescriptor: <DataAttribute OBS_FLAG>; <DataAttribute OBS_STATUS>>,
 <DimensionDescriptor: <Dimension FREQ>; <Dimension UNIT>; <Dimension ISCED97>; <Dimension SEX>; <Dimension WORKTIME>; <Dimension GEO>; <TimeDimension TIME_PERIOD>>)